From NWChem
Viewed 9267 times, With a total of 3 Posts
|
Clicked A Few Times
Threads 13
Posts 35
|
|
5:30:40 PM PST - Tue, Nov 26th 2013 |
|
Previously, I asked a question regarding how to clear share memory from strayed process. Edo told me to use a scriptl called 'ipcreset' shiped with nwchem package to deal with it, http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id1001/Errors_Running_Nwchem_6.3.ht.... This script does a great job most of the time. However, there is some process that cannot be handled by this script. For example I have tried to use ipreset but it turned out to be like this,
[tpirojsi@compute-0-55 ~]$ ipcs -a
Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 589826 tpirojsi 600 2147483648 12 dest
Semaphore Arrays --------
key semid owner perms nsems
Message Queues --------
key msqid owner perms used-bytes messages
As far as I noticed, the strayed process that can be killed will have nattch '0' but for a non-zero nattach (12 in above example) it cannot be cleared. I have tried 'ipcrm' as suggested in many websites online but still unsuccessful too. Any idea how to fix this problem please?
Best,
Tee
|
Edited On 1:14:14 AM PST - Wed, Nov 27th 2013 by Tpirojsi
|
|
|
-
Huub Forum:Admin, Forum:Mod, NWChemDeveloper, bureaucrat, sysop
|
|
Forum Regular
Threads 1
Posts 185
|
|
11:55:54 AM PST - Wed, Nov 27th 2013 |
|
Hi Tee,
Using ipcrm will only clear a shared memory segment up when there are no processes attached to it (nattch 0 as you found). In your example there are still 12 processes attached to the shared memory region. Therefore the shared memory region will be deleted only when those 12 processes detach from it. One way to force the processes to detach is to kill them (if you know which processes those are), otherwise the processes will automatically detach when the calculation completes and the processes terminate. Whichever way as soon as the processes detach the shared memory segment will automatically disappear if you previously scheduled it for deletion using ipcrm.
I hope this helps, Huub
|
|
|
|
Clicked A Few Times
Threads 13
Posts 35
|
|
12:36:17 PM PST - Wed, Nov 27th 2013 |
|
Hi Huub,
Thank you for your advice. How do I check which processes those are and kill them as you suggested? The processes have hung there for a while and don't seem to dettach themselves.
I have logged in to the nodes that unterminated processes reside. I punched in top command and saw many nwchem processes running but when I tried pkill nwchem or killall nwchem they didn't work.
Tee
Quote:Huub Nov 27th 6:55 pmHi Tee,
Using ipcrm will only clear a shared memory segment up when there are no processes attached to it (nattch 0 as you found). In your example there are still 12 processes attached to the shared memory region. Therefore the shared memory region will be deleted only when those 12 processes detach from it. One way to force the processes to detach is to kill them (if you know which processes those are), otherwise the processes will automatically detach when the calculation completes and the processes terminate. Whichever way as soon as the processes detach the shared memory segment will automatically disappear if you previously scheduled it for deletion using ipcrm.
I hope this helps, Huub
|
Edited On 12:50:29 PM PST - Wed, Nov 27th 2013 by Tpirojsi
|
|
|
|
Clicked A Few Times
Threads 13
Posts 35
|
|
12:49:48 PM PST - Wed, Nov 27th 2013 |
|
My bad. It had to be killall -9 nwchem. Now the processes are killed. Thank you.
Quote:Tpirojsi Nov 27th 7:36 pmHi Huub,
Thank you for your advice. How do I check which processes those are and kill them as you suggested? The processes have hung there for a while and don't seem to dettach themselves.
I have logged in to the nodes that unterminated processes reside. I puched in top command and saw many nwchem processes running but when I tried pkill nwchem or killall nwchem they didn't work.
Tee
Quote:Huub Nov 27th 6:55 pmHi Tee,
Using ipcrm will only clear a shared memory segment up when there are no processes attached to it (nattch 0 as you found). In your example there are still 12 processes attached to the shared memory region. Therefore the shared memory region will be deleted only when those 12 processes detach from it. One way to force the processes to detach is to kill them (if you know which processes those are), otherwise the processes will automatically detach when the calculation completes and the processes terminate. Whichever way as soon as the processes detach the shared memory segment will automatically disappear if you previously scheduled it for deletion using ipcrm.
I hope this helps, Huub
|
|
|
AWC's:
2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC