|
SEARCH
TOOLBOX
LANGUAGES
Forum Menu
CCSD(T) Calculation with Quadruple Zeta Basis Set -- Memory Issue
From NWChem
Viewed 319 times, With a total of 14 Posts
|
Clicked A Few Times
Threads 1
Posts 8
|
|
6:30:54 PM PDT - Mon, Jun 18th 2018 |
|
Hello NWChem Developer,
I am trying to run some CCSD(T) energy calculations with quadruple-zeta basis set of a 5-atom system but it seems the memory requirement of the 2-e file size is a little off the chart (> 100GB). The input file reads:
Quote:username
echo
memory stack 1300 mb heap 200 mb global 1500 mb
start im
title "im"
charge 1
geometry units angstroms print xyz noautosym noautoz
C -2.23423902 0.59425408 -0.03224283
O -1.12129315 1.09129114 -0.09445519
O -3.30588587 0.19083810 0.02028232
Br 1.41553615 -0.39477191 0.02227492
H -0.18608027 0.45084374 -0.04234683
end
basis
C library aug-cc-pvqz
H library aug-cc-pvqz
O library aug-cc-pvqz
- BASIS SET: (15s,12p,13d,3f,2g) -> [7s,6p,5d,3f,2g]
Br S
78967.5000000 0.0000280 -0.0000110
11809.7000000 0.0002140 -0.0000860
2687.1400000 0.0010560 -0.0004350
760.0360000 0.0036880 -0.0014570
241.8110000 0.0079340 -0.0033810
38.4914000 0.1528680 -0.0576580
24.0586000 -0.2786020 0.1123250
14.3587000 -0.2188500 0.0756730
... (to keep it short)
end
ECP
Br nelec 10
Br ul
2 1.0000000 0.0000000
Br S
...
end
scf
doublet
THRESH 1.0e-5
MAXITER 100
TOL2E 1e-7
end
tce
ccsd(t)
FREEZE atomic
thresh 1e-6
maxiter 100
end
task tce
The error message reads
Quote:username
2-e (intermediate) file size = 106977507300
2-e (intermediate) file name = ./im.v2i
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
available GA memory 1572841816 bytes
createfile: failed ga_create size/nproc bytes 5348875365
------------------------------------------------------------------------
------------------------------------------------------------------------
I could change the memory options at the beginning of the file but it just seems a little unrealistic to have GA as large as 100G for the nodes that I am using (two Intel Xeon E5-2680v2 “Ivy Bridge” 10-core, 2.8GHz processors, which is 20 cores total, and 128 GB of memory, 6.8GB per core).
I have also tried different "IO" and "2emet" options, for example,
Quote:username
tce
ccsd(t)
FREEZE atomic
thresh 1e-6
maxiter 100
2eorb
2emet 13
tilesize 10
attilesize 40
end
set tce:xmem 100
and
Quote:username tce
tilesize 2
io ga
2EORB
2EMET 15
idiskx 1
ccsd(t)
FREEZE atomic
thresh 1e-6
maxiter 100
end
but the job seems to hang there after printing out "v2 file size = "
Any insight on this issue is greatly appreciated!
Thank you in advance,
Rui
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 9
Posts 1522
|
|
9:47:59 AM PDT - Tue, Jun 19th 2018 |
|
createfile: failed ga_create size/nproc bytes 5348875365
5348875365=5348875365/1024/1024/1024=4.98 GB
Please change the memory line to
memory stack 1300 mb heap 200 mb global 6000 mb
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
10:50:04 PM PDT - Tue, Jun 19th 2018 |
|
Thank you for the prompt response, Edoapra.
I had to adjust the memory to
Quote:username memory stack 1000 mb heap 100 mb global 5300 mb
so it does not exceed the memory of the core (6.8 GB)
but now run into an error like the following:
Quote:username slurmstepd: error: Step 3840722.0 exceeded memory limit (123363455 > 122880000), being killed
slurmstepd: error: Step 3840722.0 exceeded memory limit (123618673 > 122880000), being killed
slurmstepd: error: Step 3840722.0 exceeded memory limit (123451708 > 122880000), being killed
slurmstepd: error: *** STEP 3840722.0 ON prod2-0143 CANCELLED AT 2018-06-20T04:05:00 ***
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: Exceeded job memory limit
srun: Job step aborted: Waiting up to 122 seconds for job step to finish.
srun: error: prod2-0148: tasks 100-119: Killed
srun: error: prod2-0150: tasks 140-159: Killed
srun: error: prod2-0149: tasks 120-139: Killed
slurmstepd: error: _get_pss: ferror() indicates error on file /proc/156552/smaps
slurmstepd: error: _get_pss: ferror() indicates error on file /proc/135960/smaps
srun: error: prod2-0145: tasks 41,43,45,47,49,51,53,55,57,59: Killed
srun: error: prod2-0146: tasks 63,65,69,71,75,77,79: Killed
slurmstepd: error: _get_pss: ferror() indicates error on file /proc/234980/smaps
srun: error: prod2-0145: tasks 40,42,44,46,48,50,52,54,56,58: Killed
srun: error: prod2-0146: tasks 61,67,73: Killed
srun: error: prod2-0143: tasks 0-19: Killed
slurmstepd: error: _get_pss: ferror() indicates error on file /proc/77821/smaps
srun: error: prod2-0146: tasks 60,62,64,66,68,70,72,74,76,78: Killed
srun: error: prod2-0144: tasks 20-39: Killed
slurmstepd: error: _get_pss: ferror() indicates error on file /proc/17624/smaps
srun: error: prod2-0147: tasks 80-99: Killed
I have also tried another memory allocation
Quote:username memory stack 400 mb heap 100 mb global 6000 mb
and it yielded a different error
Quote:username 2-e (intermediate) file size = 107432197225
2-e (intermediate) file name = ./vim.v2i
tce_ao2e: MA problem k_ijkl 18
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
0:
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
For more information see the NWChem manual at
http://www.nwchem-sw.org/index.php/NWChem_Documentation
For further details see manual section:
Currently I am using 160 cores -- Do you think I should try to use more cores so the GA allocation on each core is less?
Thank you very much,
Rui
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
4:28:48 AM PDT - Wed, Jun 20th 2018 |
|
More CPUs, still failed
|
In the hope of reducing the memory requirement on each core, I tested the job with 200 cores (increased 160 cores). However, it seems the computer could not allocate the correct amount of memory for MA. For example, the memory line reads:
Quote:username memory stack 900 mb heap 200 mb global 4300 mb
but the error message shows:
Quote:username tce_ao2e: fast2e=1
half-transformed integrals in memory
2-e (intermediate) file size = 107432197225
2-e (intermediate) file name = ./vim.v2i
Cpu & wall time / sec 214.7 266.1
available GA memory 211394680 bytes
------------------------------------------------------------------------
createfile: failed ga_create size/nproc bytes 3079838825
------------------------------------------------------------------------
------------------------------------------------------------------------
current input line :
129: task tce
even though clearly the input file was trying to allocate 4300 mb for GA.
Would you please let me know how to fix this?
Thank you,
Rui
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 9
Posts 1522
|
|
10:04:21 AM PDT - Wed, Jun 20th 2018 |
|
Please report the tce input block you are currently using and number of processors
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
2:27:15 PM PDT - Wed, Jun 20th 2018 |
|
The TCE input block reads:
Quote:username
tce
ccsd(t)
FREEZE atomic
thresh 1e-6
maxiter 100
end
I am currently using 10 nodes with 20 cores per node. The memory on each core is 6GB. The job script reads:
Quote:username #!/bin/bash
- SBATCH --job-name=vim
- SBATCH --partition=kill.q
- SBATCH --exclusive
- SBATCH --nodes=10
- SBATCH --tasks-per-node=20
- SBATCH --cpus-per-task=1
- SBATCH --error=%A.err
- SBATCH --time=0-10:59:59 ## time format is DD-HH:MM:SS
- SBATCH --output=%A.out
export I_MPI_FABRICS=shm:tmi
export I_MPI_PMI_LIBRARY=/opt/local/slurm/default/lib64/libpmi.so
source /global/opt/intel_2016/mkl/bin/mklvars.sh intel64
module load intel_2016/ics intel_2016/impi
export NWCHEM_TARGET=LINUX64
- CHANGE TO THE CORRECT PATH
export ARMCI_DEFAULT_SHMMAX=8096
export MPIRUN_PATH="srun"
export MPIRUN_NPOPT="-n"
export INPUT="vim"
$MPIRUN_PATH $MPIRUN_NPOPT ${SLURM_NTASKS} $NWCHEM_EXECUTABLE $INPUT.nw
Thank you!
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 9
Posts 1522
|
|
5:39:56 PM PDT - Wed, Jun 20th 2018 |
|
Please try the following input
echo
permanent_dir /global/cscratch1/sd/apra/arar
memory stack 1300 mb heap 200 mb global 7000 mb
start im
title "im"
charge 1
geometry #units angstroms print xyz noautosym noautoz
C -2.23423902 0.59425408 -0.03224283
O -1.12129315 1.09129114 -0.09445519
O -3.30588587 0.19083810 0.02028232
Br 1.41553615 -0.39477191 0.02227492
H -0.18608027 0.45084374 -0.04234683
end
basis spherical
C library aug-cc-pvqz
H library aug-cc-pvqz
O library aug-cc-pvqz
Br library aug-cc-pvqz-pp
end
ECP
Br library aug-cc-pvqz-pp
end
scf
doublet
THRESH 1.0e-5
MAXITER 100
TOL2E 1e-12
end
tce
ccsd(t)
FREEZE atomic
tilesize 8
attilesize 12
thresh 1e-6
maxiter 100
end
task tce
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
7:41:10 PM PDT - Wed, Jun 20th 2018 |
|
Thank you, Edoapra.
Just want to make sure I understand this correctly.
I should try to use 200 cores and each core should allocate the following amount of memory?
Quote:username memory stack 1300 mb heap 200 mb global 7000 mb
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 9
Posts 1522
|
|
10:41:41 AM PDT - Thu, Jun 21st 2018 |
|
Quote:Srhhh Jun 20th 6:41 pmThank you, Edoapra.
Just want to make sure I understand this correctly.
I should try to use 200 cores and each core should allocate the following amount of memory?
Quote:username memory stack 1300 mb heap 200 mb global 7000 mb
You should use only 10 tasks-per-node for a total of 100 cores since you mentioned that you have 6GB/core
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
12:29:10 PM PDT - Thu, Jun 21st 2018 |
|
Thank you, Edoapra.
I managed to get more core (20 nodes, 400 cores) so I was able to run without MA allocation issue with the following memory line:
Quote:username memory stack 1000 mb heap 200 mb global 4400 mb
Everything else in the input file is identified as in your previous comment. It took about 5 mins for the calculation to go to
Quote:username tce_ao2e: fast2e=1
half-transformed integrals in memory
2-e (intermediate) file size = 105684005025
2-e (intermediate) file name = ./vim.v2i
Cpu & wall time / sec 144.8 184.8
tce_mo2e: fast2e=1
2-e integrals stored in memory
but the calculation has been hanging there for over eight hours -- nothing got written into the folder or output file at all. I also noticed there were some vim.aoints.x files that seem to have not been cleaned up properly. Is the behavior normal for this size of a calculation or this QZ calculation is pushing the limit of NWChem?
Thanks again.
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
6:52:31 PM PDT - Thu, Jun 21st 2018 |
|
Unstable CCSD iterations
|
The test run in the previous comment actually went to the CCSD iterations part (each iteration takes about 1 hour wall time) but the iterations seem unstable. Please see below:
Quote:username t2 file handle = -995
CCSD iterations
-----------------------------------------------------------------
Iter Residuum Correlation Cpu Wall V2*C2
-----------------------------------------------------------------
1 0.3745619466040 -1.0830661992146 1975.7 3034.0 759.3
2 0.3338130779425 -1.0377329617715 1992.9 3058.0 760.8
3 7.2614902105214 -1.0607684520852 1991.8 3049.5 762.0
4 60.1400573985661 -1.0597624893767 1986.2 3038.5 759.7
51384.5956104600380 -1.0695691959406 1993.2 3050.9 765.8
MICROCYCLE DIIS UPDATE: 5 5
The geometry of this calculation was optimized from ccsd(t)/aug-cc-pvTZ basis set so this error should not be from a bad geometry.
Thank you!
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 9
Posts 1522
|
|
11:18:26 AM PDT - Fri, Jun 22nd 2018 |
|
Quote:Srhhh Jun 21st 5:52 pmThe test run in the previous comment actually went to the CCSD iterations part (each iteration takes about 1 hour wall time) but the iterations seem unstable.
Thank you!
Did you use a spherical or cartesian basis?
|
|
|
|
Clicked A Few Times
Threads 1
Posts 8
|
|
12:50:42 PM PDT - Fri, Jun 22nd 2018 |
|
I had a Cartesian basis and now changed to spherical. I will update you how this test goes.
Another problem just happened:
Quote:username [25] Received an Error in Communication: (-991) 25:nga_get_common:cannot locate region: ./vim.r1.d1 [18591:18511 ,1:1 ]:
[212] Received an Error in Communication: (-991) 212:nga_get_common:cannot locate region: ./vim.r1.d1 [18526:18511 ,1:1 ]:
application called MPI_Abort(comm=0x84000000, -991) - process 212
[173] Received an Error in Communication: (-991) 173:nga_get_common:cannot locate region: ./vim.r1.d1 [18721:18511 ,1:1 ]:
application called MPI_Abort(comm=0x84000000, -991) - process 173
application called MPI_Abort(comm=0x84000000, -991) - process 25
[179] Received an Error in Communication: (-991) 179:nga_get_common:cannot locate region: ./vim.r1.d1 [18656:18511 ,1:1 ]:
application called MPI_Abort(comm=0x84000000, -991) - process 179
srun: error: prod2-0101: task 212: Exited with exit code 33
srun: error: prod2-0029: task 25: Exited with exit code 33
srun: error: prod2-0096: tasks 173,179: Exited with exit code 33
From what I can find online this seems to be also related to memory (even though MA 'test' passed) and CCSD iterations started. Does DIIS require additional memories?
Thank you
|
|
|
|
Forum Regular
Threads 45
Posts 216
|
|
6:11:06 PM PDT - Fri, Jun 22nd 2018 |
|
This calculation is very hardware-demanding. I have tried NWCHEM6.8 on MAC to using aug-cc-pvdz.
...
Iterations converged
CCSD correlation energy / hartree = ...
CCSD total energy / hartree = ...
Singles contributions
Doubles contributions
...
CCSD[T] correction energy / hartree = ...
CCSD[T] correlation energy / hartree = ...
CCSD(T) correction energy / hartree = ...
CCSD(T) correlation energy / hartree = ...
CCSD(T) total energy / hartree = ...
...
CITATION
--------
Please cite the following reference when publishing
results obtained with NWChem:
M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski,
T.P. Straatsma, H.J.J. van Dam, D. Wang, J. Nieplocha,
E. Apra, T.L. Windus, W.A. de Jong
"NWChem: a comprehensive and scalable open-source
solution for large scale molecular simulations"
Comput. Phys. Commun. 181, 1477 (2010)
doi:10.1016/j.cpc.2010.04.018
AUTHORS
-------
E. Apra, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski,
T. P. Straatsma, M. Valiev, H. J. J. van Dam, D. Wang, T. L. Windus,
J. Hammond, J. Autschbach, K. Bhaskaran-Nair, J. Brabec, K. Lopata,
S. A. Fischer, S. Krishnamoorthy, M. Jacquelin, W. Ma, M. Klemm, O. Villa,
Y. Chen, V. Anisimov, F. Aquino, S. Hirata, M. T. Hackler, V. Konjkov,
D. Mejia-Rodriguez, T. Risthaus, M. Malagoli, A. Marenich,
A. Otero-de-la-Roza, J. Mullin, P. Nichols, R. Peverati, J. Pittner, Y. Zhao,
P.-D. Fan, A. Fonari, M. J. Williamson, R. J. Harrison, J. R. Rehr,
M. Dupuis, D. Silverstein, D. M. A. Smith, J. Nieplocha, V. Tipparaju,
M. Krishnan, B. E. Van Kuiken, A. Vazquez-Mayagoitia, L. Jensen, M. Swart,
Q. Wu, T. Van Voorhis, A. A. Auer, M. Nooijen, L. D. Crosby, E. Brown,
G. Cisneros, G. I. Fann, H. Fruchtl, J. Garza, K. Hirao, R. A. Kendall,
J. A. Nichols, K. Tsemekhman, K. Wolinski, J. Anchell, D. E. Bernholdt,
P. Borowski, T. Clark, D. Clerc, H. Dachsel, M. J. O. Deegan, K. Dyall,
D. Elwood, E. Glendening, M. Gutowski, A. C. Hess, J. Jaffe, B. G. Johnson,
J. Ju, R. Kobayashi, R. Kutteh, Z. Lin, R. Littlefield, X. Long, B. Meng,
T. Nakajima, S. Niu, L. Pollack, M. Rosing, K. Glaesemann, G. Sandrone,
M. Stave, H. Taylor, G. Thomas, J. H. van Lenthe, A. T. Wong, Z. Zhang.
|
Edited On 10:22:07 PM PDT - Sat, Jul 14th 2018 by Xiongyan21
|
|
|
|
Forum Regular
Threads 45
Posts 216
|
|
9:06:09 PM PDT - Fri, Jun 22nd 2018 |
|
I have tried aug-cc-pvtz, which I think is adequate for many practical purposes, with "ROHF"; and others added into the proper groups.
I am very much afraid that the original calculation employing aug-cc-pvqz only could be successful on an excellent performance supercomputer with official NWCHEM installed in a USA national lab.
NWCHEM6.8 on MAC gave
...
Iterations converged
CCSD correlation energy / hartree = ...
CCSD total energy / hartree = ...
Singles contributions
...
CITATION
--------
Please cite the following reference when publishing
results obtained with NWChem:
M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski,
T.P. Straatsma, H.J.J. van Dam, D. Wang, J. Nieplocha,
E. Apra, T.L. Windus, W.A. de Jong
"NWChem: a comprehensive and scalable open-source
solution for large scale molecular simulations"
Comput. Phys. Commun. 181, 1477 (2010)
doi:10.1016/j.cpc.2010.04.018
AUTHORS
-------
E. Apra, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski,
T. P. Straatsma, M. Valiev, H. J. J. van Dam, D. Wang, T. L. Windus,
J. Hammond, J. Autschbach, K. Bhaskaran-Nair, J. Brabec, K. Lopata,
S. A. Fischer, S. Krishnamoorthy, M. Jacquelin, W. Ma, M. Klemm, O. Villa,
Y. Chen, V. Anisimov, F. Aquino, S. Hirata, M. T. Hackler, V. Konjkov,
D. Mejia-Rodriguez, T. Risthaus, M. Malagoli, A. Marenich,
A. Otero-de-la-Roza, J. Mullin, P. Nichols, R. Peverati, J. Pittner, Y. Zhao,
P.-D. Fan, A. Fonari, M. J. Williamson, R. J. Harrison, J. R. Rehr,
M. Dupuis, D. Silverstein, D. M. A. Smith, J. Nieplocha, V. Tipparaju,
M. Krishnan, B. E. Van Kuiken, A. Vazquez-Mayagoitia, L. Jensen, M. Swart,
Q. Wu, T. Van Voorhis, A. A. Auer, M. Nooijen, L. D. Crosby, E. Brown,
G. Cisneros, G. I. Fann, H. Fruchtl, J. Garza, K. Hirao, R. A. Kendall,
J. A. Nichols, K. Tsemekhman, K. Wolinski, J. Anchell, D. E. Bernholdt,
P. Borowski, T. Clark, D. Clerc, H. Dachsel, M. J. O. Deegan, K. Dyall,
D. Elwood, E. Glendening, M. Gutowski, A. C. Hess, J. Jaffe, B. G. Johnson,
J. Ju, R. Kobayashi, R. Kutteh, Z. Lin, R. Littlefield, X. Long, B. Meng,
T. Nakajima, S. Niu, L. Pollack, M. Rosing, K. Glaesemann, G. Sandrone,
M. Stave, H. Taylor, G. Thomas, J. H. van Lenthe, A. T. Wong, Z. Zhang.
|
Edited On 4:57:02 AM PDT - Mon, Jul 9th 2018 by Xiongyan21
|
|
|
AWC's:
2.5.10 MediaWiki - Stand Alone Forum Extension Forum theme style by: AWC
| |