From NWChem
Viewed 1792 times, With a total of 7 Posts
|
Gets Around
Threads 18
Posts 67
|
|
8:33:17 PM PDT - Sat, May 31st 2014 |
|
Hi,
I am using this code to calculate MR-CCSD energy of cubane transition state.
title "cubane MK-CCSD(2,2)/cc-pVDZ calculation"
scratch_dir /mnt/scratch
memory stack 100 mb heap 100 mb global 11000 mb
geometry
symmetry C2v
H -1.32680 2.05167 -0.03231
C -0.78899 1.09343 0.03178
H 1.32680 2.05167 -0.03231
C 0.78899 1.09343 0.03178
H -1.48781 -0.00000 2.07216
C -1.20592 0.00000 1.01567
H 1.48781 -0.00000 2.07216
C 1.20592 -0.00000 1.01567
H -1.44937 0.00000 -1.98042
C -0.77946 0.00000 -1.10963
H 1.44937 0.00000 -1.98042
C 0.77946 0.00000 -1.10963
H -1.32680 -2.05167 -0.03231
C -0.78899 -1.09343 0.03178
H 1.32680 -2.05167 -0.03231
C 0.78899 -1.09343 0.03178
end
scf
direct
end
basis spherical
H library cc-pVDZ
C library cc-pVDZ
end
tce
mkccsd
2emet 1
freeze atomic
end
mrccdata
root 1
nref 2
22222222222222222222222222220
22222222222222222222222222202
end
task tce energy
I am using Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz with 8GB memory + 120Gb swap.
SHMMAX set to 16GB (echo 16384000000 > /proc/sys/kernel/shmmax)
With openblas parallelization I'm running calculation in single process.
mpirun -np 1 nwchem N8.nw > N8.nwo
But I've got issue in computation of 2-e integrals:
MRCC tiling completed in 0.0 0.0
tce_ao1e_fock2e 36.28000 36.36079
F: 1 in bytes = 87040
tce_mo1e 0.03200 0.06773
eone,etwo,enrep,energy -1126.356944754460 460.249162408168 358.838446347621 -307.269335998670
mrcc_uhf_energy 8.78800 8.78590
tce_ao1e_fock2e 35.68800 35.74738
F: 2 in bytes = 87040
tce_mo1e 0.02400 0.02567
eone,etwo,enrep,energy -1125.936417042576 459.900381432998 358.838446347621 -307.197589261957
mrcc_uhf_energy 9.31200 9.33206
2-e(intermediate) /mnt/scratch/cubane. in bytes= 8159223808
Ref. 1 Half 2-e 915.42 1115.33
V 2-e /mnt/scratch/cubane. in bytes= 1437934592
0:armci_malloc:malloc 1 failed: 1437934600
(rank:0 hostname:kbob-G41MT-S2 pid:23581):ARMCI DASSERT fail. ../../ga-5-2/armci/src/memory/memory.c:PARMCI_Malloc():880 cond:0
I tried to set the environment variable ARMCI_DEFAULT_SHMMAX in the different values (4096, 16000, 16384) but nothing has changed.
No additional errors reported with 16000 value.
I tried not to use GA IO-scheme.
In this case calculation of the 2-e integrals have been successfully completed, but MRCC iterations itself fails without GA initialization.
I am using development snapshot May 03, 2014 Nwchem-6.3.revision25564-src.2014-05-03 with patch http://www.nwchem-sw.org/index.php/Special:AWCforum/sp/id4530.
Can anyone help to figure out what might be the problem and show workarounds?
Thanks!
|
Edited On 8:36:39 PM PDT - Sat, May 31st 2014 by Vladimir
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 7
Posts 1147
|
|
3:14:12 PM PDT - Mon, Jun 2nd 2014 |
|
Vladimir
What value of ARMCI_NETWORK have you used for your installation? Have you left it undefined?
Thanks, Edo
|
|
|
|
Gets Around
Threads 18
Posts 67
|
|
6:55:09 PM PDT - Mon, Jun 2nd 2014 |
|
my build file
#!/bin/sh
#sudo apt-get install python2.7-dev zlib1g-dev libssl-dev gfortran
#Edit src/config/makefile.h and add "-lz -lssl" to the end of line 2094 (needed by python)
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/usr/lib/openblas-base -lopenblas"
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openblas-base
#sudo apt-get install libopenmpi-dev openmpi-bin
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MRCC_METHODS=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export ARMCI_NETWORK=SOCKETS
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib
export FC=gfortran
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make > make.log 2>&1
cd ../contrib
./getmem.nwchem
I found some note about ARMCI_NETWORK on this forum
http://www.nwchem-sw.org/index.php/Special:AWCforum/sp/id1600
So in my case (with openblas parallelization I'm running mpirun -np 1 nwchem N8.nw).
Do I need to compile nwhem with mpi or can I compile without?
What value should I set for the variable ARMCI_NETWORK if I use only one node?
|
Edited On 1:10:23 AM PDT - Tue, Jun 3rd 2014 by Vladimir
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 7
Posts 1147
|
|
9:15:48 AM PDT - Tue, Jun 3rd 2014 |
|
Vladimir,
I suggest you to try first with ARMCI_NETWORK=MPI-TS (the default value).
I have just tried you input and it works with the following memory and nproc=4
memory stack 400 mb heap 100 mb global 3100 mb
The ARMCI_DEFAULT_SHMMAX story become irrelevant for ARMCI_NETWORK=MPI-TS
|
|
|
|
Gets Around
Threads 18
Posts 67
|
|
11:29:52 PM PDT - Tue, Jun 3rd 2014 |
|
Thank you very much, Edo.
Your magical spell worked perfectly.
But I still do not understand why the calculation do not work in a single thread.
|
|
|
|
Gets Around
Threads 18
Posts 67
|
|
8:12:37 AM PDT - Thu, Jun 5th 2014 |
|
I found another way to fix the issue.
https://groups.google.com/forum/#!msg/hpctools/-bYstidUAYA/LqZ38W1f1ukJ
just set
-#define DEFAULT_MAX_NALLOC (4*1024*1024*16)
+#define DEFAULT_MAX_NALLOC (8*1024*1024*16)
Incidentally, nproc=1 with openblas parallelization is 7 times faster than nproc=4.
|
|
|
-
Edoapra Forum:Admin, Forum:Mod, bureaucrat, sysop
|
|
Forum Vet
Threads 7
Posts 1147
|
|
9:37:25 AM PDT - Thu, Jun 5th 2014 |
|
Vladimir
Thank you very much for your feedback.
In order to reproduce your single processor find, I need a few more details about your setting.
1) memory line used in NWChem input file
2) value of ARMCI_NETWORK used
Cheers, Edo
|
|
|
|
Gets Around
Threads 18
Posts 67
|
|
6:10:25 PM PDT - Thu, Jun 5th 2014 |
|
1. Memory line like in 1-st post:
memory stack 400 mb heap 100 mb global 11000 mb
2. ARMCI_NETWORK not set (default).
|
|
|
AWC's:
2.5.10 MediaWiki - Stand Alone Forum Extension
Forum theme style by: AWC