Hi everyone,
For my masters thesis I'm using (py)NEST for the first time. I have some
questions about parallel computing. first a short intro to give you the
context:
My research group has access to 4 clusters; 2 with a focus on RAM/CPU on 2
with focus on GPU. These are shared systems; resources are dynamically
allocated depending on the number of users doing calculations.
The way these clusters are set up allows me to simply set
"local_num_threads" to a value of 10, and if no one else is using it the
cluster manager will say I get 1000% CPU power, and the speed will increase
about 8-fold as expected. (in case you find that number confusing: the
manager rates 1 core as 100%, so 56 cores translates to 5600% CPU power)
However, problems start to occur when others are using the cluster as well.
I might get 500% CPU power one moment, and just 300% the next. At this
point it is actually FASTER to only request 1 thread and get the full 100%
power. I'm guessing this is due to the architecture; something with nodes A
and B waiting on node C which has been temporarily allocated to my
colleague?
*So, the questions:*
1. Does NEST prefer CPU or GPU? Is it RAM intensive? I couldn't find this
on your website, or anywhere really. Are there ballpark numbers for the
memory required for 10^9 connections?
2. Am I abusing the "local_num_threads" mechanic by applying it to a
cluster like this? Why is 1 thread sometimes faster than 10 threads?
3. Would switching to MPI help?
4. If so: the documentation says I need to run $NEST_SOURCE_DIR/configure --
with-mpi, but there is no "configure" folder or command to be found
anywhere, and this line fails. What and where is this "configure"?
Any help would be greatly appreciated. Thank you for your time.
Kind regards,
Jimmy Mulder