Dear Jan,
Thank you very much for your very detailed analysis. We will try to reproduce this as soon
as possible.
Three questions:
- You only use threads, no MPI parallelization, correct?
- Your machine has >= 32 cores?
- Do the neurons receive the expected input currents, especially the same currents
independent of number of threads?
Best,
Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser
Head, Department of Data Science
Faculty of Science and Technology
Norwegian University of Life Sciences
PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560
Email hans.ekkehard.plesser@nmbu.no<mailto:hans.ekkehard.plesser@nmbu.no>
Home
http://arken.nmbu.no/~plesser
On 28/04/2022, 16:22, "Jan Střeleček"
<strelda@protonmail.com<mailto:strelda@protonmail.com>> wrote:
Dear NEST developers,
In our group, we're working on a model of the primary visual cortex and use
step_current_source generators to simulate the input current of the LGN neurons. We
noticed that the simulation time of our model was very sensitive to the number of
step_current_sources. When trying to narrow down the cause, we found out that this might
be due to an issue with the parallelization of the step_current_source_generators. The
resulting simple system in which the problem can be observed is attached below,
simple_example.py. It essentially creates NS step_current_generators and injects them into
NL neurons with fixed indegree. The iaf_cond_exp neuron model is used here. The increment
in the number of step_current sources does not benefit from a multithreading performance
boost as one would expect. This is compared to the performance boost for the number of
neurons; see the technical details below. Our estimated guess is that the difference
between 1 and 32 threads is 10 to 20 times slower than the parallelization suggests.
Technical details:
The relative slowdown due to the parallelization of step_current_sources was measured
using linear regression over
simulation time = a NL + b NS.
See slowdown_example.png.
The ratio b/a was then calculated. This ratio was then measured in dependence on the
number of threads. A bigger difference between the ratio for 1 thread and 32 threads means
a greater problem in parallelization in step_current_generators.
Some additional results:
· interval_dependence.png - the slowdown does not depend on amplitude_times in the
step_current_source function
· indegree_dependence.png - the slowdown depends on the indegree of
nest.Connect(source, neurons). Specifically, the slowdown is worse for low indegree
values. This shows the slowdown depends on the number of step_current_sources created, not
on the injections themselves.
Are you aware of some lack of parallelization of the step_current_source or current the
injection itself? If so, are there any plans for improving it?
best regards,
Jan Střeleček