Thanks for the quick response,
- we use only multithreading here, no MPI
- we are able to use more than 32 threads,
- we did not check, if the neurons receive the expected input currents. We might look into
it in the future, if it will be important.
Sincerely,
Jan Střeleček
On Thu, Apr 28, 2022 at 16:43, Hans Ekkehard Plesser <hans.ekkehard.plesser(a)nmbu.no>
wrote:
Dear Jan,
Thank you very much for your very detailed analysis. We will try to reproduce this as
soon as possible.
Three questions:
- You only use threads, no MPI parallelization, correct?
- Your machine has >= 32 cores?
- Do the neurons receive the expected input currents, especially the same currents
independent of number of threads?
Best,
Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser
Head, Department of Data Science
Faculty of Science and Technology
Norwegian University of Life Sciences
PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560
Email hans.ekkehard.plesser(a)nmbu.no
Home
http://arken.nmbu.no/~plesser
On 28/04/2022, 16:22, "Jan Střeleček" <strelda(a)protonmail.com> wrote:
Dear NEST developers,
In our group, we're working on a model of the primary visual cortex and use
step_current_source generators to simulate the input current of the LGN neurons. We
noticed that the simulation time of our model was very sensitive to the number of
step_current_sources. When trying to narrow down the cause, we found out that this might
be due to an issue with the parallelization of the step_current_source_generators. The
resulting simple system in which the problem can be observed is attached below,
simple_example.py. It essentially creates NSstep_current_generators and injects them into
NLneurons with fixed indegree. The iaf_cond_exp neuron model is used here. The increment
in the number of step_current sources does not benefit from a multithreading performance
boost as one would expect. This is compared to the performance boost for the number of
neurons; see the technical details below. Our estimated guess is that the difference
between 1 and 32 threads is 10 to 20 times slower than the parallelization suggests.
Technical details:
The relative slowdown due to the parallelization of step_current_sources was measured
using linear regression over
simulation time = a NL + b NS.
See slowdown_example.png.
The ratio b/a was then calculated. This ratio was then measured in dependence on the
number of threads. A bigger difference between the ratio for 1 thread and 32 threads means
a greater problem in parallelization in step_current_generators.
Some additional results:
·interval_dependence.png - the slowdown does not depend on amplitude_times in the
step_current_source function
·indegree_dependence.png - the slowdown depends on the indegree of nest.Connect(source,
neurons). Specifically, the slowdown is worse for low indegree values. This shows the
slowdown depends on the number of step_current_sources created, not on the injections
themselves.
Are you aware of some lack of parallelization of the step_current_source or current the
injection itself? If so, are there any plans for improving it?
best regards,
Jan Střeleček