Hi all,
I'm trying to understand some inner workings of Nest. Rigth now I'm running
simulations with close half millons elements, using mpirun in a cluster
with 25 nodes. The problem I am having is that the "setup" (layer creation
and connections) phase takes close to 8min and the simulation only takes
1min.
So I tried to use python's multiprocessing package to speed it up, with the
following code:
nest.ResetKernel()
nest.SetKernelStatus({"local_num_threads": 1})
#...
connections = [
(layer1, layer1, conn_ee_dict, 1),
(layer1, layer2, conn_ee_dict, 2),
(layer2, layer2, conn_ee_dict, 3),
(layer2, layer1, conn_ee_dict, 4)
]
# Process the connections.
def parallel_topology_connect(parameters):
[pre, post, projection, number] = parameters
print(f"Connection number: {number}")
topology.ConnectLayers(pre, post, projection)
pool = multiprocessing.Pool(processes=4)
pool.map(parallel_topology_connect, connections)
The above example takes around 0.9s, but if the last two to lines are
changed for a sequential call, it takes 2.1s:
for [pre, post, projection, number] in connections:
print(f"Connection number: {number}")
topology.ConnectLayers(pre, post, projection)
So far the multiprocessing works great, the problem comes when the
"local_num_threads" parameters is changed from 1 to 2 or more, in the
cluster it could be 32. The code gets stuck in the topology.Connect without
any error, after a while I just stopped it.
Also I realised that the tolopoly.ConnectLayers just spawn one thread to
connects layers despite the local_num_threads is setted more than one.
Any idea what is going on?
Thanks in advance
Juan Manuel
PD: The full example code is attached (60 lines of code). The
local_num_threads and multiprocessing_flag variables change the behaviors
of the code.