Dear Michele,
I assume you use the hpc_benchmark.py without any modifications? On my laptop,
mpirun -np 2 python install/share/doc/nest/examples/pynest/hpc_benchmark.py
executes without problems with NEST 3.6 (and with current master).
Interestingly, for the hpc_benchmark, NEST should never even get to the else block to which the assertion on line 107 in target_table.cpp belongs, since all connections in the network are primary connections. So something seems to go wrong in the communication of information about connections to the presynaptic side.
Could you try with this branch
https://github.com/heplesser/nest-simulator/tree/36_nosingle
It makes sure all MPI communication strictly happens in the OpenMP master thread (in NEST 3.6, it may happen inside OpenMP single constructs). This should not make a difference for the hpc_benchmark, since it uses a single thread by default.
Best, Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser
Department of Data Science Faculty of Science and Technology Norwegian University of Life Sciences PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560 Email hans.ekkehard.plesser@nmbu.nomailto:hans.ekkehard.plesser@nmbu.no Home http://arken.nmbu.no/~plesser
From: Michele Martinelli michele.martinelli@roma1.infn.it Date: Monday, 29 January 2024 at 10:35 To: users@nest-simulator.org users@nest-simulator.org Subject: [NEST Users] Assert failed running hpc_benchmark Some people who received this message don't often get email from michele.martinelli@roma1.infn.it. Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
Dear NEST Users & Developers,
we're currently working on a custom OpenMPI BTL (supporting a custom FPGA-based NIC) at the National Institute for Nuclear Physics in Rome, Italy and we have an error when running hpc_benchmark (this test is currently used as simple validation test) with 2 processes (one on each of 2 hosts), the command we run is like:
mpirun -n 2 -H host1:1,host2:1 --bynode --report-bindings -mca btl apelink,self,sm python hpc_benchmark.py (apelink is our custom BTL component)
but then we see this error:
python: [...]/NEST_with_local_ompi/nest-simulator-3.6/nestkernel/target_table.cpp:107: void nest::TargetTable::add_target(size_t, size_t, const nest::TargetData&): Assertion `syn_id < secondary_send_buffer_pos_[ tid ][ lid ].size()' failed. [host:23979] *** Process received signal *** [host:23979] Signal: Aborted (6) [host:23979] Signal code: (-6)
My guess is that we are transferring something incorrectly (maybe during the initialization/setup phase?), but I'm not sure what the assert expects to have in secondary_send_buffer_pos_[ tid ][ lid ].size() and how this field should be set.
Best,
Michele