What's the timeline for this? Is there any open source discussion or
proposal for the implementation that I could read to understand how the
problem is being tackled, and maybe find out if I can propose a low-effort
interface decoupled from SONATA on disk? If there's something reading data
from disk, I might as well be streaming that thing my data.
On Fri, 7 Oct 2022, 16:09 Hans Ekkehard Plesser, <
hans.ekkehard.plesser(a)nmbu.no> wrote:
Hi Robin,
We are currently working with the Allen Institute to develop an efficient
reader for large Sonata network specifications. We assume here that all
connectivity is collected in HDF5 files, and expect significant performance
gains if data is sorted by target neuron and the Sonata files provide
"indices" tables. Would this help you?
Dividing network data according to a specific compute node configuration
seems rather restrictive to me and right now there is no way to read such
data per process.
Best,
Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser
Head, Department of Data Science
Faculty of Science and Technology
Norwegian University of Life Sciences
PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560
Email hans.ekkehard.plesser(a)nmbu.no
Home
http://arken.nmbu.no/~plesser
*From: *Robin Gilbert De Schepper <robingilbert.deschepper(a)unipv.it>
*Reply to: *NEST User Mailing List <users(a)nest-simulator.org>
*Date: *Friday, 7 October 2022 at 15:41
*To: *NEST User Mailing List <users(a)nest-simulator.org>
*Subject: *[NEST Users] Fastest way to transfer dense connectivity data
in a parallel simulation?
Hi all!
In the world of biophysical detail, it's commonplace that the connectome
is generated with algorithms that specify connections as dense tabular
data, with each row specifying a synaptic location on a cell pair (SONATA
for example).
A) In NEST I can't really find the opportunity to fit this data into any
of the connection rules: I want to specify pairwise connections from the
multiset A to multiset B.
- Is this possible with `pairwise bernoulli`, or do the inputs have to be
strict sets?
- The probability step is superfluous, can it be skipped?
B) Then there's the fact that NEST parallelizes transparently, but since
this data was generated in parallel by tiling the biological volume, I have
neatly fragmented data already available on each node in the distributed
cluster. It would be such a waste to communicate all the data to each node,
for NEST to communicate and distribute them back another way.
The data is too big to allgather and fit into memory of any single node.
Not only is this a lot of overhead to implement, but NEST will throw away
all but `1 / Nnodes` of the data on each node again, leaving me with a
reshuffled version of my starting data.
Is there a way to bypass the transparency and to imperatively declare the
cells and connections on each machine?
--
Robin De Schepper, MSc (they/them)
Department of Brain and Behavioral Sciences
Unit of Neurophysiology
University of Pavia, Italy
Via Forlanini 6, 27100 Pavia - Italy
Tel: (+39) 038298-7607
http://www-5.unipv.it/dangelo/
Interested in large scale network modelling?
Discover our framework <https://bsb.readthedocs.io/en/latest/>:
[image: Image removed by sender.] <https://github.com/dbbs-lab/bsb>
_______________________________________________
NEST Users mailing list -- users(a)nest-simulator.org
To unsubscribe send an email to users-leave(a)nest-simulator.org