Hello, dear nest team!
I am trying to launch the experiment, using nest-server-mpi and NRP on the
HPC. We also use a module generated by nestml. But I faced the issue, which
I so far cannot solve myself... Probably you could help me.
We have a working experiment for the nest-server v3.7, which also works
with v3.8.
I tried to launch nest-simulator in the container on the HPC, but I faced
some mpi incompatibility issues, could not solve it fast and gave up for
now.
Then I installed nest-simulator 3.7 from source on the HPC. The simulation
with nest-server also worked. After patching nest-server-mpi (serialization
issue) I could launch it on the HPC and communicate with the API, but the
experiment didn't work, and I switched to the local nest-simulator
container for the test.
Locally, in the original nest/nest-simulator:3.7 and 3.8 containers, the
nest-server-mpi is not starting. After the fixing the issue (with docopt
and serialization) the nest-server-mpi is launching in
nest/nest-simulator:dev container. But here is the problem, with this
version the experiment is crashing even with nest-server, as well as with
nest-server-mpi. And I cannot figure out the reason from the logs and git
history...
I tried patching versions 3.7 and 3.8, then I can start nest-server-mpi,
can use API, but the experiment itself doesn't work.
So, the summary:
The experiment works with nest-server 3.7 and 3.8, but doesn't work with
the patched nest-server-mpi
The experiment doesn't work with the nest-server with :dev container
What happens in the latest container with the nest-server:
nrp-nest-simulator |
> nrp-nest-simulator | Apr 03 16:11:15 SimulationManager::set_status [Info]:
> nrp-nest-simulator | Temporal resolution changed from 0.1 to 0.1 ms.
> nrp-nest-simulator |
> nrp-nest-simulator | Apr 03 16:11:15 Install [Info]:
> nrp-nest-simulator | loaded module controller_module
And that's it, on the other side I get [json.exception.out_of_range.403]
key 'data' not found
The nest script, which is executed at this moment is
nest.ResetKernel()
> nest.SetKernelStatus({"resolution": res})
> nest.Install("controller_module")
> for i in range(njt):
> planner_p = nest.Create("tracking_neuron_nestml", n=N, params={"kp":
> plan_params["kp"], "base_rate": plan_params["base_rate"], "pos": True,
> "traj": trj, "simulation_steps": len(trj)})
> planner_n = nest.Create("tracking_neuron_nestml", n=N, params={"kp":
> plan_params["kp"], "base_rate": plan_params["base_rate"], "pos": False,
> "traj": trj, "simulation_steps": len(trj)})
The same piece of code works in nest-server v3.7 and v3.8
What happens in the patched nest-server-mpi v 3.8 or 3.7, here it at least
throws an exception
nrp-nest-simulator | ========================================
> nrp-nest-simulator |
> nrp-nest-simulator | ==> MASTER 0/1743700791.3478301 (route_api_call):
> call=GetStatus, args=[[303]], kwargs={}
> nrp-nest-simulator | ==> MASTER 0/1743700791.3478405 (GetStatus): sending
> call bcast
> nrp-nest-simulator | ==> MASTER 0/1743700791.3478618 (GetStatus): sending
> data bcast, data=([[303]], {})
> nrp-nest-simulator | ==> MASTER 0/1743700791.3480251 (GetStatus): local
> call, args=[NodeCollection(metadata=None, model=spike_recorder, size=1,
> first=303)], kwargs={}
> nrp-nest-simulator | ==> MASTER 0/1743700791.3480933 (GetStatus): waiting
> for response gather
> nrp-nest-simulator | ==> MASTER 0/1743700791.3484681 (GetStatus):
> received response gather, data=[({'element_type': 'recorder', 'events':
> {'senders': array([204]), 'times': array([13.5])}, 'frozen': False,
> 'global_id': 303, 'label': 'Brain stem pos', 'local': True, 'model':
> 'spike_recorder', 'model_id': 96, 'n_events': 1, 'node_uses_wfr': False,
> 'origin': 0.0, 'record_to': 'memory', 'start': 0.0, 'stop':
> 1.7976931348623157e+308, 'thread': 0, 'thread_local_id': 152,
> 'time_in_steps': False, 'vp': 0},), [{'element_type': 'recorder', 'events':
> {'senders': [], 'times': []}, 'frozen': False, 'global_id': 303, 'label':
> 'Brain stem pos', 'local': True, 'model': 'spike_recorder', 'model_id': 96,
> 'n_events': 0, 'node_uses_wfr': False, 'origin': 0.0, 'record_to':
> 'memory', 'start': 0.0, 'stop': 1.7976931348623157e+308, 'thread': 0,
> 'thread_local_id': 152, 'time_in_steps': False, 'vp': 1}]]
> nrp-nest-simulator | [2025-04-03 19:19:51,348] ERROR in app: Exception on
> /api/GetStatus [POST]
> nrp-nest-simulator | Traceback (most recent call last):
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/app.py", line 2070, in wsgi_app
> nrp-nest-simulator | response = self.full_dispatch_request()
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/app.py", line 1515, in
> full_dispatch_request
> nrp-nest-simulator | rv = self.handle_user_exception(e)
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask_cors/extension.py", line 165, in
> wrapped_function
> nrp-nest-simulator | return
> cors_after_request(app.make_response(f(*args, **kwargs)))
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/app.py", line 1513, in
> full_dispatch_request
> nrp-nest-simulator | rv = self.dispatch_request()
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/app.py", line 1499, in
> dispatch_request
> nrp-nest-simulator | return
> self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
> nrp-nest-simulator | File
> "/opt/nest/lib/python3.10/site-packages/nest/server/hl_api_server.py", line
> 317, in route_api_call
> nrp-nest-simulator | return jsonify(response)
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/json/__init__.py", line 348, in
> jsonify
> nrp-nest-simulator | f"{dumps(data, indent=indent,
> separators=separators)}\n",
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/json/__init__.py", line 129, in dumps
> nrp-nest-simulator | rv = _json.dumps(obj, **kwargs)
> nrp-nest-simulator | File "/usr/lib/python3.10/json/__init__.py", line
> 238, in dumps
> nrp-nest-simulator | **kw).encode(obj)
> nrp-nest-simulator | File "/usr/lib/python3.10/json/encoder.py", line
> 199, in encode
> nrp-nest-simulator | chunks = self.iterencode(o, _one_shot=True)
> nrp-nest-simulator | File "/usr/lib/python3.10/json/encoder.py", line
> 257, in iterencode
> nrp-nest-simulator | return _iterencode(o, 0)
> nrp-nest-simulator | File
> "/usr/lib/python3/dist-packages/flask/json/__init__.py", line 56, in default
> nrp-nest-simulator | return super().default(o)
> nrp-nest-simulator | File "/usr/lib/python3.10/json/encoder.py", line
> 179, in default
> nrp-nest-simulator | raise TypeError(f'Object of type
> {o.__class__.__name__} '
> nrp-nest-simulator | TypeError: Object of type int64 is not JSON
> serializable
> nrp-nest-simulator | [2025-04-03 19:19:51,349] INFO in _internal:
> 172.18.0.3 - - [03/Apr/2025 19:19:51] "POST /api/GetStatus HTTP/1.1" 500 -
Do you have any clues where I should dig? I would appreciate any help.
Best wishes, Viktor