Spiking Neural Networks vs. State-Space Models

Spiking Neural Networks (SNNs) and State-Space Models (SSMs) are often grouped together as “the alternatives to attention” — but they come from different traditions, encode different assumptions about time, and behave very differently on real hardware. This is a short, opinionated comparison from the perspective of building a production neuromorphic LLM.

What they actually are

An SNN models a neuron as a leaky integrate-and-fire unit: membrane potential accumulates inputs, and the unit emits a binary spike when threshold is crossed. Information lives in spike timing, not in real-valued activations. The model is event-driven by construction — silent neurons consume no compute.

An SSM (S4, S5, Mamba and successors) models a sequence as a continuous-time linear dynamical system with a learned transition matrix, discretised for digital inference. It produces real-valued outputs, scales linearly in sequence length, and has well-studied training dynamics inherited from classical control theory.

Side-by-side

Dimension	SNN	SSM
Signal	Binary spikes (event)	Real-valued state (continuous)
Native task	Sensor / temporal coding	Long-context sequence modeling
Training	Surrogate gradients; non-trivial	Backprop; well-behaved
Per-token cost	Proportional to activity	Constant (linear in length)
Hardware fit	Neuromorphic chips (Loihi, TrueNorth); sparse GPU	Dense GPU / TPU; well-supported in CUDA
Best at	Sub-watt sensing, latency-critical edge	Long context, text, audio, foundation models

Why combine them

The honest answer is that neither is the whole solution for an edge LLM on Jetson:

A pure-SNN LLM today loses too much accuracy on text — the discrete spike code throws away information that matters for language.
A pure-SSM LLM is efficient but does no work-skipping when the input is uninformative, which is the property that makes edge deployment energy-viable.

The Neuratron architecture inside Caroline takes the SSM backbone for sequence modeling (so we keep accuracy and long context), then layers SNN-style sparse activation and event gating on top (so silent inputs do no work). That is what we mean when we call Caroline a neuromorphic LLM rather than just “an LLM with sparsity”.

NeuraTensor SDK

NeuratronLLM-Edge4B

What they actually are

Side-by-side

Why combine them

Further reading