Recent papers related to spiking neurons and LLMs
In this article we briefly discuss some recent spiking neuron models that combine LLMs or large language models and spiking neurons.
BrainGPT (2024)
SNN-based 100M+ parameter LLM; dual-model architecture with ANN-to-SNN conversion and STDP fine-tuning.
Achieved 100% parity with ANN model on language tasks, with ~33% less energy and 66% faster training convergence than baseline. Demonstrates lossless conversion of a Transformer to spiking form.
BrainTransformers-3B (2024)
3-billion parameter Transformer implemented with spiking neurons (spike-based MatMul, Softmax, SiLU, etc).
Matched performance of similar-sized ANN LLMs on benchmarks (63% MMLU, etc.) while operating in a spike-efficient manner. Opens the door to neuromorphic large-language models for NLP.
SpikeLLM (2024)
Spiking-driven quantization for LLMs (applied to LLaMA-2); uses integrate-and-fire neurons to identify important channels and prune/quantize others.
Optimized LLM inference: Reduced perplexity by 25% and improved accuracy >3% in a 4-bit quantized LLaMA-7B model. Enables large models to run with lower precision and energy via bio-inspired quantization.
SpikeGPT (2023)
Fully spiking generative language model; replaces Transformer attention with a spiking RNN (RWKV) architecture, trained on text.
First demonstration of a spiking LLM that can handle language generation (Enwik8, WikiText) with purely spiking dynamics. Validated that SNNs can perform autoregressive text generation.
SpikeBERT (2023)
Spiking version of BERT for language understanding; Transformer-based SNN trained via two-stage distillation from a pretrained BERT.
Achieved near-ANN performance on NLP tasks (sentiment classification, NLI) using an SNN. Showed that knowledge from a large Transformer can be transferred to a spiking network, greatly closing the accuracy gap.
Meta-SpikeFormer (2024)
Generalized spiking Vision Transformer architecture with spike-based self-attention; tested on vision tasks (ImageNet, COCO detection, etc.).
Reached 80.0% top-1 ImageNet accuracy (SNN record) and outperformed all prior CNN-based SNNs. First spiking model to support classification, detection, and segmentation in one network, guiding next-gen neuromorphic chip designs
Neuro-LIFT (2025)
LLM + SNN hybrid framework for drone navigation; LLM parses human commands, SNN with event-camera handles vision and control.
Demonstrated real-time autonomous flight through obstacles via natural language instructions.The spiking vision module enabled low-latency response, achieving tasks impossible for frame-based vision in the power budget.
Loihi RL Controller (2023)
Spiking neural network policy trained with reinforcement learning and deployed on Loihi 2 neuromorphic chip to control a robot arm.
Achieved ~100× lower energy than equivalent CPU control, with on-par latency and precision in a force-control task. Validates the efficiency of neuromorphic hardware for real-world robotic control, highlighting SNN advantages in embodied agents.
ChatGPT-Generated SNN Design (2024)
Using a large language model (ChatGPT) to automate neuromorphic design (natural language to Verilog for spiking circuits).
Showcased that LLMs can aid hardware design by generating synthesizable HDL code for SNNs. Marks an initial step towards AI-assisted development of spiking neural hardware and architectures.