Edge AI — Memristor Edge Learning & Energy-Efficient Attention — a Curated Roundup

This page summarizes key research directions in Edge Computing, Neuro-Inspired Memristor Edge Learning, and Energy-Efficient Attention Mechanisms. Each section lists representative publications with short descriptions — a quick reference for exploring energy-aware and hardware-integrated AI systems.


Edge Computing & Edge AI — Surveys and Foundations

Training Machine Learning Models at the Edge: A Survey — arXiv
A broad overview of edge-side learning and continual adaptation, covering communication, hardware, and optimization trade-offs for training directly on devices.
Federated Learning in Edge Computing: A Systematic Survey — Sensors Journal
Examines federated learning frameworks designed for bandwidth-limited edge networks, emphasizing privacy, synchronization, and decentralized optimization.
Edge Artificial Intelligence: A Systematic Review — arXiv
Classifies edge AI systems by hardware type, data locality, and workload, offering an up-to-date taxonomy of architectures for embedded and real-time AI.

Edge Learning with Fully Integrated Neuro-Inspired Memristor Chips

Edge Learning Using a Fully Integrated Neuro-Inspired Memristor Chip — Science
Demonstrates a neuromorphic memristor chip capable of on-chip learning and inference. Combines analog crossbar computation and adaptive weight storage for ultra-low-power training directly at the edge.
A Fully-Integrated Memristor Chip for Edge Learning — Springer
Describes the circuit and materials design of a memristor array optimized for learning tasks, integrating local update rules and low-voltage synaptic operations.
On-Chip Learning with Memristor-Based Neural Networks — arXiv
Explores algorithms that tolerate device variability and noise in memristor arrays, enabling stable local learning and efficient analog weight updates.

Energy-Efficient Attention for Edge AI

EcoFormer: Energy-Saving Attention with Linear Complexity — NeurIPS / arXiv
Introduces an energy-optimized attention mechanism using binary hashing and kernel approximations to achieve linear complexity and lower compute cost without large accuracy loss.
ITA: An Energy-Efficient Attention and Softmax Accelerator — arXiv
Presents a hardware design for streaming integer attention computations, reducing memory access and improving inference latency on low-power devices.
Analog In-Memory Attention Architectures for Low-Energy AI — Preprint
Describes analog in-memory implementations of transformer attention that minimize off-chip data movement and exploit resistive memory arrays for energy-proportional compute.
Efficient Transformers: A Survey — ACM
Summarizes efficient transformer architectures — including sparse, low-rank, and kernelized attention — providing key context for lightweight attention in embedded systems.

Cross-Layer and System Co-Design References

Energy-Efficient AI on the Edge — Springer (Book Chapter)
Discusses practical techniques like model pruning, quantization, and adaptive runtime scheduling for optimizing AI deployments on microcontrollers and edge NPUs.
Energy-Efficient Approximate Edge Inference Systems — ACM
Analyzes system-level design choices for approximate computing and dynamic precision scaling to achieve energy–latency trade-offs in real-world embedded AI.

Quick Takeaways