Brainy Bytes: Reinforcement Learning in the Neuromorphic Niche
Image generated by Bing DALL·E, created on April 18, 2024.

Brainy Bytes: Reinforcement Learning in the Neuromorphic Niche

What's the Buzz:  Three years after launching its second-generation neuromorphic chip, Loihi 2, Intel, in partnership with the US Department of Energy's Sandia National Laboratories, introduced a novel system called Hala Point, comprising 1,152 Loihi 2 chips. This system represents a notable advance in neuromorphic computing, boasting 1.15 billion artificial neurons and 128 billion synapses across 140,544 cores, marking a substantial increase from its predecessor, Pohoiki Springs.

Spilling the Beans: Neuromorphic computing is like trying to make a mini-brain out of silicon. It's inspired by the noodle-y complexities of our human brains—think billions of neurons chatting through synapses. Instead of the standard ones and zeros dance, these chips mimic neurons firing and chilling, which can mean faster processing with significantly less power. Why does this matter? In a world hungry for smarter tech and greener solutions, these brainy chips are like having a super-efficient, problem-solving wizard in your pocket. Plus, they're fab at tackling tasks that make traditional computers sweat, from sniffing out patterns to learning on the fly!

Why give a Hoot: Get excited about the possibilities of Reinforcement Learning on neuromorphic systems! It's not just about saving energy, it's a high-tech jamboree where energy-sipping chips crunch hefty computations for IoT devices and mobile robots. Imagine cars and robots making split-second decisions with brain-like swiftness, all while learning nifty tricks directly on the hardware, and you've got a recipe for tech that's both smart and speedy! Who knew learning could be so electric and efficient?


Keep scrolling down if you are interested in details...

Overview of Biological, Neural, and Silicon Computing

Fig.01 Biological, Deep Neural Network and Silicon Computing Ecosystem (Source: Roy, K., Jaiswal, A., & Panda, P. (2019).)

The depiction of the Biological Brain Network illustrates an intricate web of neurons and synapses that utilize temporal spikes for rapid, efficient communication across different regions. In contrast, the Deep Convolutional Neural Network (DCNN) for Object Detection employs synaptic-like storage and neuronal nonlinearity to develop broad data representations. After training via backpropagation, this network evolves from recognizing simple features such as edges and color blobs to more detailed features like the parts of an animal's face, reflecting the hierarchical structure of the visual cortex. Meanwhile, the Silicon Computing Ecosystem is characterized by a conventional division between processing units and memory storage, leading to the 'memory wall bottleneck.' This separation significantly contributes to the high energy demands and computational power needed to achieve the exceptional accuracy of deep neural networks operated on sophisticated cloud servers.

Fig.02 Fundamental differences between the traditional (von Neumann) and Neuromorphic architecture with regard to their operation, organization, programming, communication, and timing. (Source: Schuman et. al 2022)

Neuromorphic systems are designed to mimic the brain's processing, fundamentally differ from traditional digital systems by utilizing analog data processing, asynchronous communication, and spiking information representation. This field is supported by researchers focused on emulating neural functions, simulating networks, and developing bio-inspired electronic devices. Central to these efforts are memristors, which adjust resistance based on electrical history and support functionalities akin to memory and cognition. Neuromorphic engineering leverages these principles through analog electronics to deliver innovations like smart sensors and video processing systems, benefiting from low power usage and adaptive responses. [ Mehonic, A., & Kenyon, A. J. (2022).]

Fig.03 The graph shows the escalation in computing power demands over the last forty years, measured in petaFLOPS days. Initially, the demand for computing power doubled biennially until 2012; since then, it has accelerated, doubling roughly every two months. (Source: Mehonic et al. 2022)

Why a crucial tech? Conventional systems separate memory and processing, leading to high energy use. As AI usage expands in sectors like IoT and autonomous technologies, the demand for power intensifies, surpassing efficiency improvements traditional computing can achieve. Inspired by the human brain, neuromorphic computing integrates memory and processing, uses parallel architectures, and encodes information uniquely. This approach dramatically reduces energy consumption and offers a sustainable solution to future computational needs.

Other key benefits of transitioning to neuromorphic computing for machine learning, including real-time processing, robustness to noise, and scalability. These systems reduce latency and bandwidth needs while enhancing sensory processing and adaptability. Their bio-inspired capabilities enable complex, dynamic learning, ideal for applications requiring immediate, reliable responses.

Features of Neuromorphic Computing: They exhibit Highly Parallel Operation, enabling numerous neurons and synapses to work concurrently. While each neuron and synapse performs simple tasks, their collective power significantly enhances overall system performance. The Collocated Processing and Memory in neuromorphic hardware integrates these functions, eliminating the traditional separation seen in von Neumann architectures. This integration speeds up throughput and cuts energy use by reducing the frequent data accesses required by conventional systems. Inherent Scalability is another strength of neuromorphic computing; systems like SpiNNaker and Loihi show that adding more chips scales the network capacity seamlessly. Neuromorphic systems employ Event-Driven Computation, where neurons process information only in response to incoming data, leading to high efficiency and activity-dependent energy use. Lastly, Stochasticity in neuron operation introduces beneficial randomness, making these systems more robust and adaptable to various computational demands. [Schuman et al. 2022]

'' Neuromorphic systems works best with spiking neural networks (SNNs) because they use spikes—a form of information processing that closely mimics natural neural activity. ''


Spiking neural networks (SNNs): The 'spikes' in SNN refer to discrete events that occur at specific points in time, signaling the transmission of information between neurons. These spikes are analogous to the action potentials in biological neurons, which are sudden and brief electrical impulses used by neurons to communicate with each other.

Fig.04, Illustrates an example of a Spiking Neural Network (SNN) operating in the temporal domain (Source: Schuman et al. 2022)

The figure shows synapses with a time delay, where information is transmitted via spikes across the network. The network's functioning at two different times, time t (left) and time t + 1 (right), is depicted to demonstrate how the network's state evolves over time.

In an SNN, a neuron accumulates input signals until it reaches a predefined threshold. Once this threshold is crossed, the neuron generates a spike. This spike is then sent to other neurons to which it is connected, conveying information through the network. The key characteristic of these spikes is that they are binary events (they either occur or they don't) and carry information primarily through their timing and occurrence rather than through varying signal intensity. This binary, time-sensitive nature allows SNNs to model neural activity in a way that closely mimics biological neural dynamics, making them particularly effective for tasks that benefit from temporal pattern recognition and real-time processing.

The Exciting Potential of Neuromorphic Computing in ML

Machine learning domains like Computer Vision are poised to benefit significantly from advancements in neuromorphic computing, where SNNs are beginning to bridge the gap in processing dynamic visual information like the human brain [Neftci et al. 2019, Tavanaei et al. 2019]. For NLP the ability of SNNs to handle sequential data is recognized but remains underexplored [Xiao et al. 2022, Yu et al. 2020]. Additionally, their role in Reasoning and Decision Making, especially in reinforcement learning scenarios, is crucial yet underutilized [Bing et al. 2020, Friedman et al. 2020]. SNNs also offer solutions to Lifelong Learning by potentially addressing the challenge of catastrophic forgetting through temporal data processing [Parisi et al. 2020, Bellec et al. 2020]. In One-shot Learning [Kheradpisheh et al. 2021] and Unsupervised Learning with Minimal Supervision[Panda et al. 2020, Bellec et al. 2019], they could surpass the learning efficiency of traditional deep learning models. Moreover, the application of neuroscience-driven techniques such as Spike-Timing-Dependent Plasticity and dendritic learning could further refine SNN capabilities, enhancing their feature selectivity and efficiency in spike-based learning across these applications [Azghadi et al. 2014, Zenke et al. 2020].


Reinforcement Learning into the mix

How is RL benefited from Neuromorphic computation ?

Energy-Efficient Learning utilizes the low power consumption of neuromorphic systems for extensive RL computations, crucial for energy-sensitive applications like mobile robotics and IoT devices [Zanatta et al., 2023, Koursioumpas et al., 2024]. Real-Time Decision Making benefits from the rapid, parallel processing capabilities of neuromorphic hardware, critical for dynamic environments such as autonomous vehicles, and robotics where decisions must be instant and reliable [Amaya & von Arnim, 2023, Liu et al., 2021]. On-Device Adaptation emphasizes learning directly on hardware, which enhances data security by minimizing cloud reliance and cuts latency for faster response times [Rosenfeld et al., 2021, Zhang et al., 2020]. Handling Sensory Inputs uses neuromorphic chips to process sensory data with high efficiency, mimicking the human brain's ability to manage diverse sensory inputs, ideal for context-aware computing [Tang et al., 2021, Weidel et al., 2021.

Case Studies:

CASE-01

Oikonomou et al. (2023) introduces a hybrid Deep Deterministic Policy Gradient (DDPG) model combining spiking neural networks (SNNs) and deep neural networks (DNNs) for controlling a 6-degree-of-freedom robotic arm, focusing on energy efficiency and target-reaching tasks. Implemented on neuromorphic hardware, the model considerably reduces power consumption. The hybrid-DDPG achieves a high average success rate of 0.97, outperforming the DDPG trained with sub-goals, which, while having a comparable success rate, requires nearly double the time. Moreover, the hybrid model slightly surpasses the standard DDPG in execution time, demonstrating its efficiency and effectiveness in dynamic environments with optimized training methodologies.

Fig.05 Overview of the actor model (or policy) that receives joint states as input spikes and provides actions for the robotic arm (Source: Oikonomou et al. (2023))

The evaluation of the hybrid-DDPG model was conducted using an NVIDIA GeForce RTX 3090 GPU. The robotic simulations were carried out on a virtual Kuka robot arm within a simulated environment created using the Robot Operation System (ROS) and Gazebo simulator. This setup facilitated the integration of sensors such as a camera and laser scan, which were essential for tasks like detecting the target object and preventing collisions.

CASE-02

Akl. Et al. (2023) integrated Spiking Neural Networks (SNNs) with Deep Reinforcement Learning (DRL) to tackle complex control tasks on OpenAI Gym using neuromorphic chips like Intel Loihi. They found that surrogate gradients and randomizing neuron membrane parameters enhanced learning efficiency and model adaptability in continuous action environments.

Fig.06 Encoding and Decoding method used in experiment (Source: Akl. Et al. (2023))

Fig.06 provides an overview of encoding and decoding methods used in the experiments across OpenAI Gym's environments (A), considering varying observation and action dimensions. Observations are encoded into a two-neuron input scheme for each dimension, separating into positive and negative neurons (B). This input vector undergoes multiplication by the first weight matrix to produce a constant current injected into the first hidden layer’s neurons (C). Here, membrane potentials (blue curves), firing thresholds (dashed green lines), and spike emissions (red dots) are shown. Spikes travel to the second hidden layer, influencing activity there, before reaching the output layer (D) with infinitely set firing thresholds, representing action decisions based on final membrane potentials.

The control tasks intended using the robot were complex, continuous control problems from OpenAI Gym environments like Ant-v3, HalfCheetah-v3, Hopper-v3, and Pendulum-v0. These tasks require dynamic and real-time decision-making capabilities. The results indicate significant statistical improvements in performance with p-values less than 0.001 for tasks in various environments such as Ant-v3, HalfCheetah-v3, Hopper-v3, and Pendulum-v0. The approach also led to a reduction in the average firing rates in the neural networks, indicating more efficient neural activity.

Challenges and Limitations:

Neuromorphic computing faces several key challenges that limit its broader adoption and practical application. Accessibility and usability issues stem from a lack of user-friendly hardware and software, restricting experimentation across diverse fields. There is also a problem with integration into heterogeneous computing environments, where reliance on conventional host systems introduces inefficiencies. Moreover, the field suffers from a lack of established benchmarks and metrics, making it hard to gauge progress or compare technologies. Additionally, programming complexities due to the absence of higher-level abstractions make development time-consuming and restrict usage to specialized applications, further hindering its expansion

Conclusion:

Neuromorphic computing, inspired by the biological brain, holds promising potential to revolutionize machine learning through its unique features such as low power consumption, real-time processing, and adaptability to noise and scalability. Systems like neuromorphic computing integrate processing and memory, dramatically reducing energy use and overcoming limitations such as the 'memory wall' in traditional systems. Despite these advantages, significant challenges remain. These include accessibility due to complex programming demands, difficulty in integrating with existing computing environments, and a lack of standardized benchmarks for evaluating progress. Overcoming these barriers is crucial for the wider adoption and optimization of neuromorphic computing in fields ranging from autonomous systems to advanced robotics and Internet of Things (IoT) applications.


TECH TEASER (standardized benchmarks): Neurobench [Yik et al. 2023], developed collaboratively by a broad community, proposes a structured, open-source framework to evaluate neuromorphic approaches systematically. It includes two tracks: an algorithm track for hardware-independent evaluations and a system track focusing on hardware-dependent assessments. This framework is designed to support iterative updates and community-driven improvements, aiming to standardize how neuromorphic computing's effectiveness is measured, thereby fostering technological advancement and facilitating comparison with conventional computing methods.


Reference:

1. Amaya, C., & von Arnim, A. (2023). Neurorobotic reinforcement learning for domains with parametrical uncertainty. Frontiers in Neurorobotics, 17, 1239581.

2. Akl, M., Ergene, D., Walter, F., & Knoll, A. (2023). Toward robust and scalable deep spiking reinforcement learning. Frontiers in Neurorobotics, 16, 1075647.

3. Azghadi, M. R., Iannella, N., Al-Sarawi, S. F., Indiveri, G., & Abbott, D. (2014). Spike-based synaptic plasticity in silicon: design, implementation, application, and challenges. Proceedings of the IEEE, 102(5), 717-737.

4. Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., & Maass, W. (2020). Long short-term memory and learning-to-learn in networks of spiking neurons. Nature Communications, 11(1), 1-15.

5. Bellec, G., Scherr, F., Hajek, E., Salaj, D., Legenstein, R., & Maass, W. (2019). Biologically inspired alternatives to backpropagation through time for learning in recurrent neural nets. arXiv preprint arXiv:1901.09049.

6. Bing, Z., Meschede, C., Röhrbein, F., Huang, K., & Knoll, A. C. (2020). A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks. Frontiers in Neurorobotics, 13, 35.

7. Friedmann, S., Frémaux, N., Schemmel, J., Gerstner, W., & Meier, K. (2020). Reward-based learning under hardware constraints—Using a RISC processor embedded in a neuromorphic substrate. Frontiers in Neuroscience.

8. Kheradpisheh, S. R., Ganjtabesh, M., Thorpe, S. J., & Masquelier, T. (2021). STDP-based spiking deep convolutional neural networks for object recognition. Neural Networks, 99, 56-67.

9. Koursioumpas, N., Magoula, L., Petropouleas, N., Thanopoulos, A. I., Panagea, T., Alonistioti, N., ... & Khalili, R. (2024). A safe deep reinforcement learning approach for energy efficient federated learning in wireless communication networks. IEEE Transactions on Green Communications and Networking.

10. Liu, J., Lu, H., Luo, Y., & Yang, S. (2021). Spiking neural network-based multi-task autonomous learning for mobile robots. Engineering Applications of Artificial Intelligence, 104, 104362.

11. Mehonic, A., & Kenyon, A. J. (2022). Brain-inspired computing needs a master plan. Nature, 604(7905), 255-260.

12. Neftci, E., Mostafa, H., & Zenke, F. (2019). Surrogate Gradient Learning in Spiking Neural Networks. IEEE Signal Processing Magazine, 36(6), 61-63.

13. Oikonomou, K. M., Kansizoglou, I., & Gasteratos, A. (2023). A hybrid reinforcement learning approach with a spiking actor network for efficient robotic arm target reaching. IEEE Robotics and Automation Letters.

14. Panda, P., & Roy, K. (2020). Learning to generate sequences with combination of Hebbian and non-Hebbian plasticity in recurrent spiking neural networks. Frontiers in Neuroscience, 14, 598.

15. Parisi, G. I., Kemker, R., Part, J. L., Kanan, C., & Wermter, S. (2019). Continual lifelong learning with neural networks: A review. Neural Networks, 113, 54-71.

16. Parisi, G. I., & Lomonaco, V. (2020). Online continual learning on sequences. In Recent Trends in Learning From Data: Tutorials from the INNS Big Data and Deep Learning Conference (INNSBDDL2019) (pp. 197-221). Springer International Publishing.

17. Rosenfeld, B., Rajendran, B., & Simeone, O. (2021, June). Fast on-device adaptation for spiking neural networks via online-within-online meta-learning. In 2021 IEEE Data Science and Learning Workshop (DSLW) (pp. 1-6). IEEE.

18. Roy, K., Jaiswal, A., & Panda, P. (2019). Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784), 607-617.

19. Schuman, C. D., Kulkarni, S. R., Parsa, M., Mitchell, J. P., & Kay, B. (2022). Opportunities for neuromorphic computing algorithms and applications. Nature Computational Science, 2(1), 10-19.

20. Tavanaei, A., Ghodrati, M., Kheradpisheh, S. R., Masquelier, T., & Maida, A. (2019). Deep learning in spiking neural networks. Neural Networks, 111, 47-63.

21. Tang, G., Kumar, N., Yoo, R., & Michmizos, K. (2021, October). Deep reinforcement learning with population-coded spiking neural network for continuous control. In Conference on Robot Learning (pp. 2016-2029). PMLR.

22. Weidel, P., Duarte, R., & Morrison, A. (2021). Unsupervised learning and clustered connectivity enhance reinforcement learning in spiking neural networks. Frontiers in computational neuroscience, 15, 543872.

23. Xiao, R., Wan, Y., Yang, B., Zhang, H., Tang, H., Wong, D. F., & Chen, B. (2022). Towards energy-preserving natural language understanding with spiking neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31, 439-447.

24. Yik, J., Ahmed, S. H., Ahmed, Z., Anderson, B., Andreou, A. G., Bartolozzi, C., ... & Reddi, V. J. (2023). Neurobench: Advancing neuromorphic computing through collaborative, fair and representative benchmarking. arXiv preprint arXiv:2304.04640.

25. Yu, Q., Tang, H., Tan, K. C., & Li, H. (2020). A Spiking Neural Network System for Robust Sequence Recognition. IEEE Transactions on Neural Networks and Learning Systems.

26. Zanatta, L., Di Mauro, A., Barchi, F., Bartolini, A., Benini, L., & Lombardi, M. (2021). Assessing the frontiers of ultra-low power and energy efficient designs: From spiking neural networks to neuromorphic hardware. Energy & Environmental Science, 14(4), 1880-1909.

27. Zhao, W., Qu, H., & Yi, X. (2023). Neuroevolutionary algorithms for deep reinforcement learning: A comprehensive survey. Neural Computing and Applications.

To view or add a comment, sign in

More articles by Chayan Banerjee, PhD

Insights from the community

Others also viewed

Explore topics