Tiny ML and Its Data-centric Role

Tiny ML and Its Data-centric Role

In this blog I am going to share with you relevant information that helps you to understand why the future in ML is tiny and brilliant. 

We start with the basics about Tiny Machine Learning (TinyML) because they lay the groundwork for building TinyML applications.

Surely, you read my last article about the three different ways in how Machine learning learns. Well Supervised machine learning provides us with the ability to develop data-driven function approximations. These approximations can be applied to essentially any data structure: images, sounds, sensor outputs, etc. As a primer for TinyML, it is helpful to understand some of the historical context of machine learning.

How Do Today’s Computers Compare to the Past?

This technology started in 1940 but today is possible because some events occurred. Now we have plenty of data to train the models. Also we have enough compute power to train the model in an affordable way, it includes cost and power. 

It has been possible thanks to improvements in processing and storage technologies, as well as reduction in the cost of the memory. These improvements gave the technology capable of training models in a personal computer just with CPU. Shortly after, computation using graphics processing units (GPU) became necessary to handle larger datasets. Here cloud computing appears to help us to use the GPU through a dashboard on the web paying just for the computer that we use to train ML models. 

Exactly at this moment starts a new paradigm commonly referred to as the compute-centric paradigm. 

More recently we have seen the development of specialized application-specific integrated circuits (ASCIs) and tensor processing units TPUs) which can patch the power of 8 GPUs. 

This hardware has allowed this year to appear the three largest machine learning models ever created, BARD by Google, Llama by Meta and Gpt 3 by Open AI. The insatiable demand for NLP model complexity is driven by the desire to enable a wide variety of applications such as question and answering, summarizing text, improving user experience with personal assistant, generating text to aid in sentence competition and so forth. More than 175 billion parameters. 

So where does TinyML add value?

The key to this answer is in the data. I mean the data used in machine learning applications are derived from a data source. This is often a camera, microphone, or some form of sensor tasked with capturing information about the physical world and transducing it into a format digestible by computers. The internet of things surged, different things with access to the internet. 

These IoT devices periodically send data over the internet to the cloud and are often stored in a data warehouse which can easily be interfaced with via a cloud instance. Such warehousing of data from heterogeneously distributed devices requires a large amount of data to be transmitted through networking protocols from the IoT devices to the cloud.

The challenges in Machine Learning

Some individuals have raised concerns with this infrastructure, namely related to (1) privacy, (2) latency, (3) storage, and (4) energy efficiency.

We will describe this 4 mainly concerns:

  1. Privacy: Such data could be intercepted by a malicious actor and becomes inherently less secure when warehoused in a singular location.
  2. Latency: For standar IoT devices transmit data to the cloud for processing and then return a response based on the algorithm output. The devices are pretty “dumb” and fully dependent on the speed of the internet to produce a result. 

  1. Storage: For many IoT devices, a good percentage of the data that they transmit is probably useless so you are spending money on data storage that you don’t need.

  1. Energy Efficiency: Transmitting data (via wires or wirelessly) is energy-intensive, around an order of magnitude more so than onboard computations. 

How can we solve these ML problems?

The solution is TinyML, by empowering embedded systems with the ability to perform on-device machine learning, these problems are largely solved. If the IoT system can perform their own data processing this would avoid the necessity to transmit the data outside from the device which means highly energy-efficient. Also this ability reduces the latency to minimum. Finally by keeping the data primarily on the device and minimizing communications, security and privacy is improved. On final insight, for an intelligent system that only activates when necessary, lower storage capacity and fewer external communications are required. 

What is the Future for ML?

The future is the data-centric paradigm. This paradigm flips the compute-centric paradigm on its head, taking the computing power to the data, instead of the data to the computing power.

A lot of machine learning applications exist for edge devices, which are often located in remote areas which have sparse access to power and minimal networking. This kind of device that works with TinyML just communicates when it is necessary, i mean when an event occurs, so this consumes less than1mW, allowing them to run on a small coin battery continuously for more than a year. 

In conclusion, combined with their low-cost and the ability to run continuously for more than one year just with a watch battery, this kind of smart embedded device presents countless opportunities for industrial applications.

I invite you to know in how can we help you in the development of TinyML and Data products to solve your problems. Visit us in CONAUTI

or feel you free to Contact me for co-create a cutting-edge solution for your problems. Write me by Whatsapp

Enrique Suárez

TinyML & Data Product Development Leader.

Source: Harvard and Google.

I see you in the next article.

To view or add a comment, sign in

More articles by Enrique Suárez Chalco

Insights from the community

Others also viewed

Explore topics