Machine Learning (ML) is traditionally associated with large servers and enormous amounts of computing power. However recent improvements in both the efficiency of the algorithms and processing platforms mean that it is now possible to include ML functionality on much smaller embedded systems.
This article covers some of the advantages and challenges of distributed machine learning applications, discusses some of the suitable ML methodologies for use in an embedded environment, and finally looks at some of the pitfalls for the unwary.
For the last 15 or 20 years Machine Learning has become one of the key building blocks of IT systems and has become part of all of our lives, albeit mostly a hidden one. The ever-increasing amounts of available data require increasing sophisticated processing techniques and the advances in ML over this period have been quite mind blowing.
Machine learning can appear in many guises. Early applications included web page ranking, collaborative filtering (used to suggest possible extra purchases on online stores) and automatic translation of documents, but also more recently many security applications such as access control, use face recognition and speech recognition based on ML techniques.
Automotive is also a leading proponent of ML – modern vehicles now have hundreds of sensors which send data to dozens of microcontrollers and distributed data processing is key to real-time behaviour.
Industrial equipment is becoming increasingly intelligent as manufacturers add functionality for condition monitoring and predictive maintenance to reduce service and failure costs, and there are a vast array of new consumer products from vacuum cleaners to fitness monitors which utilise real-time autonomous information processing.
What these applications have in common is the requirement to use real-time, complex sensor data streamed from devices such as accelerometers, vibration, sound, electrical and biometric sensors, and use this data to find signatures of specific events and conditions, or detect anomalies which are the precursor to a failure, and do it locally on the embedded system.
The key to successful ML applications is being able to formalise the problem(s) in an algorithmic fashion, which is crucial if we want to avoid reinventing the wheel for every new application. Instead, much of the art of machine learning is to reduce a range of fairly disparate problems to a set of comparatively well-constrained scenarios. Much of the science of machine learning is then to solve those problems and provide a high level of reliability for the solutions.
In essence, Machine Learning is a powerful method for building models that use data to make predictions. In embedded systems which typically run on microcontrollers and are constrained by processing power, memory, size, weight, power consumption, and cost machine learning can be difficult to implement, as these environments cannot usually make use of the same tools that work in cloud server environments.
However as embedded systems become ever more powerful there starts to be a set of scenarios where in fact ML does become a viable option. Not all cases are suitable but where it is possible there are some clear advantages, such as:
- Improved response times
- Reduced reliance on internet connectivity
- Distributed decision handling
- Lower power (bandwidth) communications
- Lower power requirements for system
- Reduced data flow
These benefits allow the creation of lower cost, more responsive systems by devolving responsibility for data analysis and decision making to the edge, which in turn simplifies the requirements for the cloud system.
However, there are also challenges which must be overcome in the embedded ML environment.
The first challenge is data variability. Traditional ML techniques do not work well on sensor signal data. Real world data is noisy and full of variation meaning that the key indicators may look different in different circumstances. Not only the target data may show variation, but also the background data, and sometimes background variation can be as important as target variation, so it is important to analyse both.
The second challenge is actually implementing real-time detection in the embedded firmware. The embedded processor is resource constrained so being able to do real-time local detection will inevitably add complexity to the system. There are also other power and cost constraints which restrict the ability of the embedded system to perform as efficiently as possible
In the end, ML is really all about data classification. There are a wide variety of classification types and techniques, such as Linear Classifiers, Nearest Neighbour, Support Vector Machines, Decision Trees and Neural Networks. Traditionally many of these have required specialised hardware to fully realise the potential of machine learning, and processor vendors continue to offer more powerful AI processors.
However, in parallel, framework developers and ML algorithm developers are also striving to find ways to optimise frameworks and neural network architectures to more effectively support resource-constrained embedded platforms. On mobile platforms, it is possible to use Apple Core ML and Android Neural Networks API, but on lower performance systems frameworks such as TensorFlow Mobile and TensorFlow Lite provide more general solutions as part of the TinyML revolution.
Model architectures and algorithms also continue to evolve to address resource limitations of embedded devices, with some shrinking so far that they can fit into MCUs with 2 Kbytes of RAM and 32 Kbytes of flash memory.
When training the ML algorithm it is important to note that ML works best with information-rich data.
Machine learning isn’t magic. Algorithms must be trained for both positive and negative scenarios, so that they can tell the difference. Learning data should be clean and good quality but crucially it must also contain the information being sought – if relationships of interest don’t exist in the data, no algorithm will find them.
The aim of the training is to develop Generalisation which is the ability of a classifier or detector to correctly identify examples that were not included in the original training set. However it is also important not to over-train on test data sets i.e. where a classifier has learned highly accurate results on the specific test data it was trained on, but cannot replicate the same accuracy on similar example test data example it hasn’t seen before.
The best way to guard against overtraining is to use training sets that capture as much of the expected variation in target and environment as possible. Using small sample sizes with limited variation and then reusing training data for validation are common causes of overtraining.
For best results, training should be done in a server environment as this allows rapid turnaround of model improvements and the use of multiple data sets in quick succession. While trying to train ML algorithms in an embedded environments is possible, it can be time consuming and inefficient.
To summarise, ML in embedded systems is developing rapidly, and there are already sets of tools, frameworks and algorithms to help implement ML in a resource-constrained system. However these tools and frameworks come with their own challenges, and a detailed understanding of embedded development is still crucial to success.
Never is it more crucial to test early and often than with ML. Extensive testing with real-world data is the key to perfecting ML classifiers and detectors. Early small-scale pilot programmes with friendly customers can help validate the ML models and help improve training data to cover additional variations and scenarios.