The ML operations
Internal solutions require you to design, implement and maintain the required models and infrastructure, but in return, you will own the entire platform and benefit from a long-term cost advantage and a technological edge in the market. In this section, we provide an overview of key aspects to consider when designing internal Machine Learning solutions.
As a guiding principle, when you begin planning an ML solution, start by considering the requirements for integrating the complete solution in a production environment. Then consider the requirements of a minimum viable product (MVP) that is tightly constrained but fully integrated into your overall system. And finally, the requirements of a proof of concept (PoC) that can be executed with limited data and in an isolated environment. Conversely, when you begin implementing an ML solution, proceed in the reverse order: from PoC to MVP to production (Ameisen, 2020).
| Adapt to constant evolutions
The benefits of implementing state-of-the-art ML algorithms are clear when they give you a technological advantage in the market. ML is constantly evolving, so even if the engineer implemented a few weeks before a solution similar to the one required for your use case, it is a good practice to search whether there has been any development in the relevant ML topics since then. Such developments may reduce the amount of training data, computing time and computing costs required to achieve the desired accuracy.
| A body is needed to ground
| the brain into the world
When people look at Artificial Intelligence or Machine Learning applications, they often think of algorithms that function as the brain of a system that performs a complex task. What they are often not aware of, is that such a brain requires a body to ground itself into the world.
The embodiment of ML algorithms determines the memory, computing power, sensing bandwidth and communication speed they will be able to exploit for the task to be performed (Gift & Deza, 2021). The implementation of such a body may even require most of the project resources during the development of internal solutions. This is an important factor to help you to plot more accurately the implementation roadmap and thus reduce the risk of delays and underestimating budgets.
Typically, ML algorithms require two types of embodiment: one for learning (model training) and another for laboring (inference in production).
The learning infrastructure has a focus on memory-rich hardware for parallel computation and efficient read/write operations with the relevant databases. In some cases, though, such infrastructure may not be required as pre-trained models might be available and accurate enough for the task at hand. This is possible with some open-source models, or models purchased from external providers that trained them using their own infrastructure.
The laboring infrastructure also requires hardware for parallel computation. However, the memory requirements are reduced once a model finishes training and it can be optimized by reducing its size (number of parameters). Such optimization can reduce the operating costs considerably, especially when using GPU cloud instances.
| Some models might need
| frequent updates
You should understand the update frequency that your models require, no matter if you own both infrastructures (that is for learning and laboring), or just for laboring. In some cases, such updates may simply consist of replacing the model with a smaller or more efficient one, just like swapping one file for another. In other cases, you may need to retrain your models every few months, weeks or days, depending on how frequently their world changes. For example, an autonomous driving car may need to learn to detect an updated version of a traffic sign every few years, or a speech recognition system may need to learn new slang every few weeks.
The required update frequency has a large impact on the architecture design, so it is important to discuss it early on during the first stages of development. Hybrid systems are a common case, in which a human-in-the-loop reviews the edge cases where the model has low confidence in some predictions. In such scenarios, a good user interface may be crucial to obtain continuous and well-standardized feedback that allows the model to catch up with a constantly changing world. Recent advances in continual learning are bringing together both the learning and the laboring stages, effectively simplifying the design, implementation and maintenance of ML systems (Parisi, 2019). However, the field of continual learning is still in its infancy, and not all ML applications in production environments are amenable to its use.
| Be alerted when data shifts
Finally, once you have built the infrastructure for the learning and laboring of your ML models, the last stage is to monitor the quality of the data being ingested by the system. Mainly, you need to understand whether a sudden drop in the accuracy of your model predictions is due to changes in the quality of the data, the content of the data or the behavior of the internal infrastructure (Ng, 2018). Monitoring is related to data engineering and presents different challenges depending on the embodiment required by the application. It can be the case that IoT or mobile devices have limited resources to assess the incoming data, or that data privacy regulations constrain its inspection, so you may need to consult with external specialists for ad hoc solutions.