top of page

Concept Drift in Machine Learning: Model Drift and Adaptation (Part III)

Now that we have a better understanding of concept drifts and drift detection mechanisms from part I and II blog posts, lets move on to the third and last part of this series of blog posts, the model drift and adaptation.

As time goes on, models start to perform not as well as they used to. We usually judge how good they are by things like accuracy or error rate, or even how well they help a business, like click-through rates. But here's the thing: no model lasts forever. Some can stay good for months up to years but others need to be updated almost every day with new data to stay good.


Performance declining over time

Model Adaptation


As data evolves, our models must evolve with it to maintain their predictive power and reliability. So, let's dive into some key strategies for model adaptation in the context of model drift.


  • The first strategy called incremental learning or online learning, is actually a whole domain in machine learning. Instead of re-training the entire model from scratch, incremental learning focuses on updating the model gradually. The models in this category are designed to be updated per new data instances as they arrive, without forgetting its past knowledge.


  • The most common strategy though is the model retraining that is to re-train the model with most recent data. This process can help the model to adapt to the sudden drifts.


  • Another robust strategy is ensemble learning to deal with model drift. BY combining multiple models where each model is trained on different subset or concept of the stream, the final model becomes more robust and adaptive to the drifts.


  • Feature engineering can also affect the performance. Sometimes, model drift is driven by changes in the data's features rather than the underlying distribution. Regularly assessing and updating the feature engineering techniques can help keep the model relevant and accurate.


Of course, there are also other types of methodologies which can be included here such as continuous monitoring and drift detection which we read in part I and II, or regular model updates or in case we are talking about neural networks then transfer learning can also be considered as a power technique for model adaptation because the model can be fine tuned on new concepts without having to train the model from scratch.


Stability - Plasticity Dilemma


Since we are talking about adapting the models into new concepts, we should also take into consideration the stability-plasticity dilemma. The stability-plasticity dilemma in stream learning is like a balancing act. On one hand, we want the model to be stable, meaning it doesn't change too quickly with every new piece of data. This stability is important because we don't want the model to be overly sensitive to minor fluctuations in the data.


On the other hand, we also want the model to be plastic, meaning it can adapt and learn from new data. If it's too stable, it might miss important changes or trends in the data. So, the dilemma is finding the right balance between stability and plasticity. We want the model to be stable enough to avoid overreacting to noise but plastic enough to capture meaningful changes in the data.

Stability - Plasticity dilemma

The solution for handling concept drift should strike a delicate balance between stability and adaptability. This approach to concept drift adaptation is often called "informed adaptation" or the "active approach." In this strategy, the model is updated in response to the detection of concept drift occurrences. On the other side, we have the "blind adaptation" strategy, also known as the "passive approach." In this approach, the model is consistently updated whenever new data instances are received, without specifically detecting drift events.


Informed vs Blind Adaptation


Informed approaches are particularly well-suited because they actively identify concept drifts through triggering mechanisms, which are also known as drift detection mechanisms. This becomes valuable when we require a detailed description of both when and how drifts are detected.


These methods demonstrate reactivity as, when they detect a concept drift, they can choose between options such as retraining the model completely from scratch, updating the model using a recent data selection, using a window strategy that we saw earlier or use various weighting approaches. The main parts in the informed methods process are the detection mechanism, the window strategy to retain old and new data and the strategy for updating the model when the drift is detected.


On the contrary, blind methods implicitly and continuously adapt the learner to the current concept, operating without the need for explicit drift detection. These methods systematically remove outdated concepts at a fixed rate, irrespective of whether any changes have occurred or not, which makes them ideal for gradual or incremental drifts. There is a field in machine learning called incremental learning which is about updating models per instance that can fit in this category. Some of the strategies used in blind adaptation techniques are the sliding windows, instance weighting and ensemble learning.


Conclusions


Machine learning models inevitably experience performance degradation over time, often due to shifts in data distribution or changes in input feature-target variable relationships. This phenomenon, known as model drift, requires strategies for maintaining model accuracy and reliability. Methods such as incremental learning, model retraining, ensemble learning, and feature engineering are effective approaches to address model drift.


When we are talking about online systems, we also have to consider the stability-plasticity dilemma which is crucial for balancing a model’s ability to remain stable while adapting to new data. One outcome of the stability-plasticity dilemma is that a model must balance between retaining previously learned information (stability) and adapting to new information (plasticity) without compromising overall performance. By employing informed and blind adaptation methods, models can be continuously updated to retain their predictive power and relevance. AI developers have to understand the application domain before selecting one strategy over the other.



** If you are interested in other ML use-cases, please contact me using the form (and also include a publicly available dataset for this case, I'm always curious to explore new problems).

Comments


bottom of page