Model and Data Drift

Miscellaneous

Model and Data Drift refer to the degradation of a machine learning model’s performance over time due to changes in its context.

Drifts are commonly encountered in deployed machine learning models when the model’s performance degrades over time as it fails to adapt to new patterns. It is crucial to monitor and address model drift to maintain the accuracy and reliability of the model.

Model drift, or concept drift, occurs when the relationship between input features and target labels changes over time. This means that even if the data distribution remains the same, the way the model should interpret it has changed. This can happen due to various reasons such as changes in user behavior, or extraordinary values with high predictive power becoming expected over time.

Data drift occurs when the statistical properties of the input data change over time. Data Drift is often categorizes as sudden (e.g. new laws, catastrophic events), gradual (e.g. user preferences changing over time), or recurring (e.g. fashion trends or seasonal effects like christmas shopping).

Drift detection involves monitoring the model’s performance and the data distribution to identify any significant changes. Techniques such as statistical tests, control charts, and machine learning-based methods can be used for drift detection.

Understanding and addressing drift is essential for maintaining the performance and reliability of machine learning models in production.

Related terms
Model Drift Concept Drift Data Drift Drift Detection Online Learning