Skip to content

MLOps Fundamentals: The Lifecycle of Machine Learning¶

Estimated time to read: 3 minutes

Machine Learning Operations (MLOps) is the application of DevOps principles to the machine learning lifecycle. While traditional DevOps focuses on code and binary versions, MLOps must handle Code + Data + Models.

This guide covers the foundational stages of a production-grade MLOps environment.


1. MLOps vs. Traditional DevOps¶

Feature DevOps MLOps
Primary Unit Code / Binary Code + Data + Model
Success Metric Uptime, Latency, Throughput Accuracy, Precision, Drift
Testing Unit & Integration tests Model validation, Data quality tests
Monitoring System health Prediction drift, Data drift

2. The MLOps Lifecycle¶

graph LR
    A[Data Prep] --> B[Model Training]
    B --> C[Model Validation]
    C --> D[Model Registry]
    D --> E[Deployment / Serving]
    E --> F[Monitoring]
    F --> G{Drift Detected?}
    G -- Yes --> A
    G -- No --> F

Stage 1: Data Preparation & Feature Engineering¶

ML models are only as good as the data they are trained on. This stage involves cleaning, transforming, and tracking data versions. * Feature Store: A centralized repository for sharing and managing features across training and serving. See Feature Engineering, DevOps & Observability.

Stage 2: Model Training & Experiment Tracking¶

Instead of just committing code, you are tracking experiments: hyper-parameters, data versions, and training results. * Tools: MLflow, Weights & Biases, Kubeflow.

Stage 3: Model Registry¶

A versioned store for your trained models. You never deploy a "loose" file from a developer's laptop. You deploy a versioned artefact from the registry.

Stage 4: Deployment & Serving¶

Models can be deployed as: * Batch Prediction: Scoring millions of rows on a schedule (e.g., in BigQuery ML). * Real-time Serving: An API endpoint that returns a prediction in milliseconds (e.g., via Vertex AI or Seldon).

Stage 5: Monitoring & Retraining¶

You must monitor for Drift: * Data Drift: The input data distribution changes (e.g., a new sensor version starts sending different values). * Model Drift: The model's performance degrades in production because the real-world relationship between inputs and outputs has changed.


3. MLOps Maturity Levels¶

The OpsAtScale Data Maturity Assessment maps these capabilities.

🔴 Level 0: Manual (No MLOps)¶

  • Manual data prep and training.
  • Models passed as files (e.g., .pkl) via email or Slack.
  • No monitoring of performance.

🟡 Level 1: Automated Pipeline¶

  • Code to train is versioned.
  • Automated pipeline triggers training when new data arrives.
  • Model validation gates before deployment.

🔵 Level 2: CI/CD/CT (Continuous Training)¶

  • CT: The system automatically retrains the model when performance drifts or new data is available.
  • A/B Testing: New models are compared against "champion" models before full rollout. See A/B Testing with BigQuery ML.

4. Summary Checklist¶

  • Versioning: Are you versioning your data samples as well as your code?
  • Registry: Are you using a formal Model Registry for all deployments?
  • Monitoring: Do you have alerts for when prediction accuracy drops below a threshold?
  • Drift: Do you monitor the statistical distribution of your input features?
  • Reproducibility: Can you rebuild a model from 6 months ago with the exact same data and code?