MLOps

The purpose of MLOps is to solve the challenges in Machine Learning related projects.

ML usecases

New product/capability
Automation/assistance of existing manual tasks
Replacement of existing ML system

Life Cycle of Production ML project

Scoping
1. Define Project
  1. Decide on key metrics
    1. Model Accuracy
    2. Latency
    3. Throughput(QPS, query-per-second)
    4. Cloud/Edge/Browser computing?
      1. Cloud
        
        flexible computing power
      2. Edge
        
        Lower latency
        
        Offline-processing (network-incident-proof)
    5. Real-time/Batch
    6. Logging
    7. Securiy
    8. Privacy
      1. e.g. patient record
  2. Estimate required resources and timeline
Data
1. Defining the data and establishing the baseline
2. Labeling & Organizing the data
  1. Data label consistency
    1. e.g. Audio transcribtion
      1. Um, today's weather/Um...today's weather/today's weather?
      2. Volume normalization
      3. silence before/after each audio clip?
3. Validating the data
Modeling, Model Development
1. Two types of AI/ML development
  1. Model centric (tends to be research and academia)
  2. Data centric (tends to be production system)
2. Challenges in model development
  1. Doing well on training data set
  2. Doing well on dev/test data set
  3. Doing well on business metrics/project goals
3. Inputs
  1. Code (algorithm/model)
  2. Hyperparameters
  3. Data
- In Research/Academia, adjusting code/hyperparameters is reletively emphasized.
- In production system, focus is more on the data

1. Performing error analysis 1.

Deployment
1. Deploying the model to production
  1. Gradual ramp up of traffice
  2. Rollback possibility
    1. ML System Deployment patterns
      1. Shadow deployment possiblity
        
        gradual replacement of manual process, by implementing ML system in parallel
        
        e.g You’ve built a new system for making loan approval decisions. For now, its output is not used in any decision making process, and a human loan officer is solely responsible for deciding what loans to approve. But the system’s output is logged for analysis.
      2. Canary deployment
        
        Small portion of a single process to be done by the ML system to avoid significant negative impact
      3. Blue-green deployment
        
        Old(Blue) New(Green)
        
        Let Blue version running and switch traffics to Green version (Gradual or all at once)
        
        quick rollback to previous version by keeping the Blue version
  3. Recommended way is to gradually automate manuall process. Full automation to be reached after a certain maturity level is reached.
Monitoring
1. Spotting data/concept drift
  1. e.g. Audio recognition ML project
    1. Users gets older and the voice changes => emotion detection may become inaccurate
2. Spotting issues in data pipeline
3. Pipeline monitoring
  1. if there are multiple ML models deployed as micro-service, then the output of ML model 1 affects the output of ML model 2
4. Methods
  1. Use Dashboard (most common)
  2. Software Metrics
    1. e.g. Traffic load monitoring
  3. Input metrics
    1. e.g. avg. audio length, image brightness, number of missing values
  4. Output metrics
    1. e.g. frequency of recognition error, inaccurate recommendation
Maintaining
1. improving/fixing the model based on the new data coming in from the production environment

"ML Infrastructure"

Configuration
Data Collection
Feature Extraction
Data Verification
Machine Resource Management
ML Code
Analysis Tools
Process Management Tools
Serving Infrastructure
Monitoring

Ref. D.Sculley et al NIPS 205: Hidden Technical Debt in Machine Learning Systems

Data Drift

is in short, input data distribution change (tendency changes)

https://towardsdatascience.com/machine-learning-in-production-why-you-should-care-about-data-and-concept-drift-d96d0bc907fb https://christophergs.com/machine%20learning/2020/03/14/how-to-monitor-machine-learning-models/ https://youtu.be/06-AZXmwHjo

https://youtu.be/06-AZXmwHjo http://arxiv.org/abs/2011.09926 https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf

Concept Drift

X -> Y changes. The underlying theory changes.

e.g.

Light in a room changed, so that the photos taken are overall brighter, which hides the scratches on the objects.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps

ML usecases

Life Cycle of Production ML project

"ML Infrastructure"

Data Drift

Concept Drift

ML Meta data

Model Fairness

Explainability issues

"POC to Production Gap"

About

Releases

Packages

takahiko-okada/mlops

Folders and files

Latest commit

History

Repository files navigation

MLOps

ML usecases

Life Cycle of Production ML project

"ML Infrastructure"

Data Drift

Concept Drift

ML Meta data

Model Fairness

Explainability issues

"POC to Production Gap"

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages