MLflow is an MLOps tool that enables data scientist to quickly productionize their Machine Learning projects. To achieve this, MLFlow has four major components which are Tracking, Projects, Models, and Registry. MLflow lets you train, reuse, and deploy models with any library and package them into reproducible steps.

MLflow is designed to work with any machine learning library and require minimal changes to integrate into an existing codebase.

In this video, learn the common pain points of machine learning developers such as tracking experiments, reproducibility, deployment tool and model versioning.

Machine learning suffers from a reproducibility crisis.

Deterministic machine learning is not only incredibly important for academia to verify research papers, but also for developers in enterprise scenarios.

Here’s a great video on how to address this shortcoming.

Due to the various reasons for non-deterministic ML, especially when GPUs are in play, I conducted several experiments and identified all causes and the corresponding solutions (if available).

In the last several months, MLflow has introduced significant platform enhancements that simplify machine learning lifecycle management.

Expanded autologging capabilities, including a new integration with scikit-learn, have streamlined the instrumentation and experimentation process in MLflow Tracking.

Additionally, schema management functionality has been incorporated into MLflow Models, enabling users to seamlessly inspect and control model inference APIs for batch and real-time scoring. 

Taking deep learning models to production and doing so reliably is one of the next frontiers of MLOps.

With the advent of Redis modules and the availability of C APIs for the major deep learning frameworks, it is now possible to turn Redis into a reliable runtime for deep learning workloads, providing a simple solution for a model serving microservice.

RedisAI is shipped with several cool features such as support for multiple frameworks, CPU and GPU backend, auto batching, DAGing, and soon will be with automatic monitoring abilities. In this talk, we’ll explore some of these features of RedisAI and see how easy it is to integrate MLflow and RedisAI to build an efficient productionization pipeline.

[Originally aired as part of the Data+AI Online Meetup ( and Bay Area MLflow meetup]

Learn how Azure ML supports Open Source ML Frameworks and MLflow in AzureML.

Take a walk through a ScikitLearn and Pytorch example to show the built in support for ML frameworks.

Learn More:

PyTorch, the popular open-source ML framework, has continued to evolve rapidly since the introduction of PyTorch 1.0, which brought an accelerated workflow from research to production.

In this video, take a deep dive on some of the most important new advances, including model parallel distributed training, model optimization and on device deployment as well as the latest libraries that support production scale deployment working in concert with MLFlow.

Solving a data science problem is about more than making a model.

It entails data cleaning, exploration, modeling and tuning, production deployment, and workflows governing each of these steps.

Databricks has a great video on how MLflow fits into the data science process.

In this simple example, we’ll take a look at how health data can be used to predict life expectancy. Starting with data engineering in Apache Spark, data exploration, model tuning and logging with hyperopt and MLflow. It will continue with examples of how the model registry governs model promotion, and simple deployment to production with MLflow as a job or dashboard.

Databricks, the company behind the commercial development of Apache Spark, is placing its machine learning lifecycle project MLflow under the stewardship of the Linux Foundation.

MLflow provides a programmatic way to deal with all the pieces of a machine learning project through all its phases — construction, training, fine-tuning, deployment, management, and revision. It tracks and manages the the datasets, model instances, model parameters, and algorithms used in machine learning projects, so they can be versioned, stored in a central repository, and repackaged easily for reuse by other data scientists.

Databricks just livestreamed this tech talk earlier today.

Developers and data scientists around the world have developed tens of thousands of open source projects to help track, understand, and address the spread of COVID-19. Given the sheer volume, finding a project to contribute to can prove challenging. To make this easier, we built a recommendation system to highlight projects based off of inputted programming languages and keywords.

This talk will go through the full cycle of implementing this system: gathering data, building/tracking models, deploying the model, and creating a UI to utilize the model.

Databricks just posted part 3 of a 3 part online technical workshop series on Managing the Complete Machine Learning Lifecycle with MLflow. If you’re interested in learning about machine learning and MLflow, this workshop series is for you!


This workshop is an introduction to MLflow. Machine Learning (ML) development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.

To solve these challenges, MLflow, an open source project, simplifies the entire ML lifecycle. MLflow introduces simple abstractions to package reproducible projects, track results, encapsulate models that can be used with many existing tools, and central repository to share models, accelerating the ML lifecycle for organizations of any size.

Related Links: