Introducing MLflow for End-to-End Machine Learning on Databricks

Introducing MLflow for End-to-End Machine Learning on Databricks

Solving a data science problem is about more than making a model. It entails data cleaning, exploration, modeling and tuning, production deployment, and workflows governing... Details
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake

Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake

Change Data Capture (CDC) is a typical use case in Real-Time Data Warehousing. It tracks the data change log (binlog) of a relational database (OLTP),... Details
Delta Lake from a Data Engineer’s Perspective

Delta Lake from a Data Engineer’s Perspective

In this session, take a walk through the daily struggles of a data engineer in this presentation as we cover what is truly needed to... Details
Deep Dive into the New Features of Apache Spark 3.0

Deep Dive into the New Features of Apache Spark 3.0

Databricks provides an in depth look at the new features of Spark 3.0. Details
Introducing Apache Spark 3.0

Introducing Apache Spark 3.0

Here’s a keynote from Matei Zaharia, the original creator of Apache Spark, that contains retrospective of the Last 10 Years, and a Look Forward to... Details
End-to-End Deep Learning with Horovod on Apache Spark

End-to-End Deep Learning with Horovod on Apache Spark

Databricks explore the power of Horovod and what it means for data scientists and AI engineers. The newly introduced Horovod Spark Estimator API enables TensorFlow... Details
MLflow is now a Linux Foundation project

MLflow is now a Linux Foundation project

Databricks, the company behind the commercial development of Apache Spark, is placing its machine learning lifecycle project MLflow under the stewardship of the Linux Foundation.... Details
Introduction to Azure Databricks

Introduction to Azure Databricks

Ayman El-Ghazali recently presenting this Introduction to Databricks from the perspective of a SQL DBA at the NoVA SQL Users Group. Code available at:https://github.com/thesqlpro/blogThis is... Details
A COVID19 Story with Azure Databricks

A COVID19 Story with Azure Databricks

A colleague of mine, Ayman El-Ghazali, worked through data from the state of Maryland. Code is available on GitHub. I chose not source my data directly... Details
Slowly Changing Dimensions (SCD) Type 2

Slowly Changing Dimensions (SCD) Type 2

Databricks recently streamed this tech chat on SCD, or Slowly Changing Dimensions. We will discuss a popular online analytics processing (OLAP) fundamental - slowly changing... Details