Optimizing Geospatial Queries with Dynamic File Pruning

Optimizing Geospatial Queries with Dynamic File Pruning

One of the most significant benefits provided by Databricks Delta is the ability to use z-ordering and dynamic file pruning to significantly reduce the amount... Details
The Beauty of Data Privacy Engineering

The Beauty of Data Privacy Engineering

Privacy engineering is an emerging discipline within the software and data engineering domains aiming to provide methodologies, tools, and techniques such that the engineered systems... Details
Spark SQL Beyond the Official Documentation

Spark SQL Beyond the Official Documentation

This video with David Vrba focuses on some internal features of Spark SQL which are not well described in official documentation with a strong emphasis... Details
December 2020 Databricks Customer Newsletter

December 2020 Databricks Customer Newsletter

Need a quick video to keep up on all the recent happenings with Databricks? Look no further than this December 2020 newsletter the team put... Details
Michael Armbrust Demystifies Delta Lakes

Michael Armbrust Demystifies Delta Lakes

On the latest episode of Data Brew, Denny Lee talks to Michael Armbrust about Delta Lake. Delta Lake is an open source storage layer that... Details
How to Implement a GAN on Databricks

How to Implement a GAN on Databricks

Here’s a brilliant Lightning talk from Data + AI Summit 2020 by Dr. Evan Eames. We implemented a pix2pix Generative Adversarial Network (GAN) on Databricks... Details
Data Quality Testing in the Medallion Architecture with PyTest and PySpark

Data Quality Testing in the Medallion Architecture with PyTest and PySpark

Here’s a great Lightning talk from Data + AI Summit 2020 by Carter Kilgour on ”Why data quality is especially important in the medallion architecture,... Details
Delta Lakehouse Data Profiler and SQL Analytics Demo

Delta Lakehouse Data Profiler and SQL Analytics Demo

Coming from a data warehousing and BI background, Franco Patano wanted to have a catalogue of the Lakehouse, including schema and profiling statistics. He created... Details
Comparing Azure Synapse, Snowflake and Databricks for Common Data Workloads

Comparing Azure Synapse, Snowflake and Databricks for Common Data Workloads

In this video, Chris Seferlis describes some of the most common data workloads that are being deployed on Azure and which of the 3 major... Details
BI to Lakehouse Round 3: Community Questions Answered

BI to Lakehouse Round 3: Community Questions Answered

Considering shifting gears into Spark Data Engineering? We have another fun session with Simon Whiteley  and Denny Lee as they answer your questions from their... Details