Spark vs. Tez: What's the Difference?

Spark vs. Tez: What’s the Difference?

At work recently, a question came up about whether Spark or Tez is better. Here's an interesting article with some interesting perspectives. On paper, Spark... Details
What’s in Hive 3.0?

What’s in Hive 3.0?

What is new in Apache Hive 3.0? from DataWorks Summit Details
How Big is Big Data?

How Big is Big Data?

Here’s an interesting look at how big big data is from Computerphile, for those not satisfied with my “Costco Test for Big Data.” Details
Code-free modern data warehouse using Azure SQL DW and Data Factory

Code-free modern data warehouse using Azure SQL DW and Data Factory

Gaurav Malhotra joins Scott Hanselman to show how to build a modern data warehouse solution from ingress of structured, unstructured, semi-structured data to code-free data... Details
Databricks open-sources Delta Lake to make data lakes more reliable

DataBricks Open-Sources Delta Lake to Make Data Lakes More Reliable

Databricks, announced that it has open-sourced Delta Lake, a storage layer that makes it easier to ensure data integrity as new data flows into an enterprise’s... Details
Lambda Architecture in the Cloud with Azure Databricks

Lambda Architecture in the Cloud with Azure Databricks

In this talk, Andrei Varanoch demonstrates the blueprint for such a Lambda Architecture implementation in Microsoft Azure, with Azure Databricks — a PaaS Spark offering... Details
The What, Why & How of Azure Databricks

The What, Why & How of Azure Databricks

In this video, Dinesh Priyankara explains Azure Databricks, why and where it should be used and how to start with it. it speaks about modern... Details
Ingesting data with Azure Databricks & Azure SQL Data Warehouse

Ingesting data with Azure Databricks & Azure SQL Data Warehouse

In this video, learn how to ingest data using Azure Databricks in Azure SQL Data Warehouse to speed up your data pipeline and get more... Details
Leveraging HPC to Accelerate Virtual Drug Screening

Leveraging HPC to Accelerate Virtual Drug Screening

Here's another story of how big data and high performance computing and TensorFlow is reshaping medicine as we know it. Virtual drug screening has the... Details
LinkedIn Open Sources a Tool that Formats Big Data for TensorFlow

LinkedIn Open Sources a Tool that Formats Big Data for TensorFlow

LinkedIn has just open sourced a tool it created to convert Apache Spark-based big data into a format that can be readily consumed by TensorFlow.... Details