Chris Seferlis discusses one of the lesser known and newer Data Services in Azure, Data Explorer.

If you’re looking to run extremely fast queries over large sets of log and IoT data, this may be the right tool for you. I also discuss where it’s not a replacement for Azure Synapse or Azure Databricks, but works nicely alongside them in the overall architecture of the Azure Data Platform.

In this video Chris Seferlis discusses some of the reasons you might want to choose Azure Data Factory over Azure Synapse Workspaces with Synapse Studio.

Even though many of the features overlap, there are still scenarios where I’d use ADF, and pass on the additional features of Synapse. Let me know your thoughts below, please like, comment, share and follow me on Twitter: @bizdataviz

In this video, Chris Seferlis continues discussing the Modern Data Platform in Azure with Part 3: Data Processing.

Tools Discusssed:

Gaurav Malhotra joins Scott Hanselman to show how wrangling data flows in Azure Data Factory.

This provides  a code-free, serverless environment that simplifies data preparation in the cloud and scales to any data size with no infrastructure management required.

It uses the industry-leading Power Query data preparation technology (also used in Power Platform dataflows, Excel, and Power BI) to prepare and shape the data. Built to handle all the complexities and scale challenges of big data integration, wrangling data flows enable use Apache Spark execution to help you easily prepare data at scale.

Gaurav Malhotra joins Scott Hanselman to show how you can run your Azure Machine Learning (AML) service pipelines as a step in your Azure Data Factory (ADF) pipelines.

This enables you to run your machine learning models with data from multiple sources (85+ data connectors supported in ADF).

This seamless integration enables batch prediction scenarios such as identifying possible loan defaults, determining sentiment, and analyzing customer behavior patterns.     

Related Links

Data integration is complex and has many moving parts that spans across hybrid data environments. Typically, data integration projects have dependencies upstream and downstream making dependencies an important aspect to consider in any job scheduling.

Gaurav Malhotra joins Scott Hanselman to show how you can create dependent pipelines in Azure Data Factory by creating dependencies between tumbling window triggers in your pipelines. Using these dependencies assures you that the trigger is only executed after the successful execution of the dependent trigger in your data factory.

Related Links:

Donovan Brown and Gopi Chigakkagari discuss how to integrate Azure Pipelines with various 3rd party tools to achieve full DevOps cycle with Multi-cloud support. You can continue to use you existing tools and get Azure Pipelines benefits: application release orchestration, deployment, approvals, and full traceability all the way to the code or issue.


  •      

    Related resources:

    Data integration is complex with many moving parts. It helps organizations to combine data and complex business processes in hybrid data environments. Failures are very common in data integration workflows. This can happen due to data not arriving on time, functional code issues in your pipelines, infrastructure issues, etc.

    A common requirement is the ability to rerun failed activities within data integration workflows. In addition, sometimes you need to rerun activities to re-process data due to an error upstream in data processing. Azure Data Factory now enables you to rerun the entire pipeline or choose to rerun downstream from a particular activity inside a pipeline.