Change Data Capture (CDC) is a typical use case in Real-Time Data Warehousing. It tracks the data change log (binlog) of a relational database (OLTP), and replay these change log timely to an external storage to do Real-Time OLAP, such as delta/kudu.

To implement a robust CDC streaming pipeline, lots of factors should be concerned, such as how to ensure data accuracy , how to process OLTP source schema changed, whether it is easy to build for variety databases with less code. This talk will share the practice for simplify CDC pipeline with SparkStreaming SQL and Delta Lake.

Science fiction draws its inspiration from science and, in return, science fiction often inspires science. Think of the Star Trek communicator from the 1960s that evolved into the cellphone of the 90s.

Here’s great video exploring the intersection of hard science and science fiction and how one inspires the other. The cast of The Expanse sits down with brilliant minds from Blue Origin to discuss life beyond our planet.