ComputerPhile has a great video where Rebecca Tickle explains the inner workings of Apache Spark and what makes it better than MapReduce. As an added bonus, she uses Scala in the demo.

It’s also interesting to note that she used Spark in her day job pulling IoT data from trucks (“Lorries”). 

Programming thousands of machines is no easy task.

One approach pioneered by Google is known as MapReduce.

MapReduce provides a programming model that simplifies programming thousands of machines by breaking down distributed programs into two steps: map, and reduce.