Here’s an interesting talk Albert Franziu Cros on a CI/CD setup composed by a Spark Streaming job in K8s consuming from Kafka.

Over the last year, we have been moving from a batch processing jobs setup with Airflow using EC2s to a powerful & scalable setup using Airflow & Spark in K8s.

The increasing need of moving forward with all the technology changes, the new community advances, and multidisciplinary teams, forced us to design a solution where we were able to run multiple Spark versions at the same time by avoiding duplicating infrastructure and simplifying its deployment, maintenance, and development.

In our talk, we will be covering our journey about how we ended up with a CI/CD setup composed by a Spark Streaming job in K8s consuming from Kafka, using the Spark Operator and deploying with ArgoCD.

We are quickly getting to the point that any AI engineer worth their salt needs to have a good grip on the fundamentals of kubernetes.  It’s not just for the DevOps crowd anymore.

Official kubernetes (k8s) website says, Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon 15 years of experience of running production workloads at Google , […]