In this PyData London talk,  Kevin Lemagnen covers something that I’ve long wondered about: the maintainability of code created in data science projects.

Notebooks are great, they allow to explore your data and prototype models quickly. But they make it hard to follow good software practices. In this tutorial, we will go through a case study.We will see how to refactor our code as a testable and maintainable Python package with entry-points to tune, train and test our model so it can easily be integrated to a CI/CD flow.

Python has quickly grown to be the de facto language for AI and a leading language of Data Science. Its support is so widespread, however, that developers have a choice of a wide array of open source libraries. Here’s a great round up of 24 of the best.

In fact, there are so many Python libraries out there that it can become overwhelming to keep abreast of what’s out there. That’s why I decided to take away that pain and compile this list of 24 awesome Python libraries covering the end-to-end data science lifecycle.

At first glance, it may not be obvious how reliant Uber is on data or how much of a powerhouse in machine learning and data science that they’ve become. Forbes has an article on their best practices for machine learning model management — a skill every organization needs (or will need) to master.

Uber is one of those organizations that rely heavily on data. Each day, millions of trips take place in 700 cities across the world, generating information on traffic, preferred routes, estimated times of arrival/delivery, drop-off locations, and more that enables Uber to deliver a smooth riding experience to its […]

Erica Joy (@EricaJoy) joins Ashley McNamara (@ashleymcnamara) to share her not-so-secret personal mission: making genealogy information open, queryable, and easily parsable. She shares a bit about why this is so critical, common challenges, and tips for re-building your own family tree – or using open data to uncover whatever the information you need for your personal mission.

Explore open source at Microsoft

Erica’s favorite open source genealogy tools and services:

Bloomberg takes a look at the unique role of data science in professional basketball.

From the description:

With her PhD in math, Ivana Seric had expected to wind up with a career in academia—but thanks to the growing use of statistical analysis in the NBA, she took a job with the Philadelphia 76ers instead. As a data scientist, she helps the team’s coaches devise smarter strategies to win.