Here’s an interesting session from the SciPy 2020 virtual conference.

As a foundational tutorial in statistics and Bayesian inference, the intended audience is Pythonistas who are interested in gaining a foundational knowledge of probability theory and the basics of parameter estimation. Knowledge of `numpy`, `matplotlib`, and Python are prerequisites for this tutorial, in addition to curiosity and an excitement to learn new things!

Jupyter Notebooks are a key tool for data scientists everywhere.

This article provides a high-level overview of Project Jupyter and the widely popular Jupyter notebook technology.

Project Jupyter is a nonprofit organization created to “develop open-source software, open-standards, and services for interactive computing across dozens of programming languages.” Spun-off from IPython in 2014 by co-founder Fernando Pérez, Project Jupyter supports execution environments in several dozen languages.

The Career Force goes through her top 5 free dataset resources in this video.

  1. Data.gov: https://data.govData.gov is a large dataset aggregator and the home of the US Government’s open data.
  2. FiveThirtyEight: https://data.fivethirtyeight.com/ This is a great resource to not only see datasets, but also see how a well-respected analytics organization provides meaningful insights and commentary on the data.
  3. Kaggle: https://www.kaggle.com/Kaggle  is a great resource not only for free datasets, but for data science topics in general.
  4. Data.World: https://data.world/ There are hundreds of thousands of free datasets for anyone that sets up an account on data.world.
  5. Google Dataset Search: https://datasetsearch.research.google.com/ By accessing thousands of different repositories across the web, Google Dataset Search provides access to almost 25 million different publicly available datasets.

Databricks just livestreamed this tech talk earlier today.

Developers and data scientists around the world have developed tens of thousands of open source projects to help track, understand, and address the spread of COVID-19. Given the sheer volume, finding a project to contribute to can prove challenging. To make this easier, we built a recommendation system to highlight projects based off of inputted programming languages and keywords.

This talk will go through the full cycle of implementing this system: gathering data, building/tracking models, deploying the model, and creating a UI to utilize the model.

Here’s an interesting interview with the team behind Julia, an up-and-coming language for data science and AI.

At the same time, Julia is general purpose, and provides facilities for creating dashboardsdocumentationREST APIsweb applicationsintegration with databases, and much more. As a result, Julia is now seeing significant commercial adoption in a number of industries. Data scientists and engineers across industries not only use Julia to develop their models, but are able to deploy their programs to production with a single click using Julia Computing’s products.