Databricks just livestreamed this tech talk earlier today.

Developers and data scientists around the world have developed tens of thousands of open source projects to help track, understand, and address the spread of COVID-19. Given the sheer volume, finding a project to contribute to can prove challenging. To make this easier, we built a recommendation system to highlight projects based off of inputted programming languages and keywords.

This talk will go through the full cycle of implementing this system: gathering data, building/tracking models, deploying the model, and creating a UI to utilize the model.

Lex Fridman interviews Dmitry Korkin on the latest episode of his podcast.

Dmitry Korkin is a professor of bioinformatics and computational biology at Worcester Polytechnic Institute, where he specializes in bioinformatics of complex disease, computational genomics, systems biology, and biomedical data analytics. I came across Dmitry’s work when in February his group used the viral genome of the COVID-19 to reconstruct the 3D structure of its major viral proteins and their interactions with human proteins, in effect creating a structural genomics map of the coronavirus and making this data open and available to researchers everywhere. We talked about the biology of COVID-19, SARS, and viruses in general, and how computational methods can help us understand their structure and function in order to develop antiviral drugs and vaccines. This conversation is part of the Artificial Intelligence podcast. 

Time Index:

  • 0:00 – Introduction
  • 2:33 – Viruses are terrifying and fascinating
  • 6:02 – How hard is it to engineer a virus?
  • 10:48 – What makes a virus contagious?
  • 29:52 – Figuring out the function of a protein
  • 53:27 – Functional regions of viral proteins
  • 1:19:09 – Biology of a coronavirus treatment
  • 1:34:46 – Is a virus alive?
  • 1:37:05 – Epidemiological modeling
  • 1:55:27 – Russia
  • 2:02:31 – Science bobbleheads
  • 2:06:31 – Meaning of life

Databricks hosted this Online Tech Talk hosted by Denny Lee, Developer Advocate at Databricks, to see what data professionals can do to help the world beat the virus.

My name is Denny Lee and I’m a Developer Advocate at Databricks. But before this, I was a biostatistician working on HIV/AIDS research at the Fred Hutchinson Cancer Research Center and University of Washington Virology Lab in the Seattle-area. Watching my friends and colleagues working the front lines of this current pandemic has inspired me to see if we – as the data scientist community – can potentially help with “flattening the curve”. But before we dive into data science, remember – the most important thing you can do is wash your hands and social distancing! A great reference is How to Protect Yourself (https://www.cdc.gov/coronavirus/2019-ncov/prepare/prevention.html).

With the current concerns over SARS-Cov-2 and COVID-19, there are now available various COVID-19 datasets on Kaggle and GitHub as well as competitions such as the COVID-19 Open Research Dataset Challenge (CORD-19) (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge#). Whether you are a student or a professional data scientist, we thought we could help out by providing a primer session with notebooks on how to start analyzing these datasets.