Great Learning has provided this free 7 hour course on statistics for Data Science.

This course will be taught by Dr.Abhinanda Sarkar who has his Ph.D. in Statistics from Stanford University. He has taught applied mathematics at the Massachusetts Institute of Technology (MIT); been on the research staff at IBM; led Quality, Engineering Development, and Analytics functions at General Electric (GE); and has co-founded OmiX Labs.

These are the topics covered in this full course:

  1. Statistics vs Machine Learning – 2:22
  2. Types of Statistics [Descriptive, Prescriptive and Predictive] – 9:05
  3. Types of Data – 1:50:45
  4. Correlation – 2:46:02
  5. Covariance – 2:52:33
  6. Introduction to Probability – 4:26:55
  7. Conditional Probability with Baye’s Theorem – 5:24:00
  8. Binomial Distribution – 6:17:01
  9. Poisson Distribution – 6:36:02

It’s been a long time since I did anything substantial with SEO.

I have been fortunate in that FranksWorld.com has been around long enough to have a fair amount of SEO.

However, there’s always room to improve and here’s an interesting read on how to leverage data science for SEO.

Have You Ever Done a Link Disavow? A properly maintained disavow file can make or break your search engine rankings. If you’ve ever (or never!) done a disavow and are wondering if there are links you should add or remove, let Jim Boykin review your backlinks and propose a […]

Will Kwan spent 50 days to create an AI Startup, out of the project out of Y Combinator Startup School.

You can try it out here: https://omnipost.co.

I’m building a machine learning/SaaS startup. In this video, I share the results my first 50 days of full-time work, explaining my business strategy, showing the core features I designed and programmed, and summarizing what I learned from my users. I also give a overview of all the programming frameworks and API’s I used.

Here’s an interesting article on how to represent a categorical feature, with 100’s of levels, in a model in R.

In this post, we will discuss using an embedding matrix as an alternative to using one-hot encoded categorical features for in modeling. We usually find references to embedding matrices in natural language processing applications but they may also be used on tabular data. An embedding matrix replaces the spares one-hot encoded matrix with an array of vectors where each vector represents some level of the feature. Using an embedding matrix can greatly reduce the memory needed to handle the categorical features.