Linear vs. Logistic Regression

In my experience as a manager, it is worthwhile to see how people respond to simple, elementary questions in the field they are in.  These questions are not an indication of a person’s ability to be a competent software engineer but they are a clear sign of how thorough the engineer is, how they respond to a situation where they don’t know the answer, and most importantly, how deeply the person has mastered their field.

So, there is a good probability that when I am interviewing a developer for machine learning or a data scientist, one my questions will be:

What is the difference between linear regression and logistic regression?

The answer to be complete requires the following:

  • Linear regression makes predictions about continuous values (numerical values)
  • Logistic regression makes predictions about binary, discrete values (value pairs such as true or false)
  • Multinomial logistic regression makes predictions about multiclass, discrete values (discrete values of more than 2 classes)
  • Linear regression uses least squares to optimize.
  • Logistic regression uses maximum likelihood to optimize.

Extra credit if the following is given:

 

 

Anscombe’s Quartet

I love simple examples that make an important point.  Anscombe’s quartet is a really great way to make the point that data visualization is critical to understand any statistical pattern.

For anyone who hasn’t seen it before: what do these four graphs have in common?

They have the same basic statistical properties:

Property Value
Mean of x in each case 9 (exact)
Sample variance of x in each case 11 (exact)
Mean of y in each case 7.50 (to 2 decimal places)
Sample variance of y in each case 4.122 or 4.127 (to 3 decimal places)
Correlation between x and y in each case 0.816 (to 3 decimal places)
Linear regression line in each case y = 3.00 + 0.500x (to 2 and 3 decimal places, respectively)

Linear Regression

Right now, my son is learning y=mx+b in junior high.  I’ve told him that in my opinion, the two deepest insights that he will learn from algebra are this and the quadratic equation.

My son was not convinced.  He said it was obvious to him that the quadratic equation was deep because the equation was complicated but in his view, graphing a line was pretty straight since m is the slope (whether the line goes up or down) and b is the y-intercept (the intersection on the y axis when x=0).

I asked him if he knew the difference between causality and a correlation.  He said he did and not to repeat my anecdote about ice cream sales and drowning rates (as ice cream sales increase, more people drown.  This a non-causal correlation where both data points have a common cause– warmer weather: more people buy ice cream and more people go swimming.  Details here).

I told him that one of the most powerful ways to analyze correlation is through a linear regression.  When he looked unclear, I added: using y=mx+b.

There are many types of correlation.  But sometimes, if there a strong enough relationship between two variables, a prediction can be made.  Of course, this assumes two assumptions that people often skip.

He was quite impressed.  I realized after we had finished talking that I forgot to mention how to figure out which line best matches a scatter plot.  But I guess I had already used up my 15 minutes quota of his attention.

 

Almost finished with my Udacity Nanodegree

I’ve recently completed my 4th project as part of the Udacity Nanodegree.  To finish the degree, I need to do one more project in the area of my choice.  No surprise.  I’ll be choosing a topic in deep learning for my final project.

To get the ball rolling, I’ll be working Keras which runs on top of either TensorFlow and/or Theano and I’ve selected a challenge from Kaggle on image classification.  It should be fun.  Since it’s an image classification problem, I’ll start by training a Convolutional Neural Network (CNN).

 

 

Springer-Verlag and the price of math books

Sieve theory is one of the areas of number theory that I’ve long wanted to dive into.  I’m starting with George Greaves’s Sieves in Number Theory which I picked up at the wonderful Martin Luther King Library in San Jose.

This book was favorably reviewed on AMS.

I liked the introduction of the book so much, I was planning to purchase it and then I saw (on the day that I writing this blog entry: April 22, 2016), the price on Amazon which is $229.00 for new and $85.56 for used.  Wow.

There’s been lots of articles on the fact that math books are so expensive.  For example, this is a blog entry from well-known mathematician Tim Gowers.

Sometimes, professors post pdfs of their books on their home pages.  When I investigated, I found out that Professor George Greaves had died on 2008 at the age of 67.

It looks like a really impressive book!  I guess I’ll stick with my library edition.