Linear vs. Logistic Regression

In my experience as a manager, it is worthwhile to see how people respond to simple, elementary questions in the field they are in.  These questions are not an indication of a person’s ability to be a competent software engineer but they are a clear sign of how thorough the engineer is, how they respond to a situation where they don’t know the answer, and most importantly, how deeply the person has mastered their field.

So, there is a good probability that when I am interviewing a developer for machine learning or a data scientist, one my questions will be:

What is the difference between linear regression and logistic regression?

The answer to be complete requires the following:

  • Linear regression makes predictions about continuous values (numerical values)
  • Logistic regression makes predictions about binary, discrete values (value pairs such as true or false)
  • Multinomial logistic regression makes predictions about multiclass, discrete values (discrete values of more than 2 classes)
  • Linear regression uses least squares to optimize.
  • Logistic regression uses maximum likelihood to optimize.

Extra credit if the following is given:



Anscombe’s Quartet

I love simple examples that make an important point.  Anscombe’s quartet is a really great way to make the point that data visualization is critical to understand any statistical pattern.

For anyone who hasn’t seen it before: what do these four graphs have in common?

They have the same basic statistical properties:

Property Value
Mean of x in each case 9 (exact)
Sample variance of x in each case 11 (exact)
Mean of y in each case 7.50 (to 2 decimal places)
Sample variance of y in each case 4.122 or 4.127 (to 3 decimal places)
Correlation between x and y in each case 0.816 (to 3 decimal places)
Linear regression line in each case y = 3.00 + 0.500x (to 2 and 3 decimal places, respectively)

Linear Regression

Right now, my son is learning y=mx+b in junior high.  I’ve told him that in my opinion, the two deepest insights that he will learn from algebra are this and the quadratic equation.

My son was not convinced.  He said it was obvious to him that the quadratic equation was deep because the equation was complicated but in his view, graphing a line was pretty straight since m is the slope (whether the line goes up or down) and b is the y-intercept (the intersection on the y axis when x=0).

I asked him if he knew the difference between causality and a correlation.  He said he did and not to repeat my anecdote about ice cream sales and drowning rates (as ice cream sales increase, more people drown.  This a non-causal correlation where both data points have a common cause– warmer weather: more people buy ice cream and more people go swimming.  Details here).

I told him that one of the most powerful ways to analyze correlation is through a linear regression.  When he looked unclear, I added: using y=mx+b.

There are many types of correlation.  But sometimes, if there a strong enough relationship between two variables, a prediction can be made.  Of course, this assumes two assumptions that people often skip.

He was quite impressed.  I realized after we had finished talking that I forgot to mention how to figure out which line best matches a scatter plot.  But I guess I had already used up my 15 minutes quota of his attention.