Paths/Machine Learning & AI/Beyond Linear: Classification and Model Evaluation
Module 2.160 min

Logistic Regression: Linear Models for Classification

Classification seems different from regression, but the core ideas transfer directly. Logistic regression shows how.

Learning Outcome

Understand logistic regression as linear regression with a different loss function and interpretation.

Core Teachings

Key concepts with source texts

Setup: Input: features x (email text, tumor size, image pixels...) Output: class y ∈ {0, 1} (spam/not spam, malignant/benign, cat/dog...)

Why Not Linear Regression? If we use linear regression to predict class (0 or 1), the predictions can be any real number. A prediction of 1.5 or -0.3 doesn't make sense as a class.

The Logistic Function (Sigmoid): We transform the linear output into a probability:

$$P(y=1|x) = \sigma(\mathbf{w}^T \mathbf{x}) = \frac{1}{1 + e^{-\mathbf{w}^T \mathbf{x}}}$$

Properties of σ(z): - Always between 0 and 1 (interpretable as probability) - σ(0) = 0.5 (decision boundary at linear score = 0) - Monotonically increasing (higher linear score → higher probability)

Interpretation: The linear part (w^T x) computes a 'score.' The sigmoid squashes this to a probability. We predict class 1 if P(y=1|x) > 0.5 (i.e., score > 0).

Practice This

For a spam classifier with weights w_contains_viagra = 4, w_from_friend = -3: calculate the probability an email is spam if it contains 'viagra' (1) but is from a friend (1). Then calculate if it's not from a friend (0).

Deep Dive

Why This Matters

Logistic regression is the workhorse of industry ML for classification. It's interpretable (weights show feature importance), fast to train, and often surprisingly effective. Understanding it deeply prepares you for neural networks (which are stacked logistic regressions).

Study Materials

Primary sources with guided reading

WatchYouTube (StatQuest)

Logistic Regression, Clearly Explained - StatQuest

16 min
Why Read This?

To understand logistic regression as transforming a linear model into probability outputs, and why cross-entropy is the right loss function.

While Reading, Ask Yourself:
  • 1.Why can't we use regular linear regression for classification?
  • 2.What does the sigmoid function do? Why is it useful?
  • 3.What is maximum likelihood, and how does it lead to cross-entropy loss?
After Reading, You Should:

You should understand logistic regression as 'linear regression passed through sigmoid, trained with cross-entropy loss.'

Reflection & Critical Thinking

Write your thoughts before revealing answers

Consider these points:

  • What does a negative weight mean for the relationship?
  • If income increases by $10,000, how does the linear score change?
  • How does that change in score affect probability through the sigmoid?
  • Is the effect on probability constant, or does it depend on the starting point?

Your Thoughts

Writing your thoughts first will deepen your understanding

This interpretation skill is essential for explaining ML models to stakeholders—a critical skill in applied ML.
AI Bridge Notes

Bridge notes help connect the resources and show how they relate to the learning outcome.

AI-generated notes synthesize the lesson outcome and resource summaries. Human-reviewed before publishing.