Linear regression and logistic regression: It's role in AI and Machine Learning

It is easy to get lost in the hype of Deep Learning and Large Language Models. But before transformers and neural networks, there was regression.

If you don't understand these two algorithms, you don't understand AI. They are the "Hello World" of machine learning, but they are also the workhorses of production analytics.

Opening Context

We often see developers throwing XGBoost or Neural Networks at simple problems. It’s like using a flamethrower to light a candle.

Why does this matter? Because complexity breeds bugs. Linear and Logistic regression are interpretable. When they fail, you know why. When a neural network fails, you have a billion parameters of noise.

Mental Model: The Line vs. The S-Curve

Think of Linear Regression as a Ruler. You are trying to draw the "best fit" straight line through a scatter plot of data.

Input: "House Size"
Output: "Price" (Continuous value: $500k, $550k)

Think of Logistic Regression as a Switch. You are trying to draw a line that separates data into two buckets, but mapped to a probability curve.

Input: "Email words"
Output: "Spam or Not Spam?" (Probability: 0.1, 0.9)

Key Difference: Linear predicts how much. Logistic predicts yes or no (or more accurately, the probability of yes).

Hands-On Example

Let's look at how we implement these in Python using scikit-learn.

Linear Regression (Predicting Values)

from sklearn.linear_model import LinearRegression
import numpy as np

# Data: [Hours Studied]
X = np.array([[1], [2], [3], [4], [5]])
# Target: [Test Score]
y = np.array([50, 60, 70, 80, 90])

model = LinearRegression()
model.fit(X, y)

prediction = model.predict([[6]])
print(f"Predicted score for 6 hours: {prediction[0]}")
# Output: 100.0

Logistic Regression (Predicting Probabilities)

from sklearn.linear_model import LogisticRegression

# Data: [Hours Studied]
X = np.array([[1], [2], [8], [9]])
# Target: [Fail(0) or Pass(1)]
y = np.array([0, 0, 1, 1])

model = LogisticRegression()
model.fit(X, y)

probability = model.predict_proba([[5]])
print(f"Probability of passing with 5 hours: {probability[0][1]:.2f}")

Notice the output types. One is a raw number (Score), the other is a confidence score (Probability).

Under the Hood

Linear Regression

It tries to minimize the Residual Sum of Squares (RSS). It minimizes the vertical distance between the data points and the regression line. Equation: y = mx + c

Logistic Regression

It uses the Sigmoid Function to squash the output between 0 and 1. Equation: y = 1 / (1 + e^-z) where z = mx + c

This "squashing" is what allows it to handle classification. If the output is > 0.5, classify as Class 1. If < 0.5, Class 0.

Common Mistakes

Using Linear Regression for Classification

"If I predict a value of 0.8, that's Class 1!" Why it fails: Linear regression is sensitive to outliers. One massive data point can skew the line and ruin your classification boundary.

Assuming Correlation is Causation

Just because the regression line fits well (High R-squared) doesn't mean X causes Y. They could both be caused by Z.

Production Reality

In production, you rarely use raw Linear Regression. You use Regularized versions like Ridge or Lasso (L1/L2 regularization) to prevent overfitting.

For Logistic Regression, it remains the industry standard for:

Credit Scoring (High transparency required)
Ad Click-Through Rate (CTR) prediction (Ultra-low latency required)

Author’s Take

I will always try Logistic Regression before I try a Neural Network.

If Logistic Regression gets me to 85% accuracy and is explainable to my boss, and a Neural Network gets me to 87% but is a black box that costs 10x more to run... I am shipping the Logistic Regression.

Conclusion

These aren't just "beginner" algorithms. They are the foundation of statistical learning. Master the Line and the Curve, and you have mastered the basics of prediction.