Least squares regression aims to minimize the sum of the squared
differences between observed and predicted values. For classification, the
idea is to adapt this approach to predict class labels, typically by encoding
classes numerically (e.g., -1 and 1 for binary classification).
➢ Process
• Encoding Class Labels: For binary classification, the class labels (e.g.,
positive and negative) are encoded as -1 and 1. For multi-class
classification, labels are often encoded using a one-vs-all strategy.
• Model Fitting: The regression model fits a linear function to these
encoded labels by minimizing the sum of squared errors. This involves
solving for the weights 𝑤w in the linear equation:
𝑦= 𝑋𝑤+𝜖y
where 𝑋 is the feature matrix, 𝑦 is the vector of encoded labels, and 𝜖 is
the error term.
• Prediction: The fitted linear model is used to predict the class label for
new data points. For binary classification, the sign of the predicted
value determines the class (positive if 𝑦>, negative if 𝑦<0). For multi-
class classification, the class with the highest predicted value is chosen.
➢ Advantages
• Simplicity: Least squares regression is straightforward to implement
and understand.
• Efficiency: Computationally efficient, especially for small to medium-
sized datasets.
• Interpretability: The linear model’s coefficients can be easily
interpreted to understand the relationship between features and the
target variable.
➢ Limitations
• Not Optimized for Classification: Least squares regression is designed
for continuous outcomes and may not handle classification boundaries
effectively.
• Assumptions: Assumes linear separability of classes, which may not
hold in practice.
• Sensitivity to Outliers: Can be sensitive to outliers, which may distort
the classification boundary.
[…] A. Least Square Regression for classification [05] […]