Logistic Regression-python implementation from scratch without using sklearn

3 min readJan 3, 2022

Table of contents:

Generate data
Split data into the train (75%) and test (25%)
Standardize the data
Initialize the weight_vector and intercept
Compute Sigmoid
Compute Log Loss
Calculate Gradient w.r.t. ‘w’
Calculate Gradient w.r.t. ‘b’
Train the custom model
Compare custom model with sklearn SGDClassifier model
End Notes
References

In my previous article, I explained Logistic Regression concepts, please go through it if you want to know the theory behind it. In this article, I will cover the python implementation of Logistic Regression with L2 regularization using SGD (Stochastic Gradient Descent) without using sklearn library and compare the result with the sklearn library SGDClassifier.

Let’s get started with python implementation. Below are the steps:

1. Generate data: First, we use sklearn.datasets.make_classification to generate n_class (2 classes in our case) classification dataset:

2. Split data into train (75%) and test (25%): using sklearn.model_selection.train_test_split

3. Standardize the data: using sklearn.preprocessing.StandardScaler. StandardScaler Standardize features by removing the mean and scaling to unit variance. The standard score of a sample x is calculated as: z = (x — u) / s. Where u is the mean of the training samples and s is the standard deviation of the training samples

4. Initialize the weight_vector and intercept term to zeros:

5. Compute Sigmoid: Sigmoid(z) = 1/(1 + exp^-z):

6. Compute Log Loss using below formula: