Skip to content

datascienceanywhere/logistic_regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

image

Building A Logistic Regression in Python, Step by Step

Logistic Regression is supervised classification algorithm where we will used to classify the categories. In logistic regression the output of the estimator is discrete or categorical. The idea of Logistic regression is started from the intuition of Linear regression. For example as from the figure -1 suppose there are two classes, class 0 and class 1 and we need to estimate the classes using independent variable X. For that we can start with the idea of simple linear regression where we fit a line for the samples independent variable and dependent variable (class). Now with the simple linear regression (solid yellow line) we can estimate the probability. Lower the probability we can estimate as class 0 and higher the probability we can estimate as class 0.

Now the question is what is the threshold probability that we need set to class of the sample. By default we can consider 0.5 is the threshold probability and estimate the class. Now here there is problem for default threshold probability = 0.5. Suppose that there are extreme values (not outliers) of X, by common sense we can say that these are also belongs to class 1.

Now when you build a linear regression with the extreme values the linear regression will looks like yellow doted line. And for now default threshold value 0.5 is not approximation for selection or estimation of classification. This phenomenon occurs because, linear regression fit the based on least square regression and hence least square is not appropriate for this problem. To tackle this problem Logistic regression use Maximum Likelihood Estimation (MLE). In MLE, the Goal is to maximize likelihood. In most Data Science optimizations, the goal is to find minima using calculus (minimize sum of squared errors in linear regression, and so on) or numerical techniques like Gradient Descent (minimize deviance in logistic regression, and so on) Maximum Likelihood => Minimum of Negative Log-Likelihood.

Logistic Regression in Python

Step-1: Develop transformed linear regression and computer probability of each data point

Output of Probability Using Linear Regression

image

Step-2: Find the best odd ratio using MLE.

Output of Maximum Likehood Estimation

image

Combining Linear Regresision and Odds Ratio = Logistic Regression

image


by datascienceanywhere

About

math behind the logistic regression using python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published