Logistic Regression

Algorithm

Logistic regression is used to classify instances based on the values of their predictor variables.

It is commonly applied in scenarios where the goal is to predict the probability that an input data item belongs to a certain class. The output is a probability value between 0 and 1, indicating the likelihood of the input belonging to the target class.

Logistic regression works by fitting a logistic function to the data, which models the probability of the dependent variable as a function of the independent variables. The logistic function, also known as the sigmoid function, ensures that the output values are constrained between 0 and 1.

For example, in a binomial logistic regression, the classification is between two groups, such as predicting whether an individual will be employed or not. The model estimates the probability of employment based on predictor variables like education level and work experience. If the probability is greater than 0.5, the individual is classified as employed; otherwise, they are classified as not employed.

Binomial logistic regression is the simplest form of logistic regression where the outcome variable is binary. It is used to predict the probability of one of two possible outcomes, such as success/failure or yes/no. The model uses a logistic function to model the probability of the outcome as a function of the predictor variables.

Multinomial logistic regression extends this concept to cases where there are three or more mutually exclusive categories. It calculates the probability for each category using separate binomial logistic regressions and combines them into a single model. An example is predicting the type of cuisine a person might prefer based on their dietary habits and preferences.

Nested logistic regression is used when there are hierarchical structures within the choices being modeled. For instance, predicting whether a consumer will choose beef, pork, salad, or lentils might start with a dichotomy into meat/non-meat to capture the fact that only two of the choices are relevant to vegetarians.

Logistic regression is important because it provides a probabilistic framework for classification tasks and can handle binary, multinomial, and ordinal outcomes. It is widely used due to its simplicity and effectiveness in many practical applications.

Alias
Logit Regression Probit Regression Maximum Entropy Classifier
Related terms
Classification Supervised Learning Probability