Client Login

Training Sessions in Statistics

Regression Models for Categorical Data

Regression Models for Categorical Data

Linear regression is inappropriate to model binary responses such as pass/fail, alive/dead. Learn the principle of logistic regression, its similarities with linear regression and its specific tools. Good practices for model-building are presented.

Course Details

Upon completion of this module, participants will be able to:

  • Understand the context of use of logistic regression
  • Understand why ordinary regression fails for the modeling of categorical variables
  • Construct a logistic regression model
  • Assess the goodness-of-fit of the model to the data
  • Identify common issues in logistic regression, diagnose problems and fix them
  • Interpret statistical software output

This module is aimed at all scientific staff who collect categorical data and who must make decisions based on them. The regression techniques covered in this session will be particularly useful for people who deal with qualitative response variables (measurements) in finance, epidemiology, medicine, genetics, social sciences, econometrics and marketing.

Participants must have attended the following modules.

or possess an equivalent background.

Introduction to Logistic Regression

  • Goal: To Study the Relationship between a Categorical Variable and a Set of Explanatory Variables
  • Why Does Ordinary Multiple Linear Regression Fail for the Analysis of a Categorical Response Variable?

Refresher on Multiple Linear Regression

  • Definition and Estimation of the Model
  • Interpretation of model coefficients
  • Goodness-of-Fit and Validation Techniques

Classical Case: a Binary Response Variable

  • Basic Principle: Modeling the probability of observing a given value of the reponse variable
  • Example
  • Interpretation of Statistical Software Output: Coefficients and Mathematical Transformations, Odds Ratios, Statistical Testing of Model Coefficients
  • Comparison of Logistic Regression Software Output with Multiple Linear Regression
  • Goodness-of-Fit Measures: Nested Models, Cross-Validation Techniques
  • Using the Model for Predictive Purposed

Overview of the Case of an Ordinal and Nominal Response Variables Practical Considerations

  • Procedures Available in Statistical Software
  • Implementation and Interpretation

Recommended Duration: 1 day

Course materials :

  • Course notes on the statistical techniques
  • Sample datasets
  • Login to post comments

    Related Sessions

    • An applied set of modules with focus on the most widely used multivariate methods and their applications in several fields of application. Learn about the principle of the methods, the data needed, and the information they provide.

    • Learn about preference mapping techniques to explore and understand consumer preferences. Applications dealing with segmentation and the identification of niche markets are discussed. Focus on pitfalls and good practices.

    • Predictive analytics (PA) is on everyone's lips. But what is it really all about? Discover its principle, implementation, typical pitfalls and good practices. An overview of the most commonly used models is provided.

    • The primary goal of this method is to discover which variables have the best ability of discriminating between two or more known groups in your data. Discrimimant analysis may also be used to build predictive analytics models.