Applied Empirical Modeling of Nonlinearity and Endogeneity in Regression Models (WS 2017/18)

Course Number

Field(s) of Study
PhD students

University Calendar

Learnweb Platform

PhD Seminar

Course Language

Course schedule

Day Time Frequency Date Room
Thursday 09:00- 18:00 single date 05.10.2017  
Friday 09:00- 18:00 single date 06.10.2017  
Monday 09:00- 18:00 single date 09.10.2017  
Tuesday 09:00- 12:00 single date 10.10.2017  


The course is given by Prof. Dr. Richard T. Gretz (University of Texas at San Antonio).

The course takes place in room 006, MCM.

Applications for this course are possible for all doctoral students as well as post-docs of the faculty of business and economics and minor research students by sending an email to Tanja Geringhoff ( Please note that the course is limited to 25 participants for didactic reasons. Access will be granted based on the first-come-first-served principle. The application deadline is September 7, 2017.

Additional course information (e.g., the complete syllabus) and materials will be made available to the course participants via learnweb. The enrollment passphrase will be sent to the course participants via e-mail.

Information for students of the Minor Research: If your application was successful, please register at the examination office for the early examination period. The examination modalities will be published here once they have been finalized.

Information for all PhD students: The course is handled as an A certificate for the PhD program.


Often empirical problems do not fit the modeling assumptions of Ordinary Least Squares (OLS) estimation. This workshop looks at two specific scenarios: (1) Nonlinearities in dependent and independent variables and (2) instrumental variable techniques for dealing with endogeneity and non-random sample selection. These problems are often encountered in applied work. The goal of this workshop is to provide researchers with tools used to address some of the inadequacies of traditional OLS estimation in each setting.

We begin by looking at different nonlinear approaches to modeling discrete choice. We also extend the theme of nonlinearity to the independent variable side by discussing the interpretation of interaction effects in traditional OLS. Then we consider different instrumental variable strategies to deal with the problem of endogeneity. Finally, we combine the themes of nonlinearity and instrumental variables by considering selection models to deal with non-random samples.

By the end of the workshop, you should be able to understand and (equally important) run Stata code to model dichotomous dependent variables with logit and probit estimations, perform instrumental variable estimation and accompanying tests of instrument exogeneity and relevance, and estimate models controlling for selection bias.

To achieve these goals, the course is structured in a way that aims at covering the following sub-questions and aspects:


Thursday Morning: Dichotomous Dependent Variables and Probit/Logit Estimation – Part 1

  1. What are dichotomous dependent variables and why can’t we use regular OLS?
  2. What is a ”link” function? How do Probit and logit link to probability?
  3. How do probit/logit improve on OLS? How do you interpret probit/logit coefficients?

Thursday Afternoon: Dichotomous Dependent Variables and Probit/Logit Estimation – Part 2

  1. How do probit/logit coefficients compare to each other and to OLS? How do you determine marginal effects in probit/logit?
  2. What’s the major difference between probit and logit?
  3. How do you test hypotheses in probit/logit? How do you determine which model is ”better”?
  4. (time permitting) What about heteroskedasticity?
  5. Review probit and/or logit in action (Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).

Friday Morning: Nonlinearities on the Right Hand Side and Possible Multicollinearity

  1. How do we incorporate and interpret interaction effects in our regression models?
  2. What is the main effect vs. simple effect? How can Stata help us to determine marginal effects when interactions are present?
  3. What is multicollinearity? How will it impact our estimations? What are some strategies to measure and deal with possible multicollinearity.

Friday Afternoon: Endogeneity and Instrumental Variables Part 1

  1. What is ‘endogeneity’? What are some possible reasons why an independent variable could be endogenous? What impact does endogeneity have on OLS estimates?
  2. What is an ‘instrumental variable’? What are the two important requirements for an instrumental variable? Given these requirements, how can instrumental variables address the problem of endogeneity?
  3. What is Two-Stage Least Squares (2SLS) and how does it compare to OLS?

Monday Morning: Endogeneity and Instrumental Variables Part 2

  1. How do you test for ‘endogeniety’? How do you test the ‘overidentification restrictions’? How are overidentification restrictions interpreted?
  2. How do you test for instrument relevance? What happens when instruments are ‘weak’?
  3. 2SLS vs. Generalized Methods of Moments (GMM) estimation
  4. (time permitting) Endogeneity and systems of equations – what is three-stage least squares (3SLS)?

Monday Afternoon: Endogeneity and Instrumental Variables  Part 3

  1. Instrumental variables in practice. Two big problems: weak instruments and exogeneity (Murray 2006)
  2. The state of instrumental variables in marketing (Rossi 2014)
  3. The hunt for good instruments (will definitely talk about: Elberse 2010; Germann et al 2015; Levitt 1996; Levitt 1997; Petersen et al 2015) (possibly talk about: Geyskens et al 2015; Keller et al 2016; Liu et al 2016; Mantin and Eran 2016).
  4. ”Best Practices”

Tuesday Morning: Selection Bias and Heckman’s Correction

  1. What is selection bias and how will it impact estimation? What is a ‘control function’?
  2. Heckman’s correction
  3. Selection bias in action (Germann et al 2015; Liu 2016; Allen et al 2016)


  • Ronny Behrens (accompanying)