This is a paper that discusses the difference between linear multiple regression and binary logistic regression. The paper also discusses the benefits of the regression model.

## Difference between linear multiple regression and binary logistic regression

Binary Logistic Regression . Firstly, Using the ustudy.sav file, the following hypothesis will be tested: H1: The likelihood of a faculty being in a public or a private university is predicted by the number of years since they got their doctorate and their salary. Dependent variable: • UNIV (type of university) = dependent binary variable (if the dependent variable has more than two levels, a multinomial logistic regression must be used instead of the binary one) Independent variables: • YRSPHD (years since PhD) = independent variable with ratio data • SALARY (salary) = independent variable with ratio data. YOU NEED TO KNOW HOW TO RUN DATA IN SPSS.

Note: Categorical variables (with nominal or ordinal data) can be included in the regression model. However, if such categorical predictors exist, they must be declared as Categorical in the SPSS Logistic Regression window. SPSS will dummy-code these variables automatically. Procedure: To test the predictive model, a binary logistic regression will perform as follows: • select ANALYZE / REGRESSION / BINARY LOGISTIC.. from the menu; • move UNIV in the Dependent area and variables YRSPHD and SALARY in the Covariates area; make sure “Enter” is select under Method; • click OK. EXERCISE.

### Questions

Answer the following questions:

1. Firstly, what are the most important differences between linear multiple regression and also binary/multinomial logistic regression?

2. Secondly, is the regression model significant? Explain. (Check the chi-square df, and p of the model in the Omnibus Tests of Model Coefficients)

3. Thirdly, what are the “R-square” values and how do you interpret it? (See the Model Summary section of the SPSS output.)

4. Fourthly, explain the difference between the Cox & Snell and the Nagelkerke R-square indicators. (see Hair & al.)

5. Then, what is the interpretation of the -2 Log likelihood value?

6. Are the predictors statistically significant? Explain. (See the Variables in the Equation table.)

7. Lastly, why are Wald tests used to answer #5 above and not t-tests like in multiple regression?