Skip to content

logistic regression gives very strange results

I am puzzled by the results of glm with the quasibinomial link - they don't look to be what I would expect, in the most simple cases.

I have this contingency table (weighted): xtabs(W_FSTUWT ~ misalignment + ST330Q01WA, data = PISA_2022_ENG)

No Yes
0 51398.124 31953.678
1 18889.466 8652.387

According to this weighted cross-tabulation the odds ratio for YES vs No is about 0.74 (i.e. YES students are .74 times less likely to score 1 on this outcome), which should translate into a logit coefficient of -0.3 (the logarithm of 0.74).

glm(formula = misalignment ~ ST330Q01WA, family=quasibinomial, weights = W_FSTUWT, data = PISA_2022_ENG)

gives as a logit coefficient 4.583e+14 (i.e. essentially + infinity). The results are very different from the unweighted regression ( glm(formula = misalignment ~ ST330Q01WA, family=binomial, data = PISA_2022_ENG) - the coefficient there is -0.31419 (much closer to what I would expect).

Whassup?

This comes from code shared by @Jonathan.DIAZ , which I attach Logistic-Regression-Career-Misalignment-by-Participation-in-Internship.R

Edited by Francesco.AVVISATI