From your Github logistric regression folder, there is only one file called lr.p. I was looking at it.

Question1: Beside logistic function and l2 penalty, what else parameters did you use in your logistic regression model?

Question2: Did you apply standardization on data before running the model?

Question3: When you train the model, did you use cross-validation or something else?

Hi Nancy,

1.)

The logistic regression model came from sklearn:

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

The only parameters aside from the defaults were the saga solver (due to the fact that itâ€™s a large, sparse data-set), and max_iter was increased from 100 to 1000 to get better convergence for the parameters.

2.)

The diagnosis codes were 0-1. Age was not scaled, nor were the interaction terms.

3.)

We employed a train-test split, reserving 1/3 of the data for testing. We stratified on y to ensure we had roughly equal representation of true labels in both our training and test data.

Did you just use the straight last 12 month inpatient admits, or you set it at a fixed value for the outliers, if so what is your cutting point?

We used the straight 12 month cutoff.

Sorry i did not make it clear, how did you deal with outlier Inpatient admit count? Do you cap it at certain value or just use it as is? do you include one-day stay into the inpatient admit counts?