BLUE
TK
Tal Korem
@tkorem.bsky.social
Microbiome, network inference, metabolism and reproductive health. All views are mine.
239 followers159 following51 posts
TKtkorem.bsky.social

But CV is used not just for evaluation but also for hyperparameter tuning, and distributional bias impacts HPs that affect regression to the mean. For example, we show that it biases for weaker model regularization, which might affect generalization and downstream deployment.

A comparison of LOOCV and Rebalanced LOOCV evaluation of logistic regression models with varying regularization strength on one of the evaluations analyzed above. LOOCV has the best auROC (of 0.817) with weak regularization (1e-6 - 1e-2) while Rebalanced LOOCV has the best auROC (of 0.845) with strong regularization (100 - 1e5).
1

TKtkorem.bsky.social

Importantly - we'd love to hear your comments, feedback, and GitHub issues! In particular if there’s additional prior work on this topic that we should note.

0
TK
Tal Korem
@tkorem.bsky.social
Microbiome, network inference, metabolism and reproductive health. All views are mine.
239 followers159 following51 posts