-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Open
Labels
Description
Describe the workflow you want to enable
The current implementation of TargetEncoder uses KFold-cross-validation to avoid data leakage. In cases of longitudinal or clustered data, it is desirable to ensure that rows belonging to the same group or cluster belong to the same train-folds to avoid data-leakage.
Describe your proposed solution
This could be achieved by introducing an optionalgroup parameter and the use of GroupKFold-cross-validation if the group is not None.
Describe alternatives you've considered, if relevant
The alternative is to continue ignoring group structure.
Additional context
No response