A very popular/important video:
https://www.coursera.org/learn/competitive-data-science/lecture/LGYQ2/regularization
Mean encoding regularization
CV loop
LOO - Leave one out - using target variable to generate the new feature makes our encoding biased.
Smoothing.
Noise.
Expanding mean.
------------------
generalizations and extensions of mean encodings: for regression/multiclass.
Many to many relations: for e.g. classification of users based on the apps installed on their phones. Each user can have multiple apps, each app can be installed by many users. Hence, many-to-many relation.
In this case, convert data to long representation. So that, each row will have <user_id, app_id, target> like <uid1, app_id1, target(0 or 1)>. Now you can take mean of targets for every app. But how to map it back to users?
Interactions and numerical features -?
https://www.coursera.org/learn/competitive-data-science/lecture/LGYQ2/regularization
Mean encoding regularization
CV loop
LOO - Leave one out - using target variable to generate the new feature makes our encoding biased.
Smoothing.
Noise.
Expanding mean.
------------------
generalizations and extensions of mean encodings: for regression/multiclass.
Many to many relations: for e.g. classification of users based on the apps installed on their phones. Each user can have multiple apps, each app can be installed by many users. Hence, many-to-many relation.
In this case, convert data to long representation. So that, each row will have <user_id, app_id, target> like <uid1, app_id1, target(0 or 1)>. Now you can take mean of targets for every app. But how to map it back to users?
Interactions and numerical features -?
No comments:
Post a Comment