Sequential User History Features Not Utilized in Model Input Pipeline in train_amazon_inttower.py #13

CHETAN1KUKREJA · 2024-11-25T00:24:47Z

Brief

@archersama
In the file IntTower/train_amazon_inttower.py in line 141 the user history is calculated but the feature is never used for making prediction why?

Current Behavior

The implementation processes user historical interaction data through get_user_feature() and get_var_feature(), generating train_user_hist, but this processed sequential data is not included in the final model input. Currently, only sparse and dense features are passed to the model:

train_model_input = {name: train[name] for name in sparse_features + dense_features}

Expected Behavior

Sequential user history should be incorporated into the model input to leverage historical user-item interactions for better recommendations. The train_user_hist generated from get_var_feature() should be included in the model's input features.

Impact

This oversight means the model is currently making predictions based only on:

Static user features (reviewerID, user_mean_rating)
Static item features (asin, categories, item_mean_rating, price)

It's missing the valuable sequential patterns in user behavior that have already been processed but aren't being utilized.

bug enhancement feature-engineering

The text was updated successfully, but these errors were encountered:

archersama · 2024-11-25T01:56:15Z

We found that this feature could cause data leakage, so we removed this. As for the results below autoint, can you report both autoint and inttower results?

CHETAN1KUKREJA · 2024-11-25T02:25:07Z

a data leak? how exactly?
and without that wouldn't the recommendations be based on just the average rating of the user and the categories, item_mean_rating, price of the item?
so that would be not that personalized,right?

archersama · 2024-11-26T11:26:42Z

"User history is important. However, constructing an appropriate user history is crucial. You might try creating an example of user history yourself."

CHETAN1KUKREJA · 2024-12-01T21:57:46Z

Ok, I will try. Can you guide me on this topic, It will be much easier for me if you guide me on what to do, as I don't have to go through the entire code then.

Meanwhile, can you please tell me what the data leak was that you were talking about?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequential User History Features Not Utilized in Model Input Pipeline in train_amazon_inttower.py #13

Sequential User History Features Not Utilized in Model Input Pipeline in train_amazon_inttower.py #13

CHETAN1KUKREJA commented Nov 25, 2024

archersama commented Nov 25, 2024

CHETAN1KUKREJA commented Nov 25, 2024

archersama commented Nov 26, 2024

CHETAN1KUKREJA commented Dec 1, 2024

Sequential User History Features Not Utilized in Model Input Pipeline in train_amazon_inttower.py #13

Sequential User History Features Not Utilized in Model Input Pipeline in train_amazon_inttower.py #13

Comments

CHETAN1KUKREJA commented Nov 25, 2024

Brief

Current Behavior

Expected Behavior

Impact

archersama commented Nov 25, 2024

CHETAN1KUKREJA commented Nov 25, 2024

archersama commented Nov 26, 2024

CHETAN1KUKREJA commented Dec 1, 2024