Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processing dataset errors #6

Open
xingjinshuo opened this issue Aug 27, 2024 · 5 comments
Open

processing dataset errors #6

xingjinshuo opened this issue Aug 27, 2024 · 5 comments

Comments

@xingjinshuo
Copy link

Where can I find the beauty and toys data sets?I found All_beauty and Toys_and_Games at https://datarepo.eng.ucsd.edu/mcauley_group/data/amazon_v2/categoryFiles/, but the number of users and items obtained after processing by data_preprocess.py is inconsistent with the article.

@97z
Copy link

97z commented Sep 3, 2024

I have the same errors. I use Amazon review 2018, and download the review data and meta data. The video domain can catch the paper's data, but the movie domain's number is inconsistent. 311143 86678 is not the same as the paper "297,498 59,944"

@ghdtjr
Copy link
Owner

ghdtjr commented Sep 7, 2024

@xingjinshuo We used the Luxury_Beauty dataset not All_Beauty.

The thresholds for the preprocessing are varying for each dataset. I guess, I did not implement the automatically to set the threshold based on the dataset. You should check the value of the threshold which is the number of minimum interactions.

@xingjinshuo
Copy link
Author

@ghdtjr I understand,Thanks for your answer. Is "Toys" the "Toys and Games" in the amazon review dataset?

@ghdtjr
Copy link
Owner

ghdtjr commented Sep 8, 2024

@xingjinshuo You're right. The "Toys" dataset means the "Toys and Games".

@97z
Copy link

97z commented Sep 10, 2024

@xingjinshuo You're right. The "Toys" dataset means the "Toys and Games".

Hello, your paper has mentioned you select 30K(your paper wrote 3K maybe is wrong),But the whole data after filter 4-cores is larger than 30K. Do you select the user randomly?or something else? Thank you for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@ghdtjr @97z @xingjinshuo and others