Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warn supported dataset checks instead of throw #260

Merged
merged 2 commits into from
Apr 24, 2024
Merged

warn supported dataset checks instead of throw #260

merged 2 commits into from
Apr 24, 2024

Conversation

wanchaol
Copy link
Contributor

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 24, 2024
@wanchaol wanchaol merged commit be432e1 into main Apr 24, 2024
4 checks passed
@wanchaol wanchaol deleted the dataset_warn branch April 24, 2024 06:40
@@ -84,6 +79,13 @@ def __init__(
logger.info(f"Preparing {dataset_name} dataset from HuggingFace")
# Setting `streaming=True` works for large dataset, but is slightly
# slower and unstable.
if dataset_name not in _supported_datasets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right because of L99. I think we can & should only bypass the check when dataset_path is specified.

tianyu-l pushed a commit to tianyu-l/torchtitan_intern24 that referenced this pull request Aug 16, 2024
philippguevorguian pushed a commit to YerevaNN/YNNtitan that referenced this pull request Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants