Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm to random collect youtube video #169

Closed
longshuicy opened this issue Jun 13, 2024 · 0 comments · Fixed by ncsa/standalone-smm-smile-graphql#31 or #171
Closed

Algorithm to random collect youtube video #169

longshuicy opened this issue Jun 13, 2024 · 0 comments · Fixed by ncsa/standalone-smm-smile-graphql#31 or #171

Comments

@longshuicy
Copy link
Member

longshuicy commented Jun 13, 2024

  1. Prefix Generation:

    • Generate a set of random prefixes, which are combinations of letters (lowercase and uppercase) and numbers.
    • Restrict prefixes to 3-5 characters in length for higher probability of results.
  2. Search Function Modification:

    • Modify the search video endpoint to accept the prefixes in the format "watch?=" + "prefix".
    • Ensure the search can handle both case-sensitive and case-insensitive inputs.
  3. Sampling Method:

    • Search each generated prefix to retrieve YouTube videos with IDs starting with those prefixes followed by a dash.
    • Continue searching with different prefixes until a desired number of videos (e.g., 1000) are accumulated.
    • Note: Prefixes containing a dash should be excluded as they do not yield results.
  4. Quota Management:

    • Include a warning in the UI that the sampling process involves multiple searches and could use up the daily quota.
    • Provide a link to the article explaining the method used for transparency.
  5. UI Integration:

    • Add a button in the search function to generate a random sample of videos.
    • Display progress and results of the sampling process in the UI.

Additional Notes:

  • If the initial implementation faces issues or if there is a more efficient way to achieve the same goal, consider alternative methods.
  • Ensure to test thoroughly to handle cases where prefixes yield no results.

Need to read this paper:
https://dl.acm.org/doi/pdf/10.1145/2068816.2068851?casa_token=Th41gtcnR8QAAAAA:FSvkJV2qTg3Zc0I3LfTvgigKa3oppxhi_YLmZPG_aN7rowRrktZ1uK59LRhHadiAJJ0yoR3pkDJy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
1 participant