-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some tweets have their links or media skipped (unified cards) #79
Comments
Yeah, Twitter v2 API's response for the example tweet you provided (1512123194785898503) doesn't seem to include the link anywhere (even with all the expansions set):
It looks like the only way to obtain info about the cards is using the Twitter Ads API: And that would require to apply and create an additional Twitter Ads API application (with a separate token, etc.) 😖 |
Wow, that's nasty! No wonder nitter is forced to use the "unofficial API" aka web scraping. zedeus/nitter@111927a |
Funnily enough, I'm able to get some card metadata with the endpoints used by guest tokens. You can try it out on |
Il 04/12/22 22:34, robertoszek ha scritto:
Keep in mind it will only work when using guest tokens (either by omitting the `twitter_token` mapping or adding `guest: true` in your config).
Will the usual tokens still be used for the rest of the calls? If not I
guess I should use this only for the accounts which have this issue.
|
No, if an user in your config is marked as "guest", it will use the guest token on all the calls associated to that user. I've been working a bit more on it to get this feature ready for the next stable release: So the current limitations are listed here: The inability of obtaining protected tweets makes sense, as it will never work with a guest token. So the only main difference between using regular Twitter tokens and the guests ones is the 20 tweet limit per user, which I'm going to try to find if there's a way around it. |
I figured out how to force it to paginate using guest tokens: I've managed to gather more than 4000 tweets for an user using this method, not sure if it has a hard limit (apart from hitting rate limits). That commit is included in version |
Il 05/12/22 02:50, robertoszek ha scritto:
That commit is included in version `1.1.1rc49`.
I might be doing something wrong but it gives me a bunch of
✖ 2022-12-05 12:35:03,356 - pleroma_bot - ERROR - Exception occurred for
user, skipping... (cli.py:707)
Traceback (most recent call last):
File
"/home/7/federico/mastodon/bot/lib/python3.9/site-packages/pleroma_bot/cli.py",
line 549, in main
user = User(user_item, config, base_path, posts_ids)
File
"/home/7/federico/mastodon/bot/lib/python3.9/site-packages/pleroma_bot/cli.py",
line 278, in __init__
self._get_twitter_info()
File
"/home/7/federico/mastodon/bot/lib/python3.9/site-packages/pleroma_bot/_twitter.py",
line 169, in _get_twitter_info
self._get_twitter_info_guest()
File
"/home/7/federico/mastodon/bot/lib/python3.9/site-packages/pleroma_bot/_twitter.py",
line 149, in _get_twitter_info_guest
self.pinned_tweet_id = user_twitter["pinned_tweet_ids_str"][0]
IndexError: list index out of range
|
Hmm... Does running version |
Il 05/12/22 12:54, robertoszek ha scritto:
Does running version `1.1.1rc52` make any difference?
Will try.
For now I'm getting a bunch HTTP 403 (it's not protected accounts) like
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url:
https://api.twitter.com/1.1/statuses/show.json?id=1599735346824183808&include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&
include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&includ
e_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweet=true&query_source=typed_query&pc=1&spelling_corrections=1&ext=mediaStats%2Chighlighted
Label
|
Oh wait, that was *with* the token. The errors seem to have vanished
(for now) after commenting the token in the config.
|
Weird, 1597718716837335040 seems to only show up on the search API endpoint, doing the same query here: doesn't seem to include it on the results. You would think when using I've added another pass to filter any tweets that don't originate from the mirrored user, just in case. Regarding the 404's, I tried replicating on my end to no avail (reply to a deleted tweet, reply to a tweet that quotes a deleted tweet and a retweet to a deleted tweet didn't trigger it for me). Both commits are included on |
Oh, and the weird 403's you were getting when providing the token should be fixed on |
Oh, I forgot to mention I added some retries for cases when an It was included in the latest stable release, Not much else we can do than to retry a few times, usually Twitter's API starts returning 503 if their servers are overloaded or over capacity at the time of the request. |
Some fancy accounts seem to be using some Twitter feature which pleroma-bot doesn't support yet.
This is typically spotted in tweets which follow the trend of containing a mere "↓" as warning that the main content of the update is actually somewhere else, like this: https://respublicae.eu/@EU_Commission/108092396818818757 https://nitter.eu/EU_Commission/status/1512123194785898503 which is just a link to https://ec.europa.eu/commission/presscorner/detail/en/statement_22_2331 . These tweets look like just any other tweet whose main URL has been "eaten" by Twitter and shown only as attached "card", but they seem to be different.
Others are more complicated like https://respublicae.eu/@EU_Commission/108103776666586079 https://nitter.eu/EU_Commission/status/1512777762909655043 which contains a "broadcast": https://nitter.eu/i/broadcasts/1BRJjnyZoZdJw . I guess there isn't much to do about these, other than documenting it somewhere so that people make informed decisions about the
nitter
andsignature
configs.The text was updated successfully, but these errors were encountered: