Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement recovering mechanism from network interruption #116

Open
ynakao opened this issue Jan 23, 2023 · 3 comments
Open

Implement recovering mechanism from network interruption #116

ynakao opened this issue Jan 23, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@ynakao
Copy link

ynakao commented Jan 23, 2023

I tried to import tweets from twitter-archive.zip to my pleroma instance, but my network accidentally went down for a few seconds during posting tweets process. Then, I re-ran pleroma-bot, however, it didn't resume its process properly.

My twitter-archive has more than 20k tweets, so it will take very long time to post them all. It would be great if pleroma-bot has a feature recovering from network interruption or stop/resume mechanism for long running process.

  • First run
$ pleroma-bot
ℹ 2023-01-21 13:30:24,684 - pleroma_bot - INFO - config path: /home/ynakao/test/pleroma-bot/pleroma_2/config.yml
ℹ 2023-01-21 13:30:24,684 - pleroma_bot - INFO - tweets temp folder: /home/ynakao/test/pleroma-bot/pleroma_2/tweets
ℹ 2023-01-21 13:30:24,686 - pleroma_bot - INFO - ======================================
ℹ 2023-01-21 13:30:24,686 - pleroma_bot - INFO - Processing user:       pleromatest
ℹ 2023-01-21 13:30:24,686 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: nakaoyuji
ℹ 2023-01-21 13:30:28,977 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account?
ℹ 2023-01-21 13:30:28,977 - pleroma_bot - INFO -
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of
last post in the Fediverse account)]

⚠ 2023-01-21 13:30:30,690 - pleroma_bot - WARNING - Raising max_tweets to the maximum allowed value (_utils.py:599)
ℹ 2023-01-21 13:30:40,262 - pleroma_bot - INFO - tweets gathered:        23497
Processing tweets... : 100%|█████████████████████████████████████████████████████████████| 23497/23497 [09:48<00:00, 39.96it/s]
ℹ 2023-01-21 13:40:28,401 - pleroma_bot - INFO - tweets to post:         23497
Posting tweets... :   8%|█████                                                       | 1988/23497 [1:15:24<13:26:19,  2.25s/it]Warning: Zero outputs support gamma adjustment.
Posting tweets... :   9%|█████▏                                                      | 2039/23497 [1:17:25<19:09:19,  3.21s/it]✖ 2023-01-21 14:58:25,964 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717)
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/usr/lib/python3.10/site-packages/urllib3/connection.py", line 414, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/usr/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='pleroma.yujinakao.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:997)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/pleroma_bot/cli.py", line 684, in main
    post_id = user.post(
  File "/usr/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 783, in post
    post_id = self.post_pleroma(tweet, poll, sensitive, media, cw=cw)
  File "/usr/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 363, in post_pleroma
    response = pleroma_api_request(
  File "/usr/lib/python3.10/site-packages/pleroma_bot/_pleroma.py", line 29, in pleroma_api_request
    response = requests.request(
  File "/usr/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.10/site-packages/requests/adapters.py", line 563, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='pleroma.yujinakao.com', port=443): Max retries exceeded with url: /api/v1/statuses (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:997)')))
Posting tweets... :   9%|█████▏                                                      | 2039/23497 [1:17:57<13:40:23,  2.29s/it]
  • Second run
$ pleroma-bot
ℹ 2023-01-21 15:02:38,043 - pleroma_bot - INFO - config path: /home/ynakao/test/pleroma-bot/pleroma_2/config.yml
ℹ 2023-01-21 15:02:38,043 - pleroma_bot - INFO - tweets temp folder: /home/ynakao/test/pleroma-bot/pleroma_2/tweets
ℹ 2023-01-21 15:02:38,045 - pleroma_bot - INFO - ======================================
ℹ 2023-01-21 15:02:38,045 - pleroma_bot - INFO - Processing user:       pleromatest
✖ 2023-01-21 15:02:53,366 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:717)
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/pleroma_bot/cli.py", line 705, in main
    user.check_pinned(posted)
  File "/usr/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 340, in check_pinned
    ).format(str(self.pinned_tweet_id)))
AttributeError: 'User' object has no attribute 'pinned_tweet_id'
  • config.yml
pleroma_base_url: https://pleroma.yujinakao.com
users:
- twitter_username: nakaoyuji
  pleroma_username: pleromatest
  pleroma_token: xxxxx
  archive: /path/to/twitter-archive.zip
  • version info
    pleroma-bot: v1.2.0 installed from AUR
    Pleroma: v2.5.0
@tomakun
Copy link

tomakun commented Jan 23, 2023

I am seconding this feature request, not only from the network interruption POV but basically any crash that would happen after the Posting tweets stage has started. If there was a command to allow the bot to resume from where it crashed or ERRORed in the posting stage, that would be great.

@dawnerd
Copy link

dawnerd commented Jan 23, 2023

Had similar (not using archives), try the version in #113 it helped reduce the timeouts for me

@robertoszek robertoszek added the enhancement New feature or request label Jan 23, 2023
@robertoszek
Copy link
Owner

I agree, this would be very useful.
But the way I see it shouldn't apply only to network errors, instead it should be a feature to resume a failed run, no matter the reason.

Perhaps we can dump partial progress to disk during execution and then check at the beginning of a new run if there's partial data we can continue off of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants