Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor LocalS3Session to handle botocore sessions #614

Closed
englehardt opened this issue Apr 18, 2020 · 7 comments
Closed

Refactor LocalS3Session to handle botocore sessions #614

englehardt opened this issue Apr 18, 2020 · 7 comments
Labels
task Doesn't change any behaviour

Comments

@englehardt
Copy link
Collaborator

s3fs > 0.4.0 breaks localstack sessions. See https://travis-ci.org/github/mozilla/OpenWPM/jobs/673635080#L2983-L2997 and fsspec/s3fs#309.

in 0.4.1 s3fs was refactored to use botocore rather than boto3, meaning that we need to pass in a botocore session object rather than a boto3 session object on initialization: https://github.com/mozilla/OpenWPM/blob/480dd7daf384ff4a620f64daea89e048f8ee71f1/automation/DataAggregator/S3Aggregator.py#L87-L90.

This alone would be pretty simple, however we overwrite the boto3.DEFAULT_SESSION with our custom LocalS3Session object, both when running the test suite and in crawler.py.

This custom session object is used to seamlessly propagate localstack's configuration across OpenWPM. It's unclear to me exactly what we'll need to change to get this working again; I don't think swapping to the default botocore session will be sufficient. My guess is we'll either have to override that as well (perhaps similar to pytest-localstack), or pass the credentials as kwargs into the botocore session within the S3Aggregator.

In our meeting today we talked about eventually removing our dependency on localstack for a variety of reasons, so I don't think it makes sense to prioritize this work right now (unless we end up needing to update s3fs beyond 0.4.0 for some other reason).

@englehardt englehardt added the task Doesn't change any behaviour label Apr 18, 2020
@birdsarah
Copy link
Contributor

As per now closed #650, if there was a chance to not use boto directly it seems like it would abstract things away easier.

@vringar
Copy link
Contributor

vringar commented May 18, 2020

I just ran into this while trying to test locally. If I run:
python -m pytest -vv -o log_cli=True -k basic_properties I see:

ERROR    openwpm:multiprocess_utils.py:48 Exception in child process.
Traceback (most recent call last):
  File "/run/media/stefan/02e400c2-1bdd-4d46-a0dd-044d6b4f3af4/Projekte/OpenWPM/automation/utilities/multiprocess_utils.py", line 43, in run
    mp.Process.run(self)
  File "/home/stefan/.conda/envs/openwpm/lib/python3.8/site-packages/multiprocess/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/run/media/stefan/02e400c2-1bdd-4d46-a0dd-044d6b4f3af4/Projekte/OpenWPM/automation/DataAggregator/S3Aggregator.py", line 44, in listener_process_runner
    listener = S3Listener(base_params, manager_params, instance_id)
  File "/run/media/stefan/02e400c2-1bdd-4d46-a0dd-044d6b4f3af4/Projekte/OpenWPM/automation/DataAggregator/S3Aggregator.py", line 93, in __init__
    self._fs = s3fs.S3FileSystem(
  File "/home/stefan/.local/lib/python3.8/site-packages/fsspec/spec.py", line 54, in __call__
    obj = super().__call__(*args, **kwargs)
  File "/home/stefan/.local/lib/python3.8/site-packages/s3fs/core.py", line 187, in __init__
    self.s3 = self.connect()
  File "/home/stefan/.local/lib/python3.8/site-packages/s3fs/core.py", line 292, in connect
    self.s3 = self.session.create_client('s3', aws_access_key_id=self.key, aws_secret_access_key=self.secret, aws_session_token=self.token, config=conf, use_ssl=ssl,
AttributeError: 'LocalS3Session' object has no attribute 'create_client'

However I locally also see this error on master while the tests keep passing for it on Travis
@birdsarah do you have any idea how this can happen?
I have my conda activated in my local enviroment and my pytest is also using the conda python interpreter.

@vringar
Copy link
Contributor

vringar commented May 18, 2020

I don't understand why this broke though, given that our s3fs is at 0.4.0 according to conda list

@birdsarah
Copy link
Contributor

@vringar - is this fixed now you removed a global s3fs?

@vringar
Copy link
Contributor

vringar commented May 18, 2020

Yes, I'll remove them or mark them as off topic as appropriate tomorrow

@englehardt
Copy link
Collaborator Author

@vringar I think we can close this now that you've merged #753?

@vringar
Copy link
Contributor

vringar commented Feb 23, 2021

Yes, we have unpinned s3fs and no longer directly use boto/botocore

@vringar vringar closed this as completed Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Doesn't change any behaviour
Projects
None yet
Development

No branches or pull requests

3 participants