fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error #2538

bartoldeman · 2018-07-24T17:50:20Z

This fixes #2522 for me, on a new enough CentOS 6 with these packages
installed:
python-requests-2.6.0-4.el6.noarch
pyOpenSSL-0.13.1-2.el6.x86_64

This fixes easybuilders#2522 for me, on a new enough CentOS 6 with these packages installed: python-requests-2.6.0-4.el6.noarch pyOpenSSL-0.13.1-2.el6.x86_64

houndci-bot · 2018-07-24T17:50:30Z

easybuild/tools/filetools.py

@@ -460,20 +467,31 @@ def download_file(filename, url, path, forced=False):
    attempt_cnt = 0

    # use custom HTTP header
-    url_req = urllib2.Request(url, headers={'User-Agent': 'EasyBuild',  "Accept" : "*/*"})
+    headers = {'User-Agent': 'EasyBuild',  "Accept" : "*/*"}


whitespace before ':'

…zipped.

damianam · 2018-08-21T16:13:06Z

easybuild/tools/filetools.py

+        except HTTPError as err:
+            if not HAVE_REQUESTS:
+                status_code = err.code
+            if 400 <= status_code <= 499:


Isn't status_code potentially undefined here?

No, for HAVE_REQUESTS because status_code is set as
status_code = url_req.status_code
before
url_req.raise_for_status()

damianam · 2018-08-21T16:14:44Z

easybuild/tools/filetools.py

+    from requests.exceptions import HTTPError
+    HAVE_REQUESTS = True
+except ImportError:
+    import urllib2


Does it make sense to add another catch here, to give more useful feedback? Something like "Can't import requests nor urllib2. If you are in an old system please install python-requests"

urllib2 is always available (standard library). But perhaps the urlopen exception handler can check for something like <urlopen error [Errno 1] _ssl.c:492: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure> and if so, report
"SSL issues with urllib2. If you are using RHEL/CentOS 6.x please install the python-requests and pyOpenSSL RPM packages" and try again?

bartoldeman · 2018-09-05T14:03:41Z

I changed the commit so that the request package is only used if the error occurs, for the principle of the least surprise.

boegel

Also a test would be nice, although that may not be trivial (we'd somehow need to force the fallback to requests?)

boegel · 2018-09-05T16:41:58Z

easybuild/tools/filetools.py

@@ -50,6 +50,8 @@
 import tempfile
 import time
 import urllib2
+from urllib2 import HTTPError
+HAVE_REQUESTS = False


@bartoldeman Please move this down below, right above where _log is created:

try: import requests from requests.exceptions import HTTPError HAVE_REQUESTS = True except ImportError: from urllib2 import HTTPError HAVE_REQUESTS = False

boegel · 2018-09-05T16:42:09Z

easybuild/tools/filetools.py

@@ -440,6 +442,7 @@ def derive_alt_pypi_url(url):

 def download_file(filename, url, path, forced=False):
    """Download a file from the given URL, to the specified path."""
+    global HAVE_REQUESTS, HTTPError


no need for this with the import construct above

boegel · 2018-09-05T16:43:42Z

easybuild/tools/filetools.py

-            url_fd = urllib2.urlopen(url_req, timeout=timeout)
-            _log.debug('response code for given url %s: %s' % (url, url_fd.getcode()))
+            if HAVE_REQUESTS:
+                url_req = requests.get(url, headers=headers, stream=True, timeout=timeout)


move this up where url_req is defined without requests:

if HAVE_REQUESTS: url_req = requests.get(url, headers=headers, stream=True, timeout=timeout) else: url_req = urllib2.Request(url, headers=headers)

boegel · 2018-09-05T16:44:58Z

easybuild/tools/filetools.py

+                # urllib2 does the right thing for http proxy setups, urllib does not!
+                url_fd = urllib2.urlopen(url_req, timeout=timeout)
+                status_code = url_fd.getcode()
+            _log.debug('response code for given url %s: %s' % (url, status_code))


@bartoldeman Hmm, this whole block is becoming a bit much now, especially since download_file is already quite messy... Can we move this up into a dedicated function, something like:

status_code = download_from_url_to(...)

boegel · 2018-09-05T16:46:45Z

easybuild/tools/filetools.py

-                _log.warning("URL %s was not found (HTTP response code %s), not trying again" % (url, err.code))
+        except HTTPError as err:
+            if not HAVE_REQUESTS:
+                status_code = err.code


This can be done above?

status_code = url_fd.getcode().code

boegel · 2018-09-05T16:51:38Z

easybuild/tools/filetools.py

                break
            else:
                _log.warning("HTTPError occurred while trying to download %s to %s: %s" % (url, path, err))
                attempt_cnt += 1
        except IOError as err:
            _log.warning("IOError occurred while trying to download %s to %s: %s" % (url, path, err))
+            error_re = re.compile(r"<urlopen error \[Errno 1\] _ssl.c:.*: error:.*:"
+                                  "SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>")
+            if error_re.match(str(err)):


This construct is used to switch to trying with requests, but it's a bit messy, especially with the inline import statements...
I think we need a use_requests = True or maybe broken_urrlib = True here which can be checked above?

# for backward compatibility, and to avoid relying on 3rd party Python library 'requests' use_requests = False while not downloaded and attempt_cnt < max_attempts: if use_requests: if not HAVE_REQUESTS: raise EasyBuildError("...") # do requests... else: # do what we do now try: ... except IOError as err: if error_re.match(str(err)): use_requests = True

…ent.

bartoldeman · 2018-09-05T18:12:43Z

This is the cleanest I can make it without making a separate function. If you still want a function please let me know.

bartoldeman · 2018-09-06T12:11:40Z

Triggering travis rebuild, seems to have been a glitch...

boegel · 2018-09-07T12:58:45Z

easybuild/tools/filetools.py

+            if used_urllib is urllib2:
+                # urllib2 does the right thing for http proxy setups, urllib does not!
+                url_fd = urllib2.urlopen(url_req, timeout=timeout)
+                status_code = url_fd.getcode()


@bartoldeman add .code here rather than below?

status_code = url_fd.getcode().code

or maybe with

if hasattr(status_code, 'code'): status_code = status_code.code

How can that work? .code is a field of the exception err. for urllib2, and getcode() returns an integer.

One can use

if hasattr(err, 'code'): status_code = err.code

but I prefer to use use_urllib2 then.

Sorry, I'm not sure what I was smoking here, I was clearly mixing things up. Thanks for clarifying.

boegel · 2018-09-07T13:00:47Z

easybuild/tools/filetools.py

+    headers = {'User-Agent': 'EasyBuild', 'Accept': '*/*'}
+    # for backward compatibility, and to avoid relying on 3rd party Python library 'requests'
+    url_req = urllib2.Request(url, headers=headers)
+    used_urllib = urllib2


@bartoldeman I'd prefer using a boolean here (use_urllib2) and handling HTTPError via the import above

I cannot from requests.exceptions import HTTPError before it is actually used (because at the first download attempt HTTPError is still urllib.HTTPError, that is why
I changed to used_urllib. If I do both:

use_urllib2 = True HTTPError = requests.HTTPError

that's a bit double...

Perhaps I should just write a urlopen() emulation using requests (including raising the urllib2.HTTPError exception manually), which might be the cleanest then.

Right, I overlooked that you first need HTTPError from urllib2...

boegel · 2018-09-07T13:03:10Z

easybuild/tools/filetools.py

+                url_fd = urllib2.urlopen(url_req, timeout=timeout)
+                status_code = url_fd.getcode()
+            else:
+                r = requests.get(url, headers=headers, stream=True, timeout=timeout)


@bartoldeman don't use single-letter variables outside of list comprehensions please

questions answered, suggestions implemented

boegel · 2018-09-09T11:15:45Z

@bartoldeman Looks good to go taking your clarifications into account, but we should add a test to verify that this fallback isn't broken in the future, see bartoldeman#7 (which also fixes a broken URL in test_download_url).

…ackage add dedicated test for fallback to requests in download_file function

Try to download using the requests Python package, if installed.

0707acc

This fixes easybuilders#2522 for me, on a new enough CentOS 6 with these packages installed: python-requests-2.6.0-4.el6.noarch pyOpenSSL-0.13.1-2.el6.x86_64

houndci-bot reviewed Jul 24, 2018

View reviewed changes

bartoldeman added 3 commits July 24, 2018 14:38

Need to set decode_content = True for things that need to be auto-gun…

bfbe870

…zipped.

Fix whitespace reported by @houndci-bot.

30575a6

Adjust whitespace and consistent quote-style for the dict.

d57140a

damianam previously requested changes Aug 21, 2018

View reviewed changes

boegel added this to the next release milestone Sep 4, 2018

bartoldeman added 3 commits September 5, 2018 03:21

Give instructions how to proceed if error in easybuilders#2522 happens.

42390f3

Remove unnecessary newline and make diff smaller.

0a8b774

Only use request package if the SSL error occurs with urllib2.

62ceb23

boegel modified the milestones: next release, 3.7.0 Sep 5, 2018

boegel added enhancement change bug fix labels Sep 5, 2018

boegel requested changes Sep 5, 2018

View reviewed changes

= added 2 commits September 5, 2018 13:52

Simplify import logic following @boegel's comments.

fb3885f

Rename url_req to r for requests package to avoid confusion; add comm…

5e2ea6d

…ent.

bartoldeman closed this Sep 6, 2018

bartoldeman reopened this Sep 6, 2018

boegel requested changes Sep 7, 2018

View reviewed changes

boegel changed the title ~~Try to download using the requests Python package, if installed.~~ fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error Sep 7, 2018

easybuilders deleted a comment from boegelbot Sep 7, 2018

Rename "r" to "response" following @boegel's review

06aad39

boegel removed the change label Sep 9, 2018

boegel previously approved these changes Sep 9, 2018

View reviewed changes

add dedicated test for fallback to requests in download_file function

44669b0

boegel mentioned this pull request Sep 9, 2018

add dedicated test for fallback to requests in download_file function bartoldeman/easybuild-framework#7

Merged

Merge pull request #7 from boegel/bartoldeman/download_via_requests_p…

5517ad3

…ackage add dedicated test for fallback to requests in download_file function

bartoldeman dismissed boegel’s stale review via 5517ad3 September 9, 2018 21:10

boegel approved these changes Sep 10, 2018

View reviewed changes

boegel merged commit ea84637 into easybuilders:develop Sep 10, 2018

boegel mentioned this pull request Mar 9, 2019

{math,numlib}[intel/2018a] Trilinos v12.12.1, ParMETIS v4.0.3 easybuilders/easybuild-easyconfigs#7136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error #2538

fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error #2538

bartoldeman commented Jul 24, 2018

houndci-bot Jul 24, 2018

damianam Aug 21, 2018

bartoldeman Sep 5, 2018

damianam Aug 21, 2018

bartoldeman Sep 5, 2018

bartoldeman commented Sep 5, 2018

boegel left a comment

boegel Sep 5, 2018

boegel Sep 5, 2018

boegel Sep 5, 2018

boegel Sep 5, 2018

boegel Sep 5, 2018

boegel Sep 5, 2018

bartoldeman commented Sep 5, 2018

bartoldeman commented Sep 6, 2018

boegel Sep 7, 2018

bartoldeman Sep 7, 2018

boegel Sep 9, 2018

boegel Sep 7, 2018

bartoldeman Sep 7, 2018

boegel Sep 9, 2018

boegel Sep 7, 2018

boegel commented Sep 9, 2018

fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error #2538

fall back to downloading using the requests Python package (if installed) when urllib2 fails due to SSL error #2538

Conversation

bartoldeman commented Jul 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartoldeman commented Sep 5, 2018

boegel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartoldeman commented Sep 5, 2018

bartoldeman commented Sep 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

boegel commented Sep 9, 2018