Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad gateway error when trying to access TROPOMI files #594

Closed
zfasnacht opened this issue Jun 10, 2024 · 13 comments
Closed

Bad gateway error when trying to access TROPOMI files #594

zfasnacht opened this issue Jun 10, 2024 · 13 comments
Labels
question A question needs to be answered to proceed

Comments

@zfasnacht
Copy link

I'm trying to use the earthaccess tool to read TROPOMI files that are in the GES DISC cloud but I'm getting the following error quite frequently

Traceback (most recent call last):
  File "/panfs/ccds02/home/zfasnach/pace_no2_nn_train.py", line 16, in <module>
    pace_data = grab_pace_data(start_date,end_date)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/panfs/ccds02/home/zfasnach/grab_pace_l1b.py", line 44, in grab_pace_data
    f = h5py.File(filename,'r')  
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/h5py/_hl/files.py", line 562, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/h5py/_hl/files.py", line 235, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 102, in h5py.h5f.open
  File "h5py/h5fd.pyx", line 163, in h5py.h5fd.H5FD_fileobj_read
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/spec.py", line 1915, in readinto
    data = self.read(out.nbytes)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/spec.py", line 1897, in read
    out = self.cache._fetch(self.loc, self.loc + length)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/caching.py", line 481, in _fetch
    self.cache = self.fetcher(start, bend)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/asyn.py", line 118, in wrapper
    return sync(self.loop, func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
                ^^^^^^^^^^
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/fsspec/implementations/http.py", line 653, in async_fetch_range
    r.raise_for_status()
  File "/home/zfasnach/.conda/envs/example_tf/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1060, in raise_for_status
    raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 502, message='Bad Gateway', url=URL('https://data.gesdisc.earthdata.nasa.gov/data/S5P_TROPOMI_Level2/S5P_L2__NO2____HiR.2/2024/143/S5P_OFFL_L2__NO2____20240522T000826_20240522T014956_34229_03_020600_20240523T161152.nc')

Any idea how to avoid this from happening other then a simple try/except?

@mfisher87 mfisher87 added the question A question needs to be answered to proceed label Jun 10, 2024
@chuckwondo
Copy link
Collaborator

chuckwondo commented Jun 10, 2024

@zfasnacht, have you accepted the EULA? If not, that might be the problem. To accept the EULA, open the URL mentioned in the error message in a browser, which should redirect you to login to Earthdata Login, and then to an EULA (End User License Agreement) page, where you can check to box at the bottom of the page and click the Agree button. Once you do that, you should get past this problem, assuming you haven't already accepted the EULA, and assuming you're using the same credentials via earthaccess as you do when you accept the EULA.

@zfasnacht
Copy link
Author

Well it's not happening for the first file, so I'm not sure that would be the case. It reads a few files, then randomly that error occurs. Might read 2 files ok, might read 7, seems to be random.

@zfasnacht
Copy link
Author

I did go to that link, logged in, and it downloaded fine which I think suggests I've already accepted the EULA.

@chuckwondo
Copy link
Collaborator

Can you share your code? Just enough to show how you're using earthaccess.

@zfasnacht
Copy link
Author

zfasnacht commented Jun 10, 2024

Of course, thanks for the help!

import earthaccess
import h5py

start_date = '2024-05-22 00:00:00'
end_date = '2024-05-22 23:59:59'

def grab_pace_data(start_date,end_date):
    earthaccess.login(persist=True)

    results = earthaccess.search_data(short_name = 'S5P_L2__NO2____HiR',cloud_hosted=True,temporal=(start_date,end_date),count=20,bounding_box=(-180,-90,180,90))                                                 
    trop_no2_files = earthaccess.open(results)                                                                                                                                                                    


    for filename in trop_no2_files:
        print(filename.full_name)                                                                                                                                                                                 
        f = h5py.File(filename,'r')

        data_group = '/PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/'
        product_group = '/PRODUCT/'

        no2_scd = f[data_group+'nitrogendioxide_slant_column_density'][0]
        no2_strat = f[data_group+'nitrogendioxide_stratospheric_column'][0]

@chuckwondo
Copy link
Collaborator

chuckwondo commented Jun 10, 2024

This might have to do with the underlying async and multi-threading happening under the covers with the fsspec library. Unfortunately, the way earthaccess.open is currently implemented, this might be causing this issue when using it as you are using it (which is how most people are using it, I suspect).

To see if my hunch is correct, try doing the following instead, and let me know if this avoids the issue.

import earthaccess
import h5py

start_date = '2024-05-22 00:00:00'
end_date = '2024-05-22 23:59:59'

def grab_pace_data(start_date,end_date):
    data_group = '/PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/'
    product_group = '/PRODUCT/'

    earthaccess.login(persist=True)

    results = earthaccess.search_data(short_name = 'S5P_L2__NO2____HiR',cloud_hosted=True,temporal=(start_date,end_date),count=20,bounding_box=(-180,-90,180,90))

    for result in results:
        with (
            earthaccess.open([result])[0] as trop_no2_file,
            h5py.File(trop_no2_file) as f
        ):
            print(trop_no2_file.full_name)
            f = h5py.File(trop_no2_file,'r')

            no2_scd = f[data_group+'nitrogendioxide_slant_column_density'][0]
            no2_strat = f[data_group+'nitrogendioxide_stratospheric_column'][0]

This will cause each file to be open and closed in sequence. The way most people use eartheaccess.open with multiple files, multiple files are opened concurrently across multiple threads, and they are not closed, causing resource leaks. Further, given some potential issues with the combination of fsspec caching, multi-threading, and h5py, opening (and closing) each file in sequence might just address this issue.

Although I wouldn't normally suspect a "Bad Gateway" error to be a result of such potential caching/threading conflicts, I've certainly seen misleading error messages before.

Alternatively, it might literally be a flaky server causing intermittent "Bad Gateway" errors.

Regardless, I still recommend the "safer" file handling approach I gave above. If it doesn't fix this specific problem, it should at least avoid other potentially gnarly behavior.

@zfasnacht
Copy link
Author

Oh geez, that's a great point. I actually am normally careful about closing files but it looks like I did miss that so you might be very right. I'll give that a try.

Thanks for the help!

@zfasnacht
Copy link
Author

So I'm making sure I close the file now but it still seems like after I read 1-2 files I get the Bad Gateway error

Any other possible suggestions to improve this?

@mfisher87
Copy link
Collaborator

mfisher87 commented Jun 12, 2024

It's a different file every time, right? You mentioned that this is random. Perhaps a retry mechanism which "backs off" a little bit by waiting an increasing number of seconds (to a limit) with each retry would help work around this. It's possible this explanation from @chuckwondo is the issue:

it might literally be a flaky server causing intermittent "Bad Gateway" errors.

GES DISC may appreciate a heads up about this or be better able to help troubleshoot.

@zfasnacht
Copy link
Author

I'll give the retries a test. Problem is that it's happening so frequently that I'm not sure how much that will help. It seemed today like I actually went for 30mins to an hour without being able to access a single file.

I sent a message to the contact info for earthdata but as you suggest I'll also reach out to the GES DISC folks.

Thanks again for all the help!

@mfisher87
Copy link
Collaborator

We're happy to help any time! I'm going to close this issue since we have a new issue to track the need for us to implement retries internally, but if you feel there's more to talk about or that the issue should be re-opened, please feel free to continue to post here.

@mfisher87 mfisher87 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 12, 2024
@goodwilj
Copy link

@zfasnacht I was also having this issue downloading large amounts of TEMPO data (though this data is probably held on different servers than TROPOMI data), and I came across this issue. The 502 Bad Gateway error would occur randomly with or without using the earthaccess API (e.g. with curl also), so it seems to be a server issue. I reduced the number of threads in the earthaccess.download() function to potentially help any overloading. The 502 Bad Gateway errors still persisted but were less frequent.

@mfisher87
Copy link
Collaborator

It's clear there's more to discuss here! I'm going to re-open and convert this to a discussion.

@mfisher87 mfisher87 reopened this Jun 12, 2024
@nsidc nsidc locked and limited conversation to collaborators Jun 12, 2024
@mfisher87 mfisher87 converted this issue into discussion #601 Jun 12, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question A question needs to be answered to proceed
Projects
Status: Done
Development

No branches or pull requests

4 participants