-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long query URL gives error in oa_request()
but works in browser
#216
Comments
Wow this one is really really weird. The problem isn't even about length of the query string. Minimal reprex: query_substr <- "https://api.openalex.org/works?page=1&filter=title_and_abstract.search:%22Agriculture+reform%22+OR+%22ocean+reform%22"
oa_request(query_substr)
#> Warning in oa_request(query_substr): No records found!
#> list()
httr::GET(query_substr)
#> Response [https://api.openalex.org/works?page=1&filter=title_and_abstract.search:%22Agriculture+reform%22+OR+%22ocean+reform%22]
#> Date: 2024-03-08 18:53
#> Status: 200
#> Content-Type: application/json
#> Size: 332 kB
#> {"meta":{"count":1717,"db_response_time_ms":222,"page":1,"per_page":25,"groups_count":null},"results":[{"id... This happens because httr::GET(query_substr, query = list(`per-page` = 1))
#> Response [https://api.openalex.org/works?page=1&filter=title_and_abstract.search%3A%22Agriculture%2Breform%22%2BOR%2B%22ocean%2Breform%22&per-page=1]
#> Date: 2024-03-08 18:57
#> Status: 200
#> Content-Type: application/json
#> Size: 115 B
#> {"meta":{"count":0,"db_response_time_ms":68,"page":1,"per_page":1,"groups_count":null},"results":[],"group_... Essentially, So instead of this url from above: bad_url <- "https://api.openalex.org/works?page=1&filter=title_and_abstract.search%3A%22Agriculture%2Breform%22%2BOR%2B%22ocean%2Breform%22&per-page=1"
good_url <- "https://api.openalex.org/works?page=1&filter=title_and_abstract.search:%5C%22Agriculture+reform%5C%22+OR+%5C%22ocean+reform%5C%22&per-page=1"
httr::GET(good_url)
#> Response [https://api.openalex.org/works?page=1&filter=title_and_abstract.search:%5C%22Agriculture+reform%5C%22+OR+%5C%22ocean+reform%5C%22&per-page=1]
#> Date: 2024-03-08 19:33
#> Status: 200
#> Content-Type: application/json
#> Size: 9.69 kB
#> {"meta":{"count":35789,"db_response_time_ms":338,"page":1,"per_page":1,"groups_count":null},"results":[{"id... One hacky way around that is to add the slash character and ensure that it decodes before httr::GET(
URLdecode(gsub("%22", "%5C%22", bad_url))
)
#> Response [https://api.openalex.org/works?page=1&filter=title_and_abstract.search:\"Agriculture+reform\"+OR+\"ocean+reform\"&per-page=1]
#> Date: 2024-03-08 19:30
#> Status: 200
#> Content-Type: application/json
#> Size: 9.69 kB
#> {"meta":{"count":35789,"db_response_time_ms":338,"page":1,"per_page":1,"groups_count":null},"results":[{"id... So for your reprex, you can do reformat your url: query_url <- "https://api.openalex.org/works?page=1&filter=title_and_abstract.search:%22Agriculture+reform%22+OR+%22ocean+reform%22+OR+%22energy+reform%22+OR+%22decarbonization%22+OR+%22Eco-friendly+Subsidies%22+OR+%22Green+Subsidies%22+OR+%22Polluter+Pays+Principle%22+OR+%22Environmental+Externalities%22+OR+%22Biodiversity+Offsetting%22+OR+%22Conservation+Finance%22+OR+%22Payment+for+Ecosystem+Services%22+OR+%22Agri-environmental+Schemes%22+OR+%22Cross-compliance%22+OR+%22Eco-taxes%22+OR+%22Sustainable+Agriculture+Incentives%22+OR+%22Carbon+Pricing%22+OR+%22Biodiversity+Credits%22+OR+%22Habitat+Banking%22+OR+%22Rewilding+Incentives%22+OR+%22Green+Bonds%22+OR+%22Ecological+Fiscal+Transfers%22+OR+%22Renewable+Energy+Subsidies%22+OR+%22Water+Quality+Trading%22+OR+%22Sustainable+Fisheries+Subsidies%22+OR+%22Green+Certification+Schemes%22+OR+%22Conservation+Easements%22+OR+%22Environmental+Impact+Bonds%22+OR+%22Climate+Smart+Agriculture%22+OR+%22Natural+Capital+Financing%22+OR+%22Bioenergy%22+OR+%22Forest+Carbon+Credits%22+OR+%22Blue+Carbon+Initiatives%22+OR+%22Green+Public+Procurement%22+OR+%22Integrated+Pest+Management+Incentives%22+%22Wildlife+Corridors+Funding%22+OR+%22Biodiversity+Banking%22+OR+%22Climate+Adaptation+Finance%22+OR+%22Deforestation+Reduction+Programs%22+OR+%22Environmental+Risk+Assessment%22+OR+%22Green+Infrastructure+Investments%22+OR+%22High+Conservation+Value+Incentives%22+OR+%22Landscape+Restoration+Funds%22+OR+%22Marine+Protected+Areas+Support%22+OR+%22Natural+Resource+Management%22+OR+%22Organic+Farming+Subsidies%22+OR+%22Permaculture+Design+Grants%22+OR+%22Pollination+Services+Payments%22+OR+%22Protected+Area+Financing%22+OR+%22Regenerative+Agriculture+Support%22+OR+%22Sustainability+Linked+Loans%22+OR+%22Urban+Greening+Grants%22+OR+%22Wetlands+Restoration+Funding%22+OR+%22Zero+Emission+Vehicle+Incentives%22+OR+%22Adaptive+Management+Practices%22+OR+%22Biodiversity+Informatics%22+OR+%22Climate+Bonds%22+OR+%22Debt-for-Nature+Swap%22+OR+%22Ecosystem-Based+Adaptation%22+OR+%22Forest+Stewardship+Council+Certification%22+OR+%22Greenhouse+Gas+Inventory%22+%22Habitat+Restoration+Grants%22+OR+%22Invasive+Species+Control+Funding%22+OR+%22Land+Degradation+Neutrality+Fund%22+OR+%22Mitigation+Banking%22+OR+%22Non-Timber+Forest+Product+Incentives%22+%22Ocean+Acidification+Research+Grants%22+OR+%22Pollinator+Habitat+Enhancement%22+OR+%22Renewable+Energy+Certificates%22+OR+%22Soil+Health+Improvement+Programs%22+OR+%22Tree+Planting+Campaigns%22+OR+%22Wildlife+Management+Areas%22+OR+%22Biodiversity+Strategy+and+Action+Plans%22+OR+%22Circular+Economy+Initiatives%22+OR+%22Disaster+Risk+Reduction+Funding%22+OR+%22DRR+Funding%22+OR+%22Ecosystem+Valuation%22+OR+%22Fisheries+Improvement+Projects%22+OR+%22Green+Job+Training+Programs%22+OR+%22Holistic+Management+Funding%22+OR+%22Indigenous+Peoples%27+Biodiversity+Conservation%22+OR+%22Landscape+Connectivity+Projects%22+OR+%22Mangrove+Restoration+Initiatives%22+OR+%22Nature-based+Solutions%22+OR+%22Organic+Certification+Cost+Share%22+OR+%22Peatland+Restoration+and+Management%22+OR+%22Quantitative+Easing+for+the+Planet%22+OR+%22Riparian+Buffer+Zones+Support%22+OR+%22Sustainable+Land+Management%22+OR+%22Threatened+Species+Recovery+Plans%22+OR+%22Urban+Biodiversity+Enhancement%22+OR+%22Vertical+Farming+Incentives%22+OR+%22Water+Efficiency+Programs%22+OR+%22Xeriscaping+Rebates%22+OR+%22Youth+Engagement+in+Conservation%22+OR+%22Zero-waste+Strategies%22+OR+%22Agrobiodiversity+Conservation+Subsidies%22+OR+%22Biochar+Production+Incentives%22+OR+%22Climate+Resilience+Building%22+OR+%22Drought+Management+Assistance%22+OR+%22Eco-labeling+Programs%22+OR+%22Functional+Biodiversity+Promotion%22+OR+%22Green+Supply+Chain+Financing%22+OR+%22Hedgerow+Restoration+Support%22+OR+%22Integrated+Water+Resources+Management+Funding%22+OR+%22Jungle+Restoration+Projects%22"
query_url2 <- gsub("%22", "%5C%22", query_url) This still errors though, but now for a different reason - it's just genuinely long: cat(rawToChar(
httr::GET(query_url2)$content
))
#> <html>
#> <head>
#> <title>Bad Request</title>
#> </head>
#> <body>
#> <h1><p>Bad Request</p></h1>
#> Request Line is too large (4468 > 4094)
#> </body>
#> </html> Overall I'm completely stumped though. I have no idea why this is an issue and whether this is on our end, OA's end, httr's end, etc. |
Hm. What about using the opportunity to move to httr2? That would exclude one possible culprit. Also - if I could try to get somebody from OA to look at it - maybe log files? |
Switching over to httr2 would indeed be nice but it'll require more than just rewriting code and I currently don't have the bandwidth for this - I'll keep the issue in mind but for now the workaround above should do. |
Sorry just for completeness - what function call generated the long query URL you originally posted? Was it spit out by |
I got the URL from the OpenAlex web interface. If I remember correctly, the original search term did not work via openalexR (same symptoms as to long, but probably something different - by the way, it would be niche to give a warning if the url might be to long), so I tried the API to find out by how much. But there it worked. So I copied the API call back into the openalexR call, which is where it did not worked. |
Could you elaborate? Why do you say that? I agree, that a switch to httr2 opens the possibility to do some breaking changes (openalexR2), but why do you say that is necessary? |
Oh - it's not necessary to switch over at all! I just meant that if we were to, it would require quite a bit of work. |
I have an extremely long search query which works in the browser.
But when running
Created on 2024-03-08 with reprex v2.1.0
The text was updated successfully, but these errors were encountered: