Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect default encoding (ISO-8859-1) assumed when submitting SPARQL query as POST request #224

Open
SimonBin opened this issue Apr 20, 2022 · 0 comments

Comments

@SimonBin
Copy link

According to the SPARQL standard https://www.w3.org/TR/sparql11-protocol/#query-via-post-direct

the encoding of the data must be UTF-8

Blazegraph uses the getReader method:

if (RESTServlet.hasMimeType(req, MIME_SPARQL_QUERY)) {
// return the body of the POST, see trac 711
return readFully(req.getReader());
}

which defaults to ISO-8859-1:

https://github.com/apache/tomcat/blob/7c0dd42ac4e9533d73d4ba50791ab2dda9d79760/java/org/apache/coyote/Constants.java#L30

This causes charset to break with the following query:

curl -H "Content-Type: application/sparql-query" -d "SELECT ?x { BIND('Curaçao' As ?x) }" https://query.wikidata.org/sparql

For example, this problem occurs when Jena wants to query Wikidata from a SPARQL SERVICE clause, see apache/jena#1259 (comment)

It is most likely also causing Issue #206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@SimonBin and others