Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for aggregates when the pattern match is empty #61

Merged
merged 6 commits into from
May 9, 2020
Merged

Tests for aggregates when the pattern match is empty #61

merged 6 commits into from
May 9, 2020

Conversation

afs
Copy link
Contributor

@afs afs commented May 2, 2020

This PR replaces the aggregates/agg-empty-group test with 4 new tests: one each for the case of MAX and COUNT with and without GROUP BY.

@gkellogg
Copy link
Member

gkellogg commented May 2, 2020

Without delving into my implementation to far, I get different results for agg-empty-group-1, agg-empty-group-count-01 and egg-empty-group-count-02, but I'm sure your reasoning is correct.

For agg-empty-group-1, you expect no solutions, and I return a single solution with no bindings.

For agg-empty-group-count-01, you expect a single solution with ?C bound to 0, and I get a single solution with no bindings.

For agg-empty-group-count-02, you expect no solutions, and I get a single solution with no bindings.

It would seem the the purpose of the test is to test for these types of differences.

@kasei
Copy link
Contributor

kasei commented May 4, 2020

I agree with the analysis in your writeup. I also fail some of these tests; I think I lost track somewhere along the way of the original reasoning for the (1) grouping expression, and trusted (hoped) that we had gotten the test suite correct.

@ericprud
Copy link
Member

ericprud commented May 4, 2020

SWObjects gets the opposite answers from AFS on -count-0{1,2} but I think AFS's answers make more sense.

for q in agg-empty-group-2.rq agg-empty-group-count-01.rq agg-empty-group-count-02.rq; do echo $q && ~/checkouts/ericprud/swobjects/bin/sparql -8 $q; done
agg-empty-group-2.rq
┌──────┐
│ ?max │
└──────┘
agg-empty-group-count-01.rq
┌────┐
│ ?C │
└────┘
agg-empty-group-count-02.rq
┌────┐
│ ?C │
│  0 │
└────┘

@afs
Copy link
Contributor Author

afs commented May 4, 2020

ARQ has got this wrong multiple times - both with spec misreading and plain old bugs - and not just the "empty match" case.

The principle I hope is acceptable is that "aggregate, no group-by => one row always", "aggregate, with group-by => same number of rows as the group-by" for empty and no-empty matching.

The spec does not handle one case - "no match, aggregate, no group-by" - but I think does say that otherwise.

Copy link
Member

@rubensworks rubensworks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comunica produces the following:

agg-empty-group-2.rq
  error
agg-empty-group-2.rq
  error
agg-empty-group-count-01.rq
  ?c: 0
agg-empty-group-count-02.rq
  ?c: 0

So only for agg-empty-group-count-01, the result is the same.
I do however agree that the tests in this PR make sense, and we should update the implementation on our end.

@gkellogg
Copy link
Member

gkellogg commented May 5, 2020

I see a lot of reports on how implementations handle this, but it would be good to 👍 or 👎 to vote for merging this PR. I'm 👍 .

gkellogg added a commit to ruby-rdf/sparql that referenced this pull request May 6, 2020
…ressions, otherwise (not grouping), return a single solution binding aggregated variables.

The principle is that "aggregate, no group-by => one row always", "aggregate, with group-by => same number of rows as the group-by" for empty and no-empty matching.

See w3c/rdf-tests#61
@lisp
Copy link

lisp commented May 7, 2020

  • the tests names could be chosen to better indicate their respective content. that is, a more indicative would be better than "-1" and "-2". that the max v/s count names are not symmetrical is not advantageous
  • the "max" test files should indicate that in their names
  • somewhere the documents should describe concisely the intent. this should be neither a long account of how the behaviour might be thought to follow from an implementation model, nor lament discrepancies in the recommendation. it should just state the behaviour to be verified.

@gkellogg
Copy link
Member

gkellogg commented May 7, 2020

  • the tests names could be chosen to better indicate their respective content. that is, a more indicative would be better than "-1" and "-2". that the max v/s count names are not symmetrical is not advantageous

The current naming is certainly in line with most other SPARQL test cases. Generally, they could all be more descriptive, but in my experience, this becomes difficult to maintain over time. Descriptions should go in the manifest rdfs:comment, and potentially the mf:name fields. However, I'd support having new tests use more descriptive names.

  • the "max" test files should indicate that in their names
  • somewhere the documents should describe concisely the intent. this should be neither a long account of how the behaviour might be thought to follow from an implementation model, nor lament discrepancies in the recommendation. it should just state the behaviour to be verified.

Which documents? The test suite just has a top-level index (other than the manifest comments). I agree that a future spec could describe more motivation behind unintuitive test behavior, perhaps using more enhanced examples, but that would be for a future WG to decide. The test names and comments could provide some more information on what is being tested, specifically. For example, the manifest entry for the new tests should probably have an rdfs:comment entry describing more about what is being tested.

Another possibility would be to add some description wherever the mf:feature is described. Most (not all) tests use mf:feature sparql:aggregate. This expands to the namespace document, which is empty. Ideally, these terms would be described there, but I don't believe this group has control over that namespace.

We could potentially also use rdfs:seeAlso (which agg-empty-group-2 had, but now seems missing) to point to a discussion, perhaps the related issue in this repo.

@gkellogg
Copy link
Member

gkellogg commented May 7, 2020

@lisp perhaps you could provide a PR into this branch that would satisfy your concerns, or otherwise work with @afs to update this to make it satisfactory.

@lisp
Copy link

lisp commented May 7, 2020

my intent the change would be better if it does not depend on knowledge embedded in other documents. that will matter down the road.

@lisp
Copy link

lisp commented May 7, 2020

as well established as practices may be, that

The current naming is certainly in line with most other SPARQL test cases.

does not recommend them.

@kasei
Copy link
Contributor

kasei commented May 7, 2020

What's being discussed now doesn't seem to have anything to do with whether these tests should be included in the test suite. There seems to be agreement on that. Let's merge it, and anyone who wants to file a new PR to improve naming or descriptions/documentation can do that (as that is a topic that impacts more than just these new tests).

@gkellogg
Copy link
Member

gkellogg commented May 7, 2020

What's being discussed now doesn't seem to have anything to do with whether these tests should be included in the test suite. There seems to be agreement on that. Let's merge it, and anyone who wants to file a new PR to improve naming or descriptions/documentation can do that (as that is a topic that impacts more than just these new tests).

If @afs would like to act on such feedback, it's up to him. Failing that, I suggest we just go ahead and merge.

I may add some suggestions for comments inline to this PR.

Copy link
Member

@gkellogg gkellogg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments along the lines of @afs's explanation in #61 (comment). Please feel free to improve.

sparql11/data-sparql11/aggregates/manifest.ttl Outdated Show resolved Hide resolved
@afs
Copy link
Contributor Author

afs commented May 8, 2020

It seems to have been missed that SPARQL tests have a mf:name which is more descriptive than the filename. It can be output by a test runner (Jena's does).

Thank you to @gkellogg for providing rdfs:comments for the tests.

The file names for the changes to the original MAX cases replace existing tests so retaining names make some sense for discussions to show what is new and what is replacement. Just before merge, I can rename files. The "1" and "2" for the count tests could usefully be swapped.

We need to ensure the PR comments are not too broken by this change so I suggest doing it at the last moment and also recording the change in the PR comments here.

@gkellogg
Copy link
Member

gkellogg commented May 8, 2020

Thanks Andy, IMO, this substantially responds to the objections, and you're clear to go ahead and merge this PR.

@afs
Copy link
Contributor Author

afs commented May 9, 2020

Commit c92f945 tidies up the manifest and puts in consistent naming of files:

Original PR Rename Reason
agg-empty-group-count-02 agg-empty-group-count-1 1,2 always "group, "no group"
agg-empty-group-count-01 agg-empty-group-count-2 1,2 always "group, "no group"
agg-empty-group-2 agg-empty-group-max-2 Put "max" into name
agg-empty-group-2 agg-empty-group-max-2 Put "max" into name

The merge will squash the commits down to one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants