Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #459: reduce timestamptz precision #716

Merged
merged 2 commits into from
Dec 13, 2022

Conversation

sadiqkhoja
Copy link
Contributor

@sadiqkhoja sadiqkhoja commented Dec 8, 2022

Closes #459

Changes:

  • Reduce precision of timestamptz columns to milliseconds across the database

What has been done to verify that this works as intended?

  • Added integration test

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

NA

Does this change require updates to the API documentation? If so, please update docs/api.md as part of this PR.

NA

Before submitting this PR, please make sure you have:

  • run make test-full and confirmed all checks still pass OR confirm CircleCI build passes
  • verified that any code from external sources are properly credited in comments

@sadiqkhoja
Copy link
Contributor Author

sadiqkhoja commented Dec 8, 2022

There is still a separate but related issue: Only using submissionDate > xxx filter criterion can skip submision(s) in batch processing. Possible solution is to recommend users to use submissionDate > xxx and submissionDate < xxx in ETL and similar workflows

Scenario:
Assume user has ETL proceduce where they are fetching new submissions periodically by using filter submissionDate greater than submissionDate of last known submission. Then consider following sequence of events:

t0: lastTimestamp is t0
t1: new submission(1)
t1: get submissions with submissionDate > t0 -- submission(1) is returned and lastTimestamp is t1
t1: new submission(2)
t2: get submissions with submissionDate > t1 -- no submission is returned

This can always happen no matter what the database timestamp precision is. Following test captures this scenario.

it.only('should not skip submission', testService(async (service, { run }) => {
  const asAlice = await service.login('alice', identity);
  
  let lastTimestamp = '2022-01-01T09:00:00.015Z';
  
  // ****************************************************************************************
  // Simulate concurrent requests
  // Assume following three requests are executed at 2022-01-01T09:00:00.020Z
  // But there is an order at higher precision i.e. microsecond or picosecond
  // ****************************************************************************************
  {
    await asAlice.post('/v1/projects/1/forms/withrepeat/submissions')
    .send(testData.instances.withrepeat.one)
    .set('Content-Type', 'text/xml')
    .expect(200)
    .then(() => run(sql`update submissions set "createdAt"='2022-01-01T09:00:00.020Z'`));
  
    lastTimestamp = await asAlice.get(`/v1/projects/1/forms/withrepeat.svc/Submissions?$filter=__system/submissionDate gt ${lastTimestamp}`)
      .expect(200)
      .then(({ body }) => {
        body.value.length.should.be.eql(1);
        body.value[0].__id.should.be.eql('rone');
        return  body.value[0].__system.submissionDate;
      });
  
    await asAlice.post('/v1/projects/1/forms/withrepeat/submissions')
      .send(testData.instances.withrepeat.two)
      .set('Content-Type', 'text/xml')
      .expect(200)
      .then(() => run(sql`update submissions set "createdAt"='2022-01-01T09:00:00.020Z'`));
  }
  // ****************************************************************************************
  
  
  await asAlice.get(`/v1/projects/1/forms/withrepeat.svc/Submissions?$filter=__system/submissionDate gt ${lastTimestamp}`)
    .expect(200)
    .then(({ body }) => {
      body.value.length.should.be.eql(1);
      body.value[0].__id.should.be.eql('rtwo');
    });
}));

@sadiqkhoja sadiqkhoja marked this pull request as ready for review December 8, 2022 21:46
Copy link
Member

@ktuite ktuite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, I just feel bad for all those microseconds we'll be saying goodbye to!

We have a way of writing tests for individual migrations and I was trying to think of a useful before/after migration test but I can't think of one right now. The test in this PR covers the main timestamp precision scenario we want to fix.

@sadiqkhoja sadiqkhoja merged commit 3bd2c80 into getodk:master Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OData filter: date/time fields more precise than millisecond
2 participants