Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data loss when multiple clients update a task within one pull sync interval (15min) #149

Open
bentheadset opened this issue Aug 10, 2022 · 4 comments

Comments

@bentheadset
Copy link

it doesn't appear there is any attempt made at merge / automatic conflict resolution (ok, that's hard in the general case) - but also no checks to prevent updates from smashing each other?

Since the shortest pull interval is 15 min - this is a massive "race condition". Using two android devices as an example:

  1. device one makes an update to a task list (say, add a task)
  2. this is immediately pushed to the provider on the phone (etesync) which pushes to the server (OK, correct so far)
  3. but then a device that hasnt' yet pull background sync'd and thus has not picked up ^ does the same. Now that update appears to completely overwrite the state on the server; appears to just blindly save the whole new update
  4. now, device in Update dependencies #2 hits it's auto pull interval ... and the data it added is fully lost

I have seen this on both task entries as well as task lists themselves (both are just a text ~file it appears, treated with the same race condition for updates?)

Thank you, paid cloud customer

@tasn
Copy link
Member

tasn commented Aug 10, 2022

It's not data loss, but it doesn't overwrite. If you change the same task concurrently it will take the most recent edit. There's an edit history that you can use to view changed data.

@bentheadset
Copy link
Author

It's not data loss, but it doesn't overwrite. If you change the same task concurrently it will take the most recent edit. There's an edit history that you can use to view changed data.

where is this stored - locally on the device that gets overwritten? or on the server (probably not, setting up a new phone with etesync provider + tasks.org app, changelog doesn't have any history?)

if this is true, then yes in sense it's "recoverable" but requires navigating two apps (one is a background "app" (the etesync provider) that is intended to be fully transparent to users?

Does this same hole exist in calendar sync? I think tasks (like grocery list) get updated more by multiple devices, but it's still concerning?

@tasn
Copy link
Member

tasn commented Aug 10, 2022

On the server, you have the full history for everything in EteSync.

Yeah, it's annoying, I'm not sure what's the right way of solving it to be honest other than trying to have smart diffing and merging.

@bentheadset
Copy link
Author

bentheadset commented Aug 10, 2022

On the server, you have the full history for everything in EteSync.

Yeah, it's annoying, I'm not sure what's the right way of solving it to be honest other than trying to have smart diffing and merging.

In the general text document case, i'd agree with you. However, operating with knowledge of a line being a task/todo item, it would seem reasonable to treat task text as identity (eg hash the text), and preserve all entries. This would cause

  • dups if two edits occur - but thats a lot better than "loosing" [from the perspective of the user] the data;
  • and it would also not provide for order resolution in the case of adjacent edits (including at the tail) - but again, quite minor compared to the current state
  • could require that the data is tasks/todos somewhere (? I'm not sure what the 'contract' is to store things in the task provider; all I see in my data (likely all created by OpenTasks is a ~markdown list of -[] the text , and - [*] ... for resolved todos

both of these DO clearly involve Product Req tradeoffs but critically IMHO:

  • I believe this strictly preserves "single device" behavior in all ways
  • keeps data in a form that is [IMHO] pretty objectively better for the user**

** if i'm adding data to a TODO list, and the data disappears ... yes, technically apparently this is backed up somewhere on the server ... but i would not have added it to the TODO List if my own brain would track the item well ... I've likely actually lost it, as I will never notice it's "gone".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants