Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"A debounced event. Does not emit any specific event type on purpose" – please consider adding information about the reasoning, and how to work around this limitation #468

Closed
d4h0 opened this issue Feb 10, 2023 · 7 comments

Comments

@d4h0
Copy link

d4h0 commented Feb 10, 2023

Hi,

The docs of DebouncedEvent say:

A debounced event.

Does not emit any specific event type on purpose, only distinguishes between an any event and a continuous any event.

...however, the docs do not name any reasons for this limitation.

On the Upgrading from notify v4 to v5 page, I found the following:

The old DebouncedEvent is completely removed. notify-debouncer-mini only reports an Any like event (named DebouncedEvent too), as relying on specific kinds (Write/Create/Remove) is very platform specific and #261 #187 #272 to work, relying on a lot of assumptions. In most cases you should check anyway what exactly the state of files is, or probably re-run your application code, not relying on which event happened.

...which is better, but still not very satisfying.

One point that's not 100% clear to me is, why can notify::event::EventKind contain precise information about the event type, but DebouncedEvent can't? My guess is, because debouncing involves several events, which, if true, could be added to the docs.

Also, why can't DebouncedEvent contain a list of events that were involved with the debounced event? If 'avoiding allocations' is the reason, maybe there could be a feature flag that enables an enhanced DebouncedEvent.

Besides the reasoning for the limitation, I believe the docs should have guidelines about how to implement common use-cases (more detailed than what is mentioned in the upgrade instructions).

As an example, I'm implementing code that reloads a config file on changes.

A naive approach would be:

  1. Read the config file at startup
  2. Read and store the time the file was last changed
  3. Set up notify to watch the file (or rather the directory which contains the file, as the docs recommend)
  4. On events that contain the path to the config file, check if the file has changed (i.e. the modified timestamp is bigger than the previously store timestamp)
  5. Store the new timestamp
  6. Read and process the changed config file

...but is that the proper way to do it? Wouldn't reading the modified timestamp (step 4) trigger another event, resulting in an endless loop?

Because of the above, I feel the docs should contain more detail information, including guidelines for common use-cases.

@0xpr03
Copy link
Member

0xpr03 commented Feb 12, 2023

Let me answer you shortly on the two questions. I'll get the other stuff later, I don't think you're wrong that more docs is probably a good idea.

Wouldn't reading the modified timestamp (step 4) trigger another event, resulting in an endless loop?

No

contain precise information about the event type, but DebouncedEvent can't?

The short answer is that mini-debouncer isn't made for advanced requirements and gives you the minimum to reload config files on change, or rebuild assets. You should check for the existence of files anyway, or work with the fact that they may be gone between the event and your interaction with them.

Which leads into the long answer: Events aren't set in stone and you will have to deal with the "reality" of what is actually true for the files anyway. Trying to give you debounce event details is kind of impossible to get right. The easy approach would be to just do stuff like "create + delete + create = change". Problem is that renames, moves and missed events (yes that's possible, if only because moving from/to another directory) will lead to false results. Also don't forget about different editors "editing" files differently. Thus you will have to a) send all events b) in the right order and c) add a fat disclaimer that notify tried to do this earlier, but couldn't get it right, so good luck to anyone trying this by themselves, hope you at least prepare for editors that write into another file and rename afterwards.

@d4h0
Copy link
Author

d4h0 commented Feb 12, 2023

Thanks for your response, @0xpr03!

As someone not familiar with the code of this crate, and file system event APIs in general (let alone for different platforms), it's hard to see why this is difficult, but I obviously have to believe you (who is someone with that knowledge).

That being said, this makes it only even more important to add some examples and guidelines that help people to implement proper file system watching code.

Also, maybe there could be some higher level APIs for common use-cases?

For example, what I described above should be easy to implement. The debouncer (or a layer on top of it) just would have to maintain some state (i.e. the modified timestamps).

@0xpr03
Copy link
Member

0xpr03 commented Feb 12, 2023

Also, why can't DebouncedEvent contain a list of events that were involved with the debounced event

It could, but you're probably just wasting allocations (& frees), for most people.

I guess you have seen the examples folder ? Is there something specifically you're missing ? Just some readme with "look here for config changes"?

My guess is, because debouncing involves several events

The basic idea is something like this: You want to do something when files change. You may not even need to know which files changed, just that something changed. Examples are rebuilding of docs from markdown files, reloading your config or running some linter on a bunch of JS files.
But you don't want this to run after every event, this could involve 2 events per file on a git pull: truncate + append. Or more, depending how it's done (see previous comment). You don't need to know those 2 events, you just need to know that they changed, and you certainly don't want to run your logic on every change event, possibly while the files contain garbage. So that's why you use a debouncer. Most projects don't even check what kind of event happened or for which paths, they just re-run their stuff on any event.

the modified timestamps

But why would you want to store the timestamps when you know the files changed ? Timestamps can also be invalid, most linux hosts boot with relaxed timestamp settings, so no change is recorded for sufficiently fast events - you can even disable all timestamps. And if you're looking at special file systems (/proc, network), any timestamp may be invalid, so notify may internally store a hash of the content.

The gist is that I'm open for contributions to the debouncer, or making a notify-debouncer that has all the bells and whistles you may want, in addition with summarized events, maybe we can get this right for most people - I've just seen what previous people tried, and looked through the issues. We may also want some way to ignore paths, let's say watching the whole folder but ignoring .gitignore stuff and .git, so this has be to generic with some easy way to just "use gitignore". That would also help with all the cases where you have to watch the whole folder, and not only the one file you're interested in.

@d4h0
Copy link
Author

d4h0 commented Feb 12, 2023

But why would you want to store the timestamps when you know the files changed ?

I guess, mostly because I'm not sure how the debouncer works. My assumption was, that the debouncer works with the same event kinds as described in EventKind.

One of those event kinds is EventKind::Access, and my reasoning was "If someone reads the file (e.g. my program after it was notified about a change), then I don't want to reload the config file". Storing the modified timestamp seemed like a good way to prevent needless work (and endless loops).

But from your comments it seems the debouncer doesn't react to EventKind::Access, correct?

In that case, there would only be two scenarios:

  1. The config file was deleted
  2. The configuration most-likely changed because the config file was created, changed, or overwritten

If that is true, then I should just try to read the config file on any event, which would fail if the file was deleted (which I anyway need to handle).

So I guess, the main problem is that the documentation of the debouncer doesn't contain enough information (which is why I opened this issue).

I guess you have seen the examples folder ?

Yes, I've looked at debounced.rs and async_monitor.rs before I started to work on my config file watcher.

Is there something specifically you're missing ? Just some readme with "look here for config changes"?

Yes, I think a config file watcher is a common use-case, and would make a good example. Just to be clear regarding what I mean with "a config file watcher":

I'm talking about a program that reads its config file at startup, and then reloads it when it changes, so that the program does not need to be restarted after a config update. For example, alacritty uses notify for this, as far as I know.

The gist is that I'm open for contributions to the debouncer, or making a notify-debouncer that has all the bells and whistles you may want, in addition with summarized events, maybe we can get this right for most people - I've just seen what previous people tried, and looked through the issues.

I might give it a try at some point, when I need a more advanced debouncer (I also kind of want to learn more about this stuff, but my hands are pretty full at the moment).

I think, it would be useful to collect all the related problems, issues, and information in one place, so it's easier to get started, if someone decides to work on this.

We may also want some way to ignore paths, let's say watching the whole folder but ignoring .gitignore stuff and .git, so this has be to generic with some easy way to just "use gitignore".

It seems, the ignore crate would be useful for this (ignore::gitignore).

@0xpr03
Copy link
Member

0xpr03 commented Feb 12, 2023

doesn't react to EventKind::Access

Capture all events for a path, if no event got fired for that path, for the duration set on debouncer creation, emit an Any Event. If we instead still get an Event, fire AnyContinuous.

then I should just try to read the config file on any event

That's why it's called an "any" event ;) You may choose to ignore continuos events.

@d4h0
Copy link
Author

d4h0 commented Feb 12, 2023

Capture all events for a path, if no event got fired for that path, for the duration set on debouncer creation, emit an Any Event. If we instead still get an Event, fire AnyContinuous.

So what happens in the following scenario:

  1. Program watches ./config.toml
  2. User edits ./config.toml
  3. Notify sends debounced event to program
  4. Program reads ./config.toml

Wouldn't that result in an endless-loop? Step 4 would trigger a new access event, so Notify would send another debounced event, and so on.

@0xpr03
Copy link
Member

0xpr03 commented Feb 12, 2023

Wouldn't that result in an endless-loop

Only if you modify the file. Access doesn't trigger notify events from the file system, we simply don't subscribe to them and for most users they don't even exist (access times on windows don't existsame for windows, on linux only if you mount them like that - which isn't the default). Note: Specifically for linux, watching said file for changes would probably even fire an access event itself, we have to subscribe to each file recursively - and we check for some parts if the file exists, possibly performing an access.

If that were the case, hot-reload-tide would itself trigger that loop.

@0xpr03 0xpr03 closed this as completed Aug 16, 2023
CyriacBr added a commit to CyriacBr/modules-watcher that referenced this issue Sep 11, 2023
BREAKING CHANGES

Relying on FS events was not reliable across all platforms, and some events are outright ignored based on how the file changes (e.g some IDE do not truly "write" to files directly).
See notify-rs/notify#465 and notify-rs/notify#468.

We now use a dumb poll system that computes changes every 250ms. The EntryChange struct was thus improved to provide trigger cause, and this is what the watch callback will now send.

Additional changes:
- Update rust napirs packages and set napi cli to 2.14.8 to avoid duplicate generated content
- Adapt tests to the new watching system
- Update README
CyriacBr added a commit to CyriacBr/modules-watcher that referenced this issue Sep 11, 2023
BREAKING CHANGES

Relying on FS events was not reliable across all platforms, and some events are outright ignored based on how the file changes (e.g some IDE do not truly "write" to files directly).
See notify-rs/notify#465 and notify-rs/notify#468.

We now use a dumb poll system that computes changes every 250ms. The EntryChange struct was thus improved to provide trigger cause, and this is what the watch callback will now send.

Additional changes:
- Update rust napirs packages and set napi cli to 2.14.8 to avoid duplicate generated content
- Adapt tests to the new watching system
- Update README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants