Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification events are lost depending on editor - how the file is written - neovim #88

Open
davesann opened this issue Feb 1, 2023 · 7 comments

Comments

@davesann
Copy link

davesann commented Feb 1, 2023

I came a situation where notifications are lost (I think because here: https://github.com/gmethvin/directory-watcher/blob/main/core/src/main/java/io/methvin/watcher/DirectoryWatcher.java#L347) the has is not seen as new.

This appears to arise from the manner in which an editor writes the files. I was using neovim.

I don't know if you will fix this - but the issue may be useful to others

neovim has a couple of ways that it writes when handling backups - the filewatcher will fail if backupcopy=no (or "auto" because then it is sometimes "no")

from neovim help

'backupcopy' 'bkc' string (default: "auto")
global or local to buffer |global-local|
When writing a file and a backup is made, this option tells how it's
done. This is a comma-separated list of words.

The main values are:
"yes" make a copy of the file and overwrite the original one
"no" rename the file and write a new one
"auto" one of the previous, what works best

I also saw this swallowing of events also with xed - probably for similar-ish reasons. And not with nano.

There are exceptions in the stack trace for these file changes that might help diagnose:

This is with neovim

Failing case; backupcopy=no; Final ENTRY_MODIFY is swallowed

2023-02-01 18:27:18 DEBUG DirectoryWatcher:301 - ENTRY_DELETE [/home/.../clerk-demo/notebooks/data_science.clj]
2023-02-01 18:27:18 DEBUG DirectoryWatcher:372 - DirectoryWatcher got an exception while watching!
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /home/.../clerk-demo/notebooks/data_science.clj/?
        at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:121)
        at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:68)
        at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:278)
        at java.base/java.nio.file.Path.of(Path.java:147)
        at java.base/java.nio.file.Paths.get(Paths.java:69)
        at io.methvin.watcher.PathUtils.subMap(PathUtils.java:43)
        at io.methvin.watcher.DirectoryWatcher.runEventLoop(DirectoryWatcher.java:362)
        at io.methvin.watcher.DirectoryWatcher.lambda$watchAsync$1(DirectoryWatcher.java:232)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
        at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1311)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1841)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1806)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
2023-02-01 18:27:18 DEBUG DirectoryWatcher:301 - ENTRY_CREATE [/home/.../clerk-demo/notebooks/data_science.clj]
2023-02-01 18:27:18 DEBUG DirectoryWatcher:474 - Skipping create event for path [/home/.../data_science.clj]. Path already hashed.
2023-02-01 18:27:18 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../data_science.clj]

Successful case; backupcopy=yes; ENTRY_MODIFY is sent as MODIFY event.

2023-02-01 18:30:19 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../data_science.clj]
2023-02-01 18:30:19 DEBUG DirectoryWatcher:402 - -> MODIFY [/home/.../clerk-demo/notebooks/data_science.clj] (isDirectory: false)
@gmethvin
Copy link
Owner

gmethvin commented Feb 1, 2023

I think the main issue is the InvalidPathException. That prevents us from being able to notify for the ENTRY_DELETE and remove the hashes from the hash map. Then when we get an ENTRY_CREATE event, we ignore it because we see it as a "duplicate" create event since the file already exists.

The exception happens when we do:

pathMap.subMap(treeRoot, Paths.get(treeRoot.toString(), "" + Character.MAX_VALUE));

The Character.MAX_VALUE is used as a hack there to get us all the paths in the subdirectory that start with that particular path. This code basically assumes your system charset is UTF-8, so that character should be representable. Is it possible you're using some other charset? I think you should be able to force UTF-8 everywhere by setting the environment variables like:

LANG=C.UTF-8
LC_ALL=C.UTF-8

In any case, I'm not sure we should be relying on the subMap hack above, so perhaps an alternative solution makes sense there anyway.

@davesann
Copy link
Author

davesann commented Feb 1, 2023 via email

@gmethvin
Copy link
Owner

gmethvin commented Feb 2, 2023

ok, it's probably not worth trying to debug why that specific hack doesn't work. It's easy enough to switch to another method of getting the paths in a subtree.

@gmethvin
Copy link
Owner

gmethvin commented Feb 2, 2023

@davesann try version 0.17.3 and see if that fixes your bug (note I accidentally released 0.17.2 without the fix).

I believe you should get a DELETE followed by a CREATE in the backupcopy=no case.

@davesann
Copy link
Author

davesann commented Feb 2, 2023

Using nvim

Yes. I think this has fixed the issue.

with backupcopy=yes; (copy and modify in place); still works as before; I get:

2023-02-03 09:14:43 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../clerk-demo/notebooks/data_science.clj]
2023-02-03 09:14:43 DEBUG DirectoryWatcher:400 - -> MODIFY [/home/.../clerk-demo/notebooks/data_science.clj] (isDirectory: false)

with backupcopy=no; (rename and create new); works, I get:


2023-02-03 09:16:24 DEBUG DirectoryWatcher:301 - ENTRY_DELETE [/home/.../clerk-demo/notebooks/data_science.clj]
2023-02-03 09:16:24 DEBUG DirectoryWatcher:400 - -> DELETE [/home/.../clerk-demo/notebooks/data_science.clj] (isDirectory: false)
2023-02-03 09:16:24 DEBUG DirectoryWatcher:301 - ENTRY_CREATE [/home/.../clerk-demo/notebooks/data_science.clj]
2023-02-03 09:16:24 DEBUG DirectoryWatcher:400 - -> CREATE [/home/.../clerk-demo/notebooks/data_science.clj] (isDirectory: false)
2023-02-03 09:16:25 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../clerk-demo/notebooks/data_science.clj]

The final entry_modify is not converted to a Modify event - which I think is expected

using xed

This was just out of curiosity (I don't use xed but tried it when first looking at the issue) - no exceptions.

Xed seems to use a tmp file - but this does not appear to generate the final create or modify event for the modified file - so this change does not get notified.


2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_CREATE [/home/.../clerk-demo/notebooks/.goutputstream-FH1TZ1]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:400 - -> CREATE [/home/.../clerk-demo/notebooks/.goutputstream-FH1TZ1] (isDirectory: false)
2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../notebooks/.goutputstream-FH1TZ1]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../notebooks/.goutputstream-FH1TZ1]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:400 - -> MODIFY [/home/.../clerk-demo/notebooks/.goutputstream-FH1TZ1] (isDirectory: false)
2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_MODIFY [/home/.../clerk-demo/notebooks/.goutputstream-FH1TZ1]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_DELETE [/home/.../clerk-demo/notebooks/.goutputstream-FH1TZ1]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:400 - -> DELETE [/home/dave/.../clerk-demo/notebooks/.goutputstream-FH1TZ1] (isDirectory: false)
2023-02-03 09:23:24 DEBUG DirectoryWatcher:301 - ENTRY_CREATE [/home/..../clerk-demo/notebooks/data_science.clj]
2023-02-03 09:23:24 DEBUG DirectoryWatcher:472 - Skipping create event for path [/home/.../clerk-demo/notebooks/data_science.clj]. Path already hashed.

@gmethvin
Copy link
Owner

gmethvin commented Feb 2, 2023

Interesting... so with xed it appears from the logs that we get an ENTRY_CREATE for the data_science.clj file, but there never was a prior ENTRY_DELETE event for that same file. That suggests that either the file was actually deleted but somehow the ENTRY_DELETE was never received, or that the watch service is reporting an ENTRY_CREATE when there is already an entry for that file.

If it's the latter, then the DirectoryWatcher logic probably needs to be updated to account for that case. It seems odd that we would get an ENTRY_CREATE if the entry already exists, but perhaps some WatchService implementations do that when a file is moved and replaces another entry.

I suppose it might not hurt to just report a CREATE event regardless. One of the goals of directory-watcher's hashing is to prevent duplicate events, but typically those come in the form of a series of ENTRY_MODIFY events rather than duplicate ENTRY_CREATE events.

@gmethvin
Copy link
Owner

gmethvin commented Feb 5, 2023

So after looking at this a bit more, I realize there's a good reason directory-watcher suppresses the duplicate CREATE events.

Basically, when a new directory is created, there several situations in which we don't get ENTRY_CREATE events from subdirectories that already existed, so we may need to traverse the directory structure to generate our own CREATE events. But it's possible new files are created before that traversal is complete, and we do get ENTRY_CREATE events for those later, thus leading to duplicates.

That said, while it's a bad idea to send a duplicate CREATE in that case, it might be reasonable to send a MODIFY event if the hash of the file is actually different when we get the duplicate ENTRY_CREATE event. I believe that would also solve the case with xed where the ENTRY_CREATE is being sent when the file is actually being modified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants