Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add 'watcher' interface to file sync #1365

Merged
merged 15 commits into from
Aug 22, 2024
Merged

Conversation

djosephsen
Copy link
Contributor

Implement fsnotify and os.Stat based watchers

fixes: #1344

This PR

Intent of this PR is to begin a conversation about fixing #1344. The approach taken is to replace the current use of fsontify.Watcher with a local Watcher interface type that describes the fsnotify.Watcher interface.

My original take was to use fsnotify.Watcher directly as an implementation of local Watcher, but fsnotify's Watcher directly exposes its Error and Event channels, making it impossible to describe with an interface, so I had to create a small wrapper for fsnotify.Watcher to satisfy the new Watcher interface (this is fsnotify_watcher.go). From there, we implement the Watcher interface again, this time using os.Stat and fs.FileInfo (this is fileinfo_watcher.go).

Then we change the filepath sync code to use an interface to Watcher, rather than fsnotify.Watcher directly. The new fileinfo watcher plugs right in, and nothing really needs to change in the sync.

  • I have not wired up configs, so the fileinfo watcher has a hard-coded 1-second polling interval, and there is no current means of selecting between them.
  • I've added a couple tests, to demonstrate how unit tests would work in general (we use a configurable os-stat func in the fileinfo watcher, which can be mocked for tests)
  • I don't have a way of testing this on Windows. I'm vaguely aware there's an upstream issue in package fs that may require some work-around boilerplate to make this work on windows at the moment.

If yall are favorable to this approach, I'll finish wiring up configs, and flesh out the tests. I didn't want to go much further without some buy-in or feedback.

Related Issues

Fixes #1344

Notes

See bullet-points above

How to test

go test -v ./...

@djosephsen djosephsen requested a review from a team as a code owner July 22, 2024 03:54
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jul 22, 2024
Copy link

netlify bot commented Jul 22, 2024

Deploy Preview for polite-licorice-3db33c ready!

Name Link
🔨 Latest commit df3f9bc
🔍 Latest deploy log https://app.netlify.com/sites/polite-licorice-3db33c/deploys/66c51ae83276770008e19405
😎 Deploy Preview https://deploy-preview-1365--polite-licorice-3db33c.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@djosephsen djosephsen changed the title Wire up 'watcher' interface for file-watching feat: add 'watcher' interface to file sync Jul 22, 2024
@beeme1mr
Copy link
Member

Hey @djosephsen, this seems like a good approach to me. @toddbaert @thisthat @Kavindu-Dodan do you have any concerns?

@thisthat
Copy link
Member

Thanks @djosephsen for the awesome contribution, I am in favor of having this 👍
Regarding the Windows issue, I think we should add a parallel job for the integration tests and run them on a Windows runner (See https://github.com/open-feature/flagd/blob/main/.github/workflows/build.yaml#L115)

@toddbaert
Copy link
Member

toddbaert commented Jul 23, 2024

I agree with this approach! Thanks @djosephsen .

I think the only thing I'm not sure about is how this configuration should be selected... Should we make it a "global" flag so that all file-watchers use a particular implementation? Should it allow it to be configured per-file-sync object?

I think I favor the former... I think if someone wants to use a certain implementation, they will want to use it for all watchers, so a single global config makes sense. I also wonder if there's some way we can default to use this mode outside of K8s. Is there an "approved" means of knowing if you're running in K8s? Some env-var? The watcher as implemented works great there, since K8s behaves in very predictable ways wrt config-maps, but outside of that I think the new implementation will be more reliable (albeit a bit less performant). cc @thisthat @heckelmann on that last question.

@djosephsen
Copy link
Contributor Author

I agree with this approach! Thanks @djosephsen .

I think the only thing I'm not sure about is how this configuration should be selected... Should we make it a "global" flag so that all file-watchers use a particular implementation? Should it allow it to be configured per-file-sync object?

SGTM. Honestly making it a per-watcher switch hadn't even occurred to me.

I think I favor the former... I think if someone wants to use a certain implementation, they will want to use it for all watchers, so a single global config makes sense. I also wonder if there's some way we can default to use this mode outside of K8s. Is there an "approved" means of knowing if you're running in K8s? Some env-var? The watcher as implemented works great there, since K8s behaves in very predictable ways wrt config-maps, but outside of that I think the new implementation will be more reliable (albeit a bit less performant). cc @thisthat @heckelmann on that last question.

I don't know how officially approved this is, but I've used KUBERNETES_SERVICE_HOST in the past to detect whether the process is running in k8s. I suppose the bullet-proof way to do it would be to use the client lib...

Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach and 👍 from my end. We could set a default operation mode (ex:- fsnotify for historical reasons) and gradually move to fileinfo. And document/recommend fileinfo for situations where fsnotify fails.

Copy link
Member

@beeme1mr beeme1mr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking great! There's a few minor issues that should be addressed before merging but we're close.

When you get a moment, please:

  • sign-off your commits. This is a requirement from the CNCF. Here's some information on how to easily update the entire branch.
  • double-check the default "file" logic.
  • add the new file syncs to the docs.

Thanks!

core/pkg/sync/builder/syncbuilder.go Show resolved Hide resolved
core/pkg/sync/builder/syncbuilder.go Outdated Show resolved Hide resolved
core/pkg/sync/builder/syncbuilder.go Show resolved Hide resolved
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 30, 2024
@djosephsen
Copy link
Contributor Author

Just heads-up before this merges I should really flesh out fileinfo_watcher_test.go.. I feel like this is under-tested

@beeme1mr
Copy link
Member

beeme1mr commented Aug 2, 2024

Just heads-up before this merges I should really flesh out fileinfo_watcher_test.go.. I feel like this is under-tested

Sure, no problem. Please let us know when you're ready so we can rereview. Thanks!

@djosephsen
Copy link
Contributor Author

ok yall.. tests are fleshed out. Sorry that took a minute (writing tests is not my favorite thing). I have a question about how we want to handle the Error and Event channels given by the watchers. Currently we are using an unbuffered fsnotify .Watcher , but there is an upstream buffered option.

Do we want to default to using unbuffered channels in the fileinfo watcher? My personal feeling is that buffered channels with sane defaults (size=32?) are pretty cheap, generally safer and less likely to lock the runtime in unexpected ways, unless you're genuinely holding something wrong. For example, In cases like the tests, where you're just kind of flexing the code-paths, unbuffered channels can create headaches.

ATM I have silly values for both buffer sizes, and will change them to whatever yall think, at which point I think we're ready to merge pending your review.

@djosephsen
Copy link
Contributor Author

hmm, I just noticed I forgot the DCO again. Rebase is giving merge conflicts, and when I resolve them, giving me a commit message that looks like I might be overwriting someone else's history.. Do I need to close this and re-submit?

(sorry my bad. I'll git config --global commit.gpgsign true)

@beeme1mr
Copy link
Member

Hey @djosephsen, you should be able to run the following command:

In your local branch, run: git rebase HEAD~10 --signoff

Force push your changes to overwrite the branch: git push --force-with-lease origin main

We'll squash merge ovce the pr is ready.

@beeme1mr
Copy link
Member

Dco is a bit of a pain. We're looking into switching to a cla to simplify this process.

Implement fsnotify and `os.Stat` based watchers

fixes: open-feature#1344

Signed-off-by: Dave Josephsen <[email protected]>
djosephsen and others added 2 commits August 13, 2024 20:25
Signed-off-by: Dave Josephsen <[email protected]>
@djosephsen
Copy link
Contributor Author

Hey @djosephsen, you should be able to run the following command:

In your local branch, run: git rebase HEAD~10 --signoff

Force push your changes to overwrite the branch: git push --force-with-lease origin main

We'll squash merge ovce the pr is ready.

The first time I tried it went sideways, but the second time I was able to use some switch to skip an upstream merge, and that did it.

@toddbaert
Copy link
Member

toddbaert commented Aug 14, 2024

@djosephsen if you want, you can approve or 👍 👎 here and you'll get an org invite. Then your CI tests will run automatically. There's no obligations involved in joining, but it's the first step on the contributor ladder.

@Kavindu-Dodan
Copy link
Contributor

@djosephsen there're few lint issue due to unwrapped errors [1]. And you can run lint validations locally with make lint, so you know next run will be greet :)

[1] - https://github.com/open-feature/flagd/actions/runs/10406956145/job/28834819193?pr=1365#step:5:871

@djosephsen
Copy link
Contributor Author

@djosephsen there're few lint issue due to unwrapped errors [1]. And you can run lint validations locally with make lint, so you know next run will be greet :)

[1] - https://github.com/open-feature/flagd/actions/runs/10406956145/job/28834819193?pr=1365#step:5:871

Hey hey... I think it might be idiomatically incorrect to wrap those errors. fsnotify_watcher is an interface wrapper, so by definition it doesn't add context to the upstream error.

Totally happy to wrap them if ya want. But I think the correct move is probably to //nolint:wrapcheck those methods. Take a look and lemme know what yall think.

@Kavindu-Dodan
Copy link
Contributor

Kavindu-Dodan commented Aug 16, 2024

@djosephsen there're few lint issue due to unwrapped errors [1]. And you can run lint validations locally with make lint, so you know next run will be greet :)
[1] - https://github.com/open-feature/flagd/actions/runs/10406956145/job/28834819193?pr=1365#step:5:871

Hey hey... I think it might be idiomatically incorrect to wrap those errors. fsnotify_watcher is an interface wrapper, so by definition it doesn't add context to the upstream error.

Totally happy to wrap them if ya want. But I think the correct move is probably to //nolint:wrapcheck those methods. Take a look and lemme know what yall think.

Yeah it doesn't add context to the upstream, but by wrapping we get more context for downstream (ex:- log analysis). I think this section of the checker explains why wrapping is recommended [1].

[1] - https://github.com/tomarrell/wrapcheck?tab=readme-ov-file#why

@djosephsen
Copy link
Contributor Author

@djosephsen there're few lint issue due to unwrapped errors [1]. And you can run lint validations locally with make lint, so you know next run will be greet :)
[1] - https://github.com/open-feature/flagd/actions/runs/10406956145/job/28834819193?pr=1365#step:5:871

Hey hey... I think it might be idiomatically incorrect to wrap those errors. fsnotify_watcher is an interface wrapper, so by definition it doesn't add context to the upstream error.
Totally happy to wrap them if ya want. But I think the correct move is probably to //nolint:wrapcheck those methods. Take a look and lemme know what yall think.

Yeah it doesn't add context to the upstream, but by wrapping we get more context for downstream (ex:- log analysis). I think this section of the checker explains why wrapping is recommended [1].

[1] - https://github.com/tomarrell/wrapcheck?tab=readme-ov-file#why

Okeedokee, pr inbound

Signed-off-by: Dave Josephsen <[email protected]>
Signed-off-by: Dave Josephsen <[email protected]>
@beeme1mr
Copy link
Member

I'm seeing a panic now when trying to run this locally.

Error:

2024-08-19T12:35:15.559-0400    info    file/filepath_sync.go:62        Starting filepath sync notifier {"component": "sync", "sync": "fileinfo"}
panic: assignment to entry in nil map

goroutine 1 [running]:
github.com/open-feature/flagd/core/pkg/sync/file.(*fileInfoWatcher).Add(0xc00021ca80, {0xc0005b0360, 0x2a})
        /home/beemer/code/openfeature/flagd/core/pkg/sync/file/fileinfo_watcher.go:69 +0xfe
github.com/open-feature/flagd/core/pkg/sync/file.(*Sync).Init(0xc0005e1f20, {0x2123020, 0xc0003aa550})
        /home/beemer/code/openfeature/flagd/core/pkg/sync/file/filepath_sync.go:78 +0x22a
github.com/open-feature/flagd/flagd/pkg/runtime.(*Runtime).Start(0xc0005b40f0)
        /home/beemer/code/openfeature/flagd/flagd/pkg/runtime/runtime.go:75 +0x2b5
github.com/open-feature/flagd/flagd/cmd.init.func1(0xc0005a2200?, {0x1df5f60?, 0x4?, 0x1df5f64?})
        /home/beemer/code/openfeature/flagd/flagd/cmd/start.go:146 +0x8e5
github.com/spf13/cobra.(*Command).execute(0x30cb160, {0xc0003a7bf0, 0x3, 0x3})
        /home/beemer/go/pkg/mod/github.com/spf13/[email protected]/command.go:989 +0xab1
github.com/spf13/cobra.(*Command).ExecuteC(0x30cae80)
        /home/beemer/go/pkg/mod/github.com/spf13/[email protected]/command.go:1117 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
        /home/beemer/go/pkg/mod/github.com/spf13/[email protected]/command.go:1041
github.com/open-feature/flagd/flagd/cmd.Execute({0x1df5788?, 0xc0000ac050?}, {0x1df5e24?, 0x21008a0?}, {0x1df9193?, 0x4014e0?})
        /home/beemer/code/openfeature/flagd/flagd/cmd/root.go:37 +0x91
main.main()
        /home/beemer/code/openfeature/flagd/flagd/main.go:30 +0xd4
exit status 2
make: *** [Makefile:55: run-flagd] Error 1

@djosephsen
Copy link
Contributor Author

I saw that too.. I'll get it tracked down

@djosephsen
Copy link
Contributor Author

How we lookin? I can run the binary locally right now, but trying to run the integration tests, I don't have any of the files referenced by those commands.. like test-harness/flags/testing-flags.json and etc... Looks like the integration tests, as run by the CI are working though.

@beeme1mr
Copy link
Member

Hey @djosephsen, I'll manually test it real quick. If all looks good, I think we should merge. FYI @toddbaert

@beeme1mr
Copy link
Member

It works great locally. Thank you very much @djosephsen. 🥳

@toddbaert toddbaert merged commit 61fff43 into open-feature:main Aug 22, 2024
15 checks passed
@github-actions github-actions bot mentioned this pull request Aug 22, 2024
beeme1mr pushed a commit that referenced this pull request Aug 22, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>flagd: 0.11.2</summary>

##
[0.11.2](flagd/v0.11.1...flagd/v0.11.2)
(2024-08-22)


### 🐛 Bug Fixes

* **deps:** update module buf.build/gen/go/open-feature/flagd/grpc/go to
v1.5.1-20240215170432-1e611e2999cc.1
([#1372](#1372))
([ae24595](ae24595))
* **deps:** update module github.com/open-feature/flagd/core to v0.10.1
([#1355](#1355))
([8fcfb14](8fcfb14))
* **deps:** update module golang.org/x/net to v0.28.0
([#1380](#1380))
([239a432](239a432))
* **deps:** update module golang.org/x/sync to v0.8.0
([#1378](#1378))
([4804c17](4804c17))


### 🧹 Chore

* **deps:** update dependency go to v1.22.6
([#1297](#1297))
([50b92c1](50b92c1))
* **deps:** update golang docker tag to v1.23
([#1382](#1382))
([abb5ca3](abb5ca3))
* improve gRPC sync service shutdown behavior
([#1375](#1375))
([79d9085](79d9085))
</details>

<details><summary>flagd-proxy: 0.6.5</summary>

##
[0.6.5](flagd-proxy/v0.6.4...flagd-proxy/v0.6.5)
(2024-08-22)


### 🐛 Bug Fixes

* **deps:** update module buf.build/gen/go/open-feature/flagd/grpc/go to
v1.5.1-20240215170432-1e611e2999cc.1
([#1372](#1372))
([ae24595](ae24595))
* **deps:** update module github.com/open-feature/flagd/core to v0.10.1
([#1355](#1355))
([8fcfb14](8fcfb14))
* **deps:** update module golang.org/x/net to v0.28.0
([#1380](#1380))
([239a432](239a432))
* **deps:** update module golang.org/x/sync to v0.8.0
([#1378](#1378))
([4804c17](4804c17))


### 🧹 Chore

* **deps:** update dependency go to v1.22.6
([#1297](#1297))
([50b92c1](50b92c1))
* **deps:** update golang docker tag to v1.23
([#1382](#1382))
([abb5ca3](abb5ca3))


### 📚 Documentation

* **flagd-proxy:** removed invalid grpc prefix from uri config
([4911697](4911697))
</details>

<details><summary>core: 0.10.2</summary>

##
[0.10.2](core/v0.10.1...core/v0.10.2)
(2024-08-22)


### 🐛 Bug Fixes

* **deps:** update module buf.build/gen/go/open-feature/flagd/grpc/go to
v1.5.1-20240215170432-1e611e2999cc.1
([#1372](#1372))
([ae24595](ae24595))
* **deps:** update module connectrpc.com/otelconnect to v0.7.1
([#1367](#1367))
([184915b](184915b))
* **deps:** update module
github.com/open-feature/open-feature-operator/apis to v0.2.44
([#1368](#1368))
([0c68726](0c68726))
* **deps:** update module golang.org/x/crypto to v0.26.0
([#1379](#1379))
([05f6658](05f6658))
* **deps:** update module golang.org/x/mod to v0.20.0
([#1377](#1377))
([797d7a4](797d7a4))
* **deps:** update module golang.org/x/sync to v0.8.0
([#1378](#1378))
([4804c17](4804c17))


### ✨ New Features

* add 'watcher' interface to file sync
([#1365](#1365))
([61fff43](61fff43))
* added new grpc sync config option to allow setting max receive message
size. ([#1358](#1358))
([bed077b](bed077b))
* Support blob type sources and GCS as an example of such source.
([#1366](#1366))
([21f2c9a](21f2c9a))


### 🧹 Chore

* **deps:** update dependency go to v1.22.6
([#1297](#1297))
([50b92c1](50b92c1))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Signed-off-by: OpenFeature Bot <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] filepath sync to poll version, in addition to using fsnotify
5 participants