Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default setting for only_publish_changed #127

Closed
legrego opened this issue Jan 4, 2021 · 10 comments · Fixed by #159
Closed

Change default setting for only_publish_changed #127

legrego opened this issue Jan 4, 2021 · 10 comments · Fixed by #159
Labels
discuss enhancement New feature or request good first issue Good for newcomers

Comments

@legrego
Copy link
Owner

legrego commented Jan 4, 2021

Currently, this integration will publish the current state of all entities every n seconds, regardless of whether or not this entity underwent a state change.

The intent was to make visualizing this information easier in Kibana, as missing values would not render nicely in line or area charts.

Users who only want to publish state changes need to configure only_publish_changed: true.

Kibana recently added support for fitting functions for area and line charts (elastic/kibana#78154), so the need for us to publish all states should be reduced.

This issue is a proposal to change the default setting for only_publish_changed from false to true.

@legrego legrego added enhancement New feature or request good first issue Good for newcomers discuss labels Jan 4, 2021
@EmilyNerdGirl
Copy link

Hey legrego! Is this feature enabled yet? I tried enabling it with v0.4.0-beta2 and all data stopped coming in at all unless I have a glaring typo. Very much looking forward to the next release.

Also, related to this, I think all the traffic with this set to only_publish_changed: false may be causing some sort of socket issue on the HA side. I noticed that if I had the Elastic plugin enabled, and it's in "fire hose" mode with only_publish_changed: false, my Chromecasts would stop getting notifications from HA after 12-24hours. Looking in Elastic, it's getting 30-60k events per hour with this setting disabled.

I disabled the plugin for awhile, and no more issues with the Chromecasts. The error I was getting from HA was it couldn't reach the Chromecasts which was bizarre, and restarting HA, NOT the Chromecasts would fix it. I have a hypothesis that all the outgoing sockets may be getting used up on the HA side and it can't reach out until it reboots. Only a hypothesis as disabling the plugin seems to resolve it, and reminds me of something I've seen in the past with work. No hard evidence yet it is this plugin or maybe a combo of this and other plugins, not sure yet.

@legrego
Copy link
Owner Author

legrego commented Feb 5, 2021

Hey @mloebl - this hasn't been enabled yet by default, but it should be functional. I won't have time to test my own setup for a couple of days, but I'll see if I can reproduce this.

Also, related to this, I think all the traffic with this set to only_publish_changed: false may be causing some sort of socket issue on the HA side. I noticed that if I had the Elastic plugin enabled, and it's in "fire hose" mode with only_publish_changed: false, my Chromecasts would stop getting notifications from HA after 12-24hours. Looking in Elastic, it's getting 30-60k events per hour with this setting disabled.

I disabled the plugin for awhile, and no more issues with the Chromecasts. The error I was getting from HA was it couldn't reach the Chromecasts which was bizarre, and restarting HA, NOT the Chromecasts would fix it. I have a hypothesis that all the outgoing sockets may be getting used up on the HA side and it can't reach out until it reboots. Only a hypothesis as disabling the plugin seems to resolve it, and reminds me of something I've seen in the past with work. No hard evidence yet it is this plugin or maybe a combo of this and other plugins, not sure yet.

Interesting, thanks for letting me know. Can you adjust the publish_frequency to a higher/lower number to see if that impacts your Chromecasts, or if it changes the amount of time until the Chromecasts stop receiving notifications?

It makes me wonder if we aren't closing (or reusing) our connections properly, but I suspect we'd see failure notices from this plugin if that was the case.

If you can, I'd also like to see what this component reports with debug logging enabled for your setup:

# configuration.yml
logger:
  default: info
  logs:
    custom_components.elasticsearch: debug

@EmilyNerdGirl
Copy link

@legrego No worries and thanks for the reply. I've bumped up the publish_frequency to 5 minutes and enabled debug. Basically a few months ago I noticed that the Chromecasts, which had be working great, would fail after 12-24 hours, and didn't really get any helpful support posts in the forum. Then magically they started working again reliably while debugging a memory leak in one of the HA integrations. During this debugging I disabled a bunch of addons and integrations. I thought the issue was resolved, and problem started up, again. Then I realized had just re-enabled this integration, hence just a hypothesis at this point. It's entirely possible it's unrelated to this integration, or, the fact that I have this plugin reaching out to Elastic, some restapi sensors running (where I don't believe you can set the query time on), and other plugins,

The Chromecasts work solid outside of HA, and that plugin I believe just basically makes an http/https redirect call to them, so not a lot of rocket science there either. HA (or Supervisor/docker?) may have an issue on it's side cleaning up stale sockets.

@legrego
Copy link
Owner Author

legrego commented Feb 12, 2021

@mloebl

I tried enabling it with v0.4.0-beta2 and all data stopped coming in at all unless I have a glaring typo.

I noticed a similar problem on my local install. It appears that we attempted to publish a document before the component was fully initialized. I haven't found the root cause there yet, but I added additional logging in #137, which works around the problem for the time being.

In my particular case, I saw an exception in the logs, and then my es_publish_queue sensor kept growing: it wasn't attempting to publish anything at that point, which sounds consistent with what you experienced.

I published 0.4.0-beta3 with this latest change. Can you see if that resolves the issue for you?

@EmilyNerdGirl
Copy link

I will give it a shot, thank you!! FWIW I changed publish_frequency to 300s, and no issues yet with the Chromecasts, so either a coincidence (very possible as have a few networking related plugins), or a viable solution.

@EmilyNerdGirl
Copy link

Should I still be seeing Configuration is handled via yaml, and cannot be controlled via UI for v0.4.0-beta3? Thanks!

@legrego
Copy link
Owner Author

legrego commented Feb 12, 2021

That's expected if you started one of these betas with existing Elasticsearch config in your yaml file.

If you want to use the UI, then you'll need to remove the config entry from your yaml, and restart. Next, remove the Elasticsearch integration from the integrations screen (don't remove the code, or files).

Once that's done, you can add the integration back through the UI. At this point, you'll be able to use the UI to tweak your config

@EmilyNerdGirl
Copy link

That worked, thank you! So much easier now to add domains and entities to exclude :D

@EmilyNerdGirl
Copy link

@legrego TLDR; Just a followup, the Chromecast issue was NOT this plugin

Full explanation... the networking issue I've been having are not related to this integration. To save a long complex situation, there was a routing problem on my network where my IP cameras were isolated. HA was constantly trying to hit those, and 50% of the time having a route issue (but no real errors, found it by accident debugging frigate nvr). This believe it or not lead to a memory leak bug I had been chasing for weeks now in HA, this memory leak would cause HA to restart it's container every 24-48hours without me knowing it, making it look like the issue went away when I disabled this plugin.

@legrego
Copy link
Owner Author

legrego commented Feb 21, 2021

@mloebl I appreciate you following up -- thanks! I'm glad we were able to rule out the Elasticsearch component. I just saw your writeup on reddit about the memory leak, and I never would have guessed a networking problem either!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants