Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added anycast traffic routing ability, and lastpoll time for v6 #7302

Merged
merged 16 commits into from
Apr 4, 2023
Merged

added anycast traffic routing ability, and lastpoll time for v6 #7302

merged 16 commits into from
Apr 4, 2023

Conversation

serDrem
Copy link
Contributor

@serDrem serDrem commented Jan 19, 2023

These changes enable TM to poll caches that have the same ip address and are in ecmp on the same router, and direct traffic to them appropriately.

Which Traffic Control components are affected by this PR?

  • Traffic Monitor

What is the best way to verify this PR?

create two caches with the same service v4 and v6 address that are connected to the same router with ecmp enabled on the route to them aka you should be able to get to both caches with repeated curls
Disable keep alives so that TM pick a new port each time they connect to the caches

  • Generate traffic to a single cache, make sure publish/CacheStatsNew properly show the traffic
  • Lower the kbps parameter so that one of the caches is marked unavailable. Make sure the other cache is marked unavailable.
  • Admin down one of the caches in the anycast group. Admin Down cache will remain available, since TM is still polling it, by polling the other caches in the anycast group. Make sure the other cache is available in TM.
  • Disable the advertisements one of the caches. It should remain available on the TM(since its partner caches are replying), but the polling times will stop being updated.

PR submission checklist

updated docs

refactor datareq/crstate.go

updated changelog

update results with hostname from via header, if exists
@codecov
Copy link

codecov bot commented Jan 19, 2023

Codecov Report

Merging #7302 (789303b) into master (9da78ae) will increase coverage by 0.13%.
The diff coverage is 67.39%.

❗ Current head 789303b differs from pull request most recent head 64858fb. Consider uploading reports for the commit 64858fb to get more accurate results

@@             Coverage Diff              @@
##             master    #7302      +/-   ##
============================================
+ Coverage     25.02%   25.15%   +0.13%     
  Complexity       98       98              
============================================
  Files           593      593              
  Lines         73528    73768     +240     
  Branches         90       90              
============================================
+ Hits          18398    18554     +156     
- Misses        53383    53459      +76     
- Partials       1747     1755       +8     
Flag Coverage Δ
golib_unit 53.16% <ø> (+0.14%) ⬆️
grove_unit 4.60% <ø> (ø)
t3c_unit 5.33% <ø> (+<0.01%) ⬆️
traffic_monitor_unit 21.32% <67.39%> (+0.88%) ⬆️
traffic_ops_unit 19.70% <ø> (+0.01%) ⬆️
traffic_stats_unit 10.41% <ø> (ø)
unit_tests 22.69% <67.39%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
lib/go-rfc/http.go 34.09% <ø> (ø)
lib/go-tc/crstates.go 0.00% <ø> (ø)
traffic_monitor/cache/cache.go 45.51% <0.00%> (-0.90%) ⬇️
traffic_monitor/cache/data.go 46.03% <ø> (ø)
traffic_monitor/datareq/datareq.go 0.00% <0.00%> (ø)
traffic_monitor/manager/manager.go 2.76% <0.00%> (-0.04%) ⬇️
traffic_monitor/cache/astats.go 53.59% <23.07%> (-2.37%) ⬇️
traffic_monitor/cache/stats_over_http.go 34.50% <44.44%> (+0.26%) ⬆️
traffic_monitor/health/cache.go 59.23% <52.17%> (-0.03%) ⬇️
traffic_monitor/datareq/crstate.go 48.75% <79.59%> (+48.75%) ⬆️
... and 3 more

... and 7 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

traffic_monitor/cache/astats.go Fixed Show resolved Hide resolved
traffic_monitor/cache/astats.go Fixed Show fixed Hide fixed
traffic_monitor/cache/stats_over_http.go Fixed Show resolved Hide resolved
@zrhoffman zrhoffman added new feature A new feature, capability or behavior Traffic Monitor related to Traffic Monitor medium impact impacts a significant portion of a CDN, or has the potential to do so labels Jan 20, 2023
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks on the right track. Thanks!

traffic_monitor/cache/stats_over_http.go Fixed Show resolved Hide resolved
traffic_monitor/cache/stats_over_http.json_new Outdated Show resolved Hide resolved
traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Show resolved Hide resolved
lib/go-tc/crstates.go Outdated Show resolved Hide resolved
docs/source/overview/traffic_monitor.rst Outdated Show resolved Hide resolved
docs/source/overview/traffic_monitor.rst Outdated Show resolved Hide resolved
docs/source/overview/traffic_monitor.rst Outdated Show resolved Hide resolved
@serDrem serDrem requested review from ocket8888 and zrhoffman and removed request for ocket8888 and zrhoffman January 25, 2023 18:48
@serDrem serDrem requested review from zrhoffman and ocket8888 and removed request for ocket8888 and zrhoffman January 25, 2023 18:59
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks closer, just a few things left. Thanks!

traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Show resolved Hide resolved
traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/todata/todata.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/cache/data.go Outdated Show resolved Hide resolved
@zrhoffman zrhoffman self-requested a review February 3, 2023 21:54
@serDrem serDrem requested review from ocket8888 and removed request for zrhoffman February 4, 2023 02:28
Copy link
Contributor

@ocket8888 ocket8888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good

Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's getting close! What tests can be written for this feature? See the Codecov comments for which lines of code it detects are not covered by any tests.

Unit tests are preferable, if that is an option. That would probably also be simpler than getting 2 servers to have the same IP address in the TM integration tests.

traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/cache/stats_over_http.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/todata/todata.go Outdated Show resolved Hide resolved
traffic_monitor/todata/todata_test.go Show resolved Hide resolved
traffic_monitor/todata/todata_test.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate_test.go Outdated Show resolved Hide resolved
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests! Just a few things left to be addressed, I think.

traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/cache/astats.go Outdated Show resolved Hide resolved
traffic_monitor/cache/stats_over_http.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
@mitchell852
Copy link
Member

Sorry I'm so late to the game. This feels like quite a significant change to TM and I worry we don't have the proper test suite in place to verify such a change. I would propose either:

  1. hide this functionality behind a feature flag and leave current functionality alone when feature is turned off. that way we can turn the feature on for some TM production canaries to monitor/test or

  2. put this change into it's own branch (not master) where we can build an RPM from and deploy to a canary to monitor/test and if all looks good we can PR it to master

@serDrem serDrem requested a review from zrhoffman March 1, 2023 15:38
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some questions about the Threshold type and the new parameter.

traffic_monitor/todata/todata.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/cache/stats_over_http.go Outdated Show resolved Hide resolved
traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/todata/todata.go Outdated Show resolved Hide resolved
traffic_monitor/health/cache.go Outdated Show resolved Hide resolved
@serDrem serDrem requested a review from zrhoffman March 13, 2023 23:51
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking close, mainly we just need to make sure that tm.sameipservers.control being disabled/enabled is observed everywhere where it's relevant.

traffic_monitor/datareq/crstate.go Outdated Show resolved Hide resolved
traffic_monitor/todata/todata.go Outdated Show resolved Hide resolved
@serDrem serDrem requested a review from zrhoffman March 26, 2023 22:22
Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, I have some more feedback based on that.

@@ -79,6 +79,7 @@ traffic_monitor.cfg
- ``health.polling.interval``
- ``peers.polling.interval``
- ``heartbeat.polling.interval``
- ``tm.sameipservers.control`` - When set to true, performs an AND operation on the availability statuses of servers with same ip. Any unavailable server(s) with same ip as other server(s) will cause the other server(s) to be set to unavailable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tm.sameipservers.control is a little ambiguous how about tm.sameipservers.enabled? We use .enabled for parameters elsewhere in the project.

Comment on lines +86 to +96
if val, ok := monitorConfig.Get().Config["tm.sameipservers.control"]; ok && val.(string) == "true" {
localStatesC := updateStatusSameIpServers(localStates, toData)
if !directlyPolledOnly {
return tc.CRStatesMarshall(localStatesC)
}
return tc.CRStatesMarshall(filterDirectlyPolledCaches(localStatesC))
}
if !directlyPolledOnly {
return tc.CRStatesMarshall(localStates.Get())
}
unfiltered := localStates.Get()
return tc.CRStatesMarshall(filterDirectlyPolledCaches(unfiltered))
return tc.CRStatesMarshall(filterDirectlyPolledCaches(localStates.Get()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like this to reduce code duplication?

	var unfiltered tc.CRStates
	if val, ok := monitorConfig.Get().Config["tm.sameipservers.control"]; ok && val.(string) == "true" {
		unfiltered = updateStatusSameIpServers(localStates, toData)
	} else {
		unfiltered = localStates.Get()
	}
	if !directlyPolledOnly {
		return tc.CRStatesMarshall(unfiltered)
	}
	return tc.CRStatesMarshall(unfiltered)

Copy link
Member

@zrhoffman zrhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving and merging now, provided you make the requested changes in a follow-up PR.

@zrhoffman zrhoffman merged commit 0e2b42f into apache:master Apr 4, 2023
zrhoffman added a commit to zrhoffman/trafficcontrol that referenced this pull request Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium impact impacts a significant portion of a CDN, or has the potential to do so new feature A new feature, capability or behavior Traffic Monitor related to Traffic Monitor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants