From e446b8a091604840288834e23295c56739019868 Mon Sep 17 00:00:00 2001 From: Christopher Bowman Date: Wed, 25 Sep 2024 11:13:07 -0400 Subject: [PATCH 1/2] Support a rewrite syntax with form \g{n} Signed-off-by: Christopher Bowman --- Makefile.am | 1 + carbon-c-relay.md | 21 +++++++++++++++---- issues/issue462.conf | 28 ++++++++++++++++++++++++++ router.c | 12 ++++++++++- test/issue462.out | 48 ++++++++++++++++++++++++++++++++++++++++++++ test/issue462.tst | 1 + 6 files changed, 106 insertions(+), 5 deletions(-) create mode 100644 issues/issue462.conf create mode 100644 test/issue462.out create mode 100644 test/issue462.tst diff --git a/Makefile.am b/Makefile.am index e5dde131..d95f821c 100644 --- a/Makefile.am +++ b/Makefile.am @@ -126,6 +126,7 @@ CRTESTS = \ issue357 \ issue369 \ issue448 \ + issue462 \ server-type \ basic \ metriclimits \ diff --git a/carbon-c-relay.md b/carbon-c-relay.md index 988031d7..ac0d0f18 100644 --- a/carbon-c-relay.md +++ b/carbon-c-relay.md @@ -486,10 +486,11 @@ needed aggregation instances. ### REWRITES Rewrite rules take a regular expression as input to match incoming metrics, and transform them into the desired new metric name. In the -replacement, -backreferences are allowed to match capture groups defined in the input -regular expression. A match of `server\.(x|y|z)\.` allows to use e.g. -`role.\1.` in the substitution. A few caveats apply to the current +replacement, backreferences are allowed to match capture groups +defined in the input regular expression. A match of `server\.(x|y|z)\.` + allows to use e.g. `role.\1.` in the substitution. If needed, a notation +of `\g{n}` can be used instead of `\n` where the backreference is followed +by an integer, such as `\g{1}100`. A few caveats apply to the current implementation of rewrite rules. First, their location in the config file determines when the rewrite is performed. The rewrite is done in-place, as such a match rule before the rewrite would match the @@ -818,6 +819,18 @@ matters. Hence to build on top of the old/new cluster example done earlier, the following would store the original metric name in the old cluster, and the new metric name in the new cluster: +``` +rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+) + into server.\_1.\2.\3.\3\4 + ; +rewrite ^server\.(.+)\.(.+)\.([a-zA-Z]+)([0-9]+) + into server.\g{_1}.\g{2}.\g{3}.\g{3}\g{4} + ; +``` + +The alternate syntax for backreference notation using `g\{n}` instead of `\n` +notation shown above. Both rewrite rules are identical. + ``` match * send to old; diff --git a/issues/issue462.conf b/issues/issue462.conf new file mode 100644 index 00000000..6fd462ef --- /dev/null +++ b/issues/issue462.conf @@ -0,0 +1,28 @@ +cluster default + fnv1a_ch replication 1 + 127.0.0.1:2103 proto tcp + ; + +aggregate ^sys\.somemetric\.([A-Za-z0-9_]+)\.percentile + every 60 seconds + expire after 75 seconds + compute sum write to + sys.somemetric.\1sum + compute percentile50 write to + sys.somemetric.\g{1}50 + compute percentile75 write to + sys.somemetric.\g{1}75 + compute percentile90 write to + sys.somemetric.\g{^1}90 + compute percentile99 write to + sys.somemetric.\g{_1}99 + send to default + stop + ; + + +match * + send to + default + stop + ; diff --git a/router.c b/router.c index cab2c6ef..d432dd3e 100644 --- a/router.c +++ b/router.c @@ -2608,6 +2608,7 @@ router_rewrite_metric( const char *t; enum rewrite_case { RETAIN, LOWER, UPPER, RETAIN_DOT, LOWER_DOT, UPPER_DOT } rcase = RETAIN; + enum capture_case { NUMMATCH, GMATCH, GMATCH_BRACES } ccase = NUMMATCH; assert(pmatch != NULL); @@ -2647,6 +2648,10 @@ router_rewrite_metric( escape = 2; ref *= 10; ref += *p - '0'; + } else if (escape && ref == 0 && *p == 'g') { + ccase = GMATCH; + } else if (escape && ref == 0 && ccase == GMATCH && *p == '{') { + ccase = GMATCH_BRACES; } else { if (escape) { if (ref > 0 && ref <= nmatch @@ -2702,9 +2707,14 @@ router_rewrite_metric( } ref = 0; } - if (*p != '\\') { /* \1\2 case */ + if (ccase == GMATCH_BRACES && *p == '}') { /* End case of \g{n} */ escape = 0; rcase = RETAIN; + ccase = NUMMATCH; + } else if (*p != '\\') { /* \1\2 case */ + escape = 0; + rcase = RETAIN; + ccase = NUMMATCH; if (s - *newmetric + 1 < sizeof(*newmetric)) *s++ = *p; } diff --git a/test/issue462.out b/test/issue462.out new file mode 100644 index 00000000..34528fe0 --- /dev/null +++ b/test/issue462.out @@ -0,0 +1,48 @@ +listen + type linemode + 2003 proto tcp + 2003 proto udp + /tmp/.s.carbon-c-relay.2003 proto unix + ; + +statistics + submit every 60 seconds + prefix with carbon.relays.test_hostname + ; + +cluster default + fnv1a_ch replication 1 + 127.0.0.1:2103 + ; + +aggregate ^sys\.somemetric\.([A-Za-z0-9_]+)\.percentile + every 60 seconds + expire after 75 seconds + timestamp at end of bucket + compute sum write to + sys.somemetric.\1sum + compute percentile50 write to + sys.somemetric.\g{1}50 + compute percentile75 write to + sys.somemetric.\g{1}75 + compute percentile90 write to + sys.somemetric.\g{^1}90 + compute percentile99 write to + sys.somemetric.\g{_1}99 + send to default + stop + ; +match * + send to default + stop + ; + +aggregation + ^sys\.somemetric\.([A-Za-z0-9_]+)\.percentile (regex) -> sys.somemetric.random.percentile + sum(sys.somemetric.\1sum) -> sys.somemetric.randomsum + percentile50(sys.somemetric.\g{1}50) -> sys.somemetric.random50 + percentile75(sys.somemetric.\g{1}75) -> sys.somemetric.random75 + percentile90(sys.somemetric.\g{^1}90) -> sys.somemetric.RANDOM90 + percentile99(sys.somemetric.\g{_1}99) -> sys.somemetric.random99 + fnv1a_ch(default) + stop diff --git a/test/issue462.tst b/test/issue462.tst new file mode 100644 index 00000000..d21f9a50 --- /dev/null +++ b/test/issue462.tst @@ -0,0 +1 @@ +sys.somemetric.random.percentile From 791a56d311e7c0cb1a72a4578428d61bf0bdc30d Mon Sep 17 00:00:00 2001 From: Christopher Bowman Date: Sat, 28 Sep 2024 07:36:22 -0400 Subject: [PATCH 2/2] Wrap lines at 80 chars --- carbon-c-relay.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/carbon-c-relay.md b/carbon-c-relay.md index ac0d0f18..c111bb9e 100644 --- a/carbon-c-relay.md +++ b/carbon-c-relay.md @@ -488,7 +488,7 @@ Rewrite rules take a regular expression as input to match incoming metrics, and transform them into the desired new metric name. In the replacement, backreferences are allowed to match capture groups defined in the input regular expression. A match of `server\.(x|y|z)\.` - allows to use e.g. `role.\1.` in the substitution. If needed, a notation +allows to use e.g. `role.\1.` in the substitution. If needed, a notation of `\g{n}` can be used instead of `\n` where the backreference is followed by an integer, such as `\g{1}100`. A few caveats apply to the current implementation of rewrite rules. First, their location in the config @@ -556,7 +556,7 @@ Given the input expression, the following match groups are available: something like `carbon.relays.\_2` for certain scenarios, to always use the lowercased short hostname, which following the expression doesn't contain a dot. By default, the metrics are submitted every 60 seconds, -this can be changed using the `submit every seconds` clause. +this can be changed using the `submit every seconds` clause To obtain a more compatible set of values to carbon-cache.py, use the `reset counters after interval` clause to make values non-cumulative, that is, they will report the change compared to the previous value. @@ -949,7 +949,8 @@ without the `send to`, the metric name can't be kept its original name, for the output now directly goes to the cluster. -When configuring cluster you might want to check how the metrics will be routed and hashed. That's what the `-t` flag is for. For the following configuration: +When configuring cluster you might want to check how the metrics will be routed +and hashed. That's what the `-t` flag is for. For the following configuration: ``` cluster graphite_swarm_odd @@ -975,7 +976,8 @@ match * ; ``` -Running the command: `echo "my.super.metric" | carbon-c-relay -f config.conf -t`, will result in: +Running the command: +`echo "my.super.metric" | carbon-c-relay -f config.conf -t`, will result in: ``` [...] @@ -988,8 +990,9 @@ match stop ``` -You now know that your metric `my.super.metric` will be hashed and arrive on the host03 and host04 machines. -Adding the `-d` flag will increase the amount of information by showing you the hashring +You now know that your metric `my.super.metric` will be hashed and arrive on the +host03 and host04 machines. Adding the `-d` flag will increase the amount of +information by showing you the hashring ## STATISTICS @@ -1159,8 +1162,8 @@ provides a multithreaded relay which can address multiple targets and clusters for each and every metric based on pattern matches. There are a couple more replacement projects out there, which -are [carbon-relay-ng](https://github.com/graphite-ng/carbon-relay-ng) and [graphite-relay](https://github.com/markchadwick/graphite-relay -). +are [carbon-relay-ng](https://github.com/graphite-ng/carbon-relay-ng) and +[graphite-relay](https://github.com/markchadwick/graphite-relay). Compared to carbon-relay-ng, this project does provide carbon's consistent-hash routing. graphite-relay, which does this, however