Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document clarity re: sampling_priority configuration #30410

Closed
pierzapin opened this issue Jan 11, 2024 · 10 comments
Closed

Document clarity re: sampling_priority configuration #30410

pierzapin opened this issue Jan 11, 2024 · 10 comments
Labels
closed as inactive documentation Improvements or additions to documentation processor/probabilisticsampler Probabilistic Sampler processor question Further information is requested Stale

Comments

@pierzapin
Copy link

pierzapin commented Jan 11, 2024

Component(s)

processor/probabilisticsampler

Describe the issue you're reporting

It's not very clear how the sampling_priority configuration in the probabilistic_sampler is intended to work.
The readme bullet point for this processor just above the Hashing heading states:

sampling_priority (default = null, optional): The optional name of a log record attribute used to set a different sampling priority from the sampling_percentage setting. 0 means to never sample the log record, and >= 100 means to always sample the log record.

which is somewhat ambiguous - is this meant to be a string (i.e. an attribute name as stated) or a int between 0 and 100?

My initial expectation based on the readme and the example config here was that this setting works in tandem with the from_attribute to provide some sort of override mechanism to the blanket sampling rate. i.e:

processors:
  probabilistic_sampler/logs:
    sampling_percentage: 15
    attribute_source: record 
    from_attribute: foo
    sampling_priority: 100

Which would suggest that if the data included the attribute "foo" that it'd be sampled at 100%

However testdata/config.yaml#L43 implies that another attribute name is used to drive the sampling_priority. I have no idea how this would work if the attribute value is itself a string?

I'd be happy to contribute wording updates based on your advise about which way this is intended to function.

@pierzapin pierzapin added the needs triage New item requiring triage label Jan 11, 2024
@github-actions github-actions bot added the processor/probabilisticsampler Probabilistic Sampler processor label Jan 11, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1 crobert-1 added documentation Improvements or additions to documentation question Further information is requested labels Jan 11, 2024
@crobert-1
Copy link
Member

The sampling_priority configuration option will hold the string name of a log attribute. The processor will get the log attribute's value and then use it as the sampling rate, overriding the value of sampling_percentage. Here's the code for reference.

from_attribute will determine which log attribute's value will be used as the hashing value, it's unrelated to the sampling rate. The processor will get the attribute value and then compute a hash on it, then use the sampling priority (either set by sampling_percentage or the log attribute sampling_priority's value) to determine if the log itself is sampled or not.

I agree this could be made more clear in the README, a PR would be welcomed!

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Jan 23, 2024
dmitryax pushed a commit that referenced this issue Jan 23, 2024
**Description:** 
Originally I was going to just fix the typo `designed`->`designated`,
but the whole comment was hard to read so I re-worded it.

Found while investigating #30410

Co-authored-by: Alex Boten <[email protected]>
@jpkrohling
Copy link
Member

Should this be closed?

@crobert-1
Copy link
Member

Should this be closed?

I don't think so, my PR was mostly unrelated to this. We still need to update the README to close this, from my understanding.

cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this issue Feb 1, 2024
…telemetry#30739)

**Description:** 
Originally I was going to just fix the typo `designed`->`designated`,
but the whole comment was hard to read so I re-worded it.

Found while investigating open-telemetry#30410

Co-authored-by: Alex Boten <[email protected]>
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Mar 26, 2024
@crobert-1 crobert-1 removed the Stale label Mar 26, 2024
@jmacd
Copy link
Contributor

jmacd commented May 13, 2024

#31946 (comment)

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 15, 2024
@jpkrohling
Copy link
Member

@kentquirk , would you be able to transform your comment in documentation at the readme for the component?

@github-actions github-actions bot removed the Stale label Jul 16, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 16, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed as inactive documentation Improvements or additions to documentation processor/probabilisticsampler Probabilistic Sampler processor question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

4 participants