Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(instrumentation-pg): do not split query to determine operation name #2029

Merged
merged 3 commits into from
May 22, 2024

Conversation

Samuron
Copy link
Contributor

@Samuron Samuron commented Mar 21, 2024

Which problem is this PR solving?

OTEL allocates too much

Short description of the changes

To determine span name from query text currently query is split by whitespace. With typical SQL statement being many tens of words long this is wasteful. Replaced the logic with finding the index of first whitespace. This can be optimize further to avoid double slicing for COMMIT case, but this turns out to have a lot of indexing and conditions gymnastics

@@ -81,7 +81,12 @@ export function getQuerySpanName(
}

function parseNormalizedOperationName(queryText: string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your commit message says you're changing how to get the "span name", but your changes are on the "operation name" function (even if that is later used to get the span), so probably worth updating commit message/PR title to reflect that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@maryliag
Copy link
Contributor

curious, do we care about the case WITH ...? Because the name of the operation would be just WITH, which doesn't seem very useful

@Samuron Samuron changed the title perf(instrumentation-pg): do not split query to determine span name perf(instrumentation-pg): do not split query to determine operation name Mar 27, 2024
Copy link

codecov bot commented Mar 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.38%. Comparing base (dfb2dff) to head (81dc59d).
Report is 135 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2029      +/-   ##
==========================================
- Coverage   90.97%   90.38%   -0.60%     
==========================================
  Files         146      147       +1     
  Lines        7492     7506      +14     
  Branches     1502     1573      +71     
==========================================
- Hits         6816     6784      -32     
- Misses        676      722      +46     
Files Coverage Δ
...node/opentelemetry-instrumentation-pg/src/utils.ts 97.22% <100.00%> (+0.25%) ⬆️

... and 42 files with indirect coverage changes

@Samuron
Copy link
Contributor Author

Samuron commented Mar 27, 2024

curious, do we care about the case WITH ...? Because the name of the operation would be just WITH, which doesn't seem very useful

I personally do not find all this logic and span names very helpful in my work, I will create issue, seems better to discuss it via that

@blumamir
Copy link
Member

This issue is discussed in semantic conventions for database spans here:

The span name SHOULD be set to a low cardinality value representing the statement executed on the database. It MAY be a stored procedure name (without arguments), DB statement without variable arguments, operation name, etc. Since SQL statements may have very high cardinality even without arguments, SQL spans SHOULD be named the following way, unless the statement is known to be of low cardinality: <db.operation.name> <db.namespace>.<db.collection.name>, provided that db.operation.name and db.collection.name are available. If db.collection.name is not available due to its semantics, the span SHOULD be named <db.operation.name> <db.namespace>.

It is not recommended to attempt any client-side parsing of db.query.text just to get these properties, they should only be used if the library being instrumented already provides them. When it's otherwise impossible to get any meaningful span name, db.namespace or the tech-specific database name MAY be used.

So I support removing any parsing of the query to be compliant with spec and to not introduce any extensive processing.

Perhaps the span name should be in this case the dbName?

@blumamir blumamir merged commit 816611e into open-telemetry:main May 22, 2024
19 checks passed
@dyladan dyladan mentioned this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants