From e5c16980cdedbcb3bf052290f7c07709d805481f Mon Sep 17 00:00:00 2001 From: susan Date: Fri, 26 Apr 2024 11:25:55 -0400 Subject: [PATCH 1/8] Add stage 0 draft --- rfcs/text/0000-llm-security-fields.md | 139 ++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) create mode 100644 rfcs/text/0000-llm-security-fields.md diff --git a/rfcs/text/0000-llm-security-fields.md b/rfcs/text/0000-llm-security-fields.md new file mode 100644 index 0000000000..20f5a46649 --- /dev/null +++ b/rfcs/text/0000-llm-security-fields.md @@ -0,0 +1,139 @@ +# 0000: Name of RFC + + +- Stage: **0 (strawperson)** +- Date: **TBD** + + + + + +This RFC proposes LLM fields, with the increase of Generative AI and LLM logging. This will benefit our customers and users, allowing them to monitor and protect their LLM/Generative AI deployments. + + + + + +## Fields + + + +The `llm` fields proposed are: [WIP] + +Field | Type | Description /Usage +-- | -- | -- +llm.request.content | text | The full text of the user's request to the LLM. +llm.request.token_count | integer | Number of tokens in the user's request. +llm.response.content | text | The full text of the LLM's response. +llm.response.token_count | integer | Number of tokens in the LLM's response. +llm.user.id | keyword | Unique identifier for the user. +llm.user.rn | keyword | Unique resource name for the user. +llm.request.id | keyword | Unique identifier for the LLM request. +llm.response.id | keyword | Unique identifier for the LLM response. +llm.response.error_code | keyword | Error code returned in the LLM response. +llm.response.stop_reason | keyword | Reason the LLM response stopped. +llm.request.timestamp | date | Timestamp when the request was made. +llm.response.timestamp | date | Timestamp when the response was received. +llm.model.name | keyword | Name of the LLM model used to generate the response. +llm.model.version | keyword | Version of the LLM model used to generate the response. +llm.model.id | keyword | Unique identifier for the LLM model. +llm.model.role | keyword | Role of the LLM model in the interaction. +llm.model.type | keyword | Type of LLM model. +llm.model.description | keyword | Description of the LLM model. +llm.model.instructions | text | Custom instructions for the LLM model. +llm.model.parameters | keyword | Parameters used to confirm the LLM model. + + + +## Usage + + + +## Source data + + + + + + + +## Scope of impact + + + +## Concerns + + + + + + + +## People + +The following are the people that consulted on the contents of this RFC. + +* @Mikaayenson | author +* @susan-shu-c | author +* @ + + + + +## References + + + +### RFC Pull Requests + + + +* Stage 0: https://github.com/elastic/ecs/pull/NNN + + From a05f71129e399b848d7b1fa9b48f90d94ec38eb1 Mon Sep 17 00:00:00 2001 From: susan Date: Fri, 26 Apr 2024 11:29:36 -0400 Subject: [PATCH 2/8] Add PR link --- rfcs/text/0000-llm-security-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/text/0000-llm-security-fields.md b/rfcs/text/0000-llm-security-fields.md index 20f5a46649..b389707b8b 100644 --- a/rfcs/text/0000-llm-security-fields.md +++ b/rfcs/text/0000-llm-security-fields.md @@ -131,7 +131,7 @@ e.g.: -* Stage 0: https://github.com/elastic/ecs/pull/NNN +* Stage 0: https://github.com/elastic/ecs/pull/2337 - Stage: **0 (strawperson)** From 339ee27b1346b186bf8a44e33ef3feb7fea6a264 Mon Sep 17 00:00:00 2001 From: Andrew Pease <7442091+peasead@users.noreply.github.com> Date: Thu, 31 Oct 2024 14:48:12 -0500 Subject: [PATCH 4/8] changed from llm to gen_ai, sync'd w/OTel --- ...elds.md => 0000-gen_ai-security-fields.md} | 38 +++++++++++++++++-- 1 file changed, 35 insertions(+), 3 deletions(-) rename rfcs/text/{0000-llm-security-fields.md => 0000-gen_ai-security-fields.md} (82%) diff --git a/rfcs/text/0000-llm-security-fields.md b/rfcs/text/0000-gen_ai-security-fields.md similarity index 82% rename from rfcs/text/0000-llm-security-fields.md rename to rfcs/text/0000-gen_ai-security-fields.md index c527768342..8eb8118b58 100644 --- a/rfcs/text/0000-llm-security-fields.md +++ b/rfcs/text/0000-gen_ai-security-fields.md @@ -1,4 +1,4 @@ -# 0000: LLM fields +# 0000: GenAI fields - Stage: **0 (strawperson)** @@ -13,7 +13,7 @@ Feel free to remove these comments as you go along. Stage 0: Provide a high level summary of the premise of these changes. Briefly describe the nature, purpose, and impact of the changes. ~2-5 sentences. --> -This RFC proposes LLM fields, with the increase of Generative AI and LLM logging. This will benefit our customers and users, allowing them to monitor and protect their LLM/Generative AI deployments. +This RFC proposes GenAI fields, with the increase of Generative AI and LLM logging. This will benefit our customers and users, allowing them to monitor and protect their LLM/Generative AI deployments. + + +The `gen_ai` fields proposed are: [WIP] + +Field | Type | Description /Usage +-- | -- | -- +gen_ai | group | This defines the attributes used to describe telemetry in the context of Generative Artificial Intelligence (GenAI) Models requests and responses. +gen_ai.system | keyword | The Generative AI product as identified by the client or server instrumentation. +gen_ai.request.model | keyword | +gen_ai.request.max_tokens | integer | +gen_ai.request.temperature | integer | +gen_ai.request.top_p | integer | +gen_ai.request.top_k | integer | +gen_ai.request.stop_sequences | keyword | +gen_ai.request.frequency_penalty | integer | +gen_ai.request.presence_penalty | integer | +gen_ai.response.id | keyword | +gen_ai.response.model | keyword | +gen_ai.response.finish_reasons | keyword | +gen_ai.usage.input_tokens | integer | +gen_ai.usage.output_tokens | integer | +gen_ai.token.type | keyword | +gen_ai.operation.name | keword | + + +Reuse fields: +* AWS - https://www.elastic.co/docs/current/integrations/aws +* GCP - https://www.elastic.co/docs/current/integrations/gcp +* Azure - https://www.elastic.co/docs/current/integrations/azure +* Threat - https://www.elastic.co/guide/en/ecs/current/ecs-threat.html +* [OpenTelemetry GenAI attributes registry docs](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/gen-ai.md) +* [OpenTelemetry GenAI docs](https://github.com/open-telemetry/semantic-conventions/tree/main/docs/gen-ai) +* [OpenTelemetry GenAI registry YAML](https://github.com/open-telemetry/semantic-conventions/blob/main/model/gen-ai/registry.yaml) + ### RFC Pull Requests From ce9966d693fdbd1915573e0be78bab8fe795ae4a Mon Sep 17 00:00:00 2001 From: Andrew Pease <7442091+peasead@users.noreply.github.com> Date: Tue, 5 Nov 2024 13:52:24 -0600 Subject: [PATCH 6/8] added concerns --- rfcs/text/0000-gen_ai-security-fields.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/rfcs/text/0000-gen_ai-security-fields.md b/rfcs/text/0000-gen_ai-security-fields.md index 5811a968f5..21780ef3db 100644 --- a/rfcs/text/0000-gen_ai-security-fields.md +++ b/rfcs/text/0000-gen_ai-security-fields.md @@ -59,8 +59,6 @@ llm.model.parameters | keyword | Parameters used to confirm the LLM model. The `gen_ai` fields proposed are: [WIP] -**Note:** a `*` denotes that this field is not currenlty part of OTel or ECS, but is being used somewhere - most commonly in an Elastic integration - Field | Type | Description /Usage | Example -- | -- | -- | -- gen_ai | nested | This defines the attributes used to describe telemetry in the context of Generative Artificial Intelligence (GenAI) Models requests and responses. @@ -216,6 +214,12 @@ The goal here is to research and understand the impact of these changes on users ## Concerns +We have begun using OTel fields that were experimental and have since been depricated. This will lead to a breaking change. + +Example is `gen_ai.prompt`. This field has been deprecated by OTel and is handled by `gen_ai.`, but it is being used in the AWS Bedrock integration: +- AWS Bedrock integration `gen_ai.prompt` being used [source](https://github.com/elastic/integrations/blob/main/packages/aws_bedrock/data_stream/invocation/fields/fields.yml#L64-L66) +- [OTel deprecated fields](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/gen-ai.md#deprecated-genai-attributes) + From f14c3288130fdf248f171596a75507b3e7ddd244 Mon Sep 17 00:00:00 2001 From: Andrew Pease <7442091+peasead@users.noreply.github.com> Date: Tue, 5 Nov 2024 13:54:37 -0600 Subject: [PATCH 7/8] updated ref link --- rfcs/text/0000-gen_ai-security-fields.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/text/0000-gen_ai-security-fields.md b/rfcs/text/0000-gen_ai-security-fields.md index 21780ef3db..fed0425cae 100644 --- a/rfcs/text/0000-gen_ai-security-fields.md +++ b/rfcs/text/0000-gen_ai-security-fields.md @@ -216,7 +216,7 @@ The goal here is to research and understand the impact of these changes on users We have begun using OTel fields that were experimental and have since been depricated. This will lead to a breaking change. -Example is `gen_ai.prompt`. This field has been deprecated by OTel and is handled by `gen_ai.`, but it is being used in the AWS Bedrock integration: +Example is `gen_ai.prompt`. This field has been deprecated by OTel and is handled by [`gen_ai.user.message.content`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-events.md)(?), but it is being used in the AWS Bedrock integration: - AWS Bedrock integration `gen_ai.prompt` being used [source](https://github.com/elastic/integrations/blob/main/packages/aws_bedrock/data_stream/invocation/fields/fields.yml#L64-L66) - [OTel deprecated fields](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/gen-ai.md#deprecated-genai-attributes) From 161ce662dc3adaf2c6a5c8fcd2c72db477cd41d5 Mon Sep 17 00:00:00 2001 From: Andrew Pease <7442091+peasead@users.noreply.github.com> Date: Tue, 12 Nov 2024 13:19:50 -0600 Subject: [PATCH 8/8] added context for concerns --- rfcs/text/0000-gen_ai-security-fields.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/rfcs/text/0000-gen_ai-security-fields.md b/rfcs/text/0000-gen_ai-security-fields.md index fed0425cae..e7672b0799 100644 --- a/rfcs/text/0000-gen_ai-security-fields.md +++ b/rfcs/text/0000-gen_ai-security-fields.md @@ -214,12 +214,18 @@ The goal here is to research and understand the impact of these changes on users ## Concerns -We have begun using OTel fields that were experimental and have since been depricated. This will lead to a breaking change. +**Experimental vs. Stable** +We have begun using OTel fields that were experimental and have since been depricated. Example is `gen_ai.prompt`. This field has been deprecated by OTel and is handled by [`gen_ai.user.message.content`](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-events.md)(?), but it is being used in the AWS Bedrock integration: - AWS Bedrock integration `gen_ai.prompt` being used [source](https://github.com/elastic/integrations/blob/main/packages/aws_bedrock/data_stream/invocation/fields/fields.yml#L64-L66) - [OTel deprecated fields](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/gen-ai.md#deprecated-genai-attributes) +Almost all of the GenAI fields are "Experimental", if we need to wait for "Stable", we'll probably want to pause this PR and recommend maturity promotion to the OTel team. + +**Fields not in OTel** +Also, some of these fields do not exist in OTel yet, so do they need to be added in OTel before they can be considered for inclusion into ECS? +