Only include necessary information in prompt for GenAI metrics #10698

rmalani-db · 2023-12-14T16:56:22Z

Related Issues/PRs

What changes are proposed in this pull request?

Add an include_input parameter to the make_genai_metric() method to allow custom metrics that only need output and context to exclude the input, allowing for a shorter prompt with only relevant information. The parameter defaults to True for backward compatibility, and users can specify False to exclude the input.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

github-actions · 2023-12-14T16:56:42Z

Documentation preview for b4e12c7 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/7293197168.

Signed-off-by: Roshni Malani <[email protected]>

rmalani-db · 2023-12-19T21:43:47Z

@prithvikannan @annzhang-db Ready for review. Thanks!

sunishsheth2009

@dbczumar Can you help review this as well. Wanted to confirm on the API changes here and see if those makes sense or the suggestions made have more opinions.

Thank you @rmalani-db for working through this, sorry for the back and forth

sunishsheth2009 · 2023-12-20T06:27:14Z

mlflow/metrics/genai/genai_metric.py

@@ -229,6 +231,7 @@ def process_example(example):
        name,
        definition,
        grading_prompt,
+        include_input,


On line 243 in eval_fn as args, should we remove inputs as a required field that is passed to the eval_fn?
That also means that in the signature it is optional. Thoughts?

sunishsheth2009 · 2023-12-20T06:31:50Z

mlflow/metrics/genai/base.py

@@ -72,17 +72,23 @@ def _format_grading_context(self):
        else:
            return self.grading_context

-    def __str__(self) -> str:
+    def print(self, include_input: bool = True) -> str:


Is there a reason we are using include_input in example rather than just not passing in input? Basically making input as optional here? Sorry maybe I missed this decision :(

We can make input be optional here in EvaluationExample, but users can still provide it (in the case they are sharing examples between different metrics). We should still exclude the input if the metric excludes it. I believe that's the decision in Option C in the linked Jira ticket.

sunishsheth2009 · 2023-12-20T06:41:07Z

mlflow/metrics/genai/prompts/v1.py

@@ -64,6 +84,7 @@ class EvaluationModel:
    name: str
    definition: str
    grading_prompt: str
+    include_input: bool = True


Do we need to change this here as well? If we just don't pass in input to the evaluation_model it doesn't render it. We can do it similar to grading_context_columns. Basically we can try and avoid adding more APIs.

Signed-off-by: Roshni Malani <[email protected]>

rmalani-db

I tried to simplify the API surface affected by the optional include_input parameter. In the process, I also simplified how PromptTemplate works to allow for optional variables that are None. Please take another look @sunishsheth2009, thanks.

rmalani-db · 2023-12-20T16:56:50Z

mlflow/metrics/genai/base.py

@@ -72,17 +72,23 @@ def _format_grading_context(self):
        else:
            return self.grading_context

-    def __str__(self) -> str:
+    def print(self, include_input: bool = True) -> str:


We can make input be optional here in EvaluationExample, but users can still provide it (in the case they are sharing examples between different metrics). We should still exclude the input if the metric excludes it. I believe that's the decision in Option C in the linked Jira ticket.

rmalani-db · 2023-12-21T21:10:06Z

mlflow/metrics/genai/genai_metric.py

@@ -280,7 +290,9 @@ def eval_fn(
                )
            grading_payloads.append(
                evaluation_context["eval_prompt"].format(
-                    input=input, output=output, grading_context_columns=arg_string
+                    input=(input if include_input else None),


Now the eval context doesn't need to have an explicit include_input parameter.

…w#10698) Signed-off-by: Roshni Malani <[email protected]> Co-authored-by: Roshni Malani <[email protected]>

rmalani-db marked this pull request as ready for review December 14, 2023 18:41

rmalani-db requested review from prithvikannan and annzhang-db December 14, 2023 21:02

github-actions bot added the rn/none List under Small Changes in Changelogs. label Dec 15, 2023

rmalani-db force-pushed the ML-36162 branch from cc264bf to 265ff3b Compare December 19, 2023 20:04

Roshni Malani added 2 commits December 19, 2023 12:05

Only include necessary information in prompt for GenAI metrics

2752830

Signed-off-by: Roshni Malani <[email protected]>

empty

71337e8

Signed-off-by: Roshni Malani <[email protected]>

rmalani-db force-pushed the ML-36162 branch 2 times, most recently from 737d440 to f8eda62 Compare December 19, 2023 21:07

sunishsheth2009 reviewed Dec 20, 2023

View reviewed changes

Remove example input from the prompt if metric does not require input

1695c78

Signed-off-by: Roshni Malani <[email protected]>

rmalani-db force-pushed the ML-36162 branch 4 times, most recently from 66bfa85 to 0f509bd Compare December 21, 2023 20:15

attempt to simplify method signatures

b4e12c7

Signed-off-by: Roshni Malani <[email protected]>

rmalani-db force-pushed the ML-36162 branch from 0f509bd to b4e12c7 Compare December 21, 2023 21:07

rmalani-db commented Dec 21, 2023

View reviewed changes

rmalani-db requested a review from sunishsheth2009 January 2, 2024 18:45

sunishsheth2009 approved these changes Jan 2, 2024

View reviewed changes

rmalani-db merged commit 2aa052e into master Jan 3, 2024
36 checks passed

rmalani-db deleted the ML-36162 branch January 3, 2024 18:00

annzhang-db pushed a commit to annzhang-db/mlflow that referenced this pull request Jan 3, 2024

Only include necessary information in prompt for GenAI metrics (mlflo…

a0537b6

…w#10698) Signed-off-by: Roshni Malani <[email protected]> Co-authored-by: Roshni Malani <[email protected]>

B-Step62 pushed a commit to B-Step62/mlflow that referenced this pull request Jan 9, 2024

Only include necessary information in prompt for GenAI metrics (mlflo…

b23f8e1

…w#10698) Signed-off-by: Roshni Malani <[email protected]> Co-authored-by: Roshni Malani <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only include necessary information in prompt for GenAI metrics #10698

Only include necessary information in prompt for GenAI metrics #10698

rmalani-db commented Dec 14, 2023 •

edited

Loading

github-actions bot commented Dec 14, 2023 •

edited

Loading

rmalani-db commented Dec 19, 2023

sunishsheth2009 left a comment

sunishsheth2009 Dec 20, 2023

sunishsheth2009 Dec 20, 2023

rmalani-db Dec 20, 2023

sunishsheth2009 Dec 20, 2023

rmalani-db left a comment

rmalani-db Dec 20, 2023

rmalani-db Dec 21, 2023

Only include necessary information in prompt for GenAI metrics #10698

Only include necessary information in prompt for GenAI metrics #10698

Conversation

rmalani-db commented Dec 14, 2023 • edited Loading

Related Issues/PRs

What changes are proposed in this pull request?

How is this PR tested?

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

github-actions bot commented Dec 14, 2023 • edited Loading

rmalani-db commented Dec 19, 2023

sunishsheth2009 left a comment

Choose a reason for hiding this comment

sunishsheth2009 Dec 20, 2023

Choose a reason for hiding this comment

sunishsheth2009 Dec 20, 2023

Choose a reason for hiding this comment

rmalani-db Dec 20, 2023

Choose a reason for hiding this comment

sunishsheth2009 Dec 20, 2023

Choose a reason for hiding this comment

rmalani-db left a comment

Choose a reason for hiding this comment

rmalani-db Dec 20, 2023

Choose a reason for hiding this comment

rmalani-db Dec 21, 2023

Choose a reason for hiding this comment

rmalani-db commented Dec 14, 2023 •

edited

Loading

github-actions bot commented Dec 14, 2023 •

edited

Loading