-
Notifications
You must be signed in to change notification settings - Fork 847
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Leverage async batch run for first async-enabled evaluator - FluencyE…
…valuator (#3542) # Description This PR aims to improve the performance of the evaluate API by leveraging async batch run to eliminate the overhead associated with using multiple processes. The key changes include: - Converting the FluencyEvaluator to an async-based implementation. - Plumbing work in the BatchEngine to enable async batch runs. For more details, please check the "Run Evaluators Asynchronously" section in this [document](https://microsoft-my.sharepoint.com/:w:/p/ninhu/ETB_zdMkFrdAuf3Lcg9ssrUB6RVmyuFs5Un1G74O1HlwSA?e=vtfp7w). **Results:** - Evaluation with 1 evaluator and 1 row used to take 16 seconds. Now, it takes only 2 seconds, about 87% improve. - The result is very close to a pure thread pool implementation, but with async batch run, we also get proper timeout handling. # All Promptflow Contribution checklist: - [ ] **The pull request does not introduce [breaking changes].** - [ ] **CHANGELOG is updated for new features, bug fixes or other significant changes.** - [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).** - [ ] **Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: [suggested workflow](../CONTRIBUTING.md#suggested-workflow).** ## General Guidelines and Best Practices - [ ] Title of the pull request is clear and informative. - [ ] There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, [see this page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md). ### Testing Guidelines - [ ] Pull request includes test coverage for the included changes.
- Loading branch information
Showing
15 changed files
with
188 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,20 @@ | ||
# promptflow-evals package | ||
# Release History | ||
|
||
Please insert change log into "Next Release" ONLY. | ||
|
||
## Next release | ||
|
||
## 0.3.2 | ||
## v0.3.2 (Upcoming) | ||
|
||
### Features Added | ||
- Introduced `JailbreakAdversarialSimulator` for customers who need to do run jailbreak and non jailbreak adversarial simulations at the same time. More info in the README.md in `/promptflow/evals/synthetic/README.md#jailbreak-simulator` | ||
|
||
- The `AdversarialSimulator` responds with `category` of harm in the response. | ||
|
||
- Large simulation was causing a jinja exception, this has been fixed | ||
### Bugs Fixed | ||
- Large simulation was causing a jinja exception, this has been fixed. | ||
|
||
### Improvements | ||
- Converted built-in evaluators to async-based implementation, leveraging async batch run for performance improvement. | ||
- Parity between evals and Simulator on signature, passing credentials. | ||
- The `AdversarialSimulator` responds with `category` of harm in the response. | ||
|
||
## v0.3.1 (2022-07-09) | ||
- This release contains minor bug fixes and improvements. | ||
|
||
## 0.0.1 | ||
- Introduced package | ||
## v0.3.0 (2024-05-17) | ||
- Initial release of promptflow-evals package. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.