Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding evals for natural language workflow building. #14417

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
877 changes: 877 additions & 0 deletions packages/evals/component_retrieval/eval-test-suite-0-100-filtered.json

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
{
"evaluationTests": [
{
"query": "When new support tickets come in through Zendesk, analyze sentiment with GPT and prioritize in Linear based on urgency",
"triggers": [
"zendesk-new-ticket"
],
"actions": [
"openai-chat",
"linear-create-issue"
],
"persona": "complex-workflow"
},
{
"query": "I need sales calls recorded in Gong to be transcribed and summarized for the team in Slack",
"triggers": [
"gong-new-call"
],
"actions": [
"openai-create-transcription",
"openai-chat",
"slack-send-message"
],
"persona": "verbose"
},
{
"query": "When leads submit Typeform responses, use GPT to qualify them and update their status in HubSpot",
"triggers": [
"typeform-new-submission"
],
"actions": [
"openai-chat",
"hubspot-create-or-update-contact"
],
"persona": "complex-workflow"
},
{
"query": "Get help from AI to classify and organize our Notion knowledge base",
"triggers": [
"notion-new-page"
],
"actions": [
"openai-chat",
"notion-create-page-from-database"
],
"persona": "task-oriented"
},
{
"query": "When customers message us on Intercom, analyze intent with GPT before creating tickets",
"triggers": [
"intercom-new-conversation"
],
"actions": [
"openai-chat",
"linear-create-issue"
],
"persona": "complex-workflow"
},
{
"query": "Analyze customer feedback from Delighted with AI and update account health in Salesforce",
"triggers": [],
"actions": [
"openai-chat",
"salesforce_rest_api-update-contact"
],
"persona": "task-oriented"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing triggers for event-driven workflows.

Several test cases have empty trigger arrays despite describing event-driven scenarios:

  1. Delighted customer feedback analysis (lines 60-67)
  2. GitHub issues analysis (lines 81-88)
  3. Canny feature requests categorization (lines 123-130)
  4. Help Scout conversations analysis (lines 164-171)
  5. Salesforce deal closure handling (lines 195-202)

Consider adding appropriate triggers:

 {
   "query": "Analyze customer feedback from Delighted with AI and update account health in Salesforce",
-  "triggers": [],
+  "triggers": ["delighted-new-response"],
   ...
 }

Would you like me to suggest specific triggers for each case?

Also applies to: 81-88, 123-130, 164-171, 195-202

{
"query": "When new videos are uploaded to Zoom, I want them transcribed and summarized for the team",
"triggers": [
"zoom-recording-completed"
],
"actions": [
"openai-create-transcription",
"openai-chat",
"slack-send-message"
],
"persona": "verbose"
},
{
"query": "Use AI to analyze Github issues and suggest priority levels in Linear",
"triggers": [],
"actions": [
"openai-chat",
"linear-create-issue"
],
"persona": "task-oriented"
},
{
"query": "When documents are uploaded to Google Drive, use GPT to generate summaries in Notion",
"triggers": [
"google_drive-new-files-instant"
],
"actions": [
"openai-chat",
"notion-create-page-from-database"
],
"persona": "complex-workflow"
},
{
"query": "Analyze customer churn risk based on Intercom conversations using GPT",
"triggers": [
"intercom-new-conversation"
],
"actions": [
"openai-chat",
"hubspot-create-or-update-contact"
],
"persona": "task-oriented"
},
{
"query": "When new RSS articles mention our company, use AI to analyze sentiment and alert team",
"triggers": [
"rss-new-item-in-feed"
],
"actions": [
"openai-chat",
"slack-send-message"
],
"persona": "complex-workflow"
},
{
"query": "I need GPT to help categorize incoming feature requests from Canny into our product roadmap",
"triggers": [],
"actions": [
"openai-chat",
"notion-create-page-from-database"
],
"persona": "verbose"
},
{
"query": "When customers respond to our Typeform survey, analyze trends with AI and update dashboards",
"triggers": [
"typeform-new-submission"
],
"actions": [
"openai-chat",
"google_sheets-add-single-row"
],
"persona": "complex-workflow"
},
{
"query": "Use GPT to analyze Calendly meeting notes and create action items in Asana",
"triggers": [
"calendly_v2-new-event-scheduled"
],
"actions": [
"openai-chat",
"asana-create-task"
],
"persona": "task-oriented"
},
{
"query": "When support team sends emails in Gmail, let AI check tone and suggest improvements",
"triggers": [
"gmail-new-email"
],
"actions": [
"openai-chat"
],
"persona": "complex-workflow"
},
{
"query": "I need customer conversations from Help Scout to be analyzed for product feedback",
"triggers": [],
"actions": [
"openai-chat",
"notion-create-page-from-database"
],
"persona": "verbose"
},
{
"query": "When new comments appear on our YouTube videos, use AI to moderate and flag issues",
"triggers": [
"youtube_data_api-new-comment-posted"
],
"actions": [
"openai-chat",
"slack-send-message"
],
"persona": "complex-workflow"
},
{
"query": "Use GPT to analyze Twitter mentions and create support tickets when needed",
"triggers": [
"twitter-new-mention-received-by-user"
],
"actions": [
"openai-chat",
"linear-create-issue"
],
"persona": "task-oriented"
},
{
"query": "When deals close in Salesforce, use AI to generate personalized onboarding docs",
"triggers": [],
"actions": [
"openai-chat",
"google_docs-create-document"
],
"persona": "complex-workflow"
},
{
"query": "I want GPT to help write better commit messages for our Github repos",
"triggers": [
"github-new-commit"
],
"actions": [
"openai-chat"
],
"persona": "verbose"
}
]
}

Check failure on line 214 in packages/evals/component_retrieval/eval-test-suite-ai-focus-filtered.json

View workflow job for this annotation

GitHub Actions / Lint Code Base

Newline required at end of file but not found
Loading
Loading