Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy2code widget #1782

Merged
merged 69 commits into from
Oct 9, 2024
Merged

Policy2code widget #1782

merged 69 commits into from
Oct 9, 2024

Conversation

leehengpan
Copy link
Collaborator

@leehengpan leehengpan commented Sep 11, 2024

Fixes #1749
Fixes #430
Fixes #1851
Fixes #508
Fixes #1855
Fixes #1856

This PR creates a new endpoint /tracer_analysis, that enables household tracer analysis output to be analyzed via Claude. It also modifies PolicyEngineCountry.calculate to create a local tracer and saves that into a new purpose-built database table. It creates a function to parse the resulting tracer output for a particular variable's trace.

It also renames /analysis to /simulation_analysis to represent the fact that we now have two AI-driven endpoints. It refactors the former /analysis endpoint to make the function we use to interface with Claude more modular, and it re-enables streaming on this endpoint by returning a ReadableStream instead of a series of textual messages.

Due to the fact that this renames the /analysis endpoint, this must be merged at the same time as PolicyEngine/policyengine-app#2010. This will be kept in draft until that is ready.

Copy link
Collaborator

@anth-volk anth-volk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leehengpan Thanks for this. Everything looks good so far, all that remains is actually writing to the local database so that we're able to pull the output on the front end.

@PavelMakarchuk
Copy link
Collaborator

@leehengpan Thanks for this. Everything looks good so far, all that remains is actually writing to the local database so that we're able to pull the output on the front end.

I took a stab at this for time purposes - not sure if the test etc. is implemented correctly - lmk what you think

Copy link
Collaborator

@anth-volk anth-volk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @leehengpan. I've noted some places where I think edits would be necessary.

Also, I want to note that we'll have to do two things at the end:

  • Update the added test
  • Delete tracer_output_outer_function.json, though it may be helpful as we create the parsing function

@@ -117,6 +118,10 @@

app.route("/<country_id>/user_profile", methods=["PUT"])(update_user_profile)

app.route("/<country_id>/tracer_analysis", methods=["GET"])(
trigger_tracer_analysis
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a valid name, it doesn't follow standard naming conventions. Could you instead call this get_tracer_analysis?

# write a recursive function here that, when there is an adds and/or a subtracts, calls get_all_variables on that next tier downward, until eventually you hit some marker of there being no more levels.


def get_all_variables(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the below function (and its comments) should be deleted

country_id VARCHAR(3) NOT NULL,
api_version VARCHAR(10) NOT NULL,
tracer_output JSON NOT NULL,
-- variable_name VARCHAR(255) NOT NULL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fully delete this line

country_id VARCHAR(3) NOT NULL,
api_version VARCHAR(10) NOT NULL,
tracer_output JSON NOT NULL,
-- variable_name VARCHAR(255) NOT NULL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete here, as well

policyengine_api/endpoints/tracer_analysis.py Show resolved Hide resolved
Copy link
Collaborator

@anth-volk anth-volk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @leehengpan. I've suggested one edit and briefly made another, and I can explain the one I made more in detail if you'd like.

tracer_output = simulation.tracer.computation_log
log_lines = tracer_output.lines(aggregate=False, max_depth=10)
log_json = json.dumps(log_lines)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an if statement here to make sure that household_id and policy_id exist before trying to write to the db? You'll notice they're optional params in the function signature above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

{anthropic.AI_PROMPT}"""

# get prompt_id
prompt_id = local_database.query(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this piece of code won't work. What you would have to do is first check if the prompt is already stored in the db. If it is, then use that prompt_id; if not, store it, then fetch the prompt_id.

I think the best resolution will be to modify get_analysis as follows:

  • Add an optional param to get_analysis that is the prompt, with a default value of None
  • If the prompt is passed as an arg, treat that as the prompt, else if it's within the request args, use that, else it's equal to None

Basically, we're adding one additional conditional way of passing the prompt to the controller. Then, you can remove the code getting the prompt_ID value and just add the prompt to the code as follows:
analysis = get_analysis(country_id, prompt=prompt)

tracer_segment = parse_tracer_output(row["tracer_output"], variable)

# TODO: Add the parsed tracer output to the prompt
prompt = tracer_analysis_prompt.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of moving the prompt into a new folder and using a formatting trick to pass the values into a f-string. This improves separation of concerns and makes it easier to modify the prompt itself, if necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Thank you, Anthony.

# TODO: Call get_analysis with the complete prompt
analysis = get_analysis(country_id, prompt_id)

if row is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand from here down is legacy code, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, except for the get_analysis and prompt_id lines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, my bad, I meant for that comment to be two lines lower

@anth-volk anth-volk self-requested a review October 4, 2024 17:11
@anth-volk anth-volk marked this pull request as ready for review October 8, 2024 17:46
Copy link
Collaborator

@nikhilwoodruff nikhilwoodruff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after one minor comment

changelog_entry.yaml Show resolved Hide resolved
@anth-volk
Copy link
Collaborator

This should be good. Will merge after tests pass.

@anth-volk
Copy link
Collaborator

Tests have passed. Merging. Once this launches, the front-end changes will also be merged.

@anth-volk anth-volk merged commit 5862b17 into master Oct 9, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants