Redesigning Design Methods with Bayesian Optimisation #1762

m-bone · 2024-06-13T16:29:35Z

Redesigning method.py to allow for key usability improvements namely:

User should be able to change sampling/optimising method without changing the sim_creation function
User should be able to run single simulations or batches of simulations without changing the sim_creation function

Included within this is the addition of Bayesian optimisation and the creation of a MethodOptimise subclass to enable rapid development of additional optimisation methods e.g. genetic algorithms.

This will require refactoring to reduce achieve this setup without significant compromising the current user experience. Principle change will be the requirement of pre_fn and post_fn functions for creating and processing simulated data.

tylerflex

didnt get a ton of time to fully digest it today but we can discuss tomorrow. So far, it's looking good though!

tidy3d/plugins/design/method.py

tylerflex

Hey @m-bone just went through. Looks like it's getting pretty close. A few things:

I think I still feel like it would make more sense to handle fn_post after fn_mid. In my mind, if we have fn_mid return something dict-like (ie either a dict or a BatchData) then we can simply iterate through it and feed values to fn_post, this would retain the memory saving of BatchData.items(), while I think simplifying the code. Let me know if you see any problem with that?
I think we still need to nail down some kind of well defined acceptable output type -> input type relationship for fn_pre and fn_post. The current code implicitly seems to handle

fn_pre return type	fn_post arg type	notes
Simulation	SimulationData	line 170
dict[Any, Simulation]	dict[Any, SimulationData]	line 191
list[Simulation]	list[SimulationData]	line 213
Any	Any	line 217

I think we list need to consider a few things:
A: what if fn_pre returns a mixed dict or list? say some Simulation and some float? for example, they want to pass some computed value to their fn post? Do we allow this? how do we handle it more generally?
B. should be able to trivially handle the fn_pre returning Batch or Job directly I think.
C. I recall now that in the original design plugin, a dict output of fn_pre actually gets input as kwargs to fn_post and a list output gets passed as positional args. see after cell [11]. https://docs.flexcompute.com/projects/tidy3d/en/latest/notebooks/Design.html
I think we just need to decide how to fill out this table once and for all.. Here's my proposal

in a general sense

fn_pre return type	fn_post arg type	notes
list	*args
dict	**kwargs
other	other

and then within individual data, basically

fn_pre return type	fn_post arg type	notes
Simulation	SimulationData
Batch	BatchData
Job	SimulationData
anything else	itself

Then we can say that if there's some special cases we handle to speed things along..

if a list or dict of stuff, any Simulation-like data gets combined into one batch run, but then passed to fn_post using the tables above.

That's pretty complex but I'll comment out some specifics to discuss

tidy3d/plugins/design/design.py

tylerflex · 2024-06-20T12:38:02Z

`fn_pre` return	`fn_mid` does	`fn_post` call
`1.0`	nothing	`fn_post(1.0)`
`[1,2,3]`	nothing	`fn_post(1,2,3)`
`{'a': 2, 'b': 'hi'}`	nothing	`fn_post(a=2, b='hi')`
`Sim()`	`web.run()`	`fn_post(SimData())`
`Batch()`	`web.Batch.run()`	`fn_post(BatchData())`
`[Sim(), Sim()]`	`web.Batch().run()`	`fn_post(SimData(), SimData())`
`[Sim(), 1.0]`	`web().run()`	`fn_post(SimData(), 1.0)`
`[Sim(), Batch()]`	`web.Batch().run()`	`fn_post(SimData(), BatchData())`
`{'a': Sim(), 'b': Batch(), 'c': 2.0}`	`web.Batch().run()`	`fn_post(a=SimData(), b=BatchData(), c=2.0)`

How does this sound?

m-bone · 2024-07-03T09:47:08Z

Noticed another change for optimisers compared to samplers: the user may want to return the raw output data from a simulation run whilst the optimiser will only take a single float value. I think it would be sensible to allow the user to output the raw data as a list/dict and we combine it within the results df, whilst the optimiser just gets passed the float. The two output cases would be:

def td_post(sim_data):
    mnt1 = sim_data["field1"]
    mnt2 = sim_data["field2"]
    avg_field = (mnt1 + mnt2) / 2

    return avg_field

def td_post_raw_out(sim_data):
    mnt1 = sim_data["field1"]
    mnt2 = sim_data["field2"]
    avg_field = (mnt1 + mnt2) / 2

    return [avg_field, [mnt1, mnt2]]

We can include a MethodOptimise method that can deal with float or list output from the function and let the user know the requirement is that it must output a float or a list list[0] is the float the optimiser needs. We can then add the raw output to the results df, and where the user outputs a dict we can set the column labels as the keys.

Is this too much pressure on the user? I feel this output is something people will want and it fits well within our existing API design.

tylerflex · 2024-07-03T23:29:02Z

Hey Matt, I think this is a good thing to consider. The way that it’s handled in jax and autograd is to support a has_aux : bool = False flag. When has_aux is False, the objective function is assumed to simply return a float. However, when has_aux is True, the objective function is assumed to return a tuple of length 2 where the first value is assumed to be the value being optimized (float), and the 2nd item can be Anything the user wants. We could treat it like regular MethodSample output and somehow provide those Results at the end of the optimization run?

So yea what makes sense to me is to consider these two possible output possibilities to the MethodOptimize? (Where the aux part is handled like it would be in any other method). And I think we could consider putting this has_aux (or whatever we decide to call it) as a pydantic.Field inside of MethodOptimize if you agree?

m-bone · 2024-07-05T08:12:11Z

With regards to the table I've implemented tests that validate all cases except

`fn_pre` return	`fn_mid` does	`fn_post` call
`Sim()`	`web.run()`	`fn_post(SimData())`

For Sim() I think the correct result is to run web.Batch.run() and handle as many sims as is appropriate for the sampler/optimiser. This is the auto-batching that we've baked into this new API.

I'm also not convinced on allowing users to run their own batches using pre-post function inputs. Fn_mid doesn't currently support this so would need to be expanded. Creating a Batch feels like an unnecessary extra step because the user can just give us the dict they used if they wanted us to handle the batch. If we flatten the batch to take advantage of running sims in parallel, we'd likely lose whatever the user has specified as batch parameters. If the user really does want to manage their own batches it can be done with a single function.

tylerflex · 2024-07-05T18:06:37Z

Just to respond to this officially / for reference (we kind of already discussed earlier):

Creating a Batch feels like an unnecessary extra step because the user can just give us the dict they used if they wanted us to handle the batch.

This is true except for the memory advantage of BatchData.

If we flatten the batch to take advantage of running sims in parallel, we'd likely lose whatever the user has specified as batch parameters.

Yea this is true too.

If the user really does want to manage their own batches it can be done with a single function.

Yes, but if the single function is used, then we can't take advantage of the parallelism over the arg_list (eg if each individual in the genetic algorithm population runs a batch, we'd have to run each batch sequentially).

At some point, it might be worth building in some special cases to allow Batches of Batches.. with BatchData argument types to fn_post. But only under specific cases? we could also build this functionality directly into Batch itself 🤔

m-bone · 2024-07-08T15:42:21Z

Hi @tylerflex, I think this is now ready for the first round of review.

Sampler now ignores requirements for specific outputs and will take any object, whilst optimisers will now reliably throw an error for an unsupported outputs early on in the the execution.

Testing has been expanded and is now covering >95% of the code. I believe I've implemented all the things we've discussed, as well as some extras so fire away if there's any questions!

tylerflex · 2024-07-08T19:15:01Z

Thanks @m-bone ! I'm flying home today but will review as much as I can and maybe will have a review by tomorrow (my) morning or early afternoon.

tylerflex

Thanks @m-bone . I added my initial comments. Overall the structure is looking pretty good! my comments are mainly details. Let's just discuss each of them and iterate a few times on it.

tests/test_plugins/test_design.py

tidy3d/plugins/design/design.py

tidy3d/plugins/design/method.py

m-bone · 2024-07-17T12:50:49Z

@tylerflex when you're ready I think this is good for your review. I've implemented all changes and added estimate_cost, logging and QoL features. The DesignSpace delete method will coming next but I want to review the existing design wrt. updating state and then go from there.

I've removed the pre/2.8 commits that were included (again), so this is all just my design update

tylerflex · 2024-07-17T13:08:16Z

Thanks @m-bone , enjoy your trip!

tylerflex

Hey @m-bone , thanks for this. Looking pretty good! I had a lot of suggestions but I think it's getting there.

My main concern at a high level is that I have trouble following every detail as things have gotten pretty complicated. I think a lot of this can't be avoided because we're trying to do a lot of things with this package. But in general if we can be really clear and explicit in the docstrings, descriptions, and errors, that will go a long way. This especially pertains to when there are edge cases that we handle (eg behavior A if given type a, behavior B when given type b). We should be super clear otherwise I worry it will be confusing to users. But overall I think it is pretty good. Thanks!

tests/test_plugins/test_design.py

tidy3d/plugins/design/design.py

tidy3d/plugins/design/method.py

tylerflex

Thanks @m-bone This is looking great. 80% of my comments were minor proofreading things. I think it's almost done. Only slight concerns right now are:

Some of these methods are a bit complex. which is not necessarily bad, but could make them a bit hard to maintain them in long run. If you see any way to simplify them, please take the opportunity now while it's still fresh.
I think docs will be very important here for usability, so please flesh out some of the docstrings for the main class with as much info, diagrams, links as you can, including links to the tutorial notebooks.

Other than that and some minor things, looks really good, think this will be a great addition!

tests/test_plugins/test_design.py

pyproject.toml

tidy3d/plugins/design/parameter.py

tidy3d/plugins/design/method.py

tidy3d/plugins/design/design.py

tylerflex

Looks good to me, just not sure about the task_name default. Also just wondering if you checked how the docs look when compiled?

tidy3d/plugins/design/design.py

tylerflex

looks good! just a few final things:

Please add one or more items in the CHANGELOG.md. For example, I think: a sentence under "changed" summarizing the major changes to design. 1 or more bullet points under "added" discussing the new optimization methods. Basically this will be our mini advertisement that the tool has these new features
Could you squash all of these commits into 1 (or at most, a handful?) reason is to avoid an onslaught of new commits int he public branch. Just one or a few indicating the major changes should be easier. To squash, I usually rebase and "fixup" everything but the top commit, which I "reword" and call something general, eg ("design plugin overhaul and added optimization methods")
After this we can merge the notebook PR

Nice work Matt! really excited to see this in production

tylerflex · 2024-09-17T14:34:18Z

CHANGELOG.md

@@ -18,6 +18,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Added convenience functions `from_terminal_positions` and `from_circular_path` to simplify setup of `VoltageIntegralAxisAligned` and `CustomCurrentIntegral2D`, respectively.
 - Added `axial_ratio` to `DirectivityData.axial_ratio` for the `DirectivityMonitor`, defined as the ratio of the major axis to the minor axis of the polarization ellipse. 
 `ComponentModeler.batch_data` convenience property to access the `BatchData` corresponding to the component modeler run.
+- Added optimization methods to the Design plugin. The plugin has been expanded to include Bayesian optimization, genetic algorithms and particle swarm optimization. Explanations of these methods are available in new and updated notebooks.
+- Added new support functions for the Design plugin: automated batching of `Simulation` objects, and summary functions with `DesignSpace.estimate_cost` and `DesignSpace.summarize`.

 ### Changed


actually could you add a change item just to let users know of any API changes they might need?

Added Bayesian optimization, genetic algorithms, and particle swarm optimization methods Removed MethodRandom and MethodRandomCustom as had become redundant Expanded DesignSpace with new support methods estimate_cost and summarize Redesigned interface with plugin to make it easier for users to batch simulations for different design search methods Automated batching of simulations when user provides pre and post processing functions Expanded testing for all new features

m-bone requested a review from tylerflex June 13, 2024 16:29

tylerflex reviewed Jun 14, 2024

View reviewed changes

tidy3d/plugins/design/method.py Outdated Show resolved Hide resolved

tidy3d/plugins/design/method.py Outdated Show resolved Hide resolved

tidy3d/plugins/design/method.py Outdated Show resolved Hide resolved

tylerflex reviewed Jun 20, 2024

View reviewed changes

m-bone changed the base branch from develop to pre/2.8 July 1, 2024 15:30

m-bone force-pushed the matt/bay_opt branch 2 times, most recently from 48b123b to 0533e81 Compare July 2, 2024 13:48

tylerflex reviewed Jul 9, 2024

View reviewed changes

m-bone force-pushed the matt/bay_opt branch from 280f7a8 to 2e0bcd5 Compare July 17, 2024 12:46

tylerflex reviewed Jul 22, 2024

View reviewed changes

m-bone force-pushed the matt/bay_opt branch 2 times, most recently from d9a3e7c to b703a24 Compare July 31, 2024 08:18

tylerflex requested changes Aug 14, 2024

View reviewed changes

m-bone force-pushed the matt/bay_opt branch from 8835ca9 to 7a6f8eb Compare August 26, 2024 13:51

tylerflex self-requested a review September 11, 2024 08:29

tylerflex reviewed Sep 11, 2024

View reviewed changes

tidy3d/plugins/design/design.py Show resolved Hide resolved

m-bone force-pushed the matt/bay_opt branch from 7a6f8eb to 00adae8 Compare September 13, 2024 15:19

m-bone marked this pull request as ready for review September 17, 2024 13:23

tylerflex self-requested a review September 17, 2024 13:28

tylerflex reviewed Sep 17, 2024

View reviewed changes

m-bone force-pushed the matt/bay_opt branch from 415efd0 to 8e5af1f Compare September 17, 2024 14:02

tylerflex self-requested a review September 17, 2024 14:33

tylerflex approved these changes Sep 17, 2024

View reviewed changes

tylerflex reviewed Sep 17, 2024

View reviewed changes

m-bone force-pushed the matt/bay_opt branch from 8e5af1f to b280a6d Compare September 17, 2024 14:47

m-bone merged commit 931a403 into pre/2.8 Sep 17, 2024
15 checks passed

m-bone deleted the matt/bay_opt branch September 17, 2024 15:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesigning Design Methods with Bayesian Optimisation #1762

Redesigning Design Methods with Bayesian Optimisation #1762

m-bone commented Jun 13, 2024

tylerflex left a comment

tylerflex left a comment

tylerflex commented Jun 20, 2024

m-bone commented Jul 3, 2024

tylerflex commented Jul 3, 2024

m-bone commented Jul 5, 2024

tylerflex commented Jul 5, 2024

m-bone commented Jul 8, 2024

tylerflex commented Jul 8, 2024

tylerflex left a comment

m-bone commented Jul 17, 2024

tylerflex commented Jul 17, 2024

tylerflex left a comment

tylerflex left a comment

tylerflex left a comment

tylerflex left a comment •

edited

Loading

tylerflex Sep 17, 2024

Redesigning Design Methods with Bayesian Optimisation #1762

Redesigning Design Methods with Bayesian Optimisation #1762

Conversation

m-bone commented Jun 13, 2024

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex commented Jun 20, 2024

m-bone commented Jul 3, 2024

tylerflex commented Jul 3, 2024

m-bone commented Jul 5, 2024

tylerflex commented Jul 5, 2024

m-bone commented Jul 8, 2024

tylerflex commented Jul 8, 2024

tylerflex left a comment

Choose a reason for hiding this comment

m-bone commented Jul 17, 2024

tylerflex commented Jul 17, 2024

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex left a comment • edited Loading

Choose a reason for hiding this comment

tylerflex Sep 17, 2024

Choose a reason for hiding this comment

tylerflex left a comment •

edited

Loading