Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A better fitting interface #22

Closed
kmnhan opened this issue Apr 15, 2024 · 3 comments · Fixed by #23
Closed

A better fitting interface #22

kmnhan opened this issue Apr 15, 2024 · 3 comments · Fixed by #23
Assignees
Labels
enhancement New feature or request

Comments

@kmnhan
Copy link
Owner

kmnhan commented Apr 15, 2024

We removed guess_fit since fitting is not magic, and users should be aware of what the initial parameters are. “Explicit is better than implicit.”

Arguably, this introduces a little bit of inconvenience to the fitting workflow since we need to specify independent vars.

Another shortcoming is that the PyARPES approach of creating Xarray objects containing ModelResults has its advantages, but placing non-picklable objects into NetCDF-like structures is counterintuitive and could be misleading, but I can’t think of a better alternative…

@kmnhan kmnhan self-assigned this Apr 15, 2024
@kmnhan kmnhan added the enhancement New feature or request label Apr 16, 2024
@kmnhan kmnhan mentioned this issue Apr 16, 2024
4 tasks
@kmnhan
Copy link
Owner Author

kmnhan commented Apr 16, 2024

Maybe add a callable accessor named lmfit or qfit that closely follows DataArray.curvefit syntax but takes an lmfit model. I think the best pythonic approach would be to use apply_ufunc, but we'll have to see how performant it is when conducting parallel fits.

kmnhan added a commit that referenced this issue Apr 17, 2024
Add a `Dataset.modelfit` and `DataArray.modelfit` accessor with similar syntax and output as `Dataset.curvefit`. Closes #22
@kmnhan
Copy link
Owner Author

kmnhan commented Apr 17, 2024

An initial version of a callable accessor based on apply_ufunc has been added with e06982d as modelfit. Slower than joblib parallelization but faster than expected, should return in a few seconds for couple hundred well-conditioned fits.

It is very versatile but not as easy to use as I thought. Returns the best fit coeffs, their stderr, and goodness of fit statistics. I tried to make it return the covariance matrix and the number of variables and the initial parameters, but this is difficult since they may differ for each fit. One idea is to write them in terms of params, and leave the unrelated variables as NaN... should see how ambiguous this is to the user.

On the other hand, it should be feasible to store the y-values, try to implement that. Maybe apply_ufunc has a nice way of handling it.

kmnhan added a commit that referenced this issue Apr 18, 2024
Add a `Dataset.modelfit` and `DataArray.modelfit` accessor with similar syntax and output as `Dataset.curvefit`. Closes #22
@kmnhan
Copy link
Owner Author

kmnhan commented Apr 20, 2024

Should close with 0f7a1e0.
Added covariance matrix and modelresult object (optional) to output. Parallelization is implemented by converting to dataset and parallelizing over data_vars, may not be the most efficient way but it works!

@kmnhan kmnhan linked a pull request Apr 20, 2024 that will close this issue
4 tasks
kmnhan added a commit that referenced this issue Apr 22, 2024
Added new interface for fitting, see #22 for discussions.
Made loader argument optional for `erlab.io.loader_context` so it can be used to just change the data directory.
Momentum conversion has been rewritten using `xarray.apply_ufunc`, and is now dask-compatible. It also automatically determines the current energy axis (kinetic or binding).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant