-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polynomial1D model not fitting at all, and other models are not matching gbtidl #86
Comments
GBTIDL fitting code: /home/gbtidl/2.10.1/gbtidl/pro/toolbox/ortho_fit.pro `;+ This code came from Tom Bania's GBT_IDL work. Local
; ; ; @param xx {in}{required} The x-values to use in the fit. ; @param yy {in}{required} The data to be fit at xx. ; @param nfit {in}{required}{type=integer} The order of the polynomial to fit. ; @param cfit {out}{required} On return, cfit has coefficients of ; the polynomial ;
; because of round-off using this form gives unreliable results ; above about order 15 even for double precision. See <a ; href="ortho_poly.html">ortho_poly for a better discussion on the ; contents of cfit. ; @param rms {out}{required} The rms error for each polynomial up to ; order nfit. ; ; @returns polyfit array: contains information for fitting to arbitrary ; points using the recursion relations ; ; @examples ; fit a 3rd order polynomial to the data in !g.s[0] ;
; ; @Version $Id$ ;- function ortho_fit,xx,yy,nfit,cfit,rms
end` |
I've verified that specutils is treating masked parts of the spectrum differently. After recreating ortho_fit in my jupyter notebook, I can confirm that it produces the same outputs as gbtidl - for an unmasked spectrum. I can produce a match between dysh and gbtidl for ABGT17A_404 scan 19 if there are no regions selected excluded in either - and the differences are no greater than 1e-9. If I just use one region to fit (channels 0-750 selected in gbtidl, 751-819 excluded in dysh), it produces a match with no differences greater than 1e-4. If I go back to using the full region selection before, the differences reach <1e-2, with <1e-3 in the selected regions of the spectrum. In this case, I'm selecting the (100, 380), (450,750) regions in gbtidl and excluding [(0,99),(379,449),(749,819)] in dysh. The difference between the spectra is functional, shown by the green line below. Changing the dysh exclude regions by +/- 1 channel on either side (e.g. (381,451 instead of 379,449 and any permutation) doesn't result in any large difference, and certainly doesn't let us match gbtidl. Also, the Polynomial1D errors cropping up above can be solved by using the LevMarLSQFitter or LMLSQFitter options. |
This is all very interesting as well as disturbing. Have you done the test where you create a spectrum with a known polynomial baseline added plus gaussian noise and see if both GBTIDL and dysh recover the polynomial? And repeat test with and without channel exclusion. |
It appears that I was using the wrong boundaries to fit in dysh compared to GBTIDL. After fixing this, it looks like numpy polyfit returns the same values as the python implementation of ortho_fit.pro and the GBTIDL result. I suggest using numpy polyfit as a baseline fitting tool. |
Use numpy.polyfit for polynomial models, astropy modeling functions for the rest. |
Double check that GBTIDL is accurate by using synthetic data. |
This code:
returns total errors on order 1e-6 (1.6e-9 per channel, and baselined spectrum has 0.018 standard deviation) for each of the model cases above. This is for the AGBT17A_404_01 scan 19 test data set. |
*note, you must pass the "window = [0,xx[-1]]" argument to the polynomial fit function to map the returned series' domain correctly. This will return the same coefficients as the GBTIDL format (domain = channel index instead of [-1,1]). Currently working on testing fake polynomials. |
I am finding good agreement with polynomial fitting, with a (small) range of polynomial degrees and selection ranges. Chebyshev polynomial fitting is also agreeing, although I am seeing a different on order 1e-(2*n), where n is the order of the given polynomial coefficient. I have not compared the actual spectra to each other since this is starting to get quite time intensive (generating identical arrays in GBTIDL vs. Python, porting spectra between the two platforms, etc.), but they are showing good fits by eye. Also, the numpy polynomial fit routine translates the dataset to the domain [-1, 1], which may be useful for higher-order fits when the x-axis has a large magnitude (chan number, Hz) |
The problem is indeed the specutils inclusion region error as found by issue 378. With that fix, the specutils baseline fitting returns the same precision as the numpy and python-based gbtidl ortho_fit implementation, in all cases except when using the Polynomial model, which still gives the "Fit may be poorly conditioned" warning. |
A possibility, that may not require modifying |
First issue: Near misses
Using 'legendre' baseline model option fits nearly the same baseline as gbtidl, with most deviation at the edges of the baseline. The attached plot shows there is a difference in at least the third order between Dysh legendre and gbtidl, shown by the red line. (This is example dataset AGBT17A_404_...)
Comparing the coefficients themselves is apples to oranges since dysh is using a Legendre model while gbtidl is using some form of basic polynomial sum(m) a_m x^m. Using the Polynomial1D model from the dysh side results in the second issue.
Second Issue: Polynomial1D fits are not working
A 2nd or 3rd order polynomial fit was applied to the data below, respectively. The 2nd order fit seemed to just find an average (indeed, all coefficients after the first are 0) and the 3rd order fit applied a straight, sloped line. Trying higher orders resulted in no change. The vertical black lines denote the regions of the fit.
The text was updated successfully, but these errors were encountered: