Optimize find bins fix issue 993 #1000

iprafols · 2023-05-16T09:56:37Z

This PR fixes issue #993 and optimized function find_bins to use direct calculation. The results are different since I changed the binning of the wavelength grids in the computation of the expected flux. Thus, I updated the test suite. Below I provide a comparison of the results of this branch with those from master with a fugu run.

Timing comparison (over 3 runs):

Variance statistics comparison:

Delta stack comparison:

Quasar continuum comparison:

Fits metadata comparison:

Lya autocorrelation comparison:

iprafols · 2023-05-16T14:26:17Z

Note that tests are failing due to some updates of numba and/or numpy in the github actions server (and are unrelated to the changes here)

Waelthus · 2023-05-17T07:27:18Z

I think it's #1001. I'll have a look through all positions where this happens and will open another PR

Waelthus

looks good overall, see comments regarding definition of restframe grid. Will still need to work on fixing version issues, see #1002

Waelthus · 2023-05-17T07:43:00Z

py/picca/delta_extraction/astronomical_objects/forest.py

        w1 &= log_lambda >= log_lambda_grid[0] - half_pixel_step
        w1 &= log_lambda < log_lambda_grid[-1] + half_pixel_step
        w1 &= (log_lambda - np.log10(1. + z) >=
-               log_lambda_rest_frame_grid[0] - half_pixel_step_rest_frame)
+               log_lambda_rest_frame_grid[0])


Did we change the definition of the rest frame grid from pixel centers to edges or vice versa? Or was it the effect of different find_bins implementation?
Or is this part of the reason we're seeing slightly different outputs?

Waelthus · 2023-05-17T08:13:06Z

py/picca/delta_extraction/utils.py

-            else:
-                break
+    step = grid_array[1] - grid_array[0]
+    found_bin = ((original_array - grid_array[0]) / step + 0.5).astype(np.int64)


Wouldn't this give an array that is displaced by exactly half a pixel and then cast to int? Is this for avoiding numerical issues from the division? Why not just use (original - grid) // step in that case?

Waelthus · 2023-05-17T08:14:57Z

py/picca/tests/delta_extraction/astronomical_object_tests.py

-            3.01953334, 3.02453334, 3.02953334, 3.03453334, 3.03953334,
-            3.04453334, 3.04953334, 3.05453334, 3.05953334, 3.06453334,
-            3.06953334, 3.07453334
+            3.01703334, 3.02203334, 3.02703334, 3.03203334, 3.03703334,


this is a change by exactly 0.0025, i.e. one half pixel, did we want to change the definition? see other file

iprafols added 14 commits April 11, 2023 12:55

replaced search by direct computation

a099468

undid variable name change

3a5f083

fixed test suite to reflec change

727a5b6

replaced fix for out-of-bounds indexs

e4f12f7

removed floor functions

61dfe4c

updated numba version as otherwise np.clip crasehs numba compilation

277ec71

fixed issue

18bc9c6

changed to integer division to avoid casting

0da9eb0

Merge branch 'optimize_find_bins' into optimize_find_bins_fix_issue_993

7addb4c

optimized find bins, fixed problems with rebinning

7f722eb

linted code

0004990

updated test suite

20373c7

Merge branch 'master' into optimize_find_bins_fix_issue_993

b3eba55

fixed merge issues

88292db

iprafols requested a review from Waelthus May 16, 2023 14:26

Waelthus reviewed May 17, 2023

View reviewed changes

Waelthus approved these changes May 18, 2023

View reviewed changes

This was referenced May 19, 2023

fix deprecated numpy/numba usages #1002

Merged

fixed issue 993 #994

Closed

Optimize find bins #992

Closed

iprafols added 2 commits May 29, 2023 10:23

Merge branch 'master' into optimize_find_bins_fix_issue_993

6ff0c12

Merge branch 'master' into optimize_find_bins_fix_issue_993

80570bf

iprafols merged commit 677ec37 into master May 29, 2023

iprafols deleted the optimize_find_bins_fix_issue_993 branch May 29, 2023 13:19

Waelthus mentioned this pull request Jan 22, 2024

Large performance issue in find_bins #978

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize find bins fix issue 993 #1000

Optimize find bins fix issue 993 #1000

iprafols commented May 16, 2023

iprafols commented May 16, 2023

Waelthus commented May 17, 2023

Waelthus left a comment

Waelthus May 17, 2023

Waelthus May 17, 2023

Waelthus May 17, 2023

Optimize find bins fix issue 993 #1000

Optimize find bins fix issue 993 #1000

Conversation

iprafols commented May 16, 2023

iprafols commented May 16, 2023

Waelthus commented May 17, 2023

Waelthus left a comment

Choose a reason for hiding this comment

Waelthus May 17, 2023

Choose a reason for hiding this comment

Waelthus May 17, 2023

Choose a reason for hiding this comment

Waelthus May 17, 2023

Choose a reason for hiding this comment