Fix and enable multithreading #32

timothyfrankdavies · 2023-11-20T03:31:10Z

The 'mulithread' config caused the program to freeze on OzStar. I've made 1 change to fix the freeze, then 3 other changes to optimize a little.

The Freeze

The freeze was caused by creating a multiprocess.pool, making a list of jobs using imap_unordered, but only consuming the list after the pool was closed.

The jobs returned by imap_unordered are lazily evaluated, so it won't start running the processes until the list is read. It then yields each result in the order they complete. Reading the list only after the pool closes causes it to wait forever.

See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.imap (with imap_unordered just below it) for more info.

I've moved the code that consumes the list to its own function (which needs some more renaming), and that solves the freeze.

Other changes

The other changes are:

Use os.sched_getaffinity(0) instead of multiprocessing.cpu_count() to get the number of CPUs available to the job, rather than the number of CPUs on the machine.
Set a chunksize, which sets the number of jobs to assign to & return from each CPU at a time.
Set multithread to true by default in the json configs. In a follow-up, we can change it programmatically instead.

chunksize is a bit of a complicated issue. I'm inclined to do what I've done here, and set an optimistic chunksize with some small tests to back it up, and revisit later if needed. There were comments suggesting a similar approach.

Here's the considerations:

A chunksize of 1 on a large list adds a large number of system calls, as each CPU requests individual jobs.
A chunksize that divides all jobs evenly between CPUs causes fewer system calls, but if jobs take different times to complete it can be a problem. e.g. if one job takes longer than every other job in the list combined, you'd rather assign it its own CPU.
A larger chunksize can also cause larger memory usage, as the CPU needs to store all the args to the jobs as it runs them.

Tests

The pipeline already has an overall debug timer in the output file, search for All done in

With multithreading, the run completes thousands of seconds faster compared to running single threaded, and hundreds of seconds faster if running with additional cores.

With chunksize evenly dividing jobs I saw a ~ 40 second improvement. I'm unsure if that's beyond margin of error.

There are small differences in the logs before & after, but I think they're insignificant (e.g. differences due to adding numbers in different orders). It might be worth doing a test & plotting results to be sure.

Processes results within scope of multiprocessing pool Sets chunksize to address possible performance concerns Requests only available CPUs Remove now redundant comments

…ltithreading

Remove unused parameters, use consistent functions for imap_unordered. Rename intermediate variables.

…ltithreading

timothyfrankdavies · 2023-11-28T09:24:14Z

I've merged the fixes in #33 and repeated tests.

I realised that my old tests hadn't configured single core tests correctly.

On the new tests, running with a single CPU or with multithread disabled takes roughly the same time, and running with extra cores reduced the processing time substantially. How significant it is depends on the data you're processing.

Results are now consistent except for the very last step:

The last file runs with multithread=false, and is the only one with different values.

The differences here seem very significant, so I'd guess it's something to do with the order that floats are operated on, rather than some wrong calculation being done.

CMartinezLombilla

Apparently, all is okay. No need to refactor anything else for the moment so we keep the same logic to name the variables and the functions. This could be done at a later stage in case we have time.
The chunksize seems to do a reasonable job.
I'll take a look at the extracted spectra and let you know if we're fine to go ahead.

reduction_scripts/params_ns.json

CMartinezLombilla · 2023-11-29T05:29:17Z

src/pywifes/wifes_wsol.py

@@ -922,7 +915,7 @@ def xcorr_shift_all( packaged_args ):
                this_ref_array[numpy.where(this_init_x_array==item)] = \
                    numpy.max(final_lam_bes[loc-2:loc+3])

-    return [this_row,this_init_x_array,this_ref_array]
+    return [this_row, this_ref_array]



The new function and the refactoring look fine for me. Also, I've run some tests and they return what expected. I'll take a look at the extracted spectra to double-check that all is ok as some changes affect the wavelength solution and calibration.

Perfect. If there are any issues, I can try disabling different parts of the 'multithread' json to track down where the difference occurs.

…ltithreading

timothyfrankdavies · 2023-12-04T05:50:01Z

I've undone "1. Use os.sched_getaffinity(0) instead of multiprocessing.cpu_count()", as it caused a crash on macOS. I've created an issue to fix that as a follow-up here, though it may be low priority: #36.

timothyfrankdavies · 2023-12-07T04:32:34Z

There's a few more changes needed before enabling multithreading, but they're a little too much work for one PR, and we've hit the end of the year.

For this PR, I'm disabling multithreading again. A little more work is needed for each section before enabling:

wave_soln should only run multithreaded if it's faster than single threaded.
1. Users should be able to set a max number of threads to use, then we'll use num_processes = min(max_threads, multiprocessing.cpu_count()) for the pool.
cosmic_rays multithread should run in memory instead of using temporary files, and generally follow the same process as wave_soln.
cube_gen causes a small difference in results. It should be traced and fixed.
Currently any system using spawn method for subprocesses is very slow. We either need to disable multithreading if start_method = multiprocessing.get_start_method() == 'spawn', or investigate & fix the issue.

timothyfrankdavies · 2023-12-07T08:25:31Z

Quick update, (3) is now fixed, so the pipeline produces identical results whether running single-threaded or multi-threaded. The issue was the headers loaded by cube_gen.

I'm still inclined to leave multithreading disabled by default until we address the other points.

This PR should be good to go.

timothyfrankdavies and others added 5 commits September 12, 2023 18:04

Fix multithreading

0dd19a4

Processes results within scope of multiprocessing pool Sets chunksize to address possible performance concerns Requests only available CPUs Remove now redundant comments

Merge remote-tracking branch 'origin/automation' into tdavies__fix_mu…

a57ac52

…ltithreading

Fix merge and remove test changes

4864b12

Merge remote-tracking branch 'origin/automation' into tdavies__fix_mu…

afd9bfd

…ltithreading

Use multithread in default configs

e8eb14b

timothyfrankdavies requested review from CMartinezLombilla and felipeji November 20, 2023 03:31

timothyfrankdavies self-assigned this Nov 20, 2023

timothyfrankdavies added 2 commits November 21, 2023 17:44

Refactors around multithreaded code.

569aeec

Remove unused parameters, use consistent functions for imap_unordered. Rename intermediate variables.

Fix nonmulti bug and minor renames

24ce6ca

timothyfrankdavies mentioned this pull request Nov 22, 2023

Fix single threaded arc fit with bad pixels. #33

Merged

timothyfrankdavies changed the base branch from master to automation November 23, 2023 02:04

timothyfrankdavies added 2 commits November 28, 2023 14:40

Merge remote-tracking branch 'origin/automation' into tdavies__fix_mu…

27d67f5

…ltithreading

Use explicit checks and minor renames

d68786b

CMartinezLombilla reviewed Nov 29, 2023

View reviewed changes

Merge remote-tracking branch 'origin/automation' into tdavies__fix_mu…

7d79433

…ltithreading

timothyfrankdavies mentioned this pull request Dec 4, 2023

Improve estimate of available cores for multithreading. #36

Open

Use os.cpu_count to prevent macOS crash.

f611d72

timothyfrankdavies added 2 commits December 4, 2023 18:48

Add main function to reduce_data.py

6e7dd47

Disable multithreading configs

bd2860b

Fix multithreaded cube_gen header

25d1435

timothyfrankdavies merged commit 906b38b into automation Dec 7, 2023

timothyfrankdavies deleted the tdavies__fix_multithreading branch December 7, 2023 08:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix and enable multithreading #32

Fix and enable multithreading #32

timothyfrankdavies commented Nov 20, 2023 •

edited

Loading

timothyfrankdavies commented Nov 28, 2023

CMartinezLombilla left a comment

CMartinezLombilla Nov 29, 2023

timothyfrankdavies Nov 29, 2023

timothyfrankdavies commented Dec 4, 2023 •

edited

Loading

timothyfrankdavies commented Dec 7, 2023

timothyfrankdavies commented Dec 7, 2023

Fix and enable multithreading #32

Fix and enable multithreading #32

Conversation

timothyfrankdavies commented Nov 20, 2023 • edited Loading

The Freeze

Other changes

Tests

timothyfrankdavies commented Nov 28, 2023

CMartinezLombilla left a comment

Choose a reason for hiding this comment

CMartinezLombilla Nov 29, 2023

Choose a reason for hiding this comment

timothyfrankdavies Nov 29, 2023

Choose a reason for hiding this comment

timothyfrankdavies commented Dec 4, 2023 • edited Loading

timothyfrankdavies commented Dec 7, 2023

timothyfrankdavies commented Dec 7, 2023

timothyfrankdavies commented Nov 20, 2023 •

edited

Loading

timothyfrankdavies commented Dec 4, 2023 •

edited

Loading