Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupted size vs. prev_size #1162

Open
lyw8120 opened this issue Jul 13, 2024 · 0 comments
Open

corrupted size vs. prev_size #1162

lyw8120 opened this issue Jul 13, 2024 · 0 comments

Comments

@lyw8120
Copy link

lyw8120 commented Jul 13, 2024

This is a template for reporting bugs. Please fill in as much information as you can.

Describe your problem

relion was stopped during 3D auto refine for the subset data divided by 3D classification. it seems to be related to the memory problem and is hard to debug.
these data can be refined as a whole dataset, and can be classified, but failed to refine the subset data.

Environment:

  • OS: [Red Hat Enterprise Linux Server 7.5]
  • MPI runtime: [e.g. OpenMPI 4.1.6]
  • RELION version [e.g. RELION-5.0-beta-3-commit-7d79f3 (please see the title bar of the GUI)]
  • Memory: [e.g. 500 GB]
  • GPU: [e.g. GTX 1080Ti]

Dataset:

  • Box size: [e.g. 300 px]

  • Pixel size: [e.g. 3.4487 Å/px]

  • Number of particles: [e.g. 3,398]

  • Description: [e.g. A complex protein of about 5 MDa in total]

which relion_refine_mpi --o Refine3D/job101/run --auto_refine --split_random_halves --i Class3D/job092/c2.star --tomograms Tomograms/job020/tomograms.star --ref Class3D/job092/run_it019_class002.mrc --ini_high 30 --dont_combine_weights_via_disc --scratch_dir /gpu_temp --pool 10 --pad 1 --ctf --particle_diameter 650 --flatten_solvent --zero_mask --solvent_mask Reference/it5bin1_clean_flipz_mask.mrc --oversampling 1 --healpix_order 3 --auto_local_healpix_order 4 --offset_range 10 --offset_step 2 --sym C1 --low_resol_join_halves 40 --norm --scale --j 2 --gpu "0:1:2:3" --maxsig 9999 --pipeline_control Refine3D/job101/


**Error message:**


*** Error in `/programs/relion/buildv5/bin/relion_refine_mpi': corrupted size vs. prev_size: 0x0000000005417ef0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7f574)[0x7f5336729574]
/lib64/libc.so.6(+0x8166b)[0x7f533672b66b]
/lib64/libcuda.so.1(+0x22047b)[0x7f5318de747b]
/lib64/libcuda.so.1(+0x2f254a)[0x7f5318eb954a]
/programs/relion/buildv5/bin/relion_refine_mpi[0x8a871e]
/programs/relion/buildv5/bin/relion_refine_mpi[0x8d8ec8]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN19CudaCustomAllocator5AllocD1Ev+0x4c)[0x5b4d30]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN19CudaCustomAllocator5_freeEPNS_5AllocE+0xbf)[0x5b511b]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN19CudaCustomAllocator16_freeReadyAllocsEv+0x8e)[0x5b4ee0]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN19CudaCustomAllocator5allocEm+0x34)[0x5e4ab6]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN6AccPtrIfE11deviceAllocEv+0x8b)[0x845b7f]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN6AccPtrIfE8accAllocEv+0x3c)[0x85f5ec]
/programs/relion/buildv5/bin/relion_refine_mpi(_Z27getFourierTransformsAndCtfsI15MlOptimiserCudaEvlR21OptimisationParamtersR18SamplingParametersP11MlOptimiserPT_13AccPtrFactoryi+0x3105)[0x84d52e]
/programs/relion/buildv5/bin/relion_refine_mpi(_Z27accDoExpectationOneParticleI15MlOptimiserCudaEvPT_mi13AccPtrFactory+0x427)[0x847284]
/programs/relion/buildv5/bin/relion_refine_mpi(_ZN15MlOptimiserCuda32doThreadExpectationSomeParticlesEi+0x118)[0x836f18]
/programs/relion/buildv5/bin/relion_refine_mpi(_Z36globalThreadExpectationSomeParticlesPvi+0x6f)[0x775841]
/programs/relion/buildv5/bin/relion_refine_mpi[0x7d00f9]
/share/base/gcc/9.1.0/lib64/libgomp.so.1(+0x19cb6)[0x7f5336ca8cb6]
/lib64/libpthread.so.0(+0x7dd5)[0x7f53377afdd5]
/lib64/libc.so.6(clone+0x6d)[0x7f53367a8b3d]
======= Memory map: ========
00400000-00d00000 r-xp 00000000 00:2b 102271568
/programs/relion/buildv5/bin/relion_refine_mpi
00f00000-00f04000 r--p 00900000 00:2b 102271568
/programs/relion/buildv5/bin/relion_refine_mpi
00f04000-00f44000 rw-p 00904000 00:2b 102271568
/programs/relion/buildv5/bin/relion_refine_mpi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant