Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fixes for 0.3.3 #61

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
dc3b606
Post-release developer version numbering
drowenhorst-nrl Feb 2, 2024
73a9eea
Merge branch 'main' into develop
drowenhorst-nrl Feb 2, 2024
c3edccf
Update to work better with rectangular or odd shaped detectors.
drowenhorst-nrl Feb 10, 2024
cb83221
Fix type error
drowenhorst-nrl Feb 12, 2024
4aa6d2e
Fix type error
drowenhorst-nrl Feb 12, 2024
811c7bf
Attempt to fix hung processes, should be more robust.
drowenhorst-nrl Feb 26, 2024
6ffa95a
Code cleanup
drowenhorst-nrl Apr 18, 2024
a230d07
Make default to read ubyte/uint16 and convert to float32 on GPU.
drowenhorst-nrl Apr 18, 2024
de16f75
Fix ray issue with making a copy of the pattern array.
drowenhorst-nrl Apr 18, 2024
6dcbedf
Added dependabot to update actions.
drowenhorst-nrl Apr 18, 2024
8111b7c
Update tests.yml
drowenhorst-nrl Apr 18, 2024
39c4ca7
Update tests.yml
drowenhorst-nrl Apr 18, 2024
4800c19
Merge commit '307e94beb748518d385aa1989e2a57554294da91' into develop
drowenhorst-nrl Apr 19, 2024
a653e97
Moved transpose of images for GPU pipeline to GPU processing.
drowenhorst-nrl Apr 19, 2024
62549c3
Fixed typo
drowenhorst-nrl Apr 23, 2024
00cacd8
Begin wrapping bandindexing in larger loops
drowenhorst-nrl Apr 23, 2024
f18bf5f
Fix Typo
drowenhorst-nrl Apr 23, 2024
40eb17c
Wrapping band indexing into larger loops
drowenhorst-nrl Apr 23, 2024
822a8f8
Wrap bandindexing in to larger loops. Checkpoint.
drowenhorst-nrl Apr 24, 2024
88a55aa
Wrapped band indexing in to bigger loops.
drowenhorst-nrl Apr 25, 2024
6f690bf
Include requirement for numba >= 0.53
drowenhorst-nrl Apr 28, 2024
b500a96
Merge branch 'main' into develop
drowenhorst-nrl Apr 28, 2024
4fd7ff0
Update tests for newer action versions.
drowenhorst-nrl Apr 29, 2024
840b767
Update dependabot.yml
drowenhorst-nrl Apr 29, 2024
8fc8f6c
Update dependabot.yml
drowenhorst-nrl Apr 29, 2024
99af42c
Hard code 3-element sort in numba functions for more speed.
drowenhorst-nrl Apr 29, 2024
d7a51cb
Changed Numba cache location, should be more robust, survive system r…
drowenhorst-nrl Apr 30, 2024
d86568e
Removed band_vote as it is now part of tripletvote
drowenhorst-nrl Apr 30, 2024
345d93b
Update pairvote function ... it's fast but less accurate. There for c…
drowenhorst-nrl Apr 30, 2024
7e06911
First attempt at chunker
drowenhorst-nrl May 2, 2024
cb7eafc
Fixed error in chunker
drowenhorst-nrl May 2, 2024
ccdafa7
Start of sigma calc cl
drowenhorst-nrl May 3, 2024
6ecd3df
Checkpoint
drowenhorst-nrl May 3, 2024
d4e5904
checkpoint
drowenhorst-nrl May 3, 2024
f647b43
chechpoint
drowenhorst-nrl May 3, 2024
2f615dd
First working version of openCL calc sigma function
drowenhorst-nrl May 7, 2024
5b249f6
First working GPU NLPAR
drowenhorst-nrl May 8, 2024
dcf12e6
First working version of GPU optimize lam
drowenhorst-nrl May 8, 2024
0bd0656
Cleanup
drowenhorst-nrl May 9, 2024
3a75ee7
Fixed chunking in nlpar_cl
drowenhorst-nrl May 10, 2024
30034d4
Use clflush to make sure GPU is ready for calculations.
drowenhorst-nrl May 10, 2024
3ef5287
release queues.
drowenhorst-nrl May 10, 2024
a54476b
Code cleanup
drowenhorst-nrl May 10, 2024
d1ce5c7
First attempt at distributed GPU NPPAR
drowenhorst-nrl May 11, 2024
74ebbf7
Checkpoint
drowenhorst-nrl May 11, 2024
b8dda87
Checkpoint
drowenhorst-nrl May 11, 2024
c8a2389
Attempt fallback for when on integrated graphics to non-parallel opencl
drowenhorst-nrl May 12, 2024
4813839
Checkpoint
drowenhorst-nrl May 12, 2024
6cb5474
checkpoint
drowenhorst-nrl May 12, 2024
df9c31f
Checkpoint
drowenhorst-nrl May 12, 2024
d0ec7f5
Checkpoint
drowenhorst-nrl May 12, 2024
b117a9e
Checkpoint
drowenhorst-nrl May 12, 2024
27f825d
Optimize lambda working for opencl.
drowenhorst-nrl May 13, 2024
e73288f
Code cleanup
drowenhorst-nrl May 13, 2024
a93aa3b
Fixed memory allocations.
drowenhorst-nrl May 13, 2024
ebefcae
Rebalance memory for distributed.
drowenhorst-nrl May 13, 2024
09eeff0
Proper row/col order
drowenhorst-nrl May 13, 2024
73002c9
More user friendly function calling.
drowenhorst-nrl May 13, 2024
b843494
Maintain backwards compatibility with tutorials.
drowenhorst-nrl May 13, 2024
e125c25
Checkpoint
drowenhorst-nrl May 14, 2024
9cdd9c6
First attempt at distributed sigma calculation.
drowenhorst-nrl May 14, 2024
a958f5d
Sub in distributed sigma calc
drowenhorst-nrl May 14, 2024
87ca0ab
Correct method calling.
drowenhorst-nrl May 14, 2024
1f35742
Speed up lambda opt for large scans by sampling.
drowenhorst-nrl May 14, 2024
4fbf081
Improved reporting on NLPAR_CL
drowenhorst-nrl May 18, 2024
3d15720
Refactor gpuid --> gpu_id to match rest of package.
drowenhorst-nrl May 20, 2024
ba4c02e
Correct sigma reporting
drowenhorst-nrl May 20, 2024
cd8e3d7
Implement auto-check of ray/pyopencl for NLPAR.
drowenhorst-nrl May 22, 2024
932aea8
Update license.
drowenhorst-nrl May 22, 2024
4575d2c
Repair typos
drowenhorst-nrl May 22, 2024
e988a6f
Perform full check of getting GPU at import.
drowenhorst-nrl May 22, 2024
c2bf8bb
Correct gpu test exception.
drowenhorst-nrl May 22, 2024
057efe2
Try to fix again.
drowenhorst-nrl May 22, 2024
f66bc24
Check
drowenhorst-nrl May 22, 2024
3c03d8c
Need to look at this more later.
drowenhorst-nrl May 22, 2024
d2cf69e
Merge branch 'main' into develop
drowenhorst-nrl May 22, 2024
7da8c0d
Merge branch 'main' into develop
drowenhorst-nrl May 23, 2024
a24f7a6
Merge branch 'main' into develop
drowenhorst-nrl May 24, 2024
63155d3
Merge branch 'main' into develop
drowenhorst-nrl May 30, 2024
dfb2273
Merge branch 'main' into develop
drowenhorst-nrl May 31, 2024
fe7a803
Bug fix
drowenhorst-nrl Jun 1, 2024
2444523
Fix IPF color so that the number of columns/rows are automatically
drowenhorst-nrl Jun 5, 2024
3a73165
Fix to nlpar chunker for edge case with larger overlaps.
drowenhorst-nrl Jun 6, 2024
323a802
More conservative memory allocation.
drowenhorst-nrl Jun 6, 2024
dc01ab0
Reduce number of GPU queues.
drowenhorst-nrl Jun 6, 2024
65de924
Another memory adjustment
drowenhorst-nrl Jun 7, 2024
bfe1ae3
Prepare for 0.3.3 release.
drowenhorst-nrl Jun 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ Changelog
All notable changes to PyEBSDIndex will be documented in this file. The format is based
on `Keep a Changelog <https://keepachangelog.com/en/1.1.0>`_.

0.3.3 (2024-06-07)
==================

Fixed
-----
- Fixed edge case for NLPAR chunking of scans that would lead to a crash.
- Fixed issue where PyEBSDIndex would not use all GPUs by default.
- ``IPFColor.makeipf()`` will now automatically read the number of columns/rows in the scan from the file defined in the indexer object.



0.3.2 (2024-05-31)
==================

Expand Down
474 changes: 62 additions & 412 deletions doc/tutorials/ebsd_index_demo.ipynb

Large diffs are not rendered by default.

25 changes: 17 additions & 8 deletions pyebsdindex/EBSDImage/IPFcolor.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,17 +50,26 @@ def makeipf(ebsddata, indexer, vector=np.array([0,0,1.0]), xsize = None, ysize =

if xsize is not None:
xsize = int(xsize)
if ysize is None:
ysize = int(npoints // xsize + np.int64((npoints % xsize) > 0))
#if ysize is None:
#print(ysize)
else:
xsize = int(npoints)
ysize = 1
xsize = indexer.fID.nCols
#xsize = int(npoints)
#ysize = 1

npts = int(npoints)
if int(xsize*ysize) < npoints:
npts = int(xsize*ysize)
ipf_out = ipfout[0:npts,:].reshape(ysize, xsize,3)
if ysize is not None:
ysize = int(ysize)
else:
ysize = int(npoints // xsize + np.int64((npoints % xsize) > 0))


ipf_out = np.zeros((ysize, xsize,3), dtype=np.float32)
ipf_out = ipf_out.flatten()
npts = min(int(npoints), int(xsize*ysize))
# if int(xsize*ysize) < npoints:
# npts = int(xsize*ysize)
ipf_out[0:npts*3] = ipfout[0:npts,:].flatten()
ipf_out = ipf_out.reshape(ysize, xsize, 3)
return ipf_out


Expand Down
2 changes: 1 addition & 1 deletion pyebsdindex/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
]
__description__ = "Python based tool for Radon based EBSD indexing"
__name__ = "pyebsdindex"
__version__ = "0.3.2"
__version__ = "0.3.3"


# Try to import only once - also will perform check that at least one GPU is found.
Expand Down
1 change: 1 addition & 0 deletions pyebsdindex/_ebsd_index_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -294,6 +294,7 @@ def index_pats_distributed(
else:
if ngpu is None:
ngpu = len(clparam.gpu)
gpu_id = np.arange(ngpu, dtype=int)
cudagpuvis = ''
for cdgpu in range(len(clparam.gpu)):
cudagpuvis += str(cdgpu)+','
Expand Down
77 changes: 39 additions & 38 deletions pyebsdindex/nlpar_cpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -389,7 +389,7 @@ def calcnlpar(self, chunksize=0, searchradius=None, lam = None, dthresh = None,
rescale = False

nthreadpos = numba.get_num_threads()
#numba.set_num_threads(36)
#numba.set_num_threads(18)
colstartcount = np.asarray([0,ncols],dtype=np.int64)
if verbose >= 1:
print("lambda:", self.lam, "search radius:", self.searchradius, "dthresh:", self.dthresh)
Expand Down Expand Up @@ -756,50 +756,51 @@ def _calcchunks(self, patdim, ncol, nrow, target_bytes=2e9, col_overlap=0, row_o
rowstepov = min(rowstep + 2 * row_overlap, nrow)

# colchunks = np.round(np.arange(ncolchunks+1)*ncol/ncolchunks).astype(int)
colchunks = np.zeros((ncolchunks, 2), dtype=int)
colchunks[:, 0] = (np.arange(ncolchunks) * colstep).astype(int)
colchunks[:, 1] = colchunks[:, 0] + colstepov - int(col_overlap)
colchunks[:, 0] -= col_overlap
colchunks[0, 0] = 0;

for i in range(ncolchunks - 1):
if colchunks[i + 1, 0] >= ncol:
colchunks = colchunks[0:i + 1, :]

ncolchunks = colchunks.shape[0]
# colchunks = np.zeros((ncolchunks, 2), dtype=int)
# colchunks[:, 0] = (np.arange(ncolchunks) * colstep).astype(int)
# colchunks[:, 1] = colchunks[:, 0] + colstepov - int(col_overlap)
# colchunks[:, 0] -= col_overlap
# colchunks[0, 0] = 0;

colchunks = []
col_overlap = int(col_overlap)
for c in range(ncolchunks):
cchunk = [int(c * colstep) - col_overlap, int(c * colstep + colstepov) - col_overlap]
colchunks.append(cchunk)
if cchunk[1] > ncol:
break

ncolchunks = len(colchunks)
colchunks = np.array(colchunks, dtype=int)
colchunks[0, 0] = 0
colchunks[-1, 1] = ncol

if ncolchunks > 1:
colchunks[-1, 0] = max(0, colchunks[-2, 1] - col_overlap)

colchunks += col_offset

# colproc = np.zeros((ncolchunks, 2), dtype=int)
# if ncolchunks > 1:
# colproc[1:, 0] = col_overlap
# if ncolchunks > 1:
# colproc[0:, 1] = colchunks[:, 1] - colchunks[:, 0] - col_overlap
# colproc[-1, 1] = colchunks[-1, 1] - colchunks[-1, 0]

# rowchunks = np.round(np.arange(nrowchunks + 1) * nrow / nrowchunks).astype(int)
rowchunks = np.zeros((nrowchunks, 2), dtype=int)
rowchunks[:, 0] = (np.arange(nrowchunks) * rowstep).astype(int)
rowchunks[:, 1] = rowchunks[:, 0] + rowstepov - int(row_overlap)
rowchunks[:, 0] -= row_overlap
rowchunks[0, 0] = 0;

for i in range(nrowchunks - 1):
if rowchunks[i + 1, 0] >= nrow:
rowchunks = rowchunks[0:i + 1, :]

nrowchunks = rowchunks.shape[0]
# for i in range(ncolchunks - 1):
# if colchunks[i + 1, 0] >= ncol:
# colchunks = colchunks[0:i + 1, :]

rowchunks = []
row_overlap = int(row_overlap)
for r in range(nrowchunks):
rchunk = [int(r * rowstep) - row_overlap, int(r * rowstep + rowstepov) - row_overlap]
rowchunks.append(rchunk)
if rchunk[1] > nrow:
break

nrowchunks = len(rowchunks)
rowchunks = np.array(rowchunks, dtype=int)
rowchunks[0, 0] = 0
rowchunks[-1, 1] = nrow

rowchunks += row_offset
if nrowchunks > 1:
rowchunks[-1, 0] = max(0, rowchunks[-2, 1] - row_overlap)

# rowproc = np.zeros((nrowchunks, 2), dtype=int)
# if nrowchunks > 1:
# rowproc[1:, 0] = row_overlap
# if nrowchunks > 1:
# rowproc[0:, 1] = rowchunks[:, 1] - rowchunks[:, 0] - row_overlap
# rowproc[-1, 1] = rowchunks[-1, 1] - rowchunks[-1, 0]
rowchunks += row_offset

return ncolchunks, nrowchunks, colchunks, rowchunks

Expand Down
11 changes: 7 additions & 4 deletions pyebsdindex/opencl/nlpar_cl.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def loptfunc(lam, d2, tw, dthresh):
lamopt_values = []

sigma, d2, n2 = self.calcsigma(nn=1, saturation_protect=saturation_protect, automask=automask, normalize_d=True,
return_nndist=True)
return_nndist=True, **kwargs)

#sigmapad = np.pad(sigma, 1, mode='reflect')
#d2normcl(d2, n2, sigmapad)
Expand All @@ -133,7 +133,7 @@ def loptfunc(lam, d2, tw, dthresh):
return lamopt_values.flatten()


def calcsigma_cl(self,nn=1,saturation_protect=True,automask=True, normalize_d=False, gpu_id = None, **kwargs):
def calcsigma_cl(self,nn=1,saturation_protect=True,automask=True, normalize_d=False, gpu_id = None, verbose = 2, **kwargs):
self.sigmann = nn
if self.sigmann > 7:
print("Sigma optimization search limited to a search radius <= 7")
Expand Down Expand Up @@ -222,7 +222,8 @@ def calcsigma_cl(self,nn=1,saturation_protect=True,automask=True, normalize_d=Fa
#count_local = cl.LocalMemory(nnn*npadmx*4)
count_local = cl.Buffer(ctx, mf.READ_WRITE, size=int(mxchunk * nnn * 4))
countchunk = np.zeros((mxchunk, nnn), dtype=np.float32)

ndone = 0
nchunks = int(chunks[1] * chunks[0])
for rowchunk in range(chunks[1]):
rstart = chunks[3][rowchunk, 0]
rend = chunks[3][rowchunk, 1]
Expand Down Expand Up @@ -289,7 +290,9 @@ def calcsigma_cl(self,nn=1,saturation_protect=True,automask=True, normalize_d=Fa
countnn[rstart:rend, cstart:cend] = countchunk[0:int(ncolchunk*nrowchunk), :].reshape(nrowchunk, ncolchunk, nnn)
dist[rstart:rend, cstart:cend] = distchunk[0:int(ncolchunk*nrowchunk), :].reshape(nrowchunk, ncolchunk, nnn)
sigma[rstart:rend, cstart:cend] = np.minimum(sigma[rstart:rend, cstart:cend], sigmachunk)

if verbose >= 2:
print("tiles complete: ", ndone, "/", nchunks, sep='', end='\r')
ndone +=1
dist_local.release()
count_local.release()
datapad_gpu.release()
Expand Down
2 changes: 1 addition & 1 deletion pyebsdindex/opencl/nlpar_clray.py
Original file line number Diff line number Diff line change
Expand Up @@ -479,7 +479,7 @@ def calcnlpar_clray(self, searchradius=None, lam = None, dthresh = None, saturat
rescale = rescale,
gpu_id= gpu_id)

target_mem = clparams.gpu[gpu_id].max_mem_alloc_size//3
target_mem = clparams.gpu[gpu_id].max_mem_alloc_size//6
max_mem = clparams.gpu[gpu_id].global_mem_size*0.4
if target_mem*ngpuwrker > max_mem:
target_mem = max_mem/ngpuwrker
Expand Down