Compute Grid Centerpoint using Welzl's algorithm #811

rajeeja · 2024-06-11T11:09:30Z

No description provided.

…centerpoint. Need to use great circle distance and add/fix tests and data types in the algo

…ja/welzl

…x test case asserts

uxarray/grid/coordinates.py

philipc2 · 2024-07-02T17:25:07Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.

Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging
Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:

# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon

This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?

…dependency (use with arcs and arcs use coordinates). o Remove new routine in favor of using the existing angle b/w vectors to calculate distance.

…ja/welzl

rajeeja · 2024-07-09T00:09:23Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.

Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging

Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:
# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon
This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?

During my testing and sometimes in testing the face geometry both centerpoint and centroid might be needed. When working with a mesh I wanted to check how much did one deviate from the other and if one or the other made more sense.

We might be able to get both with the way you propose also, but with two calls to populate one with either options, having both available to the grid object at once might be better.

We can get another name for ctrpt, I don't like it also:)

uxarray/grid/coordinates.py

philipc2 · 2024-08-01T19:09:05Z

I'm not a big fan of the _ctrpt approach to the properties. These values are still, for example, face_lon and face_lat, they simply use a different algorithm to compute them.
Maybe we should consider the following design?

Have the default face_xyz or face_latlon values be either what was parsed from a dataset OR the existing Cartesian averaging

Introduce a grid-level Grid.populate_face_coordinates() function (similar to the internal ones that we have to allow the user to re-populate or set the desired algorithm they'd like to use for the construction.

This would make the workflow look something like the following:
# get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon

# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')

# value will now be populated using your approach
uxgrid.face_lon

# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')

# value will now be populated using artesian average
uxgrid.face_lon
This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think?
During my testing and sometimes in testing the face geometry both centerpoint and centroid might be needed. When working with a mesh I wanted to check how much did one deviate from the other and if one or the other made more sense.

We might be able to get both with the way you propose also, but with two calls to populate one with either options, having both available to the grid object at once might be better.

We can get another name for ctrpt, I don't like it also:)

My main concern with breaking up the different types of coordinates in separate attributes is that it'll add extra overhead for us to ensure that the coordinates we read match the ones that we want to store, not to mention needing to redefine / extent the UGRID conventions further past what we've already done. Even with this (and say some other method down the line), this could end up looking like:

Grid.face_lon
Grid.face_lon_centerpoint
Grid.face_lon_some_other_defenition

Consider the case where two UGRID (or any other format) grid files are loaded into UXarray. If we move forward with a split attribute approach, we'd need to ensure that the coordinates we are reading either go into face_lat/lon or face_lat/lon_centerpoints. There's also no easy way to determine what method each dataset used to compute the centroids at the loading step without parsing for any specific attributes in the file (if they exist), since this is not outlined in the UGRID conventions.

I'm still in favor of keeping face_lon and face_lat and general variables for storing some coordinate that represents the center/midpoint/centroid etc. of the face. This does limit us to only storing one type of "center" coordinate at a time, but ensures that we don't restrict us to strictly defining the type of definition for the center.

@paullric @rljacob Is there ever a sceneiro where we would want to have more than one definition of a "center" coordinate attached to a grid at a time?

benchmarks/mpas_ocean.py

uxarray/grid/grid.py

uxarray/grid/utils.py

uxarray/grid/grid.py

erogluorhan

The code looks good to me. Please address just a few simple comments below

uxarray/grid/coordinates.py

uxarray/grid/grid.py

erogluorhan

The code looks good to me. The only thing remaining was the reduced code coverage, but apparently codecov can't track test cases written for the njit-decorated functions. After we figure a path forward with that, I am happy to approve this.

uxarray/grid/grid.py

philipc2 · 2024-09-16T15:35:17Z

uxarray/grid/utils.py

+        Cartesiain y coordinate
+    z: float
+        Cartesian z coordinate
+
+
+    Returns
+    -------
+    lon : float
+        Longitude in radians
+    lat: float
+        Latitude in radians
+    """
+
+    lon = math.atan2(y, x)
+    lat = math.asin(z)
+
+    # set longitude range to [0, pi]
+    lon = np.mod(lon, 2 * np.pi)
+
+    z_mask = np.abs(z) > 1.0 - ERROR_TOLERANCE
+
+    lat = np.where(z_mask, np.sign(z) * np.pi / 2, lat)
+    lon = np.where(z_mask, 0.0, lon)
+
+    return lon, lat
+
+
+def _normalize_xyz(
+    x: Union[np.ndarray, float],
+    y: Union[np.ndarray, float],
+    z: Union[np.ndarray, float],
+) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+    """Normalizes a set of Cartesiain coordinates."""
+    denom = np.linalg.norm(
+        np.asarray(np.array([x, y, z]), dtype=np.float64), ord=2, axis=0
+    )
+
+    x_norm = x / denom
+    y_norm = y / denom
+    z_norm = z / denom
+    return x_norm, y_norm, z_norm
+
+
+@njit(cache=True)
+def _lonlat_rad_to_xyz(
+    lon: Union[np.ndarray, float],
+    lat: Union[np.ndarray, float],
+) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+    """Converts Spherical lon and lat coordinates into Cartesian x, y, z
+    coordinates."""
+    x = np.cos(lon) * np.cos(lat)
+    y = np.sin(lon) * np.cos(lat)
+    z = np.sin(lat)
+
+    return x, y, z
+
+
+def _xyz_to_lonlat_deg(
+    x: Union[np.ndarray, float],
+    y: Union[np.ndarray, float],
+    z: Union[np.ndarray, float],
+    normalize: bool = True,
+) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
+    """Converts Cartesian x, y, z coordinates in Spherical latitude and
+    longitude coordinates in degrees.
+
+    Parameters
+    ----------
+    x : Union[np.ndarray, float]
+        Cartesian x coordinates
+    y: Union[np.ndarray, float]
+        Cartesiain y coordinates
+    z: Union[np.ndarray, float]
+        Cartesian z coordinates
+    normalize: bool
+        Flag to select whether to normalize the coordinates
+
+    Returns
+    -------
+    lon : Union[np.ndarray, float]
+        Longitude in degrees
+    lat: Union[np.ndarray, float]
+        Latitude in degrees
+    """
+    lon_rad, lat_rad = _xyz_to_lonlat_rad(x, y, z, normalize=normalize)
+
+    lon = np.rad2deg(lon_rad)
+    lat = np.rad2deg(lat_rad)
+
+    lon = (lon + 180) % 360 - 180
+    return lon, lat
+
+
+@njit
+def _normalize_xyz_scalar(x: float, y: float, z: float):
+    denom = np.linalg.norm(np.asarray(np.array([x, y, z]), dtype=np.float64), ord=2)
+    x_norm = x / denom
+    y_norm = y / denom
+    z_norm = z / denom
+    return x_norm, y_norm, z_norm


I'm still a bit hesitant in moving these. Can you try moving them back into coordinates.py and provide a bit more info on the cyclical dependencies issue?

all tests should fail now, please check if there is a better fix. I thought those better belonged to grid/utils.py as they can run as a utility and are not tied to other infra.

# _xyz_to_lonlat_rad, # _lonlat_rad_to_xyz, # _xyz_to_lonlat_deg, # _normalize_xyz,

I'm still a bit hesitant in moving these. Can you try moving them back into coordinates.py and provide a bit more info on the cyclical dependencies issue?

coordinates.py needs "from uxarray.grid.arcs import _angle_of_2_vectors"
arcs.py needs stuff from coordinates.py -

from uxarray.grid.coordinates import (
_xyz_to_lonlat_rad_scalar,
_normalize_xyz_scalar,
)

rajeeja · 2024-09-16T23:55:29Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

erogluorhan · 2024-09-17T00:19:53Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

How about doing this in such a way: In the beginning of each indidvual case that tests a njit-decorated function, disable numba, in the end, enable it back? This helps us notice right away that whenever we see this kind of disable-enable pairs, that is a case for testing a njit-decorated function.

rajeeja · 2024-09-17T22:40:09Z

Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue

How about doing this in such a way: In the beginning of each indidvual case that tests a njit-decorated function, disable numba, in the end, enable it back? This helps us notice right away that whenever we see this kind of disable-enable pairs, that is a case for testing a njit-decorated function.

I removed all the numba stuff on my local and the coverage (85% total and 95% for coordinates.py - see pic below) is considerably higher for sha: dba53dc8 which is before I introduced circular dependency.

benchmarks/quad_hexagon.py

benchmarks/mpas_ocean.py

philipc2 · 2024-09-19T00:51:28Z

uxarray/grid/coordinates.py

 from uxarray.conventions import ugrid
-
-from typing import Union
+from uxarray.grid.arcs import _angle_of_2_vectors


This appears to be the root cause of the circular import.

To avoid this, can you try using the following in this module when calling this function:

uxarray.grid.arcs._angle_of_2_vectors

This appears to be the root cause of the circular import.

To avoid this, can you try using the following in this module when calling this function:

uxarray.grid.arcs._angle_of_2_vectors

Doesn't work. Numba doesn't like uxarray...

philipc2 · 2024-09-19T00:53:24Z

uxarray/grid/utils.py

-def _replace_fill_values(grid_var, original_fill, new_fill, new_dtype=None):
-    """Replaces all instances of the current fill value (``original_fill``) in
-    (``grid_var``) with (``new_fill``) and converts to the dtype defined by
-    (``new_dtype``)
-
-    Parameters
-    ----------
-    grid_var : np.ndarray
-        grid variable to be modified
-    original_fill : constant
-        original fill value used in (``grid_var``)
-    new_fill : constant
-        new fill value to be used in (``grid_var``)
-    new_dtype : np.dtype, optional
-        new data type to convert (``grid_var``) to
-
-    Returns
-    ----------
-    grid_var : xarray.Dataset
-        Input Dataset with correct fill value and dtype
-    """
-
-    # locations of fill values
-    if original_fill is not None and np.isnan(original_fill):
-        fill_val_idx = np.isnan(grid_var)
-    else:
-        fill_val_idx = grid_var == original_fill
-
-    # convert to new data type
-    if new_dtype != grid_var.dtype and new_dtype is not None:
-        grid_var = grid_var.astype(new_dtype)
-
-    # ensure fill value can be represented with current integer data type
-    if np.issubdtype(new_dtype, np.integer):
-        int_min = np.iinfo(grid_var.dtype).min
-        int_max = np.iinfo(grid_var.dtype).max
-        # ensure new_fill is in range [int_min, int_max]
-        if new_fill < int_min or new_fill > int_max:
-            raise ValueError(
-                f"New fill value: {new_fill} not representable by"
-                f" integer dtype: {grid_var.dtype}"
-            )
-
-    # ensure non-nan fill value can be represented with current float data type
-    elif np.issubdtype(new_dtype, np.floating) and not np.isnan(new_fill):
-        float_min = np.finfo(grid_var.dtype).min
-        float_max = np.finfo(grid_var.dtype).max
-        # ensure new_fill is in range [float_min, float_max]
-        if new_fill < float_min or new_fill > float_max:
-            raise ValueError(
-                f"New fill value: {new_fill} not representable by"
-                f" float dtype: {grid_var.dtype}"
-            )
-    else:
-        raise ValueError(
-            f"Data type {grid_var.dtype} not supported" f"for grid variables"
-        )
-
-    # replace all zeros with a fill value
-    grid_var[fill_val_idx] = new_fill
-
-    return grid_var


This shouldn't be moved either. Please refer to the above comment if the circular import issues still persist.

I checked, it is not used, what is used is in connectivity

Co-authored-by: Philip Chmielowiec <[email protected]>

rajeeja and others added 8 commits June 11, 2024 06:08

o #803 initial implementation of Welzl's algorithm to calculate grid …

60a2cbb

…centerpoint. Need to use great circle distance and add/fix tests and data types in the algo

o Update grid class and add asserts for test

3c41af3

Merge branch 'main' into rajeeja/welzl

5307806

o typo fix

33c445f

Merge branch 'rajeeja/welzl' of github.com:UXARRAY/uxarray into rajee…

571986a

…ja/welzl

Merge branch 'main' into rajeeja/welzl

f926928

o overhaul to not use tuples and go with numpy array, document and fi…

9b3f066

…x test case asserts

Merge branch 'main' into rajeeja/welzl

cb5b380

rajeeja changed the title ~~DRAFT: Compute Grid Centerpoint using Welzl's algorithm~~ Compute Grid Centerpoint using Welzl's algorithm Jun 26, 2024

rajeeja requested a review from hongyuchen1030 June 26, 2024 21:14

o Conform to formatting standards

504bc08

rajeeja requested a review from philipc2 June 26, 2024 23:10

rajeeja added 2 commits June 27, 2024 06:03

Merge branch 'main' into rajeeja/welzl

9462e49

o fix bugs that reversed the coordinate ordering

c62d2b7

hongyuchen1030 reviewed Jun 30, 2024

View reviewed changes

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

uxarray/grid/coordinates.py Show resolved Hide resolved

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

uxarray/grid/coordinates.py Outdated Show resolved Hide resolved

rajeeja and others added 2 commits July 2, 2024 06:40

Merge branch 'main' into rajeeja/welzl

d117413

Merge branch 'main' into rajeeja/welzl

d9a0823

philipc2 linked an issue Jul 2, 2024 that may be closed by this pull request

Welzl's algorithm for "face centerpoint" #803

Open

philipc2 assigned rajeeja Jul 2, 2024

philipc2 and others added 4 commits July 3, 2024 12:08

Merge branch 'main' into rajeeja/welzl

81710ba

o Move some fns from coordinates.py to utils as they caused circular …

d09499c

…dependency (use with arcs and arcs use coordinates). o Remove new routine in favor of using the existing angle b/w vectors to calculate distance.

Merge branch 'main' into rajeeja/welzl

8ffbe93

Merge branch 'rajeeja/welzl' of github.com:UXARRAY/uxarray into rajee…

3962433

…ja/welzl

rajeeja requested a review from hongyuchen1030 July 9, 2024 00:09

hongyuchen1030 reviewed Jul 9, 2024

View reviewed changes

uxarray/grid/coordinates.py Show resolved Hide resolved

rajeeja requested a review from hongyuchen1030 July 11, 2024 22:52

Merge branch 'main' into rajeeja/welzl

030f9f8

philipc2 reviewed Sep 4, 2024

View reviewed changes

benchmarks/mpas_ocean.py Outdated Show resolved Hide resolved

o Try to include inside funcs

1a18530

philipc2 mentioned this pull request Sep 6, 2024

Check Grid for Partial Spherical Coverage #899

Merged

10 tasks

rajeeja requested a review from philipc2 September 10, 2024 14:51

Merge branch 'main' into rajeeja/welzl

38b559d

philipc2 requested changes Sep 10, 2024

View reviewed changes

o call cartesian average instead of average add more doc

67e3de3

erogluorhan reviewed Sep 10, 2024

View reviewed changes

uxarray/grid/coordinates.py Show resolved Hide resolved

uxarray/grid/coordinates.py Show resolved Hide resolved

rajeeja added 2 commits September 12, 2024 18:31

o Fix conflicts

0b9f5b1

o Add return doc

fcc22c1

rajeeja requested review from erogluorhan and philipc2 September 12, 2024 23:35

philipc2 requested changes Sep 13, 2024

View reviewed changes

uxarray/grid/grid.py Outdated Show resolved Hide resolved

rajeeja and others added 2 commits September 13, 2024 10:05

o Add return test for doc

ac8e099

Merge branch 'main' into rajeeja/welzl

6cac65b

erogluorhan reviewed Sep 15, 2024

View reviewed changes

philipc2 reviewed Sep 16, 2024

View reviewed changes

uxarray/grid/grid.py Outdated Show resolved Hide resolved

o fix text

dba53dc

philipc2 reviewed Sep 16, 2024

View reviewed changes

o Introduce circular dependency issue

ecc0fa0

rajeeja and others added 2 commits September 17, 2024 18:07

Merge branch 'main' into rajeeja/welzl

3531d5c

Merge branch 'main' into rajeeja/welzl

4d6014a

philipc2 requested changes Sep 19, 2024

View reviewed changes

rajeeja and others added 4 commits September 19, 2024 11:54

Update benchmarks/mpas_ocean.py

cc330d4

Co-authored-by: Philip Chmielowiec <[email protected]>

Update benchmarks/mpas_ocean.py

1744a30

Co-authored-by: Philip Chmielowiec <[email protected]>

Merge branch 'main' into rajeeja/welzl

0fc7b3f

o Fix imports and benchmarks

57e3112

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute Grid Centerpoint using Welzl's algorithm #811

Compute Grid Centerpoint using Welzl's algorithm #811

rajeeja commented Jun 11, 2024

philipc2 commented Jul 2, 2024

rajeeja commented Jul 9, 2024

philipc2 commented Aug 1, 2024 •

edited

Loading

erogluorhan left a comment

erogluorhan left a comment

philipc2 Sep 16, 2024

rajeeja Sep 16, 2024

rajeeja Sep 17, 2024

rajeeja commented Sep 16, 2024

erogluorhan commented Sep 17, 2024 •

edited

Loading

rajeeja commented Sep 17, 2024

philipc2 Sep 19, 2024

rajeeja Sep 19, 2024

philipc2 Sep 19, 2024

rajeeja Sep 19, 2024

Compute Grid Centerpoint using Welzl's algorithm #811

Are you sure you want to change the base?

Compute Grid Centerpoint using Welzl's algorithm #811

Conversation

rajeeja commented Jun 11, 2024

philipc2 commented Jul 2, 2024

rajeeja commented Jul 9, 2024

philipc2 commented Aug 1, 2024 • edited Loading

erogluorhan left a comment

Choose a reason for hiding this comment

erogluorhan left a comment

Choose a reason for hiding this comment

philipc2 Sep 16, 2024

Choose a reason for hiding this comment

rajeeja Sep 16, 2024

Choose a reason for hiding this comment

rajeeja Sep 17, 2024

Choose a reason for hiding this comment

rajeeja commented Sep 16, 2024

erogluorhan commented Sep 17, 2024 • edited Loading

rajeeja commented Sep 17, 2024

philipc2 Sep 19, 2024

Choose a reason for hiding this comment

rajeeja Sep 19, 2024

Choose a reason for hiding this comment

philipc2 Sep 19, 2024

Choose a reason for hiding this comment

rajeeja Sep 19, 2024

Choose a reason for hiding this comment

philipc2 commented Aug 1, 2024 •

edited

Loading

erogluorhan commented Sep 17, 2024 •

edited

Loading