Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt styling of 3d tiles and web tiles based on values of an attribute of interest #9

Open
julietcohen opened this issue Nov 9, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@julietcohen
Copy link
Collaborator

julietcohen commented Nov 9, 2022

It would be interesting to implement a color palette with a wide range of values that represents a certain attribute of interest, such as lake size, in the 3D tile dataset.

The color palette of the Cesium 3D tiles on the web interface is determined by a Cesium map config, which is currently set to match the colors of the web tiles that represent the same data as the Cesium tiles. The web tiles are static PNG image files that are created from a pre-determined color palette assigned to pixels in the GeoTIFF version of the data. So we would need to change the config for the web tiles in a few ways, before we adjust the Cesium 3D tiles config to match the colors of the web tiles.

Currently, there are 2 statistics in the web tiles config: polygon_count and coverage. Here is an example config for the coverage statistic:

{
            "name": "coverage", # can be changed to anything
            "weight_by": "area",
            "property": "area_per_pixel_area", # must match the name of an actual attribute in the data
            "aggregation_method": "sum",
            "resampling_method": "average",
            "val_range": [0, 1], # coverage is a proportion, which always ranges 0-1
            "nodata_val": 0,
            "nodata_color": "#ffffff00",
            "palette": ["#d9c43f", "#d93fce"]
        }

This template should perhaps should be adjusted in the following ways to implement styling based on an attribute:

  • the property should be set to the attribute of interest ("property": "lake_size")
  • "name" should reflect the property ("name": "lake_size")
  • "aggregation_method": "mode" or something similar, Robyn noted that mode might be a good option here because a single pixel may contain several lakes, especially as we zoom out to lower resolution and aggregate more pixels, so the color of that pixel perhaps should represent the most common lake size present in the pixel, rather than the average lake size in that pixel. If it was set to the average, this might result in a misleading visualization.
  • "resampling_method": "mode" or keep as the "average"
  • The "palette" can be defined as a list of colors, such as ~10 colors ranging from different intensities of red to different intensities of blue. Ingmar suggested a range of color values here for his lake change dataset.

Robyn noted we might need to adjust Raster.py if we wanted to compute more complicated stats such as combining multiple attributes.

Thanks to Ingmar for this suggestion, and to Robyn for explaining how the different file types relate to the configs and the contents of the config shown above.

@julietcohen julietcohen added the enhancement New feature or request label Nov 9, 2022
@julietcohen
Copy link
Collaborator Author

julietcohen commented Mar 30, 2023

Difference between aggregation and resampling

  • Aggregation applies to the property (attribute) or statistic when we are converting the staged vector into rasters. We are working with the highest resolution z-level, since we are in the initial conversion into rasters. Consider a simple case where the only stat we are calculating is percent coverage and weight_by in the config is area. We create a dataframe where the polygons are sliced along the grid lines of the raster, and polygons are assigned to the grid cell they fall within (see here). This dataframe is then grouped by row and column indices, and this grouped dataframe is aggregated by the statistic specified in the config aggregation_method. This is "sum" for IWP coverage, but would ideally be "mode" for an attribute like rate of change (which can be positive or negative) for the lake change data. The aggregated dataframe then is rasterized, so each cell within the dataframe becomes a cell within the raster.

    • Statistic - Percent Coverage: If 50% of the cell is covered by one polygon and 20% of the cell is covered by another polygon, the highest z-level raster cell is given the percent coverage value of 70% if aggregation is set to sum, but would yield 35% if aggregation is set to “mean”.
    • Statistic - Number of Polygons: If several polygon centroids (derived from the vector data) fall within the cell, the “sum” of them is used for the “initial” raster (highest z-level raster)
    • Statistic - custom attribute in the data, such as rate of change of lake area: If the rate of change of 1 lake in the cell is +1 in the units of the change attribute, and the rate of change of 2 other lakes in the cell is -2 in the units of the change attribute, the cell gets a value of -2 if we are using mode. However, mode is not an agg() option, so we would need to use a function like statistics.mode() that returns one value for the mode, and the first mode if there are multiple modes within the value set. Could also use "max" instead of mode, but this may be biased towards representing smaller lakes that are changing faster rather than representing the change rate of the lake(s) that takes up the most area.
  • Resampling applies to how we calculate the lower z-level raster cell values as we process them from the higher z-levels, rather than calculating based on the values of the attributes in the geopackages. As we combine 4 cells at a time (as “chunks") into one cell value, we can use “mean” or “sum” or “mode”, depending on what is most appropriate for the statistic conceptually

    • Statistic - Number of Polygons: should be set to “sum” because we are summing all the centroids of the polygons that fall within all 4 cells that are being combined into 1 value (cell)
    • Statistic - Percent Coverage: should be set to “mean” instead of “sum” because the value of the resulting resampled cell cannot exceed 1, and we want the color of the resampled cell to represent the average of the cells that composed that cell so the user has an idea of the average percent coverage of that region when they are zoomed out
    • Statistic - custom attribute in the data, such as rate of change of lake area: Should be set to "mode" so we represent the change rate that is present by most area. If set to "mean" or "sum", we are changing the data itself. For example, if two raster cells are increasing in lake area, and the other two are decreasing, and we take the average, we might get a resulting higher z-level raster that shows no change at all in those lakes.

@julietcohen julietcohen changed the title Adapt styling of 3d tiles based on values of an attribute of interest Adapt styling of 3d tiles and web tiles based on values of an attribute of interest Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Status: No status
Development

No branches or pull requests

1 participant