Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new placement Vindex strategy #77

Closed
wants to merge 3 commits into from

Conversation

jeremycole
Copy link

@jeremycole jeremycole commented Jan 13, 2023

Signed-off-by: Jeremy Cole [email protected]

Description

Add a placement Vindex type for use in regional sharding.

A Vindex which uses a mapping lookup table placement_map to set the first placement_prefix_bytes of the Keyspace ID and another Vindex type placement_sub_vindex_type (which must support Hashing) as a sub-Vindex to set the rest. This is suitable for regional sharding (like region_json or region_experimental) but does not require a mapping file, and can support non-integer types for the sharding column.

All parameters are prefixed with placement_ to avoid conflict, because the params map is passed to the sub-Vindex as well.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@brendar
Copy link

brendar commented Jan 23, 2023

Sorry I’ve been slow to provide feedback. I wanted to make sure I understood how the various vindexes behaved. I've tried modeling carts using multicol, region_json, region_experimental, and your placement vindexes. I see why region_json and region_experimental won’t work for us (primarily, the non-region column must be numeric). I also tried using multicol, but it looks like there’s a bug with vtgate SQL parsing that causes numeric literals (e.g. 65535) to be parsed as int64 regardless of the table column type (e.g. SMALLINT UNSIGNED), which means the wrong 2 bytes are extracted by the multicol + binary vindex. I can provide more debugging context on that if you're curious.

placement seems to be working as advertised 🎉 , but in terms of overall approach, there might be a few other options to consider:

  1. Modifying region_json to take a sub-vindex type param. e.g. suffix_column_vindex: xxhash so that it can work with non-numeric columns. We could keep the default as hash to avoid breaking current behavior.
  2. Making multicol + binary work correctly with integers less than 64 bits. I'm not sure how difficult this might be. It would also mean we couldn't map from (string) region/country to (int) shard-prefix, but maybe that’s ok? It would mean the shard-prefix is the same as some numeric region/country code.
  3. Adding a generic static mapping vindex similar to numeric_static_map (which maps numbers to numbers via json), but capable of mapping arbitrary types, and using that as part of a multicol vindex, e.g.
    in the vschema:
"carts_multicol": {
  "type": "multicol",
  "params": {
    "column_count": "2",
    "column_bytes": "2,6",
    "column_vindex": "static_map,xxhash",
    "static_map_json_path": "/foo/static_region_map.json",
  }
}

in static_region_map.json:

{
  "key_type": "string",
  "value_type": "uint16",
  "map": {
    "foo-east1": "5",
    "bar-west2": "99"
  }
}

I'm not convinced any of those is definitely a better option, but figured they're worth a thought.

Also, as part of my exploration I put together an example modeling carts using multicol (broken) and placement on this branch: https://github.com/Shopify/vitess/tree/carts-example/examples/carts It might be helpful for further experimentation.

@jeremycole jeremycole force-pushed the 20220112_placement_vindex branch 2 times, most recently from 0c6bafa to 8e9611b Compare February 9, 2023 18:31
@github-actions
Copy link

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

@github-actions github-actions bot added the Stale label Mar 13, 2023
@github-actions
Copy link

This PR was closed because it has been stale for 7 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants