A Unified Table Data Type to specify data communications #63
heyong4725
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
A Unified Table Data Type to specify data communications
In this discussion, I would suggest we unify the data type between operators thru Table schema abstraction.
Basically we will have three main abstractions in DORA dataflow:
A Table has s schema, which is a list of column descriptors, each consisting of a name and an associated data type, which could be structured, semi-structured, or unstructured (e.g., image, etc.).
For example:
[(timerID, int)], [(cameraID, int), (cameraName, string), (cameraData, image)]
[(lidarID, int), (lidarData, byteStream)]
Thru this table schema, core relational operators can be provided by DORA, such as map, filter, join etc
User provided operators to implement user defined functions (e.g., .py, .so, .wasm), which are often ML inference workloads
In this way, it may also enable us to enforce type verification and encoding.
Beta Was this translation helpful? Give feedback.
All reactions