Skip to content

Commit

Permalink
Add details on implementation options
Browse files Browse the repository at this point in the history
  • Loading branch information
rgommers committed Jan 6, 2021
1 parent 5f278b3 commit b465e39
Showing 1 changed file with 31 additions and 0 deletions.
31 changes: 31 additions & 0 deletions protocol/dataframe_protocol_summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,39 @@ computational graph approach like Dask uses, etc.)._

## Possible direction for implementation

### Rough prototypes

The `cuDFDataFrame`, `cuDFColumn` and `cuDFBuffer` sketched out by @kkraus14
[here](https://github.com/data-apis/dataframe-api/issues/29#issuecomment-685123386)
seems to be in the right direction.

[This prototype](https://github.com/wesm/dataframe-protocol/pull/1) by Wes
McKinney was the first attempt, and has some useful features.

TODO: work this out after making sure we're all on the same page regarding requirements.


### Relevant existing protocols

Here are the four most relevant existing protocols, and what requirements they support:

| *supports* | buffer protocol | `__array_interface__` | DLPack | Arrow C Data Interface |
|---------------------|:---------------:|:---------------------:|:------:|:----------------------:|
| Python API | | Y | Y | |
| C API | Y | Y | Y | Y |
| arrays | Y | Y | Y | Y |
| dataframes | | | | |
| chunking | | | | |
| devices | | | Y | |
| bool/int/uint/float | Y | Y | Y | Y |
| missing data | (1) | (2) | (3) | Y |
| string dtype | (3) | (3) | | Y |
| datetime dtypes | | (4) | | Y |
| categoricals | (5) | (5) | (6) | (5) |

1. Can be done only via separate masks of boolean arrays.
2. `__array_interface__` has a `mask` attribute, which is a separate boolean array also implementing the `__array_interface__` protocol.
3. Only fixed-length strings as sequence of char or unicode.
4. Only NumPy datetime and timedelta, which are limited compared to what the Arrow format offers.
5. No explicit support, however categoricals can be mapped to either integers or strings.
6. No explicit support, categoricals can only be mapped to integers.

0 comments on commit b465e39

Please sign in to comment.