API/ENH: unprotected Categorical #13506
Labels
Categorical
Categorical Data Type
Closing Candidate
May be closeable, needs more eyeballs
Enhancement
xref #8640 #12699 #13361 #13410
There's been discussion of a few overlapping uses of
Categorical
:Option 1 is currently well-supported. Option 2 can be achieved explicitly by
union_categorical
, see #13361 #13410, but will not happen automatically e.g. when setting values or concatenating, see #12699. Option 3 is similar to 2, and has been discussed as a newString
type, see #8640, but may have different semantics to 2.While I completely agree that option 1 should be the default, I'd like to see more support for option 2 if possible; there are cases where I really do want to work with categorical data, but I don't yet know what the categories are, e.g. when concatenating a table together from several different source files.
One option would be to mimic Matlab's 'protected' flag for categorical data. By default, a
Categorical
would be created protected, and throw errors when performing actions which implicitly change the category set. However, the user could choose to declare aCategorical
as unprotected, in which case these operations would be performed as intended.While this behaviour could be achieved with the proposed
String
type, it's unclear whether that type would share theCategorical
API, or be efficiently convertible toCategorical
. Having the behaviour as part ofCategorical
would allow the user to build up aCategorical
iteratively, from multiple sources, and then quickly mark it protected once the category set is known.The text was updated successfully, but these errors were encountered: