Skip to content

v1.13.0

Compare
Choose a tag to compare
@ebezzi ebezzi released this 03 Apr 22:15
· 117 commits to main since this release
f2d19a7

New embeddings API

Census embeddings can now accessed using a new, simplified API. Check the notebooks for collaboration and hosted models for more information.

obs columns are now categorical instead of strings

Starting from the 2024-04-01 Census build, a subset of the columns in the obs dataframe are now categorical instead of strings.

For Python users, note that Pandas will encode these columns as pandas.Categorical for which some downstream operations may need to be adapted. See this link for more details. In particular:

Series methods like Series.value_counts() will use all categories, even if some categories are not present in the data

and

DataFrame methods like sum, groupby, pivot, value_counts also show “unused” categories when observed=False, which is the default.

For R users, note that these columns will be encoded as factor and similarly downstream operations may need to be adapted. See this link for more details.

For Python and R users interfacing with arrow, these columns will be encoded as dictionary, see more details for R in this link and Python in this link.

Additions

Full Changelog: v1.12.0...v1.13.0