-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concat on non-identical categories in categorical indexes raises TypeError #17629
Comments
hmm that seems like a bug, it should work the same. can you submit a PR? |
Related to #14177 (that's about unioning the columns). We should do this for both at the same time. |
right but we are now correctly upcasting to #14177 is above providing an option for this (rather than just do the upcast). |
I'll take a deeper look and see if i can put something together to fix this and report back if i' running into problems. |
Was this fixed? Ran into the same error using |
Still an issue in pandas. @postelrich was this a recent version of dask? dask/dask#2963 fixed an issue with |
@jorisvandenbossche i think this is related to the discussion a few months ago about _concat_same_type vs _concat_same_dtype. IIRC you had a nice solution to that. Is there anything from that we can use here? |
Sorry, I don't recall which discussion you are referring to. Do you have a link? I think the main issue here is that CategoricalIndex still overrides the base class |
This looks to work on master now. Could use a test.
|
Code Sample, a copy-pastable example if possible
Problem description
While concatenating categorical data with non-identical categories is now support for data columns by upcasting the data to an appropriate dtype it is still throwing an error when attempting to do the same thing with categorical indexes.
I think it would be convenient for users to implement the same behaviour for data columns and indexes.
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_GB.UTF-8
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.20.3
pytest: 3.2.1
pip: None
setuptools: None
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: