Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize categorical plotting and refactor stripplot #2413

Merged
merged 33 commits into from
Jan 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ef866e9
Proof of principle refactored stripplot passing all tests
mwaskom Jan 3, 2021
c990508
Improve handling of categorical dates
mwaskom Jan 5, 2021
15ec383
Improve automatic categorical orientation with dates
mwaskom Jan 8, 2021
2d3b8b5
Add more continuous datetime variable to long_df fixture
mwaskom Jan 8, 2021
50fec43
Begin updating stripplot tests
mwaskom Jan 10, 2021
f004371
Update more stripplot tests
mwaskom Jan 10, 2021
6d82f56
Add test for single strip, with hue
mwaskom Jan 12, 2021
bfcb246
Fix infer_orient argcheck
mwaskom Jan 12, 2021
c92bc13
Add tests for flat and wide data in stripplot
mwaskom Jan 12, 2021
fee80d3
Refactor hue backcompat into a plotter class method, make optional
mwaskom Jan 12, 2021
677cc57
Enable new default coloring rules in stripplot
mwaskom Jan 12, 2021
172b29f
Update catplot to use new stripplot function
mwaskom Jan 13, 2021
35dab5f
Update assert_plots_equal to test all collections
mwaskom Jan 13, 2021
077067a
Clean up some comments
mwaskom Jan 14, 2021
cfc6b86
Remove old stripplot code
mwaskom Jan 14, 2021
1313433
Fix typo
mwaskom Jan 14, 2021
8a2f088
Add explicit categorical order to VectorPlotter._attach
mwaskom Jan 14, 2021
67615d2
Modify the implementation of categorical data handling to permit unsh…
mwaskom Jan 14, 2021
ee5c930
Improve integration of axis converters with unshared facet grids
mwaskom Jan 15, 2021
f2a2caf
Fix ordering by category dtype
mwaskom Jan 15, 2021
e428432
Fix catplot point sizes
mwaskom Jan 15, 2021
a0d8dc5
Add (un)fixed_scale
mwaskom Jan 16, 2021
18e7165
Fix plot equality assertion
mwaskom Jan 16, 2021
3d918ef
Disable tests that hit matplotlib bug due to incomplete implemenation
mwaskom Jan 16, 2021
6c3cad3
Improve test coverage
mwaskom Jan 17, 2021
b2a433f
Move forced/ordered categorical scaling logic to core
mwaskom Jan 18, 2021
406bf9a
Add core-level tets for scale method(s)
mwaskom Jan 18, 2021
a92eefa
Reduce use of special attributes, add formatter and hue_norm
mwaskom Jan 18, 2021
f8e4af5
Update stripplot API examples
mwaskom Jan 18, 2021
35f4f3a
Re-enable kwarg deprecation warning
mwaskom Jan 18, 2021
0b1c692
Fix log scaled stripplot
mwaskom Jan 18, 2021
4a360be
Fixed log-scaled categorical axis
mwaskom Jan 18, 2021
72a2616
Don't jitter single strips
mwaskom Jan 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/docstrings/histplot.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -461,9 +461,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "seaborn-refactor (py38)",
"display_name": "seaborn-py38-latest",
"language": "python",
"name": "seaborn-refactor"
"name": "seaborn-py38-latest"
},
"language_info": {
"codemirror_mode": {
Expand Down
313 changes: 313 additions & 0 deletions doc/docstrings/stripplot.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,313 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"hide"
]
},
"outputs": [],
"source": [
"import seaborn as sns\n",
"sns.set_theme(style=\"whitegrid\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Assigning a single numeric variable shows its univariate distribution with points randomly \"jittered\" on the other axis:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tips = sns.load_dataset(\"tips\")\n",
"sns.stripplot(data=tips, x=\"total_bill\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Assigning a second variable splits the strips of poins to compare categorical levels of that variable:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Show vertically-oriented strips by swapping the assignment of the categorical and numerical variables:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"day\", y=\"total_bill\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Prior to version 0.12, the levels of the categorical variable had different colors. To get the same effect, assign the `hue` variable explicitly:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"day\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Or you can assign a distinct variable to `hue` to show a multidimensional relationship:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"sex\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"If the `hue` variable is numeric, it will be mapped with a quantitative palette by default (this was not the case prior to version 0.12):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"size\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Use `palette` to control the color mapping, including forcing a categorical mapping by passing the name of a qualitative palette:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"size\", palette=\"deep\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"By default, the different levels of the `hue` variable are intermingled in each strip, but setting `dodge=True` will split them:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"sex\", dodge=True)"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"The random jitter can be disabled by setting `jitter=False`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"day\", hue=\"sex\", dodge=True, jitter=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If plotting in wide-form mode, each column of the dataframe will be mapped to both `x` and `hue`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips)"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"To change the orientation while in wide-form mode, pass `orient` explicitly:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, orient=\"h\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"The `orient` parameter is also useful when both axis variables are numeric, as it will resolve ambiguity about which dimension to group (and jitter) along:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(data=tips, x=\"total_bill\", y=\"size\", orient=\"h\")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"By default, the categorical variable will be mapped to discrete indices with a fixed scale (0, 1, ...), even when it is numeric:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(\n",
" data=tips.query(\"size in [2, 3, 5]\"),\n",
" x=\"total_bill\", y=\"size\", orient=\"h\",\n",
")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"To disable this behavior and use the original scale of the variable, set `fixed_scale=False`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(\n",
" data=tips.query(\"size in [2, 3, 5]\"),\n",
" x=\"total_bill\", y=\"size\", orient=\"h\",\n",
" fixed_scale=False,\n",
")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Further visual customization can be achieved by passing matplotlib keyword arguments:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.stripplot(\n",
" data=tips, x=\"total_bill\", y=\"day\", hue=\"time\",\n",
" jitter=False, s=20, marker=\"D\", linewidth=1, alpha=.1,\n",
")"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"To make a plot with multiple facets, it is safer to use :func:`catplot` than to work with :class:`FacetGrid` directly, because :func:`catplot` will ensure that the categorical and hue variables are properly synchronized in each facet:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sns.catplot(data=tips, x=\"time\", y=\"total_bill\", hue=\"sex\", col=\"day\", aspect=.5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "seaborn-py38-latest",
"language": "python",
"name": "seaborn-py38-latest"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
4 changes: 4 additions & 0 deletions doc/releases/v0.12.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ v0.12.0 (Unreleased)

- |Fix| |Enhancement| Improved robustness to missing data, including additional support for the `pd.NA` type (:pr:`2417).

- TODO function specific categorical enhancements, including:

- In :func:`stripplot`, a "strip" with a single observation will be plotted without jitter (:pr:`2413`)

- Made `scipy` an optional dependency and added `pip install seaborn[all]` as a method for ensuring the availability of compatible `scipy` and `statsmodels` libraries at install time. This has a few minor implications for existing code, which are explained in the Github pull request (:pr:`2398`).

- Following `NEP29 <https://numpy.org/neps/nep-0029-deprecation_policy.html>`_, dropped support for Python 3.6 and bumped the minimally-supported versions of the library dependencies.
Expand Down
Loading