Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add integers to four digit year format example #3218

Merged
merged 4 commits into from
Oct 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 30 additions & 16 deletions doc/user_guide/encodings/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,19 +170,19 @@ Effect of Data Type on Color Scales
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As an example of this, here we will represent the same data three different ways,
with the color encoded as a *quantitative*, *ordinal*, and *nominal* type,
using three vertically-concatenated charts (see :ref:`vconcat-chart`):
using three horizontally-concatenated charts (see :ref:`hconcat-chart`):

.. altair-plot::

base = alt.Chart(cars).mark_point().encode(
x='Horsepower:Q',
y='Miles_per_Gallon:Q',
).properties(
width=150,
height=150
width=140,
height=140
)

alt.vconcat(
alt.hconcat(
base.encode(color='Cylinders:Q').properties(title='quantitative'),
base.encode(color='Cylinders:O').properties(title='ordinal'),
base.encode(color='Cylinders:N').properties(title='nominal'),
Expand All @@ -198,35 +198,49 @@ Effect of Data Type on Axis Scales
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Similarly, for x and y axis encodings, the type used for the data will affect
the scales used and the characteristics of the mark. For example, here is the
difference between a ``quantitative`` and ``ordinal`` scale for an column
difference between a ``ordinal``, ``quantitative``, and ``temporal`` scale for an column
that contains integers specifying a year:

.. altair-plot::

pop = data.population.url
pop = data.population()

base = alt.Chart(pop).mark_bar().encode(
alt.Y('mean(people):Q').title('total population')
alt.Y('mean(people):Q').title('Total population')
).properties(
width=200,
height=200
width=140,
height=140
)

alt.hconcat(
base.encode(x='year:Q').properties(title='year=quantitative'),
base.encode(x='year:O').properties(title='year=ordinal')
base.encode(x='year:O').properties(title='ordinal'),
base.encode(x='year:Q').properties(title='quantitative'),
base.encode(x='year:T').properties(title='temporal')
)

Because quantitative values do not have an inherent width, the bars do not
Because values on quantitative and temporal scales do not have an inherent width, the bars do not
fill the entire space between the values.
This view also makes clear the missing year of data that was not immediately
apparent when we treated the years as categories.
These scales clearly show the missing year of data that was not immediately
apparent when we treated the years as ordinal data,
but the axis formatting is undesirable in both cases.

To plot four digit integers as years with proper axis formatting,
i.e. without thousands separator,
we recommend converting the integers to strings first,
and the specifying a temporal data type in Altair.
While it is also possible to change the axis format with ``.axis(format='i')``,
it is preferred to specify the appropriate data type to Altair.

.. altair-plot::

pop['year'] = pop['year'].astype(str)

base.mark_bar().encode(x='year:T').properties(title='temporal')

This kind of behavior is sometimes surprising to new users, but it emphasizes
the importance of thinking carefully about your data types when visualizing
data: a visual encoding that is suitable for categorical data may not be
suitable for quantitative data, and vice versa.

suitable for quantitative data or temporal data, and vice versa.

.. _shorthand-description:

Expand Down
7 changes: 5 additions & 2 deletions doc/user_guide/times_and_dates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,12 @@ example, we'll limit ourselves to the first two weeks of data:
y='temp:Q'
)

(notice that for date/time values we use the ``T`` to indicate a temporal
Notice that for date/time values we use the ``T`` to indicate a temporal
encoding: while this is optional for pandas datetime input, it is good practice
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion).
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion.
If you want Altair to plot four digit integers as years,
you need to cast them as strings before changing the data type to temporal,
please see the :ref:`type-axis-scale` for details.

For date-time inputs like these, it can sometimes be useful to extract particular
time units (e.g. hours of the day, dates of the month, etc.).
Expand Down