-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faq pull request #7604
Faq pull request #7604
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @harshitha1201 - thank you for opening this PR! You've done the right steps so far.
This looks like a great start - I've added some comments with suggestions of how I think this PR can be improved further.
The reason that the automated ReadTheDocs build in the CI failed is that we have it set up to fail if the docs compilation step raises any warnings at all. Warnings won't prevent you from building the documentation locally, which is presumably what you did.
The warnings I can see are:
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/7604/doc/getting-started-guide/faq.rst:4: WARNING: Duplicate explicit target name: "pandas".
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/7604/doc/getting-started-guide/faq.rst:4: WARNING: Duplicate explicit target name: "pandas".
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/7604/doc/getting-started-guide/faq.rst:4: WARNING: Duplicate explicit target name: "pandas".
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/7604/doc/getting-started-guide/faq.rst:19: ERROR: Duplicate target name, cannot be used as a unique reference: "pandas".
If you fix these warnings and push the extra commits to this branch you should hopefully find that the ReadTheDocs build succeeds.
doc/getting-started-guide/faq.rst
Outdated
# Open a Zarr store using zarr package | ||
import zarr | ||
store = zarr.open('/path/to/my/store.zarr') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we generally only want to list other methods of opening certain file formats if those methods return xarray objects (i.e. an xarray.Dataset
). No-where else in xarray is the reader really going to use the zarr.store
object returned here, the FAQ should be about "how do I open X file format as an xarray.Dataset?"
doc/getting-started-guide/faq.rst
Outdated
"Excel (.xls, .xlsx)","xarray.open_dataset()","`pandas <https://pandas.pydata.org/>`_, `openpyxl <https://pypi.org/project/openpyxl/>`_ " | ||
"JSON (.json)","xarray.open_dataset()","`json <https://docs.python.org/3/library/json.html>`_, `pandas <https://pandas.pydata.org/>`_" | ||
|
||
To use these IO functions in xarray, you can simply call them with the path to the file(s) you want to read as an argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing you're missing here is the idea of passing a different engine
to xr.open_dataset
. That's what actually tells xarray what file format you are attempting to load. A lot of the entries in your list below should really include the engine
keyword argument, e.g:
ds = xr.open_dataset('/path/to/file.zarr', engine='zarr')
(There are some weird exceptions like how open_zarr
also exists, but we're possibly going to get rid of those, see #7495.)
In order for the engine
kwarg to work, the corresponding xarray file backend has to be registered, which is normally done by installing some other package. For example I don't think you need to import cfgrib
explicitly to use `engine='cfgrib' (see example), but you do need to have the cfgrib package installed. We should mention this.
doc/getting-started-guide/faq.rst
Outdated
How can I read format X in xarray? | ||
---------------------------------- | ||
|
||
To read format X in xarray, you need to know the `format of the data <https://docs.xarray.dev/en/stable/user-guide/io.html#csv-and-other-formats-supported-by-pandas/>`_ you want to read. If the format is supported, you can use the appropriate IO function provided by xarray. The following table provides links to IO functions for different file formats in xarray, as well as links to other packages that can be used to read these formats: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There's an error while running the command |
That issue will be fixed by #7619 |
* @TomNicholas please look into the changes. * commit 1 * @TomNicholas please review * @TomNicholas please review * latest commit * Changes done on formatting the code * pre-commit bot changes * code changes * commit_check * formatted function names * passed all checks * checks_commit * changes done_added zarr * minor changes * documentation changes * ready for review * added what I have done to whats-new.rst * updated
* main: (40 commits) Faq pull request (According to pull request pydata#7604 & issue pydata#1285 (pydata#7638) add timeouts for tests (pydata#7657) Pull Request Labeler - Undo workaround sync-labels bug (pydata#7667) [pre-commit.ci] pre-commit autoupdate (pydata#7651) Allow all integer dtypes in `polyval` (pydata#7619) [skip-ci] dev whats-new (pydata#7660) Redo whats-new for 2023.03.0 (pydata#7659) Set copy=False when calling pd.Series (pydata#7642) Pin pandas < 2 (pydata#7650) Whats-new for release 2023.03.0 (pydata#7643) Bump pypa/gh-action-pypi-publish from 1.7.1 to 1.8.1 (pydata#7648) Use more descriptive link texts (pydata#7625) Fix missing 'dim' argument in _get_nan_block_lengths (pydata#7598) Fix `pcolormesh` with str coords (pydata#7612) [skip-ci] Fix groupby binary ops benchmarks (pydata#7603) Remove incomplete sentence in IO docs (pydata#7631) Allow indexing unindexed dimensions using dask arrays (pydata#5873) Bump pypa/gh-action-pypi-publish from 1.6.4 to 1.7.1 (pydata#7618) [pre-commit.ci] pre-commit autoupdate (pydata#7620) add a test for scatter colorbar extend (pydata#7616) ...
* upstream/main: (716 commits) Faq pull request (According to pull request pydata#7604 & issue pydata#1285 (pydata#7638) add timeouts for tests (pydata#7657) Pull Request Labeler - Undo workaround sync-labels bug (pydata#7667) [pre-commit.ci] pre-commit autoupdate (pydata#7651) Allow all integer dtypes in `polyval` (pydata#7619) [skip-ci] dev whats-new (pydata#7660) Redo whats-new for 2023.03.0 (pydata#7659) Set copy=False when calling pd.Series (pydata#7642) Pin pandas < 2 (pydata#7650) Whats-new for release 2023.03.0 (pydata#7643) Bump pypa/gh-action-pypi-publish from 1.7.1 to 1.8.1 (pydata#7648) Use more descriptive link texts (pydata#7625) Fix missing 'dim' argument in _get_nan_block_lengths (pydata#7598) Fix `pcolormesh` with str coords (pydata#7612) [skip-ci] Fix groupby binary ops benchmarks (pydata#7603) Remove incomplete sentence in IO docs (pydata#7631) Allow indexing unindexed dimensions using dask arrays (pydata#5873) Bump pypa/gh-action-pypi-publish from 1.6.4 to 1.7.1 (pydata#7618) [pre-commit.ci] pre-commit autoupdate (pydata#7620) add a test for scatter colorbar extend (pydata#7616) ...
I have added changes according to issue #1285 , please review it