diff --git a/CHANGELOG.md b/CHANGELOG.md index 6b773a6ed2..007829de78 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,10 +10,76 @@ Changes are grouped as follows - `Added` for new features. - `Changed` for changes in existing functionality. - `Deprecated` for soon-to-be removed features. +- `Improved` for transparent changes, e.g. better performance. - `Removed` for now removed features. - `Fixed` for any bug fixes. - `Security` in case of vulnerabilities. +## [5.0.0] - 28-09-22 +### Improved +- Greatly increased speed of datapoints fetching, especially when asking for...: + - Large number of time series (~80+) + - Very few time series (1-3) + - Any query using a finite `limit` + - Any query for `string` datapoints +- Peak memory consumption is 25-30 % lower when using the new `retrieve_arrays` method (with the same number of `max_workers`). +- Converting fetched datapoints to a Pandas `DataFrame` via `to_pandas()` (or time saved by using `retrieve_dataframe` directly) has changed from `O(N)` to `O(1)`, i.e., speedup depends on the size and is typically 4-5 orders of magnitude faster (!) (only applies to `DatapointsArray` and `DatapointsArrayList` as returned by the `retrieve_arrays` method). +- Individual customization of queries is now available for all retrieve endpoints. Previously only `aggregates` could be customized. Now all parameters can be passed either as top-level or as individual settings. This is now aligned with the API. +- Documentation for the retrieve endpoints has been overhauled with lots of new usage patterns and (better!) examples, check it out! +- Vastly better test coverage for datapoints fetching logic. You can have increased trust in the results from the SDK! + +### Added +- New optional dependency, `numpy`. +- A new datapoints fetching method, `retrieve_arrays`, that loads data directly into NumPy arrays for improved speed and lower memory usage. +- These arrays are stored in the new resource types `DatapointsArray` and `DatapointsArrayList` which offer more efficient memory usage and zero-overhead pandas-conversion. + +### Changed +- The default value for `max_workers`, controlling the max number of concurrent threads, has been increased from 10 to 20. +- The main way to interact with the `DatapointsAPI` has been moved from `client.datapoints` to `client.time_series.data` to align and unify with the `SequenceAPI`. All example code has been updated to reflect this change. Note, however, that the `client.datapoints` will still work until the next major release, but will until then issue a `DeprecationWarning`. +- The utility function `datetime_to_ms` no longer issues a `FutureWarning` on missing timezone information. It will now interpret naive `datetime`s as local time as is Python's default interpretation. +- The utility function `ms_to_datetime` no longer issues a `FutureWarning` on returning a naive `datetime` in UTC. It will now return an aware `datetime` object in UTC. +- All data classes in the SDK that represent a Cognite resource type have a `to_pandas` method. Previously, these had various defaults for the `camel_case` parameter, but they have all been changed to `False`. +- The method `DatapointsAPI.insert_dataframe` has new default values for `dropna` (now `True`, still being applied on a per-column basis) and `external_id_headers` (now `True`, disincentivizing the use of internal IDs). +- The previous fetching logic awaited and collected all errors before raising (through the use of an "initiate-and-forget" thread pool). This is great for e.g. updates/inserts to make sure you are aware of all partial changes. However, when reading datapoints, a better option is to just fail fast (which it now does). +- `DatapointsAPI.[retrieve/retrieve_arrays/retrieve_dataframe]` no longer requires `start` (default: `0`) and `end` (default: `now`). This is now aligned with the API. +- All retrieve methods accept a list of full query dictionaries for `id` and `external_id` giving full flexibility for individual settings like `start` time, `limit`, and `granularity` (to name a few), previously only possible with the `DatapointsAPI.query` endpoint. This is now aligned with the API. +- Aggregates returned now include the time period(s) (given by `granularity` unit) that `start` and `end` are a part of (as opposed to only "fully in-between" points). This is now aligned with the API. +This is also a **bugfix**: Due to the SDK rounding differently than the API, you could supply `start` and `end` (with `start < end`) and still be given an error that `start is not before end`. This can no longer happen. +- Fetching raw datapoints using `return_outside_points=True` now returns both outside points (if they exist), regardless of `limit` setting. Previously the total number of points was capped at `limit`, thus typically only returning the first. Now up to `limit+2` datapoints are always returned. This is now aligned with the API. +- Asking for the same time series any number of times no longer raises an error (from the SDK), which is useful for instance when fetching disconnected time periods. This is now aligned with the API. +- ...this change also causes the `.get` method of `DatapointsList` and `DatapointsArrayList` to now return a list of `Datapoints` or `DatapointsArray` respectively when duplicated identifiers were queried. (For data scientists and others used to `pandas`, this syntax is familiar to the slicing logic of `Series` and `DataFrame` when used with non-unique indices). +There is also a subtle **bugfix** here: since the previous implementation allowed the same time series to be specified by both its `id` and `external_id`, using `.get` to get it would always yield the settings that were specified by the `external_id`. This will now return a `list` as explained above. +- Datapoints fetching algorithm has changed from one that relied on up-to-date and correct `count` aggregates to be fast (with fallback on serial fetching if missing), to recursively (and reactively) splitting the time-domain into smaller and smaller pieces, depending on the discovered-as-fetched density-distribution of datapoints in time and available threads. The new approach also has the ability to group more than 1 (one) time series per API request (when beneficial) and short-circuit once a user-given limit has been reached (if/when given). This method is now used for *all types of queries*; numeric raw-, string raw-, and aggregate datapoints. + +#### Change: `retrieve_dataframe` +- Previously, fetching was constricted to either raw- OR aggregate datapoints. This restriction has been lifted and the method now works exactly like the other retrieve-methods (with a few extra options relevant only for pandas `DataFrame`s). +- Used to fetch time series given by `id` and `external_id` separately - this is no longer the case. This gives a significant, additional speedup when both are supplied. +- The `complete` parameter has been removed and partially replaced by `uniform_index` (bool) which covers a subset of the previous features (with some modifications: now gives uniform index all the way from first given `start` to last given `end`). Rationale: Weird and unintuitive syntax (passing a string using a comma to separate options). +- Interpolating, forward-filling or in general, imputation (also controlled via the `complete` parameter) is completely removed as the resampling logic *really* should be up to the user fetching the data to decide, not the SDK. +- New parameter `column_names` (as already used in several existing `to_pandas` methods) decides whether to pick `id`s or `external_id`s as the dataframe column names. Previously, when both were supplied, the dataframe ended up with a mix. +Read more below in the removed section or check out the method's updated documentation. + +### Fixed +- **Critical**: Fetching aggregate datapoints now works properly with the `limit` parameter. In the old implementation, `count` aggregates were first fetched to split the time domain efficiently - but this has little-to-no informational value when fetching *aggregates* with a granularity, as the datapoints distribution can take on "any shape or form". This often led to just a few returned batches of datapoints due to miscounting. +- Fetching datapoints using `limit=0` now returns zero datapoints, instead of "unlimited". This is now aligned with the API. +- Removing aggregate names from the columns in a Pandas `DataFrame` in the previous implementation used `Datapoints._strip_aggregate_name()`, but this had a bug: Whenever raw datapoints were fetched all characters after the last pipe character (`|`) in the tag name would be removed completely. In the new version, the aggregate name is only added when asked for. +- The method `Datapoints.to_pandas` could return `dtype=object` for numeric time series when all aggregate datapoints were missing; which is not *that* unlikely, e.g., when using `interpolation` aggregate on a `is_step=False` time series with datapoints spacing above one hour on average. In such cases, an object array only containing `None` would be returned instead of float array dtype with `NaN`s. Correct dtype is now enforced by an explicit cast. +- Fixed a rare bug in `DatapointsAPI.query` when no time series was found (`ignore_unknown_ids=True`) and `.get` was used on the empty returned `DatapointsList` object which would raise an exception because the identifiers-to-datapoints mapping was not defined. + +### Fixed: Extended time domain +- `TimeSeries.[first/count]()` now work with the expanded time domain (minimum age of datapoints was moved from 1970 to 1900, see [4.2.1]). + - `TimeSeries.first()` now considers datapoints before 1970 and after "now". + - `TimeSeries.count()` now considers datapoints before 1970 and after "now" and will raise an error for string time series as `count` (or any other aggregate) is not defined. +- The utility function `ms_to_datetime` no longer raises `ValueError` for inputs from before 1970, but will raise for input outside the allowed minimum- and maximum supported timestamps in the API. +**Note**: that support for `datetime`s before 1970 may be limited on Windows. + +### Removed +- All convenience methods related to plotting and the use of `matplotlib`. Rationale: No usage and low utility value: the SDK should not be a data science library. +- The entire method `DatapointsAPI.retrieve_dataframe_dict`. Rationale: Due to its slightly confusing syntax and return value, it basically saw no use "in the wild". + +### Other +Evaluation of `protobuf` performance: In its current state, using `protobuf` results in significant performance degradation compared to JSON. Additionally, it adds an extra dependency, which, if installed in its pure-Python distribution, results in earth-shattering performance degradation. + ## [4.9.0] - 2022-10-10 ### Added - Add support for extraction pipeline configuration files @@ -22,7 +88,7 @@ Changes are grouped as follows ## [4.8.1] - 2022-10-06 ### Fixed -- Fix `__str__` function of `TransformationSchedule` +- Fix `__str__` method of `TransformationSchedule` ## [4.8.0] - 2022-09-30 ### Added @@ -31,7 +97,7 @@ Changes are grouped as follows ## [4.7.1] - 2022-09-29 ### Fixed -- Fixed the `FunctionsAPI.create` method for Windows-users by removing +- Fixed the `FunctionsAPI.create` method for Windows-users by removing validation of `requirements.txt`. ## [4.7.0] - 2022-09-28 @@ -55,7 +121,6 @@ Changes are grouped as follows ### Fixed - Fixes the issue when updating transformations with new nonce credentials - ## [4.5.1] - 2022-09-08 ### Fixed - Don't depend on typing_extensions module, since we don't have it as a dependency. @@ -174,8 +239,8 @@ other OAuth flows. - added support for nonce authentication on transformations ### Changed -- if no source or destination credentials are provided on transformation create, an attempt will be made to create a session with the CogniteClient credentials, if it succeeds the aquired nonce will be used. -- if OIDC credentials are provided on transformation create/update, an attempt will be made to create a session with the given credentials, if it succeeds the aquired nonce credentials will replace the given client credentials before sending the request. +- if no source or destination credentials are provided on transformation create, an attempt will be made to create a session with the CogniteClient credentials, if it succeeds, the acquired nonce will be used. +- if OIDC credentials are provided on transformation create/update, an attempt will be made to create a session with the given credentials. If it succeeds, the acquired nonce credentials will replace the given client credentials before sending the request. ## [3.3.0] - 2022-07-21 ### Added diff --git a/README.md b/README.md index fae55cc551..537ff8a3fa 100644 --- a/README.md +++ b/README.md @@ -13,8 +13,8 @@ Cognite Python SDK [![mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) -This is the Cognite Python SDK for developers and data scientists working with Cognite Data Fusion (CDF). -The package is tightly integrated with pandas, and helps you work easily and efficiently with data in Cognite Data +This is the Cognite Python SDK for developers and data scientists working with Cognite Data Fusion (CDF). +The package is tightly integrated with pandas, and helps you work easily and efficiently with data in Cognite Data Fusion (CDF). ## Refererence documentation @@ -34,11 +34,12 @@ $ pip install cognite-sdk ### With optional dependencies A number of optional dependencies may be specified in order to support a wider set of features. The available extras (along with the libraries they include) are: +- numpy `[numpy]` - pandas `[pandas]` - geo `[geopandas, shapely]` - sympy `[sympy]` - functions `[pip]` -- all `[pandas, geopandas, shapely, sympy, pip]` +- all `[numpy, pandas, geopandas, shapely, sympy, pip]` To include optional dependencies, specify them like this with pip: @@ -51,6 +52,11 @@ or like this if you are using poetry: $ poetry add cognite-sdk -E pandas -E geo ``` +### Performance notes +If you regularly need to fetch large amounts of datapoints, consider installing with `numpy` +(or with `pandas`, as it depends on numpy) for best performance, then use the `retrieve_arrays` endpoint. +This avoids building large pure Python data structures, and instead reads data directly into `numpy.ndarrays`. + ### Windows specific On Windows, it is recommended to install `geopandas` and its dependencies using `conda` package manager, diff --git a/cognite/client/_api/annotations.py b/cognite/client/_api/annotations.py index 52c042d872..30af04032c 100644 --- a/cognite/client/_api/annotations.py +++ b/cognite/client/_api/annotations.py @@ -11,9 +11,6 @@ class AnnotationsAPI(APIClient): _RESOURCE_PATH = "/annotations" - def __init__(self, *args: Any, **kwargs: Any) -> None: - super().__init__(*args, **kwargs) - @overload def create(self, annotations: Annotation) -> Annotation: ... diff --git a/cognite/client/_api/datapoint_constants.py b/cognite/client/_api/datapoint_constants.py new file mode 100644 index 0000000000..affe1d23e7 --- /dev/null +++ b/cognite/client/_api/datapoint_constants.py @@ -0,0 +1,87 @@ +from datetime import datetime +from typing import Dict, Iterable, List, Optional, TypedDict, Union + +try: + import numpy as np + import numpy.typing as npt + + NUMPY_IS_AVAILABLE = True +except ImportError: # pragma no cover + NUMPY_IS_AVAILABLE = False + +if NUMPY_IS_AVAILABLE: + NumpyDatetime64NSArray = npt.NDArray[np.datetime64] + NumpyInt64Array = npt.NDArray[np.int64] + NumpyFloat64Array = npt.NDArray[np.float64] + NumpyObjArray = npt.NDArray[np.object_] + +# Datapoints API-limits: +DPS_LIMIT_AGG = 10_000 +DPS_LIMIT = 100_000 +POST_DPS_OBJECTS_LIMIT = 10_000 +FETCH_TS_LIMIT = 100 +RETRIEVE_LATEST_LIMIT = 100 + + +ALL_SORTED_DP_AGGS = sorted( + [ + "average", + "max", + "min", + "count", + "sum", + "interpolation", + "step_interpolation", + "continuous_variance", + "discrete_variance", + "total_variation", + ] +) + + +class CustomDatapointsQuery(TypedDict, total=False): + # No field required + start: Union[int, str, datetime, None] + end: Union[int, str, datetime, None] + aggregates: Optional[List[str]] + granularity: Optional[str] + limit: Optional[int] + include_outside_points: Optional[bool] + ignore_unknown_ids: Optional[bool] + + +class DatapointsQueryId(CustomDatapointsQuery): + id: int # required field + + +class DatapointsQueryExternalId(CustomDatapointsQuery): + external_id: str # required field + + +class CustomDatapoints(TypedDict, total=False): + # No field required + start: int + end: int + aggregates: Optional[List[str]] + granularity: Optional[str] + limit: int + include_outside_points: bool + + +class DatapointsPayload(CustomDatapoints): + items: List[CustomDatapoints] + + +DatapointsTypes = Union[int, float, str] + + +class DatapointsFromAPI(TypedDict): + id: int + externalId: Optional[str] + isString: bool + isStep: bool + datapoints: List[Dict[str, DatapointsTypes]] + + +DatapointsIdTypes = Union[int, DatapointsQueryId, Iterable[Union[int, DatapointsQueryId]]] +DatapointsExternalIdTypes = Union[str, DatapointsQueryExternalId, Iterable[Union[str, DatapointsQueryExternalId]]] diff --git a/cognite/client/_api/datapoint_tasks.py b/cognite/client/_api/datapoint_tasks.py new file mode 100644 index 0000000000..9db1521daa --- /dev/null +++ b/cognite/client/_api/datapoint_tasks.py @@ -0,0 +1,1165 @@ +from __future__ import annotations + +import math +import numbers +import operator as op +import warnings +from abc import abstractmethod +from datetime import datetime +from functools import cached_property +from itertools import chain +from threading import Lock +from typing import ( + TYPE_CHECKING, + Any, + Callable, + Dict, + Hashable, + Iterable, + Iterator, + List, + Literal, + NoReturn, + Optional, + Sequence, + Tuple, + TypeVar, + Union, + cast, + overload, +) + +from sortedcontainers import SortedDict, SortedList # type: ignore [import] + +from cognite.client._api.datapoint_constants import ( + DPS_LIMIT, + DPS_LIMIT_AGG, + NUMPY_IS_AVAILABLE, + CustomDatapoints, + CustomDatapointsQuery, + DatapointsExternalIdTypes, + DatapointsFromAPI, + DatapointsIdTypes, + DatapointsQueryExternalId, + DatapointsQueryId, + DatapointsTypes, +) +from cognite.client.data_classes.datapoints import Datapoints, DatapointsArray, DatapointsQuery +from cognite.client.utils._auxiliary import convert_all_keys_to_snake_case, to_camel_case +from cognite.client.utils._identifier import Identifier +from cognite.client.utils._time import ( + align_start_and_end_for_granularity, + granularity_to_ms, + split_time_range, + timestamp_to_ms, +) + +if NUMPY_IS_AVAILABLE: + import numpy as np + + +if TYPE_CHECKING: + import numpy.typing as npt + + +T = TypeVar("T") + + +class _SingleTSQueryValidator: + def __init__(self, user_query: DatapointsQuery) -> None: + self.user_query = user_query + self.defaults: CustomDatapointsQuery = dict( + start=user_query.start, + end=user_query.end, + limit=user_query.limit, + aggregates=user_query.aggregates, + granularity=user_query.granularity, + include_outside_points=user_query.include_outside_points, + ignore_unknown_ids=user_query.ignore_unknown_ids, + ) + + def validate_and_create_single_queries(self) -> List[_SingleTSQueryBase]: + queries = [] + if self.user_query.id is not None: + id_queries = self._validate_multiple_id(self.user_query.id) + queries.extend(id_queries) + if self.user_query.external_id is not None: + xid_queries = self._validate_multiple_xid(self.user_query.external_id) + queries.extend(xid_queries) + if queries: + return queries + raise ValueError("Pass at least one time series `id` or `external_id`!") + + def _validate_multiple_id(self, id: DatapointsIdTypes) -> List[_SingleTSQueryBase]: + return self._validate_id_or_xid(id, "id", numbers.Integral, is_external_id=False) + + def _validate_multiple_xid(self, external_id: DatapointsExternalIdTypes) -> List[_SingleTSQueryBase]: + return self._validate_id_or_xid(external_id, "external_id", str, is_external_id=True) + + def _validate_id_or_xid( + self, + id_or_xid: Union[DatapointsIdTypes, DatapointsExternalIdTypes], + arg_name: str, + exp_type: type, + is_external_id: bool, + ) -> List[_SingleTSQueryBase]: + + if isinstance(id_or_xid, (exp_type, dict)): + # Lazy - we postpone evaluation: + id_or_xid = [id_or_xid] # type: ignore [assignment] + + if not isinstance(id_or_xid, Sequence): + # We use Sequence which requires an odering of its iterable elements + self._raise_on_wrong_ts_identifier_type(id_or_xid, arg_name, exp_type) + + queries = [] + for ts in id_or_xid: + if isinstance(ts, exp_type): + # We merge 'defaults' and given ts-dict, ts-dict takes precedence: + ts_dct = {**self.defaults, arg_name: ts} + queries.append(self._validate_and_create_query(ts_dct)) # type: ignore [arg-type] + + elif isinstance(ts, dict): + ts_validated = self._validate_ts_query_dict_keys(ts, arg_name, exp_type) + ts_dct = {**self.defaults, **ts_validated} + queries.append(self._validate_and_create_query(ts_dct)) # type: ignore [arg-type] + else: # pragma: no cover + self._raise_on_wrong_ts_identifier_type(ts, arg_name, exp_type) + return queries + + @staticmethod + def _raise_on_wrong_ts_identifier_type( + id_or_xid: Union[DatapointsIdTypes, DatapointsExternalIdTypes], + arg_name: str, + exp_type: type, + ) -> NoReturn: + raise TypeError( + f"Got unsupported type {type(id_or_xid)}, as, or part of argument `{arg_name}`. Expected one of " + f"{exp_type}, {dict} or a (mixed) list of these, but got `{id_or_xid}`." + ) + + @staticmethod + def _validate_ts_query_dict_keys( + dct: Dict[str, Any], arg_name: str, exp_type: type + ) -> Union[DatapointsQueryId, DatapointsQueryExternalId]: + if arg_name not in dct: + if (arg_name_cc := to_camel_case(arg_name)) not in dct: + raise KeyError(f"Missing required key `{arg_name}` in dict: {dct}.") + # For backwards compatibility we accept identifiers in camel case: (Make copy to avoid side effects + # for user's input). Also means we need to return it. + dct[arg_name] = (dct := dct.copy()).pop(arg_name_cc) + + ts_identifier = dct[arg_name] + if not isinstance(ts_identifier, exp_type): + _SingleTSQueryValidator._raise_on_wrong_ts_identifier_type(ts_identifier, arg_name, exp_type) + + opt_dct_keys = {"start", "end", "aggregates", "granularity", "include_outside_points", "limit"} + bad_keys = set(dct) - opt_dct_keys - {arg_name} + if not bad_keys: + return dct # type: ignore [return-value] + raise KeyError( + f"Dict provided by argument `{arg_name}` included key(s) not understood: {sorted(bad_keys)}. " + f"Required key: `{arg_name}`. Optional: {list(opt_dct_keys)}." + ) + + def _validate_and_create_query( + self, dct: Union[DatapointsQueryId, DatapointsQueryExternalId] + ) -> _SingleTSQueryBase: + limit = self._verify_limit(dct["limit"]) + granularity, aggregates = dct["granularity"], dct["aggregates"] + + if not (granularity is None or isinstance(granularity, str)): + raise TypeError(f"Expected `granularity` to be of type `str` or None, not {type(granularity)}") + + elif not (aggregates is None or isinstance(aggregates, list)): + raise TypeError(f"Expected `aggregates` to be of type `list[str]` or None, not {type(aggregates)}") + + elif aggregates is None: + if granularity is None: + # Request is for raw datapoints: + raw_query = self._convert_parameters(dct, limit, is_raw=True) + if limit is None: + return _SingleTSQueryRawUnlimited(**raw_query) + return _SingleTSQueryRawLimited(**raw_query) + raise ValueError("When passing `granularity`, argument `aggregates` is also required.") + + # Aggregates must be a list at this point: + elif len(aggregates) == 0: + raise ValueError("Empty list of `aggregates` passed, expected at least one!") + + elif granularity is None: + raise ValueError("When passing `aggregates`, argument `granularity` is also required.") + + elif dct["include_outside_points"] is True: + raise ValueError("'Include outside points' is not supported for aggregates.") + # Request is for one or more aggregates: + agg_query = self._convert_parameters(dct, limit, is_raw=False) + if limit is None: + return _SingleTSQueryAggUnlimited(**agg_query) + return _SingleTSQueryAggLimited(**agg_query) + + def _convert_parameters( + self, + dct: Union[DatapointsQueryId, DatapointsQueryExternalId], + limit: Optional[int], + is_raw: bool, + ) -> Dict[str, Any]: + identifier = Identifier.of_either(dct.get("id"), dct.get("external_id")) # type: ignore [arg-type] + start, end = self._verify_time_range(dct["start"], dct["end"], dct["granularity"], is_raw, identifier) + converted = { + "identifier": identifier, + "start": start, + "end": end, + "limit": limit, + "ignore_unknown_ids": dct["ignore_unknown_ids"], + } + if is_raw: + converted["include_outside_points"] = dct["include_outside_points"] + else: + converted["aggregates"] = dct["aggregates"] + converted["granularity"] = dct["granularity"] + return converted + + def _verify_limit(self, limit: Optional[int]) -> Optional[int]: + if limit in {None, -1, math.inf}: + return None + elif isinstance(limit, numbers.Integral) and limit >= 0: # limit=0 is accepted by the API + try: + # We don't want weird stuff like numpy dtypes etc: + return int(limit) + except Exception: # pragma no cover + raise TypeError(f"Unable to convert given {limit=} to integer") + raise TypeError( + "Parameter `limit` must be a non-negative integer -OR- one of [None, -1, inf] to " + f"indicate an unlimited query. Got: {limit} with type: {type(limit)}" + ) + + def _verify_time_range( + self, + start: Union[int, str, datetime, None], + end: Union[int, str, datetime, None], + granularity: Optional[str], + is_raw: bool, + identifier: Identifier, + ) -> Tuple[int, int]: + if start is None: + start = 0 # 1970-01-01 + else: + start = timestamp_to_ms(start) + if end is None: + end = "now" + end = timestamp_to_ms(end) + + if end <= start: + raise ValueError( + f"Invalid time range, {end=} must be later than {start=} " + f"(from query: {identifier.as_dict(camel_case=False)})" + ) + if not is_raw: # API rounds aggregate query timestamps in a very particular fashion + start, end = align_start_and_end_for_granularity(start, end, cast(str, granularity)) + return start, end + + +class _SingleTSQueryBase: + def __init__( + self, + *, + identifier: Identifier, + start: int, + end: int, + max_query_limit: int, + limit: Optional[int], + include_outside_points: bool, + ignore_unknown_ids: bool, + ) -> None: + self.identifier = identifier + self.start = start + self.end = end + self.max_query_limit = max_query_limit + self.limit = limit + self.include_outside_points = include_outside_points + self.ignore_unknown_ids = ignore_unknown_ids + + self._is_missing: Optional[bool] = None + self._is_string: Optional[bool] = None + + if self.include_outside_points and self.limit is not None: + warnings.warn( + "When using `include_outside_points=True` with a finite `limit` you may get a large gap " + "between the last 'inside datapoint' and the 'after/outside' datapoint. Note also that the " + "up-to-two outside points come in addition to your given `limit`; asking for 5 datapoints might " + "yield 5, 6 or 7. It's a feature, not a bug ;)", + UserWarning, + ) + + @property + def capped_limit(self) -> int: + if self.limit is None: + return self.max_query_limit + return min(self.limit, self.max_query_limit) + + def override_max_query_limit(self, new_limit: int) -> None: + assert isinstance(new_limit, int) + self.max_query_limit = new_limit + + @property + @abstractmethod + def is_raw_query(self) -> bool: + ... + + @property + @abstractmethod + def ts_task_type(self) -> type[BaseConcurrentTask]: + ... + + @property + def is_missing(self) -> bool: + if self._is_missing is None: + raise RuntimeError("Before making API-calls the `is_missing` status is unknown") + return self._is_missing + + @is_missing.setter + def is_missing(self, value: bool) -> None: + assert isinstance(value, bool) + self._is_missing = value + + @property + def is_string(self) -> bool: + if self._is_string is None: + raise RuntimeError( + "For queries asking for raw datapoints, the `is_string` status is unknown before " + "any API-calls have been made" + ) + return self._is_string + + @is_string.setter + def is_string(self, value: bool) -> None: + assert isinstance(value, bool) + self._is_string = value + + +class _SingleTSQueryRaw(_SingleTSQueryBase): + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs, max_query_limit=DPS_LIMIT) + self.aggregates = self.aggregates_cc = None + self.granularity = None + + @property + def is_raw_query(self) -> bool: + return True + + def to_payload(self) -> Dict[str, Any]: + return { + **self.identifier.as_dict(), + "start": self.start, + "end": self.end, + "limit": self.capped_limit, + "includeOutsidePoints": self.include_outside_points, + } + + +class _SingleTSQueryRawLimited(_SingleTSQueryRaw): + def __init__(self, *, limit: int, **kwargs: Any) -> None: + super().__init__(limit=limit, **kwargs) + assert isinstance(limit, int) + + @property + def ts_task_type(self) -> type[ParallelLimitedRawTask]: + return ParallelLimitedRawTask + + +class _SingleTSQueryRawUnlimited(_SingleTSQueryRaw): + def __init__(self, *, limit: None, **kwargs: Any) -> None: + super().__init__(limit=limit, **kwargs) + + @property + def ts_task_type(self) -> type[ParallelUnlimitedRawTask]: + return ParallelUnlimitedRawTask + + +class _SingleTSQueryAgg(_SingleTSQueryBase): + def __init__(self, *, aggregates: List[str], granularity: str, **kwargs: Any) -> None: + agg_query_settings = dict(include_outside_points=False, max_query_limit=DPS_LIMIT_AGG) + super().__init__(**kwargs, **agg_query_settings) # type: ignore [arg-type] + self.aggregates = aggregates + self.granularity = granularity + + @property + def is_raw_query(self) -> bool: + return False + + @cached_property + def aggregates_cc(self) -> List[str]: + return list(map(to_camel_case, self.aggregates)) + + def to_payload(self) -> Dict[str, Any]: + return { + **self.identifier.as_dict(), + "start": self.start, + "end": self.end, + "aggregates": self.aggregates_cc, + "granularity": self.granularity, + "limit": self.capped_limit, + "includeOutsidePoints": self.include_outside_points, + } + + +class _SingleTSQueryAggLimited(_SingleTSQueryAgg): + def __init__(self, *, limit: int, **kwargs: Any) -> None: + super().__init__(limit=limit, **kwargs) + assert isinstance(limit, int) + + @property + def ts_task_type(self) -> type[ParallelLimitedAggTask]: + return ParallelLimitedAggTask + + +class _SingleTSQueryAggUnlimited(_SingleTSQueryAgg): + def __init__(self, *, limit: None, **kwargs: Any) -> None: + super().__init__(limit=limit, **kwargs) + + @property + def ts_task_type(self) -> type[ParallelUnlimitedAggTask]: + return ParallelUnlimitedAggTask + + +class DpsUnpackFns: + ts: Callable[[Dict], int] = op.itemgetter("timestamp") + raw_dp: Callable[[Dict], DatapointsTypes] = op.itemgetter("value") + ts_dp_tpl: Callable[[Dict], Tuple[int, DatapointsTypes]] = op.itemgetter("timestamp", "value") + ts_count_tpl: Callable[[Dict], Tuple[int, int]] = op.itemgetter("timestamp", "count") + count: Callable[[Dict], int] = op.itemgetter("count") + + @staticmethod + def custom_from_aggregates( + lst: List[str], + ) -> Callable[[List[Dict[str, DatapointsTypes]]], Tuple[DatapointsTypes, ...]]: + return op.itemgetter(*lst) + + +class DefaultSortedDict(SortedDict): + def __init__(self, default_factory: Callable[[], T], /, **kw: Any): + self.default_factory = default_factory + super().__init__(**kw) + + def __missing__(self, key: Hashable) -> T: + self[key] = self.default_factory() + return self[key] + + +def dps_container() -> DefaultSortedDict: + """Initialises a new sorted container for datapoints storage""" + return DefaultSortedDict(list) + + +def subtask_lst() -> SortedList: + """Initialises a new sorted list for subtasks""" + return SortedList(key=op.attrgetter("subtask_idx")) + + +def create_array_from_dps_container(container: DefaultSortedDict) -> npt.NDArray: + return np.hstack(list(chain.from_iterable(container.values()))) + + +def create_aggregates_arrays_from_dps_container(container: DefaultSortedDict, n_aggs: int) -> List[npt.NDArray]: + all_aggs_arr = np.vstack(list(chain.from_iterable(container.values()))) + return list(map(np.ravel, np.hsplit(all_aggs_arr, n_aggs))) + + +def create_list_from_dps_container(container: DefaultSortedDict) -> List: + return list(chain.from_iterable(chain.from_iterable(container.values()))) + + +def create_aggregates_list_from_dps_container(container: DefaultSortedDict) -> Iterator[List[List]]: + concatenated = chain.from_iterable(chain.from_iterable(container.values())) + return map(list, zip(*concatenated)) # rows to columns + + +class BaseDpsFetchSubtask: + def __init__( + self, + start: int, + end: int, + identifier: Identifier, + parent: BaseConcurrentTask, + priority: int, + max_query_limit: int, + n_dps_left: float, + is_raw_query: bool, + ) -> None: + self.start = start + self.end = end + self.identifier = identifier + self.parent = parent + self.priority = priority + self.is_raw_query = is_raw_query + self.max_query_limit = max_query_limit + self.n_dps_left = n_dps_left + + self.is_done = False + + @abstractmethod + def get_next_payload(self) -> Optional[Dict[str, Any]]: + ... + + @abstractmethod + def store_partial_result(self, res: DatapointsFromAPI) -> Optional[List[SplittingFetchSubtask]]: + ... + + +class OutsideDpsFetchSubtask(BaseDpsFetchSubtask): + """Fetches outside points and stores in parent""" + + def __init__(self, **kwargs: Any) -> None: + outside_dps_settings = dict(priority=0, is_raw_query=True, max_query_limit=0, n_dps_left=0) + super().__init__(**kwargs, **outside_dps_settings) # type: ignore [arg-type] + + def get_next_payload(self) -> Optional[Dict[str, Any]]: + if self.is_done: + return None + return self._create_payload_item() + + def _create_payload_item(self) -> Dict[str, Any]: + return { + **self.identifier.as_dict(), + "start": self.start, + "end": self.end, + "limit": 0, # Not a bug; it just returns the outside points + "includeOutsidePoints": True, + } + + def store_partial_result(self, res: DatapointsFromAPI) -> None: + if dps := res["datapoints"]: + self.parent._extract_outside_points(dps) + self.is_done = True + + +class SerialFetchSubtask(BaseDpsFetchSubtask): + """Fetches datapoints serially until complete, nice and simple. Stores data in parent""" + + def __init__( + self, + *, + limit: Optional[int], + aggregates: Optional[List[str]], + granularity: Optional[str], + subtask_idx: Tuple[float, ...], + **kwargs: Any, + ) -> None: + n_dps_left = math.inf if limit is None else limit + super().__init__(**kwargs, n_dps_left=n_dps_left) + self.limit = limit + self.aggregates = aggregates + self.granularity = granularity + self.subtask_idx = subtask_idx + self.n_dps_fetched = 0 + self.agg_kwargs = {} + + self.next_start = self.start + if not self.is_raw_query: + self.agg_kwargs = {"aggregates": self.aggregates, "granularity": self.granularity} + + def get_next_payload(self) -> Optional[Dict[str, Any]]: + if self.is_done: + return None + remaining = self.parent.remaining_limit(self) + if self.parent.ts_info is not None and remaining == 0: + # Since last time this task fetched points, earlier tasks have already fetched >= limit dps. + # (If ts_info isn't known, it means we still have to send out a request, happens when given limit=0) + self.is_done, ts_task = True, self.parent + with self.parent.lock: # Keep sorted list `subtasks` from being mutated + _ = ts_task.is_done # Trigger a check of parent task + # Update all subsequent subtasks to "is done": + i_start = 1 + ts_task.subtasks.index(self) + for task in ts_task.subtasks[i_start:]: + task.is_done = True + return None + return self._create_payload_item(math.inf if remaining is None else remaining) + + def _create_payload_item(self, remaining_limit: float) -> Dict[str, Any]: + return { + **self.identifier.as_dict(), + "start": self.next_start, + "end": self.end, + "limit": min(remaining_limit, self.n_dps_left, self.max_query_limit), + **self.agg_kwargs, + } + + def store_partial_result(self, res: DatapointsFromAPI) -> None: + if self.parent.ts_info is None: + # In eager mode, first task to complete gets the honor to store ts info: + self.parent._store_ts_info(res) + + if not (dps := res["datapoints"]): + self.is_done = True + return None + + n, last_ts = len(dps), cast(int, dps[-1]["timestamp"]) + self.parent._unpack_and_store(self.subtask_idx, dps) + self._update_state_for_next_payload(last_ts, n) + if self._is_task_done(n): + self.is_done = True + + def _update_state_for_next_payload(self, last_ts: int, n: int) -> None: + self.next_start = last_ts + self.parent.offset_next # Move `start` to prepare for next query + self.n_dps_left -= n + self.n_dps_fetched += n # Used to quit limited queries asap + + def _is_task_done(self, n: int) -> bool: + return self.n_dps_left == 0 or n < self.max_query_limit or self.next_start == self.end + + +class SplittingFetchSubtask(SerialFetchSubtask): + """Fetches data serially, but splits its time domain ("divide and conquer") based on the density + of returned datapoints. Stores data in parent""" + + def __init__(self, *, max_splitting_factor: int = 10, **kwargs: Any) -> None: + super().__init__(**kwargs) + self.max_splitting_factor = max_splitting_factor + self.split_subidx: int = 0 # Actual value doesnt matter (any int will do) + + def store_partial_result(self, res: DatapointsFromAPI) -> Optional[List[SplittingFetchSubtask]]: # type: ignore [override] + self.prev_start = self.next_start + super().store_partial_result(res) + if not self.is_done: + last_ts = res["datapoints"][-1]["timestamp"] + return self._split_self_into_new_subtasks_if_needed(cast(int, last_ts)) + return None + + def _create_subtasks_idxs(self, n_new_tasks: int) -> Iterable[Tuple[float, ...]]: + """Since this task may decide to split itself multiple times, we count backwards to keep order + (we rely on tuple sorting logic). Example using `self.subtask_idx=(4,)`: + - First split into e.g. 3 parts: (4,-3), (4,-2), (4,-1) + - Next, split into 2: (4, -5) and (4, -4). These now sort before the first split.""" + end = self.split_subidx + self.split_subidx -= n_new_tasks + yield from ((*self.subtask_idx, i) for i in range(self.split_subidx, end)) + + def _split_self_into_new_subtasks_if_needed(self, last_ts: int) -> Optional[List[SplittingFetchSubtask]]: + # How many new tasks because of % of time range was fetched? + tot_ms = self.end - (start := self.prev_start) + part_ms = last_ts - start + ratio_retrieved = part_ms / tot_ms + n_new_pct = math.floor(1 / ratio_retrieved) + # How many new tasks because of limit left (if limit)? + n_new_lim = math.inf + if (remaining_limit := self.parent.remaining_limit(self)) is not None: + n_new_lim = math.ceil(remaining_limit / self.max_query_limit) + # We pick strictest criterion: + n_new_tasks = min(cast(int, n_new_lim), n_new_pct, self.max_splitting_factor + 1) # +1 for "self next" + if n_new_tasks <= 1: # No point in splitting; no faster than this task just continuing + return None + # Find a `delta_ms` thats a multiple of granularity in ms (trivial for raw queries): + boundaries = split_time_range(last_ts, self.end, n_new_tasks, self.parent.offset_next) + self.end = boundaries[1] # We shift end of 'self' backwards + static_params = { + "parent": self.parent, + "priority": self.priority, + "identifier": self.identifier, + "aggregates": self.aggregates, + "granularity": self.granularity, + "max_query_limit": self.max_query_limit, + "is_raw_query": self.is_raw_query, + } + split_idxs = self._create_subtasks_idxs(n_new_tasks) + new_subtasks = [ + SplittingFetchSubtask( + start=start, end=end, limit=remaining_limit, subtask_idx=idx, **static_params # type: ignore [arg-type] + ) + for start, end, idx in zip(boundaries[1:-1], boundaries[2:], split_idxs) + ] + self.parent.subtasks.update(new_subtasks) + return new_subtasks + + +class BaseConcurrentTask: + def __init__( + self, + query: Any, # subclasses assert correct type + eager_mode: bool, + use_numpy: bool, + first_dps_batch: Optional[DatapointsFromAPI] = None, + first_limit: Optional[int] = None, + ) -> None: + self.query = query + self.eager_mode = eager_mode + self.use_numpy = use_numpy + self.ts_info = None + self.ts_data = dps_container() + self.dps_data = dps_container() + self.subtasks = subtask_lst() + self.subtask_outside_points: Optional[OutsideDpsFetchSubtask] = None + self.raw_dtype: Optional[type] = None + self._is_done = False + self.lock = Lock() + + self.has_limit = self.query.limit is not None + # When running large queries (i.e. not "eager"), all time series have a first batch fetched before + # further subtasks are created. This gives us e.g. outside points (if asked for) and ts info: + if not self.eager_mode: + assert first_limit is not None and first_dps_batch is not None # mypy... + dps = first_dps_batch.pop("datapoints") # type: ignore [misc] + self.ts_info = first_dps_batch # Store just the ts info + self.raw_dtype = self._decide_dtype_from_is_string(first_dps_batch["isString"]) + if not dps: + self._is_done = True + return None + self._store_first_batch(dps, first_limit) + + @property + def n_dps_first_batch(self) -> int: + if self.eager_mode: + return 0 + return len(self.ts_data[(0,)][0]) + + @property + def is_done(self) -> bool: + if self.ts_info is None: + return False + elif self._is_done: + return True + elif self.subtask_outside_points and not self.subtask_outside_points.is_done: + return False + elif self.subtasks: + self._is_done = all(task.is_done for task in self.subtasks) + return self._is_done + + @is_done.setter + def is_done(self, value: bool) -> None: + self._is_done = value + + @property + @abstractmethod + def offset_next(self) -> int: + ... + + @abstractmethod + def get_result(self) -> Union[Datapoints, DatapointsArray]: + ... + + @abstractmethod + def _unpack_and_store(self, idx: Tuple[float, ...], dps: List[Dict[str, DatapointsTypes]]) -> None: + ... + + @abstractmethod + def _extract_outside_points(self, dps: List[Dict[str, DatapointsTypes]]) -> None: + ... + + @abstractmethod + def _find_number_of_subtasks_uniform_split(self, tot_ms: int, n_workers_per_queries: int) -> int: + ... + + def split_into_subtasks(self, max_workers: int, n_tot_queries: int) -> List[SplittingFetchSubtask]: + # Given e.g. a single time series, we want to put all our workers to work by splitting into lots of pieces! + # As the number grows - or we start combining multiple into the same query - we want to split less: + # we hold back to not create too many subtasks: + if self.is_done: + return [] + n_workers_per_queries = max(1, round(max_workers / n_tot_queries)) + subtasks = self._create_uniformly_split_subtasks(n_workers_per_queries) + self.subtasks.update(subtasks) + if self.eager_mode and self.query.include_outside_points: + # In eager mode we do not get the "first dps batch" to extract outside points from: + self.subtask_outside_points = OutsideDpsFetchSubtask( + start=self.query.start, + end=self.query.end, + identifier=self.query.identifier, + parent=self, + ) + # Append the outside subtask to returned subtasks so that it will be queued: + subtasks.append(self.subtask_outside_points) # type: ignore [arg-type] + return subtasks + + def _create_uniformly_split_subtasks(self, n_workers_per_queries: int) -> List[SplittingFetchSubtask]: + start = self.query.start if self.eager_mode else self.first_start + tot_ms = (end := self.query.end) - start + n_periods = self._find_number_of_subtasks_uniform_split(tot_ms, n_workers_per_queries) + boundaries = split_time_range(start, end, n_periods, self.offset_next) + limit = self.query.limit - self.n_dps_first_batch if self.has_limit else None + return [ + SplittingFetchSubtask( + start=start, + end=end, + limit=limit, + subtask_idx=(i,), + parent=self, + priority=i - 1 if self.has_limit else 0, # Prioritise in chrono. order + identifier=self.query.identifier, + aggregates=self.query.aggregates_cc, + granularity=self.query.granularity, + max_query_limit=self.query.max_query_limit, + is_raw_query=self.query.is_raw_query, + ) + for i, (start, end) in enumerate(zip(boundaries[:-1], boundaries[1:]), 1) + ] + + def _decide_dtype_from_is_string(self, is_string: bool) -> type: + return np.object_ if is_string else np.float64 + + def _store_ts_info(self, res: DatapointsFromAPI) -> None: + self.ts_info = {k: v for k, v in res.items() if k != "datapoints"} # type: ignore [assignment] + if self.use_numpy: + self.raw_dtype = self._decide_dtype_from_is_string(res["isString"]) + + def _store_first_batch(self, dps: List[Dict[str, DatapointsTypes]], first_limit: int) -> None: + # Set `start` for the first subtask: + self.first_start = cast(int, dps[-1]["timestamp"]) + self.offset_next + self._unpack_and_store((0,), dps) + + # Are we done after first batch? + if self.first_start == self.query.end: + self._is_done = True + elif self.has_limit and len(dps) <= self.query.limit <= first_limit: + self._is_done = True + elif len(dps) < first_limit: + self._is_done = True + + def remaining_limit(self, subtask: BaseDpsFetchSubtask) -> Optional[int]: + if not self.has_limit: + return None + # For limited queries: if the sum of fetched points of earlier tasks have already hit/surpassed + # `limit`, we know for sure we can cancel later/future tasks: + remaining = cast(int, self.query.limit) + with self.lock: # Keep sorted list `subtasks` from being mutated + for task in self.subtasks: + # Sum up to - but not including - given subtask: + if task is subtask or (remaining := remaining - task.n_dps_fetched) <= 0: + break + return max(0, remaining) + + +class ConcurrentLimitedMixin(BaseConcurrentTask): + @property + def is_done(self) -> bool: + if self.ts_info is None: + return False + elif self._is_done: + return True + elif self.subtask_outside_points and not self.subtask_outside_points.is_done: + return False + elif self.subtasks: + # Checking if subtasks are done is not enough; we need to check if the sum of + # "len of dps takewhile is_done" has reached the limit. Additionally, each subtask might + # need to fetch a lot of the time subdomain. We want to quit early also when the limit is + # reached in the first (chronologically) non-finished subtask: + i_first_in_progress = True + n_dps_to_fetch = cast(int, self.query.limit) - self.n_dps_first_batch + for i, task in enumerate(self.subtasks): + if not (task.is_done or i_first_in_progress): + break + if i_first_in_progress: + i_first_in_progress = False + + n_dps_to_fetch -= task.n_dps_fetched + if n_dps_to_fetch == 0: + self._is_done = True + # Update all consecutive subtasks to "is done": + for task in self.subtasks[i + 1 :]: + task.is_done = True + break + # Stop forward search as current task is not done, and limit was not reached: + # (We risk that the next task is already done, and will thus miscount) + if not i_first_in_progress: + break + else: + # All subtasks are done, but limit was -not- reached: + self._is_done = True + return self._is_done + + @is_done.setter + def is_done(self, value: bool) -> None: # Kill switch + self._is_done = value + + +class BaseConcurrentRawTask(BaseConcurrentTask): + def __init__(self, **kwargs: Any) -> None: + self.dp_outside_start: Optional[Tuple[int, DatapointsTypes]] = None + self.dp_outside_end: Optional[Tuple[int, DatapointsTypes]] = None + super().__init__(**kwargs) + + @property + def offset_next(self) -> int: + return 1 # 1 ms + + def _create_empty_result(self) -> Union[Datapoints, DatapointsArray]: + if not self.use_numpy: + return Datapoints(**convert_all_keys_to_snake_case(self.ts_info), timestamp=[], value=[]) + return DatapointsArray._load( + { + **cast(dict, self.ts_info), + "timestamp": np.array([], dtype=np.int64), + "value": np.array([], dtype=self.raw_dtype), + } + ) + + def _no_data_fetched(self) -> bool: + return not any((self.ts_data, self.dp_outside_start, self.dp_outside_end)) + + def get_result(self) -> Union[Datapoints, DatapointsArray]: + if self._no_data_fetched(): + return self._create_empty_result() + if self.has_limit: + self._cap_dps_at_limit() + if self.query.include_outside_points: + self._include_outside_points_in_result() + if self.use_numpy: + return DatapointsArray._load( + { + **cast(dict, self.ts_info), + "timestamp": create_array_from_dps_container(self.ts_data), + "value": create_array_from_dps_container(self.dps_data), + } + ) + return Datapoints( + **convert_all_keys_to_snake_case(self.ts_info), + timestamp=create_list_from_dps_container(self.ts_data), + value=create_list_from_dps_container(self.dps_data), + ) + + def _find_number_of_subtasks_uniform_split(self, tot_ms: int, n_workers_per_queries: int) -> int: + # It makes no sense to split beyond what the max-size of a query allows (for a maximally dense + # time series), but that is rarely useful as 100k dps is just 1 min 40 sec... we guess an + # average density of points at 1 dp/sec, giving us split-windows no smaller than ~1 day: + return min(n_workers_per_queries, math.ceil((tot_ms / 1000) / self.query.max_query_limit)) + + def _cap_dps_at_limit(self) -> None: + # Note 1: Outside points do not count towards given limit (API spec) + # Note 2: Lock not needed; called after pool is shut down + count = 0 + for i, (subtask_idx, sublist) in enumerate(self.ts_data.items()): + for j, seq in enumerate(sublist): + if count + len(seq) < self.query.limit: + count += len(seq) + continue + end = self.query.limit - count + self.ts_data[subtask_idx][j] = seq[:end] + self.dps_data[subtask_idx][j] = self.dps_data[subtask_idx][j][:end] + # Chop off later arrays (or lists) in same sublist (if any): + self.ts_data[subtask_idx] = self.ts_data[subtask_idx][: j + 1] + self.dps_data[subtask_idx] = self.dps_data[subtask_idx][: j + 1] + # Remove later sublists (if any). We keep using DefaultSortedDicts due to the possibility of + # having to insert/add 'outside points' later: + (new_ts := dps_container()).update(self.ts_data.items()[: i + 1]) # type: ignore [index] + (new_dps := dps_container()).update(self.dps_data.items()[: i + 1]) # type: ignore [index] + self.ts_data, self.dps_data = new_ts, new_dps + return None + + def _include_outside_points_in_result(self) -> None: + for point, idx in zip((self.dp_outside_start, self.dp_outside_end), (-math.inf, math.inf)): + if point: + ts, dp = [point[0]], [point[1]] + if self.use_numpy: + ts = np.array(ts, dtype=np.int64) + dp = np.array(dp, dtype=self.raw_dtype) + self.ts_data[(idx,)].append(ts) + self.dps_data[(idx,)].append(dp) + + def _unpack_and_store(self, idx: Tuple[float, ...], dps: List[Dict[str, DatapointsTypes]]) -> None: + if self.use_numpy: # Faster than feeding listcomp to np.array: + self.ts_data[idx].append(np.fromiter(map(DpsUnpackFns.ts, dps), dtype=np.int64, count=len(dps))) # type: ignore [arg-type] + self.dps_data[idx].append(np.fromiter(map(DpsUnpackFns.raw_dp, dps), dtype=self.raw_dtype, count=len(dps))) # type: ignore [arg-type] + else: + self.ts_data[idx].append(list(map(DpsUnpackFns.ts, dps))) + self.dps_data[idx].append(list(map(DpsUnpackFns.raw_dp, dps))) + + def _store_first_batch(self, dps: List[Dict[str, DatapointsTypes]], first_limit: int) -> None: + if self.query.is_raw_query and self.query.include_outside_points: + self._extract_outside_points(dps) + if not dps: # We might have only gotten outside points + self._is_done = True + return None + super()._store_first_batch(dps, first_limit) + + def _extract_outside_points(self, dps: List[Dict[str, DatapointsTypes]]) -> None: + first_ts = cast(int, dps[0]["timestamp"]) + if first_ts < self.query.start: + # We got a dp before `start`, this should not impact our count towards `limit`: + self.dp_outside_start = DpsUnpackFns.ts_dp_tpl(dps.pop(0)) # Slow pop :( + if dps: + last_ts = cast(int, dps[-1]["timestamp"]) + if last_ts >= self.query.end: # >= because `end` is exclusive + self.dp_outside_end = DpsUnpackFns.ts_dp_tpl(dps.pop()) # Fast pop :) + + +class ParallelUnlimitedRawTask(BaseConcurrentRawTask): + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + # This entire method just to tell mypy: + assert isinstance(self.query, _SingleTSQueryRawUnlimited) + + +class ParallelLimitedRawTask(ConcurrentLimitedMixin, BaseConcurrentRawTask): + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + # This entire method just to tell mypy: + assert isinstance(self.query, _SingleTSQueryRawLimited) + + def _find_number_of_subtasks_uniform_split(self, tot_ms: int, n_workers_per_queries: int) -> int: + # We make the guess that the time series has ~1 dp/sec and use this in combination with the given + # limit to not split into too many queries (highest throughput when each request is close to max limit) + n_estimate_periods = math.ceil((tot_ms / 1000) / self.query.max_query_limit) + remaining_limit = self.query.limit - self.n_dps_first_batch + n_periods = max(1, math.ceil(remaining_limit / self.query.max_query_limit)) + # Pick the smallest N from constraints: + return min(n_workers_per_queries, n_periods, n_estimate_periods) + + +class BaseConcurrentAggTask(BaseConcurrentTask): + def __init__(self, *, query: _SingleTSQueryAgg, use_numpy: bool, **kwargs: Any) -> None: + aggregates_cc = query.aggregates_cc + self._set_aggregate_vars(aggregates_cc, use_numpy) + super().__init__(query=query, use_numpy=use_numpy, **kwargs) + + @cached_property + def offset_next(self) -> int: + return granularity_to_ms(self.query.granularity) + + def _set_aggregate_vars(self, aggregates_cc: List[str], use_numpy: bool) -> None: + self.float_aggs = aggregates_cc[:] + self.is_count_query = "count" in self.float_aggs + if self.is_count_query: + self.count_data = dps_container() + self.float_aggs.remove("count") # Only aggregate that is integer, handle separately + + self.has_non_count_aggs = bool(self.float_aggs) + if self.has_non_count_aggs: + self.agg_unpack_fn = DpsUnpackFns.custom_from_aggregates(self.float_aggs) + + self.first_non_count_agg, *others = self.float_aggs + self.single_non_count_agg = not others + + if use_numpy: + if self.single_non_count_agg: + self.dtype_aggs = np.dtype(np.float64) # type: ignore [assignment] + else: # (.., 1) is deprecated for some reason + self.dtype_aggs = np.dtype((np.float64, len(self.float_aggs))) + + def _find_number_of_subtasks_uniform_split(self, tot_ms: int, n_workers_per_queries: int) -> int: + n_max_dps = tot_ms // self.offset_next # evenly divides + return min(n_workers_per_queries, math.ceil(n_max_dps / self.query.max_query_limit)) + + def _create_empty_result(self) -> Union[Datapoints, DatapointsArray]: + if self.use_numpy: + arr_dct = {"timestamp": np.array([], dtype=np.int64)} + if self.is_count_query: + arr_dct["count"] = np.array([], dtype=np.int64) + if self.has_non_count_aggs: + arr_dct.update({agg: np.array([], dtype=np.float64) for agg in self.float_aggs}) + return DatapointsArray._load({**cast(dict, self.ts_info), **arr_dct}) + + lst_dct = {"timestamp": []} + if self.is_count_query: + lst_dct["count"] = [] + if self.has_non_count_aggs: + lst_dct.update({agg: [] for agg in self.float_aggs}) + return Datapoints(**convert_all_keys_to_snake_case({**cast(dict, self.ts_info), **lst_dct})) + + def get_result(self) -> Union[Datapoints, DatapointsArray]: + if not self.ts_data or self.query.limit == 0: + return self._create_empty_result() + if self.has_limit: + self._cap_dps_at_limit() + + if self.use_numpy: + arr_dct = {"timestamp": create_array_from_dps_container(self.ts_data)} + if self.is_count_query: + arr_dct["count"] = create_array_from_dps_container(self.count_data) + if self.has_non_count_aggs: + arr_lst = create_aggregates_arrays_from_dps_container(self.dps_data, len(self.float_aggs)) + arr_dct.update(dict(zip(self.float_aggs, arr_lst))) + return DatapointsArray._load({**cast(dict, self.ts_info), **arr_dct}) + + lst_dct = {"timestamp": create_list_from_dps_container(self.ts_data)} + if self.is_count_query: + lst_dct["count"] = create_list_from_dps_container(self.count_data) + if self.has_non_count_aggs: + if self.single_non_count_agg: + lst_dct[self.first_non_count_agg] = create_list_from_dps_container(self.dps_data) + else: + aggs_iter = create_aggregates_list_from_dps_container(self.dps_data) + lst_dct.update(dict(zip(self.float_aggs, aggs_iter))) + return Datapoints(**convert_all_keys_to_snake_case({**cast(dict, self.ts_info), **lst_dct})) + + def _cap_dps_at_limit(self) -> None: + count, to_update = 0, ["ts_data"] + if self.is_count_query: + to_update.append("count_data") + if self.has_non_count_aggs: + to_update.append("dps_data") + + for i, (subtask_idx, sublist) in enumerate(self.ts_data.items()): + for j, arr in enumerate(sublist): + if count + len(arr) < self.query.limit: + count += len(arr) + continue + end = self.query.limit - count + + for attr in to_update: + data = getattr(self, attr) + data[subtask_idx][j] = data[subtask_idx][j][:end] + data[subtask_idx] = data[subtask_idx][: j + 1] + setattr(self, attr, dict(data.items()[: i + 1])) # regular dict (no further inserts) + return None + + def _unpack_and_store(self, idx: Tuple[float, ...], dps: List[Dict[str, DatapointsTypes]]) -> None: + if self.use_numpy: + self._unpack_and_store_numpy(idx, dps) + else: + self._unpack_and_store_basic(idx, dps) + + def _unpack_and_store_numpy(self, idx: Tuple[float, ...], dps: List[Dict[str, DatapointsTypes]]) -> None: + n = len(dps) + self.ts_data[idx].append(np.fromiter(map(DpsUnpackFns.ts, dps), dtype=np.int64, count=n)) # type: ignore [arg-type] + + if self.is_count_query: + try: + arr = np.fromiter(map(DpsUnpackFns.count, dps), dtype=np.int64, count=n) # type: ignore [arg-type] + except KeyError: + # An interval with no datapoints (hence count does not exist) has data from another aggregate... probably + # (step_)interpolation. Since the resulting agg. arrays share timestamp, we would have to cast count to float in + # order to store the missing values as NaNs... We don't want that, so we fill with zeros to keep correct dtype: + arr = np.array([dp.get("count", 0) for dp in dps], dtype=np.int64) + self.count_data[idx].append(arr) + + if self.has_non_count_aggs: + try: # Fast method uses multi-key unpacking: + arr = np.fromiter(map(self.agg_unpack_fn, dps), dtype=self.dtype_aggs, count=n) # type: ignore [arg-type] + except KeyError: # An aggregate is missing, fallback to slower `dict.get(agg)`. + # This can happen when certain aggs. are undefined, e.g. `interpolate` at first interval if rounded down + arr = np.array([tuple(map(dp.get, self.float_aggs)) for dp in dps], dtype=np.float64) + self.dps_data[idx].append(arr.reshape(n, len(self.float_aggs))) + + def _unpack_and_store_basic(self, idx: Tuple[float, ...], dps: List[Dict[str, DatapointsTypes]]) -> None: + self.ts_data[idx].append(list(map(DpsUnpackFns.ts, dps))) + + if self.is_count_query: + self.count_data[idx].append([dp.get("count", 0) for dp in dps]) + + if self.has_non_count_aggs: + try: + lst = list(map(self.agg_unpack_fn, dps)) + except KeyError: + if self.single_non_count_agg: + lst = [dp.get(self.first_non_count_agg) for dp in dps] + else: + lst = [tuple(map(dp.get, self.float_aggs)) for dp in dps] + self.dps_data[idx].append(lst) + + +class ParallelUnlimitedAggTask(BaseConcurrentAggTask): + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + # This entire method just to tell mypy: + assert isinstance(self.query, _SingleTSQueryAggUnlimited) + + +class ParallelLimitedAggTask(ConcurrentLimitedMixin, BaseConcurrentAggTask): + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + # This entire method just to tell mypy: + assert isinstance(self.query, _SingleTSQueryAggLimited) + + def _find_number_of_subtasks_uniform_split(self, tot_ms: int, n_workers_per_queries: int) -> int: + remaining_limit = self.query.limit - self.n_dps_first_batch + n_max_dps = min(remaining_limit, tot_ms // self.offset_next) + return max(1, min(n_workers_per_queries, math.ceil(n_max_dps / self.query.max_query_limit))) diff --git a/cognite/client/_api/datapoints.py b/cognite/client/_api/datapoints.py index 8492cf8996..f8babec054 100644 --- a/cognite/client/_api/datapoints.py +++ b/cognite/client/_api/datapoints.py @@ -1,20 +1,558 @@ -import copy +from __future__ import annotations + +import functools +import heapq +import itertools import math -import re as regexp +import statistics +from abc import ABC, abstractmethod +from concurrent.futures import CancelledError, as_completed +from copy import copy from datetime import datetime -from typing import TYPE_CHECKING, Any, Dict, List, Optional, Set, Tuple, Union, cast - -import cognite.client.utils._time -from cognite.client import utils +from itertools import chain +from typing import ( + TYPE_CHECKING, + Any, + Dict, + Iterable, + Iterator, + List, + Literal, + Optional, + Sequence, + Set, + Tuple, + Union, + cast, + overload, +) + +from cognite.client._api.datapoint_constants import ( + DPS_LIMIT, + DPS_LIMIT_AGG, + FETCH_TS_LIMIT, + POST_DPS_OBJECTS_LIMIT, + RETRIEVE_LATEST_LIMIT, + CustomDatapoints, + DatapointsExternalIdTypes, + DatapointsFromAPI, + DatapointsIdTypes, + DatapointsPayload, +) +from cognite.client._api.datapoint_tasks import ( + BaseConcurrentTask, + SplittingFetchSubtask, + _SingleTSQueryBase, + _SingleTSQueryValidator, +) from cognite.client._api.synthetic_time_series import SyntheticDatapointsAPI from cognite.client._api_client import APIClient -from cognite.client.data_classes import Datapoints, DatapointsList, DatapointsQuery -from cognite.client.data_classes.datapoints import DatapointsExternalIdMaybeAggregate, DatapointsIdMaybeAggregate -from cognite.client.exceptions import CogniteAPIError +from cognite.client.data_classes import ( + Datapoints, + DatapointsArray, + DatapointsArrayList, + DatapointsList, + DatapointsQuery, +) +from cognite.client.exceptions import CogniteAPIError, CogniteNotFoundError +from cognite.client.utils._auxiliary import assert_type, local_import, split_into_chunks, split_into_n_parts +from cognite.client.utils._concurrency import collect_exc_info_and_raise, execute_tasks_concurrently from cognite.client.utils._identifier import Identifier, IdentifierSequence +from cognite.client.utils._priority_tpe import PriorityThreadPoolExecutor # type: ignore +from cognite.client.utils._time import timestamp_to_ms if TYPE_CHECKING: - import pandas + from concurrent.futures import Future + + import pandas as pd + + +TSQueryList = List[_SingleTSQueryBase] +PoolSubtaskType = Tuple[int, float, float, SplittingFetchSubtask] + + +def dps_fetch_selector( + dps_client: DatapointsAPI, + user_queries: Sequence[DatapointsQuery], +) -> DpsFetchStrategy: + max_workers = dps_client._config.max_workers + if max_workers < 1: # Dps fetching does not use fn `execute_tasks_concurrently`, so we must check: + raise RuntimeError(f"Invalid option for `{max_workers=}`. Must be at least 1") + all_queries, agg_queries, raw_queries = validate_and_split_user_queries(user_queries) + + # Running mode is decided based on how many time series are requested VS. number of workers: + if len(all_queries) <= max_workers: + # Start shooting requests from the hip immediately: + return EagerDpsFetcher(dps_client, all_queries, agg_queries, raw_queries, max_workers) + # Fetch a smaller, chunked batch of dps from all time series - which allows us to do some rudimentary + # guesstimation of dps density - then chunk away: + return ChunkingDpsFetcher(dps_client, all_queries, agg_queries, raw_queries, max_workers) + + +def validate_and_split_user_queries( + user_queries: Sequence[DatapointsQuery], +) -> Tuple[TSQueryList, TSQueryList, TSQueryList]: + split_qs: Tuple[TSQueryList, TSQueryList] = [], [] + all_queries = list( + chain.from_iterable( + query.validate_and_create_single_queries() for query in map(_SingleTSQueryValidator, user_queries) + ) + ) + for query in all_queries: + split_qs[query.is_raw_query].append(query) + return (all_queries, *split_qs) + + +class DpsFetchStrategy(ABC): + def __init__( + self, + dps_client: DatapointsAPI, + all_queries: TSQueryList, + agg_queries: TSQueryList, + raw_queries: TSQueryList, + max_workers: int, + ) -> None: + self.dps_client = dps_client + self.all_queries = all_queries + self.agg_queries = agg_queries + self.raw_queries = raw_queries + self.max_workers = max_workers + self.n_queries = len(all_queries) + + @overload + def fetch_all_datapoints(self, use_numpy: Literal[True]) -> DatapointsArrayList: + ... + + @overload + def fetch_all_datapoints(self, use_numpy: Literal[False]) -> DatapointsList: + ... + + def fetch_all_datapoints(self, use_numpy: bool) -> Union[DatapointsList, DatapointsArrayList]: + with PriorityThreadPoolExecutor(max_workers=self.max_workers) as pool: + ordered_results = self.fetch_all(pool, use_numpy) + return self._finalize_tasks(ordered_results, use_numpy) + + @overload + def _finalize_tasks( + self, + ordered_results: List[BaseConcurrentTask], + use_numpy: Literal[True], + ) -> DatapointsArrayList: + ... + + @overload + def _finalize_tasks( + self, + ordered_results: List[BaseConcurrentTask], + use_numpy: Literal[False], + ) -> DatapointsList: + ... + + def _finalize_tasks( + self, + ordered_results: List[BaseConcurrentTask], + use_numpy: bool, + ) -> Union[DatapointsList, DatapointsArrayList]: + lst_class = DatapointsArrayList if use_numpy else DatapointsList + return lst_class( + [ts_task.get_result() for ts_task in ordered_results], + cognite_client=self.dps_client._cognite_client, + ) + + @abstractmethod + def fetch_all(self, pool: PriorityThreadPoolExecutor, use_numpy: bool) -> List[BaseConcurrentTask]: + ... + + @abstractmethod + def _create_initial_tasks(self, pool: PriorityThreadPoolExecutor, use_numpy: bool) -> Tuple[Dict, Dict]: + ... + + +class EagerDpsFetcher(DpsFetchStrategy): + def request_datapoints_jit( + self, + task: SplittingFetchSubtask, + payload: Optional[CustomDatapoints] = None, + ) -> List[Optional[DatapointsFromAPI]]: + # Note: We delay getting the next payload as much as possible; this way, when we count number of + # points left to fetch JIT, we have the most up-to-date estimate (and may quit early): + if (item := task.get_next_payload()) is None: + return [None] + + (payload := copy(payload) or {})["items"] = [item] # type: ignore [typeddict-item] + return self.dps_client._post( + self.dps_client._RESOURCE_PATH + "/list", json=cast(Dict[str, Any], payload) + ).json()["items"] + + def fetch_all(self, pool: PriorityThreadPoolExecutor, use_numpy: bool) -> List[BaseConcurrentTask]: + futures_dct, ts_task_lookup = self._create_initial_tasks(pool, use_numpy) + + # Run until all top level tasks are complete: + while futures_dct: + future = next(as_completed(futures_dct)) + ts_task = (subtask := futures_dct.pop(future)).parent + res = self._get_result_with_exception_handling(future, ts_task, ts_task_lookup, futures_dct) + if res is None: + continue + # We may dynamically split subtasks based on what % of time range was returned: + if new_subtasks := subtask.store_partial_result(res): + self._queue_new_subtasks(pool, futures_dct, new_subtasks) + if ts_task.is_done: # "Parent" ts task might be done before a subtask is finished + if all(parent.is_done for parent in ts_task_lookup.values()): + pool.shutdown(wait=False) + break + if ts_task.has_limit: + # For finished limited queries, cancel all unstarted futures for same parent: + self._cancel_futures_for_finished_ts_task(ts_task, futures_dct) + continue + elif subtask.is_done: + continue + self._queue_new_subtasks(pool, futures_dct, [subtask]) + # Return only non-missing time series tasks in correct order given by `all_queries`: + return list(filter(None, map(ts_task_lookup.get, self.all_queries))) + + def _create_initial_tasks( + self, + pool: PriorityThreadPoolExecutor, + use_numpy: bool, + ) -> Tuple[Dict[Future, SplittingFetchSubtask], Dict[_SingleTSQueryBase, BaseConcurrentTask]]: + futures_dct: Dict[Future, SplittingFetchSubtask] = {} + ts_task_lookup, payload = {}, {"ignoreUnknownIds": False} + for query in self.all_queries: + ts_task = ts_task_lookup[query] = query.ts_task_type(query=query, eager_mode=True, use_numpy=use_numpy) + for subtask in ts_task.split_into_subtasks(self.max_workers, self.n_queries): + future = pool.submit(self.request_datapoints_jit, subtask, payload, priority=subtask.priority) + futures_dct[future] = subtask + return futures_dct, ts_task_lookup + + def _queue_new_subtasks( + self, + pool: PriorityThreadPoolExecutor, + futures_dct: Dict[Future, SplittingFetchSubtask], + new_subtasks: List[SplittingFetchSubtask], + ) -> None: + for task in new_subtasks: + future = pool.submit(self.request_datapoints_jit, task, priority=task.priority) + futures_dct[future] = task + + def _get_result_with_exception_handling( + self, + future: Future, + ts_task: BaseConcurrentTask, + ts_task_lookup: Dict[_SingleTSQueryBase, BaseConcurrentTask], + futures_dct: Dict[Future, SplittingFetchSubtask], + ) -> Optional[DatapointsFromAPI]: + try: + return future.result()[0] + except CancelledError: + return None + except CogniteAPIError as e: + if not (e.code == 400 and e.missing and ts_task.query.ignore_unknown_ids): + collect_exc_info_and_raise([e]) + elif ts_task.is_done: + return None + ts_task.is_done = True + del ts_task_lookup[ts_task.query] + self._cancel_futures_for_finished_ts_task(ts_task, futures_dct) + return None + + def _cancel_futures_for_finished_ts_task( + self, ts_task: BaseConcurrentTask, futures_dct: Dict[Future, SplittingFetchSubtask] + ) -> None: + for future, subtask in futures_dct.copy().items(): + # TODO: Change to loop over parent.subtasks? + if subtask.parent is ts_task: + future.cancel() + del futures_dct[future] + + +class ChunkingDpsFetcher(DpsFetchStrategy): + def __init__(self, *args: Any) -> None: + super().__init__(*args) + # To chunk efficiently, we have subtask pools (heap queues) that we use to prioritise subtasks + # when building/combining subtasks into a full query: + self.raw_subtask_pool: List[PoolSubtaskType] = [] + self.agg_subtask_pool: List[PoolSubtaskType] = [] + self.subtask_pools = (self.agg_subtask_pool, self.raw_subtask_pool) + # Combined partial queries storage (chunked, but not enough to fill a request): + self.next_items: List[Dict[str, Any]] = [] + self.next_subtasks: List[SplittingFetchSubtask] = [] + + self.counter = itertools.count() + + def fetch_all(self, pool: PriorityThreadPoolExecutor, use_numpy: bool) -> List[BaseConcurrentTask]: + # The initial tasks are important - as they tell us which time series are missing, + # which are string etc. We use this info when we choose the best fetch-strategy. + ts_task_lookup, missing_to_raise = {}, set() + initial_query_limits, initial_futures_dct = self._create_initial_tasks(pool) + + for future in as_completed(initial_futures_dct): + res = future.result() + chunk_agg_qs, chunk_raw_qs = initial_futures_dct.pop(future) + new_ts_tasks, chunk_missing = self._create_ts_tasks_and_handle_missing( + res, chunk_agg_qs, chunk_raw_qs, initial_query_limits, use_numpy + ) + missing_to_raise.update(chunk_missing) + ts_task_lookup.update(new_ts_tasks) + + if missing_to_raise: + raise CogniteNotFoundError(not_found=[q.identifier.as_dict(camel_case=False) for q in missing_to_raise]) + + if ts_tasks_left := self._update_queries_with_new_chunking_limit(ts_task_lookup): + self._add_to_subtask_pools( + chain.from_iterable( + task.split_into_subtasks(max_workers=self.max_workers, n_tot_queries=len(ts_tasks_left)) + for task in ts_tasks_left + ) + ) + futures_dct: Dict[Future, List[SplittingFetchSubtask]] = {} + self._queue_new_subtasks(pool, futures_dct) + self._fetch_until_complete(pool, futures_dct, ts_task_lookup) + # Return only non-missing time series tasks in correct order given by `all_queries`: + return list(filter(None, map(ts_task_lookup.get, self.all_queries))) + + def _fetch_until_complete( + self, + pool: PriorityThreadPoolExecutor, + futures_dct: Dict[Future, List[SplittingFetchSubtask]], + ts_task_lookup: Dict[_SingleTSQueryBase, BaseConcurrentTask], + ) -> None: + while futures_dct: + future = next(as_completed(futures_dct)) + res_lst, subtask_lst = future.result(), futures_dct.pop(future) + for subtask, res in zip(subtask_lst, res_lst): + # We may dynamically split subtasks based on what % of time range was returned: + if new_subtasks := subtask.store_partial_result(res): + self._add_to_subtask_pools(new_subtasks) + if not subtask.is_done: + self._add_to_subtask_pools([subtask]) + # Check each parent in current batch once if we may cancel some queued subtasks: + if done_ts_tasks := {sub.parent for sub in subtask_lst if sub.parent.is_done}: + self._cancel_subtasks(done_ts_tasks) + + self._queue_new_subtasks(pool, futures_dct) + + if all(task.is_done for task in ts_task_lookup.values()): + pool.shutdown(wait=False) + return None + + def request_datapoints(self, payload: DatapointsPayload) -> List[Optional[DatapointsFromAPI]]: + return self.dps_client._post( + self.dps_client._RESOURCE_PATH + "/list", json=cast(Dict[str, Any], payload) + ).json()["items"] + + def _create_initial_tasks( + self, pool: PriorityThreadPoolExecutor + ) -> Tuple[Dict[_SingleTSQueryBase, int], Dict[Future, Tuple[TSQueryList, TSQueryList]]]: + initial_query_limits: Dict[_SingleTSQueryBase, int] = {} + initial_futures_dct: Dict[Future, Tuple[TSQueryList, TSQueryList]] = {} + # Optimal queries uses the entire worker pool. We may be forced to use more (queue) when we + # can't fit all individual time series (maxes out at `FETCH_TS_LIMIT * max_workers`): + n_queries = max(self.max_workers, math.ceil(self.n_queries / FETCH_TS_LIMIT)) + splitter = functools.partial(split_into_n_parts, n=n_queries) + for query_chunks in zip(splitter(self.agg_queries), splitter(self.raw_queries)): + # Agg and raw limits are independent in the query, so we max out on both: + items = [] + for queries, max_lim in zip(query_chunks, [DPS_LIMIT_AGG, DPS_LIMIT]): + maxed_limits = self._find_initial_query_limits( + [q.capped_limit for q in queries], max_lim # type: ignore [attr-defined] + ) + initial_query_limits.update( + chunk_query_limits := dict(zip(queries, maxed_limits)) # type: ignore [arg-type] + ) + items.extend( + [ + {**q.to_payload(), "limit": lim} # type: ignore [attr-defined] + for q, lim in chunk_query_limits.items() + ] + ) + + payload = {"ignoreUnknownIds": True, "items": items} + future = pool.submit(self.request_datapoints, payload, priority=0) + initial_futures_dct[future] = query_chunks # type: ignore [assignment] + return initial_query_limits, initial_futures_dct + + def _create_ts_tasks_and_handle_missing( + self, + results: List[DatapointsFromAPI], + chunk_agg_qs: TSQueryList, + chunk_raw_qs: TSQueryList, + initial_query_limits: Dict[_SingleTSQueryBase, int], + use_numpy: bool, + ) -> Tuple[Dict[_SingleTSQueryBase, BaseConcurrentTask], Set[_SingleTSQueryBase]]: + if len(results) == len(chunk_agg_qs) + len(chunk_raw_qs): + to_raise: Set[_SingleTSQueryBase] = set() + else: + # We have at least 1 missing time series: + chunk_agg_qs, chunk_raw_qs, to_raise = self._handle_missing_ts(results, chunk_agg_qs, chunk_raw_qs) + self._update_queries_is_string(results, chunk_raw_qs) + # Align initial results with corresponding queries and create tasks: + ts_tasks = { + query: query.ts_task_type( + query=query, + eager_mode=False, + use_numpy=use_numpy, + first_dps_batch=res, + first_limit=initial_query_limits[query], + ) + for res, query in zip(results, chain(chunk_agg_qs, chunk_raw_qs)) + } + return ts_tasks, to_raise + + def _add_to_subtask_pools(self, new_subtasks: Iterable[SplittingFetchSubtask]) -> None: + for task in new_subtasks: + # We leverage how tuples are compared to prioritise items. First `priority`, then `payload limit` + # (to easily group smaller queries), then counter to always break ties, but keep order (never use tasks themselves): + limit = min(task.n_dps_left, task.max_query_limit) + new_subtask: PoolSubtaskType = (task.priority, limit, next(self.counter), task) + heapq.heappush(self.subtask_pools[task.is_raw_query], new_subtask) + + def _queue_new_subtasks( + self, pool: PriorityThreadPoolExecutor, futures_dct: Dict[Future, List[SplittingFetchSubtask]] + ) -> None: + qsize = pool._work_queue.qsize() # Approximate size of the queue (number of unstarted tasks) + if qsize > 2 * self.max_workers: + # Each worker has more than 2 tasks already awaiting in the thread pool queue already, so we + # hold off on combining new subtasks just yet (allows better prioritisation as more new tasks arrive). + return None + # When pool queue has few awaiting tasks, we empty the subtasks pool into a partial request: + return_partial_payload = qsize <= min(5, math.ceil(self.max_workers / 2)) + combined_requests = self._combine_subtasks_into_requests(return_partial_payload) + + for payload, subtask_lst, priority in combined_requests: + future = pool.submit(self.request_datapoints, payload, priority=priority) + futures_dct[future] = subtask_lst + + def _combine_subtasks_into_requests( + self, + return_partial_payload: bool, + ) -> Iterator[Tuple[DatapointsPayload, List[SplittingFetchSubtask], float]]: + + while any(self.subtask_pools): # As long as both are not empty + payload_at_max_items, payload_is_full = False, [False, False] + for task_pool, request_max_limit, is_raw in zip( + self.subtask_pools, (DPS_LIMIT_AGG, DPS_LIMIT), [False, True] + ): + if not task_pool: + continue + limit_used = 0 + if self.next_items: # Happens when we continue building on a previous "partial payload" + limit_used = sum( # Tally up either raw or agg query `limit_used` + item["limit"] + for item, task in zip(self.next_items, self.next_subtasks) + if task.is_raw_query is is_raw + ) + while task_pool: + if len(self.next_items) + 1 > FETCH_TS_LIMIT: + payload_at_max_items = True + break + # Highest priority task is always at index 0 (heap magic): + *_, next_task = task_pool[0] + next_payload = next_task.get_next_payload() + if next_payload is None or next_task.is_done: + # Parent task finished before subtask and has been marked done already: + heapq.heappop(task_pool) # Pop to remove from heap + continue + next_limit = next_payload["limit"] + if limit_used + next_limit <= request_max_limit: + self.next_items.append(next_payload) + self.next_subtasks.append(next_task) + limit_used += next_limit + heapq.heappop(task_pool) + else: + payload_is_full[is_raw] = True # type: ignore [has-type] + break + + payload_done = ( + payload_at_max_items + or all(payload_is_full) + or (payload_is_full[0] and not self.raw_subtask_pool) + or (payload_is_full[1] and not self.agg_subtask_pool) + or (return_partial_payload and not any(self.subtask_pools)) + ) + if payload_done: + if not len(self.next_subtasks): + # Happens with limited queries as more and more "later" tasks get cancelled. + break + priority = statistics.mean(task.priority for task in self.next_subtasks) + payload: DatapointsPayload = {"items": self.next_items[:]} # type: ignore [typeddict-item] + yield payload, self.next_subtasks[:], priority + + self.next_items, self.next_subtasks = [], [] + break + + def _update_queries_with_new_chunking_limit( + self, ts_task_lookup: Dict[_SingleTSQueryBase, BaseConcurrentTask] + ) -> List[BaseConcurrentTask]: + queries = [query for query, task in ts_task_lookup.items() if not task.is_done] + tot_raw = sum(q.is_raw_query for q in queries) + tot_agg = len(queries) - tot_raw + n_raw_chunk = min(FETCH_TS_LIMIT, math.ceil((tot_raw or 1) / 10)) + n_agg_chunk = min(FETCH_TS_LIMIT, math.ceil((tot_agg or 1) / 10)) + max_limit_raw = math.floor(DPS_LIMIT / n_raw_chunk) + max_limit_agg = math.floor(DPS_LIMIT_AGG / n_agg_chunk) + for query in queries: + if query.is_raw_query: + query.override_max_query_limit(max_limit_raw) + else: + query.override_max_query_limit(max_limit_agg) + return [ts_task_lookup[query] for query in queries] + + def _cancel_subtasks(self, done_ts_tasks: Set[BaseConcurrentTask]) -> None: + for ts_task in done_ts_tasks: + # We do -not- want to iterate/mutate the heapqs, so we mark subtasks as done instead: + for subtask in ts_task.subtasks: + subtask.is_done = True + + @staticmethod + def _find_initial_query_limits(limits: List[int], max_limit: int) -> List[int]: + actual_lims = [0] * len(limits) + not_done = set(range(len(limits))) + while not_done: + part = max_limit // len(not_done) + if not part: + # We still might not have not reached max_limit, but we can no longer distribute evenly + break + rm_idx = set() + for i in not_done: + i_part = min(part, limits[i]) # A query of limit=10 does not need more of max_limit than 10 + actual_lims[i] += i_part + max_limit -= i_part + if i_part == limits[i]: + rm_idx.add(i) + else: + limits[i] -= i_part + not_done -= rm_idx + return actual_lims + + @staticmethod + def _update_queries_is_string(res: List[DatapointsFromAPI], queries: TSQueryList) -> None: + is_string = {("id", r["id"]) for r in res if r["isString"]}.union( + ("externalId", r["externalId"]) for r in res if r["isString"] + ) + for q in queries: + q.is_string = q.identifier.as_tuple() in is_string + + @staticmethod + def _handle_missing_ts( + res: List[DatapointsFromAPI], + agg_queries: TSQueryList, + raw_queries: TSQueryList, + ) -> Tuple[TSQueryList, TSQueryList, Set[_SingleTSQueryBase]]: + missing, to_raise = set(), set() + not_missing = {("id", r["id"]) for r in res}.union(("externalId", r["externalId"]) for r in res) + for query in chain(agg_queries, raw_queries): + # Update _SingleTSQueryBase objects with `is_missing` status: + query.is_missing = query.identifier.as_tuple() not in not_missing + if query.is_missing: + missing.add(query) + # We might be handling multiple simultaneous top-level queries, each with a + # different settings for "ignore unknown": + if not query.ignore_unknown_ids: + to_raise.add(query) + agg_queries = [q for q in agg_queries if not q.is_missing] + raw_queries = [q for q in raw_queries if not q.is_missing] + return agg_queries, raw_queries, to_raise class DatapointsAPI(APIClient): @@ -22,79 +560,223 @@ class DatapointsAPI(APIClient): def __init__(self, *args: Any, **kwargs: Any) -> None: super().__init__(*args, **kwargs) - self._DPS_LIMIT_AGG = 10000 - self._DPS_LIMIT = 100000 - self._POST_DPS_OBJECTS_LIMIT = 10000 - self._RETRIEVE_LATEST_LIMIT = 100 self.synthetic = SyntheticDatapointsAPI( self._config, api_version=self._api_version, cognite_client=self._cognite_client ) def retrieve( self, - start: Union[int, str, datetime], - end: Union[int, str, datetime], - id: DatapointsIdMaybeAggregate = None, - external_id: DatapointsExternalIdMaybeAggregate = None, - aggregates: List[str] = None, - granularity: str = None, - include_outside_points: bool = None, - limit: int = None, + *, + id: Optional[DatapointsIdTypes] = None, + external_id: Optional[DatapointsExternalIdTypes] = None, + start: Union[int, str, datetime, None] = None, + end: Union[int, str, datetime, None] = None, + aggregates: Optional[List[str]] = None, + granularity: Optional[str] = None, + limit: Optional[int] = None, + include_outside_points: bool = False, ignore_unknown_ids: bool = False, ) -> Union[None, Datapoints, DatapointsList]: - """`Get datapoints for one or more time series. `_ + """`Retrieve datapoints for one or more time series. `_ + + **Note**: All arguments are optional, as long as at least one identifier is given. When passing aggregates, granularity must also be given. + When passing dict objects with specific parameters, these will take precedence. See examples below. - Note that you cannot specify the same ids/external_ids multiple times. + **Performance hint:**: For better performance and memory usage, consider using `retrieve_arrays(...)` which uses `numpy.ndarrays` for data storage. Args: - start (Union[int, str, datetime]): Inclusive start. - end (Union[int, str, datetime]): Exclusive end. - id (DatapointsIdMaybeAggregate): Id or list of ids. Can also be object - specifying aggregates. See example below. - external_id (DatapointsExternalIdMaybeAggregate): External id or list of external - ids. Can also be object specifying aggregates. See example below. - aggregates (List[str]): List of aggregate functions to apply. - granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. - include_outside_points (bool): Whether or not to include outside points. - limit (int): Maximum number of datapoints to return for each time series. - ignore_unknown_ids (bool): Ignore IDs and external IDs that are not found rather than throw an exception. + start (Union[int, str, datetime]): Inclusive start. Default: 1970-01-01 UTC. + end (Union[int, str, datetime]): Exclusive end. Default: "now" + id (DatapointsIdTypes): Id, dict (with id) or (mixed) list of these. See examples below. + external_id (DatapointsExternalIdTypes): External id, dict (with external id) or (mixed) list of these. See examples below. + aggregates (List[str]): List of aggregate functions to apply. Default: No aggregates (raw datapoints) + granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. Default: None. + limit (int): Maximum number of datapoints to return for each time series. Default: None (no limit) + include_outside_points (bool): Whether or not to include outside points. Not allowed when fetching aggregates. Default: False + ignore_unknown_ids (bool): Whether or not to ignore missing time series rather than raising an exception. Default: False Returns: - Union[None, Datapoints, DatapointsList]: A Datapoints object containing the requested data, or a list of such objects. If `ignore_unknown_id` is True, single id is requested and it is not found, the function will return `None`. + Union[None, Datapoints, DatapointsList]: A `Datapoints` object containing the requested data, or a `DatapointsList` if multiple + time series were asked for. If `ignore_unknown_ids` is `True`, a single time series is requested and it is not found, the function + will return `None`. The ordering is first ids, then external_ids. Examples: - You can get specify the ids of the datapoints you wish to retrieve in a number of ways. In this example - we are using the time-ago format to get raw data for the time series with id 1:: + You can specify the identifiers of the datapoints you wish to retrieve in a number of ways. In this example + we are using the time-ago format to get raw data for the time series with id=42 from 2 weeks ago up until now:: >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> dps = c.datapoints.retrieve(id=1, start="2w-ago", end="now") + >>> client = CogniteClient() + >>> dps = client.time_series.data.retrieve(id=42, start="2w-ago") - We can also get aggregated values, such as average. Here we are getting daily averages for all of 2018 for + You can also get aggregated values, such as the average. Here we are getting daily averages for all of 2018 for two different time series. Note that we are fetching them using their external ids:: - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> dps = c.datapoints.retrieve(external_id=["abc", "def"], - ... start=datetime(2018,1,1), - ... end=datetime(2019,1,1), - ... aggregates=["average"], - ... granularity="1d") + >>> from datetime import datetime, timezone + >>> utc = timezone.utc + >>> dps = client.time_series.data.retrieve( + ... external_id=["foo", "bar"], + ... start=datetime(2018, 1, 1, tzinfo=utc), + ... end=datetime(2018, 1, 1, tzinfo=utc), + ... aggregates=["average"], + ... granularity="1d") + + Note that all parameters (except `ignore_unknown_ids`) can be individually set if you pass (one or more) dictionaries. + If you also pass top-level parameters, these will be overwritten by the individual parameters (when both exist). You are + free to mix ids and external ids. + + Let's say you want different aggregates and end-times for a few time series: + + >>> dps = client.time_series.data.retrieve( + ... id=[ + ... {"id": 42, "end": "2d-ago", "aggregates": ["average"]}, + ... {"id": 11, "end": "1d-ago", "aggregates": ["min", "max", "count"]}, + ... ], + ... external_id={"external_id": "foo", "aggregates": ["max"]}, + ... start="5d-ago", + ... granularity="1h") + + When requesting multiple time series, an easy way to get the datapoints of a specific one is to use the `.get` method + on the returned `DatapointsList` object, then specify if you want `id` or `external_id`. Note: If you fetch a time series + by using `id`, you can still access it with its `external_id` (and the opposite way around):: + + >>> dps_lst = client.time_series.data.retrieve( + ... id=[42, 43, ..., 500], start="2w-ago") + >>> ts_350 = dps_lst.get(id=350) # `Datapoints` object + + ...but what happens if you request duplicate `id`s or `external_id`s? Let's say you need to fetch data from multiple + disconnected periods, e.g. stock data only from recessions. In this case the `.get` method will return a list of `Datapoints` instead, + (similar to how slicing works with non-unique indexes on Pandas DataFrames): + + >>> dps_lst = client.time_series.data.retrieve( + ... id=[ + ... 42, 43, 44, 45, + ... {"id": 350, "start": datetime(1907, 10, 14, tzinfo=utc), "end": datetime(1907, 11, 6, tzinfo=utc)}, + ... {"id": 350, "start": datetime(1929, 9, 4, tzinfo=utc), "end": datetime(1929, 11, 13, tzinfo=utc)}, + ... ]) + >>> ts_44 = dps_lst.get(id=44) # Single `Datapoints` object + >>> ts_350_lst = dps_lst.get(id=350) # List of two `Datapoints` objects + + The last example showcases the great flexibility of the `retrieve` endpoint, with a very custom query. If you also want to + specify multiple values for `ignore_unknown_ids`, you'll need to use the `.query` endpoint. + + >>> ts1 = 1337 + >>> ts2 = { + ... "id": 42, + ... "start": -12345, # Overrides `start` argument below + ... "end": "1h-ago", + ... "limit": 1000, # Overrides `limit` argument below + ... "include_outside_points": True + ... } + >>> ts3 = { + ... "id": 11, + ... "end": "1h-ago", + ... "aggregates": ["max"], + ... "granularity": "42h", + ... "include_outside_points": False + ... } + >>> dps = client.time_series.data.retrieve( + ... id=[ts1, ts2, ts3], start="2w-ago", limit=None + ... ) + """ + query = DatapointsQuery( + start=start, + end=end, + id=id, + external_id=external_id, + aggregates=aggregates, + granularity=granularity, + limit=limit, + include_outside_points=include_outside_points, + ignore_unknown_ids=ignore_unknown_ids, + ) + fetcher = dps_fetch_selector(self, user_queries=[query]) + dps_list = fetcher.fetch_all_datapoints(use_numpy=False) + if not query.is_single_identifier: + return dps_list + elif not dps_list and ignore_unknown_ids: + return None + return dps_list[0] + + def retrieve_arrays( + self, + *, + id: Optional[DatapointsIdTypes] = None, + external_id: Optional[DatapointsExternalIdTypes] = None, + start: Union[int, str, datetime, None] = None, + end: Union[int, str, datetime, None] = None, + aggregates: Optional[List[str]] = None, + granularity: Optional[str] = None, + limit: Optional[int] = None, + include_outside_points: bool = False, + ignore_unknown_ids: bool = False, + ) -> Union[None, DatapointsArray, DatapointsArrayList]: + """`Retrieve datapoints for one or more time series. `_ + + **Note**: This method requires `numpy`. + + Args: + start (Union[int, str, datetime]): Inclusive start. Default: 1970-01-01 UTC. + end (Union[int, str, datetime]): Exclusive end. Default: "now" + id (DatapointsIdTypes): Id, dict (with id) or (mixed) list of these. See examples below. + external_id (DatapointsExternalIdTypes): External id, dict (with external id) or (mixed) list of these. See examples below. + aggregates (List[str]): List of aggregate functions to apply. Default: No aggregates (raw datapoints) + granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. Default: None. + limit (int): Maximum number of datapoints to return for each time series. Default: None (no limit) + include_outside_points (bool): Whether or not to include outside points. Not allowed when fetching aggregates. Default: False + ignore_unknown_ids (bool): Whether or not to ignore missing time series rather than raising an exception. Default: False + + Returns: + Union[None, DatapointsArray, DatapointsArrayList]: A `DatapointsArray` object containing the requested data, or a `DatapointsArrayList` if multiple + time series were asked for. If `ignore_unknown_ids` is `True`, a single time series is requested and it is not found, the function + will return `None`. The ordering is first ids, then external_ids. + + Examples: - If you want different aggregates for different time series specify your ids like this:: + **Note:** For more usage examples, see `DatapointsAPI.retrieve` method (which accepts exactly the same arguments). + + Get weekly `min` and `max` aggregates for a time series with id=42 since the year 2000, then compute the range of values: >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> dps = c.datapoints.retrieve(id=[{"id": 1, "aggregates": ["average"]}, - ... {"id": 1, "aggregates": ["min"]}], - ... external_id={"externalId": "1", "aggregates": ["max"]}, - ... start="1d-ago", end="now", granularity="1h") - """ - fetcher = DatapointsFetcher(client=self) + >>> from datetime import datetime, timezone + >>> client = CogniteClient() + >>> dps = client.time_series.data.retrieve_arrays( + ... id=42, + ... start=datetime(2020, 1, 1, tzinfo=timezone.utc), + ... aggregates=["min", "max"], + ... granularity="7d") + >>> weekly_range = dps.max - dps.min + + Get up-to 2 million raw datapoints for the last 48 hours for a noisy time series with external_id="ts-noisy", + then use a small and wide moving average filter to smooth it out: - _, is_single_id = fetcher._process_ts_identifiers(id, external_id) + >>> import numpy as np + >>> dps = client.time_series.data.retrieve_arrays( + ... external_id="ts-noisy", + ... start="2d-ago", + ... limit=2_000_000) + >>> smooth = np.convolve(dps.value, np.ones(5) / 5) # doctest: +SKIP + >>> smoother = np.convolve(dps.value, np.ones(20) / 20) # doctest: +SKIP + + Get raw datapoints for multiple time series, that may or may not exist, from the last 2 hours, then find the + largest gap between two consecutive values for all time series, also taking the previous value into account (outside point). + + >>> id_lst = [42, 43, 44] + >>> dps_lst = client.time_series.data.retrieve_arrays( + ... id=id_lst, + ... start="2h-ago", + ... include_outside_points=True, + ... ignore_unknown_ids=True) + >>> largest_gaps = [np.max(np.diff(dps.timestamp)) for dps in dps_lst] + + Get raw datapoints for a time series with external_id="bar" from the last 10 weeks, then convert to a `pandas.Series` + (you can of course also use the `to_pandas()` convenience method if you want a `pandas.DataFrame`): + >>> import pandas as pd + >>> dps = client.time_series.data.retrieve_arrays(external_id="bar", start="10w-ago") + >>> series = pd.Series(dps.value, index=dps.timestamp) + """ + local_import("numpy") # Verify that numpy is available or raise CogniteImportError query = DatapointsQuery( start=start, end=end, @@ -102,16 +784,186 @@ def retrieve( external_id=external_id, aggregates=aggregates, granularity=granularity, + limit=limit, include_outside_points=include_outside_points, + ignore_unknown_ids=ignore_unknown_ids, + ) + fetcher = dps_fetch_selector(self, user_queries=[query]) + dps_list = fetcher.fetch_all_datapoints(use_numpy=True) + if not query.is_single_identifier: + return dps_list + elif not dps_list and ignore_unknown_ids: + return None + return dps_list[0] + + def retrieve_dataframe( + self, + *, + id: Optional[DatapointsIdTypes] = None, + external_id: Optional[DatapointsExternalIdTypes] = None, + start: Union[int, str, datetime, None] = None, + end: Union[int, str, datetime, None] = None, + aggregates: Optional[List[str]] = None, + granularity: Optional[str] = None, + limit: Optional[int] = None, + include_outside_points: bool = False, + ignore_unknown_ids: bool = False, + uniform_index: bool = False, + include_aggregate_name: bool = True, + column_names: Literal["id", "external_id"] = "external_id", + ) -> pd.DataFrame: + """Get datapoints directly in a pandas dataframe (convenience method wrapping the `retrieve_arrays` method). + + **Note**: If you have duplicated time series in your query, the dataframe columns will also contain duplicates. + + Args: + start (Union[int, str, datetime]): Inclusive start. Default: 1970-01-01 UTC. + end (Union[int, str, datetime]): Exclusive end. Default: "now" + id (DatapointsIdTypes): Id, dict (with id) or (mixed) list of these. See examples below. + external_id (DatapointsExternalIdTypes): External id, dict (with external id) or (mixed) list of these. See examples below. + aggregates (List[str]): List of aggregate functions to apply. Default: No aggregates (raw datapoints) + granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. Default: None. + limit (int): Maximum number of datapoints to return for each time series. Default: None (no limit) + include_outside_points (bool): Whether or not to include outside points. Not allowed when fetching aggregates. Default: False + ignore_unknown_ids (bool): Whether or not to ignore missing time series rather than raising an exception. Default: False + uniform_index (bool): If only querying aggregates AND a single granularity is used AND no limit is used, specifying `uniform_index=True` will return a dataframe with an + equidistant datetime index from the earliest `start` to the latest `end` (missing values will be NaNs). If these requirements are not met, a ValueError is raised. Default: False + include_aggregate_name (bool): Include 'aggregate' in the column name, e.g. `my-ts|average`. Ignored for raw time series. Default: True + column_names ("id" | "external_id"): Use either ids or external ids as column names. Time series missing external id will use id as backup. Default: "external_id" + + Returns: + pandas.DataFrame + + Returns: + pandas.DataFrame: A pandas DataFrame containing the requested time series. The ordering of columns is ids first, then external_ids. + For time series with multiple aggregates, they will be sorted in alphabetical order ("average" before "max"). + + Examples: + + Get a pandas dataframe using a single id, and use this id as column name, with no more than 100 datapoints:: + + >>> from cognite.client import CogniteClient + >>> client = CogniteClient() + >>> df = client.time_series.data.retrieve_dataframe( + ... id=12345, + ... start="2w-ago", + ... end="now", + ... limit=100, + ... column_names="id") + + Get the pandas dataframe with a uniform index (fixed spacing between points) of 1 day, for two time series with + individually specified aggregates, from 1990 through 2020:: + + >>> from datetime import datetime, timezone + >>> df = client.time_series.data.retrieve_dataframe( + ... id=[ + ... {"external_id": "foo", "aggregates": ["discrete_variance"]}, + ... {"external_id": "bar", "aggregates": ["total_variation", "continuous_variance"]}, + ... ], + ... granularity="1d", + ... start=datetime(1990, 1, 1, tzinfo=timezone.utc), + ... end=datetime(2020, 12, 31, tzinfo=timezone.utc), + ... uniform_index=True) + + Get a pandas dataframe containing the 'average' aggregate for two time series using a 30 day granularity, + starting Jan 1, 1970 all the way up to present, without having the aggregate name in the column names:: + + >>> df = client.time_series.data.retrieve_dataframe( + ... external_id=["foo", "bar"], + ... aggregates=["average"], + ... granularity="30d", + ... include_aggregate_name=False) + """ + _, pd = local_import("numpy", "pandas") # Verify that deps are available or raise CogniteImportError + if column_names not in {"id", "external_id"}: + raise ValueError(f"Given parameter {column_names=} must be one of 'id' or 'external_id'") + + query = DatapointsQuery( + start=start, + end=end, + id=id, + external_id=external_id, + aggregates=aggregates, + granularity=granularity, limit=limit, + include_outside_points=include_outside_points, ignore_unknown_ids=ignore_unknown_ids, ) - dps_list = fetcher.fetch(query) - if is_single_id: - if len(dps_list) == 0 and ignore_unknown_ids is True: - return None - return dps_list[0] - return dps_list + fetcher = dps_fetch_selector(self, user_queries=[query]) + if uniform_index: + grans_given = set(q.granularity for q in fetcher.all_queries) + is_limited = any(q.limit is not None for q in fetcher.all_queries) + if fetcher.raw_queries or len(grans_given) > 1 or is_limited: + raise ValueError( + "Cannot return a uniform index when asking for aggregates with multiple granularities " + f"({grans_given}) OR when (partly) querying raw datapoints OR when a finite limit is used." + ) + df = fetcher.fetch_all_datapoints(use_numpy=True).to_pandas(column_names, include_aggregate_name) + if not uniform_index: + return df + + start = pd.Timestamp(min(q.start for q in fetcher.agg_queries), unit="ms") + end = pd.Timestamp(max(q.end for q in fetcher.agg_queries), unit="ms") + (granularity,) = grans_given + # Pandas understand "Cognite granularities" except `m` (minutes) which we must translate: + return df.reindex(pd.date_range(start=start, end=end, freq=granularity.replace("m", "T"), inclusive="left")) + + @overload + def query( + self, + query: Union[Sequence[DatapointsQuery], DatapointsQuery], + use_numpy: Literal[False], + ) -> DatapointsList: + ... + + @overload + def query( + self, + query: Union[Sequence[DatapointsQuery], DatapointsQuery], + use_numpy: Literal[True], + ) -> DatapointsArrayList: + ... + + def query( + self, + query: Union[Sequence[DatapointsQuery], DatapointsQuery], + use_numpy: bool = False, + ) -> Union[DatapointsList, DatapointsArrayList]: + """Get datapoints for one or more time series by passing query objects directly. + + **Note**: Before version 5.0.0, this method was the only way to retrieve datapoints easily with individual fetch settings. + This is no longer the case: `query` only differs from `retrieve` in that you can specify different values for `ignore_unknown_ids` for the multiple + query objects you pass, which is quite a niche feature. Since this is a boolean parameter, the only real use case is to pass exactly + two queries to this method; the "can be" missing and the "can't be" missing groups. If you do not need this functionality, + stick with the `retrieve` and `retrieve_arrays` endpoint. + + Args: + query (Union[DatapointsQuery, Sequence[DatapointsQuery]): The queries for datapoints + use_numpy (bool): Override fetching method to take advantage of `numpy`. If True, returns `DatapointsArrayList` instead of `DatapointsList`. Default: False. + + Returns: + Union[DatapointsList, DatapointsArrayList]: The requested datapoints. Note that you always get a single list of datapoints objects returned with type dictated + by the `use_numpy` argument. The order is the ids of the first query, then the external ids of the first query, then so on for the next queries. + + Examples: + + This method is useful if one group of one or more time series can be missing AND another, can't be missing:: + + >>> from cognite.client import CogniteClient + >>> from cognite.client.data_classes import DatapointsQuery + >>> c = CogniteClient() + >>> query1 = DatapointsQuery(id=[111, 222], start="2d-ago", end="now", ignore_unknown_ids=False) + >>> query2 = DatapointsQuery(external_id="foo", start=2900, end="now", ignore_unknown_ids=True) + >>> res_lst = c.time_series.data.query([query1, query2]) + + To return datapoints stored in `numpy` arrays, pass the `use_numpy` argument: + + >>> res_arrays = c.time_series.data.query([query1, query2], use_numpy=True) + """ + if isinstance(query, DatapointsQuery): + query = [query] + fetcher = dps_fetch_selector(self, user_queries=query) + return fetcher.fetch_all_datapoints(use_numpy=use_numpy) def retrieve_latest( self, @@ -123,9 +975,9 @@ def retrieve_latest( """`Get the latest datapoint for one or more time series `_ Args: - id (Union[int, List[int]]: Id or list of ids. + id (Union[int, List[int]]): Id or list of ids. external_id (Union[str, List[str]): External id or list of external ids. - before: Union[int, str, datetime]: Get latest datapoint before this time. + before: (Union[int, str, datetime]): Get latest datapoint before this time. ignore_unknown_ids (bool): Ignore IDs and external IDs that are not found rather than throw an exception. Returns: @@ -138,40 +990,38 @@ def retrieve_latest( >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> res = c.datapoints.retrieve_latest(id=1)[0] + >>> res = c.time_series.data.retrieve_latest(id=1)[0] You can also get the first datapoint before a specific time:: >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> res = c.datapoints.retrieve_latest(id=1, before="2d-ago")[0] + >>> res = c.time_series.data.retrieve_latest(id=1, before="2d-ago")[0] If you need the latest datapoint for multiple time series simply give a list of ids. Note that we are using external ids here, but either will work:: >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> res = c.datapoints.retrieve_latest(external_id=["abc", "def"]) + >>> res = c.time_series.data.retrieve_latest(external_id=["abc", "def"]) >>> latest_abc = res[0][0] >>> latest_def = res[1][0] """ - before = cognite.client.utils._time.timestamp_to_ms(before) if before else None + before = timestamp_to_ms(before) if before else None id_seq = IdentifierSequence.load(id, external_id) all_ids = id_seq.as_dicts() if before: for id_ in all_ids: - id_.update({"before": before}) + id_["before"] = before tasks = [ { "url_path": self._RESOURCE_PATH + "/latest", "json": {"items": chunk, "ignoreUnknownIds": ignore_unknown_ids}, } - for chunk in utils._auxiliary.split_into_chunks(all_ids, self._RETRIEVE_LATEST_LIMIT) + for chunk in split_into_chunks(all_ids, RETRIEVE_LATEST_LIMIT) ] - tasks_summary = utils._concurrency.execute_tasks_concurrently( - self._post, tasks, max_workers=self._config.max_workers - ) + tasks_summary = execute_tasks_concurrently(self._post, tasks, max_workers=self._config.max_workers) if tasks_summary.exceptions: raise tasks_summary.exceptions[0] res = tasks_summary.joined_results(lambda res: res.json()["items"]) @@ -179,44 +1029,11 @@ def retrieve_latest( return Datapoints._load(res[0], cognite_client=self._cognite_client) return DatapointsList._load(res, cognite_client=self._cognite_client) - def query( - self, query: Union[DatapointsQuery, List[DatapointsQuery]] - ) -> Union[DatapointsList, List[DatapointsList]]: - """Get datapoints for one or more time series - - This method is different from get() in that you can specify different start times, end times, and granularities - for each requested time series. - - Args: - query (Union[DatapointsQuery, List[DatapointsQuery]): List of datapoint queries. - - Returns: - Union[DatapointsList, List[DatapointsList]]: The requested DatapointsList(s). - - Examples: - - This method is useful if you want to get multiple time series, but you want to specify different starts, - ends, or granularities for each. e.g.:: - - >>> from cognite.client import CogniteClient - >>> from cognite.client.data_classes import DatapointsQuery - >>> c = CogniteClient() - >>> queries = [DatapointsQuery(id=1, start="2d-ago", end="now"), - ... DatapointsQuery(external_id="abc", - ... start="10d-ago", - ... end="now", - ... aggregates=["average"], - ... granularity="1m")] - >>> res = c.datapoints.query(queries) - """ - fetcher = DatapointsFetcher(self) - if isinstance(query, DatapointsQuery): - return fetcher.fetch(query) - return fetcher.fetch_multiple(query) - def insert( self, datapoints: Union[ + Datapoints, + DatapointsArray, List[Dict[Union[int, float, datetime], Union[int, float, str]]], List[Tuple[Union[int, float, datetime], Union[int, float, str]]], ], @@ -229,7 +1046,7 @@ def insert( Args: datapoints(Union[List[Dict], List[Tuple],Datapoints]): The datapoints you wish to insert. Can either be a list of tuples, - a list of dictionaries, or a Datapoints object. See examples below. + a list of dictionaries, a Datapoints object or a DatapointsArray object. See examples below. id (int): Id of time series to insert datapoints into. external_id (str): External id of time series to insert datapoint into. @@ -243,38 +1060,43 @@ def insert( >>> from cognite.client import CogniteClient + >>> from datetime import datetime, timezone >>> c = CogniteClient() - >>> # with datetime objects - >>> datapoints = [(datetime(2018,1,1), 1000), (datetime(2018,1,2), 2000)] - >>> c.datapoints.insert(datapoints, id=1) - >>> # with ms since epoch + >>> # With datetime objects: + >>> datapoints = [ + ... (datetime(2018,1,1, tzinfo=timezone.utc), 1000), + ... (datetime(2018,1,2, tzinfo=timezone.utc), 2000), + ... ] + >>> c.time_series.data.insert(datapoints, id=1) + >>> # With ms since epoch: >>> datapoints = [(150000000000, 1000), (160000000000, 2000)] - >>> c.datapoints.insert(datapoints, id=2) + >>> c.time_series.data.insert(datapoints, id=2) Or they can be a list of dictionaries:: - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> # with datetime objects - >>> datapoints = [{"timestamp": datetime(2018,1,1), "value": 1000}, - ... {"timestamp": datetime(2018,1,2), "value": 2000}] - >>> c.datapoints.insert(datapoints, external_id="abc") - >>> # with ms since epoch - >>> datapoints = [{"timestamp": 150000000000, "value": 1000}, - ... {"timestamp": 160000000000, "value": 2000}] - >>> c.datapoints.insert(datapoints, external_id="def") + >>> datapoints = [ + ... {"timestamp": 150000000000, "value": 1000}, + ... {"timestamp": 160000000000, "value": 2000}, + ... ] + >>> c.time_series.data.insert(datapoints, external_id="def") - Or they can be a Datapoints object:: + Or they can be a Datapoints or DatapointsArray object (raw datapoints only):: - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> data = c.datapoints.retrieve(external_id="abc",start=datetime(2018,1,1),end=datetime(2018,2,2)) - >>> c.datapoints.insert(data, external_id="def") + >>> data = c.time_series.data.retrieve(external_id="abc", start="1w-ago", end="now") + >>> c.time_series.data.insert(data, external_id="def") """ post_dps_object = Identifier.of_either(id, external_id).as_dict() - if isinstance(datapoints, Datapoints): - datapoints = [(t, v) for t, v in zip(datapoints.timestamp, datapoints.value)] - post_dps_object.update({"datapoints": datapoints}) + if isinstance(datapoints, (Datapoints, DatapointsArray)): + if datapoints.value is None: + raise ValueError( + "When inserting data using a `Datapoints` or `DatapointsArray` object, only raw datapoints are supported" + ) + if isinstance(datapoints, Datapoints): + datapoints = list(zip(datapoints.timestamp, datapoints.value)) # type: ignore [arg-type] + else: + ts = datapoints.timestamp.astype("datetime64[ms]").astype("int64") + datapoints = list(zip(ts, datapoints.value)) # type: ignore [arg-type] + post_dps_object["datapoints"] = datapoints dps_poster = DatapointsPoster(self) dps_poster.insert([post_dps_object]) @@ -294,30 +1116,19 @@ def insert_multiple(self, datapoints: List[Dict[str, Union[str, int, List]]]) -> the value:: >>> from cognite.client import CogniteClient + >>> from datetime import datetime, timezone >>> c = CogniteClient() >>> datapoints = [] - >>> # with datetime objects and id - >>> datapoints.append({"id": 1, "datapoints": [(datetime(2018,1,1), 1000), (datetime(2018,1,2), 2000)]}) + >>> # With datetime objects and id + >>> datapoints.append( + ... {"id": 1, "datapoints": [ + ... (datetime(2018,1,1,tzinfo=timezone.utc), 1000), + ... (datetime(2018,1,2,tzinfo=timezone.utc), 2000) + ... ]}) >>> # with ms since epoch and externalId >>> datapoints.append({"externalId": 1, "datapoints": [(150000000000, 1000), (160000000000, 2000)]}) - - >>> c.datapoints.insert_multiple(datapoints) - - Or they can be a list of dictionaries:: - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - - >>> datapoints = [] - >>> # with datetime objects and external id - >>> datapoints.append({"externalId": "1", "datapoints": [{"timestamp": datetime(2018,1,1), "value": 1000}, - ... {"timestamp": datetime(2018,1,2), "value": 2000}]}) - >>> # with ms since epoch and id - >>> datapoints.append({"id": 1, "datapoints": [{"timestamp": 150000000000, "value": 1000}, - ... {"timestamp": 160000000000, "value": 2000}]}) - - >>> c.datapoints.insert_multiple(datapoints) + >>> c.time_series.data.insert_multiple(datapoints) """ dps_poster = DatapointsPoster(self) dps_poster.insert(datapoints) @@ -342,14 +1153,14 @@ def delete_range( >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> c.datapoints.delete_range(start="1w-ago", end="now", id=1) + >>> c.time_series.data.delete_range(start="1w-ago", end="now", id=1) """ - start = utils._time.timestamp_to_ms(start) - end = utils._time.timestamp_to_ms(end) + start = timestamp_to_ms(start) + end = timestamp_to_ms(end) assert end > start, "end must be larger than start" - delete_dps_object = Identifier.of_either(id, external_id).as_dict() - delete_dps_object.update({"inclusiveBegin": start, "exclusiveEnd": end}) + identifier = Identifier.of_either(id, external_id).as_dict() + delete_dps_object = {**identifier, "inclusiveBegin": start, "exclusiveEnd": end} self._delete_datapoints_ranges([delete_dps_object]) def delete_ranges(self, ranges: List[Dict[str, Any]]) -> None: @@ -369,7 +1180,7 @@ def delete_ranges(self, ranges: List[Dict[str, Any]]) -> None: >>> c = CogniteClient() >>> ranges = [{"id": 1, "start": "2d-ago", "end": "now"}, ... {"externalId": "abc", "start": "2d-ago", "end": "now"}] - >>> c.datapoints.delete_ranges(ranges) + >>> c.time_series.data.delete_ranges(ranges) """ valid_ranges = [] for range in ranges: @@ -381,8 +1192,8 @@ def delete_ranges(self, ranges: List[Dict[str, Any]]) -> None: id = range.get("id") external_id = range.get("externalId") valid_range = Identifier.of_either(id, external_id).as_dict() - start = utils._time.timestamp_to_ms(range["start"]) - end = utils._time.timestamp_to_ms(range["end"]) + start = timestamp_to_ms(range["start"]) + end = timestamp_to_ms(range["end"]) valid_range.update({"inclusiveBegin": start, "exclusiveEnd": end}) valid_ranges.append(valid_range) self._delete_datapoints_ranges(valid_ranges) @@ -390,234 +1201,7 @@ def delete_ranges(self, ranges: List[Dict[str, Any]]) -> None: def _delete_datapoints_ranges(self, delete_range_objects: List[Union[Dict]]) -> None: self._post(url_path=self._RESOURCE_PATH + "/delete", json={"items": delete_range_objects}) - def retrieve_dataframe( - self, - start: Union[int, str, datetime], - end: Union[int, str, datetime], - aggregates: List[str], - granularity: str, - id: DatapointsIdMaybeAggregate = None, - external_id: DatapointsExternalIdMaybeAggregate = None, - limit: int = None, - include_aggregate_name: bool = True, - complete: str = None, - ignore_unknown_ids: bool = False, - ) -> "pandas.DataFrame": - """Get a pandas dataframe describing the requested data. - - Note that you cannot specify the same ids/external_ids multiple times. - - Args: - start (Union[int, str, datetime]): Inclusive start. - end (Union[int, str, datetime]): Exclusive end. - aggregates (List[str]): List of aggregate functions to apply. - granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. - id (Union[int, List[int], Dict[str, Any], List[Dict[str, Any]]]): Id or list of ids. Can also be object - specifying aggregates. See example below. - external_id (Union[str, List[str], Dict[str, Any], List[Dict[str, Any]]]): External id or list of external - ids. Can also be object specifying aggregates. See example below. - limit (int): Maximum number of datapoints to return for each time series. - include_aggregate_name (bool): Include 'aggregate' in the column name. Defaults to True and should only be set to False when only a single aggregate is requested per id/external id. - complete (str): Post-processing of the dataframe. - ignore_unknown_ids (bool): Ignore IDs and external IDs that are not found rather than throw an exception. - - Pass 'fill' to insert missing entries into the index, and complete data where possible (supports interpolation, stepInterpolation, count, sum, totalVariation). - - Pass 'fill,dropna' to additionally drop rows in which any aggregate for any time series has missing values (typically rows at the start and end for interpolation aggregates). - This option guarantees that all returned dataframes have the exact same shape and no missing values anywhere, and is only supported for aggregates sum, count, totalVariation, interpolation and stepInterpolation. - - Returns: - pandas.DataFrame: The requested dataframe - - Examples: - - Get a pandas dataframe:: - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> df = c.datapoints.retrieve_dataframe(id=[1,2,3], start="2w-ago", end="now", - ... aggregates=["average","sum"], granularity="1h") - - Get a pandas dataframe with the index regularly spaced at 1 minute intervals, missing values completed and without the aggregate name in the columns:: - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> df = c.datapoints.retrieve_dataframe(id=[1,2,3], start="2w-ago", end="now", - ... aggregates=["interpolation"], granularity="1m", include_aggregate_name=False, complete="fill,dropna") - """ - pd = cast(Any, utils._auxiliary.local_import("pandas")) - - if id is not None: - id_dpl = self.retrieve( - id=id, - start=start, - end=end, - aggregates=aggregates, - granularity=granularity, - limit=limit, - ignore_unknown_ids=ignore_unknown_ids, - ) - if id_dpl is None: - id_dpl = DatapointsList([]) - id_df = id_dpl.to_pandas(column_names="id") - else: - id_df = pd.DataFrame() - id_dpl = DatapointsList([]) - - if external_id is not None: - external_id_dpl = self.retrieve( - external_id=external_id, - start=start, - end=end, - aggregates=aggregates, - granularity=granularity, - limit=limit, - ignore_unknown_ids=ignore_unknown_ids, - ) - if external_id_dpl is None: - external_id_dpl = DatapointsList([]) - external_id_df = external_id_dpl.to_pandas() - else: - external_id_df = pd.DataFrame() - external_id_dpl = DatapointsList([]) - - df = pd.concat([id_df, external_id_df], axis="columns") - - complete_list = [s.strip() for s in (complete or "").split(",")] - if set(complete_list) - {"fill", "dropna", ""}: - raise ValueError("complete should be 'fill', 'fill,dropna' or Falsy") - - if "fill" in complete_list and df.shape[0] > 1: - ag_used_by_id = { - dp.id: [attr for attr, _ in dp._get_non_empty_data_fields(get_empty_lists=True)] - for dpl in [id_dpl, external_id_dpl] - for dp in (dpl.data if isinstance(dpl, DatapointsList) else [dpl]) - } - is_step_dict = { - str(field): bool(dp.is_step) - for dpl in [id_dpl, external_id_dpl] - for dp in (dpl.data if isinstance(dpl, DatapointsList) else [dpl]) - for field in [dp.id, dp.external_id] - if field - } - df = self._dataframe_fill(df, granularity, is_step_dict) - - if "dropna" in complete_list: - self._dataframe_safe_dropna(df, set([ag for id, ags in ag_used_by_id.items() for ag in ags])) - - if not include_aggregate_name: - Datapoints._strip_aggregate_names(df) - - return df - - def _dataframe_fill( - self, df: "pandas.DataFrame", granularity: str, is_step_dict: Dict[str, bool] - ) -> "pandas.DataFrame": - np, pd = utils._auxiliary.local_import("numpy", "pandas") - df = df.reindex( - np.arange( - df.index[0], - df.index[-1] + pd.Timedelta(microseconds=1), - pd.Timedelta(microseconds=cognite.client.utils._time.granularity_to_ms(granularity) * 1000), - ), - copy=False, - ) - df.fillna({c: 0 for c in df.columns if regexp.search(c, r"\|(sum|totalVariation|count)$")}, inplace=True) - int_cols = [c for c in df.columns if regexp.search(c, r"\|interpolation$")] - - def _linear_interpolation_col(col: str) -> str: - match = regexp.match(r"(.*)\|\w+$", col) - assert match - return match.group(1) - - lin_int_cols = [c for c in int_cols if not is_step_dict[_linear_interpolation_col(c)]] - step_int_cols = [c for c in df.columns if regexp.search(c, r"\|stepInterpolation$")] + list( - set(int_cols) - set(lin_int_cols) - ) - if lin_int_cols: - df[lin_int_cols] = df[lin_int_cols].interpolate(limit_area="inside") - df[step_int_cols] = df[step_int_cols].ffill() - return df - - def _dataframe_safe_dropna(self, df: "pandas.DataFrame", aggregates_used: Set[str]) -> None: - supported_aggregates = ["sum", "count", "total_variation", "interpolation", "step_interpolation"] - not_supported = set(aggregates_used) - set(supported_aggregates + ["timestamp"]) - if not_supported: - raise ValueError( - "The aggregate(s) {} is not supported for dataframe completion with dropna, only {} are".format( - [utils._auxiliary.to_camel_case(a) for a in not_supported], - [utils._auxiliary.to_camel_case(a) for a in supported_aggregates], - ) - ) - df.dropna(inplace=True) - - def retrieve_dataframe_dict( - self, - start: Union[int, str, datetime], - end: Union[int, str, datetime], - aggregates: List[str], - granularity: str, - id: DatapointsIdMaybeAggregate = None, - external_id: DatapointsExternalIdMaybeAggregate = None, - limit: int = None, - ignore_unknown_ids: bool = False, - complete: str = None, - ) -> Dict[str, "pandas.DataFrame"]: # noqa: F821 - """Get a dictionary of aggregate: pandas dataframe describing the requested data. - - Args: - start (Union[int, str, datetime]): Inclusive start. - end (Union[int, str, datetime]): Exclusive end. - aggregates (List[str]): List of aggregate functions to apply. - granularity (str): The granularity to fetch aggregates at. e.g. '1s', '2h', '10d'. - id (Union[int, List[int], Dict[str, Any], List[Dict[str, Any]]]: Id or list of ids. Can also be object specifying aggregates. - external_id (Union[str, List[str], Dict[str, Any], List[Dict[str, Any]]]): External id or list of external ids. Can also be object specifying aggregates. - limit (int): Maximum number of datapoints to return for each time series. - ignore_unknown_ids (bool): Ignore IDs and external IDs that are not found rather than throw an exception. - complete (str): Post-processing of the dataframe. - - Pass 'fill' to insert missing entries into the index, and complete data where possible (supports interpolation, stepInterpolation, count, sum, totalVariation). - - Pass 'fill,dropna' to additionally drop rows in which any aggregate for any time series has missing values (typically rows at the start and end for interpolation aggregates). - This option guarantees that all returned dataframes have the exact same shape and no missing values anywhere, and is only supported for aggregates sum, count, totalVariation, interpolation and stepInterpolation. - - Returns: - Dict[str,pandas.DataFrame]: A dictionary of aggregate: dataframe. - - Examples: - - Get a dictionary of pandas dataframes, with the index evenly spaced at 1h intervals, missing values completed in the middle and incomplete entries dropped at the start and end:: - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> dfs = c.datapoints.retrieve_dataframe_dict(id=[1,2,3], start="2w-ago", end="now", - ... aggregates=["interpolation","count"], granularity="1h", complete="fill,dropna") - """ - all_aggregates = aggregates - for queries in [id, external_id]: - if isinstance(queries, list) and queries and isinstance(queries[0], dict): - for it in queries: - for ag in cast(dict, it).get("aggregates", []): - if ag not in all_aggregates: - all_aggregates.append(ag) - - df = self.retrieve_dataframe( - start, - end, - aggregates, - granularity, - id, - external_id, - limit, - include_aggregate_name=True, - complete=complete, - ignore_unknown_ids=ignore_unknown_ids, - ) - return {ag: df.filter(like="|" + ag).rename(columns=lambda s: s[: -len(ag) - 1]) for ag in all_aggregates} - - def insert_dataframe( - self, dataframe: "pandas.DataFrame", external_id_headers: bool = False, dropna: bool = False - ) -> None: + def insert_dataframe(self, df: pd.DataFrame, external_id_headers: bool = True, dropna: bool = True) -> None: """Insert a dataframe. The index of the dataframe must contain the timestamps. The names of the remaining columns specify the ids or external ids of @@ -626,10 +1210,9 @@ def insert_dataframe( Said time series must already exist. Args: - dataframe (pandas.DataFrame): Pandas DataFrame Object containing the time series. - external_id_headers (bool): Set to True if the column headers are external ids rather than internal ids. - Defaults to False. - dropna (bool): Set to True to skip NaNs in the given DataFrame, applied per column. + df (pandas.DataFrame): Pandas DataFrame object containing the time series. + external_id_headers (bool): Interpret the column names as external id. Pass False if using ids. Default: True. + dropna (bool): Set to True to ignore NaNs in the given DataFrame, applied per column. Default: True. Returns: None @@ -640,27 +1223,25 @@ def insert_dataframe( >>> import numpy as np >>> import pandas as pd >>> from cognite.client import CogniteClient - >>> from datetime import datetime, timedelta >>> >>> c = CogniteClient() - >>> ts_id = 123 - >>> start = datetime(2018, 1, 1) - >>> x = pd.DatetimeIndex([start + timedelta(days=d) for d in range(100)]) - >>> y = np.random.normal(0, 1, 100) - >>> df = pd.DataFrame({ts_id: y}, index=x) - >>> c.datapoints.insert_dataframe(df) + >>> ts_xid = "my-foo-ts" + >>> idx = pd.date_range(start="2018-01-01", periods=100, freq="1d") + >>> noise = np.random.normal(0, 1, 100) + >>> df = pd.DataFrame({ts_xid: noise}, index=idx) + >>> c.time_series.data.insert_dataframe(df) """ - np = cast(Any, utils._auxiliary.local_import("numpy")) - assert not np.isinf(dataframe.select_dtypes(include=[np.number])).any( - axis=None - ), "Dataframe contains Infinity. Remove them in order to insert the data." + np = cast(Any, local_import("numpy")) + if np.isinf(df.select_dtypes(include=[np.number])).any(axis=None): + raise ValueError("Dataframe contains one or more (+/-) Infinity. Remove them in order to insert the data.") if not dropna: - assert not dataframe.isnull().any( - axis=None - ), "Dataframe contains NaNs. Remove them or pass `dropna=True` in order to insert the data." + if df.isnull().any(axis=None): + raise ValueError( + "Dataframe contains one or more NaNs. Remove or pass `dropna=True` in order to insert the data." + ) dps = [] - idx = dataframe.index.values.astype("datetime64[ms]").astype(np.int64) - for column_id, col in dataframe.iteritems(): + idx = df.index.to_numpy("datetime64[ms]").astype(np.int64) + for column_id, col in df.iteritems(): mask = col.notna() datapoints = list(zip(idx[mask], col[mask])) if not datapoints: @@ -720,32 +1301,32 @@ def _validate_and_format_datapoints( List[Tuple[Union[int, float, datetime], Union[int, float, str]]], ], ) -> List[Tuple[int, Any]]: - utils._auxiliary.assert_type(datapoints, "datapoints", [list]) + assert_type(datapoints, "datapoints", [list]) assert len(datapoints) > 0, "No datapoints provided" - utils._auxiliary.assert_type(datapoints[0], "datapoints element", [tuple, dict]) + assert_type(datapoints[0], "datapoints element", [tuple, dict]) valid_datapoints = [] if isinstance(datapoints[0], tuple): - valid_datapoints = [(cognite.client.utils._time.timestamp_to_ms(t), v) for t, v in datapoints] + valid_datapoints = [(timestamp_to_ms(t), v) for t, v in datapoints] elif isinstance(datapoints[0], dict): for dp in datapoints: dp = cast(Dict[str, Any], dp) assert "timestamp" in dp, "A datapoint is missing the 'timestamp' key" assert "value" in dp, "A datapoint is missing the 'value' key" - valid_datapoints.append((cognite.client.utils._time.timestamp_to_ms(dp["timestamp"]), dp["value"])) + valid_datapoints.append((timestamp_to_ms(dp["timestamp"]), dp["value"])) return valid_datapoints def _bin_datapoints(self, dps_object_list: List[Dict[str, Any]]) -> List[List[Dict[str, Any]]]: for dps_object in dps_object_list: - for i in range(0, len(dps_object["datapoints"]), self.client._DPS_LIMIT): + for i in range(0, len(dps_object["datapoints"]), DPS_LIMIT): dps_object_chunk = {k: dps_object[k] for k in ["id", "externalId"] if k in dps_object} - dps_object_chunk["datapoints"] = dps_object["datapoints"][i : i + self.client._DPS_LIMIT] + dps_object_chunk["datapoints"] = dps_object["datapoints"][i : i + DPS_LIMIT] for bin in self.bins: if bin.will_fit(len(dps_object_chunk["datapoints"])): bin.add(dps_object_chunk) break else: - bin = DatapointsBin(self.client._DPS_LIMIT, self.client._POST_DPS_OBJECTS_LIMIT) + bin = DatapointsBin(DPS_LIMIT, POST_DPS_OBJECTS_LIMIT) bin.add(dps_object_chunk) self.bins.append(bin) binned_dps_object_list = [] @@ -757,7 +1338,7 @@ def _insert_datapoints_concurrently(self, dps_object_lists: List[List[Dict[str, tasks = [] for dps_object_list in dps_object_lists: tasks.append((dps_object_list,)) - summary = utils._concurrency.execute_tasks_concurrently( + summary = execute_tasks_concurrently( self._insert_datapoints, tasks, max_workers=self.client._config.max_workers ) summary.raise_compound_exception_if_failed_tasks( @@ -772,341 +1353,3 @@ def _insert_datapoints(self, post_dps_objects: List[Dict[str, Any]]) -> None: self.client._post(url_path=self.client._RESOURCE_PATH, json={"items": post_dps_objects}) for it in post_dps_objects: del it["datapoints"] - - -class _DPWindow: - def __init__(self, start: int, end: int, limit: int = cast(int, float("inf"))) -> None: - self.start = start - self.end = end - self.limit = limit - - def __eq__(self, other: Any) -> bool: - return [self.start, self.end, self.limit] == [other.start, other.end, other.limit] - - -class _DPTask: - def __init__( - self, - client: DatapointsAPI, - start: Union[int, str, datetime], - end: Union[int, str, datetime], - ts_item: dict, - aggregates: Optional[List[str]], - granularity: Optional[str], - include_outside_points: Optional[bool], - limit: Optional[int], - ignore_unknown_ids: Optional[bool], - ): - self.start = cognite.client.utils._time.timestamp_to_ms(start) - self.end = cognite.client.utils._time.timestamp_to_ms(end) - self.aggregates = ts_item.get("aggregates") or aggregates - self.ts_item = {k: v for k, v in ts_item.items() if k in ["id", "externalId"]} - self.granularity = granularity - self.include_outside_points = include_outside_points - self.limit = cast(int, limit or float("inf")) - self.ignore_unknown_ids = ignore_unknown_ids - - self.client = client - self.request_limit = client._DPS_LIMIT_AGG if self.aggregates else client._DPS_LIMIT - self.missing = False - self.results: List[Datapoints] = [] - self.point_before = Datapoints() - self.point_after = Datapoints() - - def next_start_offset(self) -> int: - return cognite.client.utils._time.granularity_to_ms(self.granularity) if self.granularity else 1 - - def store_partial_result(self, raw_data: Dict[str, Any], start: int, end: int) -> Tuple[int, Optional[int]]: - expected_fields = self.aggregates or ["value"] - - if self.include_outside_points and raw_data["datapoints"]: - # assumes first query has full start/end range - copy_data = copy.copy(raw_data) # shallow copy - if raw_data["datapoints"][0]["timestamp"] < start: - if not self.point_before: - copy_data["datapoints"] = raw_data["datapoints"][:1] - self.point_before = Datapoints._load( - copy_data, expected_fields, cognite_client=self.client._cognite_client - ) - raw_data["datapoints"] = raw_data["datapoints"][1:] - if raw_data["datapoints"] and raw_data["datapoints"][-1]["timestamp"] >= end: - if not self.point_after: - copy_data["datapoints"] = raw_data["datapoints"][-1:] - self.point_after = Datapoints._load( - copy_data, expected_fields, cognite_client=self.client._cognite_client - ) - raw_data["datapoints"] = raw_data["datapoints"][:-1] - - self.results.append(Datapoints._load(raw_data, expected_fields, cognite_client=self.client._cognite_client)) - last_timestamp = raw_data["datapoints"] and raw_data["datapoints"][-1]["timestamp"] - return len(raw_data["datapoints"]), last_timestamp - - def mark_missing(self) -> Tuple[int, None]: # for ignore unknown ids - self.missing = True - return 0, None # as in store partial result - - def result(self) -> Datapoints: - def custom_sort_key(x: Datapoints) -> Union[int, float]: - if x.timestamp: - return x.timestamp[0] - return 0 - - dps = self.point_before - for res in sorted(self.results, key=custom_sort_key): - dps._extend(res) - dps._extend(self.point_after) - if len(dps) > self.limit: - dps = cast(Datapoints, dps[: self.limit]) - return dps - - def as_tuple(self) -> Tuple[int, int, dict, Optional[List[str]], Optional[str], Optional[bool], Optional[int]]: - return ( - self.start, - self.end, - self.ts_item, - self.aggregates, - self.granularity, - self.include_outside_points, - self.limit, - ) - - -class DatapointsFetcher: - def __init__(self, client: DatapointsAPI): - self.client = client - - def fetch(self, query: DatapointsQuery) -> DatapointsList: - return self.fetch_multiple([query])[0] - - def fetch_multiple(self, queries: List[DatapointsQuery]) -> List[DatapointsList]: - task_lists = [self._create_tasks(q) for q in queries] - self._fetch_datapoints(sum(task_lists, [])) - return self._get_dps_results(task_lists) - - def _create_tasks(self, query: DatapointsQuery) -> List[_DPTask]: - ts_items, _ = self._process_ts_identifiers(query.id, query.external_id) - tasks = [ - _DPTask( - self.client, - query.start, - query.end, - ts_item, - query.aggregates, - query.granularity, - query.include_outside_points, - query.limit, - query.ignore_unknown_ids, - ) - for ts_item in ts_items - ] - self._validate_tasks(tasks) - self._preprocess_tasks(tasks) - return tasks - - @staticmethod - def _validate_tasks(tasks: List[_DPTask]) -> None: - identifiers_seen = set() - for t in tasks: - identifier = utils._auxiliary.unwrap_identifer(t.ts_item) - if identifier in identifiers_seen: - raise ValueError("Time series identifier '{}' is duplicated in query".format(identifier)) - identifiers_seen.add(identifier) - if t.aggregates is not None and t.granularity is None: - raise ValueError("When specifying aggregates, granularity must also be provided.") - if t.granularity is not None and not t.aggregates: - raise ValueError("When specifying granularity, aggregates must also be provided.") - - def _preprocess_tasks(self, tasks: List[_DPTask]) -> None: - for t in tasks: - new_start = cognite.client.utils._time.timestamp_to_ms(t.start) - new_end = cognite.client.utils._time.timestamp_to_ms(t.end) - if t.aggregates: - assert t.granularity - new_start = self._align_with_granularity_unit(new_start, t.granularity) - new_end = self._align_with_granularity_unit(new_end, t.granularity) - t.start = new_start - t.end = new_end - - def _get_dps_results(self, task_lists: List[List[_DPTask]]) -> List[DatapointsList]: - return [ - DatapointsList([t.result() for t in tl if not t.missing], cognite_client=self.client._cognite_client) - for tl in task_lists - ] - - def _fetch_datapoints(self, tasks: List[_DPTask]) -> None: - tasks_summary = utils._concurrency.execute_tasks_concurrently( - self._fetch_dps_initial_and_return_remaining_tasks, - [(t,) for t in tasks], - max_workers=self.client._config.max_workers, - ) - if tasks_summary.exceptions: - raise tasks_summary.exceptions[0] - - remaining_tasks_with_windows = tasks_summary.joined_results() - if len(remaining_tasks_with_windows) > 0: - self._fetch_datapoints_for_remaining_queries(remaining_tasks_with_windows) - - def _fetch_dps_initial_and_return_remaining_tasks(self, task: _DPTask) -> List[Tuple[_DPTask, _DPWindow]]: - ndp_in_first_task, last_timestamp = self._get_datapoints(task, None, True) - if ndp_in_first_task < task.request_limit: - return [] - remaining_user_limit = task.limit - ndp_in_first_task - assert last_timestamp - task.start = last_timestamp + task.next_start_offset() - queries = self._split_task_into_windows(cast(int, task.results[0].id), task, remaining_user_limit) - return queries - - def _fetch_datapoints_for_remaining_queries(self, tasks_with_windows: List[Tuple[_DPTask, _DPWindow]]) -> None: - tasks_summary = utils._concurrency.execute_tasks_concurrently( - self._get_datapoints_with_paging, tasks_with_windows, max_workers=self.client._config.max_workers - ) - if tasks_summary.exceptions: - raise tasks_summary.exceptions[0] - - @staticmethod - def _align_with_granularity_unit(ts: int, granularity: str) -> int: - gms = cognite.client.utils._time.granularity_unit_to_ms(granularity) - if ts % gms == 0: - return ts - return ts - (ts % gms) + gms - - def _split_task_into_windows( - self, id: int, task: _DPTask, remaining_user_limit: int - ) -> List[Tuple[_DPTask, _DPWindow]]: - windows = self._get_windows(id, task, remaining_user_limit) - return [(task, w) for w in windows] - - def _get_windows(self, id: int, task: _DPTask, remaining_user_limit: int) -> List[_DPWindow]: - if remaining_user_limit <= 0: - return [] - if task.start >= task.end: - return [] - count_granularity = "1d" - if task.granularity and cognite.client.utils._time.granularity_to_ms( - "1d" - ) < cognite.client.utils._time.granularity_to_ms(task.granularity): - count_granularity = task.granularity - try: - count_task = _DPTask( - self.client, task.start, task.end, {"id": id}, ["count"], count_granularity, False, None, False - ) - self._get_datapoints_with_paging(count_task, _DPWindow(task.start, task.end)) - res = count_task.result() - except CogniteAPIError: - res = Datapoints() - if len(res) == 0: # string based series or aggregates not yet calculated - return [_DPWindow(task.start, task.end, remaining_user_limit)] - assert res.count is not None - counts = list(zip(res.timestamp, res.count)) - windows = [] - total_count = 0 - current_window_count = 0 - window_start = task.start - granularity_ms = cognite.client.utils._time.granularity_to_ms(task.granularity) if task.granularity else None - agg_count = lambda count: int( - min( - math.ceil(cognite.client.utils._time.granularity_to_ms(count_granularity) / cast(int, granularity_ms)), - count, - ) - ) - for i, (ts, count) in enumerate(counts): - if ts < task.start: # API rounds time stamps down, so some of the first day may have been retrieved already - count = 0 - - if i < len(counts) - 1: - next_timestamp = counts[i + 1][0] - next_raw_count = counts[i + 1][1] - next_count = next_raw_count if task.granularity is None else agg_count(next_raw_count) - else: - next_timestamp = task.end - next_count = 0 - current_count = count if task.granularity is None else agg_count(count) - total_count += current_count - current_window_count += current_count - if current_window_count + next_count > task.request_limit or i == len(counts) - 1: - window_end = int(next_timestamp) - if task.granularity: - window_end = self._align_window_end(task.start, int(next_timestamp), task.granularity) - windows.append(_DPWindow(window_start, window_end, remaining_user_limit)) - window_start = window_end - current_window_count = 0 - if total_count >= remaining_user_limit: - break - return windows - - @staticmethod - def _align_window_end(start: int, end: int, granularity: str) -> int: - gms = cognite.client.utils._time.granularity_to_ms(granularity) - diff = end - start - end -= diff % gms - return end - - def _get_datapoints_with_paging(self, task: _DPTask, window: _DPWindow) -> None: - ndp_retrieved_total = 0 - while window.end > window.start and ndp_retrieved_total < window.limit: - ndp_retrieved, last_time = self._get_datapoints(task, window) - if ndp_retrieved < min(window.limit, task.request_limit): - break - window.limit -= ndp_retrieved - assert last_time - window.start = last_time + task.next_start_offset() - - def _get_datapoints( - self, task: _DPTask, window: _DPWindow = None, first_page: bool = False - ) -> Tuple[int, Optional[int]]: - window = window or _DPWindow(task.start, task.end, task.limit) - payload = { - "items": [task.ts_item], - "start": window.start, - "end": window.end, - "aggregates": task.aggregates, - "granularity": task.granularity, - "includeOutsidePoints": task.include_outside_points and first_page, - "ignoreUnknownIds": task.ignore_unknown_ids, - "limit": min(window.limit, task.request_limit), - } - res = self.client._post(self.client._RESOURCE_PATH + "/list", json=payload).json()["items"] - if not res and task.ignore_unknown_ids: - return task.mark_missing() - else: - return task.store_partial_result(res[0], window.start, window.end) - - @staticmethod - def _process_ts_identifiers( - ids: Optional[DatapointsIdMaybeAggregate], external_ids: Optional[DatapointsExternalIdMaybeAggregate] - ) -> Tuple[List[Dict], bool]: - is_list = False - items = [] - - if isinstance(ids, List): - is_list = True - for id_item in ids: - items.append(DatapointsFetcher._process_single_ts_item(id_item, False)) - elif ids is not None: - items.append(DatapointsFetcher._process_single_ts_item(ids, False)) - - if isinstance(external_ids, List): - is_list = True - for ext_id_item in external_ids: - items.append(DatapointsFetcher._process_single_ts_item(ext_id_item, True)) - elif external_ids is not None: - items.append(DatapointsFetcher._process_single_ts_item(external_ids, True)) - - return items, not is_list and len(items) == 1 - - @staticmethod - def _process_single_ts_item(item: Union[int, str, dict], external: bool) -> Dict[str, Any]: - item_type = "externalId" if external else "id" - id_type = str if external else int - if isinstance(item, id_type): - return {item_type: item} - elif isinstance(item, Dict): - for key in item: - if key not in [item_type, "aggregates"]: - raise ValueError("Unknown key '{}' in {} dict argument".format(key, item_type)) - if item_type not in item: - raise ValueError( - "When passing a dict to the {} argument, '{}' must be specified.".format(item_type, item_type) - ) - return item - raise TypeError("Invalid type '{}' for argument '{}'".format(type(item), item_type)) diff --git a/cognite/client/_api/diagrams.py b/cognite/client/_api/diagrams.py index d2831c364a..8c61c382b8 100644 --- a/cognite/client/_api/diagrams.py +++ b/cognite/client/_api/diagrams.py @@ -81,7 +81,7 @@ def detect( ) -> DiagramDetectResults: """Detect entities in a PNID. The results are not written to CDF. - Note: All users on this CDF subscription with assets read-all and files read-all capabilities in the project, + **Note**: All users on this CDF subscription with assets read-all and files read-all capabilities in the project, are able to access the data sent to this endpoint. Args: diff --git a/cognite/client/_api/entity_matching.py b/cognite/client/_api/entity_matching.py index d9efdc4c1e..74d3847c65 100644 --- a/cognite/client/_api/entity_matching.py +++ b/cognite/client/_api/entity_matching.py @@ -153,7 +153,7 @@ def fit( external_id: str = None, ) -> EntityMatchingModel: """Fit entity matching model. - Note: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all + **Note**: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all capabilities in the project, are able to access the data sent to this endpoint. Args: @@ -206,7 +206,7 @@ def predict( external_id: Optional[str] = None, ) -> ContextualizationJob: """Predict entity matching. NB. blocks and waits for the model to be ready if it has been recently created. - Note: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all + **Note**: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all capabilities in the project, are able to access the data sent to this endpoint. Args: @@ -235,7 +235,7 @@ def refit( external_id: Optional[str] = None, ) -> EntityMatchingModel: """Re-fits an entity matching model, using the combination of the old and new true matches. - Note: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all + **Note**: All users on this CDF subscription with assets read-all and entitymatching read-all and write-all capabilities in the project, are able to access the data sent to this endpoint. Args: diff --git a/cognite/client/_api/synthetic_time_series.py b/cognite/client/_api/synthetic_time_series.py index a692170d9b..71dfa9154e 100644 --- a/cognite/client/_api/synthetic_time_series.py +++ b/cognite/client/_api/synthetic_time_series.py @@ -48,22 +48,18 @@ def query( >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> dps = c.datapoints.synthetic.query(expressions="TS{id:123} + TS{externalId:'abc'}", start="2w-ago", end="now") + >>> dps = c.time_series.data.synthetic.query(expressions="TS{id:123} + TS{externalId:'abc'}", start="2w-ago", end="now") Use variables to re-use an expression: - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() >>> vars = {"A": "my_ts_external_id", "B": client.time_series.retrieve(id=1)} - >>> dps = c.datapoints.synthetic.query(expressions="A+B", start="2w-ago", end="now", variables=vars) + >>> dps = c.time_series.data.synthetic.query(expressions="A+B", start="2w-ago", end="now", variables=vars) Use sympy to build complex expressions: - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() >>> from sympy import symbols, cos, sin >>> a = symbols('a') - >>> dps = c.datapoints.synthetic.query([sin(a), cos(a)], start="2w-ago", end="now", variables={"a": "my_ts_external_id"}, aggregate='interpolation', granularity='1m') + >>> dps = c.time_series.data.synthetic.query([sin(a), cos(a)], start="2w-ago", end="now", variables={"a": "my_ts_external_id"}, aggregate='interpolation', granularity='1m') """ if limit is None or limit == -1: limit = cast(int, float("inf")) @@ -82,7 +78,8 @@ def query( "start": cognite.client.utils._time.timestamp_to_ms(start), "end": cognite.client.utils._time.timestamp_to_ms(end), } - query_datapoints = Datapoints(value=[], error=[]) + values: List[float] = [] # mypy + query_datapoints = Datapoints(value=values, error=[]) query_datapoints.external_id = short_expression tasks.append((query, query_datapoints, limit)) diff --git a/cognite/client/_api/time_series.py b/cognite/client/_api/time_series.py index 3442e04cc9..2093251d97 100644 --- a/cognite/client/_api/time_series.py +++ b/cognite/client/_api/time_series.py @@ -1,5 +1,6 @@ from typing import Any, Dict, Iterator, List, Optional, Sequence, Union, cast, overload +from cognite.client._api.datapoints import DatapointsAPI from cognite.client._api_client import APIClient from cognite.client.data_classes import ( TimeSeries, @@ -14,6 +15,10 @@ class TimeSeriesAPI(APIClient): _RESOURCE_PATH = "/timeseries" + def __init__(self, *args: Any, **kwargs: Any) -> None: + super().__init__(*args, **kwargs) + self.data = DatapointsAPI(*args, **kwargs) + def __call__( self, chunk_size: int = None, diff --git a/cognite/client/_api_client.py b/cognite/client/_api_client.py index ee3536caf2..ca2ef4f009 100644 --- a/cognite/client/_api_client.py +++ b/cognite/client/_api_client.py @@ -895,7 +895,7 @@ def _get_response_content_safe(cls, res: Response) -> str: @staticmethod def _sanitize_headers(headers: Optional[Dict]) -> None: if headers is None: - return + return None if "api-key" in headers: headers["api-key"] = "***" if "Authorization" in headers: diff --git a/cognite/client/_cognite_client.py b/cognite/client/_cognite_client.py index 52b50d9679..975ef292cd 100644 --- a/cognite/client/_cognite_client.py +++ b/cognite/client/_cognite_client.py @@ -1,5 +1,7 @@ +from __future__ import annotations + import warnings -from typing import Any, Dict, Optional +from typing import TYPE_CHECKING, Any, Dict, Optional from requests import Response @@ -7,7 +9,6 @@ from cognite.client._api.annotations import AnnotationsAPI from cognite.client._api.assets import AssetsAPI from cognite.client._api.data_sets import DataSetsAPI -from cognite.client._api.datapoints import DatapointsAPI from cognite.client._api.diagrams import DiagramsAPI from cognite.client._api.entity_matching import EntityMatchingAPI from cognite.client._api.events import EventsAPI @@ -29,6 +30,9 @@ from cognite.client._api_client import APIClient from cognite.client.config import ClientConfig, global_config +if TYPE_CHECKING: + from cognite.client._api.datapoints import DatapointsAPI + class CogniteClient: """Main entrypoint into Cognite Python SDK. @@ -51,7 +55,6 @@ def __init__(self, config: Optional[ClientConfig] = None) -> None: self._config = client_config self.login = LoginAPI(self._config, cognite_client=self) self.assets = AssetsAPI(self._config, api_version=self._API_VERSION, cognite_client=self) - self.datapoints = DatapointsAPI(self._config, api_version=self._API_VERSION, cognite_client=self) self.events = EventsAPI(self._config, api_version=self._API_VERSION, cognite_client=self) self.files = FilesAPI(self._config, api_version=self._API_VERSION, cognite_client=self) self.iam = IAMAPI(self._config, api_version=self._API_VERSION, cognite_client=self) @@ -94,6 +97,19 @@ def delete(self, url: str, params: Dict[str, Any] = None, headers: Dict[str, Any """Perform a DELETE request to an arbitrary path in the API.""" return self._api_client._delete(url, params=params, headers=headers) + @property # TODO (v6.0.0): Delete this whole property + def datapoints(self) -> DatapointsAPI: + if int(self.version.split(".")[0]) >= 6: + raise AttributeError( # ...in case we forget to delete this property in v6... + "'CogniteClient' object has no attribute 'datapoints'. Use 'time_series.data' instead." + ) + warnings.warn( + "Accessing the DatapointsAPI through `client.datapoints` is deprecated and will be removed " + "in major version 6.0.0. Use `client.time_series.data` instead.", + DeprecationWarning, + ) + return self.time_series.data + @property def version(self) -> str: """Returns the current SDK version. diff --git a/cognite/client/_version.py b/cognite/client/_version.py index d615590c5c..35390c764d 100644 --- a/cognite/client/_version.py +++ b/cognite/client/_version.py @@ -1,2 +1,2 @@ -__version__ = "4.9.0" +__version__ = "5.0.0" __api_subversion__ = "V20220125" diff --git a/cognite/client/config.py b/cognite/client/config.py index fe5374a0b6..e00db50a75 100644 --- a/cognite/client/config.py +++ b/cognite/client/config.py @@ -70,13 +70,12 @@ def __init__( file_transfer_timeout: Optional[int] = None, debug: bool = False, ) -> None: - super().__init__() self.client_name = client_name self.project = project self.credentials = credentials self.api_subversion = api_subversion or __api_subversion__ self.base_url = (base_url or "https://api.cognitedata.com").rstrip("/") - self.max_workers = max_workers if max_workers is not None else 10 + self.max_workers = max_workers if max_workers is not None else 20 self.headers = headers or {} self.timeout = timeout or 30 self.file_transfer_timeout = file_transfer_timeout or 600 diff --git a/cognite/client/credentials.py b/cognite/client/credentials.py index bc11af4e08..3b21202c34 100644 --- a/cognite/client/credentials.py +++ b/cognite/client/credentials.py @@ -74,7 +74,7 @@ def __init__(self) -> None: @abstractmethod def _refresh_access_token(self) -> Tuple[str, float]: """This method should return the access_token and expiry time""" - raise NotImplementedError + ... @classmethod def __should_refresh_token(cls, token: Optional[str], expires_at: Optional[float]) -> bool: @@ -123,7 +123,6 @@ class OAuthDeviceCode(_OAuthCredentialProviderWithTokenRefresh, _WithMsalSeriali Examples: >>> from cognite.client.credentials import OAuthInteractive - >>> import os >>> oauth_provider = OAuthDeviceCode( ... authority_url="https://login.microsoftonline.com/xyz", ... client_id="abcd", @@ -185,7 +184,6 @@ class OAuthInteractive(_OAuthCredentialProviderWithTokenRefresh, _WithMsalSerial Examples: >>> from cognite.client.credentials import OAuthInteractive - >>> import os >>> oauth_provider = OAuthInteractive( ... authority_url="https://login.microsoftonline.com/xyz", ... client_id="abcd", diff --git a/cognite/client/data_classes/__init__.py b/cognite/client/data_classes/__init__.py index e57c81f9a5..6b7df22e46 100644 --- a/cognite/client/data_classes/__init__.py +++ b/cognite/client/data_classes/__init__.py @@ -164,7 +164,14 @@ TimestampRange, ) -from cognite.client.data_classes.datapoints import Datapoint, Datapoints, DatapointsList, DatapointsQuery # isort: skip +from cognite.client.data_classes.datapoints import ( # isort: skip + Datapoint, + Datapoints, + DatapointsList, + DatapointsQuery, + DatapointsArray, + DatapointsArrayList, +) from cognite.client.data_classes.events import EndTimeFilter, Event, EventFilter, EventList, EventUpdate # isort: skip from cognite.client.data_classes.login import LoginStatus # isort: skip from cognite.client.data_classes.raw import Database, DatabaseList, Row, RowList, Table, TableList # isort: skip diff --git a/cognite/client/data_classes/_base.py b/cognite/client/data_classes/_base.py index 36ba9216e6..2da7067394 100644 --- a/cognite/client/data_classes/_base.py +++ b/cognite/client/data_classes/_base.py @@ -117,10 +117,10 @@ def _load( if hasattr(instance, snake_case_key): setattr(instance, snake_case_key, value) return instance - raise TypeError("Resource must be json str or Dict, not {}".format(type(resource))) + raise TypeError("Resource must be json str or dict, not {}".format(type(resource))) def to_pandas( - self, expand: Sequence[str] = ("metadata",), ignore: List[str] = None, camel_case: bool = True + self, expand: Sequence[str] = ("metadata",), ignore: List[str] = None, camel_case: bool = False ) -> "pandas.DataFrame": """Convert the instance into a pandas DataFrame. @@ -188,6 +188,7 @@ def __init__(self, resources: Collection[Any], cognite_client: "CogniteClient" = ) self._cognite_client = cast("CogniteClient", cognite_client) super().__init__(resources) + self._id_to_item, self._external_id_to_item = {}, {} if self.data: if hasattr(self.data[0], "external_id"): self._external_id_to_item = { @@ -241,7 +242,7 @@ def get(self, id: int = None, external_id: str = None) -> Optional[CogniteResour return self._id_to_item.get(id) return self._external_id_to_item.get(external_id) - def to_pandas(self, camel_case: bool = True) -> "pandas.DataFrame": + def to_pandas(self, camel_case: bool = False) -> "pandas.DataFrame": """Convert the instance into a pandas DataFrame. Returns: diff --git a/cognite/client/data_classes/assets.py b/cognite/client/data_classes/assets.py index ff2e889d7c..54a8a43ac2 100644 --- a/cognite/client/data_classes/assets.py +++ b/cognite/client/data_classes/assets.py @@ -203,7 +203,7 @@ def dump(self, camel_case: bool = False) -> Dict[str, Any]: return result def to_pandas( - self, expand: Sequence[str] = ("metadata", "aggregates"), ignore: List[str] = None, camel_case: bool = True + self, expand: Sequence[str] = ("metadata", "aggregates"), ignore: List[str] = None, camel_case: bool = False ) -> "pandas.DataFrame": """Convert the instance into a pandas DataFrame. diff --git a/cognite/client/data_classes/datapoints.py b/cognite/client/data_classes/datapoints.py index 3e592a1af4..54f2697384 100644 --- a/cognite/client/data_classes/datapoints.py +++ b/cognite/client/data_classes/datapoints.py @@ -1,25 +1,79 @@ -import collections +from __future__ import annotations + import json -import re as regexp +import numbers +import operator as op +import warnings +from collections import defaultdict +from dataclasses import dataclass from datetime import datetime -from typing import TYPE_CHECKING, Any, Dict, Generator, List, Optional, Tuple, Union, cast +from functools import cached_property +from typing import ( + TYPE_CHECKING, + Any, + Callable, + Collection, + Dict, + Generator, + Iterable, + Iterator, + List, + Literal, + NoReturn, + Optional, + Sequence, + Tuple, + Type, + TypedDict, + Union, + cast, +) from cognite.client import utils +from cognite.client._api.datapoint_constants import ( + ALL_SORTED_DP_AGGS, + DPS_LIMIT, + DPS_LIMIT_AGG, + CustomDatapointsQuery, + DatapointsExternalIdTypes, + DatapointsIdTypes, + DatapointsQueryExternalId, + DatapointsQueryId, +) from cognite.client.data_classes._base import CogniteResource, CogniteResourceList -from cognite.client.exceptions import CogniteDuplicateColumnsError +from cognite.client.utils._auxiliary import ( + convert_all_keys_to_camel_case, + convert_all_keys_to_snake_case, + find_duplicates, + local_import, + to_camel_case, +) +from cognite.client.utils._time import ( + UNIT_IN_MS, + align_start_and_end_for_granularity, + granularity_to_ms, + timestamp_to_ms, +) if TYPE_CHECKING: + import numpy.typing as npt import pandas from cognite.client import CogniteClient + from cognite.client._api.datapoint_constants import ( + NumpyDatetime64NSArray, + NumpyFloat64Array, + NumpyInt64Array, + NumpyObjArray, + ) class Datapoint(CogniteResource): """An object representing a datapoint. Args: - timestamp (Union[int, float]): The data timestamp in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1.1.1900 00:00:00 UTC - value (Union[str, int, float]): The data value. Can be String or numeric depending on the metric + timestamp (Union[int, float]): The data timestamp in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1900.01.01 00:00:00 UTC + value (Union[str, float]): The data value. Can be string or numeric average (float): The integral average value in the aggregate period max (float): The maximum value in the aggregate period min (float): The minimum value in the aggregate period @@ -34,8 +88,8 @@ class Datapoint(CogniteResource): def __init__( self, - timestamp: Union[int, float] = None, - value: Union[str, int, float] = None, + timestamp: int = None, + value: Union[str, float] = None, average: float = None, max: float = None, min: float = None, @@ -60,14 +114,16 @@ def __init__( self.discrete_variance = discrete_variance self.total_variation = total_variation - def to_pandas(self, camel_case: bool = True) -> "pandas.DataFrame": # type: ignore[override] + def to_pandas(self, camel_case: bool = False) -> "pandas.DataFrame": # type: ignore[override] """Convert the datapoint into a pandas DataFrame. - camel_case (bool): Convert column names to camel case (e.g. `stepInterpolation` instead of `step_interpolation`) + + Args: + camel_case (bool): Convert column names to camel case (e.g. `stepInterpolation` instead of `step_interpolation`) Returns: - pandas.DataFrame: The dataframe. + pandas.DataFrame """ - pd = cast(Any, utils._auxiliary.local_import("pandas")) + pd = cast(Any, local_import("pandas")) dumped = self.dump(camel_case=camel_case) timestamp = dumped.pop("timestamp") @@ -75,17 +131,184 @@ def to_pandas(self, camel_case: bool = True) -> "pandas.DataFrame": # type: ign return pd.DataFrame(dumped, index=[pd.Timestamp(timestamp, unit="ms")]) +class DatapointsArray(CogniteResource): + """An object representing datapoints using numpy arrays.""" + + def __init__( + self, + id: int = None, + external_id: str = None, + is_string: bool = None, + is_step: bool = None, + unit: str = None, + timestamp: NumpyDatetime64NSArray = None, + value: Union[NumpyFloat64Array, NumpyObjArray] = None, + average: NumpyFloat64Array = None, + max: NumpyFloat64Array = None, + min: NumpyFloat64Array = None, + count: NumpyInt64Array = None, + sum: NumpyFloat64Array = None, + interpolation: NumpyFloat64Array = None, + step_interpolation: NumpyFloat64Array = None, + continuous_variance: NumpyFloat64Array = None, + discrete_variance: NumpyFloat64Array = None, + total_variation: NumpyFloat64Array = None, + ): + self.id = id + self.external_id = external_id + self.is_string = is_string + self.is_step = is_step + self.unit = unit + self.timestamp = timestamp + self.value = value + self.average = average + self.max = max + self.min = min + self.count = count + self.sum = sum + self.interpolation = interpolation + self.step_interpolation = step_interpolation + self.continuous_variance = continuous_variance + self.discrete_variance = discrete_variance + self.total_variation = total_variation + + @property + def _ts_info(self) -> Dict[str, Any]: + return { + "id": self.id, + "external_id": self.external_id, + "is_string": self.is_string, + "is_step": self.is_step, + "unit": self.unit, + } + + @classmethod + def _load( # type: ignore [override] + cls, + dps_dct: Dict[str, Union[int, str, bool, npt.NDArray]], + ) -> DatapointsArray: + # Since pandas always uses nanoseconds for datetime, we stick with the same: + dps_dct["timestamp"] = dps_dct["timestamp"].astype("datetime64[ms]").astype("datetime64[ns]") + return cls(**convert_all_keys_to_snake_case(dps_dct)) + + def __len__(self) -> int: + if self.timestamp is None: + return 0 + return len(self.timestamp) + + def __eq__(self, other: Any) -> bool: + # Override CogniteResource __eq__ which checks exact type & dump being equal. We do not want + # this: comparing arrays with (mostly) floats is a very bad idea; also dump is exceedingly expensive. + return id(self) == id(other) + + def __str__(self) -> str: + return json.dumps(self.dump(convert_timestamps=True), indent=4) + + def _repr_html_(self) -> str: + return self.to_pandas()._repr_html_() + + def __getitem__(self, item: Any) -> Union[Datapoint, DatapointsArray]: + if isinstance(item, slice): + return self._slice(item) + return Datapoint(**{attr: self._dtype_fix(arr[item]) for attr, arr in zip(*self._data_fields())}) + + def _slice(self, part: slice) -> DatapointsArray: + return DatapointsArray( + **self._ts_info, **{attr: arr[part] for attr, arr in zip(*self._data_fields())} # type: ignore [arg-type] + ) + + def __iter__(self) -> Iterator[Datapoint]: + # Let's not create a single Datapoint more than we have too: + attrs, arrays = self._data_fields() + yield from (Datapoint(**dict(zip(attrs, map(self._dtype_fix, row)))) for row in zip(*arrays)) + + @cached_property + def _dtype_fix(self) -> Callable: + if self.is_string: + # Return no-op as array contains just references to vanilla python objects: + return lambda s: s + # Using .item() on numpy scalars gives us vanilla python types: + return op.methodcaller("item") + + def _data_fields(self) -> Tuple[List[str], List[npt.NDArray]]: + data_field_tuples = [ + (attr, arr) + for attr in ("timestamp", "value", *ALL_SORTED_DP_AGGS) + if (arr := getattr(self, attr)) is not None + ] + attrs, arrays = map(list, zip(*data_field_tuples)) + return attrs, arrays # type: ignore [return-value] + + def dump(self, camel_case: bool = False, convert_timestamps: bool = False) -> Dict[str, Any]: + """Dump the datapoints into a json serializable Python data type. + + Args: + camel_case (bool): Use camelCase for attribute names. Default: False. + convert_timestamps (bool): Convert integer timestamps to ISO 8601 formatted strings. Default: False. + + Returns: + List[Dict[str, Any]]: A list of dicts representing the instance. + """ + attrs, arrays = self._data_fields() + if convert_timestamps: + assert attrs[0] == "timestamp" + # Note: numpy does not have a strftime method to get the exact format we want (hence the datetime detour) + # and for some weird reason .astype(datetime) directly from dt64 returns native integer... whatwhyy + arrays[0] = arrays[0].astype("datetime64[ms]").astype(datetime).astype(str) + + if camel_case: + attrs = list(map(to_camel_case, attrs)) + + dumped = {**self._ts_info, "datapoints": [dict(zip(attrs, map(self._dtype_fix, row))) for row in zip(*arrays)]} + if camel_case: + dumped = convert_all_keys_to_camel_case(dumped) + return {k: v for k, v in dumped.items() if v is not None} + + def to_pandas( # type: ignore [override] + self, column_names: Literal["id", "external_id"] = "external_id", include_aggregate_name: bool = True + ) -> "pandas.DataFrame": + assert isinstance(include_aggregate_name, bool) + pd = cast(Any, local_import("pandas")) + identifier_dct = {"id": self.id, "external_id": self.external_id} + if column_names not in identifier_dct: + raise ValueError("Argument `column_names` must be either 'external_id' or 'id'") + + identifier = identifier_dct[column_names] + if identifier is None: # Time series are not required to have an external_id unfortunately... + identifier = identifier_dct["id"] + assert identifier is not None # Only happens if a user has created DatapointsArray themselves + warnings.warn( + f"Time series does not have an external ID, so its ID ({self.id}) was used instead as " + 'the column name in the DataFrame. If this is expected, consider passing `column_names="id"` ' + "to silence this warning.", + UserWarning, + ) + if self.value is not None: + return pd.DataFrame({identifier: self.value}, index=self.timestamp, copy=False) + + columns, data = [], [] + for agg in ALL_SORTED_DP_AGGS: + if (arr := getattr(self, agg)) is not None: + data.append(arr) + columns.append(f"{identifier}{include_aggregate_name * f'|{agg}'}") + + # Since columns might contain duplicates, we can't instantiate from dict as only the + # last key (array/column) would be kept: + (df := pd.DataFrame(dict(enumerate(data)), index=self.timestamp, copy=False)).columns = columns + return df + + class Datapoints(CogniteResource): """An object representing a list of datapoints. Args: id (int): Id of the timeseries the datapoints belong to - external_id (str): External id of the timeseries the datapoints belong to (Only if id is not set) + external_id (str): External id of the timeseries the datapoints belong to is_string (bool): Whether the time series is string valued or not. is_step (bool): Whether the time series is a step series or not. unit (str): The physical unit of the time series. - timestamp (List[Union[int, float]]): The data timestamps in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1.1.1900 00:00:00 UTC - value (List[Union[int, str, float]]): The data values. Can be String or numeric depending on the metric + timestamp (List[Union[int, float]]): The data timestamps in milliseconds since the epoch (Jan 1, 1970). Can be negative to define a date before 1970. Minimum timestamp is 1900.01.01 00:00:00 UTC + value (Union[List[str], List[float]]): The data values. Can be string or numeric average (List[float]): The integral average values in the aggregate period max (List[float]): The maximum values in the aggregate period min (List[float]): The minimum values in the aggregate period @@ -105,8 +328,8 @@ def __init__( is_string: bool = None, is_step: bool = None, unit: str = None, - timestamp: List[Union[int, float]] = None, - value: List[Union[int, str, float]] = None, + timestamp: List[int] = None, + value: Union[List[str], List[float]] = None, average: List[float] = None, max: List[float] = None, min: List[float] = None, @@ -124,7 +347,7 @@ def __init__( self.is_string = is_string self.is_step = is_step self.unit = unit - self.timestamp = timestamp or [] + self.timestamp = timestamp or [] # Needed in __len__ self.value = value self.average = average self.max = max @@ -189,65 +412,55 @@ def dump(self, camel_case: bool = False) -> Dict[str, Any]: return {key: value for key, value in dumped.items() if value is not None} def to_pandas( # type: ignore[override] - self, column_names: str = "externalId", include_aggregate_name: bool = True, include_errors: bool = False + self, + column_names: str = "external_id", + include_aggregate_name: bool = True, + include_errors: bool = False, ) -> "pandas.DataFrame": """Convert the datapoints into a pandas DataFrame. Args: - column_names (str): Which field to use as column header. Defaults to "externalId", can also be "id". + column_names (str): Which field to use as column header. Defaults to "external_id", can also be "id". For time series with no external ID, ID will be used instead. include_aggregate_name (bool): Include aggregate in the column name include_errors (bool): For synthetic datapoint queries, include a column with errors. Returns: pandas.DataFrame: The dataframe. """ - np, pd = utils._auxiliary.local_import("numpy", "pandas") + pd = cast(Any, local_import("pandas")) data_fields = {} timestamps = [] - if column_names == "externalId": + if column_names in ["external_id", "externalId"]: # Camel case for backwards compat identifier = self.external_id if self.external_id is not None else self.id elif column_names == "id": identifier = self.id else: - raise ValueError("column_names must be 'externalId' or 'id'") - for attr, value in self._get_non_empty_data_fields(get_empty_lists=True, get_error=include_errors): + raise ValueError("column_names must be 'external_id' or 'id'") + for attr, data in self._get_non_empty_data_fields(get_empty_lists=True, get_error=include_errors): if attr == "timestamp": - timestamps = value + timestamps = data else: id_with_agg = str(identifier) if attr != "value": - id_with_agg += "|{}".format(utils._auxiliary.to_camel_case(attr)) - data_fields[id_with_agg] = value - df = pd.DataFrame(data_fields, index=pd.DatetimeIndex(data=np.array(timestamps, dtype="datetime64[ms]"))) - if not include_aggregate_name: - Datapoints._strip_aggregate_names(df) - return df + if include_aggregate_name: + id_with_agg += f"|{attr}" + data = pd.to_numeric(data, errors="coerce") # Avoids object dtype for missing aggs + data_fields[id_with_agg] = data - def plot(self, *args: Any, **kwargs: Any) -> None: - """Plot the datapoints.""" - plt = cast(Any, utils._auxiliary.local_import("matplotlib.pyplot")) - self.to_pandas().plot(*args, **kwargs) - plt.show() - - @staticmethod - def _strip_aggregate_names(df: "pandas.DataFrame") -> "pandas.DataFrame": - df.rename(columns=lambda s: regexp.sub(r"\|\w+$", "", s), inplace=True) - if len(set(df.columns)) < df.shape[1]: - raise CogniteDuplicateColumnsError( - [item for item, count in collections.Counter(df.columns).items() if count > 1] - ) - return df + return pd.DataFrame(data_fields, index=pd.to_datetime(timestamps, unit="ms")) @classmethod - def _load( # type: ignore[override] + def _load( # type: ignore [override] cls, dps_object: Dict[str, Any], expected_fields: List[str] = None, cognite_client: "CogniteClient" = None ) -> "Datapoints": - instance = cls() - instance.id = dps_object.get("id") - instance.external_id = dps_object.get("externalId") - instance.is_string = dps_object["isString"] # should never be missing - instance.is_step = dps_object.get("isStep") # NB can be null if isString is true - instance.unit = dps_object.get("unit") + del cognite_client # just needed for signature + instance = cls( + id=dps_object.get("id"), + external_id=dps_object.get("externalId"), + is_string=dps_object["isString"], + is_step=dps_object.get("isStep"), + unit=dps_object.get("unit"), + ) expected_fields = (expected_fields or ["value"]) + ["timestamp"] if len(dps_object["datapoints"]) == 0: for key in expected_fields: @@ -255,7 +468,7 @@ def _load( # type: ignore[override] setattr(instance, snake_key, []) else: for key in expected_fields: - data = [dp[key] if key in dp else None for dp in dps_object["datapoints"]] + data = [dp.get(key) for dp in dps_object["datapoints"]] snake_key = utils._auxiliary.to_snake_case(key) setattr(instance, snake_key, data) return instance @@ -313,9 +526,87 @@ def _repr_html_(self) -> str: return self.to_pandas(include_errors=True)._repr_html_() +class DatapointsArrayList(CogniteResourceList): + _RESOURCE = DatapointsArray + + def __init__(self, resources: Collection[Any], cognite_client: "CogniteClient" = None): + super().__init__(resources, cognite_client) + + # Fix what happens for duplicated identifiers: + ids = {x.id: x for x in self.data if x.id is not None} + xids = {x.external_id: x for x in self.data if x.external_id is not None} + dupe_ids, id_dct = find_duplicates(ids), defaultdict(list) + dupe_xids, xid_dct = find_duplicates(xids), defaultdict(list) + + for id, dps_arr in ids.items(): + if id in dupe_ids: + id_dct[id].append(dps_arr) + + for xid, dps_arr in xids.items(): + if xid in dupe_xids: + xid_dct[xid].append(dps_arr) + + self._id_to_item.update(id_dct) + self._external_id_to_item.update(xid_dct) + + def get(self, id: int = None, external_id: str = None) -> Optional[Union[DatapointsArray, List[DatapointsArray]]]: + # TODO: Question, can we type annotate without specifying the function? + return super().get(id, external_id) + + def __str__(self) -> str: + return json.dumps(self.dump(convert_timestamps=True), indent=4) + + def _repr_html_(self) -> str: + return self.to_pandas()._repr_html_() + + def to_pandas( # type: ignore [override] + self, column_names: Literal["id", "external_id"] = "external_id", include_aggregate_name: bool = True + ) -> "pandas.DataFrame": + pd = cast(Any, local_import("pandas")) + if dfs := [arr.to_pandas(column_names, include_aggregate_name) for arr in self.data]: + return pd.concat(dfs, axis="columns") + return pd.DataFrame(index=pd.to_datetime([])) + + def dump(self, camel_case: bool = False, convert_timestamps: bool = False) -> List[Dict[str, Any]]: + """Dump the instance into a json serializable Python data type. + + Args: + camel_case (bool): Use camelCase for attribute names. Default: False. + convert_timestamps (bool): Convert integer timestamps to ISO 8601 formatted strings. Default: False. + + Returns: + List[Dict[str, Any]]: A list of dicts representing the instance. + """ + return [dps.dump(camel_case, convert_timestamps) for dps in self.data] + + class DatapointsList(CogniteResourceList): _RESOURCE = Datapoints + def __init__(self, resources: Collection[Any], cognite_client: "CogniteClient" = None): + super().__init__(resources, cognite_client) + + # Fix what happens for duplicated identifiers: + ids = {x.id: x for x in self.data if x.id is not None} + xids = {x.external_id: x for x in self.data if x.external_id is not None} + dupe_ids, id_dct = find_duplicates(ids), defaultdict(list) + dupe_xids, xid_dct = find_duplicates(xids), defaultdict(list) + + for id, dps in ids.items(): + if id in dupe_ids: + id_dct[id].append(dps) + + for xid, dps in xids.items(): + if xid in dupe_xids: + xid_dct[xid].append(dps) + + self._id_to_item.update(id_dct) + self._external_id_to_item.update(xid_dct) + + def get(self, id: int = None, external_id: str = None) -> Optional[Union[DatapointsList, List[DatapointsList]]]: + # TODO: Question, can we type annotate without specifying the function? + return super().get(id, external_id) + def __str__(self) -> str: item = self.dump() for i in item: @@ -323,81 +614,57 @@ def __str__(self) -> str: return json.dumps(item, default=lambda x: x.__dict__, indent=4) def to_pandas( # type: ignore[override] - self, column_names: str = "externalId", include_aggregate_name: bool = True + self, column_names: str = "external_id", include_aggregate_name: bool = True ) -> "pandas.DataFrame": """Convert the datapoints list into a pandas DataFrame. Args: - column_names (str): Which field to use as column header. Defaults to "externalId", can also be "id". + column_names (str): Which field to use as column header. Defaults to "external_id", can also be "id". For time series with no external ID, ID will be used instead. include_aggregate_name (bool): Include aggregate in the column name Returns: pandas.DataFrame: The datapoints list as a pandas DataFrame. """ - pd = cast(Any, utils._auxiliary.local_import("pandas")) + pd = cast(Any, local_import("pandas")) - dfs = [df.to_pandas(column_names=column_names) for df in self.data] + dfs = [ + dps.to_pandas( + column_names=column_names, + include_aggregate_name=include_aggregate_name, + ) + for dps in self.data + ] if dfs: - df = pd.concat(dfs, axis="columns") - if not include_aggregate_name: # do not pass in to_pandas above, so we check for duplicate columns - Datapoints._strip_aggregate_names(df) - return df - + return pd.concat(dfs, axis="columns") return pd.DataFrame() def _repr_html_(self) -> str: return self.to_pandas()._repr_html_() - def plot(self, *args: Any, **kwargs: Any) -> None: - """Plot the list of datapoints.""" - plt = utils._auxiliary.local_import("matplotlib.pyplot") - self.to_pandas().plot(*args, **kwargs) - plt.show() # type: ignore - - -DatapointsIdMaybeAggregate = Union[ - int, List[int], Dict[str, Union[int, List[int]]], List[Dict[str, Union[int, List[int]]]] -] -DatapointsExternalIdMaybeAggregate = Union[ - str, List[str], Dict[str, Union[str, List[str]]], List[Dict[str, Union[str, List[str]]]] -] - +@dataclass class DatapointsQuery(CogniteResource): """Parameters describing a query for datapoints. - Args: - start (Union[str, int, datetime]): Get datapoints after this time. Format is N[timeunit]-ago where timeunit is w,d,h,m,s. Example: '2d-ago' will get everything that is up to 2 days old. Can also send time in ms since epoch. - end (Union[str, int, datetime]): Get datapoints up to this time. The format is the same as for start. - id (Union[int, List[int], Dict[str, Any], List[Dict[str, Any]]]: Id or list of ids. Can also be object - specifying aggregates. See example below. - external_id (Union[str, List[str], Dict[str, Any], List[Dict[str, Any]]]): External id or list of external - ids. Can also be object specifying aggregates. See example below. - limit (int): Return up to this number of datapoints. - aggregates (List[str]): The aggregates to be returned. Use default if null. An empty string must be sent to get raw data if the default is a set of aggregates. - granularity (str): The granularity size and granularity of the aggregates. - include_outside_points (bool): Whether to include the last datapoint before the requested time period,and the first one after the requested period. This can be useful for interpolating data. Not available for aggregates. - ignore_unknown_ids (bool): Ignore IDs and external IDs that are not found rather than throw an exception. Note that in this case the function always returns a DatapointsList even when a single id is requested. + See `DatapointsAPI.retrieve` method for a description of the parameters. """ - def __init__( - self, - start: Union[str, int, datetime], - end: Union[str, int, datetime], - id: DatapointsIdMaybeAggregate = None, - external_id: DatapointsExternalIdMaybeAggregate = None, - limit: int = None, - aggregates: List[str] = None, - granularity: str = None, - include_outside_points: bool = None, - ignore_unknown_ids: bool = False, - ): - self.id = id - self.external_id = external_id - self.start = start - self.end = end - self.limit = limit - self.aggregates = aggregates - self.granularity = granularity - self.include_outside_points = include_outside_points - self.ignore_unknown_ids = ignore_unknown_ids + start: Union[int, str, datetime, None] = None + end: Union[int, str, datetime, None] = None + id: Optional[DatapointsIdTypes] = None + external_id: Optional[DatapointsExternalIdTypes] = None + aggregates: Optional[List[str]] = None + granularity: Optional[str] = None + limit: Optional[int] = None + include_outside_points: bool = False + ignore_unknown_ids: bool = False + + @property + def is_single_identifier(self) -> bool: + # No lists given and exactly one of id/xid was given: + return ( + isinstance(self.id, (dict, numbers.Integral)) + and self.external_id is None + or isinstance(self.external_id, (dict, str)) + and self.id is None + ) diff --git a/cognite/client/data_classes/files.py b/cognite/client/data_classes/files.py index 3b42c2e919..077329e096 100644 --- a/cognite/client/data_classes/files.py +++ b/cognite/client/data_classes/files.py @@ -270,11 +270,6 @@ def data_set_id(self) -> _PrimitiveFileMetadataUpdate: def labels(self) -> _LabelFileMetadataUpdate: return FileMetadataUpdate._LabelFileMetadataUpdate(self, "labels") - # TODO: This is left here for backwards compatibility. Should be removed on next major version change - @property - def geoLocation(self) -> _PrimitiveFileMetadataUpdate: - return self.geo_location - @property def geo_location(self) -> _PrimitiveFileMetadataUpdate: return FileMetadataUpdate._PrimitiveFileMetadataUpdate(self, "geoLocation") diff --git a/cognite/client/data_classes/functions.py b/cognite/client/data_classes/functions.py index 080d15118e..1ff1dbabed 100644 --- a/cognite/client/data_classes/functions.py +++ b/cognite/client/data_classes/functions.py @@ -156,7 +156,7 @@ def update(self) -> None: """ latest = self._cognite_client.functions.retrieve(id=self.id) if latest is None: - return + return None for attribute in self.__dict__: if attribute.startswith("_"): diff --git a/cognite/client/data_classes/geospatial.py b/cognite/client/data_classes/geospatial.py index 80807a97b7..bc24f739b1 100644 --- a/cognite/client/data_classes/geospatial.py +++ b/cognite/client/data_classes/geospatial.py @@ -177,7 +177,7 @@ def _to_feature_property_name(property_name: str) -> str: class FeatureList(CogniteResourceList): _RESOURCE = Feature - def to_geopandas(self, geometry: str, camel_case: bool = True) -> "geopandas.GeoDataFrame": # noqa: F821 + def to_geopandas(self, geometry: str, camel_case: bool = False) -> "geopandas.GeoDataFrame": # noqa: F821 """Convert the instance into a GeoPandas GeoDataFrame. Args: @@ -205,8 +205,7 @@ def to_geopandas(self, geometry: str, camel_case: bool = True) -> "geopandas.Geo wkt = cast(Any, utils._auxiliary.local_import("shapely.wkt")) df[geometry] = df[geometry].apply(lambda g: wkt.loads(g["wkt"])) geopandas = cast(Any, utils._auxiliary.local_import("geopandas")) - gdf = geopandas.GeoDataFrame(df, geometry=geometry) - return gdf + return geopandas.GeoDataFrame(df, geometry=geometry) @staticmethod def from_geopandas( diff --git a/cognite/client/data_classes/time_series.py b/cognite/client/data_classes/time_series.py index 3c38a01ac0..81a6526540 100644 --- a/cognite/client/data_classes/time_series.py +++ b/cognite/client/data_classes/time_series.py @@ -1,6 +1,5 @@ from typing import TYPE_CHECKING, Any, Dict, List, Optional, Sequence, Union, cast -from cognite.client import utils from cognite.client.data_classes._base import ( CogniteFilter, CogniteLabelUpdate, @@ -14,10 +13,11 @@ ) from cognite.client.data_classes.shared import TimestampRange from cognite.client.utils._identifier import Identifier +from cognite.client.utils._time import MAX_TIMESTAMP_MS, MIN_TIMESTAMP_MS if TYPE_CHECKING: from cognite.client import CogniteClient - from cognite.client.data_classes import Asset, Datapoint, DatapointsList + from cognite.client.data_classes import Asset, Datapoint class TimeSeries(CogniteResource): @@ -75,32 +75,6 @@ def __init__( self.legacy_name = legacy_name self._cognite_client = cast("CogniteClient", cognite_client) - def plot( - self, - start: str = "1d-ago", - end: str = "now", - aggregates: List[str] = None, - granularity: str = None, - id_labels: bool = False, - *args: Any, - **kwargs: Any, - ) -> None: - plt = utils._auxiliary.local_import("matplotlib.pyplot") - identifier = Identifier.load(self.id, self.external_id).as_dict() - dps = self._cognite_client.datapoints.retrieve( - start=start, end=end, aggregates=aggregates, granularity=granularity, **identifier - ) - if id_labels: - dps.plot(*args, **kwargs) - else: - assert self.id is not None - columns: Dict[Union[int, str], Any] = {self.id: self.name} - for agg in aggregates or []: - columns["{}|{}".format(self.id, agg)] = "{}|{}".format(self.name, agg) - df = dps.to_pandas().rename(columns=columns) - df.plot(*args, **kwargs) - plt.show() # type: ignore - def count(self) -> int: """Returns the number of datapoints in this time series. @@ -108,35 +82,45 @@ def count(self) -> int: Returns: int: The number of datapoints in this time series. + + Raises: + ValueError: If the time series is string as count aggregate is only supported for numeric data + + Returns: + int: The total number of datapoints """ + if self.is_string: + raise ValueError("String time series does not support count aggregate.") + identifier = Identifier.load(self.id, self.external_id).as_dict() - dps = self._cognite_client.datapoints.retrieve( - start=0, end="now", aggregates=["count"], granularity="10d", **identifier + dps = self._cognite_client.time_series.data.retrieve( + **identifier, start=MIN_TIMESTAMP_MS, end=MAX_TIMESTAMP_MS, aggregates=["count"], granularity="100d" ) return sum(dps.count) - def latest(self) -> Optional["Datapoint"]: # noqa: F821 - """Returns the latest datapoint in this time series + def latest(self, before: Union[int, str, datetime] = None) -> Optional["Datapoint"]: # noqa: F821 + """Returns the latest datapoint in this time series. If empty, returns None. Returns: Datapoint: A datapoint object containing the value and timestamp of the latest datapoint. """ identifier = Identifier.load(self.id, self.external_id).as_dict() - dps = self._cognite_client.datapoints.retrieve_latest(**identifier) - if len(dps) > 0: - return list(dps)[0] + if dps := self._cognite_client.time_series.data.retrieve_latest(**identifier, before=before): + return dps[0] return None def first(self) -> Optional["Datapoint"]: # noqa: F821 - """Returns the first datapoint in this time series. + """Returns the first datapoint in this time series. If empty, returns None. Returns: Datapoint: A datapoint object containing the value and timestamp of the first datapoint. """ identifier = Identifier.load(self.id, self.external_id).as_dict() - dps = self._cognite_client.datapoints.retrieve(**identifier, start=0, end="now", limit=1) - if len(dps) > 0: - return list(dps)[0] + dps = self._cognite_client.time_series.data.retrieve( + **identifier, start=MIN_TIMESTAMP_MS, end=MAX_TIMESTAMP_MS, limit=1 + ) + if dps: + return dps[0] return None def asset(self) -> "Asset": # noqa: F821 @@ -302,32 +286,3 @@ def __init__(self, count: int = None, **kwargs: Any) -> None: class TimeSeriesList(CogniteResourceList): _RESOURCE = TimeSeries - - def plot( - self, - start: str = "1d-ago", - end: str = "now", - aggregates: List[str] = None, - granularity: str = None, - id_labels: bool = False, - *args: Any, - **kwargs: Any, - ) -> None: - plt = utils._auxiliary.local_import("matplotlib.pyplot") - dps = cast( - "DatapointsList", - self._cognite_client.datapoints.retrieve( - id=[ts.id for ts in self.data], start=start, end=end, aggregates=aggregates, granularity=granularity - ), - ) - if id_labels: - dps.plot(*args, **kwargs) - else: - columns = {} - for ts in self.data: - columns[ts.id] = ts.name - for agg in aggregates or []: - columns["{}|{}".format(ts.id, agg)] = "{}|{}".format(ts.name, agg) - df = dps.to_pandas().rename(columns=columns) - df.plot(*args, **kwargs) - plt.show() # type: ignore diff --git a/cognite/client/exceptions.py b/cognite/client/exceptions.py index ec6ab549f5..2bc1318849 100644 --- a/cognite/client/exceptions.py +++ b/cognite/client/exceptions.py @@ -1,4 +1,5 @@ import json +import reprlib from typing import Callable, Dict, List, Sequence @@ -131,9 +132,9 @@ def __init__( super().__init__(successful, failed, unknown, unwrap_fn) def __str__(self) -> str: - msg = "Not found: {}".format(self.not_found) - msg += self._get_multi_exception_summary() - return msg + if len(not_found := self.not_found) > 200: + not_found = reprlib.repr(self.not_found) + return f"Not found: {not_found}{self._get_multi_exception_summary()}" class CogniteDuplicatedError(CogniteMultiException): diff --git a/cognite/client/testing.py b/cognite/client/testing.py index 4aca1e9397..6239fa4009 100644 --- a/cognite/client/testing.py +++ b/cognite/client/testing.py @@ -3,17 +3,31 @@ from unittest.mock import MagicMock from cognite.client import CogniteClient +from cognite.client._api.annotations import AnnotationsAPI from cognite.client._api.assets import AssetsAPI from cognite.client._api.data_sets import DataSetsAPI from cognite.client._api.datapoints import DatapointsAPI +from cognite.client._api.diagrams import DiagramsAPI +from cognite.client._api.entity_matching import EntityMatchingAPI from cognite.client._api.events import EventsAPI +from cognite.client._api.extractionpipelines import ExtractionPipelineRunsAPI, ExtractionPipelinesAPI from cognite.client._api.files import FilesAPI +from cognite.client._api.functions import FunctionCallsAPI, FunctionsAPI, FunctionSchedulesAPI +from cognite.client._api.geospatial import GeospatialAPI from cognite.client._api.iam import IAMAPI, APIKeysAPI, GroupsAPI, SecurityCategoriesAPI, ServiceAccountsAPI from cognite.client._api.labels import LabelsAPI from cognite.client._api.login import LoginAPI from cognite.client._api.raw import RawAPI, RawDatabasesAPI, RawRowsAPI, RawTablesAPI from cognite.client._api.relationships import RelationshipsAPI from cognite.client._api.sequences import SequencesAPI, SequencesDataAPI +from cognite.client._api.synthetic_time_series import SyntheticDatapointsAPI +from cognite.client._api.templates import ( + TemplateGroupsAPI, + TemplateGroupVersionsAPI, + TemplateInstancesAPI, + TemplatesAPI, + TemplateViewsAPI, +) from cognite.client._api.three_d import ( ThreeDAPI, ThreeDAssetMappingAPI, @@ -22,6 +36,14 @@ ThreeDRevisionsAPI, ) from cognite.client._api.time_series import TimeSeriesAPI +from cognite.client._api.transformations import ( + TransformationJobsAPI, + TransformationNotificationsAPI, + TransformationsAPI, + TransformationSchedulesAPI, + TransformationSchemaAPI, +) +from cognite.client._api.vision import VisionAPI class CogniteClientMock(MagicMock): @@ -33,34 +55,70 @@ class CogniteClientMock(MagicMock): def __init__(self, *args: Any, **kwargs: Any) -> None: if "parent" in kwargs: super().__init__(*args, **kwargs) - return + return None super().__init__(spec=CogniteClient, *args, **kwargs) - self.time_series = MagicMock(spec_set=TimeSeriesAPI) - self.datapoints = MagicMock(spec_set=DatapointsAPI) + + self.datapoints = MagicMock(spec=DatapointsAPI) + self.datapoints.synthetic = MagicMock(spec_set=SyntheticDatapointsAPI) + + self.time_series = MagicMock(spec=TimeSeriesAPI) + self.time_series.data = self.datapoints + self.assets = MagicMock(spec_set=AssetsAPI) self.events = MagicMock(spec_set=EventsAPI) self.data_sets = MagicMock(spec_set=DataSetsAPI) self.files = MagicMock(spec_set=FilesAPI) self.labels = MagicMock(spec_set=LabelsAPI) self.login = MagicMock(spec_set=LoginAPI) + self.three_d = MagicMock(spec=ThreeDAPI) self.three_d.models = MagicMock(spec_set=ThreeDModelsAPI) self.three_d.revisions = MagicMock(spec_set=ThreeDRevisionsAPI) self.three_d.files = MagicMock(spec_set=ThreeDFilesAPI) self.three_d.asset_mappings = MagicMock(spec_set=ThreeDAssetMappingAPI) + self.iam = MagicMock(spec=IAMAPI) - self.iam.service_accounts = MagicMock(spec=ServiceAccountsAPI) + self.iam.service_accounts = MagicMock(spec_set=ServiceAccountsAPI) self.iam.api_keys = MagicMock(spec_set=APIKeysAPI) self.iam.groups = MagicMock(spec_set=GroupsAPI) self.iam.security_categories = MagicMock(spec_set=SecurityCategoriesAPI) + self.raw = MagicMock(spec=RawAPI) self.raw.databases = MagicMock(spec_set=RawDatabasesAPI) self.raw.tables = MagicMock(spec_set=RawTablesAPI) self.raw.rows = MagicMock(spec_set=RawRowsAPI) + self.relationships = MagicMock(spec_set=RelationshipsAPI) + self.sequences = MagicMock(spec=SequencesAPI) self.sequences.data = MagicMock(spec_set=SequencesDataAPI) + self.entity_matching = MagicMock(spec_set=EntityMatchingAPI) + self.extraction_pipelines = MagicMock(spec_set=ExtractionPipelinesAPI) + self.extraction_pipeline_runs = MagicMock(spec_set=ExtractionPipelineRunsAPI) + self.geospatial = MagicMock(spec_set=GeospatialAPI) + + self.templates = MagicMock(spec=TemplatesAPI) + self.templates.groups = MagicMock(spec_set=TemplateGroupsAPI) + self.templates.versions = MagicMock(spec_set=TemplateGroupVersionsAPI) + self.templates.instances = MagicMock(spec_set=TemplateInstancesAPI) + self.templates.views = MagicMock(spec_set=TemplateViewsAPI) + + self.transformations = MagicMock(spec=TransformationsAPI) + self.transformations.jobs = MagicMock(spec_set=TransformationJobsAPI) + self.transformations.schedules = MagicMock(spec_set=TransformationSchedulesAPI) + self.transformations.schema = MagicMock(spec_set=TransformationSchemaAPI) + self.transformations.notifications = MagicMock(spec_set=TransformationNotificationsAPI) + + self.diagrams = MagicMock(spec_set=DiagramsAPI) + self.annotations = MagicMock(spec_set=AnnotationsAPI) + + self.functions = MagicMock(spec=FunctionsAPI) + self.functions.calls = MagicMock(spec_set=FunctionCallsAPI) + self.functions.schedules = MagicMock(spec_set=FunctionSchedulesAPI) + + self.vision = MagicMock(spec_set=VisionAPI) + @contextmanager def monkeypatch_cognite_client() -> Iterator[CogniteClientMock]: diff --git a/cognite/client/utils/_auxiliary.py b/cognite/client/utils/_auxiliary.py index 9171c22dbb..2e8bd971b9 100644 --- a/cognite/client/utils/_auxiliary.py +++ b/cognite/client/utils/_auxiliary.py @@ -15,13 +15,16 @@ import warnings from decimal import Decimal from types import ModuleType -from typing import Any, Dict, List, Sequence, Tuple, Union +from typing import Any, Dict, Hashable, Iterator, List, Sequence, Set, Tuple, TypeVar, Union from urllib.parse import quote import cognite.client from cognite.client.exceptions import CogniteImportError from cognite.client.utils._version_checker import get_newest_version_in_major_release +T = TypeVar("T") +THashable = TypeVar("THashable", bound=Hashable) + @functools.lru_cache(maxsize=128) def to_camel_case(snake_case_string: str) -> str: @@ -35,11 +38,12 @@ def to_snake_case(camel_case_string: str) -> str: return re.sub("([a-z0-9])([A-Z])", r"\1_\2", s1).lower() -def convert_all_keys_to_camel_case(d: dict) -> dict: - new_d = {} - for k, v in d.items(): - new_d[to_camel_case(k)] = v - return new_d +def convert_all_keys_to_camel_case(d: Dict[str, Any]) -> Dict[str, Any]: + return dict(zip(map(to_camel_case, d.keys()), d.values())) + + +def convert_all_keys_to_snake_case(d: Dict[str, Any]) -> Dict[str, Any]: + return dict(zip(map(to_snake_case, d.keys()), d.values())) def json_dump_default(x: Any) -> Any: @@ -125,11 +129,12 @@ def _check_client_has_newest_major_version() -> None: ) -def random_string(size: int = 100) -> str: - return "".join(random.choice(string.ascii_uppercase + string.digits) for _ in range(size)) +def random_string(size: int = 100, sample_from: str = string.ascii_uppercase + string.digits) -> str: + return "".join(random.choices(sample_from, k=size)) class PriorityQueue: + # TODO: Just use queue.PriorityQueue() def __init__(self) -> None: self.__heap: List[Any] = [] self.__id = 0 @@ -146,6 +151,11 @@ def __bool__(self) -> bool: return len(self.__heap) > 0 +def split_into_n_parts(seq: Sequence[T], /, n: int) -> Iterator[Sequence[T]]: + # NB: Chaotic sampling: jumps n for each starting position + yield from (seq[i::n] for i in range(n)) + + def split_into_chunks(collection: Union[List, Dict], chunk_size: int) -> List[Union[List, Dict]]: chunks: List[Union[List, Dict]] = [] if isinstance(collection, list): @@ -173,3 +183,9 @@ def convert_true_match(true_match: Union[dict, list, Tuple[Union[int, str], Unio return true_match else: raise ValueError("true_matches should be a dictionary or a two-element list: found {}".format(true_match)) + + +def find_duplicates(seq: Sequence[THashable]) -> Set[THashable]: + seen: Set[THashable] = set() + add = seen.add # skip future attr lookups for perf + return set(x for x in seq if x in seen or add(x)) diff --git a/cognite/client/utils/_concurrency.py b/cognite/client/utils/_concurrency.py index 30739b7669..c973bc323f 100644 --- a/cognite/client/utils/_concurrency.py +++ b/cognite/client/utils/_concurrency.py @@ -33,7 +33,7 @@ def raise_compound_exception_if_failed_tasks( str_format_element_fn: Optional[Callable] = None, ) -> None: if not self.exceptions: - return + return None task_unwrap_fn = (lambda x: x) if task_unwrap_fn is None else task_unwrap_fn if task_list_element_unwrap_fn is not None: successful = [] @@ -110,7 +110,9 @@ def collect_exc_info_and_raise( def execute_tasks_concurrently( func: Callable, tasks: Union[Sequence[Tuple], List[Dict]], max_workers: int ) -> TasksSummary: - assert max_workers > 0, "Number of workers should be >= 1, was {}".format(max_workers) + if max_workers < 1: + raise RuntimeError(f"Number of workers should be >= 1, was {max_workers}") + with ThreadPoolExecutor(max_workers) as p: futures = [] for task in tasks: diff --git a/cognite/client/utils/_identifier.py b/cognite/client/utils/_identifier.py index f4f7187f76..f77b3e7786 100644 --- a/cognite/client/utils/_identifier.py +++ b/cognite/client/utils/_identifier.py @@ -1,5 +1,5 @@ import numbers -from typing import Dict, Generic, Iterable, List, Optional, Sequence, TypeVar, Union, cast, overload +from typing import Dict, Generic, Iterable, List, Optional, Sequence, Tuple, TypeVar, Union, cast, overload T_ID = TypeVar("T_ID", int, str) @@ -25,12 +25,29 @@ def load(cls, id: Optional[int] = None, external_id: Optional[str] = None) -> "I def as_primitive(self) -> T_ID: return self.__value - def as_dict(self) -> Dict[str, T_ID]: + def as_dict(self, camel_case: bool = True) -> Dict[str, T_ID]: if isinstance(self.__value, str): - return {"externalId": self.__value} + if camel_case: + return {"externalId": self.__value} + return {"external_id": self.__value} else: return {"id": self.__value} + def as_tuple(self, camel_case: bool = True) -> Tuple[str, T_ID]: + if isinstance(self.__value, str): + if camel_case: + return ("externalId", self.__value) + return ("external_id", self.__value) + else: + return ("id", self.__value) + + def __str__(self) -> str: + identifier_type, identifier = self.as_tuple(camel_case=False) + return f"{type(self).__name__}({identifier_type}={identifier!r})" + + def __repr__(self) -> str: + return str(self) + class ExternalId(Identifier[str]): ... diff --git a/cognite/client/utils/_priority_tpe.py b/cognite/client/utils/_priority_tpe.py new file mode 100644 index 0000000000..59c4a78abe --- /dev/null +++ b/cognite/client/utils/_priority_tpe.py @@ -0,0 +1,130 @@ +""" +This code has been modified from the original created by Oleg Lupats, 2019, under an MIT license: +project = 'PriorityThreadPoolExecutor' +url = 'https://github.com/oleglpts/PriorityThreadPoolExecutor' +copyright = '2019, Oleg Lupats' +author = 'Oleg Lupats' +release = '0.0.1' + +MIT License + +Copyright (c) 2019 The Python Packaging Authority + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +""" + +import atexit +import inspect +import itertools +import sys +import threading +import weakref +from concurrent.futures.thread import ThreadPoolExecutor, _base, _python_exit, _threads_queues, _WorkItem +from queue import PriorityQueue +from threading import Lock + +NULL_ENTRY = (sys.maxsize, None, _WorkItem(None, None, (), {})) +_SHUTDOWN = False + + +def python_exit(): + global _SHUTDOWN + _SHUTDOWN = True + items = list(_threads_queues.items()) + for thread, queue in items: + queue.put(NULL_ENTRY) + for thread, queue in items: + thread.join() + + +atexit.unregister(_python_exit) +atexit.register(python_exit) + + +def _worker(executor_reference, work_queue): + try: + while True: + priority, _, work_item = work_queue.get(block=True) + if priority != sys.maxsize: + work_item.run() + del work_item + continue + executor = executor_reference() + if _SHUTDOWN or executor is None or executor._shutdown: + work_queue.put(NULL_ENTRY) + return None + del executor + except BaseException: + _base.LOGGER.critical("Exception in worker", exc_info=True) + + +class PriorityThreadPoolExecutor(ThreadPoolExecutor): + """Thread pool executor with queue.PriorityQueue()""" + + def __init__(self, max_workers=None): + super().__init__(max_workers) + self._work_queue = PriorityQueue() + self._lock = Lock() + self._counter = itertools.count() + + def counter(self): + with self._lock: + return next(self._counter) + + def submit(self, fn, *args, **kwargs): + if "priority" in inspect.signature(fn).parameters: + raise TypeError(f"Given function {fn} cannot accept reserved parameter name `priority`") + + with self._shutdown_lock: + if self._shutdown: + raise RuntimeError("Cannot schedule new futures after shutdown") + + priority = kwargs.pop("priority", None) + assert isinstance(priority, int), "`priority` has to be an integer" + + future = _base.Future() + work_item = _WorkItem(future, fn, args, kwargs) + + # `counter` to break ties, but keep order: + self._work_queue.put((priority, self.counter(), work_item)) + self._adjust_thread_count() + return future + + def _adjust_thread_count(self): + def weak_ref_cb(_, queue=self._work_queue): + queue.put(NULL_ENTRY) + + if len(self._threads) < self._max_workers: + thread = threading.Thread(target=_worker, args=(weakref.ref(self, weak_ref_cb), self._work_queue)) + thread.daemon = True + thread.start() + self._threads.add(thread) + _threads_queues[thread] = self._work_queue + + def shutdown(self, wait=True): + with self._shutdown_lock: + self._shutdown = True + self._work_queue.put(NULL_ENTRY) + if wait: + for thread in self._threads: + thread.join() + else: + # See: https://gist.github.com/clchiou/f2608cbe54403edb0b13 + self._threads.clear() + _threads_queues.clear() diff --git a/cognite/client/utils/_time.py b/cognite/client/utils/_time.py index 2d19f691e2..8a6dedd582 100644 --- a/cognite/client/utils/_time.py +++ b/cognite/client/utils/_time.py @@ -1,25 +1,25 @@ import numbers import re import time -import warnings -from datetime import datetime, timezone -from typing import Dict, List, Optional, Union +from datetime import datetime, timedelta, timezone +from typing import Dict, List, Optional, Tuple, Union -_unit_in_ms_without_week = {"s": 1000, "m": 60000, "h": 3600000, "d": 86400000} -_unit_in_ms = {**_unit_in_ms_without_week, "w": 604800000} +UNIT_IN_MS_WITHOUT_WEEK = {"s": 1000, "m": 60000, "h": 3600000, "d": 86400000} +UNIT_IN_MS = {**UNIT_IN_MS_WITHOUT_WEEK, "w": 604800000} MIN_TIMESTAMP_MS = -2208988800000 +MAX_TIMESTAMP_MS = 2556143999999 def datetime_to_ms(dt: datetime) -> int: - if dt.tzinfo is None: - warnings.warn( - "Interpreting given naive datetime as UTC instead of local time (against Python default behaviour). " - "This will change in the next major release (4.0.0). Please use (timezone) aware datetimes " - "or convert it yourself to integer (number of milliseconds since epoch, leap seconds excluded).", - FutureWarning, - ) - dt = dt.replace(tzinfo=timezone.utc) + """Converts datetime object to milliseconds since epoch. + + Args: + dt (datetime): Naive or aware datetime object. Naive datetimes are interpreted as local time. + + Returns: + ms: Milliseconds since epoch (negative for time prior to 1970-01-01) + """ return int(1000 * dt.timestamp()) @@ -27,22 +27,16 @@ def ms_to_datetime(ms: Union[int, float]) -> datetime: """Converts milliseconds since epoch to datetime object. Args: - ms (Union[int, float]): Milliseconds since epoch + ms (Union[int, float]): Milliseconds since epoch. Returns: - datetime: Naive datetime object in UTC. - + datetime: Aware datetime object in UTC. """ - if ms < 0: - raise ValueError("ms must be greater than or equal to zero.") + if not (MIN_TIMESTAMP_MS <= ms <= MAX_TIMESTAMP_MS): + raise ValueError(f"`ms` does not satisfy: {MIN_TIMESTAMP_MS} <= ms <= {MAX_TIMESTAMP_MS}") - warnings.warn( - "This function, `ms_to_datetime` returns a naive datetime object in UTC. This is against " - "the default interpretation of naive datetimes in Python (i.e. local time). This behaviour will " - "change to returning timezone-aware datetimes in UTC in the next major release (4.0.0).", - FutureWarning, - ) - return datetime.utcfromtimestamp(ms / 1000) + # Note: We don't use fromtimestamp because it typically fails for negative values on Windows + return datetime(1970, 1, 1, tzinfo=timezone.utc) + timedelta(milliseconds=ms) def time_string_to_ms(pattern: str, string: str, unit_in_ms: Dict[str, int]) -> Optional[int]: @@ -56,7 +50,7 @@ def time_string_to_ms(pattern: str, string: str, unit_in_ms: Dict[str, int]) -> def granularity_to_ms(granularity: str) -> int: - ms = time_string_to_ms(r"(\d+)({})", granularity, _unit_in_ms_without_week) + ms = time_string_to_ms(r"(\d+)({})", granularity, UNIT_IN_MS_WITHOUT_WEEK) if ms is None: raise ValueError( "Invalid granularity format: `{}`. Must be on format (s|m|h|d). E.g. '5m', '3h' or '1d'.".format( @@ -75,7 +69,7 @@ def time_ago_to_ms(time_ago_string: str) -> int: """Returns millisecond representation of time-ago string""" if time_ago_string == "now": return 0 - ms = time_string_to_ms(r"(\d+)({})-ago", time_ago_string, _unit_in_ms) + ms = time_string_to_ms(r"(\d+)({})-ago", time_ago_string, UNIT_IN_MS) if ms is None: raise ValueError( "Invalid time-ago format: `{}`. Must be on format (s|m|h|d|w)-ago or 'now'. E.g. '3d-ago' or '1w-ago'.".format( @@ -112,7 +106,7 @@ def timestamp_to_ms(timestamp: Union[int, float, str, datetime]) -> int: def _convert_time_attributes_in_dict(item: Dict) -> Dict: - TIME_ATTRIBUTES = [ + TIME_ATTRIBUTES = { "start_time", "end_time", "last_updated_time", @@ -121,7 +115,7 @@ def _convert_time_attributes_in_dict(item: Dict) -> Dict: "scheduled_execution_time", "source_created_time", "source_modified_time", - ] + } new_item = {} for k, v in item.items(): if k in TIME_ATTRIBUTES: @@ -142,3 +136,31 @@ def convert_time_attributes_to_datetime(item: Union[Dict, List[Dict]]) -> Union[ new_items.append(_convert_time_attributes_in_dict(el)) return new_items raise TypeError("item must be dict or list of dicts") + + +def align_start_and_end_for_granularity(start: int, end: int, granularity: str) -> Tuple[int, int]: + # Note the API always aligns `start` with 1s, 1m, 1h or 1d (even when given e.g. 73h) + remainder = start % granularity_unit_to_ms(granularity) + if remainder: + # Floor `start` when not exactly at boundary + start -= remainder + gms = granularity_to_ms(granularity) + remainder = (end - start) % gms + if remainder: + # Ceil `end` when not exactly at boundary decided by `start + N * granularity` + end += gms - remainder + return start, end + + +def split_time_range(start: int, end: int, n_splits: int, granularity_in_ms: int) -> List[int]: + if n_splits < 1: + raise ValueError(f"Cannot split into less than 1 piece, got {n_splits=}") + tot_ms = end - start + if n_splits * granularity_in_ms > tot_ms: + raise ValueError( + f"Given time interval ({tot_ms=}) could not be split as `{n_splits=}` times `{granularity_in_ms=}` " + "is larger than the interval itself." + ) + # Find a `delta_ms` thats a multiple of granularity in ms (trivial for raw queries). + delta_ms = granularity_in_ms * round(tot_ms / n_splits / granularity_in_ms) + return [*(start + delta_ms * i for i in range(n_splits)), end] diff --git a/docs/source/cognite.rst b/docs/source/cognite.rst index fdacd8d6f3..e4d9409f5e 100644 --- a/docs/source/cognite.rst +++ b/docs/source/cognite.rst @@ -68,46 +68,7 @@ Limits for listing resources default to 25, so the following code will return th >>> from cognite.client import CogniteClient >>> c = CogniteClient() - >>> ts_list = c.time_series.list(include_metadata=False) - -Plot time series ----------------- -There are several ways of plotting a time series you have fetched from the API. The easiest is to call -:code:`.plot()` on the returned :code:`TimeSeries` or :code:`TimeSeriesList` objects. By default, this plots the raw -data points for the last 24 hours. If there are no data points for the last 24 hours, :code:`plot` will throw an exception. - -.. code:: python - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> my_time_series = c.time_series.retrieve(id=) - >>> my_time_series.plot() - -You can also pass arguments to the :code:`.plot()` method to change the start, end, aggregates, and granularity of the -request. - -.. code:: python - - >>> my_time_series.plot(start="365d-ago", end="now", aggregates=["average"], granularity="1d") - -The :code:`Datapoints` and :code:`DatapointsList` objects that are returned when you fetch data points, also have :code:`.plot()` -methods you can use to plot the data. - -.. code:: python - - >>> from cognite.client import CogniteClient - >>> c = CogniteClient() - >>> my_datapoints = c.datapoints.retrieve( - ... id=[], - ... start="10d-ago", - ... end="now", - ... aggregates=["max"], - ... granularity="1h" - ... ) - >>> my_datapoints.plot() - -.. NOTE:: - To use the :code:`.plot()` functionality you need to install :code:`matplotlib`. + >>> ts_list = c.time_series.list() Create an asset hierarchy ------------------------- @@ -237,26 +198,16 @@ You can use the :code:`.to_pandas()` method on pretty much any object and get a This is particularly useful when you are working with time series data and with tabular data from the Raw API. -Matplotlib integration ----------------------- -You can use the :code:`.plot()` method on any time series or data points result that the SDK returns. The method takes keyword -arguments which are passed on to the underlying matplotlib plot function, allowing you to configure for example the -size and layout of your plots. - -You need to install the matplotlib package manually: - -.. code:: bash - - $ pip install matplotlib - How to install extra dependencies --------------------------------- -If your application requires the functionality from e.g. the :code:`pandas`, :code:`numpy`, or :code:`geopandas` dependencies, -you should install the sdk along with its optional dependencies. The available extras are: - -- pandas -- geo -- sympy +If your application requires the functionality from e.g. the :code:`pandas`, :code:`sympy`, or :code:`geopandas` dependencies, +you should install the SDK along with its optional dependencies. The available extras are: + +- numpy: numpy +- pandas: pandas +- geo: geopanda, shapely +- sympy: sympy +- functions: pip - all (will install dependencies for all the above) These can be installed with the following command: @@ -656,10 +607,6 @@ Retrieve pandas dataframe ^^^^^^^^^^^^^^^^^^^^^^^^^ .. automethod:: cognite.client._api.datapoints.DatapointsAPI.retrieve_dataframe -Retrieve pandas dataframes indexed by aggregate -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. automethod:: cognite.client._api.datapoints.DatapointsAPI.retrieve_dataframe_dict - Perform data points queries ^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. automethod:: cognite.client._api.datapoints.DatapointsAPI.query @@ -1111,12 +1058,12 @@ Start an asynchronous job to extract information from image files stored in CDF: >>> from cognite.client.data_classes.contextualization import VisionFeature >>> c = CogniteClient() >>> extract_job = c.vision.extract( - ... features=[VisionFeature.ASSET_TAG_DETECTION, VisionFeature.PEOPLE_DETECTION], + ... features=[VisionFeature.ASSET_TAG_DETECTION, VisionFeature.PEOPLE_DETECTION], ... file_ids=[1, 2], ... ) -The returned job object, :code:`extract_job`, can be used to retrieve the status of the job and the prediction results once the job is completed. +The returned job object, :code:`extract_job`, can be used to retrieve the status of the job and the prediction results once the job is completed. Wait for job completion and get the parsed results: .. code:: python @@ -1132,8 +1079,8 @@ Save the prediction results in CDF as `Annotations >> extract_job.save_predictions() -.. note:: - Prediction results are stored in CDF as `Annotations `_ using the :code:`images.*` annotation types. In particular, text detections are stored as :code:`images.TextRegion`, asset tag detections are stored as :code:`images.AssetLink`, while other detections are stored as :code:`images.ObjectDetection`. +.. note:: + Prediction results are stored in CDF as `Annotations `_ using the :code:`images.*` annotation types. In particular, text detections are stored as :code:`images.TextRegion`, asset tag detections are stored as :code:`images.AssetLink`, while other detections are stored as :code:`images.ObjectDetection`. Tweaking the parameters of a feature extractor: @@ -1486,7 +1433,7 @@ Run transformations by id ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. automethod:: cognite.client._api.transformations.TransformationsAPI.run .. automethod:: cognite.client._api.transformations.TransformationsAPI.run_async - + Preview transformations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. automethod:: cognite.client._api.transformations.TransformationsAPI.preview diff --git a/mypy.ini b/mypy.ini index 3cd50754cd..38f4332a06 100644 --- a/mypy.ini +++ b/mypy.ini @@ -8,6 +8,7 @@ namespace_packages = true explicit_package_bases = true show_error_codes = true plugins = numpy.typing.mypy_plugin +exclude = _priority_tpe\.py$ [mypy-msal.*] ignore_missing_imports = true diff --git a/poetry.lock b/poetry.lock index a732aa1ede..cd73ac19f0 100644 --- a/poetry.lock +++ b/poetry.lock @@ -23,10 +23,10 @@ optional = false python-versions = ">=3.5" [package.extras] -dev = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", "furo", "sphinx", "sphinx-notfound-page", "pre-commit", "cloudpickle"] -docs = ["furo", "sphinx", "zope.interface", "sphinx-notfound-page"] -tests = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", "cloudpickle"] -tests_no_zope = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "cloudpickle"] +dev = ["cloudpickle", "coverage[toml] (>=5.0.2)", "furo", "hypothesis", "mypy (>=0.900,!=0.940)", "pre-commit", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins", "sphinx", "sphinx-notfound-page", "zope.interface"] +docs = ["furo", "sphinx", "sphinx-notfound-page", "zope.interface"] +tests = ["cloudpickle", "coverage[toml] (>=5.0.2)", "hypothesis", "mypy (>=0.900,!=0.940)", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins", "zope.interface"] +tests_no_zope = ["cloudpickle", "coverage[toml] (>=5.0.2)", "hypothesis", "mypy (>=0.900,!=0.940)", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins"] [[package]] name = "babel" @@ -53,11 +53,11 @@ webencodings = "*" [package.extras] css = ["tinycss2 (>=1.1.0,<1.2)"] -dev = ["build (==0.8.0)", "flake8 (==4.0.1)", "hashin (==0.17.0)", "pip-tools (==6.6.2)", "pytest (==7.1.2)", "Sphinx (==4.3.2)", "tox (==3.25.0)", "twine (==4.0.1)", "wheel (==0.37.1)", "black (==22.3.0)", "mypy (==0.961)"] +dev = ["Sphinx (==4.3.2)", "black (==22.3.0)", "build (==0.8.0)", "flake8 (==4.0.1)", "hashin (==0.17.0)", "mypy (==0.961)", "pip-tools (==6.6.2)", "pytest (==7.1.2)", "tox (==3.25.0)", "twine (==4.0.1)", "wheel (==0.37.1)"] [[package]] name = "certifi" -version = "2022.6.15" +version = "2022.9.24" description = "Python package for providing Mozilla's CA Bundle." category = "main" optional = false @@ -116,7 +116,7 @@ python-versions = "*" click = ">=4.0" [package.extras] -dev = ["pytest (>=3.6)", "pytest-cov", "wheel", "coveralls"] +dev = ["coveralls", "pytest (>=3.6)", "pytest-cov", "wheel"] [[package]] name = "cligj" @@ -149,7 +149,7 @@ optional = false python-versions = "*" [package.extras] -test = ["hypothesis (==3.55.3)", "flake8 (==3.7.8)"] +test = ["flake8 (==3.7.8)", "hypothesis (==3.55.3)"] [[package]] name = "coverage" @@ -167,7 +167,7 @@ toml = ["tomli"] [[package]] name = "cryptography" -version = "37.0.4" +version = "38.0.1" description = "cryptography is a package which provides cryptographic recipes and primitives to Python developers." category = "main" optional = false @@ -178,19 +178,11 @@ cffi = ">=1.12" [package.extras] docs = ["sphinx (>=1.6.5,!=1.8.0,!=3.1.0,!=3.1.1)", "sphinx-rtd-theme"] -docstest = ["pyenchant (>=1.6.11)", "twine (>=1.12.0)", "sphinxcontrib-spelling (>=4.0.1)"] +docstest = ["pyenchant (>=1.6.11)", "sphinxcontrib-spelling (>=4.0.1)", "twine (>=1.12.0)"] pep8test = ["black", "flake8", "flake8-import-order", "pep8-naming"] -sdist = ["setuptools_rust (>=0.11.4)"] +sdist = ["setuptools-rust (>=0.11.4)"] ssh = ["bcrypt (>=3.1.5)"] -test = ["pytest (>=6.2.0)", "pytest-benchmark", "pytest-cov", "pytest-subtests", "pytest-xdist", "pretend", "iso8601", "pytz", "hypothesis (>=1.11.4,!=3.79.2)"] - -[[package]] -name = "cycler" -version = "0.11.0" -description = "Composable style cycles" -category = "dev" -optional = false -python-versions = ">=3.6" +test = ["hypothesis (>=1.11.4,!=3.79.2)", "iso8601", "pretend", "pytest (>=6.2.0)", "pytest-benchmark", "pytest-cov", "pytest-subtests", "pytest-xdist", "pytz"] [[package]] name = "distlib" @@ -249,32 +241,10 @@ munch = "*" six = ">=1.7" [package.extras] -all = ["boto3 (>=1.2.4)", "pytest-cov", "shapely", "pytest (>=3)", "mock"] +all = ["boto3 (>=1.2.4)", "mock", "pytest (>=3)", "pytest-cov", "shapely"] calc = ["shapely"] s3 = ["boto3 (>=1.2.4)"] -test = ["pytest (>=3)", "pytest-cov", "boto3 (>=1.2.4)", "mock"] - -[[package]] -name = "fonttools" -version = "4.37.1" -description = "Tools to manipulate font files" -category = "dev" -optional = false -python-versions = ">=3.7" - -[package.extras] -all = ["fs (>=2.2.0,<3)", "lxml (>=4.0,<5)", "zopfli (>=0.1.4)", "lz4 (>=1.7.4.2)", "matplotlib", "sympy", "skia-pathops (>=0.5.0)", "uharfbuzz (>=0.23.0)", "brotlicffi (>=0.8.0)", "scipy", "brotli (>=1.0.1)", "munkres", "unicodedata2 (>=14.0.0)", "xattr"] -graphite = ["lz4 (>=1.7.4.2)"] -interpolatable = ["scipy", "munkres"] -lxml = ["lxml (>=4.0,<5)"] -pathops = ["skia-pathops (>=0.5.0)"] -plot = ["matplotlib"] -repacker = ["uharfbuzz (>=0.23.0)"] -symfont = ["sympy"] -type1 = ["xattr"] -ufo = ["fs (>=2.2.0,<3)"] -unicode = ["unicodedata2 (>=14.0.0)"] -woff = ["zopfli (>=0.1.4)", "brotlicffi (>=0.8.0)", "brotli (>=1.0.1)"] +test = ["boto3 (>=1.2.4)", "mock", "pytest (>=3)", "pytest-cov"] [[package]] name = "geopandas" @@ -293,7 +263,7 @@ shapely = ">=1.7,<2" [[package]] name = "identify" -version = "2.5.3" +version = "2.5.5" description = "File identification library for Python" category = "dev" optional = false @@ -304,7 +274,7 @@ license = ["ukkonen"] [[package]] name = "idna" -version = "3.3" +version = "3.4" description = "Internationalized Domain Names in Applications (IDNA)" category = "main" optional = false @@ -330,9 +300,9 @@ python-versions = ">=3.7" zipp = ">=0.5" [package.extras] -docs = ["sphinx", "jaraco.packaging (>=9)", "rst.linker (>=1.9)"] +docs = ["jaraco.packaging (>=9)", "rst.linker (>=1.9)", "sphinx"] perf = ["ipython"] -testing = ["pytest (>=6)", "pytest-checkdocs (>=2.4)", "pytest-flake8", "pytest-cov", "pytest-enabler (>=1.3)", "packaging", "pyfakefs", "flufl.flake8", "pytest-perf (>=0.9.2)", "pytest-black (>=0.3.7)", "pytest-mypy (>=0.9.1)", "importlib-resources (>=1.3)"] +testing = ["flufl.flake8", "importlib-resources (>=1.3)", "packaging", "pyfakefs", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-flake8", "pytest-mypy (>=0.9.1)", "pytest-perf (>=0.9.2)"] [[package]] name = "iniconfig" @@ -354,8 +324,8 @@ python-versions = ">=3.7" more-itertools = "*" [package.extras] -docs = ["sphinx", "jaraco.packaging (>=9)", "rst.linker (>=1.9)", "jaraco.tidelift (>=1.4)"] -testing = ["pytest (>=6)", "pytest-checkdocs (>=2.4)", "pytest-flake8", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-black (>=0.3.7)", "pytest-mypy (>=0.9.1)"] +docs = ["jaraco.packaging (>=9)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx"] +testing = ["pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-flake8", "pytest-mypy (>=0.9.1)"] [[package]] name = "jeepney" @@ -366,8 +336,8 @@ optional = false python-versions = ">=3.7" [package.extras] -test = ["pytest", "pytest-trio", "pytest-asyncio (>=0.17)", "testpath", "trio", "async-timeout"] -trio = ["trio", "async-generator"] +test = ["async-timeout", "pytest", "pytest-asyncio (>=0.17)", "pytest-trio", "testpath", "trio"] +trio = ["async-generator", "trio"] [[package]] name = "jinja2" @@ -385,7 +355,7 @@ i18n = ["Babel (>=2.7)"] [[package]] name = "keyring" -version = "23.9.0" +version = "23.9.3" description = "Store and access your passwords safely." category = "dev" optional = false @@ -399,16 +369,8 @@ pywin32-ctypes = {version = "<0.1.0 || >0.1.0,<0.1.1 || >0.1.1", markers = "sys_ SecretStorage = {version = ">=3.2", markers = "sys_platform == \"linux\""} [package.extras] -docs = ["sphinx", "jaraco.packaging (>=9)", "rst.linker (>=1.9)", "jaraco.tidelift (>=1.4)"] -testing = ["pytest (>=6)", "pytest-checkdocs (>=2.4)", "pytest-flake8", "flake8 (<5)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-black (>=0.3.7)", "pytest-mypy (>=0.9.1)"] - -[[package]] -name = "kiwisolver" -version = "1.4.4" -description = "A fast implementation of the Cassowary constraint solver" -category = "dev" -optional = false -python-versions = ">=3.7" +docs = ["jaraco.packaging (>=9)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx"] +testing = ["flake8 (<5)", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-flake8", "pytest-mypy (>=0.9.1)"] [[package]] name = "markupsafe" @@ -418,25 +380,6 @@ category = "dev" optional = false python-versions = ">=3.7" -[[package]] -name = "matplotlib" -version = "3.5.3" -description = "Python plotting package" -category = "dev" -optional = false -python-versions = ">=3.7" - -[package.dependencies] -cycler = ">=0.10" -fonttools = ">=4.22.0" -kiwisolver = ">=1.0.1" -numpy = ">=1.17" -packaging = ">=20.0" -pillow = ">=6.2.0" -pyparsing = ">=2.2.1" -python-dateutil = ">=2.7" -setuptools_scm = ">=4,<7" - [[package]] name = "more-itertools" version = "8.14.0" @@ -454,19 +397,19 @@ optional = true python-versions = "*" [package.extras] -develop = ["pytest (>=4.6)", "pycodestyle", "pytest-cov", "codecov", "wheel"] +develop = ["codecov", "pycodestyle", "pytest (>=4.6)", "pytest-cov", "wheel"] tests = ["pytest (>=4.6)"] [[package]] name = "msal" -version = "1.18.0" +version = "1.19.0" description = "The Microsoft Authentication Library (MSAL) for Python library enables your app to access the Microsoft Cloud by supporting authentication of users with Microsoft Azure Active Directory accounts (AAD) and Microsoft Accounts (MSA) using industry standard OAuth2 and OpenID Connect." category = "main" optional = false python-versions = "*" [package.dependencies] -cryptography = ">=0.6,<40" +cryptography = ">=0.6,<41" PyJWT = {version = ">=1.0.0,<3", extras = ["crypto"]} requests = ">=2.0.0,<3" @@ -482,7 +425,7 @@ python-versions = "*" six = "*" [package.extras] -testing = ["pytest", "coverage", "astroid (>=1.5.3,<1.6.0)", "pylint (>=1.7.2,<1.8.0)", "astroid (>=2.0)", "pylint (>=2.3.1,<2.4.0)"] +testing = ["astroid (>=1.5.3,<1.6.0)", "astroid (>=2.0)", "coverage", "pylint (>=1.7.2,<1.8.0)", "pylint (>=2.3.1,<2.4.0)", "pytest"] yaml = ["PyYAML (>=5.1.0)"] [[package]] @@ -521,15 +464,15 @@ python-versions = ">=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.* [[package]] name = "numpy" -version = "1.23.2" +version = "1.23.3" description = "NumPy is the fundamental package for array computing with Python." category = "main" -optional = false +optional = true python-versions = ">=3.8" [[package]] name = "oauthlib" -version = "3.2.0" +version = "3.2.1" description = "A generic, spec-compliant, thorough implementation of the OAuth request-signing logic" category = "main" optional = false @@ -553,7 +496,7 @@ pyparsing = ">=2.0.2,<3.0.5 || >3.0.5" [[package]] name = "pandas" -version = "1.4.4" +version = "1.5.0" description = "Powerful data structures for data analysis, time series, and statistics" category = "main" optional = true @@ -561,10 +504,8 @@ python-versions = ">=3.8" [package.dependencies] numpy = [ - {version = ">=1.18.5", markers = "platform_machine != \"aarch64\" and platform_machine != \"arm64\" and python_version < \"3.10\""}, - {version = ">=1.19.2", markers = "platform_machine == \"aarch64\" and python_version < \"3.10\""}, - {version = ">=1.20.0", markers = "platform_machine == \"arm64\" and python_version < \"3.10\""}, {version = ">=1.21.0", markers = "python_version >= \"3.10\""}, + {version = ">=1.20.3", markers = "python_version < \"3.10\""}, ] python-dateutil = ">=2.8.1" pytz = ">=2020.1" @@ -572,18 +513,6 @@ pytz = ">=2020.1" [package.extras] test = ["hypothesis (>=5.5.3)", "pytest (>=6.0)", "pytest-xdist (>=1.31)"] -[[package]] -name = "pillow" -version = "9.2.0" -description = "Python Imaging Library (Fork)" -category = "dev" -optional = false -python-versions = ">=3.7" - -[package.extras] -docs = ["furo", "olefile", "sphinx (>=2.4)", "sphinx-copybutton", "sphinx-issues (>=3.0.1)", "sphinx-removed-in", "sphinxext-opengraph"] -tests = ["check-manifest", "coverage", "defusedxml", "markdown2", "olefile", "packaging", "pyroma", "pytest", "pytest-cov", "pytest-timeout"] - [[package]] name = "pkginfo" version = "1.8.3" @@ -593,7 +522,7 @@ optional = false python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*" [package.extras] -testing = ["nose", "coverage"] +testing = ["coverage", "nose"] [[package]] name = "platformdirs" @@ -604,8 +533,8 @@ optional = false python-versions = ">=3.7" [package.extras] -docs = ["furo (>=2021.7.5b38)", "proselint (>=0.10.2)", "sphinx-autodoc-typehints (>=1.12)", "sphinx (>=4)"] -test = ["appdirs (==1.4.4)", "pytest-cov (>=2.7)", "pytest-mock (>=3.6)", "pytest (>=6)"] +docs = ["furo (>=2021.7.5b38)", "proselint (>=0.10.2)", "sphinx (>=4)", "sphinx-autodoc-typehints (>=1.12)"] +test = ["appdirs (==1.4.4)", "pytest (>=6)", "pytest-cov (>=2.7)", "pytest-mock (>=3.6)"] [[package]] name = "pluggy" @@ -664,20 +593,21 @@ plugins = ["importlib-metadata"] [[package]] name = "pyjwt" -version = "2.4.0" +version = "2.5.0" description = "JSON Web Token implementation in Python" category = "main" optional = false -python-versions = ">=3.6" +python-versions = ">=3.7" [package.dependencies] cryptography = {version = ">=3.3.1", optional = true, markers = "extra == \"crypto\""} +types-cryptography = {version = ">=3.3.21", optional = true, markers = "extra == \"crypto\""} [package.extras] -crypto = ["cryptography (>=3.3.1)"] -dev = ["sphinx", "sphinx-rtd-theme", "zope.interface", "cryptography (>=3.3.1)", "pytest (>=6.0.0,<7.0.0)", "coverage[toml] (==5.0.4)", "mypy", "pre-commit"] -docs = ["sphinx", "sphinx-rtd-theme", "zope.interface"] -tests = ["pytest (>=6.0.0,<7.0.0)", "coverage[toml] (==5.0.4)"] +crypto = ["cryptography (>=3.3.1)", "types-cryptography (>=3.3.21)"] +dev = ["coverage[toml] (==5.0.4)", "cryptography (>=3.3.1)", "pre-commit", "pytest (>=6.0.0,<7.0.0)", "sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "types-cryptography (>=3.3.21)", "zope.interface"] +docs = ["sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "zope.interface"] +tests = ["coverage[toml] (==5.0.4)", "pytest (>=6.0.0,<7.0.0)"] [[package]] name = "pyparsing" @@ -688,11 +618,11 @@ optional = false python-versions = ">=3.6.8" [package.extras] -diagrams = ["railroad-diagrams", "jinja2"] +diagrams = ["jinja2", "railroad-diagrams"] [[package]] name = "pyproj" -version = "3.3.1" +version = "3.4.0" description = "Python interface to PROJ (cartographic projections and coordinate transformations library)" category = "main" optional = true @@ -734,7 +664,7 @@ python-versions = ">=3.7" pytest = ">=6.1.0" [package.extras] -testing = ["coverage (==6.2)", "hypothesis (>=5.7.1)", "flaky (>=3.5.0)", "mypy (==0.931)", "pytest-trio (>=0.7.0)"] +testing = ["coverage (==6.2)", "flaky (>=3.5.0)", "hypothesis (>=5.7.1)", "mypy (==0.931)", "pytest-trio (>=0.7.0)"] [[package]] name = "pytest-cov" @@ -749,7 +679,7 @@ coverage = {version = ">=5.2.1", extras = ["toml"]} pytest = ">=4.6" [package.extras] -testing = ["virtualenv", "pytest-xdist", "six", "process-tests", "hunter", "fields"] +testing = ["fields", "hunter", "process-tests", "pytest-xdist", "six", "virtualenv"] [[package]] name = "pytest-forked" @@ -797,7 +727,7 @@ name = "python-dateutil" version = "2.8.2" description = "Extensions to the standard Python datetime module" category = "main" -optional = false +optional = true python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,>=2.7" [package.dependencies] @@ -840,7 +770,7 @@ python-versions = ">=3.6" [[package]] name = "readme-renderer" -version = "37.0" +version = "37.2" description = "readme_renderer is a library for rendering \"readme\" descriptions for Warehouse" category = "dev" optional = false @@ -911,7 +841,7 @@ requests = ">=2.0,<3.0" urllib3 = ">=1.25.10" [package.extras] -tests = ["pytest (>=7.0.0)", "coverage (>=6.0.0)", "pytest-cov", "pytest-asyncio", "pytest-localserver", "flake8", "types-mock", "types-requests", "mypy"] +tests = ["coverage (>=6.0.0)", "flake8", "mypy", "pytest (>=7.0.0)", "pytest-asyncio", "pytest-cov", "pytest-localserver", "types-mock", "types-requests"] [[package]] name = "rfc3986" @@ -952,22 +882,6 @@ python-versions = ">=3.6" cryptography = ">=2.0" jeepney = ">=0.6" -[[package]] -name = "setuptools-scm" -version = "6.4.2" -description = "the blessed package to manage your versions by scm tags" -category = "dev" -optional = false -python-versions = ">=3.6" - -[package.dependencies] -packaging = ">=20.0" -tomli = ">=1.0.0" - -[package.extras] -test = ["pytest (>=6.2)", "virtualenv (>20)"] -toml = ["setuptools (>=42)"] - [[package]] name = "shapely" version = "1.8.4" @@ -977,7 +891,7 @@ optional = true python-versions = ">=3.6" [package.extras] -all = ["pytest", "pytest-cov", "numpy"] +all = ["numpy", "pytest", "pytest-cov"] test = ["pytest", "pytest-cov"] vectorized = ["numpy"] @@ -997,9 +911,17 @@ category = "dev" optional = false python-versions = "*" +[[package]] +name = "sortedcontainers" +version = "2.4.0" +description = "Sorted Containers -- Sorted List, Sorted Dict, Sorted Set" +category = "main" +optional = false +python-versions = "*" + [[package]] name = "sphinx" -version = "5.1.1" +version = "5.2.1" description = "Python documentation generator" category = "dev" optional = false @@ -1007,16 +929,16 @@ python-versions = ">=3.6" [package.dependencies] alabaster = ">=0.7,<0.8" -babel = ">=1.3" -colorama = {version = ">=0.3.5", markers = "sys_platform == \"win32\""} +babel = ">=2.9" +colorama = {version = ">=0.4.5", markers = "sys_platform == \"win32\""} docutils = ">=0.14,<0.20" -imagesize = "*" -importlib-metadata = {version = ">=4.4", markers = "python_version < \"3.10\""} -Jinja2 = ">=2.3" -packaging = "*" -Pygments = ">=2.0" +imagesize = ">=1.3" +importlib-metadata = {version = ">=4.8", markers = "python_version < \"3.10\""} +Jinja2 = ">=3.0" +packaging = ">=21.0" +Pygments = ">=2.12" requests = ">=2.5.0" -snowballstemmer = ">=1.1" +snowballstemmer = ">=2.0" sphinxcontrib-applehelp = "*" sphinxcontrib-devhelp = "*" sphinxcontrib-htmlhelp = ">=2.0.0" @@ -1026,8 +948,8 @@ sphinxcontrib-serializinghtml = ">=1.1.5" [package.extras] docs = ["sphinxcontrib-websupport"] -lint = ["flake8 (>=3.5.0)", "flake8-comprehensions", "flake8-bugbear", "isort", "mypy (>=0.971)", "sphinx-lint", "docutils-stubs", "types-typed-ast", "types-requests"] -test = ["pytest (>=4.6)", "html5lib", "cython", "typed-ast"] +lint = ["docutils-stubs", "flake8 (>=3.5.0)", "flake8-bugbear", "flake8-comprehensions", "flake8-simplify", "isort", "mypy (>=0.971)", "sphinx-lint", "types-requests", "types-typed-ast"] +test = ["cython", "html5lib", "pytest (>=4.6)", "typed-ast"] [[package]] name = "sphinx-rtd-theme" @@ -1042,7 +964,7 @@ docutils = "<0.18" sphinx = ">=1.6" [package.extras] -dev = ["transifex-client", "sphinxcontrib-httpdomain", "bump2version"] +dev = ["bump2version", "sphinxcontrib-httpdomain", "transifex-client"] [[package]] name = "sphinxcontrib-applehelp" @@ -1053,7 +975,7 @@ optional = false python-versions = ">=3.5" [package.extras] -lint = ["flake8", "mypy", "docutils-stubs"] +lint = ["docutils-stubs", "flake8", "mypy"] test = ["pytest"] [[package]] @@ -1065,7 +987,7 @@ optional = false python-versions = ">=3.5" [package.extras] -lint = ["flake8", "mypy", "docutils-stubs"] +lint = ["docutils-stubs", "flake8", "mypy"] test = ["pytest"] [[package]] @@ -1077,8 +999,8 @@ optional = false python-versions = ">=3.6" [package.extras] -lint = ["flake8", "mypy", "docutils-stubs"] -test = ["pytest", "html5lib"] +lint = ["docutils-stubs", "flake8", "mypy"] +test = ["html5lib", "pytest"] [[package]] name = "sphinxcontrib-jsmath" @@ -1089,7 +1011,7 @@ optional = false python-versions = ">=3.5" [package.extras] -test = ["pytest", "flake8", "mypy"] +test = ["flake8", "mypy", "pytest"] [[package]] name = "sphinxcontrib-qthelp" @@ -1100,7 +1022,7 @@ optional = false python-versions = ">=3.5" [package.extras] -lint = ["flake8", "mypy", "docutils-stubs"] +lint = ["docutils-stubs", "flake8", "mypy"] test = ["pytest"] [[package]] @@ -1112,7 +1034,7 @@ optional = false python-versions = ">=3.5" [package.extras] -lint = ["flake8", "mypy", "docutils-stubs"] +lint = ["docutils-stubs", "flake8", "mypy"] test = ["pytest"] [[package]] @@ -1161,9 +1083,17 @@ rfc3986 = ">=1.4.0" rich = ">=12.0.0" urllib3 = ">=1.26.0" +[[package]] +name = "types-cryptography" +version = "3.3.23" +description = "Typing stubs for cryptography" +category = "main" +optional = false +python-versions = "*" + [[package]] name = "types-requests" -version = "2.28.9" +version = "2.28.11" description = "Typing stubs for requests" category = "dev" optional = false @@ -1174,7 +1104,7 @@ types-urllib3 = "<1.27" [[package]] name = "types-urllib3" -version = "1.26.23" +version = "1.26.24" description = "Typing stubs for urllib3" category = "dev" optional = false @@ -1197,13 +1127,13 @@ optional = false python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, <4" [package.extras] -brotli = ["brotlicffi (>=0.8.0)", "brotli (>=1.0.9)", "brotlipy (>=0.6.0)"] -secure = ["pyOpenSSL (>=0.14)", "cryptography (>=1.3.4)", "idna (>=2.0.0)", "certifi", "urllib3-secure-extra", "ipaddress"] +brotli = ["brotli (>=1.0.9)", "brotlicffi (>=0.8.0)", "brotlipy (>=0.6.0)"] +secure = ["certifi", "cryptography (>=1.3.4)", "idna (>=2.0.0)", "ipaddress", "pyOpenSSL (>=0.14)", "urllib3-secure-extra"] socks = ["PySocks (>=1.5.6,!=1.5.7,<2.0)"] [[package]] name = "virtualenv" -version = "20.16.4" +version = "20.16.5" description = "Virtual Python Environment builder" category = "dev" optional = false @@ -1235,40 +1165,120 @@ optional = false python-versions = ">=3.7" [package.extras] -docs = ["sphinx", "jaraco.packaging (>=9)", "rst.linker (>=1.9)", "jaraco.tidelift (>=1.4)"] -testing = ["pytest (>=6)", "pytest-checkdocs (>=2.4)", "pytest-flake8", "pytest-cov", "pytest-enabler (>=1.3)", "jaraco.itertools", "func-timeout", "pytest-black (>=0.3.7)", "pytest-mypy (>=0.9.1)"] +docs = ["jaraco.packaging (>=9)", "jaraco.tidelift (>=1.4)", "rst.linker (>=1.9)", "sphinx"] +testing = ["func-timeout", "jaraco.itertools", "pytest (>=6)", "pytest-black (>=0.3.7)", "pytest-checkdocs (>=2.4)", "pytest-cov", "pytest-enabler (>=1.3)", "pytest-flake8", "pytest-mypy (>=0.9.1)"] [extras] all = ["pandas", "geopandas", "shapely", "sympy", "pip"] functions = ["pip"] geo = ["geopandas", "shapely"] +numpy = ["numpy"] pandas = ["pandas"] sympy = ["sympy"] [metadata] lock-version = "1.1" python-versions = "^3.8" -content-hash = "f1a0caf224c1ef325aec07d36a47703507b482e986e5a454bf69ef5147826a6a" +content-hash = "e407979dc6bc8192c8621760783855da54414a6567e976be97bb54d9adbf44c4" [metadata.files] alabaster = [ {file = "alabaster-0.7.12-py2.py3-none-any.whl", hash = "sha256:446438bdcca0e05bd45ea2de1668c1d9b032e1a9154c2c259092d77031ddd359"}, {file = "alabaster-0.7.12.tar.gz", hash = "sha256:a661d72d58e6ea8a57f7a86e37d86716863ee5e92788398526d58b26a4e4dc02"}, ] -atomicwrites = [] -attrs = [] -babel = [] -bleach = [] +atomicwrites = [ + {file = "atomicwrites-1.4.1.tar.gz", hash = "sha256:81b2c9071a49367a7f770170e5eec8cb66567cfbbc8c73d20ce5ca4a8d71cf11"}, +] +attrs = [ + {file = "attrs-22.1.0-py2.py3-none-any.whl", hash = "sha256:86efa402f67bf2df34f51a335487cf46b1ec130d02b8d39fd248abfd30da551c"}, + {file = "attrs-22.1.0.tar.gz", hash = "sha256:29adc2665447e5191d0e7c568fde78b21f9672d344281d0c6e1ab085429b22b6"}, +] +babel = [ + {file = "Babel-2.10.3-py3-none-any.whl", hash = "sha256:ff56f4892c1c4bf0d814575ea23471c230d544203c7748e8c68f0089478d48eb"}, + {file = "Babel-2.10.3.tar.gz", hash = "sha256:7614553711ee97490f732126dc077f8d0ae084ebc6a96e23db1482afabdb2c51"}, +] +bleach = [ + {file = "bleach-5.0.1-py3-none-any.whl", hash = "sha256:085f7f33c15bd408dd9b17a4ad77c577db66d76203e5984b1bd59baeee948b2a"}, + {file = "bleach-5.0.1.tar.gz", hash = "sha256:0d03255c47eb9bd2f26aa9bb7f2107732e7e8fe195ca2f64709fcf3b0a4a085c"}, +] certifi = [ - {file = "certifi-2022.6.15-py3-none-any.whl", hash = "sha256:fe86415d55e84719d75f8b69414f6438ac3547d2078ab91b67e779ef69378412"}, - {file = "certifi-2022.6.15.tar.gz", hash = "sha256:84c85a9078b11105f04f3036a9482ae10e4621616db313fe045dd24743a0820d"}, + {file = "certifi-2022.9.24-py3-none-any.whl", hash = "sha256:90c1a32f1d68f940488354e36370f6cca89f0f106db09518524c88d6ed83f382"}, + {file = "certifi-2022.9.24.tar.gz", hash = "sha256:0d9c601124e5a6ba9712dbc60d9c53c21e34f5f641fe83002317394311bdce14"}, +] +cffi = [ + {file = "cffi-1.15.1-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:a66d3508133af6e8548451b25058d5812812ec3798c886bf38ed24a98216fab2"}, + {file = "cffi-1.15.1-cp27-cp27m-manylinux1_i686.whl", hash = "sha256:470c103ae716238bbe698d67ad020e1db9d9dba34fa5a899b5e21577e6d52ed2"}, + {file = "cffi-1.15.1-cp27-cp27m-manylinux1_x86_64.whl", hash = "sha256:9ad5db27f9cabae298d151c85cf2bad1d359a1b9c686a275df03385758e2f914"}, + {file = "cffi-1.15.1-cp27-cp27m-win32.whl", hash = "sha256:b3bbeb01c2b273cca1e1e0c5df57f12dce9a4dd331b4fa1635b8bec26350bde3"}, + {file = "cffi-1.15.1-cp27-cp27m-win_amd64.whl", hash = "sha256:e00b098126fd45523dd056d2efba6c5a63b71ffe9f2bbe1a4fe1716e1d0c331e"}, + {file = "cffi-1.15.1-cp27-cp27mu-manylinux1_i686.whl", hash = "sha256:d61f4695e6c866a23a21acab0509af1cdfd2c013cf256bbf5b6b5e2695827162"}, + {file = "cffi-1.15.1-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:ed9cb427ba5504c1dc15ede7d516b84757c3e3d7868ccc85121d9310d27eed0b"}, + {file = "cffi-1.15.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:39d39875251ca8f612b6f33e6b1195af86d1b3e60086068be9cc053aa4376e21"}, + {file = "cffi-1.15.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:285d29981935eb726a4399badae8f0ffdff4f5050eaa6d0cfc3f64b857b77185"}, + {file = "cffi-1.15.1-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3eb6971dcff08619f8d91607cfc726518b6fa2a9eba42856be181c6d0d9515fd"}, + {file = "cffi-1.15.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:21157295583fe8943475029ed5abdcf71eb3911894724e360acff1d61c1d54bc"}, + {file = "cffi-1.15.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5635bd9cb9731e6d4a1132a498dd34f764034a8ce60cef4f5319c0541159392f"}, + {file = "cffi-1.15.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2012c72d854c2d03e45d06ae57f40d78e5770d252f195b93f581acf3ba44496e"}, + {file = "cffi-1.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:dd86c085fae2efd48ac91dd7ccffcfc0571387fe1193d33b6394db7ef31fe2a4"}, + {file = "cffi-1.15.1-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:fa6693661a4c91757f4412306191b6dc88c1703f780c8234035eac011922bc01"}, + {file = "cffi-1.15.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:59c0b02d0a6c384d453fece7566d1c7e6b7bae4fc5874ef2ef46d56776d61c9e"}, + {file = "cffi-1.15.1-cp310-cp310-win32.whl", hash = "sha256:cba9d6b9a7d64d4bd46167096fc9d2f835e25d7e4c121fb2ddfc6528fb0413b2"}, + {file = "cffi-1.15.1-cp310-cp310-win_amd64.whl", hash = "sha256:ce4bcc037df4fc5e3d184794f27bdaab018943698f4ca31630bc7f84a7b69c6d"}, + {file = "cffi-1.15.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3d08afd128ddaa624a48cf2b859afef385b720bb4b43df214f85616922e6a5ac"}, + {file = "cffi-1.15.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:3799aecf2e17cf585d977b780ce79ff0dc9b78d799fc694221ce814c2c19db83"}, + {file = "cffi-1.15.1-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a591fe9e525846e4d154205572a029f653ada1a78b93697f3b5a8f1f2bc055b9"}, + {file = "cffi-1.15.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3548db281cd7d2561c9ad9984681c95f7b0e38881201e157833a2342c30d5e8c"}, + {file = "cffi-1.15.1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:91fc98adde3d7881af9b59ed0294046f3806221863722ba7d8d120c575314325"}, + {file = "cffi-1.15.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:94411f22c3985acaec6f83c6df553f2dbe17b698cc7f8ae751ff2237d96b9e3c"}, + {file = "cffi-1.15.1-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:03425bdae262c76aad70202debd780501fabeaca237cdfddc008987c0e0f59ef"}, + {file = "cffi-1.15.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:cc4d65aeeaa04136a12677d3dd0b1c0c94dc43abac5860ab33cceb42b801c1e8"}, + {file = "cffi-1.15.1-cp311-cp311-win32.whl", hash = "sha256:a0f100c8912c114ff53e1202d0078b425bee3649ae34d7b070e9697f93c5d52d"}, + {file = "cffi-1.15.1-cp311-cp311-win_amd64.whl", hash = "sha256:04ed324bda3cda42b9b695d51bb7d54b680b9719cfab04227cdd1e04e5de3104"}, + {file = "cffi-1.15.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:50a74364d85fd319352182ef59c5c790484a336f6db772c1a9231f1c3ed0cbd7"}, + {file = "cffi-1.15.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e263d77ee3dd201c3a142934a086a4450861778baaeeb45db4591ef65550b0a6"}, + {file = "cffi-1.15.1-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:cec7d9412a9102bdc577382c3929b337320c4c4c4849f2c5cdd14d7368c5562d"}, + {file = "cffi-1.15.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4289fc34b2f5316fbb762d75362931e351941fa95fa18789191b33fc4cf9504a"}, + {file = "cffi-1.15.1-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:173379135477dc8cac4bc58f45db08ab45d228b3363adb7af79436135d028405"}, + {file = "cffi-1.15.1-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:6975a3fac6bc83c4a65c9f9fcab9e47019a11d3d2cf7f3c0d03431bf145a941e"}, + {file = "cffi-1.15.1-cp36-cp36m-win32.whl", hash = "sha256:2470043b93ff09bf8fb1d46d1cb756ce6132c54826661a32d4e4d132e1977adf"}, + {file = "cffi-1.15.1-cp36-cp36m-win_amd64.whl", hash = "sha256:30d78fbc8ebf9c92c9b7823ee18eb92f2e6ef79b45ac84db507f52fbe3ec4497"}, + {file = "cffi-1.15.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:198caafb44239b60e252492445da556afafc7d1e3ab7a1fb3f0584ef6d742375"}, + {file = "cffi-1.15.1-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5ef34d190326c3b1f822a5b7a45f6c4535e2f47ed06fec77d3d799c450b2651e"}, + {file = "cffi-1.15.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8102eaf27e1e448db915d08afa8b41d6c7ca7a04b7d73af6514df10a3e74bd82"}, + {file = "cffi-1.15.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5df2768244d19ab7f60546d0c7c63ce1581f7af8b5de3eb3004b9b6fc8a9f84b"}, + {file = "cffi-1.15.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a8c4917bd7ad33e8eb21e9a5bbba979b49d9a97acb3a803092cbc1133e20343c"}, + {file = "cffi-1.15.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0e2642fe3142e4cc4af0799748233ad6da94c62a8bec3a6648bf8ee68b1c7426"}, + {file = "cffi-1.15.1-cp37-cp37m-win32.whl", hash = "sha256:e229a521186c75c8ad9490854fd8bbdd9a0c9aa3a524326b55be83b54d4e0ad9"}, + {file = "cffi-1.15.1-cp37-cp37m-win_amd64.whl", hash = "sha256:a0b71b1b8fbf2b96e41c4d990244165e2c9be83d54962a9a1d118fd8657d2045"}, + {file = "cffi-1.15.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:320dab6e7cb2eacdf0e658569d2575c4dad258c0fcc794f46215e1e39f90f2c3"}, + {file = "cffi-1.15.1-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1e74c6b51a9ed6589199c787bf5f9875612ca4a8a0785fb2d4a84429badaf22a"}, + {file = "cffi-1.15.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a5c84c68147988265e60416b57fc83425a78058853509c1b0629c180094904a5"}, + {file = "cffi-1.15.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:3b926aa83d1edb5aa5b427b4053dc420ec295a08e40911296b9eb1b6170f6cca"}, + {file = "cffi-1.15.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:87c450779d0914f2861b8526e035c5e6da0a3199d8f1add1a665e1cbc6fc6d02"}, + {file = "cffi-1.15.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4f2c9f67e9821cad2e5f480bc8d83b8742896f1242dba247911072d4fa94c192"}, + {file = "cffi-1.15.1-cp38-cp38-win32.whl", hash = "sha256:8b7ee99e510d7b66cdb6c593f21c043c248537a32e0bedf02e01e9553a172314"}, + {file = "cffi-1.15.1-cp38-cp38-win_amd64.whl", hash = "sha256:00a9ed42e88df81ffae7a8ab6d9356b371399b91dbdf0c3cb1e84c03a13aceb5"}, + {file = "cffi-1.15.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:54a2db7b78338edd780e7ef7f9f6c442500fb0d41a5a4ea24fff1c929d5af585"}, + {file = "cffi-1.15.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:fcd131dd944808b5bdb38e6f5b53013c5aa4f334c5cad0c72742f6eba4b73db0"}, + {file = "cffi-1.15.1-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7473e861101c9e72452f9bf8acb984947aa1661a7704553a9f6e4baa5ba64415"}, + {file = "cffi-1.15.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6c9a799e985904922a4d207a94eae35c78ebae90e128f0c4e521ce339396be9d"}, + {file = "cffi-1.15.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:3bcde07039e586f91b45c88f8583ea7cf7a0770df3a1649627bf598332cb6984"}, + {file = "cffi-1.15.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:33ab79603146aace82c2427da5ca6e58f2b3f2fb5da893ceac0c42218a40be35"}, + {file = "cffi-1.15.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5d598b938678ebf3c67377cdd45e09d431369c3b1a5b331058c338e201f12b27"}, + {file = "cffi-1.15.1-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:db0fbb9c62743ce59a9ff687eb5f4afbe77e5e8403d6697f7446e5f609976f76"}, + {file = "cffi-1.15.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:98d85c6a2bef81588d9227dde12db8a7f47f639f4a17c9ae08e773aa9c697bf3"}, + {file = "cffi-1.15.1-cp39-cp39-win32.whl", hash = "sha256:40f4774f5a9d4f5e344f31a32b5096977b5d48560c5592e2f3d2c4374bd543ee"}, + {file = "cffi-1.15.1-cp39-cp39-win_amd64.whl", hash = "sha256:70df4e3b545a17496c9b3f41f5115e69a4f2e77e94e1d2a8e1070bc0c38c8a3c"}, + {file = "cffi-1.15.1.tar.gz", hash = "sha256:d400bfb9a37b1351253cb402671cea7e89bdecc294e8016a707f6d1d8ac934f9"}, ] -cffi = [] cfgv = [ {file = "cfgv-3.3.1-py2.py3-none-any.whl", hash = "sha256:c6a0883f3917a037485059700b9e75da2464e6c27051014ad85ba6aaa5884426"}, {file = "cfgv-3.3.1.tar.gz", hash = "sha256:f5a830efb9ce7a445376bb66ec94c638a9787422f96264c98edc6bdeed8ab736"}, ] -charset-normalizer = [] +charset-normalizer = [ + {file = "charset-normalizer-2.1.1.tar.gz", hash = "sha256:5a3d016c7c547f69d6f81fb0db9449ce888b418b5b9952cc5e6e66843e9dd845"}, + {file = "charset_normalizer-2.1.1-py3-none-any.whl", hash = "sha256:83e9a75d1911279afd89352c68b45348559d1fc0506b054b346651b5e7fee29f"}, +] click = [ {file = "click-8.1.3-py3-none-any.whl", hash = "sha256:bb4d8133cb15a609f44e8213d9b391b0809795062913b383c62be0ee95b1db48"}, {file = "click-8.1.3.tar.gz", hash = "sha256:7682dc8afb30297001674575ea00d1814d808d6a36af415a82bd481d37ba7b8e"}, @@ -1285,17 +1295,107 @@ colorama = [ {file = "colorama-0.4.5-py2.py3-none-any.whl", hash = "sha256:854bf444933e37f5824ae7bfc1e98d5bce2ebe4160d46b5edf346a89358e99da"}, {file = "colorama-0.4.5.tar.gz", hash = "sha256:e6c6b4334fc50988a639d9b98aa429a0b57da6e17b9a44f0451f930b6967b7a4"}, ] -commonmark = [] -coverage = [] -cryptography = [] -cycler = [ - {file = "cycler-0.11.0-py3-none-any.whl", hash = "sha256:3a27e95f763a428a739d2add979fa7494c912a32c17c4c38c4d5f082cad165a3"}, - {file = "cycler-0.11.0.tar.gz", hash = "sha256:9c87405839a19696e837b3b818fed3f5f69f16f1eec1a1ad77e043dcea9c772f"}, -] -distlib = [] -docutils = [] -execnet = [] -filelock = [] +commonmark = [ + {file = "commonmark-0.9.1-py2.py3-none-any.whl", hash = "sha256:da2f38c92590f83de410ba1a3cbceafbc74fee9def35f9251ba9a971d6d66fd9"}, + {file = "commonmark-0.9.1.tar.gz", hash = "sha256:452f9dc859be7f06631ddcb328b6919c67984aca654e5fefb3914d54691aed60"}, +] +coverage = [ + {file = "coverage-6.4.4-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e7b4da9bafad21ea45a714d3ea6f3e1679099e420c8741c74905b92ee9bfa7cc"}, + {file = "coverage-6.4.4-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:fde17bc42e0716c94bf19d92e4c9f5a00c5feb401f5bc01101fdf2a8b7cacf60"}, + {file = "coverage-6.4.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cdbb0d89923c80dbd435b9cf8bba0ff55585a3cdb28cbec65f376c041472c60d"}, + {file = "coverage-6.4.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:67f9346aeebea54e845d29b487eb38ec95f2ecf3558a3cffb26ee3f0dcc3e760"}, + {file = "coverage-6.4.4-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:42c499c14efd858b98c4e03595bf914089b98400d30789511577aa44607a1b74"}, + {file = "coverage-6.4.4-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:c35cca192ba700979d20ac43024a82b9b32a60da2f983bec6c0f5b84aead635c"}, + {file = "coverage-6.4.4-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:9cc4f107009bca5a81caef2fca843dbec4215c05e917a59dec0c8db5cff1d2aa"}, + {file = "coverage-6.4.4-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:5f444627b3664b80d078c05fe6a850dd711beeb90d26731f11d492dcbadb6973"}, + {file = "coverage-6.4.4-cp310-cp310-win32.whl", hash = "sha256:66e6df3ac4659a435677d8cd40e8eb1ac7219345d27c41145991ee9bf4b806a0"}, + {file = "coverage-6.4.4-cp310-cp310-win_amd64.whl", hash = "sha256:35ef1f8d8a7a275aa7410d2f2c60fa6443f4a64fae9be671ec0696a68525b875"}, + {file = "coverage-6.4.4-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:c1328d0c2f194ffda30a45f11058c02410e679456276bfa0bbe0b0ee87225fac"}, + {file = "coverage-6.4.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:61b993f3998ee384935ee423c3d40894e93277f12482f6e777642a0141f55782"}, + {file = "coverage-6.4.4-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d5dd4b8e9cd0deb60e6fcc7b0647cbc1da6c33b9e786f9c79721fd303994832f"}, + {file = "coverage-6.4.4-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7026f5afe0d1a933685d8f2169d7c2d2e624f6255fb584ca99ccca8c0e966fd7"}, + {file = "coverage-6.4.4-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:9c7b9b498eb0c0d48b4c2abc0e10c2d78912203f972e0e63e3c9dc21f15abdaa"}, + {file = "coverage-6.4.4-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:ee2b2fb6eb4ace35805f434e0f6409444e1466a47f620d1d5763a22600f0f892"}, + {file = "coverage-6.4.4-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:ab066f5ab67059d1f1000b5e1aa8bbd75b6ed1fc0014559aea41a9eb66fc2ce0"}, + {file = "coverage-6.4.4-cp311-cp311-win32.whl", hash = "sha256:9d6e1f3185cbfd3d91ac77ea065d85d5215d3dfa45b191d14ddfcd952fa53796"}, + {file = "coverage-6.4.4-cp311-cp311-win_amd64.whl", hash = "sha256:e3d3c4cc38b2882f9a15bafd30aec079582b819bec1b8afdbde8f7797008108a"}, + {file = "coverage-6.4.4-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:a095aa0a996ea08b10580908e88fbaf81ecf798e923bbe64fb98d1807db3d68a"}, + {file = "coverage-6.4.4-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ef6f44409ab02e202b31a05dd6666797f9de2aa2b4b3534e9d450e42dea5e817"}, + {file = "coverage-6.4.4-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4b7101938584d67e6f45f0015b60e24a95bf8dea19836b1709a80342e01b472f"}, + {file = "coverage-6.4.4-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:14a32ec68d721c3d714d9b105c7acf8e0f8a4f4734c811eda75ff3718570b5e3"}, + {file = "coverage-6.4.4-cp37-cp37m-musllinux_1_1_aarch64.whl", hash = "sha256:6a864733b22d3081749450466ac80698fe39c91cb6849b2ef8752fd7482011f3"}, + {file = "coverage-6.4.4-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:08002f9251f51afdcc5e3adf5d5d66bb490ae893d9e21359b085f0e03390a820"}, + {file = "coverage-6.4.4-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:a3b2752de32c455f2521a51bd3ffb53c5b3ae92736afde67ce83477f5c1dd928"}, + {file = "coverage-6.4.4-cp37-cp37m-win32.whl", hash = "sha256:f855b39e4f75abd0dfbcf74a82e84ae3fc260d523fcb3532786bcbbcb158322c"}, + {file = "coverage-6.4.4-cp37-cp37m-win_amd64.whl", hash = "sha256:ee6ae6bbcac0786807295e9687169fba80cb0617852b2fa118a99667e8e6815d"}, + {file = "coverage-6.4.4-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:564cd0f5b5470094df06fab676c6d77547abfdcb09b6c29c8a97c41ad03b103c"}, + {file = "coverage-6.4.4-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:cbbb0e4cd8ddcd5ef47641cfac97d8473ab6b132dd9a46bacb18872828031685"}, + {file = "coverage-6.4.4-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6113e4df2fa73b80f77663445be6d567913fb3b82a86ceb64e44ae0e4b695de1"}, + {file = "coverage-6.4.4-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:8d032bfc562a52318ae05047a6eb801ff31ccee172dc0d2504614e911d8fa83e"}, + {file = "coverage-6.4.4-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e431e305a1f3126477abe9a184624a85308da8edf8486a863601d58419d26ffa"}, + {file = "coverage-6.4.4-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:cf2afe83a53f77aec067033199797832617890e15bed42f4a1a93ea24794ae3e"}, + {file = "coverage-6.4.4-cp38-cp38-musllinux_1_1_i686.whl", hash = "sha256:783bc7c4ee524039ca13b6d9b4186a67f8e63d91342c713e88c1865a38d0892a"}, + {file = "coverage-6.4.4-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:ff934ced84054b9018665ca3967fc48e1ac99e811f6cc99ea65978e1d384454b"}, + {file = "coverage-6.4.4-cp38-cp38-win32.whl", hash = "sha256:e1fabd473566fce2cf18ea41171d92814e4ef1495e04471786cbc943b89a3781"}, + {file = "coverage-6.4.4-cp38-cp38-win_amd64.whl", hash = "sha256:4179502f210ebed3ccfe2f78bf8e2d59e50b297b598b100d6c6e3341053066a2"}, + {file = "coverage-6.4.4-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:98c0b9e9b572893cdb0a00e66cf961a238f8d870d4e1dc8e679eb8bdc2eb1b86"}, + {file = "coverage-6.4.4-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:fc600f6ec19b273da1d85817eda339fb46ce9eef3e89f220055d8696e0a06908"}, + {file = "coverage-6.4.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7a98d6bf6d4ca5c07a600c7b4e0c5350cd483c85c736c522b786be90ea5bac4f"}, + {file = "coverage-6.4.4-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:01778769097dbd705a24e221f42be885c544bb91251747a8a3efdec6eb4788f2"}, + {file = "coverage-6.4.4-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:dfa0b97eb904255e2ab24166071b27408f1f69c8fbda58e9c0972804851e0558"}, + {file = "coverage-6.4.4-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:fcbe3d9a53e013f8ab88734d7e517eb2cd06b7e689bedf22c0eb68db5e4a0a19"}, + {file = "coverage-6.4.4-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:15e38d853ee224e92ccc9a851457fb1e1f12d7a5df5ae44544ce7863691c7a0d"}, + {file = "coverage-6.4.4-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:6913dddee2deff8ab2512639c5168c3e80b3ebb0f818fed22048ee46f735351a"}, + {file = "coverage-6.4.4-cp39-cp39-win32.whl", hash = "sha256:354df19fefd03b9a13132fa6643527ef7905712109d9c1c1903f2133d3a4e145"}, + {file = "coverage-6.4.4-cp39-cp39-win_amd64.whl", hash = "sha256:1238b08f3576201ebf41f7c20bf59baa0d05da941b123c6656e42cdb668e9827"}, + {file = "coverage-6.4.4-pp36.pp37.pp38-none-any.whl", hash = "sha256:f67cf9f406cf0d2f08a3515ce2db5b82625a7257f88aad87904674def6ddaec1"}, + {file = "coverage-6.4.4.tar.gz", hash = "sha256:e16c45b726acb780e1e6f88b286d3c10b3914ab03438f32117c4aa52d7f30d58"}, +] +cryptography = [ + {file = "cryptography-38.0.1-cp36-abi3-macosx_10_10_universal2.whl", hash = "sha256:10d1f29d6292fc95acb597bacefd5b9e812099d75a6469004fd38ba5471a977f"}, + {file = "cryptography-38.0.1-cp36-abi3-macosx_10_10_x86_64.whl", hash = "sha256:3fc26e22840b77326a764ceb5f02ca2d342305fba08f002a8c1f139540cdfaad"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_24_aarch64.whl", hash = "sha256:3b72c360427889b40f36dc214630e688c2fe03e16c162ef0aa41da7ab1455153"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:194044c6b89a2f9f169df475cc167f6157eb9151cc69af8a2a163481d45cc407"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ca9f6784ea96b55ff41708b92c3f6aeaebde4c560308e5fbbd3173fbc466e94e"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_24_x86_64.whl", hash = "sha256:16fa61e7481f4b77ef53991075de29fc5bacb582a1244046d2e8b4bb72ef66d0"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:d4ef6cc305394ed669d4d9eebf10d3a101059bdcf2669c366ec1d14e4fb227bd"}, + {file = "cryptography-38.0.1-cp36-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3261725c0ef84e7592597606f6583385fed2a5ec3909f43bc475ade9729a41d6"}, + {file = "cryptography-38.0.1-cp36-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:0297ffc478bdd237f5ca3a7dc96fc0d315670bfa099c04dc3a4a2172008a405a"}, + {file = "cryptography-38.0.1-cp36-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:89ed49784ba88c221756ff4d4755dbc03b3c8d2c5103f6d6b4f83a0fb1e85294"}, + {file = "cryptography-38.0.1-cp36-abi3-win32.whl", hash = "sha256:ac7e48f7e7261207d750fa7e55eac2d45f720027d5703cd9007e9b37bbb59ac0"}, + {file = "cryptography-38.0.1-cp36-abi3-win_amd64.whl", hash = "sha256:ad7353f6ddf285aeadfaf79e5a6829110106ff8189391704c1d8801aa0bae45a"}, + {file = "cryptography-38.0.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:896dd3a66959d3a5ddcfc140a53391f69ff1e8f25d93f0e2e7830c6de90ceb9d"}, + {file = "cryptography-38.0.1-pp37-pypy37_pp73-manylinux_2_24_x86_64.whl", hash = "sha256:d3971e2749a723e9084dd507584e2a2761f78ad2c638aa31e80bc7a15c9db4f9"}, + {file = "cryptography-38.0.1-pp37-pypy37_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:79473cf8a5cbc471979bd9378c9f425384980fcf2ab6534b18ed7d0d9843987d"}, + {file = "cryptography-38.0.1-pp38-pypy38_pp73-macosx_10_10_x86_64.whl", hash = "sha256:d9e69ae01f99abe6ad646947bba8941e896cb3aa805be2597a0400e0764b5818"}, + {file = "cryptography-38.0.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5067ee7f2bce36b11d0e334abcd1ccf8c541fc0bbdaf57cdd511fdee53e879b6"}, + {file = "cryptography-38.0.1-pp38-pypy38_pp73-manylinux_2_24_x86_64.whl", hash = "sha256:3e3a2599e640927089f932295a9a247fc40a5bdf69b0484532f530471a382750"}, + {file = "cryptography-38.0.1-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:c2e5856248a416767322c8668ef1845ad46ee62629266f84a8f007a317141013"}, + {file = "cryptography-38.0.1-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:64760ba5331e3f1794d0bcaabc0d0c39e8c60bf67d09c93dc0e54189dfd7cfe5"}, + {file = "cryptography-38.0.1-pp39-pypy39_pp73-macosx_10_10_x86_64.whl", hash = "sha256:b6c9b706316d7b5a137c35e14f4103e2115b088c412140fdbd5f87c73284df61"}, + {file = "cryptography-38.0.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b0163a849b6f315bf52815e238bc2b2346604413fa7c1601eea84bcddb5fb9ac"}, + {file = "cryptography-38.0.1-pp39-pypy39_pp73-manylinux_2_24_x86_64.whl", hash = "sha256:d1a5bd52d684e49a36582193e0b89ff267704cd4025abefb9e26803adeb3e5fb"}, + {file = "cryptography-38.0.1-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:765fa194a0f3372d83005ab83ab35d7c5526c4e22951e46059b8ac678b44fa5a"}, + {file = "cryptography-38.0.1-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:52e7bee800ec869b4031093875279f1ff2ed12c1e2f74923e8f49c916afd1d3b"}, + {file = "cryptography-38.0.1.tar.gz", hash = "sha256:1db3d807a14931fa317f96435695d9ec386be7b84b618cc61cfa5d08b0ae33d7"}, +] +distlib = [ + {file = "distlib-0.3.6-py2.py3-none-any.whl", hash = "sha256:f35c4b692542ca110de7ef0bea44d73981caeb34ca0b9b6b2e6d7790dda8f80e"}, + {file = "distlib-0.3.6.tar.gz", hash = "sha256:14bad2d9b04d3a36127ac97f30b12a19268f211063d8f8ee4f47108896e11b46"}, +] +docutils = [ + {file = "docutils-0.15.2-py2-none-any.whl", hash = "sha256:9e4d7ecfc600058e07ba661411a2b7de2fd0fafa17d1a7f7361cd47b1175c827"}, + {file = "docutils-0.15.2-py3-none-any.whl", hash = "sha256:6c4f696463b79f1fb8ba0c594b63840ebd41f059e92b31957c46b74a4599b6d0"}, + {file = "docutils-0.15.2.tar.gz", hash = "sha256:a2aeea129088da402665e92e0b25b04b073c04b2dce4ab65caaa38b7ce2e1a99"}, +] +execnet = [ + {file = "execnet-1.9.0-py2.py3-none-any.whl", hash = "sha256:a295f7cc774947aac58dde7fdc85f4aa00c42adf5d8f5468fc630c1acf30a142"}, + {file = "execnet-1.9.0.tar.gz", hash = "sha256:8f694f3ba9cc92cab508b152dcfe322153975c29bda272e2fd7f3f00f36e47c5"}, +] +filelock = [ + {file = "filelock-3.8.0-py3-none-any.whl", hash = "sha256:617eb4e5eedc82fc5f47b6d61e4d11cb837c56cb4544e39081099fa17ad109d4"}, + {file = "filelock-3.8.0.tar.gz", hash = "sha256:55447caa666f2198c5b6b13a26d2084d26fa5b115c00d065664b2124680c4edc"}, +] fiona = [ {file = "Fiona-1.8.21-cp310-cp310-macosx_10_10_x86_64.whl", hash = "sha256:39c656421e25b4d0d73d0b6acdcbf9848e71f3d9b74f44c27d2d516d463409ae"}, {file = "Fiona-1.8.21-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:43b1d2e45506e56cf3a9f59ba5d6f7981f3f75f4725d1e6cb9a33ba856371ebd"}, @@ -1309,20 +1409,34 @@ fiona = [ {file = "Fiona-1.8.21-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:40b4eaf5b88407421d6c9e707520abd2ff16d7cd43efb59cd398aa41d2de332c"}, {file = "Fiona-1.8.21.tar.gz", hash = "sha256:3a0edca2a7a070db405d71187214a43d2333a57b4097544a3fcc282066a58bfc"}, ] -fonttools = [] -geopandas = [] -identify = [] +geopandas = [ + {file = "geopandas-0.11.1-py3-none-any.whl", hash = "sha256:f3344937f3866e52996c7e505d56dae78be117dc840cd1c23507da0b33c0af71"}, + {file = "geopandas-0.11.1.tar.gz", hash = "sha256:f0f0c8d0423d30cf81de2056d853145c4362739350a7f8f2d72cc7409ef1eca1"}, +] +identify = [ + {file = "identify-2.5.5-py2.py3-none-any.whl", hash = "sha256:ef78c0d96098a3b5fe7720be4a97e73f439af7cf088ebf47b620aeaa10fadf97"}, + {file = "identify-2.5.5.tar.gz", hash = "sha256:322a5699daecf7c6fd60e68852f36f2ecbb6a36ff6e6e973e0d2bb6fca203ee6"}, +] idna = [ - {file = "idna-3.3-py3-none-any.whl", hash = "sha256:84d9dd047ffa80596e0f246e2eab0b391788b0503584e8945f2368256d2735ff"}, - {file = "idna-3.3.tar.gz", hash = "sha256:9d643ff0a55b762d5cdb124b8eaa99c66322e2157b69160bc32796e824360e6d"}, + {file = "idna-3.4-py3-none-any.whl", hash = "sha256:90b77e79eaa3eba6de819a0c442c0b4ceefc341a7a2ab77d7562bf49f425c5c2"}, + {file = "idna-3.4.tar.gz", hash = "sha256:814f528e8dead7d329833b91c5faa87d60bf71824cd12a7530b5526063d02cb4"}, +] +imagesize = [ + {file = "imagesize-1.4.1-py2.py3-none-any.whl", hash = "sha256:0d8d18d08f840c19d0ee7ca1fd82490fdc3729b7ac93f49870406ddde8ef8d8b"}, + {file = "imagesize-1.4.1.tar.gz", hash = "sha256:69150444affb9cb0d5cc5a92b3676f0b2fb7cd9ae39e947a5e11a36b4497cd4a"}, +] +importlib-metadata = [ + {file = "importlib_metadata-4.12.0-py3-none-any.whl", hash = "sha256:7401a975809ea1fdc658c3aa4f78cc2195a0e019c5cbc4c06122884e9ae80c23"}, + {file = "importlib_metadata-4.12.0.tar.gz", hash = "sha256:637245b8bab2b6502fcbc752cc4b7a6f6243bb02b31c5c26156ad103d3d45670"}, ] -imagesize = [] -importlib-metadata = [] iniconfig = [ {file = "iniconfig-1.1.1-py2.py3-none-any.whl", hash = "sha256:011e24c64b7f47f6ebd835bb12a743f2fbe9a26d4cecaa7f53bc4f35ee9da8b3"}, {file = "iniconfig-1.1.1.tar.gz", hash = "sha256:bc3af051d7d14b2ee5ef9969666def0cd1a000e121eaea580d4a313df4b37f32"}, ] -"jaraco.classes" = [] +"jaraco.classes" = [ + {file = "jaraco.classes-3.2.2-py3-none-any.whl", hash = "sha256:e6ef6fd3fcf4579a7a019d87d1e56a883f4e4c35cfe925f86731abc58804e647"}, + {file = "jaraco.classes-3.2.2.tar.gz", hash = "sha256:6745f113b0b588239ceb49532aa09c3ebb947433ce311ef2f8e3ad64ebb74594"}, +] jeepney = [ {file = "jeepney-0.8.0-py3-none-any.whl", hash = "sha256:c0a454ad016ca575060802ee4d590dd912e35c122fa04e70306de3d076cce755"}, {file = "jeepney-0.8.0.tar.gz", hash = "sha256:5efe48d255973902f6badc3ce55e2aa6c5c3b3bc642059ef3a91247bcfcc5806"}, @@ -1331,8 +1445,10 @@ jinja2 = [ {file = "Jinja2-3.1.2-py3-none-any.whl", hash = "sha256:6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61"}, {file = "Jinja2-3.1.2.tar.gz", hash = "sha256:31351a702a408a9e7595a8fc6150fc3f43bb6bf7e319770cbc0db9df9437e852"}, ] -keyring = [] -kiwisolver = [] +keyring = [ + {file = "keyring-23.9.3-py3-none-any.whl", hash = "sha256:69732a15cb1433bdfbc3b980a8a36a04878a6cfd7cb99f497b573f31618001c0"}, + {file = "keyring-23.9.3.tar.gz", hash = "sha256:69b01dd83c42f590250fe7a1f503fc229b14de83857314b1933a3ddbf595c4a5"}, +] markupsafe = [ {file = "MarkupSafe-2.1.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:86b1f75c4e7c2ac2ccdaec2b9022845dbb81880ca318bb7a0a01fbf7813e3812"}, {file = "MarkupSafe-2.1.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:f121a1420d4e173a5d96e47e9a0c0dcff965afdf1626d28de1460815f7c4ee7a"}, @@ -1375,35 +1491,126 @@ markupsafe = [ {file = "MarkupSafe-2.1.1-cp39-cp39-win_amd64.whl", hash = "sha256:46d00d6cfecdde84d40e572d63735ef81423ad31184100411e6e3388d405e247"}, {file = "MarkupSafe-2.1.1.tar.gz", hash = "sha256:7f91197cc9e48f989d12e4e6fbc46495c446636dfc81b9ccf50bb0ec74b91d4b"}, ] -matplotlib = [] -more-itertools = [] +more-itertools = [ + {file = "more-itertools-8.14.0.tar.gz", hash = "sha256:c09443cd3d5438b8dafccd867a6bc1cb0894389e90cb53d227456b0b0bccb750"}, + {file = "more_itertools-8.14.0-py3-none-any.whl", hash = "sha256:1bc4f91ee5b1b31ac7ceacc17c09befe6a40a503907baf9c839c229b5095cfd2"}, +] mpmath = [ {file = "mpmath-1.2.1-py3-none-any.whl", hash = "sha256:604bc21bd22d2322a177c73bdb573994ef76e62edd595d17e00aff24b0667e5c"}, {file = "mpmath-1.2.1.tar.gz", hash = "sha256:79ffb45cf9f4b101a807595bcb3e72e0396202e0b1d25d689134b48c4216a81a"}, ] -msal = [] +msal = [ + {file = "msal-1.19.0-py2.py3-none-any.whl", hash = "sha256:2206b44a739918b3ba0ee1bacd904e548fc91706cada9f1673b763c8d5f3364e"}, + {file = "msal-1.19.0.tar.gz", hash = "sha256:65e329d69cbfe48bb3dd3236b1ef8e4cc91f869637606c184e227e86d2b0629d"}, +] munch = [ {file = "munch-2.5.0-py2.py3-none-any.whl", hash = "sha256:6f44af89a2ce4ed04ff8de41f70b226b984db10a91dcc7b9ac2efc1c77022fdd"}, {file = "munch-2.5.0.tar.gz", hash = "sha256:2d735f6f24d4dba3417fa448cae40c6e896ec1fdab6cdb5e6510999758a4dbd2"}, ] -mypy = [] +mypy = [ + {file = "mypy-0.961-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:697540876638ce349b01b6786bc6094ccdaba88af446a9abb967293ce6eaa2b0"}, + {file = "mypy-0.961-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:b117650592e1782819829605a193360a08aa99f1fc23d1d71e1a75a142dc7e15"}, + {file = "mypy-0.961-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:bdd5ca340beffb8c44cb9dc26697628d1b88c6bddf5c2f6eb308c46f269bb6f3"}, + {file = "mypy-0.961-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:3e09f1f983a71d0672bbc97ae33ee3709d10c779beb613febc36805a6e28bb4e"}, + {file = "mypy-0.961-cp310-cp310-win_amd64.whl", hash = "sha256:e999229b9f3198c0c880d5e269f9f8129c8862451ce53a011326cad38b9ccd24"}, + {file = "mypy-0.961-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:b24be97351084b11582fef18d79004b3e4db572219deee0212078f7cf6352723"}, + {file = "mypy-0.961-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:f4a21d01fc0ba4e31d82f0fff195682e29f9401a8bdb7173891070eb260aeb3b"}, + {file = "mypy-0.961-cp36-cp36m-win_amd64.whl", hash = "sha256:439c726a3b3da7ca84a0199a8ab444cd8896d95012c4a6c4a0d808e3147abf5d"}, + {file = "mypy-0.961-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:5a0b53747f713f490affdceef835d8f0cb7285187a6a44c33821b6d1f46ed813"}, + {file = "mypy-0.961-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:0e9f70df36405c25cc530a86eeda1e0867863d9471fe76d1273c783df3d35c2e"}, + {file = "mypy-0.961-cp37-cp37m-win_amd64.whl", hash = "sha256:b88f784e9e35dcaa075519096dc947a388319cb86811b6af621e3523980f1c8a"}, + {file = "mypy-0.961-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:d5aaf1edaa7692490f72bdb9fbd941fbf2e201713523bdb3f4038be0af8846c6"}, + {file = "mypy-0.961-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:9f5f5a74085d9a81a1f9c78081d60a0040c3efb3f28e5c9912b900adf59a16e6"}, + {file = "mypy-0.961-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:f4b794db44168a4fc886e3450201365c9526a522c46ba089b55e1f11c163750d"}, + {file = "mypy-0.961-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:64759a273d590040a592e0f4186539858c948302c653c2eac840c7a3cd29e51b"}, + {file = "mypy-0.961-cp38-cp38-win_amd64.whl", hash = "sha256:63e85a03770ebf403291ec50097954cc5caf2a9205c888ce3a61bd3f82e17569"}, + {file = "mypy-0.961-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:5f1332964963d4832a94bebc10f13d3279be3ce8f6c64da563d6ee6e2eeda932"}, + {file = "mypy-0.961-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:006be38474216b833eca29ff6b73e143386f352e10e9c2fbe76aa8549e5554f5"}, + {file = "mypy-0.961-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:9940e6916ed9371809b35b2154baf1f684acba935cd09928952310fbddaba648"}, + {file = "mypy-0.961-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:a5ea0875a049de1b63b972456542f04643daf320d27dc592d7c3d9cd5d9bf950"}, + {file = "mypy-0.961-cp39-cp39-win_amd64.whl", hash = "sha256:1ece702f29270ec6af25db8cf6185c04c02311c6bb21a69f423d40e527b75c56"}, + {file = "mypy-0.961-py3-none-any.whl", hash = "sha256:03c6cc893e7563e7b2949b969e63f02c000b32502a1b4d1314cabe391aa87d66"}, + {file = "mypy-0.961.tar.gz", hash = "sha256:f730d56cb924d371c26b8eaddeea3cc07d78ff51c521c6d04899ac6904b75492"}, +] mypy-extensions = [ {file = "mypy_extensions-0.4.3-py2.py3-none-any.whl", hash = "sha256:090fedd75945a69ae91ce1303b5824f428daf5a028d2f6ab8a299250a846f15d"}, {file = "mypy_extensions-0.4.3.tar.gz", hash = "sha256:2d82818f5bb3e369420cb3c4060a7970edba416647068eb4c5343488a6c604a8"}, ] -nodeenv = [] -numpy = [] +nodeenv = [ + {file = "nodeenv-1.7.0-py2.py3-none-any.whl", hash = "sha256:27083a7b96a25f2f5e1d8cb4b6317ee8aeda3bdd121394e5ac54e498028a042e"}, + {file = "nodeenv-1.7.0.tar.gz", hash = "sha256:e0e7f7dfb85fc5394c6fe1e8fa98131a2473e04311a45afb6508f7cf1836fa2b"}, +] +numpy = [ + {file = "numpy-1.23.3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:c9f707b5bb73bf277d812ded9896f9512a43edff72712f31667d0a8c2f8e71ee"}, + {file = "numpy-1.23.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ffcf105ecdd9396e05a8e58e81faaaf34d3f9875f137c7372450baa5d77c9a54"}, + {file = "numpy-1.23.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0ea3f98a0ffce3f8f57675eb9119f3f4edb81888b6874bc1953f91e0b1d4f440"}, + {file = "numpy-1.23.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:004f0efcb2fe1c0bd6ae1fcfc69cc8b6bf2407e0f18be308612007a0762b4089"}, + {file = "numpy-1.23.3-cp310-cp310-win32.whl", hash = "sha256:98dcbc02e39b1658dc4b4508442a560fe3ca5ca0d989f0df062534e5ca3a5c1a"}, + {file = "numpy-1.23.3-cp310-cp310-win_amd64.whl", hash = "sha256:39a664e3d26ea854211867d20ebcc8023257c1800ae89773cbba9f9e97bae036"}, + {file = "numpy-1.23.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:1f27b5322ac4067e67c8f9378b41c746d8feac8bdd0e0ffede5324667b8a075c"}, + {file = "numpy-1.23.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2ad3ec9a748a8943e6eb4358201f7e1c12ede35f510b1a2221b70af4bb64295c"}, + {file = "numpy-1.23.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bdc9febce3e68b697d931941b263c59e0c74e8f18861f4064c1f712562903411"}, + {file = "numpy-1.23.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:301c00cf5e60e08e04d842fc47df641d4a181e651c7135c50dc2762ffe293dbd"}, + {file = "numpy-1.23.3-cp311-cp311-win32.whl", hash = "sha256:7cd1328e5bdf0dee621912f5833648e2daca72e3839ec1d6695e91089625f0b4"}, + {file = "numpy-1.23.3-cp311-cp311-win_amd64.whl", hash = "sha256:8355fc10fd33a5a70981a5b8a0de51d10af3688d7a9e4a34fcc8fa0d7467bb7f"}, + {file = "numpy-1.23.3-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:bc6e8da415f359b578b00bcfb1d08411c96e9a97f9e6c7adada554a0812a6cc6"}, + {file = "numpy-1.23.3-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:22d43376ee0acd547f3149b9ec12eec2f0ca4a6ab2f61753c5b29bb3e795ac4d"}, + {file = "numpy-1.23.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a64403f634e5ffdcd85e0b12c08f04b3080d3e840aef118721021f9b48fc1460"}, + {file = "numpy-1.23.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:efd9d3abe5774404becdb0748178b48a218f1d8c44e0375475732211ea47c67e"}, + {file = "numpy-1.23.3-cp38-cp38-win32.whl", hash = "sha256:f8c02ec3c4c4fcb718fdf89a6c6f709b14949408e8cf2a2be5bfa9c49548fd85"}, + {file = "numpy-1.23.3-cp38-cp38-win_amd64.whl", hash = "sha256:e868b0389c5ccfc092031a861d4e158ea164d8b7fdbb10e3b5689b4fc6498df6"}, + {file = "numpy-1.23.3-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:09f6b7bdffe57fc61d869a22f506049825d707b288039d30f26a0d0d8ea05164"}, + {file = "numpy-1.23.3-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:8c79d7cf86d049d0c5089231a5bcd31edb03555bd93d81a16870aa98c6cfb79d"}, + {file = "numpy-1.23.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e5d5420053bbb3dd64c30e58f9363d7a9c27444c3648e61460c1237f9ec3fa14"}, + {file = "numpy-1.23.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d5422d6a1ea9b15577a9432e26608c73a78faf0b9039437b075cf322c92e98e7"}, + {file = "numpy-1.23.3-cp39-cp39-win32.whl", hash = "sha256:c1ba66c48b19cc9c2975c0d354f24058888cdc674bebadceb3cdc9ec403fb5d1"}, + {file = "numpy-1.23.3-cp39-cp39-win_amd64.whl", hash = "sha256:78a63d2df1d947bd9d1b11d35564c2f9e4b57898aae4626638056ec1a231c40c"}, + {file = "numpy-1.23.3-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:17c0e467ade9bda685d5ac7f5fa729d8d3e76b23195471adae2d6a6941bd2c18"}, + {file = "numpy-1.23.3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:91b8d6768a75247026e951dce3b2aac79dc7e78622fc148329135ba189813584"}, + {file = "numpy-1.23.3-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:94c15ca4e52671a59219146ff584488907b1f9b3fc232622b47e2cf832e94fb8"}, + {file = "numpy-1.23.3.tar.gz", hash = "sha256:51bf49c0cd1d52be0a240aa66f3458afc4b95d8993d2d04f0d91fa60c10af6cd"}, +] oauthlib = [ - {file = "oauthlib-3.2.0-py3-none-any.whl", hash = "sha256:6db33440354787f9b7f3a6dbd4febf5d0f93758354060e802f6c06cb493022fe"}, - {file = "oauthlib-3.2.0.tar.gz", hash = "sha256:23a8208d75b902797ea29fd31fa80a15ed9dc2c6c16fe73f5d346f83f6fa27a2"}, + {file = "oauthlib-3.2.1-py3-none-any.whl", hash = "sha256:88e912ca1ad915e1dcc1c06fc9259d19de8deacd6fd17cc2df266decc2e49066"}, + {file = "oauthlib-3.2.1.tar.gz", hash = "sha256:1565237372795bf6ee3e5aba5e2a85bd5a65d0e2aa5c628b9a97b7d7a0da3721"}, ] packaging = [ {file = "packaging-21.3-py3-none-any.whl", hash = "sha256:ef103e05f519cdc783ae24ea4e2e0f508a9c99b2d4969652eed6a2e1ea5bd522"}, {file = "packaging-21.3.tar.gz", hash = "sha256:dd47c42927d89ab911e606518907cc2d3a1f38bbd026385970643f9c5b8ecfeb"}, ] -pandas = [] -pillow = [] -pkginfo = [] +pandas = [ + {file = "pandas-1.5.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:0d8d7433d19bfa33f11c92ad9997f15a902bda4f5ad3a4814a21d2e910894484"}, + {file = "pandas-1.5.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:5cc47f2ebaa20ef96ae72ee082f9e101b3dfbf74f0e62c7a12c0b075a683f03c"}, + {file = "pandas-1.5.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:8e8e5edf97d8793f51d258c07c629bd49d271d536ce15d66ac00ceda5c150eb3"}, + {file = "pandas-1.5.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:41aec9f87455306496d4486df07c1b98c15569c714be2dd552a6124cd9fda88f"}, + {file = "pandas-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c76f1d104844c5360c21d2ef0e1a8b2ccf8b8ebb40788475e255b9462e32b2be"}, + {file = "pandas-1.5.0-cp310-cp310-win_amd64.whl", hash = "sha256:1642fc6138b4e45d57a12c1b464a01a6d868c0148996af23f72dde8d12486bbc"}, + {file = "pandas-1.5.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:171cef540bfcec52257077816a4dbbac152acdb8236ba11d3196ae02bf0959d8"}, + {file = "pandas-1.5.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:a68a9b9754efff364b0c5ee5b0f18e15ca640c01afe605d12ba8b239ca304d6b"}, + {file = "pandas-1.5.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:86d87279ebc5bc20848b4ceb619073490037323f80f515e0ec891c80abad958a"}, + {file = "pandas-1.5.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:207d63ac851e60ec57458814613ef4b3b6a5e9f0b33c57623ba2bf8126c311f8"}, + {file = "pandas-1.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e252a9e49b233ff96e2815c67c29702ac3a062098d80a170c506dff3470fd060"}, + {file = "pandas-1.5.0-cp311-cp311-win_amd64.whl", hash = "sha256:de34636e2dc04e8ac2136a8d3c2051fd56ebe9fd6cd185581259330649e73ca9"}, + {file = "pandas-1.5.0-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:1d34b1f43d9e3f4aea056ba251f6e9b143055ebe101ed04c847b41bb0bb4a989"}, + {file = "pandas-1.5.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:1b82ccc7b093e0a93f8dffd97a542646a3e026817140e2c01266aaef5fdde11b"}, + {file = "pandas-1.5.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:4e30a31039574d96f3d683df34ccb50bb435426ad65793e42a613786901f6761"}, + {file = "pandas-1.5.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:62e61003411382e20d7c2aec1ee8d7c86c8b9cf46290993dd8a0a3be44daeb38"}, + {file = "pandas-1.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fc987f7717e53d372f586323fff441263204128a1ead053c1b98d7288f836ac9"}, + {file = "pandas-1.5.0-cp38-cp38-win32.whl", hash = "sha256:e178ce2d7e3b934cf8d01dc2d48d04d67cb0abfaffdcc8aa6271fd5a436f39c8"}, + {file = "pandas-1.5.0-cp38-cp38-win_amd64.whl", hash = "sha256:33a9d9e21ab2d91e2ab6e83598419ea6a664efd4c639606b299aae8097c1c94f"}, + {file = "pandas-1.5.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:73844e247a7b7dac2daa9df7339ecf1fcf1dfb8cbfd11e3ffe9819ae6c31c515"}, + {file = "pandas-1.5.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:e9c5049333c5bebf993033f4bf807d163e30e8fada06e1da7fa9db86e2392009"}, + {file = "pandas-1.5.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:85a516a7f6723ca1528f03f7851fa8d0360d1d6121cf15128b290cf79b8a7f6a"}, + {file = "pandas-1.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:947ed9f896ee61adbe61829a7ae1ade493c5a28c66366ec1de85c0642009faac"}, + {file = "pandas-1.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c7f38d91f21937fe2bec9449570d7bf36ad7136227ef43b321194ec249e2149d"}, + {file = "pandas-1.5.0-cp39-cp39-win32.whl", hash = "sha256:2504c032f221ef9e4a289f5e46a42b76f5e087ecb67d62e342ccbba95a32a488"}, + {file = "pandas-1.5.0-cp39-cp39-win_amd64.whl", hash = "sha256:8a4fc04838615bf0a8d3a03ed68197f358054f0df61f390bcc64fbe39e3d71ec"}, + {file = "pandas-1.5.0.tar.gz", hash = "sha256:3ee61b881d2f64dd90c356eb4a4a4de75376586cd3c9341c6c0fcaae18d52977"}, +] +pkginfo = [ + {file = "pkginfo-1.8.3-py2.py3-none-any.whl", hash = "sha256:848865108ec99d4901b2f7e84058b6e7660aae8ae10164e015a6dcf5b242a594"}, + {file = "pkginfo-1.8.3.tar.gz", hash = "sha256:a84da4318dd86f870a9447a8c98340aa06216bfc6f2b7bdc4b8766984ae1867c"}, +] platformdirs = [ {file = "platformdirs-2.5.2-py3-none-any.whl", hash = "sha256:027d8e83a2d7de06bbac4e5ef7e023c02b863d7ea5d079477e722bb41ab25788"}, {file = "platformdirs-2.5.2.tar.gz", hash = "sha256:58c8abb07dcb441e6ee4b11d8df0ac856038f944ab98b7be6b27b2a3c7feef19"}, @@ -1412,7 +1619,10 @@ pluggy = [ {file = "pluggy-1.0.0-py2.py3-none-any.whl", hash = "sha256:74134bbf457f031a36d68416e1509f34bd5ccc019f0bcc952c7b909d06b37bd3"}, {file = "pluggy-1.0.0.tar.gz", hash = "sha256:4224373bacce55f955a878bf9cfa763c1e360858e330072059e10bad68531159"}, ] -pre-commit = [] +pre-commit = [ + {file = "pre_commit-2.20.0-py2.py3-none-any.whl", hash = "sha256:51a5ba7c480ae8072ecdb6933df22d2f812dc897d5fe848778116129a681aac7"}, + {file = "pre_commit-2.20.0.tar.gz", hash = "sha256:a978dac7bc9ec0bcee55c18a277d553b0f419d259dadb4b9418ff2d00eb43959"}, +] py = [ {file = "py-1.11.0-py2.py3-none-any.whl", hash = "sha256:607c53218732647dff4acdfcd50cb62615cedf612e72d1724fb1a0cc6405b378"}, {file = "py-1.11.0.tar.gz", hash = "sha256:51c75c4126074b472f746a24399ad32f6053d1b34b68d2fa41e558e6f4a98719"}, @@ -1421,49 +1631,87 @@ pycparser = [ {file = "pycparser-2.21-py2.py3-none-any.whl", hash = "sha256:8ee45429555515e1f6b185e78100aea234072576aa43ab53aefcae078162fca9"}, {file = "pycparser-2.21.tar.gz", hash = "sha256:e644fdec12f7872f86c58ff790da456218b10f863970249516d60a5eaca77206"}, ] -pygments = [] -pyjwt = [] +pygments = [ + {file = "Pygments-2.13.0-py3-none-any.whl", hash = "sha256:f643f331ab57ba3c9d89212ee4a2dabc6e94f117cf4eefde99a0574720d14c42"}, + {file = "Pygments-2.13.0.tar.gz", hash = "sha256:56a8508ae95f98e2b9bdf93a6be5ae3f7d8af858b43e02c5a2ff083726be40c1"}, +] +pyjwt = [ + {file = "PyJWT-2.5.0-py3-none-any.whl", hash = "sha256:8d82e7087868e94dd8d7d418e5088ce64f7daab4b36db654cbaedb46f9d1ca80"}, + {file = "PyJWT-2.5.0.tar.gz", hash = "sha256:e77ab89480905d86998442ac5788f35333fa85f65047a534adc38edf3c88fc3b"}, +] pyparsing = [ {file = "pyparsing-3.0.9-py3-none-any.whl", hash = "sha256:5026bae9a10eeaefb61dab2f09052b9f4307d44aee4eda64b309723d8d206bbc"}, {file = "pyparsing-3.0.9.tar.gz", hash = "sha256:2b020ecf7d21b687f219b71ecad3631f644a47f01403fa1d1036b0c6416d70fb"}, ] pyproj = [ - {file = "pyproj-3.3.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:473961faef7a9fd723c5d432f65220ea6ab3854e606bf84b4d409a75a4261c78"}, - {file = "pyproj-3.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2fef9c1e339f25c57f6ae0558b5ab1bbdf7994529a30d8d7504fc6302ea51c03"}, - {file = "pyproj-3.3.1-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:140fa649fedd04f680a39f8ad339799a55cb1c49f6a84e1b32b97e49646647aa"}, - {file = "pyproj-3.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b59c08aea13ee428cf8a919212d55c036cc94784805ed77c8f31a4d1f541058c"}, - {file = "pyproj-3.3.1-cp310-cp310-win32.whl", hash = "sha256:1adc9ccd1bf04998493b6a2e87e60656c75ab790653b36cfe351e9ef214828ed"}, - {file = "pyproj-3.3.1-cp310-cp310-win_amd64.whl", hash = "sha256:42eea10afc750fccd1c5c4ba56de29ab791ab4d83c1f7db72705566282ac5396"}, - {file = "pyproj-3.3.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:531ea36519fa7b581466d4b6ab32f66ae4dadd9499d726352f71ee5e19c3d1c5"}, - {file = "pyproj-3.3.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:67025e37598a6bbed2c9c6c9e4c911f6dd39315d3e1148ead935a5c4d64309d5"}, - {file = "pyproj-3.3.1-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:aed1a3c0cd4182425f91b48d5db39f459bc2fe0d88017ead6425a1bc85faee33"}, - {file = "pyproj-3.3.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3cc4771403db54494e1e55bca8e6d33cde322f8cf0ed39f1557ff109c66d2cd1"}, - {file = "pyproj-3.3.1-cp38-cp38-win32.whl", hash = "sha256:c99f7b5757a28040a2dd4a28c9805fdf13eef79a796f4a566ab5cb362d10630d"}, - {file = "pyproj-3.3.1-cp38-cp38-win_amd64.whl", hash = "sha256:5dac03d4338a4c8bd0f69144c527474f517b4cbd7d2d8c532cd8937799723248"}, - {file = "pyproj-3.3.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:56b0f9ee2c5b2520b18db30a393a7b86130cf527ddbb8c96e7f3c837474a9d79"}, - {file = "pyproj-3.3.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5f92d8f6514516124abb714dce912b20867831162cfff9fae2678ef07b6fcf0f"}, - {file = "pyproj-3.3.1-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1ef1bfbe2dcc558c7a98e2f1836abdcd630390f3160724a6f4f5c818b2be0ad5"}, - {file = "pyproj-3.3.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5ca5f32b56210429b367ca4f9a57ffe67975c487af82e179a24370879a3daf68"}, - {file = "pyproj-3.3.1-cp39-cp39-win32.whl", hash = "sha256:aba199704c824fb84ab64927e7bc9ef71e603e483130ec0f7e09e97259b8f61f"}, - {file = "pyproj-3.3.1-cp39-cp39-win_amd64.whl", hash = "sha256:120d45ed73144c65e9677dc73ba8a531c495d179dd9f9f0471ac5acc02d7ac4b"}, - {file = "pyproj-3.3.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:52efb681647dfac185cc655a709bc0caaf910031a0390f816f5fc8ce150cbedc"}, - {file = "pyproj-3.3.1-pp38-pypy38_pp73-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5ab0d6e38fda7c13726afacaf62e9f9dd858089d67910471758afd9cb24e0ecd"}, - {file = "pyproj-3.3.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:45487942c19c5a8b09c91964ea3201f4e094518e34743cae373889a36e3d9260"}, - {file = "pyproj-3.3.1-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:797ad5655d484feac14b0fbb4a4efeaac0cf780a223046e2465494c767fd1c3b"}, - {file = "pyproj-3.3.1.tar.gz", hash = "sha256:b3d8e14d91cc95fb3dbc03a9d0588ac58326803eefa5bbb0978d109de3304fbe"}, -] -pytest = [] -pytest-asyncio = [] -pytest-cov = [] -pytest-forked = [] -pytest-rerunfailures = [] -pytest-xdist = [] + {file = "pyproj-3.4.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:f343725566267a296b09ee7e591894f1fdc90f84f8ad5ec476aeb53bd4479c07"}, + {file = "pyproj-3.4.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:5816807ca0bdc7256558770c6206a6783a3f02bcf844f94ee245f197bb5f7285"}, + {file = "pyproj-3.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e7e609903572a56cca758bbaee5c1663c3e829ddce5eec4f368e68277e37022b"}, + {file = "pyproj-3.4.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4fd425ee8b6781c249c7adb7daa2e6c41ce573afabe4f380f5eecd913b56a3be"}, + {file = "pyproj-3.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:954b068136518b3174d0a99448056e97af62b63392a95c420894f7de2229dae6"}, + {file = "pyproj-3.4.0-cp310-cp310-win_amd64.whl", hash = "sha256:4a23d84c5ffc383c7d9f0bde3a06fc1f6697b1b96725597f8f01e7b4bef0a2b5"}, + {file = "pyproj-3.4.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:1f9c100fd0fd80edbc7e4daa303600a8cbef6f0de43d005617acb38276b88dc0"}, + {file = "pyproj-3.4.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:aa5171f700f174777a9e9ed8f4655583243967c0f9cf2c90e3f54e54ff740134"}, + {file = "pyproj-3.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9a496d9057b2128db9d733e66b206f2d5954bbae6b800d412f562d780561478c"}, + {file = "pyproj-3.4.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:52e54796e2d9554a5eb8f11df4748af1fbbc47f76aa234d6faf09216a84554c5"}, + {file = "pyproj-3.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a454a7c4423faa2a14e939d08ef293ee347fa529c9df79022b0585a6e1d8310c"}, + {file = "pyproj-3.4.0-cp311-cp311-win_amd64.whl", hash = "sha256:25a36e297f3e0524694d40259e3e895edc1a47492a0e30608268ffc1328e3f5d"}, + {file = "pyproj-3.4.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:77d5f519f3cdb94b026ecca626f78db4f041afe201cf082079c8c0092a30b087"}, + {file = "pyproj-3.4.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ccb4b70ad25218027f77e0c8934d10f9b7cdf91d5e64080147743d58fddbc3c0"}, + {file = "pyproj-3.4.0-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4e161114bc92701647a83c4bbce79489984f12d980cabb365516e953d1450885"}, + {file = "pyproj-3.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f80adda8c54b84271a93829477a01aa57bc178c834362e9f74e1de1b5033c74c"}, + {file = "pyproj-3.4.0-cp38-cp38-win_amd64.whl", hash = "sha256:221d8939685e0c43ee594c9f04b6a73a10e8e1cc0e85f28be0b4eb2f1bc8777d"}, + {file = "pyproj-3.4.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:d94afed99f31673d3d19fe750283621e193e2a53ca9e0443bf9d092c3905833b"}, + {file = "pyproj-3.4.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:0fff9c3a991508f16027be27d153f6c5583d03799443639d13c681e60f49e2d7"}, + {file = "pyproj-3.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3b85acf09e5a9e35cd9ee72989793adb7089b4e611be02a43d3d0bda50ad116b"}, + {file = "pyproj-3.4.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:45554f47d1a12a84b0620e4abc08a2a1b5d9f273a4759eaef75e74788ec7162a"}, + {file = "pyproj-3.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:12f62c20656ac9b6076ebb213e9a635d52f4f01fef95310121d337e62e910cb6"}, + {file = "pyproj-3.4.0-cp39-cp39-win_amd64.whl", hash = "sha256:65a0bcdbad95b3c00b419e5d75b1f7e450ec17349b5ea16bf7438ac1d50a12a2"}, + {file = "pyproj-3.4.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:14ad113b5753c6057f9b2f3c85a6497cef7fa237c4328f2943c0223e98c1dde6"}, + {file = "pyproj-3.4.0-pp38-pypy38_pp73-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4688b4cd62cbd86b5e855f9e27d90fbb53f2b4c2ea1cd394a46919e1a4151b89"}, + {file = "pyproj-3.4.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:47ad53452ae1dc8b0bf1df920a210bb5616989085aa646592f8681f1d741a754"}, + {file = "pyproj-3.4.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:48787962232109bad8b72e27949037a9b03591228a6955f25dbe451233e8648a"}, + {file = "pyproj-3.4.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2cb8592259ea54e7557523b079d3f2304081680bdb48bfbf0fd879ee6156129c"}, + {file = "pyproj-3.4.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:82200b4569d68b421c079d2973475b58d5959306fe758b43366e79fe96facfe5"}, + {file = "pyproj-3.4.0.tar.gz", hash = "sha256:a708445927ace9857f52c3ba67d2915da7b41a8fdcd9b8f99a4c9ed60a75eb33"}, +] +pytest = [ + {file = "pytest-6.2.5-py3-none-any.whl", hash = "sha256:7310f8d27bc79ced999e760ca304d69f6ba6c6649c0b60fb0e04a4a77cacc134"}, + {file = "pytest-6.2.5.tar.gz", hash = "sha256:131b36680866a76e6781d13f101efb86cf674ebb9762eb70d3082b6f29889e89"}, +] +pytest-asyncio = [ + {file = "pytest-asyncio-0.18.3.tar.gz", hash = "sha256:7659bdb0a9eb9c6e3ef992eef11a2b3e69697800ad02fb06374a210d85b29f91"}, + {file = "pytest_asyncio-0.18.3-1-py3-none-any.whl", hash = "sha256:16cf40bdf2b4fb7fc8e4b82bd05ce3fbcd454cbf7b92afc445fe299dabb88213"}, + {file = "pytest_asyncio-0.18.3-py3-none-any.whl", hash = "sha256:8fafa6c52161addfd41ee7ab35f11836c5a16ec208f93ee388f752bea3493a84"}, +] +pytest-cov = [ + {file = "pytest-cov-3.0.0.tar.gz", hash = "sha256:e7f0f5b1617d2210a2cabc266dfe2f4c75a8d32fb89eafb7ad9d06f6d076d470"}, + {file = "pytest_cov-3.0.0-py3-none-any.whl", hash = "sha256:578d5d15ac4a25e5f961c938b85a05b09fdaae9deef3bb6de9a6e766622ca7a6"}, +] +pytest-forked = [ + {file = "pytest-forked-1.4.0.tar.gz", hash = "sha256:8b67587c8f98cbbadfdd804539ed5455b6ed03802203485dd2f53c1422d7440e"}, + {file = "pytest_forked-1.4.0-py3-none-any.whl", hash = "sha256:bbbb6717efc886b9d64537b41fb1497cfaf3c9601276be8da2cccfea5a3c8ad8"}, +] +pytest-rerunfailures = [ + {file = "pytest-rerunfailures-10.2.tar.gz", hash = "sha256:9e1e1bad51e07642c5bbab809fc1d4ec8eebcb7de86f90f1a26e6ef9de446697"}, + {file = "pytest_rerunfailures-10.2-py3-none-any.whl", hash = "sha256:d31d8e828dfd39363ad99cd390187bf506c7a433a89f15c3126c7d16ab723fe2"}, +] +pytest-xdist = [ + {file = "pytest-xdist-2.5.0.tar.gz", hash = "sha256:4580deca3ff04ddb2ac53eba39d76cb5dd5edeac050cb6fbc768b0dd712b4edf"}, + {file = "pytest_xdist-2.5.0-py3-none-any.whl", hash = "sha256:6fe5c74fec98906deb8f2d2b616b5c782022744978e7bd4695d39c8f42d0ce65"}, +] python-dateutil = [ {file = "python-dateutil-2.8.2.tar.gz", hash = "sha256:0123cacc1627ae19ddf3c27a5de5bd67ee4586fbdd6440d9748f8abb483d3e86"}, {file = "python_dateutil-2.8.2-py2.py3-none-any.whl", hash = "sha256:961d03dc3453ebbc59dbdea9e4e11c5651520a876d0f4db161e8674aae935da9"}, ] -python-dotenv = [] -pytz = [] +python-dotenv = [ + {file = "python-dotenv-0.20.0.tar.gz", hash = "sha256:b7e3b04a59693c42c36f9ab1cc2acc46fa5df8c78e178fc33a8d4cd05c8d498f"}, + {file = "python_dotenv-0.20.0-py3-none-any.whl", hash = "sha256:d92a187be61fe482e4fd675b6d52200e7be63a12b724abbf931a40ce4fa92938"}, +] +pytz = [ + {file = "pytz-2022.2.1-py2.py3-none-any.whl", hash = "sha256:220f481bdafa09c3955dfbdddb7b57780e9a94f5127e35456a48589b9e0c0197"}, + {file = "pytz-2022.2.1.tar.gz", hash = "sha256:cea221417204f2d1a2aa03ddae3e867921971d0d76f14d87abb4414415bbdcf5"}, +] pywin32-ctypes = [ {file = "pywin32-ctypes-0.2.0.tar.gz", hash = "sha256:24ffc3b341d457d48e8922352130cf2644024a4ff09762a2261fd34c36ee5942"}, {file = "pywin32_ctypes-0.2.0-py2.py3-none-any.whl", hash = "sha256:9dc2d991b3479cc2df15930958b674a48a227d5361d413827a4cfd0b5876fc98"}, @@ -1503,8 +1751,14 @@ pyyaml = [ {file = "PyYAML-6.0-cp39-cp39-win_amd64.whl", hash = "sha256:b3d267842bf12586ba6c734f89d1f5b871df0273157918b0ccefa29deb05c21c"}, {file = "PyYAML-6.0.tar.gz", hash = "sha256:68fb519c14306fec9720a2a5b45bc9f0c8d1b9c72adf45c37baedfcd949c35a2"}, ] -readme-renderer = [] -requests = [] +readme-renderer = [ + {file = "readme_renderer-37.2-py3-none-any.whl", hash = "sha256:d3f06a69e8c40fca9ab3174eca48f96d9771eddb43517b17d96583418427b106"}, + {file = "readme_renderer-37.2.tar.gz", hash = "sha256:e8ad25293c98f781dbc2c5a36a309929390009f902f99e1798c761aaf04a7923"}, +] +requests = [ + {file = "requests-2.28.1-py3-none-any.whl", hash = "sha256:8fefa2a1a1365bf5520aac41836fbee479da67864514bdb821f31ce07ce65349"}, + {file = "requests-2.28.1.tar.gz", hash = "sha256:7c5599b102feddaa661c826c56ab4fee28bfd17f5abca1ebbe3e7f19d7c97983"}, +] requests-oauthlib = [ {file = "requests-oauthlib-1.3.1.tar.gz", hash = "sha256:75beac4a47881eeb94d5ea5d6ad31ef88856affe2332b9aafb52c6452ccf0d7a"}, {file = "requests_oauthlib-1.3.1-py2.py3-none-any.whl", hash = "sha256:2577c501a2fb8d05a304c09d090d6e47c306fef15809d102b327cf8364bddab5"}, @@ -1513,18 +1767,58 @@ requests-toolbelt = [ {file = "requests-toolbelt-0.9.1.tar.gz", hash = "sha256:968089d4584ad4ad7c171454f0a5c6dac23971e9472521ea3b6d49d610aa6fc0"}, {file = "requests_toolbelt-0.9.1-py2.py3-none-any.whl", hash = "sha256:380606e1d10dc85c3bd47bf5a6095f815ec007be7a8b69c878507068df059e6f"}, ] -responses = [] +responses = [ + {file = "responses-0.21.0-py3-none-any.whl", hash = "sha256:2dcc863ba63963c0c3d9ee3fa9507cbe36b7d7b0fccb4f0bdfd9e96c539b1487"}, + {file = "responses-0.21.0.tar.gz", hash = "sha256:b82502eb5f09a0289d8e209e7bad71ef3978334f56d09b444253d5ad67bf5253"}, +] rfc3986 = [ {file = "rfc3986-2.0.0-py2.py3-none-any.whl", hash = "sha256:50b1502b60e289cb37883f3dfd34532b8873c7de9f49bb546641ce9cbd256ebd"}, {file = "rfc3986-2.0.0.tar.gz", hash = "sha256:97aacf9dbd4bfd829baad6e6309fa6573aaf1be3f6fa735c8ab05e46cecb261c"}, ] -rich = [] -secretstorage = [] -setuptools-scm = [ - {file = "setuptools_scm-6.4.2-py3-none-any.whl", hash = "sha256:acea13255093849de7ccb11af9e1fb8bde7067783450cee9ef7a93139bddf6d4"}, - {file = "setuptools_scm-6.4.2.tar.gz", hash = "sha256:6833ac65c6ed9711a4d5d2266f8024cfa07c533a0e55f4c12f6eff280a5a9e30"}, +rich = [ + {file = "rich-12.5.1-py3-none-any.whl", hash = "sha256:2eb4e6894cde1e017976d2975ac210ef515d7548bc595ba20e195fb9628acdeb"}, + {file = "rich-12.5.1.tar.gz", hash = "sha256:63a5c5ce3673d3d5fbbf23cd87e11ab84b6b451436f1b7f19ec54b6bc36ed7ca"}, +] +secretstorage = [ + {file = "SecretStorage-3.3.3-py3-none-any.whl", hash = "sha256:f356e6628222568e3af06f2eba8df495efa13b3b63081dafd4f7d9a7b7bc9f99"}, + {file = "SecretStorage-3.3.3.tar.gz", hash = "sha256:2403533ef369eca6d2ba81718576c5e0f564d5cca1b58f73a8b23e7d4eeebd77"}, +] +shapely = [ + {file = "Shapely-1.8.4-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:6702a5df484ca92bbd1494b5945dd7d6b8f6caab13ca9f6240e64034a114fa13"}, + {file = "Shapely-1.8.4-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:79da29fde8ad2ca791b324f2cc3e75093573f69488ade7b524f79d781b042699"}, + {file = "Shapely-1.8.4-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:eac2d08c0a02dccffd7f836901ea1d1b0f8e7ff3878b2c7a45443f0a34e7f087"}, + {file = "Shapely-1.8.4-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:007f0d51d045307dc3addd1c318d18f450c565c8ea96ea41304e020ca34d85b7"}, + {file = "Shapely-1.8.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:04f416aa8ca9480b5cd74d2184fe43d4196a5941046661f7be27fe5c10f89ede"}, + {file = "Shapely-1.8.4-cp310-cp310-win32.whl", hash = "sha256:f6801a33897fb54ce39d5e841214192ecf95f4ddf8458d17e196a314fefe43bb"}, + {file = "Shapely-1.8.4-cp310-cp310-win_amd64.whl", hash = "sha256:e018163500109ab4c9ad51d018ba28abb1aed5b0451476859e189fbb00c46c7b"}, + {file = "Shapely-1.8.4-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:687520cf1db1fac2970cca5eb2ea037c1862b2e6938a514f9f6106c9d4ac0445"}, + {file = "Shapely-1.8.4-cp36-cp36m-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:471ce47f3b221731b3a8fb90c24dd5899140ca892bb78c5df49b340a73da5bd2"}, + {file = "Shapely-1.8.4-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:bb371511269d8320652b980edb044f9c45c87df12ecce00c4bb1d0662d53bdb4"}, + {file = "Shapely-1.8.4-cp36-cp36m-win32.whl", hash = "sha256:20157b20f32eac57a56b5ef5a5a0ffb5288e1554e0172bc9452d3de190965709"}, + {file = "Shapely-1.8.4-cp36-cp36m-win_amd64.whl", hash = "sha256:be731cf35cfd54091d62cd63a4c4d87a97db68c2224408ec6ef28c6333d74501"}, + {file = "Shapely-1.8.4-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:95a864b83857de736499d171785b8e71df97e8cef62d4e36b34f057b5a4dc98c"}, + {file = "Shapely-1.8.4-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:4c10d55a2dfab648d9aeca1818f986e505f29be2763edd0910b50c76d73db085"}, + {file = "Shapely-1.8.4-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:a2cc137d525a2e54557df2f70f7b9d52749840e1d877cf500a8f7f0f77170552"}, + {file = "Shapely-1.8.4-cp37-cp37m-win32.whl", hash = "sha256:6c399712b98fef80ef53748a572b229788650b0af535e6d4c5a3168aabbc0013"}, + {file = "Shapely-1.8.4-cp37-cp37m-win_amd64.whl", hash = "sha256:4f14ea7f041412ff5b277d5424e76638921ba771c43b21b20706abc7900d5ce9"}, + {file = "Shapely-1.8.4-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:1d431ac2bb75e7c59a75820719b2f0f494720d821cb68eeb2487812d1d7bc287"}, + {file = "Shapely-1.8.4-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:2a6e2fb40415cecf67dff1a13844d27a11c09604839b5cfbbb41b80cf97a625c"}, + {file = "Shapely-1.8.4-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:1f071175777f87d9220c24e4576dcf972b14f93dffd05a1d72ee0555dfa2a799"}, + {file = "Shapely-1.8.4-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:7855ac13c5a951bcef1f3834d1affeeacea42a4abd2c0f46b341229b350f2406"}, + {file = "Shapely-1.8.4-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:d7a6fd1329f75e290b858e9faeef15ae76d7ea05a02648fe216fec3c3bed4eb0"}, + {file = "Shapely-1.8.4-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:20c40085835fbd5b12566b9b0a6d718b0b6a4d308ff1fff5b19d7cf29f75cc77"}, + {file = "Shapely-1.8.4-cp38-cp38-win32.whl", hash = "sha256:41e1395bb3865e42ca3dec857669ed3ab90806925fce38c47d7f92bd4276f7cd"}, + {file = "Shapely-1.8.4-cp38-cp38-win_amd64.whl", hash = "sha256:34765b0495c6297adb95d7de8fc62790f8eaf8e7fb96260dd644cf11d37b3d21"}, + {file = "Shapely-1.8.4-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:53d453f40e5b1265b8806ac7e5f3ce775b758e5c42c24239e3d8de6e861b7699"}, + {file = "Shapely-1.8.4-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:5f3bf1d985dc8367f480f68f07770f57a5fe54477e98237c6f328db79568f1e2"}, + {file = "Shapely-1.8.4-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:033b9eaf50c9de4c87b0d1ffa532edcf7420b70a329c630431da50071be939d9"}, + {file = "Shapely-1.8.4-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:b1756c28a48a61e5581720171a89d69ae303d5faffc58efef0dab498e16a50f1"}, + {file = "Shapely-1.8.4-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:a352f00637dda1354c549b602d9dcc69a7048d5d64dcdaf3b5e702d0bf5faad2"}, + {file = "Shapely-1.8.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b70463ef505f509809b92ffb1202890a1236ce9f21666020de289fed911fdeaf"}, + {file = "Shapely-1.8.4-cp39-cp39-win32.whl", hash = "sha256:5b77a7fd5bbf051a640d25db85fc062d245ef03cd80081321b6b87213a8b0892"}, + {file = "Shapely-1.8.4-cp39-cp39-win_amd64.whl", hash = "sha256:5d629bcf68b45dfdfd85cc0dc37f5325d4ce9341b235f16969c1a76599476e84"}, + {file = "Shapely-1.8.4.tar.gz", hash = "sha256:a195e51caafa218291f2cbaa3fef69fd3353c93ec4b65b2a4722c4cf40c3198c"}, ] -shapely = [] six = [ {file = "six-1.16.0-py2.py3-none-any.whl", hash = "sha256:8abb2f1d86890a2dfb989f9a77cfcfd3e47c2a354b01111771326f8aa26e0254"}, {file = "six-1.16.0.tar.gz", hash = "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926"}, @@ -1533,7 +1827,14 @@ snowballstemmer = [ {file = "snowballstemmer-2.2.0-py2.py3-none-any.whl", hash = "sha256:c8e1716e83cc398ae16824e5572ae04e0d9fc2c6b985fb0f900f5f0c96ecba1a"}, {file = "snowballstemmer-2.2.0.tar.gz", hash = "sha256:09b16deb8547d3412ad7b590689584cd0fe25ec8db3be37788be3810cbf19cb1"}, ] -sphinx = [] +sortedcontainers = [ + {file = "sortedcontainers-2.4.0-py2.py3-none-any.whl", hash = "sha256:a163dcaede0f1c021485e957a39245190e74249897e2ae4b2aa38595db237ee0"}, + {file = "sortedcontainers-2.4.0.tar.gz", hash = "sha256:25caa5a06cc30b6b83d11423433f65d1f9d76c4c6a0c90e3379eaa43b9bfdb88"}, +] +sphinx = [ + {file = "Sphinx-5.2.1.tar.gz", hash = "sha256:c009bb2e9ac5db487bcf53f015504005a330ff7c631bb6ab2604e0d65bae8b54"}, + {file = "sphinx-5.2.1-py3-none-any.whl", hash = "sha256:3dcf00fcf82cf91118db9b7177edea4fc01998976f893928d0ab0c58c54be2ca"}, +] sphinx-rtd-theme = [ {file = "sphinx_rtd_theme-1.0.0-py2.py3-none-any.whl", hash = "sha256:4d35a56f4508cfee4c4fb604373ede6feae2a306731d533f409ef5c3496fdbd8"}, {file = "sphinx_rtd_theme-1.0.0.tar.gz", hash = "sha256:eec6d497e4c2195fa0e8b2016b337532b8a699a68bcb22a512870e16925c6a5c"}, @@ -1562,7 +1863,10 @@ sphinxcontrib-serializinghtml = [ {file = "sphinxcontrib-serializinghtml-1.1.5.tar.gz", hash = "sha256:aa5f6de5dfdf809ef505c4895e51ef5c9eac17d0f287933eb49ec495280b6952"}, {file = "sphinxcontrib_serializinghtml-1.1.5-py2.py3-none-any.whl", hash = "sha256:352a9a00ae864471d3a7ead8d7d79f5fc0b57e8b3f95e9867eb9eb28999b92fd"}, ] -sympy = [] +sympy = [ + {file = "sympy-1.11.1-py3-none-any.whl", hash = "sha256:938f984ee2b1e8eae8a07b884c8b7a1146010040fccddc6539c54f401c8f6fcf"}, + {file = "sympy-1.11.1.tar.gz", hash = "sha256:e32380dce63cb7c0108ed525570092fd45168bdae2faa17e528221ef72e88658"}, +] toml = [ {file = "toml-0.10.2-py2.py3-none-any.whl", hash = "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b"}, {file = "toml-0.10.2.tar.gz", hash = "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f"}, @@ -1571,14 +1875,39 @@ tomli = [ {file = "tomli-2.0.1-py3-none-any.whl", hash = "sha256:939de3e7a6161af0c887ef91b7d41a53e7c5a1ca976325f429cb46ea9bc30ecc"}, {file = "tomli-2.0.1.tar.gz", hash = "sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"}, ] -twine = [] -types-requests = [] -types-urllib3 = [] -typing-extensions = [] -urllib3 = [] -virtualenv = [] +twine = [ + {file = "twine-4.0.1-py3-none-any.whl", hash = "sha256:42026c18e394eac3e06693ee52010baa5313e4811d5a11050e7d48436cf41b9e"}, + {file = "twine-4.0.1.tar.gz", hash = "sha256:96b1cf12f7ae611a4a40b6ae8e9570215daff0611828f5fe1f37a16255ab24a0"}, +] +types-cryptography = [ + {file = "types-cryptography-3.3.23.tar.gz", hash = "sha256:b85c45fd4d3d92e8b18e9a5ee2da84517e8fff658e3ef5755c885b1c2a27c1fe"}, + {file = "types_cryptography-3.3.23-py3-none-any.whl", hash = "sha256:913b3e66a502edbf4bfc3bb45e33ab476040c56942164a7ff37bd1f0ef8ef783"}, +] +types-requests = [ + {file = "types-requests-2.28.11.tar.gz", hash = "sha256:7ee827eb8ce611b02b5117cfec5da6455365b6a575f5e3ff19f655ba603e6b4e"}, + {file = "types_requests-2.28.11-py3-none-any.whl", hash = "sha256:af5f55e803cabcfb836dad752bd6d8a0fc8ef1cd84243061c0e27dee04ccf4fd"}, +] +types-urllib3 = [ + {file = "types-urllib3-1.26.24.tar.gz", hash = "sha256:a1b3aaea7dda3eb1b51699ee723aadd235488e4dc4648e030f09bc429ecff42f"}, + {file = "types_urllib3-1.26.24-py3-none-any.whl", hash = "sha256:cf7918503d02d3576e503bbfb419b0e047c4617653bba09624756ab7175e15c9"}, +] +typing-extensions = [ + {file = "typing_extensions-4.3.0-py3-none-any.whl", hash = "sha256:25642c956049920a5aa49edcdd6ab1e06d7e5d467fc00e0506c44ac86fbfca02"}, + {file = "typing_extensions-4.3.0.tar.gz", hash = "sha256:e6d2677a32f47fc7eb2795db1dd15c1f34eff616bcaf2cfb5e997f854fa1c4a6"}, +] +urllib3 = [ + {file = "urllib3-1.26.12-py2.py3-none-any.whl", hash = "sha256:b930dd878d5a8afb066a637fbb35144fe7901e3b209d1cd4f524bd0e9deee997"}, + {file = "urllib3-1.26.12.tar.gz", hash = "sha256:3fa96cf423e6987997fc326ae8df396db2a8b7c667747d47ddd8ecba91f4a74e"}, +] +virtualenv = [ + {file = "virtualenv-20.16.5-py3-none-any.whl", hash = "sha256:d07dfc5df5e4e0dbc92862350ad87a36ed505b978f6c39609dc489eadd5b0d27"}, + {file = "virtualenv-20.16.5.tar.gz", hash = "sha256:227ea1b9994fdc5ea31977ba3383ef296d7472ea85be9d6732e42a91c04e80da"}, +] webencodings = [ {file = "webencodings-0.5.1-py2.py3-none-any.whl", hash = "sha256:a0af1213f3c2226497a97e2b3aa01a7e4bee4f403f95be16fc9acd2947514a78"}, {file = "webencodings-0.5.1.tar.gz", hash = "sha256:b36a1c245f2d304965eb4e0a82848379241dc04b865afcc4aab16748587e1923"}, ] -zipp = [] +zipp = [ + {file = "zipp-3.8.1-py3-none-any.whl", hash = "sha256:47c40d7fe183a6f21403a199b3e4192cca5774656965b0a4988ad2f8feb5f009"}, + {file = "zipp-3.8.1.tar.gz", hash = "sha256:05b45f1ee8f807d0cc928485ca40a07cb491cf092ff587c0df9cb1fd154848d2"}, +] diff --git a/pyproject.toml b/pyproject.toml index d6a859f29c..17728d4cb7 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,7 +1,7 @@ [tool.poetry] name = "cognite-sdk" -version = "4.9.0" +version = "5.0.0" description = "Cognite Python SDK" readme = "README.md" @@ -16,14 +16,17 @@ python = "^3.8" requests = "^2" requests_oauthlib = "^1" msal = "^1" +sortedcontainers = "^2.2" +numpy = { version = "^1.20", optional = true } sympy = { version = "*", optional = true } -pandas = { version = "*", optional = true } +pandas = { version = "^1.4", optional = true } geopandas = { version = ">=0.10.0", optional = true } shapely = { version = ">=1.7.0", optional = true } pip = { version = ">=20.0.0", optional = true} [tool.poetry.extras] pandas = ["pandas"] +numpy = ["numpy"] geo = ["geopandas", "shapely"] sympy = ["sympy"] functions = ["pip"] @@ -41,7 +44,6 @@ responses = "^0.21.0" pytest-rerunfailures = "^10.2" pytest-asyncio = "^0.18.3" toml = "^0.10.2" -matplotlib = "^3.5.2" python-dotenv = "^0.20.0" pytest-xdist = "^2.5.0" mypy = "^0.961" diff --git a/pytest.ini b/pytest.ini index bd5f3121bd..045d362847 100644 --- a/pytest.ini +++ b/pytest.ini @@ -6,5 +6,3 @@ markers= asyncio_mode=auto filterwarnings= ignore:Authenticated towards inferred project 'test':UserWarning - ignore:Interpreting given naive datetime as UTC instead of local time:FutureWarning - ignore:This function, `ms_to_datetime` returns a naive datetime object in UTC:FutureWarning diff --git a/scripts/create_ts_for_integration_tests.py b/scripts/create_ts_for_integration_tests.py new file mode 100644 index 0000000000..f2c5110797 --- /dev/null +++ b/scripts/create_ts_for_integration_tests.py @@ -0,0 +1,207 @@ +import time + +import numpy as np +import pandas as pd + +from cognite.client import CogniteClient +from cognite.client.data_classes import TimeSeries +from cognite.client.utils._time import UNIT_IN_MS + +NAMES = [ + f"PYSDK integration test {s}" + for s in [ + "001: outside points, numeric", + "002: outside points, string", + *[f"{i:03d}: weekly values, 1950-2000, numeric" for i in range(3, 54)], + *[f"{i:03d}: weekly values, 1950-2000, string" for i in range(54, 104)], + "104: daily values, 1965-1975, numeric", + "105: hourly values, 1969-10-01 - 1970-03-01, numeric", + "106: every minute, 1969-12-31 - 1970-01-02, numeric", + "107: every second, 1969-12-31 23:30:00 - 1970-01-01 00:30:00, numeric", + "108: every millisecond, 1969-12-31 23:59:58.500 - 1970-01-01 00:00:01.500, numeric", + "109: daily values, is_step=True, 1965-1975, numeric", + "110: hourly values, is_step=True, 1969-10-01 - 1970-03-01, numeric", + "111: every minute, is_step=True, 1969-12-31 - 1970-01-02, numeric", + "112: every second, is_step=True, 1969-12-31 23:30:00 - 1970-01-01 00:30:00, numeric", + "113: every millisecond, is_step=True, 1969-12-31 23:59:58.500 - 1970-01-01 00:00:01.500, numeric", + "114: 1mill dps, random distribution, 1950-2020, numeric", + "115: 1mill dps, random distribution, 1950-2020, string", + "116: 5mill dps, 2k dps (.1s res) burst per day, 2000-01-01 12:00:00 - 2013-09-08 12:03:19.900, numeric", + ] +] + + +def create_dense_rand_dist_ts(xid, seed, n=1_000_000): + np.random.seed(seed) + idx = np.sort( + np.random.randint( + pd.Timestamp("1950-01-01").value // int(1e6), + pd.Timestamp("2020-01-01").value // int(1e6), + n, + ) + ) + dupe = idx[:-1] == idx[1:] + idx[1:][dupe] += 1 # fingers crossed + idx = pd.to_datetime(idx, unit="ms").sort_values() + assert idx.is_unique, "not unique idx" + return pd.DataFrame( + {xid: np.arange(n) - 499_999}, + index=idx, + ) + + +def create_bursty_ts(xid, offset): + idxs = [] + for day in pd.date_range(start="2000", periods=5000, freq="D"): + idxs.append(pd.date_range(start=day + pd.Timedelta("12h"), periods=2000, freq="100ms")) + idxs = np.concatenate(idxs) + return pd.DataFrame({xid: idxs.astype(np.int64) // int(1e6) + offset}, index=idxs) + + +def delete_all_time_series(ts_api): + ts_api.delete(external_id=NAMES, ignore_unknown_ids=True) + print(f"Deleted ts: {len(NAMES)=}") + time.sleep(10) + + +def create_all_time_series(ts_api): + ts_add = (ts_lst := []).append + df_add = (df_lst := []).append + ts_add(TimeSeries(name=NAMES[0], external_id=NAMES[0], is_string=False, metadata={"offset": 1, "delta": 10})) + arr = np.arange(-100, 101, 10) + df_add(pd.DataFrame({NAMES[0]: arr + 1}, index=pd.to_datetime(arr, unit="ms"))) + + ts_add(TimeSeries(name=NAMES[1], external_id=NAMES[1], is_string=True, metadata={"offset": 2, "delta": 10})) + df_add(pd.DataFrame({NAMES[1]: (arr + 2).astype(str)}, index=pd.to_datetime(arr, unit="ms"))) + + weekly_idx = pd.date_range(start="1950", end="2000", freq="1w") + arr = weekly_idx.to_numpy("datetime64[ms]").astype(np.int64) + for i, name in enumerate(NAMES[2:53], 3): + ts_add( + TimeSeries(name=name, external_id=name, is_string=False, metadata={"offset": i, "delta": UNIT_IN_MS["w"]}) + ) + df_add(pd.DataFrame({name: arr + i}, index=weekly_idx)) + + for i, name in enumerate(NAMES[53:103], i + 1): + ts_add( + TimeSeries(name=name, external_id=name, is_string=True, metadata={"offset": i, "delta": UNIT_IN_MS["w"]}) + ) + df_add(pd.DataFrame({name: (arr + i).astype(str)}, index=weekly_idx)) + + i = 103 + for is_step in [False, True]: + daily_idx = pd.date_range(start="1965", end="1975", freq="1d") + arr = daily_idx.to_numpy("datetime64[ms]").astype(np.int64) + ts_add( + TimeSeries( + name=NAMES[i], + external_id=NAMES[i], + is_string=False, + is_step=is_step, + metadata={"offset": i + 1, "delta": UNIT_IN_MS["d"]}, + ) + ) + df_add(pd.DataFrame({NAMES[i]: arr + i + 1}, index=daily_idx)) + + hourly_idx = pd.date_range(start="1969-10-01", end="1970-03-01", freq="1h") + arr = hourly_idx.to_numpy("datetime64[ms]").astype(np.int64) + ts_add( + TimeSeries( + name=NAMES[i + 1], + external_id=NAMES[i + 1], + is_string=False, + is_step=is_step, + metadata={"offset": i + 2, "delta": UNIT_IN_MS["h"]}, + ) + ) + df_add(pd.DataFrame({NAMES[i + 1]: arr + i + 2}, index=hourly_idx)) + + minute_idx = pd.date_range(start="1969-12-31", end="1970-01-02", freq="1T") + arr = minute_idx.to_numpy("datetime64[ms]").astype(np.int64) + ts_add( + TimeSeries( + name=NAMES[i + 2], + external_id=NAMES[i + 2], + is_string=False, + is_step=is_step, + metadata={"offset": i + 3, "delta": UNIT_IN_MS["m"]}, + ) + ) + df_add(pd.DataFrame({NAMES[i + 2]: arr + i + 3}, index=minute_idx)) + + second_idx = pd.date_range(start="1969-12-31 23:30:00", end="1970-01-01 00:30:00", freq="1s") + arr = second_idx.to_numpy("datetime64[ms]").astype(np.int64) + ts_add( + TimeSeries( + name=NAMES[i + 3], + external_id=NAMES[i + 3], + is_string=False, + is_step=is_step, + metadata={"offset": i + 4, "delta": UNIT_IN_MS["s"]}, + ) + ) + df_add(pd.DataFrame({NAMES[i + 3]: arr + i + 4}, index=second_idx)) + + millisec_idx = pd.date_range(start="1969-12-31 23:59:58.500", end="1970-01-01 00:00:01.500", freq="1ms") + arr = millisec_idx.to_numpy("datetime64[ms]").astype(np.int64) + ts_add( + TimeSeries( + name=NAMES[i + 4], + external_id=NAMES[i + 4], + is_string=False, + is_step=is_step, + metadata={"offset": i + 5, "delta": 1}, + ) + ) + df_add(pd.DataFrame({NAMES[i + 4]: arr + i + 5}, index=millisec_idx)) + i += 5 + + ts_add( + TimeSeries( + name=NAMES[113], + external_id=NAMES[113], + is_string=False, + metadata={"offset": "n/a", "delta": "uniform random"}, + ) + ) + df_add(create_dense_rand_dist_ts(NAMES[113], seed=42)) + + ts_add( + TimeSeries( + name=NAMES[114], + external_id=NAMES[114], + is_string=True, + metadata={"offset": "n/a", "delta": "uniform random"}, + ) + ) + df_add(create_dense_rand_dist_ts(NAMES[114], seed=43).astype(str)) + + ts_add( + TimeSeries( + name=NAMES[115], + external_id=NAMES[115], + is_string=False, + metadata={"offset": 116, "delta": "1 or 86200100000000"}, + ) + ) + df_add(create_bursty_ts(NAMES[115], 116)) + + ts_api.create(ts_lst) + print(f"Created {len(ts_lst)} ts") + time.sleep(5) + + # Concat consumes too much RAM, loop through dfs: + for df in df_lst: + ts_api.data.insert_dataframe( + df, + external_id_headers=True, + dropna=True, + ) + print("Inserted loads of dps!") + + +if __name__ == "__main__": + # To avoid accidental runs, please provide a valid config to CogniteClient: + client = CogniteClient(...) + delete_all_time_series(client.time_series) + create_all_time_series(client.time_series) diff --git a/scripts/generate_code_snippets.py b/scripts/generate_code_snippets.py index 48bac64419..798643415d 100644 --- a/scripts/generate_code_snippets.py +++ b/scripts/generate_code_snippets.py @@ -25,7 +25,7 @@ def collect_apis(obj, done): apis = collect_apis(client, {}) snippets = {"language": "Python", "label": "Python SDK", "operations": defaultdict(str)} -filter_out = ["from cognite.client import CogniteClient", "c = CogniteClient()", ""] +filter_out = ["from cognite.client import CogniteClient", "c = CogniteClient()", "client = CogniteClient()", ""] duplicate_operations = { "listAssets": "getAssets", diff --git a/tests/conftest.py b/tests/conftest.py index 723ebf8dd0..06f1e6836c 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -1,79 +1,14 @@ -from unittest import mock - import dotenv import pytest import responses -from cognite.client import CogniteClient, global_config -from cognite.client._api.assets import AssetsAPI -from cognite.client._api.data_sets import DataSetsAPI -from cognite.client._api.datapoints import DatapointsAPI -from cognite.client._api.entity_matching import EntityMatchingAPI -from cognite.client._api.events import EventsAPI -from cognite.client._api.files import FilesAPI -from cognite.client._api.iam import IAMAPI, APIKeysAPI, GroupsAPI, SecurityCategoriesAPI, ServiceAccountsAPI -from cognite.client._api.login import LoginAPI -from cognite.client._api.raw import RawAPI, RawDatabasesAPI, RawRowsAPI, RawTablesAPI -from cognite.client._api.relationships import RelationshipsAPI -from cognite.client._api.sequences import SequencesAPI, SequencesDataAPI -from cognite.client._api.three_d import ( - ThreeDAPI, - ThreeDAssetMappingAPI, - ThreeDFilesAPI, - ThreeDModelsAPI, - ThreeDRevisionsAPI, -) -from cognite.client._api.time_series import TimeSeriesAPI -from cognite.client._api.vision import VisionAPI +from cognite.client import global_config dotenv.load_dotenv() global_config.disable_pypi_version_check = True -@pytest.fixture -def mock_cognite_client(): - with mock.patch("cognite.client.CogniteClient") as client_mock: - cog_client_mock = mock.MagicMock(spec=CogniteClient) - cog_client_mock.time_series = mock.MagicMock(spec=TimeSeriesAPI) - cog_client_mock.datapoints = mock.MagicMock(spec=DatapointsAPI) - cog_client_mock.assets = mock.MagicMock(spec=AssetsAPI) - cog_client_mock.events = mock.MagicMock(spec=EventsAPI) - cog_client_mock.data_sets = mock.MagicMock(spec=DataSetsAPI) - cog_client_mock.files = mock.MagicMock(spec=FilesAPI) - cog_client_mock.login = mock.MagicMock(spec=LoginAPI) - cog_client_mock.three_d = mock.MagicMock(spec=ThreeDAPI) - cog_client_mock.three_d.models = mock.MagicMock(spec=ThreeDModelsAPI) - cog_client_mock.three_d.revisions = mock.MagicMock(spec=ThreeDRevisionsAPI) - cog_client_mock.three_d.files = mock.MagicMock(spec=ThreeDFilesAPI) - cog_client_mock.three_d.asset_mappings = mock.MagicMock(spec=ThreeDAssetMappingAPI) - cog_client_mock.iam = mock.MagicMock(spec=IAMAPI) - cog_client_mock.iam.service_accounts = mock.MagicMock(spec=ServiceAccountsAPI) - cog_client_mock.iam.api_keys = mock.MagicMock(spec=APIKeysAPI) - cog_client_mock.iam.groups = mock.MagicMock(spec=GroupsAPI) - cog_client_mock.iam.security_categories = mock.MagicMock(spec=SecurityCategoriesAPI) - cog_client_mock.sequences = mock.MagicMock(spec=SequencesAPI) - cog_client_mock.sequences.data = mock.MagicMock(spec=SequencesDataAPI) - cog_client_mock.relationships = mock.MagicMock(spec=RelationshipsAPI) - cog_client_mock.vision = mock.MagicMock(spec=VisionAPI) - raw_mock = mock.MagicMock(spec=RawAPI) - raw_mock.databases = mock.MagicMock(spec=RawDatabasesAPI) - raw_mock.tables = mock.MagicMock(spec=RawTablesAPI) - raw_mock.rows = mock.MagicMock(spec=RawRowsAPI) - cog_client_mock.raw = raw_mock - client_mock.return_value = cog_client_mock - yield - - -@pytest.fixture -def mock_cognite_beta_client(mock_cognite_client): - with mock.patch("cognite.client.beta.CogniteClient") as client_mock: - cog_client_mock = mock.MagicMock(spec=CogniteClient) - cog_client_mock.entity_matching = mock.MagicMock(spec=EntityMatchingAPI) - client_mock.return_value = cog_client_mock - yield - - @pytest.fixture def rsps(): with responses.RequestsMock() as rsps: @@ -96,7 +31,7 @@ def pytest_addoption(parser): def pytest_collection_modifyitems(config, items): if config.getoption("--test-deps-only-core"): - return + return None skip_core = pytest.mark.skip(reason="need --test-deps-only-core option to run") for item in items: if "coredeps" in item.keywords: diff --git a/tests/tests_integration/conftest.py b/tests/tests_integration/conftest.py index 9ce11d0246..7299511d09 100644 --- a/tests/tests_integration/conftest.py +++ b/tests/tests_integration/conftest.py @@ -24,9 +24,9 @@ def cognite_client() -> CogniteClient: scopes=os.environ.get("COGNITE_TOKEN_SCOPES", "").split(","), redirect_port=random.randint(53000, 60000), # random port so we can run the test suite in parallel ) - else: raise ValueError("Environment variable LOGIN_FLOW must be set to either 'client_credentials' or 'interactive'") + return CogniteClient( ClientConfig( client_name=os.environ["COGNITE_CLIENT_NAME"], diff --git a/tests/tests_integration/test_api/test_datapoints.py b/tests/tests_integration/test_api/test_datapoints.py index 73be0624e5..b7454776f6 100644 --- a/tests/tests_integration/test_api/test_datapoints.py +++ b/tests/tests_integration/test_api/test_datapoints.py @@ -1,23 +1,118 @@ +""" +Note: If tests related to fetching datapoints are broken, all time series + their datapoints can be + recreated easily by running the file linked below. You will need to provide a valid set of + credentials to the `CogniteClient` for the Python SDK integration test CDF project: +>>> python scripts/create_ts_for_integration_tests.py +""" +import itertools +import random import re -from datetime import datetime, timedelta -from unittest import mock +import time +from contextlib import nullcontext as does_not_raise +from datetime import datetime +from random import randint +from unittest.mock import patch -import numpy -import pandas +import numpy as np +import pandas as pd import pytest -from cognite.client import utils -from cognite.client.data_classes import DatapointsList, DatapointsQuery, TimeSeries -from cognite.client.exceptions import CogniteAPIError -from cognite.client.utils._time import MIN_TIMESTAMP_MS, timestamp_to_ms -from tests.utils import set_request_limit +from cognite.client._api.datapoint_constants import ALL_SORTED_DP_AGGS +from cognite.client.data_classes import ( + Datapoints, + DatapointsArray, + DatapointsArrayList, + DatapointsList, + DatapointsQuery, + TimeSeries, +) +from cognite.client.exceptions import CogniteAPIError, CogniteNotFoundError +from cognite.client.utils._time import ( + MAX_TIMESTAMP_MS, + MIN_TIMESTAMP_MS, + UNIT_IN_MS, + align_start_and_end_for_granularity, + granularity_to_ms, + timestamp_to_ms, +) +from tests.utils import ( + random_aggregates, + random_cognite_external_ids, + random_cognite_ids, + random_gamma_dist_integer, + random_granularity, + set_max_workers, +) + +DATAPOINTS_API = "cognite.client._api.datapoints.{}" +WEEK_MS = UNIT_IN_MS["w"] +DAY_MS = UNIT_IN_MS["d"] +YEAR_MS = { + 1950: -631152000000, + 1965: -157766400000, + 1975: 157766400000, + 2000: 946684800000, + 2014: 1388534400000, + 2020: 1577836800000, +} +DPS_TYPES = [Datapoints, DatapointsArray] +DPS_LST_TYPES = [DatapointsList, DatapointsArrayList] + +# To avoid the error "different tests were collected between...", we must make sure all parallel test-runners +# generate the same tests (randomized test data). We also want different random values over time (...thats the point), +# so we set seed based on time, but round to have some buffer: +random.seed(round(time.time(), -3)) @pytest.fixture(scope="session") -def test_time_series(cognite_client): - eids = ["test__constant_%d_with_noise" % i for i in range(10)] - ts = cognite_client.time_series.retrieve_multiple(external_ids=eids, ignore_unknown_ids=True) - yield {int(re.match(r"test__constant_(\d+)_with_noise", t.name).group(1)): t for t in ts} +def all_test_time_series(cognite_client): + prefix = "PYSDK integration test" + return cognite_client.time_series.retrieve_multiple( + external_ids=[ + f"{prefix} 001: outside points, numeric", + f"{prefix} 002: outside points, string", + *[f"{prefix} {i:03d}: weekly values, 1950-2000, numeric" for i in range(3, 54)], + *[f"{prefix} {i:03d}: weekly values, 1950-2000, string" for i in range(54, 104)], + f"{prefix} 104: daily values, 1965-1975, numeric", + f"{prefix} 105: hourly values, 1969-10-01 - 1970-03-01, numeric", + f"{prefix} 106: every minute, 1969-12-31 - 1970-01-02, numeric", + f"{prefix} 107: every second, 1969-12-31 23:30:00 - 1970-01-01 00:30:00, numeric", + f"{prefix} 108: every millisecond, 1969-12-31 23:59:58.500 - 1970-01-01 00:00:01.500, numeric", + f"{prefix} 109: daily values, is_step=True, 1965-1975, numeric", + f"{prefix} 110: hourly values, is_step=True, 1969-10-01 - 1970-03-01, numeric", + f"{prefix} 111: every minute, is_step=True, 1969-12-31 - 1970-01-02, numeric", + f"{prefix} 112: every second, is_step=True, 1969-12-31 23:30:00 - 1970-01-01 00:30:00, numeric", + f"{prefix} 113: every millisecond, is_step=True, 1969-12-31 23:59:58.500 - 1970-01-01 00:00:01.500, numeric", + f"{prefix} 114: 1mill dps, random distribution, 1950-2020, numeric", + f"{prefix} 115: 1mill dps, random distribution, 1950-2020, string", + f"{prefix} 116: 5mill dps, 2k dps (.1s res) burst per day, 2000-01-01 12:00:00 - 2013-09-08 12:03:19.900, numeric", + ] + ) + + +@pytest.fixture +def outside_points_ts(all_test_time_series): + return all_test_time_series[:2] + + +@pytest.fixture +def weekly_dps_ts(all_test_time_series): + return all_test_time_series[2:53], all_test_time_series[53:103] + + +@pytest.fixture +def fixed_freq_dps_ts(all_test_time_series): + return all_test_time_series[103:108], all_test_time_series[108:113] + + +@pytest.fixture +def one_mill_dps_ts(all_test_time_series): + return all_test_time_series[113], all_test_time_series[114] + + +@pytest.fixture +def ms_bursty_ts(all_test_time_series): + return all_test_time_series[115] @pytest.fixture(scope="session") @@ -28,280 +123,1042 @@ def new_ts(cognite_client): assert cognite_client.time_series.retrieve(ts.id) is None -def has_duplicates(df: pandas.DataFrame): - return df.duplicated().any() +@pytest.fixture +def retrieve_endpoints(cognite_client): + return [ + cognite_client.time_series.data.retrieve, + cognite_client.time_series.data.retrieve_arrays, + ] -def has_correct_timestamp_spacing(df: pandas.DataFrame, granularity: str): - timestamps = df.index.values.astype("datetime64[ms]").astype("int64") - deltas = numpy.diff(timestamps, 1) - granularity_ms = utils._time.granularity_to_ms(granularity) - return (deltas != 0).all() and (deltas % granularity_ms == 0).all() +def ts_to_ms(ts): + assert isinstance(ts, str) + return pd.Timestamp(ts).value // int(1e6) -@pytest.fixture -def post_spy(cognite_client): - with mock.patch.object(cognite_client.datapoints, "_post", wraps=cognite_client.datapoints._post) as _: - yield +def convert_any_ts_to_integer(ts): + if isinstance(ts, int): + return ts + elif isinstance(ts, np.datetime64): + return ts.astype("datetime64[ms]").astype(int) + raise ValueError -class TestDatapointsAPI: - def test_retrieve(self, cognite_client, test_time_series): - ts = test_time_series[0] - dps = cognite_client.datapoints.retrieve(id=ts.id, start="1d-ago", end="now") - assert len(dps) > 0 +def validate_raw_datapoints_lst(ts_lst, dps_lst, **kw): + assert isinstance(dps_lst, (DatapointsList, DatapointsArrayList)), "Datapoints(Array)List not given" + for ts, dps in zip(ts_lst, dps_lst): + validate_raw_datapoints(ts, dps, **kw) - def test_retrieve_before_epoch(self, cognite_client, test_time_series): - ts = test_time_series[0] - dps = cognite_client.datapoints.retrieve(id=ts.id, start=MIN_TIMESTAMP_MS, end="now") - assert len(dps) > 0 - def test_retrieve_unknown(self, cognite_client, test_time_series): - ts = test_time_series[0] - dps = cognite_client.datapoints.retrieve(id=[ts.id] + [42], start="1d-ago", end="now", ignore_unknown_ids=True) - assert 1 == len(dps) +def validate_raw_datapoints(ts, dps, check_offset=True, check_delta=True): + assert isinstance(dps, (Datapoints, DatapointsArray)), "Datapoints(Array) not given" + # Convert both dps types to arrays for simple comparisons: + # (also convert string datapoints - which are also integers) + values = np.array(dps.value, dtype=np.int64) + index = np.array(dps.timestamp, dtype=np.int64) + if isinstance(dps, DatapointsArray): + index = index // int(1e6) + # Verify index is sorted: + assert np.all(index[:-1] < index[1:]) + # Verify the actual datapoint values: + if check_offset: + offset = int(ts.metadata["offset"]) + assert np.all(index == values - offset) + # Verify spacing between points: + if check_delta: + delta = int(ts.metadata["delta"]) + assert np.all(np.diff(values) == delta) - def test_retrieve_all_unknown(self, cognite_client, test_time_series): - dps = cognite_client.datapoints.retrieve( - id=[42], external_id="missing", start="1d-ago", end="now", ignore_unknown_ids=True - ) - assert isinstance(dps, DatapointsList) - assert 0 == len(dps) + return index, values - def test_retrieve_all_unknown_single(self, cognite_client, test_time_series): - dps = cognite_client.datapoints.retrieve( - external_id="missing", start="1d-ago", end="now", ignore_unknown_ids=True - ) - assert dps is None - def test_retrieve_multiple(self, cognite_client, test_time_series): - ids = [test_time_series[0].id, test_time_series[1].id, {"id": test_time_series[2].id, "aggregates": ["max"]}] +PARAMETRIZED_VALUES_OUTSIDE_POINTS = [ + (-100, 100, False, True), + (-99, 100, True, True), + (-99, 101, True, False), + (-100, 101, False, False), +] + + +class TestRetrieveRawDatapointsAPI: + """Note: Since `retrieve` and `retrieve_arrays` endspoints should give identical results, + except for the data container types, all tests run both endpoints. + """ + + @pytest.mark.parametrize("start, end, has_before, has_after", PARAMETRIZED_VALUES_OUTSIDE_POINTS) + def test_retrieve_outside_points_only( + self, retrieve_endpoints, outside_points_ts, start, end, has_before, has_after + ): + # We have 10 ms resolution data between ts = -100 and +100 + for ts, endpoint in itertools.product(outside_points_ts, retrieve_endpoints): + res = endpoint(id=ts.id, limit=0, start=start, end=end, include_outside_points=True) + index, values = validate_raw_datapoints(ts, res, check_delta=False) + assert len(res) == has_before + has_after + + if has_before or has_after: + first_ts, last_ts = index[0].item(), index[-1].item() # numpy bool != py bool + assert (start > first_ts) is has_before + assert (end <= last_ts) is has_after + if has_before: + assert start > first_ts == -100 + if has_after: + assert end <= last_ts == 100 + + @pytest.mark.parametrize("start, end, has_before, has_after", PARAMETRIZED_VALUES_OUTSIDE_POINTS) + def test_retrieve_outside_points_nonzero_limit( + self, retrieve_endpoints, outside_points_ts, start, end, has_before, has_after + ): + for ts, endpoint, limit in itertools.product(outside_points_ts, retrieve_endpoints, [3, None]): + res = endpoint(id=ts.id, limit=limit, start=start, end=end, include_outside_points=True) + index, values = validate_raw_datapoints(ts, res, check_delta=limit is None) + if limit == 3: + assert len(res) - 3 == has_before + has_after + first_ts, last_ts = index[0].item(), index[-1].item() # numpy bool != py bool + assert (start > first_ts) is has_before + assert (end <= last_ts) is has_after + if has_before: + assert start > first_ts == -100 + if has_after: + assert end <= last_ts == 100 + + @pytest.mark.parametrize("start, end, has_before, has_after", PARAMETRIZED_VALUES_OUTSIDE_POINTS) + def test_retrieve_outside_points__query_limit_plusminus1_tests( + self, retrieve_endpoints, outside_points_ts, start, end, has_before, has_after + ): + limit = 3 + for dps_limit in range(limit - 1, limit + 2): + with patch(DATAPOINTS_API.format("DPS_LIMIT"), dps_limit): + for ts, endpoint in itertools.product(outside_points_ts, retrieve_endpoints): + res = endpoint(id=ts.id, limit=limit, start=start, end=end, include_outside_points=True) + index, values = validate_raw_datapoints(ts, res, check_delta=False) + assert len(res) - 3 == has_before + has_after + first_ts, last_ts = index[0].item(), index[-1].item() # numpy bool != py bool + assert (start > first_ts) is has_before + assert (end <= last_ts) is has_after + if has_before: + assert start > first_ts == -100 + if has_after: + assert end <= last_ts == 100 + + @pytest.mark.parametrize( + "start, end, exp_first_ts, exp_last_ts", + # fmt: off + [ + (631670400000 + 1, 693964800000, 631670400000, 693964800000), # noqa: E241 + (631670400000, 693964800000, 631670400000 - WEEK_MS, 693964800000), # noqa: E241 + (631670400000, 693964800000 + 1, 631670400000 - WEEK_MS, 693964800000 + WEEK_MS), # noqa: E241 + (631670400000 + 1, 693964800000 + 1, 631670400000, 693964800000 + WEEK_MS), # noqa: E241 + ], + # fmt: on + ) + def test_retrieve_outside_points__query_chunking_mode( + self, start, end, exp_first_ts, exp_last_ts, cognite_client, retrieve_endpoints, weekly_dps_ts + ): + ts_lst = weekly_dps_ts[0] + weekly_dps_ts[1] # chain numeric & string + limits = [0, 1, 50, int(1e9), None] # None ~ 100 dps (max dps returned) + with set_max_workers(cognite_client, 5), patch(DATAPOINTS_API.format("EagerDpsFetcher")): + # `n_ts` is per identifier (id + xid). At least 3, since 3 x 2 > 5 + for n_ts, endpoint in itertools.product([3, 10, 50], retrieve_endpoints): + id_ts_lst, xid_ts_lst = random.sample(ts_lst, k=n_ts), random.sample(ts_lst, k=n_ts) + res_lst = endpoint( + external_id=[{"external_id": ts.external_id, "limit": random.choice(limits)} for ts in xid_ts_lst], + id=[{"id": ts.id, "limit": random.choice(limits)} for ts in id_ts_lst], + start=start, + end=end, + include_outside_points=True, + ) + requested_ts = id_ts_lst + xid_ts_lst + for ts, res in zip(requested_ts, res_lst): + index, values = validate_raw_datapoints(ts, res, check_delta=False) + assert exp_first_ts == index[0] + assert exp_last_ts == index[-1] + + @pytest.mark.parametrize( + "n_ts, identifier", + [ + (1, "id"), + (1, "external_id"), + (2, "id"), + (2, "external_id"), + (5, "id"), + (5, "external_id"), + ], + ) + def test_retrieve_raw_eager_mode_single_identifier_type( + self, cognite_client, n_ts, identifier, retrieve_endpoints, weekly_dps_ts + ): + # We patch out ChunkingDpsFetcher to make sure we fail if we're not in eager mode: + with set_max_workers(cognite_client, 5), patch(DATAPOINTS_API.format("ChunkingDpsFetcher")): + for ts_lst, endpoint, limit in itertools.product(weekly_dps_ts, retrieve_endpoints, [0, 50, None]): + ts_lst = random.sample(ts_lst, n_ts) + res_lst = endpoint( + **{identifier: [getattr(ts, identifier) for ts in ts_lst]}, + limit=limit, + start=MIN_TIMESTAMP_MS, + end=MAX_TIMESTAMP_MS, + include_outside_points=False, + ) + validate_raw_datapoints_lst(ts_lst, res_lst) + exp_len = 2609 if limit is None else limit + for res in res_lst: + assert len(res) == exp_len + + @pytest.mark.parametrize( + "n_ts, identifier", + [ + (3, "id"), + (3, "external_id"), + (10, "id"), + (10, "external_id"), + (50, "id"), + (50, "external_id"), + ], + ) + def test_retrieve_raw_chunking_mode_single_identifier_type( + self, cognite_client, n_ts, identifier, retrieve_endpoints, weekly_dps_ts + ): + # We patch out EagerDpsFetcher to make sure we fail if we're not in chunking mode: + with set_max_workers(cognite_client, 2), patch(DATAPOINTS_API.format("EagerDpsFetcher")): + for ts_lst, endpoint, limit in itertools.product(weekly_dps_ts, retrieve_endpoints, [0, 50, None]): + ts_lst = random.sample(ts_lst, n_ts) + res_lst = endpoint( + **{identifier: [getattr(ts, identifier) for ts in ts_lst]}, + limit=limit, + start=MIN_TIMESTAMP_MS, + end=MAX_TIMESTAMP_MS, + include_outside_points=False, + ) + validate_raw_datapoints_lst(ts_lst, res_lst) + exp_len = 2609 if limit is None else limit + for res in res_lst: + assert len(res) == exp_len + + @pytest.mark.parametrize( + "n_ts, ignore_unknown_ids, mock_out_eager_or_chunk, expected_raise", + [ + (1, True, "ChunkingDpsFetcher", does_not_raise()), + (1, False, "ChunkingDpsFetcher", pytest.raises(CogniteNotFoundError, match=re.escape("Not found: ["))), + (3, True, "ChunkingDpsFetcher", does_not_raise()), + (3, False, "ChunkingDpsFetcher", pytest.raises(CogniteNotFoundError, match=re.escape("Not found: ["))), + (10, True, "EagerDpsFetcher", does_not_raise()), + (10, False, "EagerDpsFetcher", pytest.raises(CogniteNotFoundError, match=re.escape("Not found: ["))), + (50, True, "EagerDpsFetcher", does_not_raise()), + (50, False, "EagerDpsFetcher", pytest.raises(CogniteNotFoundError, match=re.escape("Not found: ["))), + ], + ) + def test_retrieve_unknown__check_raises_or_returns_existing_only( + self, + n_ts, + ignore_unknown_ids, + mock_out_eager_or_chunk, + expected_raise, + cognite_client, + retrieve_endpoints, + all_test_time_series, + ): + ts_exists = all_test_time_series[0] + with set_max_workers(cognite_client, 9), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + identifier = { + "id": [ts_exists.id] + random_cognite_ids(n_ts), + "external_id": [ts_exists.external_id] + random_cognite_external_ids(n_ts), + } + drop_id = random.choice(["id", "external_id", "keep"]) + if drop_id != "keep": + identifier.pop(drop_id) + for endpoint in retrieve_endpoints: + with expected_raise: + res_lst = endpoint( + **identifier, + start=randint(MIN_TIMESTAMP_MS, 100), + end=randint(101, MAX_TIMESTAMP_MS), + ignore_unknown_ids=ignore_unknown_ids, + limit=5, + ) + exp_len = 2 if drop_id == "keep" else 1 + assert exp_len == len(res_lst) + validate_raw_datapoints_lst([ts_exists] * exp_len, res_lst) + + @pytest.mark.parametrize( + "max_workers, mock_out_eager_or_chunk, ids, external_ids, exp_res_types", + [ + # Single identifier given as base type (int/str) or as dict + (1, "ChunkingDpsFetcher", random_cognite_ids(1)[0], None, [None, None]), + (1, "ChunkingDpsFetcher", None, random_cognite_external_ids(1)[0], [None, None]), + (1, "ChunkingDpsFetcher", {"id": random_cognite_ids(1)[0]}, None, [None, None]), + (1, "ChunkingDpsFetcher", None, {"external_id": random_cognite_external_ids(1)[0]}, [None, None]), + # Single identifier given as length-1 list: + (1, "ChunkingDpsFetcher", random_cognite_ids(1), None, DPS_LST_TYPES), + (1, "ChunkingDpsFetcher", None, random_cognite_external_ids(1), DPS_LST_TYPES), + # Single identifier given by BOTH id and external id: + (2, "ChunkingDpsFetcher", random_cognite_ids(1)[0], random_cognite_external_ids(1)[0], DPS_LST_TYPES), + ( + 2, + "ChunkingDpsFetcher", + {"id": random_cognite_ids(1)[0]}, + {"external_id": random_cognite_external_ids(1)[0]}, + DPS_LST_TYPES, + ), + ( + 2, + "ChunkingDpsFetcher", + {"id": random_cognite_ids(1)[0]}, + random_cognite_external_ids(1)[0], + DPS_LST_TYPES, + ), + ( + 2, + "ChunkingDpsFetcher", + random_cognite_ids(1)[0], + {"external_id": random_cognite_external_ids(1)[0]}, + DPS_LST_TYPES, + ), + (1, "EagerDpsFetcher", random_cognite_ids(1)[0], random_cognite_external_ids(1)[0], DPS_LST_TYPES), + ( + 1, + "EagerDpsFetcher", + {"id": random_cognite_ids(1)[0]}, + {"external_id": random_cognite_external_ids(1)[0]}, + DPS_LST_TYPES, + ), + ( + 1, + "EagerDpsFetcher", + random_cognite_ids(1)[0], + {"external_id": random_cognite_external_ids(1)[0]}, + DPS_LST_TYPES, + ), + (1, "EagerDpsFetcher", {"id": random_cognite_ids(1)[0]}, random_cognite_external_ids(1)[0], DPS_LST_TYPES), + # Multiple identifiers given by single identifier: + (4, "ChunkingDpsFetcher", random_cognite_ids(3), None, DPS_LST_TYPES), + (4, "ChunkingDpsFetcher", None, random_cognite_external_ids(3), DPS_LST_TYPES), + (2, "EagerDpsFetcher", random_cognite_ids(3), None, DPS_LST_TYPES), + (2, "EagerDpsFetcher", None, random_cognite_external_ids(3), DPS_LST_TYPES), + # Multiple identifiers given by BOTH identifiers: + (5, "ChunkingDpsFetcher", random_cognite_ids(2), random_cognite_external_ids(2), DPS_LST_TYPES), + (5, "ChunkingDpsFetcher", random_cognite_ids(2), random_cognite_external_ids(2), DPS_LST_TYPES), + (3, "EagerDpsFetcher", random_cognite_ids(2), random_cognite_external_ids(2), DPS_LST_TYPES), + (3, "EagerDpsFetcher", random_cognite_ids(2), random_cognite_external_ids(2), DPS_LST_TYPES), + ( + 5, + "ChunkingDpsFetcher", + [{"id": id} for id in random_cognite_ids(2)], + [{"external_id": xid} for xid in random_cognite_external_ids(2)], + DPS_LST_TYPES, + ), + ( + 3, + "EagerDpsFetcher", + [{"id": id} for id in random_cognite_ids(2)], + [{"external_id": xid} for xid in random_cognite_external_ids(2)], + DPS_LST_TYPES, + ), + ], + ) + def test_retrieve__all_unknown_single_multiple_given( + self, max_workers, mock_out_eager_or_chunk, ids, external_ids, exp_res_types, cognite_client, retrieve_endpoints + ): + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + for endpoint, exp_res_type in zip(retrieve_endpoints, exp_res_types): + res = endpoint( + id=ids, + external_id=external_ids, + ignore_unknown_ids=True, + ) + if exp_res_type is None: + assert res is None + else: + assert isinstance(res, exp_res_type) + assert len(res) == 0 + + @pytest.mark.parametrize( + "max_workers, mock_out_eager_or_chunk, identifier, exp_res_types", + [ + (1, "ChunkingDpsFetcher", "id", DPS_TYPES), + (1, "ChunkingDpsFetcher", "external_id", DPS_TYPES), + ], + ) + def test_retrieve_nothing__single( + self, + max_workers, + mock_out_eager_or_chunk, + identifier, + exp_res_types, + outside_points_ts, + retrieve_endpoints, + cognite_client, + ): + ts = outside_points_ts[0] + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + for endpoint, exp_res_type in zip(retrieve_endpoints, exp_res_types): + res = endpoint(**{identifier: getattr(ts, identifier)}, start=1, end=9) + assert isinstance(res, exp_res_type) + assert len(res) == 0 + assert isinstance(res.is_step, bool) + assert isinstance(res.is_string, bool) + + @pytest.mark.parametrize( + "max_workers, mock_out_eager_or_chunk, exp_res_types", + [ + (1, "EagerDpsFetcher", DPS_LST_TYPES), + (3, "ChunkingDpsFetcher", DPS_LST_TYPES), + ], + ) + def test_retrieve_nothing__multiple( + self, max_workers, mock_out_eager_or_chunk, exp_res_types, outside_points_ts, retrieve_endpoints, cognite_client + ): + ts1, ts2 = outside_points_ts + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + for endpoint, exp_res_type in zip(retrieve_endpoints, exp_res_types): + res = endpoint(id=ts1.id, external_id=[ts2.external_id], start=1, end=9) + assert isinstance(res, exp_res_type) + assert len(res) == 2 + for r in res: + assert len(r) == 0 + assert isinstance(r.is_step, bool) + assert isinstance(r.is_string, bool) + + +class TestRetrieveAggregateDatapointsAPI: + @pytest.mark.parametrize( + "is_step, start, end, exp_start, exp_end, max_workers, mock_out_eager_or_chunk", + ( + (True, MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS, YEAR_MS[1965], YEAR_MS[1975], 4, "ChunkingDpsFetcher"), + (False, MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS, YEAR_MS[1965], YEAR_MS[1975], 4, "ChunkingDpsFetcher"), + (True, MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS, YEAR_MS[1965], YEAR_MS[1975], 1, "EagerDpsFetcher"), + (False, MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS, YEAR_MS[1965], YEAR_MS[1975], 1, "EagerDpsFetcher"), + (True, YEAR_MS[1965], YEAR_MS[1975], YEAR_MS[1965], YEAR_MS[1975] - DAY_MS, 4, "ChunkingDpsFetcher"), + (False, YEAR_MS[1965], YEAR_MS[1975], YEAR_MS[1965], YEAR_MS[1975] - DAY_MS, 4, "ChunkingDpsFetcher"), + (True, YEAR_MS[1965], YEAR_MS[1975], YEAR_MS[1965], YEAR_MS[1975] - DAY_MS, 1, "EagerDpsFetcher"), + (False, YEAR_MS[1965], YEAR_MS[1975], YEAR_MS[1965], YEAR_MS[1975] - DAY_MS, 1, "EagerDpsFetcher"), + (True, -WEEK_MS, WEEK_MS + 1, -WEEK_MS, WEEK_MS, 4, "ChunkingDpsFetcher"), + (False, -WEEK_MS, WEEK_MS + 1, -WEEK_MS, WEEK_MS, 4, "ChunkingDpsFetcher"), + (True, -WEEK_MS, WEEK_MS + 1, -WEEK_MS, WEEK_MS, 1, "EagerDpsFetcher"), + (False, -WEEK_MS, WEEK_MS + 1, -WEEK_MS, WEEK_MS, 1, "EagerDpsFetcher"), + (True, -DAY_MS, DAY_MS + 1, -DAY_MS, DAY_MS, 4, "ChunkingDpsFetcher"), + (False, -DAY_MS, DAY_MS + 1, -DAY_MS, DAY_MS, 4, "ChunkingDpsFetcher"), + (True, -DAY_MS, DAY_MS + 1, -DAY_MS, DAY_MS, 1, "EagerDpsFetcher"), + (False, -DAY_MS, DAY_MS + 1, -DAY_MS, DAY_MS, 1, "EagerDpsFetcher"), + ), + ) + def test_sparse_data__multiple_granularities_is_step_true_false( + self, + is_step, + start, + end, + exp_start, + exp_end, + max_workers, + mock_out_eager_or_chunk, + cognite_client, + retrieve_endpoints, + fixed_freq_dps_ts, + ): + # Underlying time series has daily values, we ask for 1d, 1h, 1m and 1s and make sure all share + # the exact same timestamps. Interpolation aggregates tested separately because they return data + # also in empty regions... why:( + (ts_daily, *_), (ts_daily_is_step, *_) = fixed_freq_dps_ts + ts, exclude = ts_daily, {"step_interpolation"} + if is_step: + exclude.add("interpolation") + ts = ts_daily_is_step + + assert ts.is_step is is_step + + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + for endpoint in retrieve_endpoints: + # Give each "granularity" a random list of aggregates: + aggs = [random_aggregates(exclude=exclude) for _ in range(4)] + res = endpoint( + start=start, + end=end, + id=[ + {"id": ts.id, "granularity": "1d", "aggregates": aggs[0]}, + {"id": ts.id, "granularity": "1h", "aggregates": aggs[1]}, + ], + external_id=[ + {"external_id": ts.external_id, "granularity": "1m", "aggregates": aggs[2]}, + {"external_id": ts.external_id, "granularity": "1s", "aggregates": aggs[3]}, + ], + ) + assert ((df := res.to_pandas()).isna().sum() == 0).all() + assert df.index[0] == pd.Timestamp(exp_start, unit="ms") + assert df.index[-1] == pd.Timestamp(exp_end, unit="ms") + + @pytest.mark.parametrize( + "is_step, aggregates, empty, before_first_dp", + ( + (False, ["interpolation"], True, False), + (False, ["interpolation"], True, True), + (False, ["step_interpolation"], False, False), + (False, ["step_interpolation"], True, True), + (True, ["interpolation"], False, False), + (True, ["interpolation"], True, True), + (True, ["step_interpolation"], False, False), + (True, ["step_interpolation"], True, True), + ), + ) + def test_interpolation_returns_data_from_empty_periods_before_and_after_data( + self, + is_step, + aggregates, + empty, + before_first_dp, + retrieve_endpoints, + fixed_freq_dps_ts, + ): + # Ts: has ms resolution data from: + # 1969-12-31 23:59:58.500 (-1500 ms) to 1970-01-01 00:00:01.500 (1500 ms) + (*_, ts_ms), (*_, ts_ms_is_step) = fixed_freq_dps_ts + ts = ts_ms_is_step if is_step else ts_ms + assert ts.is_step is is_step + + # Pick random start and end in an empty region: + if before_first_dp: + start = randint(MIN_TIMESTAMP_MS, -315619200000) # 1900 -> 1960 + end = randint(start, -31536000000) # start -> 1969 + else: # after last dp + start = randint(31536000000, 2524608000000) # 1971 -> 2050 + end = randint(start, MAX_TIMESTAMP_MS) # start -> (2051 minus 1ms) + granularities = f"{randint(1, 15)}d", f"{randint(1, 50)}h", f"{randint(1, 120)}m", f"{randint(1, 120)}s" + + for endpoint in retrieve_endpoints: + res_lst = endpoint( + start=start, + end=end, + aggregates=aggregates, + id=[ + {"id": ts.id, "granularity": granularities[0]}, + {"id": ts.id, "granularity": granularities[1]}, + ], + external_id=[ + {"external_id": ts.external_id, "granularity": granularities[2]}, + {"external_id": ts.external_id, "granularity": granularities[3]}, + # Verify empty with `count`: + {"external_id": ts.external_id, "granularity": granularities[0], "aggregates": ["count"]}, + ], + ) + *interp_res_lst, count_dps = res_lst + assert sum(count_dps.count) == 0 + + for dps, gran in zip(interp_res_lst, granularities): + interp_dps = getattr(dps, aggregates[0]) + assert (len(interp_dps) == 0) is empty - dps = cognite_client.datapoints.retrieve( - id=ids, start="6h-ago", end="now", aggregates=["min"], granularity="1s" + if not empty: + first_ts = convert_any_ts_to_integer(dps.timestamp[0]) + aligned_start, _ = align_start_and_end_for_granularity(start, end, gran) + assert first_ts == aligned_start + + @pytest.mark.parametrize( + "max_workers, n_ts, mock_out_eager_or_chunk", + [ + (1, 1, "ChunkingDpsFetcher"), + (5, 1, "ChunkingDpsFetcher"), + (5, 5, "ChunkingDpsFetcher"), + (1, 2, "EagerDpsFetcher"), + (1, 10, "EagerDpsFetcher"), + (9, 10, "EagerDpsFetcher"), + (9, 50, "EagerDpsFetcher"), + ], + ) + def test_retrieve_aggregates__string_ts_raises( + self, max_workers, n_ts, mock_out_eager_or_chunk, weekly_dps_ts, cognite_client, retrieve_endpoints + ): + _, string_ts = weekly_dps_ts + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + ts_chunk = random.sample(string_ts, k=n_ts) + for endpoint in retrieve_endpoints: + with pytest.raises(CogniteAPIError) as exc: + endpoint( + granularity=random_granularity(), + aggregates=random_aggregates(), + id=[ts.id for ts in ts_chunk], + ignore_unknown_ids=random.choice((True, False)), + ) + assert exc.value.code == 400 + assert exc.value.message == "Aggregates are not supported for string time series" + + @pytest.mark.parametrize( + "granularity, lower_lim, upper_lim", + ( + ("h", 30, 1000), + ("d", 1, 200), + ), + ) + def test_granularity_invariants(self, granularity, lower_lim, upper_lim, one_mill_dps_ts, retrieve_endpoints): + # Sum of count and sum of sum is independent of granularity + ts, _ = one_mill_dps_ts + for endpoint in retrieve_endpoints: + res = endpoint( + start=MIN_TIMESTAMP_MS, + end=MAX_TIMESTAMP_MS, + id={"id": ts.id, "aggregates": ["count", "sum"]}, + granularity=random_granularity(granularity, lower_lim, upper_lim), + ) + assert sum(res.count) == 1_000_000 + assert sum(res.sum) == 500_000 + + @pytest.mark.parametrize( + "first_gran, second_gran, start", + ( + ("60s", "1m", 86572008555), + ("120s", "2m", 27340402091), + ("60m", "1h", -357464206106), + ("120m", "2h", -150117679983), + ("24h", "1d", 114399466017), + ("48h", "2d", -170931071253), + ("240h", "10d", 366850985031), + ("4800h", "200d", -562661581583), + ), + ) + def test_can_be_equivalent_granularities(self, first_gran, second_gran, start, one_mill_dps_ts, retrieve_endpoints): + ts, _ = one_mill_dps_ts # data: 1950-2020 + gran_ms = granularity_to_ms(first_gran) + for endpoint in retrieve_endpoints: + end = start + gran_ms * random.randint(10, 1000) + start_aligned, end_aligned = align_start_and_end_for_granularity(start, end, second_gran) + res_lst = endpoint( + aggregates=random_aggregates(), + id=[ + # These should return different results: + {"id": ts.id, "granularity": first_gran, "start": start, "end": end}, + {"id": ts.id, "granularity": second_gran, "start": start, "end": end}, + ], + external_id=[ + # These should return identical results (up to float precision): + { + "external_id": ts.external_id, + "granularity": first_gran, + "start": start_aligned, + "end": end_aligned, + }, + { + "external_id": ts.external_id, + "granularity": second_gran, + "start": start_aligned, + "end": end_aligned, + }, + ], + ) + dps1, dps2, dps3, dps4 = res_lst + pd.testing.assert_frame_equal(dps3.to_pandas(), dps4.to_pandas()) + with pytest.raises(AssertionError): + pd.testing.assert_frame_equal(dps1.to_pandas(), dps2.to_pandas()) + + @pytest.mark.parametrize( + "max_workers, ts_idx, granularity, exp_len, start, end, exlude_step_interp", + ( + (1, 105, "8m", 81, ts_to_ms("1969-12-31 14:14:14"), ts_to_ms("1970-01-01 01:01:01"), False), + (1, 106, "7s", 386, ts_to_ms("1960"), ts_to_ms("1970-01-01 00:15:00"), False), + (8, 106, "7s", 386, ts_to_ms("1960"), ts_to_ms("1970-01-01 00:15:00"), False), + (2, 107, "1s", 4, ts_to_ms("1969-12-31 23:59:58.123"), ts_to_ms("2049-01-01 00:00:01.500"), True), + (5, 113, "11h", 32_288, ts_to_ms("1960-01-02 03:04:05.060"), ts_to_ms("2000-07-08 09:10:11.121"), True), + (3, 115, "1s", 200, ts_to_ms("2000-01-01"), ts_to_ms("2000-01-01 12:03:20"), False), + (20, 115, "12h", 5_000, ts_to_ms("1990-01-01"), ts_to_ms("2013-09-09 00:00:00.001"), True), + ), + ) + def test_eager_fetcher_unlimited( + self, + max_workers, + ts_idx, + granularity, + exp_len, + start, + end, + exlude_step_interp, + retrieve_endpoints, + all_test_time_series, + cognite_client, + ): + exclude = {"step_interpolation"} if exlude_step_interp else set() + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format("ChunkingDpsFetcher")): + for endpoint in retrieve_endpoints: + res = endpoint( + id=all_test_time_series[ts_idx].id, + start=start, + end=end, + granularity=granularity, + aggregates=random_aggregates(exclude=exclude), + limit=None, + ) + assert len(res) == exp_len + assert len(set(np.diff(res.timestamp))) == 1 + + def test_eager_chunking_unlimited(self, retrieve_endpoints, all_test_time_series, cognite_client): + # We run all test from Eager (above) at the same time: + test_setup_data = ( + (105, "8m", 81, ts_to_ms("1969-12-31 14:14:14"), ts_to_ms("1970-01-01 01:01:01"), False), + (106, "7s", 386, ts_to_ms("1960"), ts_to_ms("1970-01-01 00:15:00"), False), + (106, "7s", 386, ts_to_ms("1960"), ts_to_ms("1970-01-01 00:15:00"), False), + (107, "1s", 4, ts_to_ms("1969-12-31 23:59:58.123"), ts_to_ms("2049-01-01 00:00:01.500"), True), + (113, "11h", 32_288, ts_to_ms("1960-01-02 03:04:05.060"), ts_to_ms("2000-07-08 09:10:11.121"), True), + (115, "1s", 200, ts_to_ms("2000-01-01"), ts_to_ms("2000-01-01 12:03:20"), False), + (115, "12h", 5_000, ts_to_ms("1990-01-01"), ts_to_ms("2013-09-09 00:00:00.001"), True), ) - df = dps.to_pandas(column_names="id") - assert "{}|min".format(test_time_series[0].id) in df.columns - assert "{}|min".format(test_time_series[1].id) in df.columns - assert "{}|max".format(test_time_series[2].id) in df.columns - assert 0 < df.shape[0] - assert 3 == df.shape[1] - assert has_correct_timestamp_spacing(df, "1s") - for dpl in dps: - assert dpl.is_step is not None - assert dpl.is_string is not None - - def test_retrieve_nothing(self, cognite_client, test_time_series): - dpl = cognite_client.datapoints.retrieve(id=test_time_series[0].id, start=0, end=1) - assert 0 == len(dpl) - assert dpl.is_step is not None - assert dpl.is_string is not None - - def test_retrieve_multiple_with_exception(self, cognite_client, test_time_series): - with pytest.raises(CogniteAPIError): - cognite_client.datapoints.retrieve( - id=[test_time_series[0].id, test_time_series[1].id, 0], - start="1m-ago", - end="now", - aggregates=["min"], + with set_max_workers(cognite_client, 2), patch(DATAPOINTS_API.format("EagerDpsFetcher")): + ids = [ + { + "id": all_test_time_series[idx].id, + "start": start, + "end": end, + "granularity": gran, + "aggregates": random_aggregates(exclude={"step_interpolation"} if exclude else set()), + } + for idx, gran, _, start, end, exclude in test_setup_data + ] + for endpoint in retrieve_endpoints: + res_lst = endpoint(id=ids, limit=None) + for row, res in zip(test_setup_data, res_lst): + exp_len = row[2] + assert len(res) == exp_len + assert len(set(np.diff(res.timestamp))) == 1 + + @pytest.mark.parametrize( + "max_workers, n_ts, mock_out_eager_or_chunk, use_bursty", + ( + (3, 1, "ChunkingDpsFetcher", True), + (3, 1, "ChunkingDpsFetcher", False), + (10, 10, "ChunkingDpsFetcher", True), + (10, 10, "ChunkingDpsFetcher", False), + (1, 2, "EagerDpsFetcher", True), + (1, 2, "EagerDpsFetcher", False), + (2, 10, "EagerDpsFetcher", True), + (2, 10, "EagerDpsFetcher", False), + ), + ) + def test_finite_limit( + self, + max_workers, + n_ts, + mock_out_eager_or_chunk, + use_bursty, + cognite_client, + ms_bursty_ts, + one_mill_dps_ts, + retrieve_endpoints, + ): + if use_bursty: + ts = ms_bursty_ts # data: 2000-01-01 12:00:00 - 2013-09-08 12:03:19.900 + start, end, gran_unit_upper = YEAR_MS[2000], YEAR_MS[2014], 60 + else: + ts, _ = one_mill_dps_ts # data: 1950-2020 ~25k days + start, end, gran_unit_upper = YEAR_MS[1950], YEAR_MS[2020], 120 + with set_max_workers(cognite_client, max_workers), patch(DATAPOINTS_API.format(mock_out_eager_or_chunk)): + for endpoint in retrieve_endpoints: + limits = random.sample(range(2000), k=n_ts) + res_lst = endpoint( + start=start, + end=end, + id=[ + { + "id": ts.id, + "limit": lim, + "aggregates": random_aggregates(1), + "granularity": f"{random_gamma_dist_integer(gran_unit_upper)}{random.choice('smh')}", + } + for lim in limits + ], + ) + for res, exp_lim in zip(res_lst, limits): + assert len(res) == exp_lim + + def test_edge_case_all_aggs_missing(self, one_mill_dps_ts, retrieve_endpoints): + xid = one_mill_dps_ts[0].external_id + for endpoint in retrieve_endpoints: + res = endpoint( + start=ts_to_ms("1970-09-05 12:00:06"), + end=ts_to_ms("1970-09-05 19:49:40"), + external_id=xid, granularity="1s", + aggregates=["average", "interpolation"], ) + # Each dp is more than 1h apart, leading to all-nans for interp. agg only: + df = res.to_pandas(include_aggregate_name=True, column_names="external_id") + assert df[f"{xid}|interpolation"].isna().all() # SDK bug v<5, would be all None + assert df[f"{xid}|average"].notna().all() - def test_stop_pagination_in_time(self, cognite_client, test_time_series, post_spy): - lim = 152225 - ts = test_time_series[0] - dps = cognite_client.datapoints.retrieve(id=ts.id, start=0, end="now", limit=lim) - assert lim == len(dps) - # first page 100k, counts 1, paginate 2 windows (+1 potential for 1st day uncertainty) - assert 3 <= cognite_client.datapoints._post.call_count <= 4 - - def test_retrieve_include_outside_points(self, cognite_client, test_time_series): - ts = test_time_series[0] - start = utils._time.timestamp_to_ms("6h-ago") - end = utils._time.timestamp_to_ms("1h-ago") - dps_wo_outside = cognite_client.datapoints.retrieve( - id=ts.id, start=start, end=end, include_outside_points=False - ) - dps_w_outside = cognite_client.datapoints.retrieve(id=ts.id, start=start, end=end, include_outside_points=True) - assert not has_duplicates(dps_w_outside.to_pandas()) - assert len(dps_wo_outside) + 1 <= len(dps_w_outside) <= len(dps_wo_outside) + 2 - - def test_retrieve_include_outside_points_paginate_no_outside(self, cognite_client, test_time_series, post_spy): - ts = test_time_series[0] - start = utils._time.timestamp_to_ms("156w-ago") - end = utils._time.timestamp_to_ms("1h-ago") - test_lim = 250 - dps_non_outside = cognite_client.datapoints.retrieve(id=ts.id, start=start, end=end, limit=1234) - - tmp = cognite_client.datapoints._DPS_LIMIT - cognite_client.datapoints._DPS_LIMIT = test_lim - dps = cognite_client.datapoints.retrieve( - id=ts.id, start=start, end=end, include_outside_points=True, limit=1234 - ) - cognite_client.datapoints._DPS_LIMIT = tmp - assert len(dps) == len(dps_non_outside) - assert not has_duplicates(dps.to_pandas()) - ts_outside = set(dps.timestamp) - set(dps_non_outside.timestamp) - assert 0 == len(ts_outside) - assert set(dps.timestamp) == set(dps_non_outside.timestamp) - - def test_retrieve_include_outside_points_paginate_outside_exists(self, cognite_client, test_time_series, post_spy): - ts = test_time_series[0] - start = utils._time.timestamp_to_ms("12h-ago") - end = utils._time.timestamp_to_ms("1h-ago") - test_lim = 2500 - dps_non_outside = cognite_client.datapoints.retrieve(id=ts.id, start=start, end=end) - tmp = cognite_client.datapoints._DPS_LIMIT - cognite_client.datapoints._DPS_LIMIT = test_lim - dps = cognite_client.datapoints.retrieve(id=ts.id, start=start, end=end, include_outside_points=True) - cognite_client.datapoints._DPS_LIMIT = tmp - ts_outside = set(dps.timestamp) - set(dps_non_outside.timestamp) - assert 2 == len(ts_outside) - for ts in ts_outside: - assert ts < timestamp_to_ms(start) or ts >= timestamp_to_ms(end) - assert set(dps.timestamp) - ts_outside == set(dps_non_outside.timestamp) - - def test_retrieve_dataframe(self, cognite_client, test_time_series): - ts = test_time_series[0] - df = cognite_client.datapoints.retrieve_dataframe( - id=ts.id, start="6h-ago", end="now", aggregates=["average"], granularity="1s" + +class TestRetrieveDataFrameAPI: + """The `retrieve_dataframe` endpoint uses `retrieve_arrays` under the hood, so lots of tests + do not need to be repeated. + """ + + @pytest.mark.parametrize( + "max_workers, n_ts, mock_out_eager_or_chunk, outside, exp_len", + ( + (1, 1, "ChunkingDpsFetcher", True, 524), + (1, 1, "ChunkingDpsFetcher", False, 524 - 2), + (2, 5, "EagerDpsFetcher", True, 524), + (2, 5, "EagerDpsFetcher", False, 524 - 2), + ), + ) + def test_raw_dps(self, max_workers, n_ts, mock_out_eager_or_chunk, outside, exp_len, cognite_client, weekly_dps_ts): + ts_lst_numeric, ts_lst_string = weekly_dps_ts + for exp_dtype, ts_lst in zip([np.float64, object], weekly_dps_ts): + ts_sample = random.sample(ts_lst, k=n_ts) + res_df = cognite_client.time_series.data.retrieve_dataframe( + id=[ts.id for ts in ts_sample], + start=YEAR_MS[1965], + end=YEAR_MS[1975], + include_outside_points=outside, + ) + assert res_df.isna().sum().sum() == 0 + assert res_df.shape == (exp_len, n_ts) + assert res_df.dtypes.nunique() == 1 + assert res_df.dtypes.iloc[0] == exp_dtype + + @pytest.mark.parametrize( + "uniform, exp_n_ts_delta, exp_n_nans_step_interp", + ( + (True, 1, 1), + (False, 2, 0), + ), + ) + def test_agg_uniform_true_false( + self, uniform, exp_n_ts_delta, exp_n_nans_step_interp, cognite_client, one_mill_dps_ts + ): + ts, _ = one_mill_dps_ts + with set_max_workers(cognite_client, 1): + res_df = cognite_client.time_series.data.retrieve_dataframe( + id=ts.id, + start=YEAR_MS[1965], + end=YEAR_MS[1975], + granularity="3h", + aggregates=["step_interpolation", "average"], + uniform_index=uniform, + ) + assert len(set(np.diff(res_df.index))) == exp_n_ts_delta + assert res_df[f"{ts.external_id}|step_interpolation"].isna().sum() == exp_n_nans_step_interp + assert (res_df.count().values == [28994, 29215]).all() + + @pytest.mark.parametrize("limit", (0, 1, 2)) + def test_low_limits(self, limit, cognite_client, one_mill_dps_ts): + ts_numeric, ts_string = one_mill_dps_ts + res_df = cognite_client.time_series.data.retrieve_dataframe( + # Raw dps: + id=[ + ts_string.id, + {"id": ts_string.id, "include_outside_points": True}, + ts_numeric.id, + {"id": ts_numeric.id, "include_outside_points": True}, + ], + # Agg dps: + external_id={ + "external_id": ts_numeric.external_id, + "granularity": random_granularity(upper_lim=120), + # Exclude count (only non-float agg) and (step_)interpolation which might yield nans: + "aggregates": random_aggregates(exclude={"count", "interpolation", "step_interpolation"}), + }, + start=random.randint(YEAR_MS[1950], YEAR_MS[2000]), + end=ts_to_ms("2019-12-01"), + limit=limit, ) - assert df.shape[0] > 0 - assert df.shape[1] == 1 - assert has_correct_timestamp_spacing(df, "1s") + # We have duplicates in df.columns, so to test specific columns, we reset first: + res_df.columns = c1, c2, c3, c4, *cx = range(len(res_df.columns)) + assert res_df[[c1, c2]].dtypes.unique() == [object] + assert res_df[[c3, c4, *cx]].dtypes.unique() == [np.float64] + assert (res_df[[c1, c3, *cx]].count() == [limit] * (len(cx) + 2)).all() + assert (res_df[[c2, c4]].count() == [limit + 2] * 2).all() - def test_retrieve_dataframe_missing(self, cognite_client, test_time_series): - ts = test_time_series[0] - df = cognite_client.datapoints.retrieve_dataframe( + @pytest.mark.parametrize( + "granularity_lst, aggregates_lst, limits", + ( + # Fail because of raw request: + ([None, "1s"], [None, random_aggregates(1)], [None, None]), + # Fail because of multiple granularities: + (["1m", "60s"], [random_aggregates(1), random_aggregates(1)], [None, None]), + # Fail because of finite limit: + (["1d", "1d"], [random_aggregates(1), random_aggregates(1)], [123, None]), + ), + ) + def test_uniform_index_fails(self, granularity_lst, aggregates_lst, limits, cognite_client, one_mill_dps_ts): + with pytest.raises(ValueError, match="Cannot return a uniform index"): + cognite_client.time_series.data.retrieve_dataframe( + uniform_index=True, + id=[ + {"id": one_mill_dps_ts[0].id, "granularity": gran, "aggregates": agg, "limit": lim} + for gran, agg, lim in zip(granularity_lst, aggregates_lst, limits) + ], + ) + + @pytest.mark.parametrize( + "include_aggregate_name, column_names", + ( + (True, "id"), + (True, "external_id"), + (False, "id"), + (False, "external_id"), + ), + ) + def test_include_aggregate_name_and_column_names_true_false( + self, include_aggregate_name, column_names, cognite_client, one_mill_dps_ts + ): + ts = one_mill_dps_ts[0] + random.shuffle(aggs := ALL_SORTED_DP_AGGS[:]) + + res_df = cognite_client.time_series.data.retrieve_dataframe( id=ts.id, - external_id="missing", - start="6h-ago", - end="now", - aggregates=["average"], - granularity="1s", - ignore_unknown_ids=True, - ) - assert df.shape[0] > 0 - assert df.shape[1] == 1 - - def test_retrieve_string(self, cognite_client): - dps = cognite_client.datapoints.retrieve(external_id="test__string_b", start="2d-ago", end="now") - assert len(dps) > 100000 - - def test_query(self, cognite_client, test_time_series): - dps_query1 = DatapointsQuery(id=test_time_series[0].id, start="6h-ago", end="now") - dps_query2 = DatapointsQuery(id=test_time_series[1].id, start="3h-ago", end="now") - dps_query3 = DatapointsQuery( - id=test_time_series[2].id, start="1d-ago", end="now", aggregates=["average"], granularity="1h" + limit=5, + granularity=random_granularity(), + aggregates=aggs, + include_aggregate_name=include_aggregate_name, + column_names=column_names, ) + for col, agg in zip(res_df.columns, ALL_SORTED_DP_AGGS): + name = str(getattr(ts, column_names)) + if include_aggregate_name: + name += f"|{agg}" + assert col == name + + def test_column_names_fails(self, cognite_client, one_mill_dps_ts): + with pytest.raises(ValueError, match=re.escape("must be one of 'id' or 'external_id'")): + cognite_client.time_series.data.retrieve_dataframe( + id=one_mill_dps_ts[0].id, limit=5, column_names="bogus_id" + ) - res = cognite_client.datapoints.query([dps_query1, dps_query2, dps_query3]) - assert len(res) == 3 - assert len(res[2][0]) < len(res[1][0]) < len(res[0][0]) - - def test_query_unknown(self, cognite_client, test_time_series): - dps_query1 = DatapointsQuery(id=test_time_series[0].id, start="6h-ago", end="now", ignore_unknown_ids=True) - dps_query2 = DatapointsQuery(id=123, start="3h-ago", end="now", ignore_unknown_ids=True) - dps_query3 = DatapointsQuery( - external_id="missing time series", - start="1d-ago", - end="now", - aggregates=["average"], - granularity="1h", - ignore_unknown_ids=True, + def test_include_aggregate_name_fails(self, cognite_client, one_mill_dps_ts): + with pytest.raises(AssertionError): + cognite_client.time_series.data.retrieve_dataframe( + id=one_mill_dps_ts[0].id, limit=5, include_aggregate_name=None + ) + + +class TestQueryDatapointsAPI: + @pytest.mark.parametrize( + "use_numpy, exp_res_lst_type", + ( + (False, DatapointsList), + (True, DatapointsArrayList), + ), + ) + def test_query_single(self, use_numpy, exp_res_lst_type, cognite_client, one_mill_dps_ts): + ts, _ = one_mill_dps_ts + query = DatapointsQuery(id=ts.id, limit=20) + res_lst = cognite_client.time_series.data.query(query, use_numpy=use_numpy) + assert isinstance(res_lst, exp_res_lst_type) + assert len(res_lst) == 1 + assert res_lst.get(id=ts.id).external_id == ts.external_id + + @pytest.mark.parametrize( + "use_numpy, exp_res_lst_type", + ( + (False, DatapointsList), + (True, DatapointsArrayList), + ), + ) + def test_query_multiple(self, use_numpy, exp_res_lst_type, cognite_client, ms_bursty_ts, one_mill_dps_ts): + ts_num, ts_str = one_mill_dps_ts + grans = random_granularity("sm") + aggs = random_aggregates(2) + query1 = DatapointsQuery( + ignore_unknown_ids=False, # The key test ingredient + external_id=ts_str.external_id, + id={"id": ts_num.id, "include_outside_points": True}, + limit=20, + ) + query2 = DatapointsQuery( + ignore_unknown_ids=True, # The key test ingredient + id={"id": ms_bursty_ts.id, "granularity": grans, "aggregates": aggs, "limit": 5}, + external_id=random_cognite_external_ids(3), # should be silently ignored ) - res = cognite_client.datapoints.query([dps_query1, dps_query2, dps_query3]) - assert len(res) == 3 - assert len(res[0]) == 1 - assert len(res[0][0]) > 0 - assert len(res[1]) == 0 - assert len(res[2]) == 0 - - def test_retrieve_latest(self, cognite_client, test_time_series): - ids = [test_time_series[0].id, test_time_series[1].id] - res = cognite_client.datapoints.retrieve_latest(id=ids) + res_lst = cognite_client.time_series.data.query([query1, query2], use_numpy=use_numpy) + assert isinstance(res_lst, exp_res_lst_type) + assert len(res_lst) == 3 + + @pytest.mark.parametrize( + "use_numpy, exp_res_lst_type", + ( + (False, DatapointsList), + (True, DatapointsArrayList), + ), + ) + def test_query_no_ts_exists(self, use_numpy, exp_res_lst_type, cognite_client): + ts_id = random_cognite_ids(1) + query = DatapointsQuery(id=ts_id, ignore_unknown_ids=True) + res_lst = cognite_client.time_series.data.query(query, use_numpy=use_numpy) + assert res_lst.get(id=ts_id[0]) is None # SDK bug v<5, id mapping would not exist + + +@pytest.fixture +def post_spy(cognite_client): + with patch.object(cognite_client.time_series.data, "_post", wraps=cognite_client.time_series.data._post): + yield + + +class TestRetrieveLatestDatapointsAPI: + def test_retrieve_latest(self, cognite_client, all_test_time_series): + ids = [all_test_time_series[0].id, all_test_time_series[1].id] + res = cognite_client.time_series.data.retrieve_latest(id=ids) for dps in res: assert 1 == len(dps) - def test_retrieve_latest_unknown(self, cognite_client, test_time_series): - ids = [test_time_series[0].id, test_time_series[1].id, 42, 1337] - res = cognite_client.datapoints.retrieve_latest(id=ids, ignore_unknown_ids=True) + def test_retrieve_latest_two_unknown(self, cognite_client, all_test_time_series): + ids = [all_test_time_series[0].id, all_test_time_series[1].id, 42, 1337] + res = cognite_client.time_series.data.retrieve_latest(id=ids, ignore_unknown_ids=True) assert 2 == len(res) for dps in res: assert 1 == len(dps) - def test_retrieve_latest_many(self, cognite_client, test_time_series, post_spy): - ids = [ - t.id for t in cognite_client.time_series.list(limit=12) if not t.security_categories - ] # more than one page - assert len(ids) > 10 - tmp = cognite_client.datapoints._RETRIEVE_LATEST_LIMIT - cognite_client.datapoints._RETRIEVE_LATEST_LIMIT = 10 - res = cognite_client.datapoints.retrieve_latest(id=ids, ignore_unknown_ids=True) - cognite_client.datapoints._RETRIEVE_LATEST_LIMIT = tmp - assert set(dps.id for dps in res).issubset(set(ids)) - assert 2 == cognite_client.datapoints._post.call_count + def test_retrieve_latest_many(self, cognite_client, post_spy): + ids = [t.id for t in cognite_client.time_series.list(limit=12) if not t.security_categories] + assert len(ids) > 10 # more than one page + + with patch(DATAPOINTS_API.format("RETRIEVE_LATEST_LIMIT"), 10): + res = cognite_client.time_series.data.retrieve_latest(id=ids, ignore_unknown_ids=True) + + assert {dps.id for dps in res}.issubset(set(ids)) + assert 2 == cognite_client.time_series.data._post.call_count # needs post_spy for dps in res: assert len(dps) <= 1 # could be empty - def test_retrieve_latest_before(self, cognite_client, test_time_series): - ts = test_time_series[0] - res = cognite_client.datapoints.retrieve_latest(id=ts.id, before="1h-ago") + def test_retrieve_latest_before(self, cognite_client, all_test_time_series): + ts = all_test_time_series[0] + res = cognite_client.time_series.data.retrieve_latest(id=ts.id, before="1h-ago") assert 1 == len(res) - assert res[0].timestamp < utils._time.timestamp_to_ms("1h-ago") + assert res[0].timestamp < timestamp_to_ms("1h-ago") + +class TestInsertDatapointsAPI: def test_insert(self, cognite_client, new_ts, post_spy): datapoints = [(datetime(year=2018, month=1, day=1, hour=1, minute=i), i) for i in range(60)] - with set_request_limit(cognite_client.datapoints, 30): - cognite_client.datapoints.insert(datapoints, id=new_ts.id) - assert 2 == cognite_client.datapoints._post.call_count + with patch(DATAPOINTS_API.format("DPS_LIMIT"), 30), patch(DATAPOINTS_API.format("POST_DPS_OBJECTS_LIMIT"), 30): + cognite_client.time_series.data.insert(datapoints, id=new_ts.id) + assert 2 == cognite_client.time_series.data._post.call_count # needs post_spy def test_insert_before_epoch(self, cognite_client, new_ts, post_spy): datapoints = [(datetime(year=1950, month=1, day=1, hour=1, minute=i), i) for i in range(60)] - with set_request_limit(cognite_client.datapoints, 30): - cognite_client.datapoints.insert(datapoints, id=new_ts.id) - assert 2 == cognite_client.datapoints._post.call_count + with patch(DATAPOINTS_API.format("DPS_LIMIT"), 30), patch(DATAPOINTS_API.format("POST_DPS_OBJECTS_LIMIT"), 30): + cognite_client.time_series.data.insert(datapoints, id=new_ts.id) + assert 2 == cognite_client.time_series.data._post.call_count # needs post_spy - def test_insert_copy(self, cognite_client, test_time_series, new_ts, post_spy): - data = cognite_client.datapoints.retrieve(id=test_time_series[0].id, start="600d-ago", end="now", limit=100) + def test_insert_copy(self, cognite_client, ms_bursty_ts, new_ts, post_spy): + data = cognite_client.time_series.data.retrieve(id=ms_bursty_ts.id, start=0, end="now", limit=100) assert 100 == len(data) - cognite_client.datapoints.insert(data, id=new_ts.id) - assert 2 == cognite_client.datapoints._post.call_count + cognite_client.time_series.data.insert(data, id=new_ts.id) + assert 2 == cognite_client.time_series.data._post.call_count # needs post_spy + + def test_insert_copy_fails_at_aggregate(self, cognite_client, ms_bursty_ts, new_ts): + data = cognite_client.time_series.data.retrieve( + id=ms_bursty_ts.id, end="now", granularity="1m", aggregates=random_aggregates(1), limit=100 + ) + assert 100 == len(data) + with pytest.raises(ValueError, match="only raw datapoints are supported"): + cognite_client.time_series.data.insert(data, id=new_ts.id) def test_insert_pandas_dataframe(self, cognite_client, new_ts, post_spy): - start = datetime(2018, 1, 1) - x = pandas.DatetimeIndex([start + timedelta(days=d) for d in range(100)]) - y = numpy.random.normal(0, 1, 100) - df = pandas.DataFrame({new_ts.id: y}, index=x) - with set_request_limit(cognite_client.datapoints, 50): - cognite_client.datapoints.insert_dataframe(df) - assert 2 == cognite_client.datapoints._post.call_count + df = pd.DataFrame( + {new_ts.id: np.random.normal(0, 1, 30)}, + index=pd.date_range(start="2018", freq="1D", periods=30), + ) + with patch(DATAPOINTS_API.format("DPS_LIMIT"), 20), patch(DATAPOINTS_API.format("POST_DPS_OBJECTS_LIMIT"), 20): + cognite_client.time_series.data.insert_dataframe(df, external_id_headers=False) + assert 2 == cognite_client.time_series.data._post.call_count # needs post_spy def test_delete_range(self, cognite_client, new_ts): - cognite_client.datapoints.delete_range(start="2d-ago", end="now", id=new_ts.id) + cognite_client.time_series.data.delete_range(start="2d-ago", end="now", id=new_ts.id) def test_delete_range_before_epoch(self, cognite_client, new_ts): - cognite_client.datapoints.delete_range(start=MIN_TIMESTAMP_MS, end="now", id=new_ts.id) + cognite_client.time_series.data.delete_range(start=MIN_TIMESTAMP_MS, end=0, id=new_ts.id) def test_delete_ranges(self, cognite_client, new_ts): - cognite_client.datapoints.delete_ranges([{"start": "2d-ago", "end": "now", "id": new_ts.id}]) - - def test_retrieve_dataframe_dict(self, cognite_client, test_time_series): - - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=[test_time_series[0].id, 42], - external_id=["missing time series", test_time_series[1].external_id], - aggregates=["count", "interpolation"], - start=0, - end="now", - limit=100, - granularity="1m", - ignore_unknown_ids=True, - ) - assert isinstance(dfd, dict) - assert 2 == len(dfd.keys()) - assert dfd["interpolation"].shape[0] > 0 - assert dfd["interpolation"].shape[1] == 2 - - assert dfd["count"].shape[0] > 0 - assert dfd["count"].shape[1] == 2 + cognite_client.time_series.data.delete_ranges([{"start": "2d-ago", "end": "now", "id": new_ts.id}]) diff --git a/tests/tests_integration/test_api/test_geospatial.py b/tests/tests_integration/test_api/test_geospatial.py index c69b63b8a9..1478a6212e 100644 --- a/tests/tests_integration/test_api/test_geospatial.py +++ b/tests/tests_integration/test_api/test_geospatial.py @@ -472,10 +472,10 @@ def test_list(self, cognite_client, test_feature_type, test_features): assert len(res) == 4 df = res.to_pandas() - assert list(df) == ["externalId"] + assert list(df) == ["external_id"] def test_to_pandas(self, test_feature_type, test_features): - df = test_features.to_pandas() + df = test_features.to_pandas(camel_case=True) assert list(df) == [ "externalId", "position", @@ -488,7 +488,7 @@ def test_to_pandas(self, test_feature_type, test_features): ] def test_to_geopandas(self, test_feature_type, test_features): - gdf = test_features.to_geopandas(geometry="position") + gdf = test_features.to_geopandas(geometry="position", camel_case=True) assert list(gdf) == [ "externalId", "position", diff --git a/tests/tests_integration/test_api/test_raw.py b/tests/tests_integration/test_api/test_raw.py index bc209c8cc6..1e4bf6a39e 100644 --- a/tests/tests_integration/test_api/test_raw.py +++ b/tests/tests_integration/test_api/test_raw.py @@ -1,14 +1,13 @@ import pytest -from cognite.client import utils from cognite.client.data_classes import Row from cognite.client.utils._auxiliary import random_string @pytest.fixture(scope="session") def new_database_with_table(cognite_client): - db_name = "db_" + utils._auxiliary.random_string(10) - table_name = "table_" + utils._auxiliary.random_string(10) + db_name = "db_" + random_string(10) + table_name = "table_" + random_string(10) db = cognite_client.raw.databases.create(db_name) table = cognite_client.raw.tables.create(db_name, table_name) yield db, table @@ -31,7 +30,7 @@ def test_list_tables(self, cognite_client): def test_create_and_delete_table(self, cognite_client, new_database_with_table): db, _ = new_database_with_table - table_name = "table_" + utils._auxiliary.random_string(10) + table_name = "table_" + random_string(10) table = cognite_client.raw.tables.create(db.name, table_name) assert table in cognite_client.raw.tables.list(db.name) cognite_client.raw.tables.delete(db.name, table.name) diff --git a/tests/tests_integration/test_api/test_synthetic_time_series.py b/tests/tests_integration/test_api/test_synthetic_time_series.py index 1de22982b0..51a863ff68 100644 --- a/tests/tests_integration/test_api/test_synthetic_time_series.py +++ b/tests/tests_integration/test_api/test_synthetic_time_series.py @@ -20,7 +20,7 @@ def test_time_series(cognite_client): @pytest.fixture def post_spy(cognite_client): with mock.patch.object( - cognite_client.datapoints.synthetic, "_post", wraps=cognite_client.datapoints.synthetic._post + cognite_client.time_series.data.synthetic, "_post", wraps=cognite_client.time_series.data.synthetic._post ) as _: yield @@ -28,31 +28,31 @@ def post_spy(cognite_client): class TestSyntheticDatapointsAPI: def test_query(self, cognite_client, test_time_series, post_spy): query = "ts{id:%d} + ts{id:%d}" % (test_time_series[0].id, test_time_series[1].id) - dps = cognite_client.datapoints.synthetic.query( + dps = cognite_client.time_series.data.synthetic.query( expressions=query, start=datetime(2017, 1, 1), end="now", limit=23456 ) assert 23456 == len(dps) - assert 3 == cognite_client.datapoints.synthetic._post.call_count + assert 3 == cognite_client.time_series.data.synthetic._post.call_count def test_query_with_start_before_epoch(self, cognite_client, test_time_series, post_spy): query = "ts{id:%d} + ts{id:%d}" % (test_time_series[0].id, test_time_series[1].id) - dps = cognite_client.datapoints.synthetic.query( + dps = cognite_client.time_series.data.synthetic.query( expressions=query, start=datetime(1920, 1, 1), end="now", limit=23456 ) assert 23456 == len(dps) - assert 3 == cognite_client.datapoints.synthetic._post.call_count + assert 3 == cognite_client.time_series.data.synthetic._post.call_count def test_query_with_multiple_expressions(self, cognite_client, test_time_series, post_spy): expressions = ["ts{id:%d}" % test_time_series[0].id, "ts{id:%d}" % test_time_series[1].id] - dps = cognite_client.datapoints.synthetic.query( + dps = cognite_client.time_series.data.synthetic.query( expressions=expressions, start=datetime(2017, 1, 1), end="now", limit=23456 ) assert 23456 == len(dps[0]) assert 23456 == len(dps[1]) - assert 6 == cognite_client.datapoints.synthetic._post.call_count + assert 6 == cognite_client.time_series.data.synthetic._post.call_count def test_query_with_errors(self, cognite_client, test_time_series, post_spy): - dps = cognite_client.datapoints.synthetic.query( + dps = cognite_client.time_series.data.synthetic.query( expressions=["A / (B - B)"], start=datetime(2017, 1, 1), end="now", @@ -70,14 +70,14 @@ def test_query_with_errors(self, cognite_client, test_time_series, post_spy): def test_expression_builder_time_series_vs_string(self, cognite_client, test_time_series): from sympy import symbols - dps1 = cognite_client.datapoints.synthetic.query( + dps1 = cognite_client.time_series.data.synthetic.query( expressions=symbols("a"), start=datetime(2017, 1, 1), end="now", limit=100, variables={"a": test_time_series[0].external_id}, ) - dps2 = cognite_client.datapoints.synthetic.query( + dps2 = cognite_client.time_series.data.synthetic.query( expressions=[symbols("a"), symbols("b")], start=datetime(2017, 1, 1), end="now", @@ -107,7 +107,7 @@ def test_expression_builder_complex(self, cognite_client, test_time_series): + cos(syms[3] ** (1 + 0.1 ** syms[4])) + sqrt(log(abs(syms[8]) + 1)) ) - dps1 = cognite_client.datapoints.synthetic.query( + dps1 = cognite_client.time_series.data.synthetic.query( expressions=[expression], start=datetime(2017, 1, 1), end="now", diff --git a/tests/tests_integration/test_api/test_templates.py b/tests/tests_integration/test_api/test_templates.py index 5ecf0fe082..caafa71643 100644 --- a/tests/tests_integration/test_api/test_templates.py +++ b/tests/tests_integration/test_api/test_templates.py @@ -51,7 +51,6 @@ def new_template_group_version(cognite_client, new_template_group): version = TemplateGroupVersion(schema) new_version = cognite_client.templates.versions.upsert(ext_id, version=version) yield new_group, ext_id, new_version - print(ext_id, new_version.version) cognite_client.templates.versions.delete(ext_id, new_version.version) @@ -184,23 +183,23 @@ def test_instances_update_set(self, cognite_client, new_template_instance): def test_query(self, cognite_client, new_template_instance): new_group, ext_id, new_version, new_instance = new_template_instance query = """ - { - countryList - { - name, - deaths { - externalId, - datapoints(limit: 2) { - timestamp, value - } - }, - confirmed { - externalId, - datapoints(limit: 2) { - timestamp, value - } - } - } + { + countryList + { + name, + deaths { + externalId, + datapoints(limit: 2) { + timestamp, value + } + }, + confirmed { + externalId, + datapoints(limit: 2) { + timestamp, value + } + } + } } """ res = cognite_client.templates.graphql_query(ext_id, 1, query) diff --git a/tests/tests_integration/test_api/test_transformations/test_jobs.py b/tests/tests_integration/test_api/test_transformations/test_jobs.py index 060f9148e3..79757841b0 100644 --- a/tests/tests_integration/test_api/test_transformations/test_jobs.py +++ b/tests/tests_integration/test_api/test_transformations/test_jobs.py @@ -1,5 +1,4 @@ import asyncio -import random import string import time @@ -12,11 +11,12 @@ TransformationDestination, TransformationJobStatus, ) +from cognite.client.utils._auxiliary import random_string @pytest.fixture def new_transformation(cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) creds = cognite_client.config.credentials assert isinstance(creds, OAuthClientCredentials) transform = Transformation( diff --git a/tests/tests_integration/test_api/test_transformations/test_notifications.py b/tests/tests_integration/test_api/test_transformations/test_notifications.py index 3f7d669abe..1f015d31db 100644 --- a/tests/tests_integration/test_api/test_transformations/test_notifications.py +++ b/tests/tests_integration/test_api/test_transformations/test_notifications.py @@ -1,11 +1,11 @@ -import random import string import pytest from cognite.client.data_classes import Transformation, TransformationDestination, TransformationNotification +from cognite.client.utils._auxiliary import random_string -prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) +prefix = random_string(6, string.ascii_letters) @pytest.fixture diff --git a/tests/tests_integration/test_api/test_transformations/test_schedules.py b/tests/tests_integration/test_api/test_transformations/test_schedules.py index ea13d16a8c..b1183ae40b 100644 --- a/tests/tests_integration/test_api/test_transformations/test_schedules.py +++ b/tests/tests_integration/test_api/test_transformations/test_schedules.py @@ -1,4 +1,3 @@ -import random import string import pytest @@ -11,11 +10,12 @@ TransformationSchedule, TransformationScheduleUpdate, ) +from cognite.client.utils._auxiliary import random_string @pytest.fixture def new_transformation(cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) creds = cognite_client.config.credentials assert isinstance(creds, OAuthClientCredentials) transform = Transformation( diff --git a/tests/tests_integration/test_api/test_transformations/test_transformations.py b/tests/tests_integration/test_api/test_transformations/test_transformations.py index b2ecea1224..85da91f887 100644 --- a/tests/tests_integration/test_api/test_transformations/test_transformations.py +++ b/tests/tests_integration/test_api/test_transformations/test_transformations.py @@ -1,4 +1,3 @@ -import random import string import pytest @@ -13,6 +12,7 @@ ) from cognite.client.data_classes.transformations import ContainsAny from cognite.client.data_classes.transformations.common import NonceCredentials, OidcCredentials, SequenceRows +from cognite.client.utils._auxiliary import random_string @pytest.fixture @@ -32,7 +32,7 @@ def new_datasets(cognite_client): @pytest.fixture def new_transformation(cognite_client, new_datasets): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) creds = cognite_client.config.credentials assert isinstance(creds, OAuthClientCredentials) transform = Transformation( @@ -69,7 +69,7 @@ def new_transformation(cognite_client, new_datasets): class TestTransformationsAPI: def test_create_transformation_error(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform_without_name = Transformation( external_id=f"{prefix}-transformation", destination=TransformationDestination.assets() ) @@ -83,7 +83,7 @@ def test_create_transformation_error(self, cognite_client): assert failed def test_create_asset_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", destination=TransformationDestination.assets() ) @@ -91,7 +91,7 @@ def test_create_asset_transformation(self, cognite_client): cognite_client.transformations.delete(id=ts.id) def test_create_raw_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", @@ -102,7 +102,7 @@ def test_create_raw_transformation(self, cognite_client): assert ts.destination == TransformationDestination.raw("myDatabase", "myTable") def test_create_asset_hierarchy_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", destination=TransformationDestination.asset_hierarchy() ) @@ -110,7 +110,7 @@ def test_create_asset_hierarchy_transformation(self, cognite_client): cognite_client.transformations.delete(id=ts.id) def test_create_string_datapoints_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", @@ -120,7 +120,7 @@ def test_create_string_datapoints_transformation(self, cognite_client): cognite_client.transformations.delete(id=ts.id) def test_create_transformation_with_tags(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", @@ -133,7 +133,7 @@ def test_create_transformation_with_tags(self, cognite_client): @pytest.mark.skip def test_create_dmi_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", @@ -153,7 +153,7 @@ def test_create_dmi_transformation(self, cognite_client): cognite_client.transformations.delete(id=ts.id) def test_create_sequence_rows_transformation(self, cognite_client): - prefix = "".join(random.choice(string.ascii_letters) for i in range(6)) + prefix = random_string(6, string.ascii_letters) transform = Transformation( name="any", external_id=f"{prefix}-transformation", diff --git a/tests/tests_unit/test_api/test_data_sets.py b/tests/tests_unit/test_api/test_data_sets.py index 3150ca7eed..30d21473ca 100644 --- a/tests/tests_unit/test_api/test_data_sets.py +++ b/tests/tests_unit/test_api/test_data_sets.py @@ -151,7 +151,7 @@ def test_datasets_list_to_pandas_empty(self, cognite_client, mock_ds_empty): def test_datasets_to_pandas(self, cognite_client, mock_ds_response): import pandas as pd - df = cognite_client.data_sets.retrieve(id=1).to_pandas() + df = cognite_client.data_sets.retrieve(id=1).to_pandas(camel_case=True) assert isinstance(df, pd.DataFrame) assert "metadata" not in df.columns assert df.loc["writeProtected"].bool() is False diff --git a/tests/tests_unit/test_api/test_datapoints.py b/tests/tests_unit/test_api/test_datapoints.py index bd9535ad8b..11ed2e69c6 100644 --- a/tests/tests_unit/test_api/test_datapoints.py +++ b/tests/tests_unit/test_api/test_datapoints.py @@ -1,22 +1,25 @@ import json import math -from datetime import datetime +import re +from datetime import datetime, timezone from random import random -from typing import List -from unittest import mock +from unittest.mock import patch import pytest -from cognite.client import utils -from cognite.client._api.datapoints import DatapointsBin, DatapointsFetcher, _DPTask, _DPWindow +from cognite.client._api.datapoints import DatapointsBin from cognite.client.data_classes import Datapoint, Datapoints, DatapointsList, DatapointsQuery from cognite.client.exceptions import CogniteAPIError, CogniteDuplicateColumnsError, CogniteNotFoundError -from tests.utils import jsgz_load, set_request_limit +from cognite.client.utils._time import granularity_to_ms +from tests.utils import jsgz_load + +DATAPOINTS_API = "cognite.client._api.datapoints.{}" +DPS_DATA_CLASSES = "cognite.client.data_classes.datapoints.{}" def generate_datapoints(start: int, end: int, aggregates=None, granularity=None): dps = [] - granularity = utils._time.granularity_to_ms(granularity) if granularity else 1000 + granularity = granularity_to_ms(granularity) if granularity else 1000 for i in range(start, end, granularity): dp = {} if aggregates: @@ -88,7 +91,7 @@ def request_callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", callback=request_callback, content_type="application/json", ) @@ -99,7 +102,7 @@ def request_callback(request): def mock_get_datapoints_empty(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", status=200, json={ "items": [{"id": 1, "externalId": "1", "isString": False, "isStep": False, "unit": "kPa", "datapoints": []}] @@ -113,7 +116,7 @@ def mock_get_datapoints_include_outside(rsps, cognite_client): # return 100001 datapoints with one beyond 'end' rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", status=200, json={ "items": [ @@ -134,7 +137,7 @@ def mock_get_datapoints_include_outside(rsps, cognite_client): def mock_get_datapoints_one_ts_empty(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", status=200, json={ "items": [ @@ -150,7 +153,7 @@ def mock_get_datapoints_one_ts_empty(rsps, cognite_client): ) rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", status=200, json={"items": [{"id": 2, "externalId": "2", "isString": False, "isStep": False, "datapoints": []}]}, ) @@ -160,7 +163,9 @@ def mock_get_datapoints_one_ts_empty(rsps, cognite_client): @pytest.fixture def mock_get_datapoints_one_ts_has_missing_aggregates(rsps, cognite_client): def callback(request): - item = jsgz_load(request.body) + body = jsgz_load(request.body) + assert len(body["items"]) == 1 + item = body["items"][0] if item["aggregates"] == ["average"]: dps = { "id": 1, @@ -193,7 +198,7 @@ def callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", callback=callback, content_type="application/json", ) @@ -203,7 +208,9 @@ def callback(request): @pytest.fixture def mock_get_datapoints_several_missing(rsps, cognite_client): def callback(request): - item = jsgz_load(request.body) + body = jsgz_load(request.body) + assert len(body["items"]) == 1 + item = body["items"][0] if item["aggregates"] == ["interpolation"]: dps = { "id": 2, @@ -241,7 +248,7 @@ def callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", callback=callback, content_type="application/json", ) @@ -255,7 +262,7 @@ def callback(request): } rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/byids", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/byids", status=200, json=response_body, ) @@ -279,27 +286,20 @@ def callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", callback=callback, content_type="application/json", ) yield rsps -@pytest.fixture -def set_dps_workers(cognite_client): - def set_workers(limit): - cognite_client.datapoints._config.max_workers = limit - - workers_tmp = cognite_client.datapoints._config.max_workers - yield set_workers - cognite_client.datapoints._config.max_workers = workers_tmp - - def assert_dps_response_is_correct(calls, dps_object): datapoints = [] for call in calls: - if jsgz_load(call.request.body)["limit"] > 1 and jsgz_load(call.request.body).get("aggregates") != ["count"]: + body = jsgz_load(call.request.body) + assert len(body["items"]) == 1 + item = body["items"][0] + if item["limit"] > 1 and item.get("aggregates") != ["count"]: dps_response = call.response.json()["items"][0] if dps_response["id"] == dps_object.id and dps_response["externalId"] == dps_object.external_id: datapoints.extend(dps_response["datapoints"]) @@ -314,32 +314,32 @@ def assert_dps_response_is_correct(calls, dps_object): class TestGetDatapoints: def test_retrieve_datapoints_by_id(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.retrieve(id=123, start=1000000, end=1100000) + dps_res = cognite_client.time_series.data.retrieve(id=123, start=1000000, end=1100000) assert isinstance(dps_res, Datapoints) assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) def test_retrieve_datapoints_500(self, cognite_client, rsps): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", json={"error": {"code": 500, "message": "Internal Server Error"}}, status=500, ) with pytest.raises(CogniteAPIError): - cognite_client.datapoints.retrieve(id=123, start=1000000, end=1100000) + cognite_client.time_series.data.retrieve(id=123, start=1000000, end=1100000) def test_retrieve_datapoints_by_external_id(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.retrieve(external_id="123", start=1000000, end=1100000) + dps_res = cognite_client.time_series.data.retrieve(external_id="123", start=1000000, end=1100000) assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) def test_retrieve_datapoints_aggregates(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.retrieve( - id=123, start=1000000, end=1100000, aggregates=["average", "stepInterpolation"], granularity="10s" + dps_res = cognite_client.time_series.data.retrieve( + id=123, start=1000000, end=1100000, aggregates=["average", "step_interpolation"], granularity="10s" ) assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) def test_retrieve_datapoints_local_aggregates(self, cognite_client, mock_get_datapoints): - dps_res_list = cognite_client.datapoints.retrieve( + dps_res_list = cognite_client.time_series.data.retrieve( external_id={"externalId": "123", "aggregates": ["average"]}, id={"id": 234}, start=1000000, @@ -350,10 +350,13 @@ def test_retrieve_datapoints_local_aggregates(self, cognite_client, mock_get_dat for dps_res in dps_res_list: assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) + @pytest.mark.dsl # TODO: Revert to old code and use `retrieve`, not `retrieve_arrays` def test_retrieve_datapoints_some_aggregates_omitted( self, cognite_client, mock_get_datapoints_one_ts_has_missing_aggregates ): - dps_res_list = cognite_client.datapoints.retrieve( + import numpy as np + + dps_res_list = cognite_client.time_series.data.retrieve_arrays( id={"id": 1, "aggregates": ["average"]}, external_id={"externalId": "def", "aggregates": ["interpolation"]}, start=0, @@ -362,46 +365,22 @@ def test_retrieve_datapoints_some_aggregates_omitted( ) for dps in dps_res_list: if dps.id == 1: - assert dps.average == [0, 1, 2, 3, 4] + np.testing.assert_array_equal(dps.average, [0, 1, 2, 3, 4]) elif dps.id == 2: - assert dps.interpolation == [None, 1, None, 3, None] - - def test_datapoints_paging(self, cognite_client, mock_get_datapoints, set_dps_workers): - set_dps_workers(1) - with set_request_limit(cognite_client.datapoints, 2): - dps_res = cognite_client.datapoints.retrieve( - id=123, start=0, end=10000, aggregates=["average"], granularity="1s" - ) - assert 6 == len(mock_get_datapoints.calls) - assert 10 == len(dps_res) - - def test_datapoints_concurrent(self, cognite_client, mock_get_datapoints): - with set_request_limit(cognite_client.datapoints, 20): - dps_res = cognite_client.datapoints.retrieve( - id=123, start=0, end=100000, aggregates=["average"], granularity="1s" - ) - requested_windows = sorted( - [ - (jsgz_load(call.request.body)["start"], jsgz_load(call.request.body)["end"]) - for call in mock_get_datapoints.calls - ], - key=lambda x: x[0], - ) - assert (0, 100000) == requested_windows[0] - assert [(20000, 100000), (40000, 100000), (60000, 100000), (80000, 100000)] == requested_windows[2:] - assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) + np.testing.assert_array_equal(dps.interpolation, [np.nan, 1, np.nan, 3, np.nan]) def test_datapoints_paging_with_limit(self, cognite_client, mock_get_datapoints): - with set_request_limit(cognite_client.datapoints, 3): - dps_res = cognite_client.datapoints.retrieve( - id=123, start=0, end=10000, aggregates=["average"], granularity="1s", limit=4 - ) + with patch(DPS_DATA_CLASSES.format("DPS_LIMIT_AGG"), 3): + with patch(DATAPOINTS_API.format("DPS_LIMIT_AGG"), 3): + dps_res = cognite_client.time_series.data.retrieve( + id=123, start=0, end=10000, aggregates=["average"], granularity="1s", limit=4 + ) assert 4 == len(dps_res) def test_retrieve_datapoints_multiple_time_series(self, cognite_client, mock_get_datapoints): ids = [1, 2, 3] external_ids = ["4", "5", "6"] - dps_res_list = cognite_client.datapoints.retrieve(id=ids, external_id=external_ids, start=0, end=100000) + dps_res_list = cognite_client.time_series.data.retrieve(id=ids, external_id=external_ids, start=0, end=100000) assert isinstance(dps_res_list, DatapointsList), type(dps_res_list) for dps_res in dps_res_list: if dps_res.id in ids: @@ -413,43 +392,46 @@ def test_retrieve_datapoints_multiple_time_series(self, cognite_client, mock_get assert 0 == len(external_ids) def test_retrieve_datapoints_empty(self, cognite_client, mock_get_datapoints_empty): - res = cognite_client.datapoints.retrieve(id=1, start=0, end=10000) + res = cognite_client.time_series.data.retrieve(id=1, start=0, end=10000) assert 0 == len(res) def test_retrieve_datapoints_empty_extrafields_set(self, cognite_client, mock_get_datapoints_empty): - res = cognite_client.datapoints.retrieve(id=1, start=0, end=10000) + res = cognite_client.time_series.data.retrieve(id=1, start=0, end=10000) assert "kPa" == res.unit assert res.is_step is False assert res.is_string is False def test_aggregate_limits_correct(self, cognite_client, mock_get_datapoints): - cognite_client.datapoints.retrieve(id={"id": 1, "aggregates": ["average"]}, start=0, end=10, granularity="1d") - cognite_client.datapoints.retrieve(id=1, start=0, end=10, granularity="1d", aggregates=["max"]) - cognite_client.datapoints.retrieve(id=1, start=0, end=10) - assert 10000 == jsgz_load(mock_get_datapoints.calls[0].request.body)["limit"] - assert 10000 == jsgz_load(mock_get_datapoints.calls[1].request.body)["limit"] - assert 100000 == jsgz_load(mock_get_datapoints.calls[2].request.body)["limit"] + cognite_client.time_series.data.retrieve( + id={"id": 1, "aggregates": ["average"]}, start=0, end=10, granularity="1d" + ) + cognite_client.time_series.data.retrieve(id=1, start=0, end=10, granularity="1d", aggregates=["max"]) + cognite_client.time_series.data.retrieve(id=1, start=0, end=10) + for i, limit in zip(range(3), [10_000, 10_000, 100_000]): + body = jsgz_load(mock_get_datapoints.calls[i].request.body) + assert len(body["items"]) == 1 + assert limit == body["items"][0]["limit"] class TestQueryDatapoints: def test_query_single(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.query(query=DatapointsQuery(id=1, start=0, end=10000)) + dps_res = cognite_client.time_series.data.query(query=DatapointsQuery(id=1, start=0, end=10000)) assert isinstance(dps_res, DatapointsList) assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res[0]) def test_query_multiple(self, cognite_client, mock_get_datapoints): - dps_res_list = cognite_client.datapoints.query( + dps_res_list = cognite_client.time_series.data.query( query=[ DatapointsQuery(id=1, start=0, end=10000), DatapointsQuery(external_id="2", start=10000, end=20000, aggregates=["average"], granularity="2s"), ] ) - assert isinstance(dps_res_list, List) + assert isinstance(dps_res_list, DatapointsList) for dps_res in dps_res_list: - assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res[0]) + assert_dps_response_is_correct(mock_get_datapoints.calls, dps_res) def test_query_empty(self, cognite_client, mock_get_datapoints_empty): - dps_res = cognite_client.datapoints.query(query=DatapointsQuery(id=1, start=0, end=10000)) + dps_res = cognite_client.time_series.data.query(query=DatapointsQuery(id=1, start=0, end=10000)) assert isinstance(dps_res, DatapointsList) assert 1 == len(dps_res) assert 0 == len(dps_res[0]) @@ -478,7 +460,7 @@ def request_callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/latest", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/latest", callback=request_callback, content_type="application/json", ) @@ -489,7 +471,7 @@ def request_callback(request): def mock_retrieve_latest_empty(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/latest", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/latest", status=200, json={ "items": [ @@ -505,7 +487,7 @@ def mock_retrieve_latest_empty(rsps, cognite_client): def mock_retrieve_latest_with_failure(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/latest", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/latest", status=200, json={ "items": [ @@ -516,7 +498,7 @@ def mock_retrieve_latest_with_failure(rsps, cognite_client): ) rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/latest", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/latest", status=500, json={"error": {"code": 500, "message": "Internal Server Error"}}, ) @@ -525,54 +507,57 @@ def mock_retrieve_latest_with_failure(rsps, cognite_client): class TestGetLatest: def test_retrieve_latest(self, cognite_client, mock_retrieve_latest): - res = cognite_client.datapoints.retrieve_latest(id=1) + res = cognite_client.time_series.data.retrieve_latest(id=1) assert isinstance(res, Datapoints) assert 10000 == res[0].timestamp assert isinstance(res[0].value, float) def test_retrieve_latest_multiple_ts(self, cognite_client, mock_retrieve_latest): - res = cognite_client.datapoints.retrieve_latest(id=1, external_id="2") + res = cognite_client.time_series.data.retrieve_latest(id=1, external_id="2") assert isinstance(res, DatapointsList) for dps in res: assert 10000 == dps[0].timestamp assert isinstance(dps[0].value, float) def test_retrieve_latest_with_before(self, cognite_client, mock_retrieve_latest): - res = cognite_client.datapoints.retrieve_latest(id=1, before=10) + res = cognite_client.time_series.data.retrieve_latest(id=1, before=10) assert isinstance(res, Datapoints) assert 9 == res[0].timestamp assert isinstance(res[0].value, float) def test_retrieve_latest_multiple_ts_with_before(self, cognite_client, mock_retrieve_latest): - res = cognite_client.datapoints.retrieve_latest(id=[1, 2], external_id=["1", "2"], before=10) + res = cognite_client.time_series.data.retrieve_latest(id=[1, 2], external_id=["1", "2"], before=10) assert isinstance(res, DatapointsList) for dps in res: assert 9 == dps[0].timestamp assert isinstance(dps[0].value, float) def test_retrieve_latest_empty(self, cognite_client, mock_retrieve_latest_empty): - res = cognite_client.datapoints.retrieve_latest(id=1) + res = cognite_client.time_series.data.retrieve_latest(id=1) assert isinstance(res, Datapoints) assert 0 == len(res) def test_retrieve_latest_multiple_ts_empty(self, cognite_client, mock_retrieve_latest_empty): - res_list = cognite_client.datapoints.retrieve_latest(id=[1, 2]) + res_list = cognite_client.time_series.data.retrieve_latest(id=[1, 2]) assert isinstance(res_list, DatapointsList) assert 2 == len(res_list) for res in res_list: assert 0 == len(res) def test_retrieve_latest_concurrent_fails(self, cognite_client, mock_retrieve_latest_with_failure): - with set_request_limit(cognite_client.datapoints, 2): + with patch(DATAPOINTS_API.format("RETRIEVE_LATEST_LIMIT"), 2): with pytest.raises(CogniteAPIError) as e: - cognite_client.datapoints.retrieve_latest(id=[1, 2, 3]) + cognite_client.time_series.data.retrieve_latest(id=[1, 2, 3]) assert e.value.code == 500 @pytest.fixture def mock_post_datapoints(rsps, cognite_client): rsps.add( - rsps.POST, cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data", status=200, json={} + rsps.POST, + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data", + status=200, + json={}, ) yield rsps @@ -581,7 +566,7 @@ def mock_post_datapoints(rsps, cognite_client): def mock_post_datapoints_400(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data", status=400, json={"error": {"message": "Ts not found", "missing": [{"externalId": "does_not_exist"}]}}, ) @@ -591,7 +576,7 @@ def mock_post_datapoints_400(rsps, cognite_client): class TestInsertDatapoints: def test_insert_tuples(self, cognite_client, mock_post_datapoints): dps = [(i * 1e11, i) for i in range(1, 11)] - res = cognite_client.datapoints.insert(dps, id=1) + res = cognite_client.time_series.data.insert(dps, id=1) assert res is None assert { "items": [{"id": 1, "datapoints": [{"timestamp": int(i * 1e11), "value": i} for i in range(1, 11)]}] @@ -599,7 +584,7 @@ def test_insert_tuples(self, cognite_client, mock_post_datapoints): def test_insert_dicts(self, cognite_client, mock_post_datapoints): dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 11)] - res = cognite_client.datapoints.insert(dps, id=1) + res = cognite_client.time_series.data.insert(dps, id=1) assert res is None assert { "items": [{"id": 1, "datapoints": [{"timestamp": int(i * 1e11), "value": i} for i in range(1, 11)]}] @@ -607,7 +592,7 @@ def test_insert_dicts(self, cognite_client, mock_post_datapoints): def test_by_external_id(self, cognite_client, mock_post_datapoints): dps = [(i * 1e11, i) for i in range(1, 11)] - cognite_client.datapoints.insert(dps, external_id="1") + cognite_client.time_series.data.insert(dps, external_id="1") assert { "items": [ {"externalId": "1", "datapoints": [{"timestamp": int(i * 1e11), "value": i} for i in range(1, 11)]} @@ -618,12 +603,13 @@ def test_by_external_id(self, cognite_client, mock_post_datapoints): def test_invalid_datapoints_keys(self, cognite_client, ts_key, value_key): dps = [{ts_key: i * 1e11, value_key: i} for i in range(1, 11)] with pytest.raises(AssertionError, match="is missing the"): - cognite_client.datapoints.insert(dps, id=1) + cognite_client.time_series.data.insert(dps, id=1) def test_insert_datapoints_over_limit(self, cognite_client, mock_post_datapoints): dps = [(i * 1e11, i) for i in range(1, 11)] - with set_request_limit(cognite_client.datapoints, 5): - res = cognite_client.datapoints.insert(dps, id=1) + with patch(DATAPOINTS_API.format("DPS_LIMIT"), 5): + with patch(DATAPOINTS_API.format("POST_DPS_OBJECTS_LIMIT"), 5): + res = cognite_client.time_series.data.insert(dps, id=1) assert res is None request_bodies = [jsgz_load(call.request.body) for call in mock_post_datapoints.calls] assert { @@ -635,12 +621,12 @@ def test_insert_datapoints_over_limit(self, cognite_client, mock_post_datapoints def test_insert_datapoints_no_data(self, cognite_client): with pytest.raises(AssertionError, match="No datapoints provided"): - cognite_client.datapoints.insert(id=1, datapoints=[]) + cognite_client.time_series.data.insert(id=1, datapoints=[]) def test_insert_datapoints_in_multiple_time_series(self, cognite_client, mock_post_datapoints): dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 11)] dps_objects = [{"externalId": "1", "datapoints": dps}, {"id": 1, "datapoints": dps}] - res = cognite_client.datapoints.insert_multiple(dps_objects) + res = cognite_client.time_series.data.insert_multiple(dps_objects) assert res is None request_body = jsgz_load(mock_post_datapoints.calls[0].request.body) assert { @@ -654,40 +640,32 @@ def test_insert_datapoints_in_multiple_time_series_invalid_key(self, cognite_cli dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 11)] dps_objects = [{"extId": "1", "datapoints": dps}] with pytest.raises(ValueError, match="Invalid key 'extId'"): - cognite_client.datapoints.insert_multiple(dps_objects) + cognite_client.time_series.data.insert_multiple(dps_objects) def test_insert_datapoints_ts_does_not_exist(self, cognite_client, mock_post_datapoints_400): with pytest.raises(CogniteNotFoundError): - cognite_client.datapoints.insert(datapoints=[(1e14, 1)], external_id="does_not_exist") + cognite_client.time_series.data.insert(datapoints=[(1e14, 1)], external_id="does_not_exist") def test_insert_multiple_ts__below_ts_and_dps_limit(self, cognite_client, mock_post_datapoints): dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 2)] dps_objects = [{"id": i, "datapoints": dps} for i in range(100)] - cognite_client.datapoints.insert_multiple(dps_objects) + cognite_client.time_series.data.insert_multiple(dps_objects) assert 1 == len(mock_post_datapoints.calls) request_body = jsgz_load(mock_post_datapoints.calls[0].request.body) for i, dps in enumerate(request_body["items"]): assert i == dps["id"] - @pytest.fixture - def set_post_dps_objects_limit_to_100(self, cognite_client): - tmp = cognite_client.datapoints._POST_DPS_OBJECTS_LIMIT - cognite_client.datapoints._POST_DPS_OBJECTS_LIMIT = 100 - yield - cognite_client.datapoints._POST_DPS_OBJECTS_LIMIT = tmp - - def test_insert_multiple_ts_single_call__below_dps_limit_above_ts_limit( - self, cognite_client, mock_post_datapoints, set_post_dps_objects_limit_to_100 - ): - dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 2)] - dps_objects = [{"id": i, "datapoints": dps} for i in range(101)] - cognite_client.datapoints.insert_multiple(dps_objects) - assert 2 == len(mock_post_datapoints.calls) + def test_insert_multiple_ts_single_call__below_dps_limit_above_ts_limit(self, cognite_client, mock_post_datapoints): + with patch(DATAPOINTS_API.format("POST_DPS_OBJECTS_LIMIT"), 100): + dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 2)] + dps_objects = [{"id": i, "datapoints": dps} for i in range(101)] + cognite_client.time_series.data.insert_multiple(dps_objects) + assert 2 == len(mock_post_datapoints.calls) def test_insert_multiple_ts_single_call__above_dps_limit_below_ts_limit(self, cognite_client, mock_post_datapoints): dps = [{"timestamp": i * 1e11, "value": i} for i in range(1, 1002)] dps_objects = [{"id": i, "datapoints": dps} for i in range(10)] - cognite_client.datapoints.insert_multiple(dps_objects) + cognite_client.time_series.data.insert_multiple(dps_objects) assert 2 == len(mock_post_datapoints.calls) @@ -695,7 +673,7 @@ def test_insert_multiple_ts_single_call__above_dps_limit_below_ts_limit(self, co def mock_delete_datapoints(rsps, cognite_client): rsps.add( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/delete", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/delete", status=200, json={}, ) @@ -704,7 +682,9 @@ def mock_delete_datapoints(rsps, cognite_client): class TestDeleteDatapoints: def test_delete_range(self, cognite_client, mock_delete_datapoints): - res = cognite_client.datapoints.delete_range(start=datetime(2018, 1, 1), end=datetime(2018, 1, 2), id=1) + res = cognite_client.time_series.data.delete_range( + start=datetime(2018, 1, 1, tzinfo=timezone.utc), end=datetime(2018, 1, 2, tzinfo=timezone.utc), id=1 + ) assert res is None assert {"items": [{"id": 1, "inclusiveBegin": 1514764800000, "exclusiveEnd": 1514851200000}]} == jsgz_load( mock_delete_datapoints.calls[0].request.body @@ -716,15 +696,15 @@ def test_delete_range(self, cognite_client, mock_delete_datapoints): ) def test_delete_range_invalid_id(self, cognite_client, id, external_id, exception): with pytest.raises(exception): - cognite_client.datapoints.delete_range("1d-ago", "now", id, external_id) + cognite_client.time_series.data.delete_range("1d-ago", "now", id, external_id) def test_delete_range_start_after_end(self, cognite_client): with pytest.raises(AssertionError, match="must be"): - cognite_client.datapoints.delete_range(1, 0, 1) + cognite_client.time_series.data.delete_range(1, 0, 1) def test_delete_ranges(self, cognite_client, mock_delete_datapoints): ranges = [{"id": 1, "start": 0, "end": 1}, {"externalId": "1", "start": 0, "end": 1}] - cognite_client.datapoints.delete_ranges(ranges) + cognite_client.time_series.data.delete_ranges(ranges) assert { "items": [ {"id": 1, "inclusiveBegin": 0, "exclusiveEnd": 1}, @@ -735,10 +715,10 @@ def test_delete_ranges(self, cognite_client, mock_delete_datapoints): def test_delete_ranges_invalid_ids(self, cognite_client): ranges = [{"idz": 1, "start": 0, "end": 1}] with pytest.raises(AssertionError, match="Invalid key 'idz'"): - cognite_client.datapoints.delete_ranges(ranges) + cognite_client.time_series.data.delete_ranges(ranges) ranges = [{"start": 0, "end": 1}] with pytest.raises(ValueError, match="Exactly one of id or external id must be specified"): - cognite_client.datapoints.delete_ranges(ranges) + cognite_client.time_series.data.delete_ranges(ranges) class TestDatapointsObject: @@ -810,6 +790,7 @@ def test_load_string(self, cognite_client): { "id": 1, "externalId": "1", + "isStep": False, "isString": True, "datapoints": [{"timestamp": 1, "value": 1}, {"timestamp": 2, "value": 2}], } @@ -819,7 +800,7 @@ def test_load_string(self, cognite_client): assert [1, 2] == res.timestamp assert [1, 2] == res.value assert res.is_string is True - assert res.is_step is None + assert res.is_step is False assert res.unit is None def test_slice(self, cognite_client): @@ -854,34 +835,13 @@ def test_extend(self, cognite_client): assert d0.sum is None -@pytest.mark.dsl -class TestPlotDatapoints: - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.plot") - def test_plot_datapoints(self, pandas_plot_mock, plt_show_mock): - d = Datapoints(id=1, timestamp=[1, 2, 3, 4, 5], value=[1, 2, 3, 4, 5]) - d.plot() - assert 1 == pandas_plot_mock.call_count - assert 1 == plt_show_mock.call_count - - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.plot") - def test_plot_datapoints_list(self, pandas_plot_mock, plt_show_mock): - d1 = Datapoints(id=1, timestamp=[1, 2, 3, 4, 5], value=[1, 2, 3, 4, 5]) - d2 = Datapoints(id=2, timestamp=[1, 2, 3, 4, 5], value=[6, 7, 8, 9, 10]) - d = DatapointsList([d1, d2]) - d.plot() - assert 1 == pandas_plot_mock.call_count - assert 1 == plt_show_mock.call_count - - @pytest.mark.dsl class TestPandasIntegration: def test_datapoint(self, cognite_client): import pandas as pd d = Datapoint(timestamp=0, value=2, max=3) - expected_df = pd.DataFrame({"value": [2], "max": [3]}, index=[utils._time.ms_to_datetime(0)]) + expected_df = pd.DataFrame({"value": [2], "max": [3]}, index=[pd.Timestamp(0, unit="ms")]) pd.testing.assert_frame_equal(expected_df, d.to_pandas(), check_like=True) def test_datapoints(self, cognite_client): @@ -889,8 +849,8 @@ def test_datapoints(self, cognite_client): d = Datapoints(id=1, timestamp=[1, 2, 3], average=[2, 3, 4], step_interpolation=[3, 4, 5]) expected_df = pd.DataFrame( - {"1|average": [2, 3, 4], "1|stepInterpolation": [3, 4, 5]}, - index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]], + {"1|average": [2, 3, 4], "1|step_interpolation": [3, 4, 5]}, + index=pd.to_datetime(range(1, 4), unit="ms"), ) pd.testing.assert_frame_equal(expected_df, d.to_pandas()) @@ -898,9 +858,9 @@ def test_datapoints_no_names(self, cognite_client): import pandas as pd d = Datapoints(id=1, timestamp=[1, 2, 3], average=[2, 3, 4]) - expected_df = pd.DataFrame({"1": [2, 3, 4]}, index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]]) + expected_df = pd.DataFrame({"1": [2, 3, 4]}, index=pd.to_datetime(range(1, 4), unit="ms")) pd.testing.assert_frame_equal(expected_df, d.to_pandas(include_aggregate_name=False)) - expected_df = pd.DataFrame({"1|average": [2, 3, 4]}, index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]]) + expected_df = pd.DataFrame({"1|average": [2, 3, 4]}, index=pd.to_datetime(range(1, 4), unit="ms")) pd.testing.assert_frame_equal(expected_df, d.to_pandas(include_aggregate_name=True)) def test_id_and_external_id_set_gives_external_id_columns(self, cognite_client): @@ -908,8 +868,8 @@ def test_id_and_external_id_set_gives_external_id_columns(self, cognite_client): d = Datapoints(id=0, external_id="abc", timestamp=[1, 2, 3], average=[2, 3, 4], step_interpolation=[3, 4, 5]) expected_df = pd.DataFrame( - {"abc|average": [2, 3, 4], "abc|stepInterpolation": [3, 4, 5]}, - index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]], + {"abc|average": [2, 3, 4], "abc|step_interpolation": [3, 4, 5]}, + index=pd.to_datetime(range(1, 4), unit="ms"), ) pd.testing.assert_frame_equal(expected_df, d.to_pandas()) @@ -927,12 +887,12 @@ def test_datapoints_list(self, cognite_client): expected_df = pd.DataFrame( { "1|average": [2, 3, 4], - "1|stepInterpolation": [3, 4, 5], + "1|step_interpolation": [3, 4, 5], "2|max": [2, 3, 4], - "2|stepInterpolation": [3, 4, 5], + "2|step_interpolation": [3, 4, 5], "3": [1, None, 3], }, - index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]], + index=pd.to_datetime(range(1, 4), unit="ms"), ) pd.testing.assert_frame_equal(expected_df, dps_list.to_pandas(), check_freq=False) @@ -943,7 +903,7 @@ def test_datapoints_list_names(self, cognite_client): d2 = Datapoints(id=3, timestamp=[1, 3], average=[1, 3]) dps_list = DatapointsList([d1, d2]) expected_df = pd.DataFrame( - {"2|max": [2, 3, 4], "3|average": [1, None, 3]}, index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]] + {"2|max": [2, 3, 4], "3|average": [1, None, 3]}, index=pd.to_datetime(range(1, 4), unit="ms") ) pd.testing.assert_frame_equal(expected_df, dps_list.to_pandas(), check_freq=False) expected_df.columns = [c[:1] for c in expected_df.columns] @@ -957,7 +917,7 @@ def test_datapoints_list_names_dup(self, cognite_client): dps_list = DatapointsList([d1, d2]) expected_df = pd.DataFrame( {"2|max": [2, 3, 4], "2|average": [1, None, 3]}, - index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3]], + index=pd.to_datetime(range(1, 4), unit="ms"), columns=["2|max", "2|average"], ) pd.testing.assert_frame_equal(expected_df, dps_list.to_pandas(), check_freq=False) @@ -974,7 +934,7 @@ def test_datapoints_list_non_aligned(self, cognite_client): expected_df = pd.DataFrame( {"1": [1, 2, 3, None, None], "2": [None, None, 3, 4, 5]}, - index=[utils._time.ms_to_datetime(ms) for ms in [1, 2, 3, 4, 5]], + index=pd.to_datetime(range(1, 6), unit="ms"), ) pd.testing.assert_frame_equal(expected_df, dps_list.to_pandas(), check_freq=False) @@ -983,7 +943,7 @@ def test_datapoints_list_empty(self, cognite_client): assert dps_list.to_pandas().empty def test_retrieve_dataframe(self, cognite_client, mock_get_datapoints): - df = cognite_client.datapoints.retrieve_dataframe( + df = cognite_client.time_series.data.retrieve_dataframe( id=[1, {"id": 2, "aggregates": ["max"]}], external_id=["123"], start=1000000, @@ -1000,31 +960,26 @@ def test_retrieve_datapoints_some_aggregates_omitted( ): import pandas as pd - df = cognite_client.datapoints.retrieve_dataframe( + df = cognite_client.time_series.data.retrieve_dataframe( id={"id": 1, "aggregates": ["average"]}, external_id={"externalId": "def", "aggregates": ["interpolation"]}, start=0, end=1, aggregates=[], granularity="1s", + column_names="external_id", ) expected_df = pd.DataFrame( - {"1|average": [0, 1, 2, 3, 4], "def|interpolation": [None, 1, None, 3, None]}, - index=[utils._time.ms_to_datetime(i) for i in range(5)], + {"abc|average": [0.0, 1, 2, 3, 4], "def|interpolation": [None, 1, None, 3, None]}, + index=pd.to_datetime(range(5), unit="ms"), ) pd.testing.assert_frame_equal(df, expected_df) - def test_retrieve_datapoints_last_beyond_end(self, cognite_client, mock_get_datapoints_include_outside): - dpt = cognite_client.datapoints.retrieve( - id=1, include_outside_points=True, start=1000000000, end=1000000000 + 100000 - ) - assert 100001 == len(dpt) - def test_retrieve_dataframe_several_missing(self, cognite_client, mock_get_datapoints_several_missing): import pandas as pd - df = cognite_client.datapoints.retrieve_dataframe( + df = cognite_client.time_series.data.retrieve_dataframe( id=[ {"id": 1, "aggregates": ["average"]}, {"id": 2, "aggregates": ["interpolation"]}, @@ -1034,205 +989,24 @@ def test_retrieve_dataframe_several_missing(self, cognite_client, mock_get_datap start=0, end=1, granularity="1s", + column_names="id", ) expected_df = pd.DataFrame( {"1|average": [11, 22, 44, None], "2|interpolation": [None, 1, 3, None], "3|count": [None, 2, 4, 5]}, - index=[utils._time.ms_to_datetime(i * 1000) for i in [0, 1, 3, 4]], + index=pd.to_datetime([0, 1, 3, 4], unit="s"), ) pd.testing.assert_frame_equal(df, expected_df) - def test_retrieve_dataframe_complete_single_isstep(self, cognite_client, mock_get_datapoints_single_isstep): - import pandas as pd - - df = cognite_client.datapoints.retrieve_dataframe( - id=[{"id": 3, "aggregates": ["interpolation"]}], - aggregates=[], - start=0, - end=1, - granularity="1s", - complete="fill", - include_aggregate_name=False, - ) - - expected_df = pd.DataFrame( - {"3": [1.0, 1.0, 3.0]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [1, 2, 3]] - ) - pd.testing.assert_frame_equal(df, expected_df) - - def test_retrieve_dataframe_several_missing_complete(self, cognite_client, mock_get_datapoints_several_missing): - import pandas as pd - - df = cognite_client.datapoints.retrieve_dataframe( - id=[ - {"id": 1, "aggregates": ["average"]}, - {"id": 2, "aggregates": ["interpolation"]}, - {"id": 3, "aggregates": ["count"]}, - ], - aggregates=[], - start=0, - end=1, - granularity="1s", - complete="fill", - ) - - expected_df = pd.DataFrame( - { - "1|average": [11.0, 22.0, None, 44.0, None], - "2|interpolation": [None, 1.0, 2.0, 3.0, None], - "3|count": [0.0, 2.0, 0.0, 4.0, 5.0], - }, - index=[utils._time.ms_to_datetime(i * 1000) for i in range(5)], - ) - pd.testing.assert_frame_equal(df, expected_df) - - def test_retrieve_dataframe_dict_empty(self, cognite_client, mock_get_datapoints_empty): - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=1, - aggregates=["count", "interpolation", "stepInterpolation", "totalVariation"], - start=0, - end=1, - granularity="1s", - ) - assert isinstance(dfd, dict) - assert 4 == len(dfd) - - def test_retrieve_dataframe_dict_empty_single_aggregate(self, cognite_client, mock_get_datapoints_empty): - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=1, aggregates=["count"], start=0, end=1, granularity="1s" - ) - assert isinstance(dfd, dict) - assert ["count"] == list(dfd.keys()) - assert dfd["count"].empty - - def test_retrieve_dataframe_complete_all(self, cognite_client, mock_get_datapoints): - import pandas as pd - - df = cognite_client.datapoints.retrieve_dataframe( - id=[1, 2], - aggregates=["count", "sum", "average", "totalVariation"], - start=0, - end=1, - granularity="1s", - complete="fill", - ) - assert isinstance(df, pd.DataFrame) - assert 8 == df.shape[1] - - def test_retrieve_dataframe_dict(self, cognite_client, mock_get_datapoints_several_missing): - import pandas as pd - - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=[ - {"id": 1, "aggregates": ["average"]}, - {"id": 2, "aggregates": ["interpolation"]}, - {"id": 3, "aggregates": ["count"]}, - ], - aggregates=[], - start=0, - end=1, - granularity="1s", - ) - assert isinstance(dfd, dict) - assert 3 == len(dfd) - - expected_dict = { - "average": pd.DataFrame( - {"1": [11.0, 22.0, 44.0, None]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [0, 1, 3, 4]] - ), - "count": pd.DataFrame( - {"3": [None, 2, 4, 5]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [0, 1, 3, 4]] - ), - "interpolation": pd.DataFrame( - {"2": [None, 1, 3, None]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [0, 1, 3, 4]] - ), - } - for k in expected_dict: - pd.testing.assert_frame_equal(expected_dict[k], dfd[k]) - - def test_retrieve_dataframe_dict_complete(self, cognite_client, mock_get_datapoints_several_missing): - import pandas as pd - - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=[{"id": 2, "aggregates": ["interpolation"]}, {"id": 3, "aggregates": ["count"]}], - aggregates=[], - start=0, - end=1, - granularity="1s", - complete="fill,dropna", - ) - assert isinstance(dfd, dict) - assert 2 == len(dfd) - - expected_dict = { - "count": pd.DataFrame( - {"3": [2.0, 0.0, 4.0]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [1, 2, 3]] - ), - "interpolation": pd.DataFrame( - {"2": [1.0, 2.0, 3.0]}, index=[utils._time.ms_to_datetime(i * 1000) for i in [1, 2, 3]] - ), - } - - for k in expected_dict: - pd.testing.assert_frame_equal(expected_dict[k], dfd[k]) - - with pytest.raises(ValueError, match="is not supported for dataframe completion"): - dfd = cognite_client.datapoints.retrieve_dataframe_dict( - id=[{"id": 1, "aggregates": ["average"]}], - aggregates=[], - start=0, - end=1, - granularity="1s", - complete="fill,dropna", - ) - - def test_retrieve_dataframe_id_and_external_id_requested(self, cognite_client, rsps): - rsps.add( - rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", - status=200, - json={ - "items": [ - { - "id": 1, - "externalId": "abc", - "isString": False, - "isStep": False, - "datapoints": [{"timestamp": 0, "average": 1}], - } - ] - }, - ) - rsps.add( - rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", - status=200, - json={ - "items": [ - { - "id": 2, - "externalId": "def", - "isString": False, - "isStep": False, - "datapoints": [{"timestamp": 0, "average": 1}], - } - ] - }, - ) - res = cognite_client.datapoints.retrieve_dataframe( - start=0, end="now", id=1, external_id=["def"], aggregates=["average"], granularity="1m" - ) - assert {"1|average", "def|average"} == set(res.columns) - def test_insert_dataframe(self, cognite_client, mock_post_datapoints): import pandas as pd timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( {"123": [1, 2, 3, 4], "456": [5.0, 6.0, 7.0, 8.0]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + index=pd.to_datetime(timestamps, unit="ms"), ) - res = cognite_client.datapoints.insert_dataframe(df) + res = cognite_client.time_series.data.insert_dataframe(df, external_id_headers=False) assert res is None request_body = jsgz_load(mock_post_datapoints.calls[0].request.body) assert { @@ -1254,9 +1028,9 @@ def test_insert_dataframe_external_ids(self, cognite_client, mock_post_datapoint timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( {"123": [1, 2, 3, 4], "456": [5.0, 6.0, 7.0, 8.0]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + index=pd.to_datetime(timestamps, unit="ms"), ) - res = cognite_client.datapoints.insert_dataframe(df, external_id_headers=True) + res = cognite_client.time_series.data.insert_dataframe(df, external_id_headers=True) assert res is None request_body = jsgz_load(mock_post_datapoints.calls[0].request.body) assert { @@ -1278,10 +1052,10 @@ def test_insert_dataframe_with_nans(self, cognite_client): timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( {"123": [1, 2, None, 4], "456": [5.0, 6.0, 7.0, 8.0]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + index=pd.to_datetime(timestamps, unit="ms"), ) - with pytest.raises(AssertionError, match="contains NaNs"): - cognite_client.datapoints.insert_dataframe(df) + with pytest.raises(ValueError, match="contains one or more NaNs"): + cognite_client.time_series.data.insert_dataframe(df, dropna=False) def test_insert_dataframe_with_dropna(self, cognite_client, mock_post_datapoints): import pandas as pd @@ -1289,9 +1063,9 @@ def test_insert_dataframe_with_dropna(self, cognite_client, mock_post_datapoints timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( {"123": [1, 2, None, 4], "456": [5.0, 6.0, 7.0, 8.0]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + index=pd.to_datetime(timestamps, unit="ms"), ) - res = cognite_client.datapoints.insert_dataframe(df, dropna=True) + res = cognite_client.time_series.data.insert_dataframe(df, external_id_headers=False, dropna=True) assert res is None request_body = jsgz_load(mock_post_datapoints.calls[0].request.body) assert { @@ -1313,21 +1087,20 @@ def test_insert_dataframe_single_dp(self, cognite_client, mock_post_datapoints): import pandas as pd timestamps = [1500000000000] - df = pd.DataFrame({"a": [1.0], "b": [2.0]}, index=[utils._time.ms_to_datetime(ms) for ms in timestamps]) - res = cognite_client.datapoints.insert_dataframe(df, external_id_headers=True) + df = pd.DataFrame({"a": [1.0], "b": [2.0]}, index=pd.to_datetime(timestamps, unit="ms")) + res = cognite_client.time_series.data.insert_dataframe(df, external_id_headers=True) assert res is None def test_insert_dataframe_with_infs(self, cognite_client): - import numpy as np import pandas as pd timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( - {"123": [1, 2, np.inf, 4], "456": [5.0, 6.0, 7.0, 8.0], "xyz": ["a", "b", "c", "d"]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + {"123": [1, 2, math.inf, 4], "456": [5.0, 6.0, 7.0, 8.0], "xyz": ["a", "b", "c", "d"]}, + index=pd.to_datetime(timestamps, unit="ms"), ) - with pytest.raises(AssertionError, match="contains Infinity"): - cognite_client.datapoints.insert_dataframe(df) + with pytest.raises(ValueError, match=re.escape("contains one or more (+/-) Infinity")): + cognite_client.time_series.data.insert_dataframe(df) def test_insert_dataframe_with_strings(self, cognite_client, mock_post_datapoints): import pandas as pd @@ -1335,20 +1108,20 @@ def test_insert_dataframe_with_strings(self, cognite_client, mock_post_datapoint timestamps = [1500000000000, 1510000000000, 1520000000000, 1530000000000] df = pd.DataFrame( {"123": ["a", "b", "c", "d"], "456": [5.0, 6.0, 7.0, 8.0]}, - index=[utils._time.ms_to_datetime(ms) for ms in timestamps], + index=pd.to_datetime(timestamps, unit="ms"), ) - cognite_client.datapoints.insert_dataframe(df) + cognite_client.time_series.data.insert_dataframe(df) def test_retrieve_datapoints_multiple_time_series_correct_ordering(self, cognite_client, mock_get_datapoints): ids = [1, 2, 3] external_ids = ["4", "5", "6"] - dps_res_list = cognite_client.datapoints.retrieve(id=ids, external_id=external_ids, start=0, end=100000) + dps_res_list = cognite_client.time_series.data.retrieve(id=ids, external_id=external_ids, start=0, end=100000) assert list(dps_res_list.to_pandas().columns) == ["1", "2", "3", "4", "5", "6"], "Incorrect column ordering" def test_retrieve_datapoints_one_ts_empty_correct_number_of_columns( self, cognite_client, mock_get_datapoints_one_ts_empty ): - res = cognite_client.datapoints.retrieve(id=[1, 2], start=0, end=10000) + res = cognite_client.time_series.data.retrieve(id=[1, 2], start=0, end=10000) assert 2 == len(res.to_pandas().columns) @@ -1361,15 +1134,15 @@ def request_callback(request): end = payload["end"] assert payload["aggregates"] == ["count"] - assert utils._time.granularity_to_ms(payload["granularity"]) >= utils._time.granularity_to_ms("1d") + assert granularity_to_ms(payload["granularity"]) >= granularity_to_ms("1d") - dps = [{"timestamp": i, "count": 1000} for i in range(start, end, utils._time.granularity_to_ms(granularity))] + dps = [{"timestamp": i, "count": 1000} for i in range(start, end, granularity_to_ms(granularity))] response = {"items": [{"id": 0, "externalId": "bla", "isStep": False, "isString": False, "datapoints": dps}]} return 200, {}, json.dumps(response) rsps.add_callback( rsps.POST, - cognite_client.datapoints._get_base_url_with_base_path() + "/timeseries/data/list", + cognite_client.time_series.data._get_base_url_with_base_path() + "/timeseries/data/list", callback=request_callback, content_type="application/json", ) @@ -1402,185 +1175,5 @@ def test_datapoints_bin_will_fit__above_dps_limit_above_ts_limit(self, cognite_c assert not bin.will_fit(1) -gms = utils._time.granularity_to_ms - - class TestDataFetcher: - @pytest.mark.parametrize( - "q, exc, message", - [ - ( - DatapointsQuery(start=1, end=2, id=1, aggregates=["average"]), - ValueError, - "granularity must also be provided", - ), - (DatapointsQuery(start=1, end=2, id=1, granularity="1d"), ValueError, "aggregates must also be provided"), - ( - DatapointsQuery(start=1, end=2, id=[1, 1], granularity="1d", aggregates=["average"]), - ValueError, - "identifier '1' is duplicated in query", - ), - ( - DatapointsQuery( - start=1, end=2, id=[1, {"id": 1, "aggregates": ["max"]}], granularity="1d", aggregates=["average"] - ), - ValueError, - "identifier '1' is duplicated in query", - ), - ], - ) - def test_validate_query(self, cognite_client, q, exc, message): - with pytest.raises(exc, match=message): - DatapointsFetcher(cognite_client.datapoints).fetch(q) - - @pytest.mark.parametrize( - "fn", - [ - lambda cognite_client: ( - [_DPTask(cognite_client.datapoints, 1, 2, {}, None, None, None, None, False)], - [_DPTask(cognite_client.datapoints, 1, 2, {}, None, None, None, None, False)], - ), - lambda cognite_client: ( - [ - _DPTask( - cognite_client.datapoints, - datetime(2018, 1, 1), - datetime(2019, 1, 1), - {}, - None, - None, - None, - None, - False, - ) - ], - [_DPTask(cognite_client.datapoints, 1514764800000, 1546300800000, {}, None, None, None, None, False)], - ), - lambda cognite_client: ( - [_DPTask(cognite_client.datapoints, gms("1h"), gms(("25h")), {}, ["average"], "1d", None, None, False)], - [_DPTask(cognite_client.datapoints, gms("1d"), gms("2d"), {}, ["average"], "1d", None, None, False)], - ), - ], - ) - def test_preprocess_tasks(self, cognite_client, fn): - q, expected_q = fn(cognite_client) - DatapointsFetcher(cognite_client.datapoints)._preprocess_tasks(q) - for actual, expected in zip(q, expected_q): - assert expected.start == actual.start - assert expected.end == actual.end - - @pytest.mark.parametrize( - "ts, granularity, expected_output", - [ - (gms("1h"), "1d", gms("1d")), - (gms("23h"), "10d", gms("1d")), - (gms("24h"), "5d", gms("1d")), - (gms("25h"), "3d", gms("2d")), - (gms("1m"), "1h", gms("1h")), - (gms("90s"), "10m", gms("2m")), - (gms("90s"), "1s", gms("90s")), - ], - ) - def test_align_with_granularity_unit(self, cognite_client, ts, granularity, expected_output): - assert expected_output == DatapointsFetcher._align_with_granularity_unit(ts, granularity) - - @pytest.mark.parametrize( - "start, end, granularity, request_limit, user_limit, expected_output", - [ - (0, gms("20d"), "10d", 2, None, [_DPWindow(0, 1728000000)]), - (0, gms("20d"), "10d", 1, None, [_DPWindow(0, 864000000), _DPWindow(864000000, 1728000000)]), - ( - 0, - gms("6d"), - "1s", - 2000, - None, - [_DPWindow(0, gms("2d")), _DPWindow(gms("2d"), gms("4d")), _DPWindow(gms("4d"), gms("6d"))], - ), - ( - 0, - gms("3d"), - None, - 1000, - None, - [_DPWindow(0, gms("1d")), _DPWindow(gms("1d"), gms("2d")), _DPWindow(gms("2d"), gms("3d"))], - ), - (0, gms("1h"), None, 2000, None, [_DPWindow(0, 3600000)]), - (0, gms("1s"), None, 1, None, [_DPWindow(0, 1000)]), - (0, gms("1s"), None, 1, None, [_DPWindow(0, 1000)]), - (0, gms("3d"), None, 1000, 500, [_DPWindow(0, gms("1d"))]), - ], - ) - def test_get_datapoints_windows( - self, cognite_client, start, end, granularity, request_limit, user_limit, expected_output, mock_get_dps_count - ): - user_limit = user_limit or float("inf") - task = _DPTask( - client=cognite_client.datapoints, - start=start, - end=end, - ts_item={}, - granularity=granularity, - aggregates=[], - limit=None, - include_outside_points=False, - ignore_unknown_ids=False, - ) - task.request_limit = request_limit - res = DatapointsFetcher(cognite_client.datapoints)._get_windows( - id=0, task=task, remaining_user_limit=user_limit - ) - for w in expected_output: - w.limit = user_limit - assert expected_output == res - - @pytest.mark.parametrize( - "start, end, granularity, expected_output", - [ - (0, 10001, "1s", 10000), - (0, 10000, "1m", 0), - (0, 110000, "1m", gms("1m")), - (0, gms("10d") - 1, "2d", gms("8d")), - (0, gms("10d") - 1, "1m", gms("10d") - gms("1m")), - (0, 10000, "1s", 10000), - ], - ) - def test_align_window_end(self, cognite_client, start, end, granularity, expected_output): - assert expected_output == DatapointsFetcher._align_window_end(start, end, granularity) - - @pytest.mark.parametrize( - "ids, external_ids, expected_output", - [ - (1, None, ([{"id": 1}], True)), - (None, "1", ([{"externalId": "1"}], True)), - (1, "1", ([{"id": 1}, {"externalId": "1"}], False)), - ([1], ["1"], ([{"id": 1}, {"externalId": "1"}], False)), - ([1], None, ([{"id": 1}], False)), - ({"id": 1, "aggregates": ["average"]}, None, ([{"id": 1, "aggregates": ["average"]}], True)), - ({"id": 1}, {"externalId": "1"}, ([{"id": 1}, {"externalId": "1"}], False)), - ( - [{"id": 1, "aggregates": ["average"]}], - [{"externalId": "1", "aggregates": ["average", "sum"]}], - ([{"id": 1, "aggregates": ["average"]}, {"externalId": "1", "aggregates": ["average", "sum"]}], False), - ), - ], - ) - def test_process_time_series_input_ok(self, cognite_client, ids, external_ids, expected_output): - assert expected_output == DatapointsFetcher._process_ts_identifiers(ids, external_ids) - - @pytest.mark.parametrize( - "ids, external_ids, exception, match", - [ - (1.0, None, TypeError, "Invalid type ''"), - ([1.0], None, TypeError, "Invalid type ''"), - (None, 1, TypeError, "Invalid type ''"), - (None, [1], TypeError, "Invalid type ''"), - ({"wrong": 1, "aggregates": ["average"]}, None, ValueError, "Unknown key 'wrong'"), - (None, [{"externalId": 1, "wrong": ["average"]}], ValueError, "Unknown key 'wrong'"), - (None, {"id": 1, "aggregates": ["average"]}, ValueError, "Unknown key 'id'"), - ({"externalId": 1}, None, ValueError, "Unknown key 'externalId'"), - ], - ) - def test_process_time_series_input_fail(self, cognite_client, ids, external_ids, exception, match): - with pytest.raises(exception, match=match): - DatapointsFetcher._process_ts_identifiers(ids, external_ids) + pass # TODO(haakonvt): Get to it! diff --git a/tests/tests_unit/test_api/test_datapoints_tasks.py b/tests/tests_unit/test_api/test_datapoints_tasks.py new file mode 100644 index 0000000000..36cfa0acc9 --- /dev/null +++ b/tests/tests_unit/test_api/test_datapoints_tasks.py @@ -0,0 +1,215 @@ +import math +import random +import re +from datetime import datetime, timezone +from decimal import Decimal + +import pytest + +from cognite.client._api.datapoint_tasks import _SingleTSQueryValidator, dps_container, subtask_lst +from cognite.client.data_classes import DatapointsQuery +from cognite.client.utils._auxiliary import random_string +from cognite.client.utils._time import timestamp_to_ms +from tests.utils import random_aggregates, random_cognite_ids, random_gamma_dist_integer, random_granularity + + +class TestSingleTSQueryValidator: + @pytest.mark.parametrize( + "ids, xids", + ( + (None, None), + ([], None), + (None, []), + ([], []), + ), + ) + def test_no_identifiers_raises(self, ids, xids): + with pytest.raises(ValueError, match=re.escape("Pass at least one time series `id` or `external_id`!")): + _SingleTSQueryValidator(DatapointsQuery(id=ids, external_id=xids)).validate_and_create_single_queries() + + @pytest.mark.parametrize( + "ids, xids, exp_attr_to_fail", + ( + ({123}, None, "id"), + (None, {"foo"}, "external_id"), + ({123}, {"foo"}, "id"), + ), + ) + def test_wrong_identifier_type_raises(self, ids, xids, exp_attr_to_fail): + err_msg = f"Got unsupported type {type(ids or xids)}, as, or part of argument `{exp_attr_to_fail}`." + + with pytest.raises(TypeError, match=re.escape(err_msg)): + _SingleTSQueryValidator(DatapointsQuery(id=ids, external_id=xids)).validate_and_create_single_queries() + + @pytest.mark.parametrize( + "ids, xids, exp_attr_to_fail", + ( + ({"iid": 123}, None, "id"), + (None, {"extern-id": "foo"}, "external_id"), + ([{"iid": 123}], None, "id"), + (None, [{"extern-id": "foo"}], "external_id"), + ({"iid": 123}, {"extern-id": "foo"}, "id"), + ), + ) + def test_missing_identifier_in_dict_raises(self, ids, xids, exp_attr_to_fail): + err_msg = f"Missing required key `{exp_attr_to_fail}` in dict:" + + with pytest.raises(KeyError, match=re.escape(err_msg)): + _SingleTSQueryValidator(DatapointsQuery(id=ids, external_id=xids)).validate_and_create_single_queries() + + @pytest.mark.parametrize( + "ids, xids, exp_wrong_type, exp_attr_to_fail", + ( + ({"id": "123"}, None, str, "id"), + ({"id": 42 + 0j}, None, complex, "id"), + ({"id": Decimal("123")}, None, Decimal, "id"), + (None, {"external_id": 123}, int, "external_id"), + (None, {"externalId": ["foo"]}, list, "external_id"), + ([{"id": "123"}], None, str, "id"), + (None, [{"external_id": 123}], int, "external_id"), + (None, [{"externalId": b"foo"}], bytes, "external_id"), + ({"id": None}, {"external_id": "foo"}, type(None), "id"), + ), + ) + def test_identifier_in_dict_has_wrong_type(self, ids, xids, exp_wrong_type, exp_attr_to_fail): + err_msg = f"Got unsupported type {exp_wrong_type}, as, or part of argument `{exp_attr_to_fail}`." + + with pytest.raises(TypeError, match=re.escape(err_msg)): + _SingleTSQueryValidator(DatapointsQuery(id=ids, external_id=xids)).validate_and_create_single_queries() + + @pytest.mark.parametrize("identifier_dct", ({"id": 123}, {"external_id": "foo"}, {"externalId": "bar"})) + def test_identifier_dicts_has_wrong_keys(self, identifier_dct): + good_keys = random.choices( + ["start", "end", "aggregates", "granularity", "include_outside_points", "limit"], + k=random.randint(0, 6), + ) + bad_keys = [random_string(20) for _ in range(random.randint(1, 3))] + identifier_dct.update(dict.fromkeys(good_keys + bad_keys)) + if "id" in identifier_dct: + identifier = "id" + query = DatapointsQuery(id=identifier_dct, external_id=None) + else: + identifier = "external_id" + query = DatapointsQuery(id=None, external_id=identifier_dct) + + with pytest.raises( + KeyError, match=re.escape(f"Dict provided by argument `{identifier}` included key(s) not understood") + ): + _SingleTSQueryValidator(query).validate_and_create_single_queries() + + @pytest.mark.parametrize("limit, exp_limit", [(0, 0), (1, 1), (-1, None), (math.inf, None), (None, None)]) + def test_valid_limits(self, limit, exp_limit): + ts_query = _SingleTSQueryValidator(DatapointsQuery(id=0, limit=limit)).validate_and_create_single_queries() + assert len(ts_query) == 1 + assert ts_query[0].limit == exp_limit + + @pytest.mark.parametrize("limit", (-2, -math.inf, math.nan, ..., "5000")) + def test_limits_not_allowed_values(self, limit): + with pytest.raises(TypeError, match=re.escape("Parameter `limit` must be a non-negative integer -OR-")): + _SingleTSQueryValidator(DatapointsQuery(id=0, limit=limit)).validate_and_create_single_queries() + + @pytest.mark.parametrize( + "granularity, aggregates, outside, exp_err, exp_err_msg_idx", + ( + (4000, ["min"], None, TypeError, 0), + ("4h", {"min"}, None, TypeError, 1), + ("4h", None, None, ValueError, 2), + ("4h", [], None, ValueError, 3), + (None, ["min"], None, ValueError, 4), + ("4h", ["min"], True, ValueError, 5), + ), + ) + def test_function_validate_and_create_query(self, granularity, aggregates, outside, exp_err, exp_err_msg_idx): + err_msgs = [ + f"Expected `granularity` to be of type `str` or None, not {type(granularity)}", + f"Expected `aggregates` to be of type `list[str]` or None, not {type(aggregates)}", + "When passing `granularity`, argument `aggregates` is also required.", + "Empty list of `aggregates` passed, expected at least one!", + "When passing `aggregates`, argument `granularity` is also required.", + "'Include outside points' is not supported for aggregates.", + ] + user_query = DatapointsQuery( + id=0, granularity=granularity, aggregates=aggregates, include_outside_points=outside + ) + with pytest.raises(exp_err, match=re.escape(err_msgs[exp_err_msg_idx])): + _SingleTSQueryValidator(user_query).validate_and_create_single_queries() + + @pytest.mark.parametrize( + "start, end", + ( + (None, None), + (None, 123), + (123, None), + (-123, "now"), + (-123, -12), + ("now", 2 * timestamp_to_ms("now")), + (1, datetime.now()), + (1, datetime.now(timezone.utc)), + (datetime.utcnow(), 2 * timestamp_to_ms("now")), + ), + ) + def test_function__verify_time_range__valid_inputs(self, start, end): + gran_dct = {"granularity": random_granularity(), "aggregates": random_aggregates()} + for kwargs in [{}, gran_dct]: + user_query = DatapointsQuery(id=0, start=start, end=end, **kwargs) + ts_query = _SingleTSQueryValidator(user_query).validate_and_create_single_queries() + assert isinstance(ts_query[0].start, int) + assert isinstance(ts_query[0].end, int) + + @pytest.mark.parametrize( + "start, end", + ( + (0, -1), + (50, 50), + (-50, -50), + (None, 0), + ("now", -123), + ("now", 123), + (2 * timestamp_to_ms("now"), "now"), + (datetime.now(), 123), + (datetime.now(timezone.utc), 123), + ), + ) + def test_function__verify_time_range__raises(self, start, end): + gran_dct = {"granularity": random_granularity(), "aggregates": random_aggregates()} + for kwargs in [{}, gran_dct]: + user_query = DatapointsQuery(id=0, start=start, end=end, **kwargs) + with pytest.raises(ValueError, match="Invalid time range"): + _SingleTSQueryValidator(user_query).validate_and_create_single_queries() + + def test_retrieve_aggregates__include_outside_points_raises(self): + id_dct_lst = [ + {"id": ts_id, "granularity": random_granularity(), "aggregates": random_aggregates()} + for ts_id in random_cognite_ids(10) + ] + # Only one time series is configured wrong and will raise: + id_dct_lst[-1]["include_outside_points"] = True + + user_query = DatapointsQuery(id=id_dct_lst, include_outside_points=False) + with pytest.raises(ValueError, match="'Include outside points' is not supported for aggregates."): + _SingleTSQueryValidator(user_query).validate_and_create_single_queries() + + +@pytest.fixture +def create_random_int_tuples(): + return [ + tuple(random.choices(range(-50, 50), k=random.randint(1, 5))) for _ in range(random_gamma_dist_integer(100)) + ] + + +class TestSortedContainers: + def test_dps_container(self, create_random_int_tuples): + container = dps_container() + for k in create_random_int_tuples: + container[k] = None + assert list(container.keys()) == sorted(create_random_int_tuples) + + def test_subtask_lst(self, create_random_int_tuples): + class Foo: + def __init__(self, idx): + self.subtask_idx = idx + + random_foos = [Foo(tpl) for tpl in create_random_int_tuples] + container = subtask_lst() + container.update(random_foos) + assert list(container) == sorted(random_foos, key=lambda foo: foo.subtask_idx) diff --git a/tests/tests_unit/test_api/test_events.py b/tests/tests_unit/test_api/test_events.py index 5babee2f11..e6a8a6a9f8 100644 --- a/tests/tests_unit/test_api/test_events.py +++ b/tests/tests_unit/test_api/test_events.py @@ -241,7 +241,7 @@ def test_event_list_to_pandas_empty(self, cognite_client, mock_events_empty): def test_event_to_pandas(self, cognite_client, mock_events_response): import pandas as pd - df = cognite_client.events.retrieve(id=1).to_pandas() + df = cognite_client.events.retrieve(id=1).to_pandas(camel_case=True) assert isinstance(df, pd.DataFrame) assert "metadata" not in df.columns assert [1] == df.loc["assetIds"][0] diff --git a/tests/tests_unit/test_api/test_files.py b/tests/tests_unit/test_api/test_files.py index b9bb7c11ee..8eed9ce4b8 100644 --- a/tests/tests_unit/test_api/test_files.py +++ b/tests/tests_unit/test_api/test_files.py @@ -591,7 +591,7 @@ def test_file_list_to_pandas_empty(self, cognite_client, mock_files_empty): def test_file_to_pandas(self, cognite_client, mock_files_response): import pandas as pd - df = cognite_client.files.retrieve(id=1).to_pandas() + df = cognite_client.files.retrieve(id=1).to_pandas(camel_case=True) assert isinstance(df, pd.DataFrame) assert "metadata" not in df.columns assert [1] == df.loc["assetIds"][0] diff --git a/tests/tests_unit/test_api/test_geospatial.py b/tests/tests_unit/test_api/test_geospatial.py index 60178f2e86..eec673b070 100644 --- a/tests/tests_unit/test_api/test_geospatial.py +++ b/tests/tests_unit/test_api/test_geospatial.py @@ -1,6 +1,6 @@ +import math import uuid -import numpy as np import pytest from cognite.client import utils @@ -69,19 +69,19 @@ def test_features(test_feature_type): class TestGeospatialAPI: @pytest.mark.dsl def test_to_pandas(self, test_feature_type, test_features): - df = test_features.to_pandas() + df = test_features.to_pandas(camel_case=True) assert set(list(df)) == {"externalId", "dataSetId", "position", "volume", "temperature", "pressure", "assetIds"} @pytest.mark.dsl def test_to_geopandas(self, test_feature_type, test_features): - gdf = test_features.to_geopandas(geometry="position") + gdf = test_features.to_geopandas(geometry="position", camel_case=True) assert set(gdf) == {"externalId", "dataSetId", "position", "volume", "temperature", "pressure", "assetIds"} geopandas = utils._auxiliary.local_import("geopandas") assert type(gdf.dtypes["position"]) == geopandas.array.GeometryDtype @pytest.mark.dsl def test_from_geopandas(self, test_feature_type, test_features): - gdf = test_features.to_geopandas(geometry="position") + gdf = test_features.to_geopandas(geometry="position", camel_case=True) fl = FeatureList.from_geopandas(test_feature_type, gdf) assert type(fl) == FeatureList assert len(fl) == 4 @@ -118,7 +118,7 @@ def test_from_geopandas_nan_values(self, test_feature_type): "temperature": 0.0, "pressure": 1.0, "volume": 11.0, - "weight": np.nan, + "weight": math.nan, "description": "string", "assetIds": [1, 2], }, diff --git a/tests/tests_unit/test_api/test_synthetic_time_series.py b/tests/tests_unit/test_api/test_synthetic_time_series.py index db2e618383..a62dae300d 100644 --- a/tests/tests_unit/test_api/test_synthetic_time_series.py +++ b/tests/tests_unit/test_api/test_synthetic_time_series.py @@ -45,7 +45,7 @@ def request_callback(request): rsps.add_callback( rsps.POST, - cognite_client.datapoints.synthetic._get_base_url_with_base_path() + "/timeseries/synthetic/query", + cognite_client.time_series.data.synthetic._get_base_url_with_base_path() + "/timeseries/synthetic/query", callback=request_callback, content_type="application/json", ) @@ -57,7 +57,8 @@ def mock_get_datapoints_empty(rsps, cognite_client): rsps.add( rsps.POST, re.compile( - re.escape(cognite_client.datapoints.synthetic._get_base_url_with_base_path()) + "/timeseries/synthetic/.*" + re.escape(cognite_client.time_series.data.synthetic._get_base_url_with_base_path()) + + "/timeseries/synthetic/.*" ), status=200, json={"items": [{"isString": False, "datapoints": []}]}, @@ -67,7 +68,7 @@ def mock_get_datapoints_empty(rsps, cognite_client): class TestSyntheticQuery: def test_query(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.synthetic.query( + dps_res = cognite_client.time_series.data.synthetic.query( expressions='TS{externalID:"abc"} + TS{id:1}', start=1000000, end=1100001 ) assert isinstance(dps_res, Datapoints) @@ -75,7 +76,7 @@ def test_query(self, cognite_client, mock_get_datapoints): assert 11 == len(mock_get_datapoints.calls) def test_query_limit(self, cognite_client, mock_get_datapoints): - dps_res = cognite_client.datapoints.synthetic.query( + dps_res = cognite_client.time_series.data.synthetic.query( expressions=['TS{externalID:"abc"}', "TS{id:1}"], start=1000000, end=1100001, limit=20000 ) assert 20000 == len(dps_res[0]) @@ -83,7 +84,7 @@ def test_query_limit(self, cognite_client, mock_get_datapoints): assert 4 == len(mock_get_datapoints.calls) def test_query_empty(self, cognite_client, mock_get_datapoints_empty): - dps_res = cognite_client.datapoints.synthetic.query( + dps_res = cognite_client.time_series.data.synthetic.query( expressions=['TS{externalID:"abc"} + TS{id:1}'], start=1000000, end=1100001 ) assert isinstance(dps_res[0], Datapoints) @@ -94,22 +95,22 @@ def test_query_empty(self, cognite_client, mock_get_datapoints_empty): def test_expression_builder(self, cognite_client): from sympy import symbols - assert ("ts{externalId:'x'}", "a") == cognite_client.datapoints.synthetic._build_expression( + assert ("ts{externalId:'x'}", "a") == cognite_client.time_series.data.synthetic._build_expression( symbols("a"), {"a": "x"} ) assert ( "ts{externalId:'x',aggregate:'average',granularity:'1m'}", "a", - ) == cognite_client.datapoints.synthetic._build_expression( + ) == cognite_client.time_series.data.synthetic._build_expression( symbols("a"), {"a": "x"}, aggregate="average", granularity="1m" ) assert ( "(ts{externalId:'x'}+ts{externalId:'y'}+ts{externalId:'z'})", "(a+b+c)", - ) == cognite_client.datapoints.synthetic._build_expression( + ) == cognite_client.time_series.data.synthetic._build_expression( symbols("a") + symbols("b") + symbols("c"), {"a": "x", "b": "y", "c": "z"} ) - assert ("(1/ts{externalId:'a'})", "(1/a)") == cognite_client.datapoints.synthetic._build_expression( + assert ("(1/ts{externalId:'a'})", "(1/a)") == cognite_client.time_series.data.synthetic._build_expression( 1 / symbols("a"), {"a": "a"} ) @@ -120,13 +121,13 @@ def test_expression_builder_variables_missing(self, cognite_client): with pytest.raises( ValueError, match="sympy expressions are only supported in combination with the `variables` parameter" ): - cognite_client.datapoints.synthetic.query([symbols("a")], start=0, end="now") + cognite_client.time_series.data.synthetic.query([symbols("a")], start=0, end="now") @pytest.mark.dsl def test_expression_builder_unsupported_missing(self, cognite_client): from sympy import cot, symbols with pytest.raises(ValueError, match="Unsupported sympy class cot"): - cognite_client.datapoints.synthetic.query( + cognite_client.time_series.data.synthetic.query( [symbols("a") + cot(symbols("a"))], start=0, end="now", variables={"a": "a"} ) diff --git a/tests/tests_unit/test_api/test_time_series.py b/tests/tests_unit/test_api/test_time_series.py index 12b09418b2..15b97bba4a 100644 --- a/tests/tests_unit/test_api/test_time_series.py +++ b/tests/tests_unit/test_api/test_time_series.py @@ -1,5 +1,4 @@ import re -from unittest import mock import pytest @@ -198,77 +197,6 @@ def test_update_object(self): ) -@pytest.mark.dsl -class TestPlotTimeSeries: - @pytest.fixture - def mock_get_dps(self, cognite_client, rsps): - rsps.add( - rsps.POST, - cognite_client.time_series._get_base_url_with_base_path() + "/timeseries/data/list", - status=200, - json={ - "items": [ - { - "id": 0, - "externalId": "string1", - "isString": False, - "isStep": False, - "datapoints": [{"timestamp": i * 10000, "average": i} for i in range(5000)], - } - ] - }, - ) - - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.rename") - def test_plot_time_series_name_labels( - self, pandas_rename_mock, plt_show_mock, mock_ts_response, mock_get_dps, cognite_client - ): - res = cognite_client.time_series.retrieve(id=0) - df_mock = mock.MagicMock() - pandas_rename_mock.return_value = df_mock - res.plot(aggregates=["average"], granularity="1d") - - assert {0: "stringname", "0|average": "stringname|average"} == pandas_rename_mock.call_args[1]["columns"] - assert 1 == df_mock.plot.call_count - assert 1 == plt_show_mock.call_count - - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.plot") - def test_plot_time_series_id_labels( - self, pandas_plot_mock, plt_show_mock, mock_ts_response, mock_get_dps, cognite_client - ): - res = cognite_client.time_series.retrieve(id=0) - res.plot(id_labels=True, aggregates=["average"], granularity="1s") - - assert 1 == pandas_plot_mock.call_count - assert 1 == plt_show_mock.call_count - - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.rename") - def test_plot_time_series_list_name_labels( - self, pandas_rename_mock, plt_show_mock, mock_ts_response, mock_get_dps, cognite_client - ): - res = cognite_client.time_series.retrieve_multiple(ids=[0]) - df_mock = mock.MagicMock() - pandas_rename_mock.return_value = df_mock - res.plot(aggregates=["average"], granularity="1h") - assert {0: "stringname", "0|average": "stringname|average"} == pandas_rename_mock.call_args[1]["columns"] - assert 1 == df_mock.plot.call_count - assert 1 == plt_show_mock.call_count - - @mock.patch("matplotlib.pyplot.show") - @mock.patch("pandas.core.frame.DataFrame.plot") - def test_plot_time_series_list_id_labels( - self, pandas_plot_mock, plt_show_mock, mock_ts_response, mock_get_dps, cognite_client - ): - res = cognite_client.time_series.retrieve_multiple(ids=[0]) - res.plot(id_labels=True) - - assert 1 == pandas_plot_mock.call_count - assert 1 == plt_show_mock.call_count - - @pytest.fixture def mock_time_series_empty(rsps, cognite_client): url_pattern = re.compile(re.escape(cognite_client.time_series._get_base_url_with_base_path()) + "/.+") @@ -298,7 +226,7 @@ def test_time_series_list_to_pandas_empty(self, cognite_client, mock_time_series def test_time_series_to_pandas(self, cognite_client, mock_ts_response): import pandas as pd - df = cognite_client.time_series.retrieve(id=1).to_pandas() + df = cognite_client.time_series.retrieve(id=1).to_pandas(camel_case=True) assert isinstance(df, pd.DataFrame) assert "metadata" not in df.columns assert [0] == df.loc["securityCategories"][0] diff --git a/tests/tests_unit/test_base.py b/tests/tests_unit/test_base.py index 57510a13c5..ffb1017e62 100644 --- a/tests/tests_unit/test_base.py +++ b/tests/tests_unit/test_base.py @@ -167,7 +167,7 @@ def __init__(self, a_list, ob, ob_expand, ob_ignore, prim, prim_ignore): expected_df.loc["md_key"] = ["md_value"] res = SomeResource([1, 2, 3], {"x": "y"}, {"md_key": "md_value"}, {"bla": "bla"}, "abc", 1) - actual_df = res.to_pandas(expand=["obExpand"], ignore=["primIgnore", "obIgnore"]) + actual_df = res.to_pandas(expand=["obExpand"], ignore=["primIgnore", "obIgnore"], camel_case=True) pd.testing.assert_frame_equal(expected_df, actual_df, check_like=True) res.to_pandas() @@ -209,7 +209,7 @@ def test_to_pandas(self): resource_list = MyResourceList([MyResource(1), MyResource(2, 3)]) expected_df = pd.DataFrame({"varA": [1, 2], "varB": [None, 3]}) - pd.testing.assert_frame_equal(resource_list.to_pandas(), expected_df) + pd.testing.assert_frame_equal(resource_list.to_pandas(camel_case=True), expected_df) @pytest.mark.dsl def test_to_pandas_no_camels(self): diff --git a/tests/tests_unit/test_data_classes/test_time_series.py b/tests/tests_unit/test_data_classes/test_time_series.py index cad72d4444..472c1c942b 100644 --- a/tests/tests_unit/test_data_classes/test_time_series.py +++ b/tests/tests_unit/test_data_classes/test_time_series.py @@ -1,7 +1,7 @@ import pytest -from cognite.client import utils from cognite.client.data_classes import Asset, Datapoint +from cognite.client.utils._time import MAX_TIMESTAMP_MS, MIN_TIMESTAMP_MS from tests.utils import jsgz_load @@ -104,12 +104,20 @@ def mock_get_first_dp_in_ts(mock_ts_by_ids_response, cognite_client): class TestTimeSeries: - def test_get_count(self, cognite_client, mock_count_dps_in_ts): - now = utils.timestamp_to_ms("now") - assert 15 == cognite_client.time_series.retrieve(id=1).count() - assert "count" == jsgz_load(mock_count_dps_in_ts.calls[1].request.body)["aggregates"][0] - assert 0 == jsgz_load(mock_count_dps_in_ts.calls[1].request.body)["start"] - assert now <= jsgz_load(mock_count_dps_in_ts.calls[1].request.body)["end"] + def test_get_count__numeric(self, cognite_client, mock_count_dps_in_ts): + ts = cognite_client.time_series.retrieve(id=1) + ts.is_string = False # TODO: This is not elegant + assert 15 == ts.count() + body = jsgz_load(mock_count_dps_in_ts.calls[1].request.body) + assert len(body["items"]) == 1 + item = body["items"][0] + assert ["count"] == item["aggregates"] + assert MIN_TIMESTAMP_MS == item["start"] + assert MAX_TIMESTAMP_MS < item["end"] # agg dps, end ts is rounded up + + def test_get_count__string_raises(self, cognite_client, mock_ts_by_ids_response): + with pytest.raises(ValueError, match="String time series does not support count aggregate"): + cognite_client.time_series.retrieve(id=1).count() def test_get_latest(self, cognite_client, mock_get_latest_dp_in_ts): res = cognite_client.time_series.retrieve(id=1).latest() @@ -117,13 +125,15 @@ def test_get_latest(self, cognite_client, mock_get_latest_dp_in_ts): assert Datapoint(timestamp=1, value=10) == res def test_get_first_datapoint(self, cognite_client, mock_get_first_dp_in_ts): - now = utils.timestamp_to_ms("now") res = cognite_client.time_series.retrieve(id=1).first() assert isinstance(res, Datapoint) assert Datapoint(timestamp=1, value=10) == res - assert 0 == jsgz_load(mock_get_first_dp_in_ts.calls[1].request.body)["start"] - assert now <= jsgz_load(mock_get_first_dp_in_ts.calls[1].request.body)["end"] - assert 1 == jsgz_load(mock_get_first_dp_in_ts.calls[1].request.body)["limit"] + body = jsgz_load(mock_get_first_dp_in_ts.calls[1].request.body) + assert len(body["items"]) == 1 + item = body["items"][0] + assert MIN_TIMESTAMP_MS == item["start"] + assert MAX_TIMESTAMP_MS == item["end"] # raw dps, no ts rounding + assert 1 == item["limit"] def test_asset(self, cognite_client, mock_ts_by_ids_response, mock_asset_by_ids_response): asset = cognite_client.time_series.retrieve(id=1).asset() diff --git a/tests/tests_unit/test_docstring_examples.py b/tests/tests_unit/test_docstring_examples.py index 18eaa052b5..d295f44a2a 100644 --- a/tests/tests_unit/test_docstring_examples.py +++ b/tests/tests_unit/test_docstring_examples.py @@ -1,5 +1,6 @@ import doctest from unittest import TextTestRunner +from unittest.mock import patch import pytest @@ -18,6 +19,7 @@ three_d, time_series, ) +from cognite.client.testing import CogniteClientMock # this fixes the issue with 'got MagicMock but expected Nothing in docstrings' doctest.OutputChecker.__check_output = doctest.OutputChecker.check_output @@ -32,7 +34,7 @@ def run_docstring_tests(module): assert 0 == len(s.failures) -@pytest.mark.usefixtures("mock_cognite_client") +@patch("cognite.client.CogniteClient", CogniteClientMock) class TestDocstringExamples: def test_time_series(self): run_docstring_tests(time_series) @@ -72,8 +74,5 @@ def test_sequences(self): def test_relationships(self): run_docstring_tests(relationships) - -@pytest.mark.usefixtures("mock_cognite_beta_client") -class TestDocstringExamplesBeta: def test_entity_matching(self): run_docstring_tests(entity_matching) diff --git a/tests/tests_unit/test_utils/test_auxiliary.py b/tests/tests_unit/test_utils/test_auxiliary.py index 3e5c9f64bf..100c6fd0f5 100644 --- a/tests/tests_unit/test_utils/test_auxiliary.py +++ b/tests/tests_unit/test_utils/test_auxiliary.py @@ -1,22 +1,34 @@ import json +import math from decimal import Decimal +from itertools import zip_longest import pytest -from cognite.client import utils from cognite.client.exceptions import CogniteImportError +from cognite.client.utils._auxiliary import ( + assert_type, + find_duplicates, + interpolate_and_url_encode, + json_dump_default, + local_import, + split_into_chunks, + split_into_n_parts, + to_camel_case, + to_snake_case, +) class TestCaseConversion: def test_to_camel_case(self): - assert "camelCase" == utils._auxiliary.to_camel_case("camel_case") - assert "camelCase" == utils._auxiliary.to_camel_case("camelCase") - assert "a" == utils._auxiliary.to_camel_case("a") + assert "camelCase" == to_camel_case("camel_case") + assert "camelCase" == to_camel_case("camelCase") + assert "a" == to_camel_case("a") def test_to_snake_case(self): - assert "snake_case" == utils._auxiliary.to_snake_case("snakeCase") - assert "snake_case" == utils._auxiliary.to_snake_case("snake_case") - assert "a" == utils._auxiliary.to_snake_case("a") + assert "snake_case" == to_snake_case("snakeCase") + assert "snake_case" == to_snake_case("snake_case") + assert "a" == to_snake_case("a") class TestLocalImport: @@ -24,35 +36,35 @@ class TestLocalImport: def test_local_import_single_ok(self): import pandas - assert pandas == utils._auxiliary.local_import("pandas") + assert pandas == local_import("pandas") @pytest.mark.dsl def test_local_import_multiple_ok(self): import numpy import pandas - assert (pandas, numpy) == utils._auxiliary.local_import("pandas", "numpy") + assert (pandas, numpy) == local_import("pandas", "numpy") def test_local_import_single_fail(self): with pytest.raises(CogniteImportError, match="requires 'not-a-module' to be installed"): - utils._auxiliary.local_import("not-a-module") + local_import("not-a-module") @pytest.mark.dsl def test_local_import_multiple_fail(self): with pytest.raises(CogniteImportError, match="requires 'not-a-module' to be installed"): - utils._auxiliary.local_import("pandas", "not-a-module") + local_import("pandas", "not-a-module") @pytest.mark.coredeps def test_dsl_deps_not_installed(self): for dep in ["geopandas", "pandas", "shapely", "sympy"]: with pytest.raises(CogniteImportError, match=dep): - utils._auxiliary.local_import(dep) + local_import(dep) class TestUrlEncode: def test_url_encode(self): - assert "/bla/yes%2Fno/bla" == utils._auxiliary.interpolate_and_url_encode("/bla/{}/bla", "yes/no") - assert "/bla/123/bla/456" == utils._auxiliary.interpolate_and_url_encode("/bla/{}/bla/{}", "123", "456") + assert "/bla/yes%2Fno/bla" == interpolate_and_url_encode("/bla/{}/bla", "yes/no") + assert "/bla/123/bla/456" == interpolate_and_url_encode("/bla/{}/bla/{}", "123", "456") class TestJsonDumpDefault: @@ -60,7 +72,7 @@ def test_json_serializable_Decimal(self): with pytest.raises(TypeError): json.dumps(Decimal(1)) - assert json.dumps(Decimal(1), default=utils._auxiliary.json_dump_default) + assert json.dumps(Decimal(1), default=json_dump_default) def test_json_not_serializable_sets(self): with pytest.raises(TypeError): @@ -70,15 +82,15 @@ def test_json_not_serializable_sets(self): @pytest.mark.dsl def test_json_serializable_numpy(self): - np = utils._auxiliary.local_import("numpy") - arr = np.array([1.2, 3.4]).astype(np.float32) + np = local_import("numpy") + arr = np.array([1.2, 3.4], dtype=np.float32) with pytest.raises(TypeError): json.dumps(arr) with pytest.raises(TypeError): json.dumps(arr[0]) with pytest.raises(TypeError): # core sdk makes it hard to serialize np.ndarray - assert json.dumps(arr, default=utils._auxiliary.json_dump_default) - assert json.dumps(arr[0], default=utils._auxiliary.json_dump_default) + assert json.dumps(arr, default=json_dump_default) + assert json.dumps(arr[0], default=json_dump_default) def test_json_serializable_object(self): class Obj: @@ -88,7 +100,7 @@ def __init__(self): with pytest.raises(TypeError): json.dumps(Obj()) - assert json.dumps({"x": 1}) == json.dumps(Obj(), default=utils._auxiliary.json_dump_default) + assert json.dumps({"x": 1}) == json.dumps(Obj(), default=json_dump_default) @pytest.mark.dsl def test_json_serialiable_numpy_integer(self): @@ -96,7 +108,7 @@ def test_json_serialiable_numpy_integer(self): inputs = [np.int32(1), np.int64(1)] for input in inputs: - assert json.dumps(input, default=utils._auxiliary.json_dump_default) + assert json.dumps(input, default=json_dump_default) class TestSplitIntoChunks: @@ -115,7 +127,7 @@ class TestSplitIntoChunks: ], ) def test_split_into_chunks(self, input, chunk_size, expected_output): - actual_output = utils._auxiliary.split_into_chunks(input, chunk_size) + actual_output = split_into_chunks(input, chunk_size) assert len(actual_output) == len(expected_output) for element in expected_output: assert element in actual_output @@ -124,9 +136,84 @@ def test_split_into_chunks(self, input, chunk_size, expected_output): class TestAssertions: @pytest.mark.parametrize("var, var_name, types", [(1, "var1", [int]), ("1", "var2", [int, str])]) def test_assert_type_ok(self, var, var_name, types): - utils._auxiliary.assert_type(var, var_name, types=types) + assert_type(var, var_name, types=types) @pytest.mark.parametrize("var, var_name, types", [("1", "var", [int, float]), ((1,), "var2", [dict, list])]) def test_assert_type_fail(self, var, var_name, types): with pytest.raises(TypeError, match=str(types)): - utils._auxiliary.assert_type(var, var_name, types) + assert_type(var, var_name, types) + + +class TestFindDuplicates: + @pytest.mark.parametrize("inp", ("abc", (1, 2, 3), [1.0, 1.1, 2], range(3), {1: 2, 2: 3}, set([1, 1, 1]))) + def test_no_duplicates(self, inp): + assert set() == find_duplicates(inp) + + @pytest.mark.parametrize( + "inp, exp_duplicate", + ( + ("abca", {"a"}), + ("x" * 20, {"x"}), + ((1, 2, 2.0, 3), {2}), + ([-0, 0.0, 1.0, 1.1], {0}), + ([math.nan, math.nan], {math.nan}), # Hmmm + ([None, int, print, lambda s: s, print], {print}), # Hmmm again + ([frozenset((1,)), frozenset((1,)), frozenset((1, 3))], {frozenset((1,))}), + ), + ) + def test_has_duplicates(self, inp, exp_duplicate): + assert exp_duplicate == find_duplicates(inp) + + @pytest.mark.parametrize( + "inp", + ( + ([1], [1], [1, 2]), + [set((1,)), set((1,)), set((1, 3))], + [{1: 2}, {1: 2}, {1: 2, 2: 3}], + ), + ) + def test_raises_not_hashable(self, inp): + with pytest.raises(TypeError, match="unhashable type:"): + find_duplicates(inp) + + +class TestSplitIntoNParts: + @pytest.mark.parametrize( + "inp, n, exp_out", + ( + ("abcd", 2, ("ac", "bd")), + ("abcd", 3, ("ad", "b", "c")), + (list("abcd"), 1, (list("abcd"),)), + (list("abcd"), 2, (["a", "c"], ["b", "d"])), + ((1, None, "a"), 2, ((1, "a"), (None,))), + (range(10), 3, (range(0, 10, 3), range(1, 10, 3), range(2, 10, 3))), + ), + ) + def test_normal_split(self, inp, n, exp_out): + exp_type = type(inp) + res = split_into_n_parts(inp, n) + for r, res_exp in zip_longest(res, exp_out, fillvalue=math.nan): + assert type(r) is exp_type + assert r == res_exp + + @pytest.mark.parametrize( + "inp, n, exp_out", + ( + ("abc", 4, ("a", "b", "c", "")), + (list("abc"), 4, (["a"], ["b"], ["c"], [])), + ((1, None), 5, ((1,), (None,), (), (), ())), + (range(1), 3, (range(0, 1, 3), range(1, 1, 3), range(2, 1, 3))), + ), + ) + def test_split_into_too_many_pieces(self, inp, n, exp_out): + exp_type = type(inp) + res = split_into_n_parts(inp, n) + for r, res_exp in zip_longest(res, exp_out, fillvalue=math.nan): + assert type(r) is exp_type + assert r == res_exp + + @pytest.mark.parametrize("inp", (set(range(5)), None)) + def test_raises_not_subscriptable(self, inp): + res = split_into_n_parts(inp, n=2) + with pytest.raises(TypeError, match="object is not subscriptable"): + next(res) diff --git a/tests/tests_unit/test_utils/test_time.py b/tests/tests_unit/test_utils/test_time.py index 77aff28616..6a6997c6a9 100644 --- a/tests/tests_unit/test_utils/test_time.py +++ b/tests/tests_unit/test_utils/test_time.py @@ -1,66 +1,85 @@ -from datetime import datetime -from time import sleep +import platform +import time +from datetime import datetime, timezone from unittest import mock import pytest -from cognite.client import utils -from cognite.client.utils._time import MIN_TIMESTAMP_MS +from cognite.client.utils._time import ( + MAX_TIMESTAMP_MS, + MIN_TIMESTAMP_MS, + align_start_and_end_for_granularity, + convert_time_attributes_to_datetime, + datetime_to_ms, + granularity_to_ms, + granularity_unit_to_ms, + ms_to_datetime, + split_time_range, + timestamp_to_ms, +) +from tests.utils import tmp_set_envvar class TestDatetimeToMs: - def test_datetime_to_ms(self): - from datetime import datetime - - assert utils._time.datetime_to_ms(datetime(2018, 1, 31)) == 1517356800000 - assert utils._time.datetime_to_ms(datetime(2018, 1, 31, 11, 11, 11)) == 1517397071000 - assert utils._time.datetime_to_ms(datetime(100, 1, 31)) == -59008867200000 - with pytest.raises(AttributeError): - utils._time.datetime_to_ms(None) + @pytest.mark.skipif(platform.system() == "Windows", reason="Overriding timezone is too much hassle on Windows") + @pytest.mark.parametrize( + "local_tz, expected_ms", + [ + ("Europe/Oslo", 1517353200000), + ("UTC", 1517356800000), + ("America/Los_Angeles", 1517385600000), + ], + ) + def test_naive_datetime_to_ms(self, local_tz, expected_ms): + with tmp_set_envvar("TZ", local_tz): + time.tzset() + assert datetime_to_ms(datetime(2018, 1, 31, tzinfo=None)) == expected_ms + assert timestamp_to_ms(datetime(2018, 1, 31, tzinfo=None)) == expected_ms def test_aware_datetime_to_ms(self): - from datetime import datetime, timezone - # TODO: Starting from PY39 we should also add tests using: # from zoneinfo import ZoneInfo # datetime(2020, 10, 31, 12, tzinfo=ZoneInfo("America/Los_Angeles")) - utc = timezone.utc - assert utils._time.datetime_to_ms(datetime(2018, 1, 31, tzinfo=utc)) == 1517356800000 - assert utils._time.datetime_to_ms(datetime(2018, 1, 31, 11, 11, 11, tzinfo=utc)) == 1517397071000 - assert utils._time.datetime_to_ms(datetime(100, 1, 31, tzinfo=utc)) == -59008867200000 + assert datetime_to_ms(datetime(2018, 1, 31, tzinfo=utc)) == 1517356800000 + assert datetime_to_ms(datetime(2018, 1, 31, 11, 11, 11, tzinfo=utc)) == 1517397071000 + assert datetime_to_ms(datetime(100, 1, 31, tzinfo=utc)) == -59008867200000 def test_ms_to_datetime(self): - from datetime import datetime - - assert utils._time.ms_to_datetime(1517356800000) == datetime(2018, 1, 31) - assert utils._time.ms_to_datetime(1517397071000) == datetime(2018, 1, 31, 11, 11, 11) + utc = timezone.utc + assert ms_to_datetime(1517356800000) == datetime(2018, 1, 31, tzinfo=utc) + assert ms_to_datetime(1517397071000) == datetime(2018, 1, 31, 11, 11, 11, tzinfo=utc) with pytest.raises(ValueError, match="greater than"): - utils._time.ms_to_datetime(-59008867200000) + ms_to_datetime(-59008867200000) with pytest.raises(TypeError): - utils._time.ms_to_datetime(None) + ms_to_datetime(None) class TestTimestampToMs: @pytest.mark.parametrize("t", [None, [], {}]) def test_invalid_type(self, t): with pytest.raises(TypeError, match="must be"): - utils._time.timestamp_to_ms(t) + timestamp_to_ms(t) def test_ms(self): - assert 1514760000000 == utils._time.timestamp_to_ms(1514760000000) - assert 1514764800000 == utils._time.timestamp_to_ms(1514764800000) - assert -1514764800000 == utils._time.timestamp_to_ms(-1514764800000) + assert 1514760000000 == timestamp_to_ms(1514760000000) + assert 1514764800000 == timestamp_to_ms(1514764800000) + assert -1514764800000 == timestamp_to_ms(-1514764800000) + @pytest.mark.skipif(platform.system() == "Windows", reason="Overriding timezone is too much hassle on Windows") def test_datetime(self): - assert 1514764800000 == utils._time.timestamp_to_ms(datetime(2018, 1, 1)) - assert 1546300800000 == utils._time.timestamp_to_ms(datetime(2019, 1, 1)) - assert MIN_TIMESTAMP_MS == utils._time.timestamp_to_ms(datetime(1900, 1, 1)) + # Note: See also `TestDatetimeToMs.test_naive_datetime_to_ms` + with tmp_set_envvar("TZ", "UTC"): + time.tzset() + assert 1514764800000 == timestamp_to_ms(datetime(2018, 1, 1)) + assert 1546300800000 == timestamp_to_ms(datetime(2019, 1, 1)) + assert MIN_TIMESTAMP_MS == timestamp_to_ms(datetime(1900, 1, 1)) + assert MAX_TIMESTAMP_MS == timestamp_to_ms(datetime(2051, 1, 1)) - 1 def test_float(self): - assert 1514760000000 == utils._time.timestamp_to_ms(1514760000000.0) - assert 1514764800000 == utils._time.timestamp_to_ms(1514764800000.0) - assert -1514764800000 == utils._time.timestamp_to_ms(-1514764800000.0) + assert 1514760000000 == timestamp_to_ms(1514760000000.0) + assert 1514764800000 == timestamp_to_ms(1514764800000.0) + assert -1514764800000 == timestamp_to_ms(-1514764800000.0) @mock.patch("cognite.client.utils._time.time.time") @pytest.mark.parametrize( @@ -82,27 +101,27 @@ def test_float(self): def test_time_ago(self, time_mock, time_ago_string, expected_timestamp): time_mock.return_value = 10**9 - assert utils._time.timestamp_to_ms(time_ago_string) == expected_timestamp + assert timestamp_to_ms(time_ago_string) == expected_timestamp @pytest.mark.parametrize("time_ago_string", ["1s", "4h", "13m-ag", "13m ago", "bla"]) def test_invalid(self, time_ago_string): with pytest.raises(ValueError, match=time_ago_string): - utils._time.timestamp_to_ms(time_ago_string) + timestamp_to_ms(time_ago_string) def test_time_ago_real_time(self): expected_time_now = datetime.now().timestamp() * 1000 - time_now = utils._time.timestamp_to_ms("now") + time_now = timestamp_to_ms("now") assert abs(expected_time_now - time_now) < 10 - sleep(0.2) + time.sleep(0.2) - time_now = utils._time.timestamp_to_ms("now") + time_now = timestamp_to_ms("now") assert abs(expected_time_now - time_now) > 190 @pytest.mark.parametrize("t", [MIN_TIMESTAMP_MS - 1, datetime(1899, 12, 31), "100000000w-ago"]) def test_negative(self, t): with pytest.raises(ValueError, match="must represent a time after 1.1.1900"): - utils._time.timestamp_to_ms(t) + timestamp_to_ms(t) class TestGranularityToMs: @@ -120,12 +139,12 @@ class TestGranularityToMs: ], ) def test_to_ms(self, granularity, expected_ms): - assert utils._time.granularity_to_ms(granularity) == expected_ms + assert granularity_to_ms(granularity) == expected_ms @pytest.mark.parametrize("granularity", ["2w", "-3h", "13m-ago", "13", "bla"]) def test_to_ms_invalid(self, granularity): with pytest.raises(ValueError, match=granularity): - utils._time.granularity_to_ms(granularity) + granularity_to_ms(granularity) class TestGranularityUnitToMs: @@ -143,12 +162,12 @@ class TestGranularityUnitToMs: ], ) def test_to_ms(self, granularity, expected_ms): - assert utils._time.granularity_unit_to_ms(granularity) == expected_ms + assert granularity_unit_to_ms(granularity) == expected_ms @pytest.mark.parametrize("granularity", ["2w", "-3h", "13m-ago", "13", "bla"]) def test_to_ms_invalid(self, granularity): with pytest.raises(ValueError, match="format"): - utils._time.granularity_unit_to_ms(granularity) + granularity_unit_to_ms(granularity) class TestObjectTimeConversion: @@ -171,4 +190,65 @@ class TestObjectTimeConversion: ], ) def test_convert_time_attributes_to_datetime(self, item, expected_output): - assert expected_output == utils._time.convert_time_attributes_to_datetime(item) + assert expected_output == convert_time_attributes_to_datetime(item) + + +class TestSplitTimeDomain: + def test_split_time_range__zero_splits(self): + with pytest.raises(ValueError, match="Cannot split"): + split_time_range(-100, 100, 0, 1) + + def test_split_time_range__too_large_delta_ms(self): + with pytest.raises(ValueError, match="is larger than the interval itself"): + split_time_range(-100, 100, 1, 201) + + @pytest.mark.parametrize( + "n_splits, expected", + [ + (1, [-100, 100]), + (2, [-100, 0, 100]), + (3, [-100, -33, 34, 100]), + (4, [-100, -50, 0, 50, 100]), + ], + ) + def test_split_time_range__raw_granularity(self, n_splits, expected): + assert expected == split_time_range(-100, 100, n_splits, 1) + + @pytest.mark.parametrize( + "n_splits, granularity, expected", + [ + (24, "5s", 39600000), + (12, "4m", 79200000), + (11, "3h", 86400000), + (1, "1d", 950400000), + ], + ) + def test_split_time_range__agg_granularity(self, n_splits, granularity, expected): + one_day_ms, gran_ms = 86_400_000, granularity_to_ms(granularity) + res = split_time_range(-2 * one_day_ms, 9 * one_day_ms, n_splits, gran_ms) + assert n_splits == len(res) - 1 + (single_diff,) = set(next - prev for next, prev in zip(res[1:], res[:-1])) + assert expected == single_diff + assert all(val % gran_ms == 0 for val in res) + + +class TestAlignToGranularity: + @pytest.mark.parametrize("granularity", ["2s", "3m", "4h", "5d"]) + def test_exactly_on_granularity_boundary(self, granularity): + gran_ms = granularity_to_ms(granularity) + start, end = gran_ms, 2 * gran_ms + assert start, end == align_start_and_end_for_granularity(start, end, granularity) + + @pytest.mark.parametrize( + "granularity, expected", + [ + ("2s", (1000, 5000)), + ("3m", (120000, 480000)), + ("4h", (10800000, 39600000)), + ("5d", (345600000, 1209600000)), + ], + ) + def test_start_not_on_granularity_boundary(self, granularity, expected): + gran_ms = granularity_to_ms(granularity) + start, end = gran_ms - 1, 2 * gran_ms + assert expected == align_start_and_end_for_granularity(start, end, granularity) diff --git a/tests/utils.py b/tests/utils.py index 02e749772f..0c0783ba90 100644 --- a/tests/utils.py +++ b/tests/utils.py @@ -2,8 +2,70 @@ import functools import gzip import json +import math +import os +import random from contextlib import contextmanager +from cognite.client._api.datapoint_constants import ALL_SORTED_DP_AGGS +from cognite.client.utils._auxiliary import random_string + + +def random_cognite_ids(n): + # Returns list of random, valid Cognite internal IDs: + return random.choices(range(1, 9007199254740992), k=n) + + +def random_cognite_external_ids(n, str_len=50): + # Returns list of random, valid Cognite external IDs: + return [random_string(str_len) for _ in range(n)] + + +def random_granularity(granularities="smhd", lower_lim=1, upper_lim=100000): + gran = random.choice(granularities) + upper = {"s": 120, "m": 120, "h": 100000, "d": 100000} + unit = random.choice(range(max(lower_lim, 1), min(upper_lim, upper[gran]) + 1)) + return f"{unit}{gran}" + + +def random_aggregates(n=None, exclude=None): + """Return n random aggregates in a list - or random (at least 1) if n is None. + Accepts a container object of aggregates to `exclude` + """ + agg_lst = ALL_SORTED_DP_AGGS + if exclude: + agg_lst = [a for a in agg_lst if a not in exclude] + n = n or random.randint(1, len(agg_lst)) + return random.sample(agg_lst, k=n) + + +def random_gamma_dist_integer(inclusive_max, max_tries=100): + # "Smaller integers are more likely" + for _ in range(max_tries): + i = 1 + math.floor(random.gammavariate(1, inclusive_max * 0.3)) + if i <= inclusive_max: # rejection sampling + return i + raise RuntimeError(f"Max tries exceeded while generating a random integer in range [1, {inclusive_max}]") + + +@contextmanager +def set_max_workers(cognite_client, new): + old = cognite_client._config.max_workers + cognite_client._config.max_workers = new + yield + cognite_client._config.max_workers = old + + +@contextmanager +def tmp_set_envvar(envvar: str, value: str): + old = os.getenv(envvar) + os.environ[envvar] = value + yield + if old is None: + del os.environ[envvar] + else: + os.environ[envvar] = old + def jsgz_load(s): return json.loads(gzip.decompress(s).decode()) @@ -36,7 +98,6 @@ def set_request_limit(client, limit): "_UPDATE_LIMIT", "_DELETE_LIMIT", "_DPS_LIMIT", - "_DPS_LIMIT_AGG", "_POST_DPS_OBJECTS_LIMIT", "_RETRIEVE_LATEST_LIMIT", ]