Skip to content

Commit

Permalink
Check for the whole protocol including delimited (<scheme>://) when j… (
Browse files Browse the repository at this point in the history
#2715)

* Check for the whole protocol including delimited (<scheme>://) when joining path for partitions in partitioned data set

Signed-off-by: Julius Hetzel <[email protected]>

* Add bugfix description to release nodes

Signed-off-by: Julius Hetzel <[email protected]>

---------

Signed-off-by: Julius Hetzel <[email protected]>
  • Loading branch information
juliushetzel authored Jun 23, 2023
1 parent 160fd6b commit fd8162d
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 4 deletions.
1 change: 1 addition & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Major features and improvements

## Bug fixes and other changes
* Compare for protocol and delimiter in `PartitionedDataSet` to be able to pass the protocol to partitions which paths starts with the same characters as the protocol (e.g. `s3://s3-my-bucket`).

## Breaking changes to the API

Expand Down
7 changes: 4 additions & 3 deletions kedro/io/partitioned_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,10 +263,11 @@ def _list_partitions(self) -> list[str]:
]

def _join_protocol(self, path: str) -> str:
if self._path.startswith(self._protocol) and not path.startswith(
self._protocol
protocol_prefix = f"{self._protocol}://"
if self._path.startswith(protocol_prefix) and not path.startswith(
protocol_prefix
):
return f"{self._protocol}://{path}"
return f"{protocol_prefix}{path}"
return path

def _partition_to_path(self, path: str):
Expand Down
2 changes: 1 addition & 1 deletion tests/io/test_partitioned_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ def test_dataset_creds(self, pds_config, expected_ds_creds, global_creds):
assert pds._credentials == global_creds


BUCKET_NAME = "fake_bucket_name"
BUCKET_NAME = "s3_fake_bucket_name"
S3_DATASET_DEFINITION = [
"pandas.CSVDataSet",
"kedro.extras.datasets.pandas.CSVDataSet",
Expand Down

0 comments on commit fd8162d

Please sign in to comment.