Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-48310][PYTHON][CONNECT] Cached properties must return copies
### What changes were proposed in this pull request? When a consumer modifies the result values of a cached property it will modify the value of the cached property. Before: ```python df_columns = df.columns for col in ['id', 'name']: df_columns.remove(col) assert len(df_columns) == df.columns ``` But this is wrong and this patch fixes it to ```python df_columns = df.columns for col in ['id', 'name']: df_columns.remove(col) assert len(df_columns) != df.columns ``` ### Why are the changes needed? Correctness of the API ### Does this PR introduce _any_ user-facing change? No, this makes the code consistent with Spark classic. ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #46621 from grundprinzip/grundprinzip/SPARK-48310. Authored-by: Martin Grund <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
- Loading branch information