Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"sdk_moniker" key error #424

Closed
benrutter opened this issue Sep 13, 2023 · 9 comments
Closed

"sdk_moniker" key error #424

benrutter opened this issue Sep 13, 2023 · 9 comments

Comments

@benrutter
Copy link
Contributor

This is an error that's coming downstream from aszure-sdk-for-python since yesterdays (azure-storage-blob) release (12.8.0)

It happens at initialisation of AzureBlobFileSystem() if azure-storage-blob version is 12.8.0. Here's the traceback:

file /databricks/python/lib/python3.10/site-packages/adlfs/spec.py:222, in AzureBlobFileSystem()
    211 protocol = "abfs"
    213 def __init__(
    214     self,
    215     account_name: str = None,
    216     account_key: str = None,
    217     connection_string: str = None,
    218     credential: str = None,
    219     sas_token: str = None,
    220     request_session=None,
    221     socket_timeout=_SOCKET_TIMEOUT_DEFAULT,
--> 222     blocksize: int = create_configuration(storage_sdk="blob").max_block_size,
    223     client_id: str = None,
    224     client_secret: str = None,
    225     tenant_id: str = None,
    226     anon: bool = True,
    227     location_mode: str = "primary",
    228     loop=None,
    229     asynchronous: bool = False,
    230     default_fill_cache: bool = True,
    231     default_cache_type: str = "bytes",
    232     version_aware: bool = False,
    233     assume_container_exists: Optional[bool] = None,
    234     max_concurrency: Optional[int] = None,
    235     **kwargs,
    236 ):
    237     super_kwargs = {
    238         k: kwargs.pop(k)
    239         for k in ["use_listings_cache", "listings_expiry_time", "max_paths"]
    240         if k in kwargs
    241     }  # pass on to fsspec superclass

File /databricks/python/lib/python3.10/site-packages/azure/storage/blob/_shared/base_client.py:415, in create_configuration(**kwargs)
    414 config.headers_policy = StorageHeadersPolicy(**kwargs)
--> 415 config.user_agent_policy = UserAgentPolicy(sdk_moniker=kwargs.pop('sdk_moniker'), **kwargs)
    416 config.retry_policy = kwargs.get("retry_policy") or ExponentialRetry(**kwargs)

KeyError: 'sdk_moniker'

Seems like the issue is this line:

config.user_agent_policy = UserAgentPolicy(sdk_moniker=kwargs.pop('sdk_moniker'), **kwargs)

which effictively makes "sdk_moniker" a required keyword argument.

It's a new issue, so I don't know if azure-storage-blob has plans to fix on its side. Is there any need for defensive input of an "skd_monker" here? I think the fix would essentially just be something like this:

blocksize: int = create_configuration(storage_sdk="blob", sdk_moniker="ADLFS").max_block_size
@daavoo
Copy link
Contributor

daavoo commented Sep 13, 2023

Can confirm. I am getting the same error after upgrading azure-storage-blob

@mateusz91t
Copy link

Hi, same issue from Today. It worked yesterday.
adlfs==2023.8.0
I use this snipped of code to load data with pandas from Azure Blob Storage.
2023-09-13_10h57_59

To run the same code today I have to downgrade version of azure-blob-storage 12.18.0 -> 12.17.0.

adlfs still wants to install newer version of azure-blob-storage that is incompatible.
There is the update in the azure-blob-storage that breaks up your code:
2023-09-13_11h05_17

A workaround is installing azure-blob-storage==12.17.0 additionally to adlfs==2023.8.0.

@benrutter
Copy link
Contributor Author

I just raised and closed a PR - looks like this needs to be fixed in azure-storage-blob.

azure-storage-blob pre 12.18.0 looks like this:

config.user_agent_policy = UserAgentPolicy(
    sdk_moniker=(f"storage-{kwargs.pop('storage_skd')}/{VERSION}", **kwargs)
)

and, it now look like this:

config.user_agent_policy = UserAgentPolicy(sdk_moniker=kwargs.pop('sdk_moniker'), **kwargs)

Effectively, if you pass in sdk_moniker in a version before 12.18.0 you'll get both a dynamically generated skd_moniker, and the one you've specified, throwing an error.

If you don't pass one in for version 12.18.0, you'll get an error.

So aside from doing something like figuring out which version of azure-store-blob is installed and then acting conditionally on that, I don't think it's possible to use create_configuration in a way that's consistent across versions.

If they haven't already spotted this, I'll let the azure-blob-storage project know this - I'm assuming its an accident.

@benrutter
Copy link
Contributor Author

benrutter commented Sep 13, 2023

I've put this PR in to azure-blob-storage which I think should fix the issue

@nosterlu
Copy link

Hi, same issue from Today. It worked yesterday. adlfs==2023.8.0 I use this snipped of code to load data with pandas from Azure Blob Storage. 2023-09-13_10h57_59

To run the same code today I have to downgrade version of azure-blob-storage 12.18.0 -> 12.17.0.

adlfs still wants to install newer version of azure-blob-storage that is incompatible. There is the update in the azure-blob-storage that breaks up your code: 2023-09-13_11h05_17

A workaround is installing azure-blob-storage==12.17.0 additionally to adlfs==2023.8.0.

azure-storage-blob==12.17.0 👍

@jalauzon-msft
Copy link

The fix for azure-storage-blob is here: Azure/azure-sdk-for-python#32071

This should be released as a patch later today or tomorrow. Thanks.

@TomAugspurger
Copy link
Contributor

Thanks for following up!

As you mentioned in Azure/azure-sdk-for-python#32056 (comment), adlfs shouldn't be using private APIs from azure.storage.blob (there are a handful)

I've opened #426 to track that.

@jalauzon-msft
Copy link

Thanks @TomAugspurger Tom! I commented on that issue just to help out with some suggestions :).

As for this issue, azure-storage-blob version 12.18.1 was just released which should contain the fix.

@benrutter
Copy link
Contributor Author

I've tested and this is all working as before with 12.18.1 - thanks @jalauzon-msft for sorting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants