Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impyla + Dask : How to setup the impala URI ? #556

Open
frbelotto opened this issue Oct 1, 2024 · 0 comments
Open

Impyla + Dask : How to setup the impala URI ? #556

frbelotto opened this issue Oct 1, 2024 · 0 comments

Comments

@frbelotto
Copy link

I currently own and use a impala connection using impyla.

import sqlalchemy
import pandas as pd
def conn():
    return connect(host=host, 
                   port=port,
                   database='default',
                   timeout=60,
                   use_ssl=True,
                   auth_mechanism="LDAP",
                   use_http_transport=True,
                   http_path=path,
                   user=user, 
                   password=pwd)
engine = sqlalchemy.create_engine('impala://', creator=conn)
query = '''select * from HIVE_ENI.NVG_USU_CNL_DGTL as t1 '''
pd.read_sql(query, db, index_col ='horario' )

It does work, despite of a deprecation warning :

SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'impala.sqlalchemy.ImpalaDialect'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
  engine = sqlalchemy.create_engine('impala://', creator=conn)

But as my base is getting bigger, I am trying to move from pandas to dask. The issue is that dask requires the connection string instead of the engine :

import dask.dataframe as dd
dd.read_sql(query, db, index_col ='horario' )
`TypeError: con must be of type str, not <class 'sqlalchemy.engine.base.Engine'>Note: Dask does not support SQLAlchemy connectables here`

It might be stupid, but,

1) How could I solve the DeprecationWarning on the engine creation?
2) how do I create the connection URI for my server given the data I´ve showed aboveto be used on Dask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant