Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python SQLAlchemy Inspector Error ( MySQL Dialect ) with greptime - TypeError: NullType() takes no arguments #4537

Closed
atul-r opened this issue Aug 9, 2024 · 16 comments
Labels
C-bug Category Bugs

Comments

@atul-r
Copy link

atul-r commented Aug 9, 2024

What type of bug is this?

Incorrect result

What subsystems are affected?

Standalone mode

Minimal reproduce step

I am encountering an issue when using SQLAlchemy's inspector to retrieve column details from a Greptime database using the MySQL dialect. Below is the code snippet I am using:

from sqlalchemy import create_engine, text, inspect

conn_string = "mysql://root:[email protected]:4002/somedb"

engine = create_engine(conn_string)

inspector = inspect(engine)
columns = inspector.get_columns('app_logs', 'somedb')
for column in columns:
    print(column)

When I run this code, I receive the following error:

TypeError: NullType() takes no arguments

Analysis:
During reflection, SQLAlchemy issues a SHOW CREATE TABLE query to retrieve the table schema. It then parses the returned string using regex to extract the table name and column details.

Greptime returns the following CREATE TABLE statement for the table:

CREATE TABLE IF NOT EXISTS `app_logs` (
  `ts` TIMESTAMP(3) NOT NULL,
  `host` STRING NULL,
  `api_path` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  `log_level` STRING NULL,
  `log` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  TIME INDEX (`ts`),
  PRIMARY KEY (`host`, `log_level`)
)
ENGINE=mito
WITH(
  append_mode = 'true'
)

The issue seems to stem from the data types being returned in uppercase (e.g., STRING, TIMESTAMP) also some data types are different for example STRING, there is no data type of STRING in mysql. The MySQL dialect in SQLAlchemy does not recognize these uppercase data types and expects them in lowercase. As a result, it raises a TypeError.

I manually changed the data types to lowercase during debugging, and the reflection worked without any issues. This suggests that the data type casing is the root cause of the problem.

Request:
Is there a way to modify Greptime so that it returns the data types in lowercase, or provide an output consistent with what the MySQL dialect in SQLAlchemy expects? This would enhance compatibility and prevent errors during schema reflection.

What did you expect to see?

{'name': 'ts', 'type': TIMESTAMP(fsp=3), 'default': None, 'comment': None, 'nullable': False}
{'name': 'host', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'api_path', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'log_level', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}
{'name': 'log', 'type': VARCHAR(), 'default': None, 'comment': None, 'nullable': True}

What did you see instead?

TypeError: NullType() takes no arguments

What operating system did you use?

greptime/greptimedb:v0.9.1,

What version of GreptimeDB did you use?

0.9.1

Relevant log output and stack trace

c:\Users\user\Documents\project\others\alchemy-greptime\app.py:43: SAWarning: Did not recognize type 'TIMESTAMP' of column 'ts'
  columns = inspector.get_columns('app_logs', 'supersetexample')
Traceback (most recent call last):
  File "c:\Users\user\Documents\project\others\alchemy-greptime\app.py", line 43, in <module>
    columns = inspector.get_columns('app_logs', 'supersetexample')
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 859, in get_columns
    col_defs = self.dialect.get_columns(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in get_columns
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 97, in cache
    ret = fn(self, con, *args, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 2966, in get_columns
    parsed_state = self._parsed_state_or_create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 3226, in _parsed_state_or_create
    return self._setup_parser(
           ^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in _setup_parser
  File "<string>", line 2, in _setup_parser
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\engine\reflection.py", line 97, in cache     
    ret = fn(self, con, *args, **kw)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\base.py", line 3262, in _setup_parser     _parser
    return parser.parse(sql, charset)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                      rse
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\reflection.py", line 48, in parse                                                                                                                                                                        parse_column
    self._parse_column(line, state)
  File "C:\Users\user\AppData\Local\pypoetry\Cache\virtualenvs\alchemy-greptime-7gNpWF2k-py3.11\Lib\site-packages\sqlalchemy\dialects\mysql\reflection.py", line 284, in _parse_column
    type_instance = col_type(*type_args, **type_kw)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: NullType() takes no arguments
@atul-r atul-r added the C-bug Category Bugs label Aug 9, 2024
@evenyag
Copy link
Contributor

evenyag commented Aug 9, 2024

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema.
https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756

I'm also considering if we can provide a way to display the data type in lowercase.

@atul-r
Copy link
Author

atul-r commented Aug 9, 2024

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema. https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756

I'm also considering if we can provide a way to display the data type in lowercase.

@evenyag But there are non matching data types as well. For example STRING, there is no data type of STRING in MySQL. possibly VARCHAR. This needs to come from greptime

@v0y4g3r
Copy link
Contributor

v0y4g3r commented Aug 9, 2024

Maybe we can create a PR to address this in SQLAlchemy. I found that the PostgreSQL dialect lowers the string before getting the schema. https://github.com/sqlalchemy/sqlalchemy/blob/6cf5e2a188fc5e337d22a098a5fe9a9fe10cc7e7/lib/sqlalchemy/dialects/postgresql/base.py#L3756
I'm also considering if we can provide a way to display the data type in lowercase.

But there are non matching data types as well. For example STRING, there is no data type of STRING in MySQL. possibly VARCHAR. This needs to come from greptime

Since PostgreSQL does not support SHOW statements, we can format the output to match MySQL

@evenyag
Copy link
Contributor

evenyag commented Aug 9, 2024

Can we add a new method display_mysql_dialect() to the CreateTable and Column?

impl Display for CreateTable {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "CREATE ")?;
if self.engine == FILE_ENGINE {
write!(f, "EXTERNAL ")?;
}
write!(f, "TABLE ")?;
if self.if_not_exists {
write!(f, "IF NOT EXISTS ")?;
}
writeln!(f, "{} (", &self.name)?;
writeln!(f, "{},", format_list_indent!(self.columns))?;

We need to map some data types to MySQL's types in Column.

impl Display for Column {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.column_def)?;
if let Some(fulltext_options) = &self.extensions.fulltext_options {
if !fulltext_options.is_empty() {
let options = fulltext_options.kv_pairs();
write!(f, " FULLTEXT WITH({})", format_list_comma!(options))?;
} else {
write!(f, " FULLTEXT")?;
}
}
Ok(())
}
}

Another way is to map the data type while creating the Column struct.

data_type: statements::concrete_data_type_to_sql_data_type(&column_schema.data_type)
.with_context(|_| ConvertSqlTypeSnafu {
datatype: column_schema.data_type.clone(),
})?,

@sunng87
Copy link
Member

sunng87 commented Aug 12, 2024

@atul-r I'm not quite familiar with SQLAlchemy, I wonder if this inspect function a must-have to use SQLAlchemy with GreptimeDB? Is this error blocking you from running SQL query in SQLAlchemy?

@atul-r
Copy link
Author

atul-r commented Aug 12, 2024

@atul-r I'm not quite familiar with SQLAlchemy, I wonder if this inspect function a must-have to use SQLAlchemy with GreptimeDB? Is this error blocking you from running SQL query in SQLAlchemy?

@sunng87 I am using apache superset to create charts from the data in greptime. Superset uses SQLAlchemy to connect to greptime. When connecting to any database, superset uses the inspect call to get the table details. This is where I am getting the issue. So it is a blocker

@sunng87
Copy link
Member

sunng87 commented Aug 12, 2024

@atul-r Thank you for the information. That's a blocking issue for Superset.

@atul-r
Copy link
Author

atul-r commented Aug 12, 2024

@atul-r Thank you for the information. That's a blocking issue for Superset.

The MySQL endpoint of greptime should send responses that is inline with MySQL. This needs to be handled in greptime. If not any MySQL client will face issues with greptime

There is another issue that I have created as well (#4542)

@sunng87
Copy link
Member

sunng87 commented Aug 12, 2024

Yes. That's we need to address in GreptimeDB for support of Superset.

@evenyag
Copy link
Contributor

evenyag commented Aug 12, 2024

For #4542, most date/time functions should have an alternative in datafusion, which is compatible with PostgreSQL.
https://datafusion.apache.org/user-guide/sql/scalar_functions.html#time-and-date-functions

Will it be easier to support PostgreSQL's driver than MySQL? We are working on #3560

What do you think? @sunng87

@sunng87
Copy link
Member

sunng87 commented Aug 12, 2024

@evenyag sounds good. I will give Superset's postgres datasource a try to see the gap.

@sunng87
Copy link
Member

sunng87 commented Aug 13, 2024

I tried postgresql data source. It seems Superset doesn't offer configuration for psycopg's autocommit option so the postgres client will send statements like BEGIN which is not supported by us.

@sunng87
Copy link
Member

sunng87 commented Aug 14, 2024

After a deep dive into postgres superset connector, even with #3560 and #4553 , there are still some functions like pg_catalog.pg_get_serial_sequence, json_build_object and etc. to be supported. We don't see this in near future.

Perhaps we need to build our own plugin or connector for superset, as well as metabase.

@sunng87
Copy link
Member

sunng87 commented Sep 6, 2024

@atul-r I just created a greptimedb connector for sqlalchemy and superset. You can add greptimedb-sqlalchemy to docker/requirements-local.txt (if you are using docker-compose) or use pip install greptimedb-sqlalchemy if you installed superset in your system.

There will be a GreptimeDB option when you are adding database connection.

This is an early release of the greptimedb native connector. Let me know if you find any issue with it.

@atul-r
Copy link
Author

atul-r commented Sep 10, 2024

@atul-r I just created a greptimedb connector for sqlalchemy and superset. You can add greptimedb-sqlalchemy to docker/requirements-local.txt (if you are using docker-compose) or use pip install greptimedb-sqlalchemy if you installed superset in your system.

There will be a GreptimeDB option when you are adding database connection.

This is an early release of the greptimedb native connector. Let me know if you find any issue with it.

@sunng87 I am getting the below error

ERROR: (builtins.NoneType) None
[SQL: Failed to plan SQL: Error during planning: Invalid function 'pg_catalog.version'.
Did you mean 'APPROX_MEDIAN'?
]
(Background on this error at: https://sqlalche.me/e/14/dbapi)

My connection string is greptimedb://root:[email protected]:4003/supersetexample

Created a detailed issue here

@sunng87
Copy link
Member

sunng87 commented Oct 14, 2024

This has been resolved by our superset driver, see sunng87/greptimedb-sqlalchemy#1 and our docs https://docs.greptime.com/nightly/user-guide/integrations/superset

@sunng87 sunng87 closed this as completed Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category Bugs
Projects
None yet
Development

No branches or pull requests

4 participants