MySQL Boolean autogenerate comparson to TINYINT #605

johaven · 2019-10-01T08:52:56Z

Alembic version: 1.2.1
SQL: MariaDB

Each upgrade (with compare_type enabled) duplicate constraints :

paid = db.Column(db.Boolean)
satisfaction = db.Column(db.Boolean)

  CONSTRAINT `CONSTRAINT_1` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_2` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_3` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_4` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_5` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_6` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_7` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_8` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_9` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_10` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_11` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_12` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_13` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_14` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_15` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_16` CHECK (`satisfaction` in (0,1)),
  CONSTRAINT `CONSTRAINT_17` CHECK (`paid` in (0,1)),
  CONSTRAINT `CONSTRAINT_18` CHECK (`satisfaction` in (0,1))

zzzeek · 2019-10-01T13:26:32Z

this is because compare_type does not detect a MySQL TINYINT column as the same as Boolean, so you will need to adjust these migrations manually or use a custom type comparator.

there is then a second issue that when "changing" the type of the column, it does not drop the old constraint.

unfortunately there is no good way to automate these things and I have no means of fixing this issue in the near future.

zzzeek · 2019-10-01T13:28:19Z

this is not specifically "enums" but this is the same area of issue.

johaven · 2019-10-01T13:36:43Z

I do not understand totally because logically when doing the upgrade there is a conversion to the SQL language based on the type of database, no? Starting from this idea, it is possible to compare what has been generated in SQL and the structure of the table (for example via a "show create table"). I am wrong ?

zzzeek · 2019-10-01T13:47:02Z

MySQL has a datatype called "TINYINT". You can use this datatype directly in your schema, but it is not a SQL standard type; only MySQL has it. SQLAlchemy has a datatype called "Boolean" that generates any number of different datatypes on databases, depending on what capabilities they have. On SQL Server, it uses BIT; on Postgresql, BOOLEAN, on MySQL, TINYINT.

The thing is you can use TINYINT / BIT / etc. without them corresponding to SQLAlchemy's Boolean.

Alembic therefore does not, at the moment, make the assumption that if a metadata model has Boolean, and the target database has TINYINT, that this TINYINT is the same datatype.

This is 100% a bug, because even if the user did use TINYINT(display_width=1), and then changed their column to Boolean, that should not produce a change in datatype.

That part of this is not too difficult to fix but needs a comprehensive approach and tests throughout the library to fully release.

However, with all of that, the situation gets much worse. SQLAlchemy's Boolean and Enuim datatypes also generate CHECK constraints, which is a decision I would never have made today if I could have seen these issues coming. There's no autogenerate support for CHECK constraints either, and this would be very difficult to do generically. However, again in the case of Boolean / Enum, we should be taking a bit of a guess that there likely is a CHECK constraint on the table when these datatypes are used. However, that is also quite complicated because these CHECK constraints have no particular name (they are unnamed by default and as you can see MySQL just generates a name), and we'd have to basically search for them and parse their SQL text on a per-database basis in order to identify the ones that we think are relevant to our particular database column.

So we can solve your immediate issue here with the first part of this but there's a longer term problem in locating and finding these CHECK constraints, such as if you changed your database type from Boolean to INTEGER, we'd need to DROP that constraint.

johaven · 2019-10-01T14:38:25Z

I understand the difficulty now, thank you very much for the explanation.

The best solution (for now) would be to disable the creation of constraints when desired.
There is an option that allows this when declaring a field like Boolean (https://docs.sqlalchemy.org/en/13/core/type_basics.html#sqlalchemy.types.Boolean).

I tested it on my fields:

paid = db.Column(db.Boolean, create_constraint=False)
satisfaction = db.Column(db.Boolean, create_constraint=False)

But this does not work currently in alembic, the constraints are still created :(
Independently i have this error when i use the create_constraint option:

/../sqlalchemy/sql/base.py:299: SAWarning: Can't validate argument 'create_constraint'; can't locate any SQLAlchemy dialect named 'create'
  % (k, dialect_name)

zzzeek · 2019-10-01T16:58:07Z

that's an argument of boolean:

Boolean(create_constraint=False)

johaven · 2019-10-01T16:58:18Z

Sorry, i am tired 😅 , it works with correct initialization:

paid = db.Column(db.Boolean(create_constraint=False))
satisfaction = db.Column(db.Boolean(create_constraint=False))

johaven · 2019-10-01T16:59:06Z

that's an argument of boolean:
Boolean(create_constraint=False)

Faster than me :) thanks anyway !

pbecotte · 2019-10-07T21:36:06Z

I have been looking at this a bit. It kind of feels like the issue is with the adapt api in sqlalchemy. While we could put special cases directly in alembic to handle this, there are other places it could be useful. My thought was that we could use the native attributes like supports_native_boolean here. So the Boolean class would return Boolean if you check the affinity/adapt from Postgres, but would report an Integer affinity in Mysql where that variable is False.

I was thinking could move a bit of logic out to the type system- instead of type_api.adapt_type walking the Inheritance chain, have it call a method on the type that will do it instead. Maybe walk up the chain until the first type that has the method defined? Then for Enum, Bool, Sequence etc it could be overridden with the multiple possibilities.

Does it sound like that is a useful path to explore, or am I thinking about this all wrong?

zzzeek · 2019-10-07T22:00:04Z

I think it's a valid idea to explore having Alembic use the type adaptation system to do the compare, but as far as the "move type_api.adapt_type" part I think you need to look at the bigger dialect-level part of this that should already do what you're looking for, which would be the dialect.type_descriptor() method - that gives you the type which the dialect uses, given a generic type.

however, this won't work for the types we care about right now, e.g. boolean and enum, because there is no adaption used with these, the way Boolean becomes TINYINT is the MySQL compiler renders "BOOL" for it, and the way Enum becomes CHAR is that it just renders CHAR.

were you proposing a change on the SQLAlchemy side?

pbecotte · 2019-10-08T00:13:34Z

Yeah, my thought was to have the generic type come out as Integer instead of Boolean when the supports_native_boolean flag was false. Not at all sure how to make that happen- but wanted to check if it sounded reasonable before I dug in any further. If you think it's worth looking at, I'll explore a little more and see what the actual change would look like.

zzzeek · 2019-10-08T02:16:50Z

I think we'd have to rethink the approach on the SQLAlchemy side, since adaptation is only used to a limited extent, not really for DDL. A different architecture might not be a bad thing but I dont know if I have the appetite for that right now as that would be a huge change. The type comparison on the Alembic side is not really the biggest problem, a few hardcoded types is not the end of the world and it can be changed later. the bigger issue is the check constraint thing, and also the extra mechanics for Postgresql ENUMs. and all the tests for all of it.

pbecotte · 2019-10-08T13:21:17Z

Okay- I'll work on the alembic only solution. Is there a way to introspect WHICH of those integer columns (TINYINT, BIT, etc) is used on a particular dialect, or do I just need to stick a switch statement of some sort (or an attribute on the dialect?)

zzzeek · 2019-10-08T14:35:29Z

yeah...nothing too great, it's like visit_BOOLEAN -> BOOL in code like this: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/mysql/base.py#L2057 there's a class called TypeCompiler that delivers the string representation of the type for each dialect. For Enum->CHAR it's in the base, because there is no enum type: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/sql/compiler.py#L3398 then for mysql it switches at https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/mysql/base.py#L2018 .

for table reflection, there's this whole "let's render the DDL for a type" but in reverse, it goes back to the more specific type, e.g. in this mapping: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/mysql/base.py#L1145 . this is what table reflection uses to get back the reflected types, in the case of MySQL BOOL we get TINYINT.

so you can see the disconnect, the python-> DDL and DDL->python mappings are not symmetrical in special cases like this. this particular version of the typing system came along in version 0.6 or so of sqlalchemy. prior to that it might have worked in more of a class-hierarchy / class-adaptation system kind of way, the TypeCompiler thing came in to provide more fluidity to the system and to get rid of lots of extra classes and such.

So there is, in fact, a way you could automate the comparison here. you could run the TypeCompiler and compare the strings:

>>> from sqlalchemy import Enum
>>> from sqlalchemy.dialects import mysql
>>> mysql.dialect().type_compiler.process(Enum('x', 'y', 'z', native_enum=False))
'VARCHAR(1)'
>>> mysql.dialect().type_compiler.process(mysql.VARCHAR(1))
'VARCHAR(1)'

>>> mysql.dialect().type_compiler.process(Enum('x', 'y', 'z'))
"ENUM('x','y','z')"
>>> mysql.dialect().type_compiler.process(mysql.ENUM('x', 'y', 'z'))
"ENUM('x','y','z')"

here it is doing a tricky one for Postgresql:

>>> from sqlalchemy.dialects import postgresql
>>> from sqlalchemy import DateTime
>>> postgresql.dialect().type_compiler.process(DateTime())
'TIMESTAMP WITHOUT TIME ZONE'
>>> postgresql.dialect().type_compiler.process(postgresql.TIMESTAMP())
'TIMESTAMP WITHOUT TIME ZONE'

still, the mysql impl needs to know that BOOL and TINYINT are synonymous:

>>> from sqlalchemy import Boolean
>>> mysql.dialect().type_compiler.process(Boolean())
'BOOL'
>>> mysql.dialect().type_compiler.process(mysql.TINYINT())
'TINYINT'

I would probably have a short translation dicttionary in the alembic MySQL impl class for things like "BOOL" -> "TINYINT".

this would also be a way that the type comparison could pick up on really small things like changes in length and things like that.

i haven't had breakfast yet and wondering if I'm overlooking something really obviously not-working about this, because it seems like it might solve most of the problem

pbecotte · 2019-10-08T22:51:17Z

Was looking at the compare_columns function in DefaultImpl. Something like this-

  def compare_type(self, inspector_column, metadata_column):
        # snip
        if conn_type._compare_type_affinity(metadata_impl):
            comparator = _type_comparators.get(conn_type._type_affinity, None)
            return comparator and comparator(metadata_impl, conn_type)

        return self.compare_non_native_types(inspector_column, metadata_column)

    def compare_non_native_types(self, inspector_column, metadata_column):
        comparator = self.non_native_comparator.get(metadata_column.type._type_affinity)
        if comparator:
            return comparator(inspector_column)
        return True

And then in MySqlImpl-

class MySQLImpl(DefaultImpl):
    __dialect__ = "mysql"

    transactional_ddl = False
    non_native_comparator = {
        sqltypes.Boolean: lambda col: not (isinstance(col.type, TINYINT) and col.type.display_width == 1),
    }

That code gets the simple tests I added passing, but... not familiar enough with the code base to know if that's a good or bad way of sticking that logic in there haha. Will also have to go through and look at the various database implementations to flesh out that mapping, but again wanted to run that through.

I also noticed that the current implementation does not show changes between TINYINT() and TINYINT(display_width=1) - is this intentional, or should this be something that we are hoping to update as well? Last question- I was thinking adding one more step there- check if ANY check constraint(1, 0) exists on that column- if it doesn't then return True so that it gets added. Would involve a more introspection queries though.

zzzeek · 2019-10-09T14:21:39Z

so sure, that looks like a start where we just have some rules for these specific types, that is fine. But I wonder if making a more dramatic leap and just comparing all types on the string representation like in my example above returned by the dialect would solve everything at once ? as far as comparing the elements inside the types, yes that is also something that "doesn't work" etiher right now. the string comparison approach would likely help with that also but might also produce false positives.

pbecotte · 2019-11-08T01:21:42Z

I put in a first draft PR for this. As you guessed, using the type compiler gets you 90% of the way there- the only things left to handle are default precisions and the couple weird cases we already mentioned. All feedback welcome (am going to look at the constraint issue as well - have some ideas)

sqla-tester · 2020-01-18T15:45:34Z

Paul Becotte has proposed a fix for this issue in the master branch:

Update the type comparison code used for schema autogeneration. Compare https://gerrit.sqlalchemy.org/1561

zzzeek · 2020-02-04T18:26:14Z

split out the CHECK part of this, which I'm fairly pessimistic anything is going to really happen for a long time, into #652.

sqla-tester · 2020-02-04T19:01:50Z

Paul Becotte has proposed a fix for this issue in the master branch:

Update the type comparison code used for schema autogeneration. Compare https://gerrit.sqlalchemy.org/1561

zzzeek added question usage and API questions bug Something isn't working mysql and removed question usage and API questions labels Oct 1, 2019

zzzeek added the autogenerate - detection label Oct 1, 2019

zzzeek changed the title ~~Constraints are duplicated~~ MySQL Boolean autogenerate comparson to TINYINT; auto-constraints not dropped for MySQL change type Oct 1, 2019

zzzeek added data types autogenerate for enums a long term subject, tagging issues related to this labels Oct 1, 2019

zzzeek mentioned this issue Nov 6, 2019

Documentation surrounding compare_type context param. #613

Open

pbecotte mentioned this issue Nov 8, 2019

Update the type comparison code used for schema autogeneration. Compare #619

Closed

zzzeek changed the title ~~MySQL Boolean autogenerate comparson to TINYINT; auto-constraints not dropped for MySQL change type~~ MySQL Boolean autogenerate comparson to TINYINT Feb 4, 2020

zzzeek mentioned this issue Feb 4, 2020

CHECK constraints not dropped if one migrates from Boolean or Enum to another type #652

Open

sqlalchemy-bot closed this as completed in 3ddf82e Feb 4, 2020

This was referenced Feb 12, 2020

add some kind of option/recipe to turn TinyInt + length 1 into a boolean for mysql #84

Closed

detect changes in type parameters #180

Closed

jvanasco mentioned this issue May 27, 2020

MySQL 8.0.0 ER_DEPENDENT_BY_CHECK_CONSTRAINT error #699

Closed

This was referenced Jan 16, 2021

implement type detection change for MySQL ENUM #779

Open

Enum value changes are not detected in sqlite or mysql (non-native enum setting) #329

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MySQL Boolean autogenerate comparson to TINYINT #605

MySQL Boolean autogenerate comparson to TINYINT #605

johaven commented Oct 1, 2019 •

edited

Loading

zzzeek commented Oct 1, 2019

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019 •

edited

Loading

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019

johaven commented Oct 1, 2019

pbecotte commented Oct 7, 2019

zzzeek commented Oct 7, 2019

pbecotte commented Oct 8, 2019

zzzeek commented Oct 8, 2019

pbecotte commented Oct 8, 2019

zzzeek commented Oct 8, 2019

pbecotte commented Oct 8, 2019 •

edited

Loading

zzzeek commented Oct 9, 2019

pbecotte commented Nov 8, 2019

sqla-tester commented Jan 18, 2020

zzzeek commented Feb 4, 2020

sqla-tester commented Feb 4, 2020

MySQL Boolean autogenerate comparson to TINYINT #605

MySQL Boolean autogenerate comparson to TINYINT #605

Comments

johaven commented Oct 1, 2019 • edited Loading

zzzeek commented Oct 1, 2019

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019 • edited Loading

zzzeek commented Oct 1, 2019

johaven commented Oct 1, 2019

johaven commented Oct 1, 2019

pbecotte commented Oct 7, 2019

zzzeek commented Oct 7, 2019

pbecotte commented Oct 8, 2019

zzzeek commented Oct 8, 2019

pbecotte commented Oct 8, 2019

zzzeek commented Oct 8, 2019

pbecotte commented Oct 8, 2019 • edited Loading

zzzeek commented Oct 9, 2019

pbecotte commented Nov 8, 2019

sqla-tester commented Jan 18, 2020

zzzeek commented Feb 4, 2020

sqla-tester commented Feb 4, 2020

johaven commented Oct 1, 2019 •

edited

Loading

johaven commented Oct 1, 2019 •

edited

Loading

pbecotte commented Oct 8, 2019 •

edited

Loading