Skip to content

Latest commit

 

History

History
568 lines (289 loc) · 21.6 KB

CHANGELOG.md

File metadata and controls

568 lines (289 loc) · 21.6 KB

Changelog

v4.1.2 (2022-07-29)

Release Notes


Bug Fixes

  • pangres was not running on Python 3.10 because I never added packaging in requirements.txt. I fixed it by using pkg_resources from the standard library instead to avoid the additional dependency (see commit 89d3679)

    Note: the tests were running fine because we use pytest which uses packaging (so I never saw the missing dependency)


Full Changelog

Closed issues:

  • Cannot insert into column "id" #60

v4.1.1 (2022-03-13)

Release Notes


Bug Fixes

  • fixed bug where I used a synchronous method instead of its asynchronous variant (UpsertQuery.execute instead of UpsertQuery.aexecute in pangres.aupsert). This has no repercussions for the end user

Documentation

  • fix illogic code in example for pangres.aupsert (using engine instead of connection in contexts) and commit which I had forgotten!
  • added changelog

Testing

  • overhaul of the tests. asynchronous and synchronous tests have been separated
  • module test_upsert_end_to_end has been renamed to test_core

Full Changelog

Fixed bugs:

  • Fix bad strategy of temporarily replacing async engines with synchronous engines in tests #57

Closed issues:

  • Changelog (releases) #58

Merged pull requests:

v4.1 (2022-01-21)

Release Notes


New Features

  • Added async support with function pangres.aupsert 🚀 ! Tested using aiosqlite for SQlite, asyncpg for PostgreSQL and aiomysql for MySQL. See documentation in dedicated wiki page

Full Changelog

Implemented enhancements:

  • [FEAT] Add pangres.aupsert (async variant) #55 (ThibTrip)

Closed issues:

  • Add async support? #47

v4.0.2 (2022-01-17)

Release Notes


This patches an important bug with MySQL. We recommend that all users upgrade to this version.

Bug Fixes

  • Fixed bug where tables in MySQL where created with auto increment on the primary key (see #56)

Full Changelog

Closed issues:

  • MySQL creating undesired auto incrementing primary keys #56
  • Get rid of the LooseVersion warning #54

v4.0.1 (2022-01-13)

Release Notes


Bug Fixes

  • removed warning due to deprecated code when checking versions of other libraries in Python >= 3.10 (see issue #54)

Full Changelog

v4.0 (2022-01-12)

Release Notes


There have been important changes in the main function pangres.upsert:

Breaking changes

  1. The first argument engine has been be renamed to con and now accepts engines and connections
  2. The argument chunksize now defaults to None. Like in pandas.DataFrame.to_sql we will attempt to insert all rows by default. Previously the default was 10000 rows.
  3. There will be no more automatic adjustments to the chunksize the user passes in pangres.upsert even if we can predict that it will raise an Exception due to database limitations.

E.g. inserting 100000 rows at once in a SQlite database with pangres will necessarily raise an Exception down the line because we need to pass NUMBER_OF_ROWS * NUMBER_OF_COLUMNS parameters and the maximum of parameters allowed in SQLite is 32766 (for version >= 3.32.0, otherwise it is 999).

I have made a new utility function pangres.adjust_chunksize that you can use before calling pangres.upsert if you want to make sure the chunksize is not too big.

New Features

Now that pangres.upsert accept connections objects this will give you more control when using it when it comes to connections and transactions.

See transaction control demo notebook.


Full Changelog

Closed issues:

  • Set default chunksize to None #51
  • Option to Disable commit in execute() #44

Merged pull requests:

v3.1 (2022-01-07)

Release Notes


Bug Fixes

  • fixed wrong version number for SQLite when checking how many parameters are allowed (see commit a60c61e)

Improvements

  • when using pangres.upsert all operations will be done within a transaction
  • when using pangres.upsert all operations will be done using a single connection

Full Changelog

Closed issues:

  • Regarding limit of Upserting #49
  • Using a transaction with only one connection #46

Merged pull requests:

  • [FEAT] Run all of pangres' operations at connection level and within a transaction #48 (ThibTrip)

v3.0 (2021-12-06)

Release Notes


New Features

  • added option create_table to pangres.upsert for disabling table creation/check (CREATE TABLE IF NOT EXISTS... statement) that was always issued by this function. This can speed things up a little bit if you are sure that the target SQL table already exists
  • pangres should already be compatible with the future sqlalchemy version 2 (unless something in the API changes in the future of course). You can try this by using the flag future=True in sqlalchemy.create_engine. For instance create_engine('sqlite:///', future=True)

Improvements

  • allowed more SQL parameters for newer versions of SQLite (>=3.22.0): from 999 to 32766 (see issue #43)
  • improved error messages (e.g. showing duplicated labels)

Bug Fixes

  • when using parameter yield_chunks=True in pangres.upsert with an empty DataFrame we will now return an empty generator instead of None in order to ensure data type consistency
  • fixed problem with log levels not being respected when pangres logs something (see commit #c494c95)
  • fixed problem with the logger not filtering properly when the environment variable "PANGRES_LOG_LEVEL" is set (see commit #c494c95)

Breaking Changes

These changes are all related to new exceptions in pangres. If you weren't catching specific exceptions from pangres before this should not change anything for you.

  • when create_schema=True, pangres.upsert will now raise the custom exception pangres.exceptions.HasNoSchemaSystemException if given database does not support schemas (AFAIK this only exist in PostgreSQL) but a schema was provided (schema is not None)
  • in the presence of problematic column names (e.g. column names with "(", ")" or "%" for PostgreSQL) pangres.upsert will now raise the custom exception pangres.exceptions.BadColumnNamesException
  • in the presence of duplicated labels (duplicates amongst columns/index levels or both) pangres.upsert and pangres.fix_psycopg2_bad_cols will now raise the custom exception pangres.exceptions.DuplicateLabelsException
  • in the presence of unnamed index levels pangres.upsert and pangres.fix_psycopg2_bad_cols will now raise the custom exception pangres.exceptions.UnnamedIndexLevelsException
  • in the presence of duplicates amongst the index values pangres.upsert will now raise the custom exception pangres.exceptions.DuplicateValuesInIndexException
  • in pangres.upsert, when add_missing_columns is True but one of the columns to be added in the SQL table is part of the df's index, pangres will now raise the custom exception pangres.exceptions.MissingIndexLevelInSqlException

Documentation

  • added notes on logging
  • added a demo in the form of a notebook

Development

  • rewrote the upsert module and separated the creation of the queries from their execution. The code should be easier to maintain and understand

Testing

  • made it possible to test one or more databases types instead of always all of them
  • added testing with future=True flag
  • further improved coverage thanks to many new tests, ignoring coverage for tests (some functionalities of pytest caused coverage to be missed) and ignoring some lines
  • added ids for tests to better read parameters used in pytest
  • added context managers to ensure tables don't exist before and after tests in the database

Full Changelog

Closed issues:

  • SQLite3 Limit & Performance #43
  • Does not upsert #39
  • Upsert raises Index Error : #38
  • [Question] Is there any plan for testing upsert with sqlalchemy 2.0 API? #37
  • Update Wiki link to Pandas getting started guide #36

Merged pull requests:

  • [FIX] allow more SQL parameters for sqlite >= 3.22.0 #45 (ThibTrip)
  • [FEAT] Make the library compatible with sqlalchemy v2 #41 (ThibTrip)
  • [CORE] New structure in preparation for Sqlalchemy v2.0 and asynchronous upsert #40 (ThibTrip)

v2.3.1 (2021-06-15)

Release Notes


Bugfixes

  • Having a column named values will not raise errors anymore when using pangres.upsert. See issue #34

Full Changelog

Closed issues:

  • [Bug] db field name is keywords will raise AttributeError: 'function' object has no attribute 'translate' #34

Merged pull requests:

  • [BUGFIX] Make it possible to use a column named values #35 (ThibTrip)

v2.3 (2021-06-01)

Release Notes


New Features

  • Added yield_chunks parameter to the main function pangres.upsert. When True, this will yield the result of inserted chunks (sqlalchemy.engine.cursor.LegacyCursorResult objects). This allows you to notably count upserted rows for each chunk. See issue #32.

Improvements

  • The context manager for the connection to the database to make upserts was being called unnecessarily early. I have now put it at the very last step to minimize the connection time. See commit 4573588

Bugfixes

  • The value of the chunksize parameter was mistakenly being modified when upserting to a SQlite database because of a wrong indentation in the code. See commit da2e9aa

Full Changelog

Closed issues:

  • Metric number of inserted rows #32

Merged pull requests:

  • [FEAT] Add yield chunks parameter (for getting information on upserted chunks) #33 (ThibTrip)

v2.2.4 (2021-05-01)

Release Notes


Bugfixes

  • pangres is now also compatible with sqlalchemy>=1.4. Important: make sure to get pandas>=1.2.4 as well to avoid deprecation warnings

Full Changelog

Closed issues:

  • Does the data frame index have to be the same as the primary key column? #29
  • Issue with "update" and "ignore" in Pangres #28
  • Error when trying to write a SQL database in MS Azure #25

Merged pull requests:

v2.2.3 (2020-12-15)

Release Notes


Bugfixes

  • Fixed case where upsert was not possible with DataFrames that have no columns (only index) (see #26)
  • Fixes failed release 2.2.2 (version commit was done afterwards)

Full Changelog

Closed issues:

  • how to write to a table with only one column #26

Full Changelog

Closed issues:

  • [Question] Write DataFrame index as a column argument #24

Merged pull requests:

  • [BUGFIX] Fix upsert with df with no columns (only index) #27 (ThibTrip)

v2.2.1 (2020-10-11)

Release Notes


Bugfixes

  • fixed logging issue where pangres' logging formatting would take over once imported (it would override your own configuration, see #20)
  • fixed logging issue where pangres' logs would not be written to file (see #18)
  • fixed conda environment file (the library npdoc_to_md has to be installed via pip, see commit 44d5423)

Improvements

  • Added version attribute (see PR #22):
import pangres
pangres.__version__
2.2.1

Full Changelog

Closed issues:

  • Pangres does something to the root logger that now everything identifies as pangres #20
  • Logging set up in utils.py can conflict with logging profiles in calling script #18

Merged pull requests:

  • [ENH]: add __version__ attribute #22 (ThibTrip)
  • [BUGFIX]: Second attempt at fixing logging #21 (ThibTrip)
  • [BUGFIX] Configure the logger instead of configuring logging #19 (ThibTrip)

v2.2 (2020-08-22)

Release Notes


Breaking changes

  • Removal of previously deprecated function pangres.pg_upsert (use pangres.upsert instead)
  • In function pangres.upsert the arguments create_schema and add_new_columns have been set to False by default (they were previously both set to True by default) as this would be the common expectation (i.e. failing on missing schema or missing column). See #15. Thanks to @lsloan and @rajrohan.

Full Changelog

Closed issues:

  • make add_new_columns default to False #15
  • It is actually compatible with unique constraint #12
  • Primary key requirement #11

Merged pull requests:

  • [CORE] Remove pg upsert as planned (deprecated) #17 (ThibTrip)
  • [CORE] set create_schema and add_new_columns to False by default #16 (ThibTrip)
  • [DOC] Indicate that we can use unique keys and not just PKs #14 (ThibTrip)
  • [TEST] Add tests that show pangres also works with unique keys #13 (ThibTrip)

v2.1 (2020-04-11)

Release Notes


Bugfixes

  • Fixed bug where the "ON...CONFLICT" statement would be repeated leading to a syntax error in SQL insert statements

Changes for developers

  • Testing instructions changed
  • Added necessary tools to generate the documentation (see folder pangres/docs)

Full Changelog

Fixed bugs:

  • BUG: fix coverage #7

Merged pull requests:

  • TEST: Add testing via doctest #10 (ThibTrip)
  • BUGFIX: Improve coverage #9 (ThibTrip)
  • BUGFIX: fix issue where "ON CONFLICT" clause appears twice in the insert query #8 (ThibTrip)
  • BUGFIX: fix bugs in tests #6 (ThibTrip)
  • DOC: Fix broken anchor link #5 (ThibTrip)

v2 (2020-04-04)

Release Notes


New features

  • Added support for MySQL and SQlite 🎉 !
  • Completely SQL injection safe (everything is escaped or parameterized including schema, table and column names)
  • Documentation improved
  • Logo added ! (I am not much of a graphic designer but I did my best 🙈)

Deprecations

  • pangres.pg_upsert became pangres.upsert (to reflect the fact that pangres can now handle other databases than postgres). pangres.pg_upsert will be removed in the next version!
  • the argument "if_exists" of the old function pangres.pg_upsert was removed in the new pangres.upsert in favor of the argument "if_row_exists" whose functionnality is clearer. The equivalent of if_exists="upsert_overwrite" is now if_row_exists="update" and if_exists="upsert_keep" is now if_row_exists="ignore"

Breaking changes

  • Contrary to the old function pangres.pg_upsert the new pangres.upsert function does not clean "bad" column names automatically for postgres. An error will be raised if any column contains "%" or "(" or ")" but you can use pangres.fix_psycopg2_bad_cols to fix such problems.

Changes for developers

  • Testing improved! You can provide a connection string in the command line directly 👍 !

Full Changelog

Merged pull requests:

  • DOC: Update README.md #4 (ThibTrip)
  • CORE: Add support for other databases (MySQL, SQlite and potentially others) #3 (ThibTrip)

v1.3.1 (2020-02-16)

Release Notes


  • Drastically improved speed for small datasets (<= 100 rows and <= 20 columns)

Full Changelog

Merged pull requests:

v1.3 (2020-01-31)

Release Notes



Full Changelog

* This Changelog was automatically generated by github_changelog_generator