v4.1.2 (2022-07-29)
Release Notes
Bug Fixes
-
pangres
was not running on Python 3.10 because I never addedpackaging
inrequirements.txt
. I fixed it by usingpkg_resources
from the standard library instead to avoid the additional dependency (see commit 89d3679)Note: the tests were running fine because we use
pytest
which usespackaging
(so I never saw the missing dependency)
Closed issues:
- Cannot insert into column "id" #60
v4.1.1 (2022-03-13)
Release Notes
Bug Fixes
- fixed bug where I used a synchronous method instead of its asynchronous variant (
UpsertQuery.execute
instead ofUpsertQuery.aexecute
inpangres.aupsert
). This has no repercussions for the end user
Documentation
- fix illogic code in example for
pangres.aupsert
(usingengine
instead ofconnection
in contexts) andcommit
which I had forgotten! - added changelog
Testing
- overhaul of the tests. asynchronous and synchronous tests have been separated
- module
test_upsert_end_to_end
has been renamed totest_core
Fixed bugs:
- Fix bad strategy of temporarily replacing async engines with synchronous engines in tests #57
Closed issues:
- Changelog (releases) #58
Merged pull requests:
v4.1 (2022-01-21)
Release Notes
New Features
- Added async support with function
pangres.aupsert
🚀 ! Tested usingaiosqlite
for SQlite,asyncpg
for PostgreSQL andaiomysql
for MySQL. See documentation in dedicated wiki page
Implemented enhancements:
Closed issues:
- Add async support? #47
v4.0.2 (2022-01-17)
Release Notes
This patches an important bug with MySQL. We recommend that all users upgrade to this version.
Bug Fixes
- Fixed bug where tables in MySQL where created with auto increment on the primary key (see #56)
Closed issues:
v4.0.1 (2022-01-13)
Release Notes
Bug Fixes
- removed warning due to deprecated code when checking versions of other libraries in Python >= 3.10 (see issue #54)
v4.0 (2022-01-12)
Release Notes
There have been important changes in the main function pangres.upsert
:
Breaking changes
- The first argument
engine
has been be renamed tocon
and now accepts engines and connections - The argument
chunksize
now defaults toNone
. Like inpandas.DataFrame.to_sql
we will attempt to insert all rows by default. Previously the default was10000
rows. - There will be no more automatic adjustments to the
chunksize
the user passes inpangres.upsert
even if we can predict that it will raise an Exception due to database limitations.
E.g. inserting 100000 rows at once in a SQlite database with pangres
will necessarily raise an Exception down the line because we need to pass NUMBER_OF_ROWS * NUMBER_OF_COLUMNS parameters and the maximum of parameters allowed in SQLite is 32766 (for version >= 3.32.0, otherwise it is 999).
I have made a new utility function pangres.adjust_chunksize
that you can use before calling pangres.upsert
if you want to make sure the chunksize
is not too big.
New Features
Now that pangres.upsert
accept connections objects this will give you more control when using it when it comes to connections and transactions.
See transaction control demo notebook.
Closed issues:
Merged pull requests:
- [CORE] Directly move to v4 #53 (ThibTrip)
- [CORE] Default
chunksize
toNone
#52 (ThibTrip) - [FEAT] Add transaction and connection control #50 (ThibTrip)
v3.1 (2022-01-07)
Release Notes
Bug Fixes
- fixed wrong version number for SQLite when checking how many parameters are allowed (see commit a60c61e)
Improvements
- when using
pangres.upsert
all operations will be done within a transaction - when using
pangres.upsert
all operations will be done using a single connection
Closed issues:
Merged pull requests:
v3.0 (2021-12-06)
Release Notes
New Features
- added option
create_table
topangres.upsert
for disabling table creation/check (CREATE TABLE IF NOT EXISTS...
statement) that was always issued by this function. This can speed things up a little bit if you are sure that the target SQL table already exists pangres
should already be compatible with the futuresqlalchemy
version 2 (unless something in the API changes in the future of course). You can try this by using the flagfuture=True
insqlalchemy.create_engine
. For instancecreate_engine('sqlite:///', future=True)
Improvements
- allowed more SQL parameters for newer versions of SQLite (>=3.22.0): from 999 to 32766 (see issue #43)
- improved error messages (e.g. showing duplicated labels)
Bug Fixes
- when using parameter
yield_chunks=True
inpangres.upsert
with an empty DataFrame we will now return an empty generator instead of None in order to ensure data type consistency - fixed problem with log levels not being respected when pangres logs something (see commit #c494c95)
- fixed problem with the logger not filtering properly when the environment variable "PANGRES_LOG_LEVEL" is set (see commit #c494c95)
Breaking Changes
These changes are all related to new exceptions in pangres
. If you weren't catching specific exceptions from pangres
before this should not change anything for you.
- when
create_schema=True
,pangres.upsert
will now raise the custom exceptionpangres.exceptions.HasNoSchemaSystemException
if given database does not support schemas (AFAIK this only exist in PostgreSQL) but a schema was provided (schema
is not None) - in the presence of problematic column names (e.g. column names with "(", ")" or "%" for PostgreSQL)
pangres.upsert
will now raise the custom exceptionpangres.exceptions.BadColumnNamesException
- in the presence of duplicated labels (duplicates amongst columns/index levels or both)
pangres.upsert
andpangres.fix_psycopg2_bad_cols
will now raise the custom exceptionpangres.exceptions.DuplicateLabelsException
- in the presence of unnamed index levels
pangres.upsert
andpangres.fix_psycopg2_bad_cols
will now raise the custom exceptionpangres.exceptions.UnnamedIndexLevelsException
- in the presence of duplicates amongst the index values
pangres.upsert
will now raise the custom exceptionpangres.exceptions.DuplicateValuesInIndexException
- in
pangres.upsert
, whenadd_missing_columns
is True but one of the columns to be added in the SQL table is part of the df's index,pangres
will now raise the custom exceptionpangres.exceptions.MissingIndexLevelInSqlException
Documentation
- added notes on logging
- added a demo in the form of a notebook
Development
- rewrote the
upsert
module and separated the creation of the queries from their execution. The code should be easier to maintain and understand
Testing
- made it possible to test one or more databases types instead of always all of them
- added testing with
future=True
flag - further improved coverage thanks to many new tests, ignoring coverage for tests (some functionalities of pytest caused coverage to be missed) and ignoring some lines
- added ids for tests to better read parameters used in pytest
- added context managers to ensure tables don't exist before and after tests in the database
Closed issues:
- SQLite3 Limit & Performance #43
- Does not upsert #39
- Upsert raises Index Error : #38
- [Question] Is there any plan for testing upsert with sqlalchemy 2.0 API? #37
- Update Wiki link to Pandas getting started guide #36
Merged pull requests:
- [FIX] allow more SQL parameters for sqlite >= 3.22.0 #45 (ThibTrip)
- [FEAT] Make the library compatible with sqlalchemy v2 #41 (ThibTrip)
- [CORE] New structure in preparation for Sqlalchemy v2.0 and asynchronous upsert #40 (ThibTrip)
v2.3.1 (2021-06-15)
Release Notes
Bugfixes
- Having a column named
values
will not raise errors anymore when usingpangres.upsert
. See issue #34
Closed issues:
- [Bug] db field name is keywords will raise
AttributeError: 'function' object has no attribute 'translate'
#34
Merged pull requests:
v2.3 (2021-06-01)
Release Notes
New Features
- Added
yield_chunks
parameter to the main functionpangres.upsert
. When True, this will yield the result of inserted chunks (sqlalchemy.engine.cursor.LegacyCursorResult objects). This allows you to notably count upserted rows for each chunk. See issue #32.
Improvements
- The context manager for the connection to the database to make upserts was being called unnecessarily early. I have now put it at the very last step to minimize the connection time. See commit 4573588
Bugfixes
- The value of the
chunksize
parameter was mistakenly being modified when upserting to a SQlite database because of a wrong indentation in the code. See commit da2e9aa
Closed issues:
- Metric number of inserted rows #32
Merged pull requests:
v2.2.4 (2021-05-01)
Release Notes
Bugfixes
pangres
is now also compatible withsqlalchemy>=1.4
. Important: make sure to getpandas>=1.2.4
as well to avoid deprecation warnings
Closed issues:
- Does the data frame index have to be the same as the primary key column? #29
- Issue with "update" and "ignore" in Pangres #28
- Error when trying to write a SQL database in MS Azure #25
Merged pull requests:
v2.2.3 (2020-12-15)
Release Notes
Bugfixes
- Fixed case where upsert was not possible with DataFrames that have no columns (only index) (see #26)
- Fixes failed release 2.2.2 (version commit was done afterwards)
Closed issues:
- how to write to a table with only one column #26
Closed issues:
- [Question] Write DataFrame index as a column argument #24
Merged pull requests:
v2.2.1 (2020-10-11)
Release Notes
Bugfixes
- fixed logging issue where pangres' logging formatting would take over once imported (it would override your own configuration, see #20)
- fixed logging issue where pangres' logs would not be written to file (see #18)
- fixed conda environment file (the library
npdoc_to_md
has to be installed viapip
, see commit 44d5423)
Improvements
- Added version attribute (see PR #22):
import pangres
pangres.__version__
2.2.1
Closed issues:
- Pangres does something to the root logger that now everything identifies as pangres #20
- Logging set up in utils.py can conflict with logging profiles in calling script #18
Merged pull requests:
- [ENH]: add __version__ attribute #22 (ThibTrip)
- [BUGFIX]: Second attempt at fixing logging #21 (ThibTrip)
- [BUGFIX] Configure the logger instead of configuring logging #19 (ThibTrip)
v2.2 (2020-08-22)
Release Notes
Breaking changes
- Removal of previously deprecated function
pangres.pg_upsert
(usepangres.upsert
instead) - In function
pangres.upsert
the argumentscreate_schema
andadd_new_columns
have been set to False by default (they were previously both set to True by default) as this would be the common expectation (i.e. failing on missing schema or missing column). See #15. Thanks to @lsloan and @rajrohan.
Closed issues:
- make
add_new_columns
default toFalse
#15 - It is actually compatible with unique constraint #12
- Primary key requirement #11
Merged pull requests:
- [CORE] Remove pg upsert as planned (deprecated) #17 (ThibTrip)
- [CORE] set create_schema and add_new_columns to False by default #16 (ThibTrip)
- [DOC] Indicate that we can use unique keys and not just PKs #14 (ThibTrip)
- [TEST] Add tests that show pangres also works with unique keys #13 (ThibTrip)
v2.1 (2020-04-11)
Release Notes
Bugfixes
- Fixed bug where the "ON...CONFLICT" statement would be repeated leading to a syntax error in SQL insert statements
Changes for developers
- Testing instructions changed
- Added necessary tools to generate the documentation (see folder pangres/docs)
Fixed bugs:
- BUG: fix coverage #7
Merged pull requests:
- TEST: Add testing via doctest #10 (ThibTrip)
- BUGFIX: Improve coverage #9 (ThibTrip)
- BUGFIX: fix issue where "ON CONFLICT" clause appears twice in the insert query #8 (ThibTrip)
- BUGFIX: fix bugs in tests #6 (ThibTrip)
- DOC: Fix broken anchor link #5 (ThibTrip)
v2 (2020-04-04)
Release Notes
New features
- Added support for MySQL and SQlite 🎉 !
- Completely SQL injection safe (everything is escaped or parameterized including schema, table and column names)
- Documentation improved
- Logo added ! (I am not much of a graphic designer but I did my best 🙈)
Deprecations
- pangres.pg_upsert became pangres.upsert (to reflect the fact that pangres can now handle other databases than postgres). pangres.pg_upsert will be removed in the next version!
- the argument "if_exists" of the old function pangres.pg_upsert was removed in the new pangres.upsert in favor of the argument "if_row_exists" whose functionnality is clearer. The equivalent of
if_exists="upsert_overwrite"
is nowif_row_exists="update"
andif_exists="upsert_keep"
is nowif_row_exists="ignore"
Breaking changes
- Contrary to the old function pangres.pg_upsert the new pangres.upsert function does not clean "bad" column names automatically for postgres. An error will be raised if any column contains "%" or "(" or ")" but you can use pangres.fix_psycopg2_bad_cols to fix such problems.
Changes for developers
- Testing improved! You can provide a connection string in the command line directly 👍 !
Merged pull requests:
- DOC: Update README.md #4 (ThibTrip)
- CORE: Add support for other databases (MySQL, SQlite and potentially others) #3 (ThibTrip)
v1.3.1 (2020-02-16)
Release Notes
- Drastically improved speed for small datasets (<= 100 rows and <= 20 columns)
Merged pull requests:
v1.3 (2020-01-31)
Release Notes
* This Changelog was automatically generated by github_changelog_generator