Skip to content

Commit

Permalink
Databases Docs Update (#339)
Browse files Browse the repository at this point in the history
* Overview + MySQL

* Finish our DB

- Add Databases Overview
- Postgres Overview/Auth/Quickstart
- DBSync additional overview.

* Move DBSync to it's own framework.
  • Loading branch information
jburchard authored Aug 10, 2020
1 parent 66c307b commit 2ddacd5
Show file tree
Hide file tree
Showing 5 changed files with 124 additions and 44 deletions.
1 change: 1 addition & 0 deletions docs/aws.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ API
:inherited-members:
:members:
.. _redshift:
********
Redshift
********
Expand Down
112 changes: 68 additions & 44 deletions docs/databases.rst
Original file line number Diff line number Diff line change
@@ -1,81 +1,105 @@
Databases
=========

********
Overview
********

Parsons offers support for a variety of popular SQL database dialects. The functionality is focused on the ability to query and upload data to SQL databases. Each database class also includes the ability to infer datatypes and data schemas from a Parsons table and automatically create new tables.

Similar to other classes in Parsons, the query methods for databases all return :ref:`parsons-table`, which allow them to be easily converted to other data types.

There is also support for synchronization of tables between databases as part of the :doc:`dbsync` framework.

***************
Google BigQuery
***************

See :doc:`google` for documentation.


*****
MySQL
*****
.. _my-sql:

MySQL is the world's most popular open source database. The Parsons class leverages on the `mysql <https://github.com/farcepest/MySQLdb1>`_ python package.

===========
Quick Start
===========

**Authentication**

.. code-block:: python
from parsons import MySQL
# Instantiate MySQL from environmental variables
mysql = MySQL()
# Instantiate MySQL from passed variables
mysql = MySQL(username='me', password='secret', host='mydb.com', db='dev', port=3306)
**Quick Start**

.. code-block:: python
# Query database
tbl = mysql.query('select * from my_schema.secret_sauce')
# Copy data to database
tbl = Table.from_csv('my_file.csv') # Load from a CSV or other source.
mysql.copy(tbl, 'my_schema.winning_formula')
.. autoclass:: parsons.MySQL
:inherited-members:

.. _postgres:
********
Postgres
********

.. autoclass:: parsons.Postgres
:inherited-members:
Postgres is popular open source SQL database dialect. The Parsons class leverages the `mysql <https://www.psycopg.org/>`_ python package.

===========
Quick Start
===========

********
Redshift
********

See :doc:`aws` section for documentation.
**Authentication**

*************
Database Sync
*************
.. code-block:: python
Sync tables between two databases with just a few lines of code. Currently supported
database types are:
from parsons import Postgres
* Google BigQuery
* MySQL
* Postgres
* Redshift
# Instantiate Postgres from environmental variables
mysql = Postgres()
========
Examples
========
# Instantiate Postgres from passed variables
Postgres = Postgres(username='me', password='secret', host='mydb.com', db='dev', port=3306)
**Full Sync Of Tables**
# Instantiate Postgres from a ~/.pgpass file
Postgres = Postgres()
Copy all data from a source table to a destination table.
**Quick Start**

.. code-block:: python
# Create source and destination database objects
source_rs = Redshift()
destination_rs = Postgres()
# Create db sync object and run sync.
db_sync = DBSync(source_rs, destination_rs) # Create DBSync Object
db_sync.table_sync_full('parsons.source_data', 'parsons.destination_data')
# Query database
tbl = postgres.query('select * from my_schema.secret_sauce')
**Incremental Sync of Tables**
# Copy data to database
tbl = Table.from_csv('my_file.csv') # Load from a CSV or other source.
postgres.copy(tbl, 'my_schema.winning_formula')
Copy just new data in the table. Utilize this method for tables with
distinct primary keys.

.. code-block:: python
.. autoclass:: parsons.Postgres
:inherited-members:

# Create source and destination database objects
source_rs = Postgres()
destination_rs = Postgres()

# Create db sync object and run sync.
db_sync = DBSync(source_pg, destination_pg) # Create DBSync Object
db_sync.table_sync_incremental('parsons.source_data', 'parsons.destination_data', 'myid')
********
Redshift
********

===
API
===
See :doc:`aws` section for documentation.

.. autoclass:: parsons.DBSync
:inherited-members:
53 changes: 53 additions & 0 deletions docs/dbsync.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
*************
Database Sync
*************

The database sync framework allows tables between two databases with just a few lines of code. Currently supported
database types are:

* :ref:`Google Big Query <gbq>`
* :ref:`MySQL <my-sql>`
* :ref:`Postgres <postgres>`
* :ref:`Redshift <redshift>`

The ``DBSync`` class is not a connector, but rather a class that joins in database classes and moves data seamlessly between them.

===========
Quick Start
===========

**Full Sync Of Tables**

Copy all data from a source table to a destination table.

.. code-block:: python
# Create source and destination database objects
source_rs = Redshift()
destination_rs = Postgres()
# Create db sync object and run sync.
db_sync = DBSync(source_rs, destination_rs) # Create DBSync Object
db_sync.table_sync_full('parsons.source_data', 'parsons.destination_data')
**Incremental Sync of Tables**

Copy just new data in the table. Utilize this method for tables with
distinct primary keys.

.. code-block:: python
# Create source and destination database objects
source_rs = Postgres()
destination_rs = Postgres()
# Create db sync object and run sync.
db_sync = DBSync(source_pg, destination_pg) # Create DBSync Object
db_sync.table_sync_incremental('parsons.source_data', 'parsons.destination_data', 'myid')
===
API
===

.. autoclass:: parsons.DBSync
:inherited-members:
1 change: 1 addition & 0 deletions docs/google.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Google
Google Cloud services utilize a credentials JSON file for authentication. If you are the administrator of your Google Cloud account,
you can generate them in the `Google Cloud Console APIs and Services <https://console.cloud.google.com/apis/credentials/serviceaccountkey?_ga=2.116342342.-1334320118.1565013288>`_.

.. _gbq:
********
BigQuery
********
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ Indices and tables
:caption: Framework
:name: framework

dbsync
table
notifications

Expand Down

0 comments on commit 2ddacd5

Please sign in to comment.