Database Sync
The database sync framework allows tables between two databases with just a few lines of code. Currently supported database types are:
The DBSync
class is not a connector, but rather a class that joins in database classes and moves data seamlessly between them.
Quick Start
Full Sync Of Tables
Copy all data from a source table to a destination table.
# Create source and destination database objects
source_rs = Redshift()
destination_rs = Postgres()
# Create db sync object and run sync.
db_sync = DBSync(source_rs, destination_rs) # Create DBSync Object
db_sync.table_sync_full('parsons.source_data', 'parsons.destination_data')
Incremental Sync of Tables
Copy just new data in the table. Utilize this method for tables with distinct primary keys.
# Create source and destination database objects
source_rs = Postgres()
destination_rs = Postgres()
# Create db sync object and run sync.
db_sync = DBSync(source_pg, destination_pg) # Create DBSync Object
db_sync.table_sync_incremental('parsons.source_data', 'parsons.destination_data', 'myid')
API
- class parsons.DBSync(source_db, destination_db, chunk_size=100000)[source]
Sync tables between databases. Works with
Postgres
,Redshift
,MySQL
databases.- Args:
- source_db: Database connection object
A database object.
- destination_db: Database connection object
A database object.
- chunk_size: int
The number of rows per transaction copy when syncing a table. The default value is 100,000 rows.
- Returns:
A DBSync object.
- table_sync_full(source_table, destination_table, if_exists='drop', **kwargs)[source]
Full sync of table from a source database to a destination database. This will wipe all data from the destination table.
- Args:
- source_table: str
Full table path (e.g.
my_schema.my_table
)- destination_table: str
Full table path (e.g.
my_schema.my_table
)- if_exists: str
If destination table exists either
drop
ortruncate
. Truncate is useful when there are dependent views associated with the table.- **kwargs: args
Optional copy arguments for destination database.
- Returns:
None
- table_sync_incremental(source_table, destination_table, primary_key, distinct_check=True, **kwargs)[source]
Incremental sync of table from a source database to a destination database using an incremental primary key.
- Args:
- source_table: str
Full table path (e.g.
my_schema.my_table
)- destination_table: str
Full table path (e.g.
my_schema.my_table
)- if_exists: str
If destination table exists either
drop
ortruncate
. Truncate is useful when there are dependent views associated with the table.- primary_key: str
The name of the primary key. This must be the same for the source and destination table.
- distinct_check: bool
Check that the source table primary key is distinct prior to running the sync. If it is not, an error will be raised.
- **kwargs: args
Optional copy arguments for destination database.
- Returns:
None