Opinionated Python ORM for DynamoDB
I wanted a ORM that felt similar to my workflow. Coming from a Django background, I like tools that could be somewhat described as a framework. Also, it is built to be the ORM for the Capless framework.
DocB is opinionated because it makes a lot of decisions for you. It makes the partition key decision and some other ones for you.
- HASH - _doc_type - Autogenerated string based on the Document class name
- RANGE - _id - Autogenerated unique string (an unique primary key of sorts). It is the md5 hash of the document dict with the injection of autogenerated
_date
(datetime.datetime.now()) and_uuid
(uuid.uuid4()).
Docb should work on Python 3.5+ and higher
pip install docb
Example: loading.py
from docb.loading import DocbHandler
docb_handler = DocbHandler({
'dynamodb':{
'connection':{
'table':'your-dynamodb-table'
},
'config':{
'endpoint_url':'http://localhost:8000'
},
'documents':['docb.testcase.BaseTestDocumentSlug','docb.testcase.DynamoTestCustomIndex'],
'table_config':{
'write_capacity': 2,
'read_capacity': 3,
'secondary_write_capacity': 2,
'secondary_read_capacity': 3
}
}
})
This basically specifies the table name and optionally the endpoint url.
DocB allows you to use one table for all Document classes, use one table per Document class, or a mixture of the two.
The documents keys is used to specify which Document classes and indexes are used for each table.
This is only for CloudFormation deployment. Specifying handler
in the Meta
class of the Document
class is still required.
If you want to specify one table per Document class and there are different capacity requirements for each table you should specify those capacities in the Meta class (see example below).
from docb import (Document,CharProperty,DateTimeProperty,
DateProperty,BooleanProperty,IntegerProperty,
FloatProperty)
from .loading import docb_handler
class TestDocument(Document):
name = CharProperty(required=True,unique=True,min_length=3,max_length=20)
last_updated = DateTimeProperty(auto_now=True)
date_created = DateProperty(auto_now_add=True)
is_active = BooleanProperty(default_value=True,index=True,key_type='HASH')
city = CharProperty(required=False,max_length=50)
state = CharProperty(required=True,global_index=True,max_length=50)
no_subscriptions = IntegerProperty(default_value=1,global_index=True,min_value=1,max_value=20)
gpa = FloatProperty(global_index=True,key_type='RANGE')
def __unicode__(self):
return self.name
class Meta:
use_db = 'dynamodb'
handler = docb_handler
config = { # This is optional read above
'write_capacity':2,
'read_capacity':2,
'secondary_write_capacity':2,
'secondary_read_capacity':2
}
Specify the capacity in the handler if you want to use one table for multiple classes.
IMPORTANT: This will not work yet if you need different
from docb.loading import DocbHandler
docb_handler = DocbHandler({
'dynamodb': {
'connection': {
'table': 'your-dynamodb-table',
},
'config':{ # This is optional read below
'write_capacity':2,
'read_capacity':2,
'secondary_write_capacity':2,
'secondary_read_capacity':2
}
}
})
Example: models.py
from docb import (Document,CharProperty,DateTimeProperty,
DateProperty,BooleanProperty,IntegerProperty,
FloatProperty)
from .loading import docb_handler
class TestDocument(Document):
name = CharProperty(required=True,unique=True,min_length=3,max_length=20)
last_updated = DateTimeProperty(auto_now=True)
date_created = DateProperty(auto_now_add=True)
is_active = BooleanProperty(default_value=True, global_index=True)
city = CharProperty(required=False,max_length=50)
state = CharProperty(required=True,global_index=True,max_length=50)
no_subscriptions = IntegerProperty(default_value=1,global_index=True,min_value=1,max_value=20)
gpa = FloatProperty()
def __unicode__(self):
return self.name
class Meta:
use_db = 'dynamodb'
handler = docb_handler
>>>from .models import TestDocument
>>>kevin = TestDocument(name='Kev',is_active=True,no_subscriptions=3,state='NC',gpa=3.25)
>>>kevin.save()
>>>kevin.name
'Kev'
>>>kevin.is_active
True
>>>kevin.pk
ec640abfd6
>>>kevin.id
ec640abfd6
>>>kevin._id
'ec640abfd6:id:s3redis:testdocument'
>>>george = TestDocument(name='George',is_active=True,no_subscriptions=3,gpa=3.25,state='VA')
>>>george.save()
>>>sally = TestDocument(name='Sally',is_active=False,no_subscriptions=6,gpa=3.0,state='VA')
>>>sally.save()
IMPORTANT: This is a query (not a scan) of all of the documents with _doc_type of the Document you're using. So if you're using one table for multiple document types you will only get back the documents that fit that query.
>>>TestDocument.objects().all()
[<TestDocument: Kev:ec640abfd6>,<TestDocument: George:aff7bcfb56>,<TestDocument: Sally:c38a77cfe4>]
#Faster uses pk or _id to perform a DynamoDB get_item
>>>TestDocument.get('ec640abfd6')
<TestDocument: Kev:ec640abfd6>
#Use DynamoDB query and throws an error if more than one result is found.
>>>TestDocument.objects().get({'state':'NC'})
<TestDocument: Kev:ec640abfd6>
>>>TestDocument.objects().filter({'state':'VA'})
[<TestDocument: George:aff7bcfb56>,<TestDocument: Sally:c38a77cfe4>]
>>>TestDocument.objects().filter({'no_subscriptions':3})
[<TestDocument: Kev:ec640abfd6>,<TestDocument: George:aff7bcfb56>]
>>>TestDocument.objects().filter({'no_subscriptions':3,'state':'NC'})
[<TestDocument: Kev:ec640abfd6>]
This is just like the filter
method but it uses a Global Secondary Index as the key instead of the main Global Index.
>>>TestDocument.objects().gfilter({'state':'VA'}, index_name='state-index') #Index Name is not required and this option is only provided for when you won't to query on multiple attributes that are GSIs.
[<TestDocument: George:aff7bcfb56>,<TestDocument: Sally:c38a77cfe4>]
Docb supports the following DynamoDB conditions. Specify conditions by using double underscores (__). Example for GreaterThan you would use the_attribute_name__gt
.
Full List of Conditions:
- Equals
__eq
(default filter so it is not necessary to specify) - NotEquals
__ne
- LessThan
__lt
- LessThanEquals
__lte
- GreaterThan
__gt
- GreaterThanEqual
__gte
- In
__in
- Between
__between
- BeginsWith
__begins
- Contains
__contains
- AttributeType
__attr_type
- AttributeExists
__attr_exists
- AttributeNotExists
__attr_not_exists
>>>TestDocument.objects().filter({'no_subscriptions__gt':3})
[<TestDocument: Sally:ec640abfd6>]
Limits the amount of records returned from the query.
>>>TestDocument.objects().filter({'no_subscriptions__gt':3}, limit=5)
Sort the results of the records returned from the query.
WARNING: This feature only sorts the results that are returned. It is not an official DynamoDB feature and
therefore if you use this with the limit
argument your results may not be true.
>>>TestDocument.objects().filter({'no_subscriptions__gt':3}, sort_attr='state', sort_reverse=True)
The chain filters feature is only available for Redis and S3/Redis backends.
>>>TestDocument.objects().filter({'no_subscriptions':3}).filter({'state':'NC'})
[<TestDocument: Kev:ec640abfd6>]
Bulk save documents with DynamoDB's batch writer.
doc_list = [TestDocument(name='George',is_active=True,no_subscriptions=3,gpa=3.25,state='VA'),
TestDocument(name='Sally',is_active=False,no_subscriptions=6,gpa=3.0,state='VA')]
TestDocument().bulk_save(doc_list)
The Property class that all other classes are based on.
from docb.properties import BaseProperty
BaseProperty(default_value=None,required=False,global_index=False,index_name=None,unique=False,write_capacity=None,
read_capacity=None,key_type='HASH',validators=[])
default_value
(optional) - Specifies the default value for the property (default: None)required
(optional)- Specifies whether the property is required to save the document (default: False)global_index
(optional) - Specifies whether the property is a Global Secondary Index (default: False)index_name
(optional) - If theglobal_index
argument isTrue
you have the option to set the index name. (default: None)unique
(optional) - Specifies whether this property's value should be unique in the table. (default: False)write_capacity
(optional) - If theglobal_index
argument isTrue
you have the option to set the index's write capacity (default: None)read_capacity
(optional) - If theglobal_index
argument isTrue
you have the option to set the index's read capacity (default: None)key_type
(optional) - Specifies type of key. Choices areHASH
andRANGE
. (default: HASH)validators
(optional) - Specifies what extra validator classes should be used. (default: None)
Same as BaseProperty
Same as BaseProperty
Same as BaseProperty
Same as BaseProperty
Same as BaseProperty
Same as BaseProperty
auto_now
(optional) - Specifies whether the date should be autogenerated on update (default: False)auto_now_add
(optional) - Specifies whether the date should be autogenerated just on first save (default: False)
Same as DateProperty
DocB features two ways to deploy tables to AWS (only one works with DynamoDB Local though).
This is the preferred method for deploying production and development workloads on AWS.
from docb.loading import DocbHandler
handler = DocbHandler({
'dynamodb':{
'connection':{
'table':'school'
},
'documents':['docb.testcase.Student'],
'table_config':{
'write_capacity':2,
'read_capacity':3
}
}
})
# Build the SAM template
sam = handler.build_cf_template('resource_name', 'table_name', 'db_label')
# Deploys the SAM template to AWS via CloudFormation
sam.publish('stack_name')
This method is used for our unit tests and we suggest using it for testing code locally (with Jupyter Notebooks and such).
from docb.loading import DocbHandler
from docb import (Document, CharProperty, IntegerProperty,
DateTimeProperty,BooleanProperty, FloatProperty,
DateProperty)
handler = DocbHandler({
'dynamodb':{
'connection':{
'table':'school'
},
'config':{
'endpoint_url':'http://localhost:8000'
},
'documents':['docb.testcase.Student'],
'table_config':{
'write_capacity':2,
'read_capacity':3
}
}
})
class Student(Document):
first_name = CharProperty(required=True)
last_name = CharProperty(required=True)
slug = CharProperty(required=True,unique=True)
email = CharProperty(required=True, unique=True)
gpa = FloatProperty(global_index=True)
hometown = CharProperty(required=True)
high_school = CharProperty()
class Meta:
use_db = 'dynamodb'
handler = handler
# Creates the table via AWS API
Student().create_table()
- Table name should be between 3 and 255 characters long. (A-Z,a-z,0-9,_,-,.)
- Primary key (partition key) should be equal to
_doc_type
and range should be_id
.
If you want to make filter()
queries, you should create an index for every attribute that you want to filter by.
- Primary key should be equal to attribute name.
- Index name should be equal to attribute name postfixed by "-index". (It will be filled by AWS automatically). For example, for attribute "city": Primary key = "city" and index name = "city-index".
- Index name can be directly specified by
index_name
argument:
name = CharProperty(required=True,unique=True,min_length=5,max_length=20,index_name='name_index')
- IMPORTANT: In other words, if your indexed attribute is named city, then your index name should be city-index,
if you didn't specify
index_name
argument.
- Projected attributes: All.
Use the docker-compose file, Dockerfile, and the requirements.txt from the repo.
docker-compose up
Easily backup or restore your model locally or from S3. The backup method creates a JSON file backup.
IMPORTANT: These are only appropriate for small datasets.
TestDocument().backup('test-backup.json')
TestDocument().backup('s3://your-bucket/kev/test-backup.json')
TestDocument().restore('test-backup.json')
TestDocument().restore('s3://your-bucket/kev/test-backup.json')
Twitter::@brianjinwright Github: bjinwright
Github: armicron