Custom Type Example¶
Warning
The following examples document a deprecated feature. The
SONManipulator
API has limitations as a
technique for transforming your data. Instead, it is more flexible and
straightforward to transform outgoing documents in your own code before
passing them to PyMongo, and transform incoming documents after receiving
them from PyMongo.
Thus the add_son_manipulator()
method is
deprecated. PyMongo 3’s new CRUD API does not apply SON manipulators to
documents passed to bulk_write()
,
insert_one()
,
insert_many()
,
update_one()
, or
update_many()
. SON manipulators are
not applied to documents returned by the new methods
find_one_and_delete()
,
find_one_and_replace()
, and
find_one_and_update()
.
This is an example of using a custom type with PyMongo. The example
here is a bit contrived, but shows how to use a
SONManipulator
to manipulate
documents as they are saved or retrieved from MongoDB. More
specifically, it shows a couple different mechanisms for working with
custom datatypes in PyMongo.
Setup¶
We’ll start by getting a clean database to use for the example:
>>> from pymongo.mongo_client import MongoClient
>>> client = MongoClient()
>>> client.drop_database("custom_type_example")
>>> db = client.custom_type_example
Since the purpose of the example is to demonstrate working with custom
types, we’ll need a custom datatype to use. Here we define the aptly
named Custom
class, which has a single method, x()
:
>>> class Custom(object):
... def __init__(self, x):
... self.__x = x
...
... def x(self):
... return self.__x
...
>>> foo = Custom(10)
>>> foo.x()
10
When we try to save an instance of Custom
with PyMongo, we’ll
get an InvalidDocument
exception:
>>> db.test.insert({"custom": Custom(5)})
Traceback (most recent call last):
InvalidDocument: cannot convert value of type <class 'Custom'> to bson
Manual Encoding¶
One way to work around this is to manipulate our data into something
we can save with PyMongo. To do so we define two methods,
encode_custom()
and decode_custom()
:
>>> def encode_custom(custom):
... return {"_type": "custom", "x": custom.x()}
...
>>> def decode_custom(document):
... assert document["_type"] == "custom"
... return Custom(document["x"])
...
We can now manually encode and decode Custom
instances and
use them with PyMongo:
>>> db.test.insert({"custom": encode_custom(Custom(5))})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
>>> decode_custom(db.test.find_one()["custom"])
<Custom object at ...>
>>> decode_custom(db.test.find_one()["custom"]).x()
5
Automatic Encoding and Decoding¶
Needless to say, that was a little unwieldy. Let’s make this a bit
more seamless by creating a new
SONManipulator
.
SONManipulator
instances allow you
to specify transformations to be applied automatically by PyMongo:
>>> from pymongo.son_manipulator import SONManipulator
>>> class Transform(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = encode_custom(value)
... elif isinstance(value, dict): # Make sure we recurse into sub-docs
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, dict):
... if "_type" in value and value["_type"] == "custom":
... son[key] = decode_custom(value)
... else: # Again, make sure to recurse into sub-docs
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we add our manipulator to the Database
:
>>> db.add_son_manipulator(Transform())
After doing so we can save and restore Custom
instances seamlessly:
>>> db.test.remove() # remove whatever has already been saved
{...}
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
If we get a new Database
instance we’ll
clear out the SONManipulator
instance we added:
>>> db = client.custom_type_example
This allows us to see what was actually saved to the database:
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
which is the same format that we encode to with our
encode_custom()
method!
Binary Encoding¶
We can take this one step further by encoding to binary, using a user
defined subtype. This allows us to identify what to decode without
resorting to tricks like the _type
field used above.
We’ll start by defining the methods to_binary()
and
from_binary()
, which convert Custom
instances to and
from Binary
instances:
Note
You could just pickle the instance and save that. What we do here is a little more lightweight.
>>> from bson.binary import Binary
>>> def to_binary(custom):
... return Binary(str(custom.x()), 128)
...
>>> def from_binary(binary):
... return Custom(int(binary))
...
Next we’ll create another
SONManipulator
, this time using the
methods we just defined:
>>> class TransformToBinary(SONManipulator):
... def transform_incoming(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Custom):
... son[key] = to_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_incoming(value, collection)
... return son
...
... def transform_outgoing(self, son, collection):
... for (key, value) in son.items():
... if isinstance(value, Binary) and value.subtype == 128:
... son[key] = from_binary(value)
... elif isinstance(value, dict):
... son[key] = self.transform_outgoing(value, collection)
... return son
...
Now we’ll empty the Database
and add the
new manipulator:
>>> db.test.remove()
{...}
>>> db.add_son_manipulator(TransformToBinary())
After doing so we can save and restore Custom
instances
seamlessly:
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
We can see what’s actually being saved to the database (and verify
that it is using a Binary
instance) by
clearing out the manipulators and repeating our
find_one()
:
>>> db = client.custom_type_example
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': Binary('5', 128)}