This post is a part of Implementing Event Sourcing series. It consists of code snippets, thoughts and practical advice how to implement ES in your own project. The contents of this post will probably make most sense if you also read all other parts. Then you should be ready to use it in your own projects.

What Event Sourcing actually is?

Recall any entity/model/being from a piece of software you recently worked on. Probably you thought about User, but let’s try a bit harder. Consider e-commerce Order. It might hold current status (new, confirmed, shipped, etc) and summaries – total price, shipping and taxes. Naturally, Order does not exist on its own. We usually wire it with another entity, OrderLine that refers to a single product ordered with a quantity information. This structure could be represented in a relational database in a following way:

By storing data this way we can always cheaply get CURRENT state of our Order. We store a dump of serialized object after latest changes. Changing anything, for example switching status from new to shipped causes data overwrite. We irreversibly lose old state. What if we need to track all changes? Let’s see how that fits in another database table:

Such a representation enables us to firmly tell what was changed and when. But this OrderHistory plays a second fiddle. It is merely an auxiliary record of Order, added just to fulfill some business requirement. We still reach to original Orders table when we want to know exact state of any Order in all other scenarios.

Please take a note OrdersHistory is as good as Orders table when we have to get current Order state. We just have to fetch all entries for given Order and ‘replay’ them from the start. In the end we’ll get exactly the same information that is saved in Orders table. So should we still treat Orders table as our source of truth? Event Sourcing denies such a claim. We can safely get rid of the table or at least no longer rely on it in any situation that would actually mutate Order.

To sum up, Event Sourcing comes down to:

  • Keeping your business objects (called aggregates) as a series of replayable events. This is often called an event stream
  • Never deleting any events from a system, only appending new ones
  • Using events as the only reliable way of telling in what state a given aggregate is
  • If you need to query data or present them in a table-like format, keep a copy of them in a denormalized format. This is called projection
  • Designing your aggregates to protect certain vital business invariants, such as Order encapsulates costs summary. A good rule of thumb is to keep aggregates as small as possible

If any point sounds a bit unclear, don’t worry. It will all be clarified within few next paragraphs and code snippets.

In return you get:

  • A complete history what was changed when and by who (if you enclose such information in an event)
  • Time-travel debugging, allowing to recreate state of the system in any given moment
  • Possibility of creating specialized read models of your data for high performance
  • Append – only write model that is also easier to scale

I strongly recommend to watch talk of Greg Young if you have not seen it before.

Talk is cheap – show me some code!

Consider following Order class:

To create an Order we have to provide user_id. Status is equal to new by default. The only thing we can do is to change status. This may look like a trivial example, but I made it to be simple for a purpose. Let’s rewrite the class using Event Sourcing. First, we need events that will represent any state mutations:

In such simple example there are only two events. First one, OrderCreated, is a standard way of starting any event stream. We know that status will be equal to new, so there is no point in adding such a field to OrderCreated event. Second event, StatusChanged, represents any status field mutation. Again, we just need one field to fully represent what’s going on. Consider following order mutations:

A corresponding event stream looks like this:

So now we need a way to restore order’s state using these events…

  1. Now the only way to instantiate an Order is to give it a list of events it should be initialized with
  2. Inside __init__, we apply every event, causing state mutation
  3. We need to keep an append-only list of state mutations done after Order initialization, because to save changes we just need to persist new events. Old ones are already saved and we will never delete them
  4. A heart of an event sourcing aggregate is an apply method. Inside we mutate state. I will show a bit more clever implementation later, without  if-elif-else block.
  5. Crucial change is inside set_status method. We still validate input, but instead of modifying object’s fields directly…
  6. …we prepare StatusChanged event, put it through apply…
  7. …and finally append new event to our changes list

Wait, how do I create new Order then? We can not simply constuct OrderCreated and pass it to a newly-created Order, because this would not include OrderCreated in a changes list. I use a classmethod that encapsulates what is going on:

Is testing aggregates hard? You basically create Order with events, perform some actions and see if expected events were appended to changes. Some examples:

Of course for this to work, we need to support __eq__ in our events:

That’s basically everything about event sourcing aggregates. Now let’s see how we can implement it in a bit more clever way

Implementation improvements

Events

We had to manually write classes and support __eq__ operation. We could use namedtuples or attrs library instead:

We’ve got __init__, __eq__ and several other goodies for free. Please take a look at attrs docs for more information.

Data Classes that will be introduced in Python 3.7 are also worth mentioning here.

Aggregate’s apply method

Probably some of you already heard about singledispatch that was introduced in Python3.4.

Unfortunately, singledispatch does not support objects’ methods (sic!). But there is a workaround:

As you can see, we splitted one apply method into three much smaller (and cleaner!) ones.

This is the end of first part devoted to implementing Event Sourcing in Python. New week = new post. Hold tight, guys!

This post is a part of Implementing Event Sourcing series. It consists of code snippets, thoughts and practical advice how to implement ES in your own project. The contents of this post will probably make most sense if you also read all other parts. Then you should be ready to use it in your own projects.