Showing posts with label software design. Show all posts

Saturday, January 13, 2018

Event Sourced Aggregates Part 6: Smarter Events

In this 6th and final post in my series about event sourced aggregates, I will make the events a little bit smarter. So far events have been completely stupid object that just carry some data around, but has no behavior. In this post I give every event a little bit of behavior by moving the 'When' methods from the aggregate to the events. The result is that the aggregate becomes anemic (that is, no behavior, just data), does not violate Open/Closed principle anymore, and that the events become more self contained.

In the 5th post I made an attempt at moving the 'When' methods out of the aggregate to get to a design where the aggregate does not violate Open/Closed. I did so by introducing a new abstraction - an aggregate projector - but that just ended up with the same Open/Closed violation that the aggregate suffered from originally. Therefore I take another approach in this post.

Smarter events

Let's, once again, see how the aggregate looks after domain logic was moved from the aggregate to the command handlers in post 4:

And the 'UsernameChangedEvent' looks like this:

In order to keep the aggregate from growing and growing as features are added to the system, I will move the 'When(UsernameChangedEvent e)` to a `When` method on the event itself, like this:

Now the event holds the data of the event and is responsible for applying the state changes to the aggregate that the event implies. That is: A domain event - like the UsernameChangedEvent - is an indication that something happened to the aggregate, i.e. there was a state change to the aggregate. The 'When' method on the event applies that state change to aggregate that it gets passed in as an argument.

When moving the 'When' method from the aggregate to the event the signature of the method must change a bit to 'void When(UserAggregate aggregate)'. Notice that this signature is not specific to the 'UsernameChangedEvent', but will be the same for all events. That turns out to be a quite handy side effect of moving the `When` methods. More on that in the next section. Since the `When` signature is the same for all events I'll go ahead and add it to the event interface:

Before the 'Event' interface was just a marker interface. Now it shows that all events have a 'When' method.

Event replay revisited

Moving the 'When' methods impact how event replay is done. Remember from the first post that what is stored in the database are the events. When we need an aggregate all the events emitted on that aggregate are read from the database, and then the 'When' method for each one is called. Since the 'When' methods each apply the state changes to aggregate implied by the event the end result is an aggregate object in the correct current state of that aggregate. The replay process goes like this:

Where the event replay is done by the 'Replay' method on the abstract class 'Aggregate'. The special sauce in this is in the implementation of the 'Play' method which - as shown in the first post of the series - involves using reflection to find a suitable 'When' method. This becomes a lot simpler now that all events implement an interface with a 'When' method on it:

This simplification was not the goal of the refactoring done in this post, but a nice side effect, and as such an indication that this is indeed a good road to follow.

The anemic aggregate

With the `When` methods moved to the events the aggregate has become anemic. And that is a good thing in this case. The UserAggregate is reduced to this:

which is simply a representation of the current state of a user. That is the essence of what the 'UserAggregate' is. Moreover this only changes if we have reason to change how the current state of a user looks, which I consider a good reason to change that class. Moreover the 'UserAggregate' no longer violates the Open/Closed principle since new domain behavior can be added by adding new commands, command handlers and events, but without changing the 'UserAggregate'.

Often an anemics domain model is seen as an anti pattern in objec oriented code. I agree. But only when looking at the domain model as a whole, not when looking at just one class - like the 'UserAggregate'. My point here is, that looking a the domain model as a whole includes commands, command handlers and events. In that perspective the domain model is not anemic - only the user aggregate is anemic.

Wrapping up

In the first post of this series I outlined what I see as a typical C# implementation of event sourced aggregates. I also argued that, that implementation leads to the aggregates violating the Open/Closed principle. In the third and fourth posts I solved half of the problem by making the command handlers smarter, and in this post I solved the second half of the problem by making events smarter.

The final code is on Github, as are the intermediate steps (see other branches on that same repository).

Wednesday, November 22, 2017

Event Sourced Aggregates Part 5: Anemic aggregate

In this 5th part in my series on event sourced aggregates I continue moving code out of the aggregate. In the last post I moved domain logic out of the aggregate and into the command handlers, making them smart enough to actually handle commands by themselves. In this post I will continue along the path of moving stuff out of the aggregate: I will move the remaining methods, namely the 'When' methods, out from the aggregate to a new abstraction - an aggregate projector. While this will achieve the goal set out in the first post of getting past the aggregate growing and growing over time, the new aggregate projector unfortunately will suffer from the same problem. In the next post I will take another approach moving the 'When' methods out and arrive at a better design, but first let's follow what I think is the most obvious path and see why that leads a bad design.

The aggregate is a projection

Taking a step back, what is the aggregate? At the datastore level it is a list of events. Together the events represent everything that has happened to the aggregate and, as we have seen, the current state of the aggregate can be recreated from the list of events. At the code level the aggregate is an object - it has some state and it has some behavior. At this point the only behavior it has left is the 'When' methods. The important bit is that the aggregate is an object. It's just an object. Likewise, in the code, different read models are just objects that result from projections over events. In that sense the aggregate is not different from a read model: It is an object that is the result of a projection over events.

Introducing the aggregate projector

Before I start refactoring lets take a look at how the aggregate looks right now:

The aggregate has some state represented by the properties on lines 3 and 4, and then some 'When' methods that make up the logic needed to perform the projection from the aggregates events to its current state.

Seeing that a new 'When' method will be added to the aggregate every time a new event is introduced - and new features will usually result in new events, in my experience - the aggregate still has the problem of growing big and unwieldy over time. So let's introduce another class that can do the projections:

This doesn't just work. First off the new 'UserAggregateProjector' cannot set the properties on the aggregate to anything. That can be fixed by adding internal setters to the aggregate, allowing the projector to access the setters, but disallowing access from outside the same project as the 'UserAggregate', which I expect to mean anything beyond commands, command handlers and events.
Furthermore the event replay done when fetching an aggregate must also change from calling 'When' methods on the aggregate to calling them on the 'UserAggregateProjector'. That means changing 'Aggregate' base class to this:

The changes are the introduction of the 'GetProjector' method on line 30 and the use of that new method in the 'Play' method, which now does reflection of the projector class to find the 'When' methods instead of doing it over the aggregate. The end result is the same: An aggregate object with the current state of the aggregate recreated by replaying all events.

Moving the 'When' methods has obviously also changed the aggregate, which now only contains state:

This is what is known as an anemic domain model, because it has no behavior. That's usually considered an anti-pattern, but I don't necessarily agree that it is; as argued above the aggregate is essentially a projection of the events, so I do not see why that object has to be where the domain behavior goes. As we saw in the 4th post of the series command handlers is a nice place to put domain behavior.

The projector violates Open/Closed principle

As a stated at the beginning of this post the design I've arrived at now is not good: The new 'UserAggregateProjector' suffers just as much from perpetual growth as the aggregate did before I moved the 'When' methods out of it. In other words the new projector violates the Open/Closed principle, which is what I am trying to get away from. So I have not solved anything, just moved the problem to a new abstraction :( Seems like I need to take another iteration, which I will in the next post.

The code for this post is in this branch.

Tuesday, November 14, 2017

Event Sourced Aggregates Part 4: Smart command handlers

In this fourth post in my series about event sourced aggregates I will continue work on the command handlers. In the 3rd post I did some cleaning up of the command handlers, cutting them down to very little code. In this post I will make the command handlers smart by moving domain logic from the aggregate to the command handlers.

Motivation

In the first and second posts of the series I outlined a typical C# implementation of event sourced aggregates and how that style of implementation leads to ever growing aggregates - every added features adds (at least) two methods to the aggregate:

One for the domain logic. E.g. a 'ChangeUsername' method, that has whatever business logic there is around changing the username. If and when these methods decide a change to the aggregate state is needed they emit a domain event.
A 'When' method for any new events. The `When` methods perform all state changes on the aggregate.

The patterns I see in the implementations I come across is that there is a one-to-one correspondence between commands, command handlers and public domain logic methods on the aggregate. For example the pattern is that for a 'ChangeUsernameCommand' there is a 'ChangeUsernameCommandHandler' class and a 'ChangeUsername' method on the aggregate. I took almost all the plumbing out of the command handler in the last post and essentially left it at this:

which invokes the 'helper.Handle' method to get all the plumbing done and then calls the 'ChangeUsername' method to get the domain logic done. So in essence the command handler just delegates to the aggregate, but isn't the responsibility of the command handler to ... handle the command? I think it is. And handling the command means running the domain logic, so let's move that logic from the aggregate to the command handler.

Smart command handlers

In the second post I introduced the 'ChangeUsernameCommand` and the associated command handler and methods on the aggregate. In particular this `ChangeUsername` method on the aggregate:

which implements the domain logic for changing username. That is the logic I want to move to the command handler.
Moving the domain logic straight over the command handler, changes the `Handle` method on the command handler to this:

Now the command handler contains the logic for handling the command. Note that now the command handler also emits domain events - on line 7. This makes sense since this is still event sourced, so changes to the aggregate state are still done through events. The rest of the mechanics around events remain unchanged: The `Emit` method on the base aggregate still calls the 'When' method for the event and stores the event to the list of new events on the aggregate. Saving the aggregate still means appending the list of new events to the event store, and getting the aggregate from the 'AggregateRepository' still means reading all the aggregates events from the event store and replaying each one.

Having moved the domain logic out of the aggregate I have a slimmer aggregate, that only has the state of the aggregate and the 'When' methods. In the next two post I slim down the aggregate even further by moving the 'When' methods out.

The complete code for this post in this branch.

Monday, November 6, 2017

Event Sourced Aggregates Part 3: Clean up command handlers

This is the 3rd part of my series about event sourced aggregates. In the first post I outlined a typical implementation of event sourced aggregates, and in the second post I showed how that implementation leads to aggregates that grow bigger and bigger as features are added to the system. In this post I will clean up the command handlers from the previous post by moving some repeated code out them to a new abstraction. In the coming posts I will refactor between the command handlers, aggregates and event handlers to arrive at a design where the aggregate does not grow.

Repeated code in command handlers

In the last post we looked briefly at this implementation of a command handler:

Looking at the above the code at line 20 is - in a sense - where the ChangeUsernameCommand the is handled, because that is the only line that is about changing a username. All the other code in the command handler is about infrastructure; loading the aggregate, saving the aggregate and dispatching events. Moreover the code for loading and saving aggregates as well as the code for dispatching will be repeated in every command handler

Introducing a helper

To get past that repetitiveness and to cut back on the amount of infrastructure code in the command handler, we introduce this helper, where the loading of the aggregate, the saving of the aggregate and the dispatching of events is done:

The idea behind the CommandHandlerHelper is that the concrete command handler calls the Handle method with a handlerFunc, that does the business logic bit of the command handler. The handlerFunc is called at line 18, so the helper makes sure the infrastructure code is done in right order in relation to the business logic.

Cleaner command handlers

With the CommandHandlerHelper in place the ChangeUsernameCommand can be rewritten to use it like this:

This is a good deal simpler than the command handler code at the start of the post.

That's it for now. With this clean up in place we set for the next steps:

Make the command handlers smarter
Make the aggregate anemic in a naive way, leaving a stable aggregate, but introducing a new Open/Closed violation
Make the aggregate anemic, events (a little) smart, and solve the Open/Closed violation

Tuesday, October 31, 2017

Event Sourced Aggregates Part 2: Where the mess starts

In the first post in this series I outlined a typical C# implmentation of event sourced aggregates. In this post we add a second feature to the example from the first post. In doing so I wish to illustrate how that typical implmentation leads to violation of Open/Closed principle and ever growing aggregates.

The second feature

Once we have the code from the first post in place - that is: The infrastructure for sending commands to aggregates, raising events from aggregates, saving events and replaying events - and need to add a second feature, the way to do it is (to some extent) as outline in the first post:

Send a new type of command
Implement a command handler for the new command
Implement a new method on the aggregate with the new domain logic
Emit any new events needed
Implement new When methods for any new event types

Let's say we want to be able change a users username.

The command and the sending of of that command looks like this:

That's pretty straight forward and not too interesting, so let's move on to the command handler:

That's also pretty straight forward, but a little more interesting: Most of the the command handler code is something that will be repeated in all commands handlers. We will deal with this in the next post, where that repetition is pulled out into a helper class.
The next step is changes to the aggregate, where this is added:

This is still pretty straight forward. In fact everything needed to add this new feature was straight forward, so that is a good thing. The problem lies in the last two methods, the ones added to the aggregate.

Problem: An Open/Closed Principle violation

The aggregate just grew. In order to support changing the username we added 2 methods to the aggregate. That violates the Open/Closed principle, which indicates that it is a potential problem. In my experience, it quickly becomes a real problem because the aggregate grows relatively quickly and eventually becomes big and complicated, just like any other class that grows monotonically.

That's it for now. The next posts will:

Make the command handlers smarter and clean up some repetitiveness
Make the aggregate anemic in a naive way, leaving a stable aggregate, but introducing a new Open/Closed violation
Make the aggregate anemic, events (a little) smart, and solve the Open/Closed violation

Tuesday, October 24, 2017

Event Sourced Aggregates Part 1: Outline of a typical implementation

This is the first post in a series of posts that takes its offset in a design problem I've encountered repeatedly with event sourced aggregates: They grow every time a feature is added. Nothing (almost) is removed from them, so over time they grow very big and gnarly. Why does this happen? Because typical implementations of event sourced aggregates violate the Open/Closed principle.
Through this series of post, I will show how event sourced aggregates violate Open/Closed and - as a consequence - tend to grow monotonically, and then show how we can address that by re-factoring away from the big aggregate towards a small and manageable one. Cliffhanger and spoiler: The aggregate will become anemic and I think that is a good thing.

The complete code from the series is on GitHub, where there is a branch with the code for the first two posts.

Event sourced aggregates

The context of what I am exploring in this series of posts is systems based on Domain Driven Design, where some or all of the aggregates are stored using event sourcing. Often these systems also use CQRS - very much inspired by Greg Youngs talks and writings.
Using event sourcing for storing the aggregates, means that the aggregate code does not change the state of the aggregate directly, instead it emits an event. The event is applied to the aggregate which is where changes to the state of the aggregate happens, but the event is also stored to a data store. Since aggregate state is only changed when an event is applied, the current aggregate state can be recreated by reading all the events for a given aggregate up form the data store and applying each one to the aggregate. The benefits of this approach are many (when applied to a suitable domain) and described elsewhere, so I wont go into them here.

A typical C# implementation

Typically implementations of all this follow a structure where requests coming in from the outside - be it though a client making a request to an HTTP endpoint, a message on a queue from some other service or something else - result in a chain that goes like this:

A command is sent, asking the domain to perform a certain task
A command handler picks up that command, fetches the appropriate aggregate and triggers appropriate domain behavior. Fetching the aggregate involves replaying all the aggregates events (event sourcing!)
A method on an aggregate is called, and that is where the actual domain logic is implemented
Whenever the domain logic needs to change some state it emits an event (event sourcing, again) of a specific type
A 'when' method for the specific event type on the aggregate is called and updates the state of the aggregate
After the aggregate is done, all the events emitted during execution of the domain logic is dispatched to any number of event handlers that cause side effects like updating view models, or sending messages to other services.

To put this whole process into code, let's think of an example: Creating a user in some imaginary system. The first step is to send the create user command:

Next step is the event handler for the create user command. Note that in this example I use the MediatR library to connect the sending of a command to the handler for the command.

Note that most of what is going on here is the same as for other handlers for other commands: Pick the right aggregate and, after executing domain logic, save that aggregate and then dispatch whatever events were emitted.

On line 19 of the handler we call into the aggregate. The code in the aggregate looks like this:

At line 11 we call the Emit method. This is how most implementations I've seen work, and typically that Emit method is part of the Aggregate base class and looks something like this:

Notice how Emit calls Play which uses reflection to find a When method on the aggregate itself, and to call that When method. The When method is supposed to update the state of the aggregate and is also the method that gets called during event replay. More on that below. For now let's see the when method:

That's pretty much it, though there a few things I have skipped over a bit quickly: How the aggregate is fetched, how it is saved and how events are dispatched. I will not go into the event dispatching, since it is not relevant to the point I am making in this series, but the code is on Github, if you want to look. As for the other two bits - fetching and saving aggregates - lets start with how aggregates are saved:

As you can see saving the aggregate essentially means saving a list of events. The list should contain all the events that has ever been emitted by the aggregate. That is the essence of event sourcing. When it comes to fetching the aggregate, the list of events is read, and each one is replayed on a new clean aggregate object - that is the When methods for each event is called in turn. Since only the When methods update the state of the aggregate the result is an aggregate object in the right state. The Get method on the aggregate repository (which does the fetching) looks like this:

And the Replay method called in line 14 just runs through the list of events and plays each on them in turn, like this:

That pretty much outline the implementations of event sourced aggregates I seem to come across.

That's it for now. The next posts will:

Add a second feature and see how the aggregate starts to violate Open/Closed principle
Make the command handlers smarter and clean up some repetitiveness
Make the aggregate anemic in a naive way, leaving a stable aggregate, but introducing a new Open/Closed violation
Make the aggregate anemic, events (a little) smart, and solve the Open/Closed violation

Monday, November 7, 2016

Sharing and Caching Assets Across Single-Page BFFs

Over the last couple of posts I showed the single-page BFFs pattern (a variation of the Backend-For-Frontend aka BFF pattern where the BFF serves only one page in a web application), and how you will likely need to place single-page BFFs behind a reverse proxy.

In this post I will outline an approach to handling shared assets - like CSS or Javascript which is used on all or most of the pages in a web application built using a number of single-page BFFs.

When building a web application consisting of a number of pages, there is likely both some look-and-feel and some behavior that we want to share across all pages - buttons should look and behave the same, menus that should be on all pages, colors and fonts should be consistent etc. This calls for sharing some CSS and some Javascript across the pages. When thinking about handling these shared things across a number of single-page BFFs there are some competing requirements to take into consideration:

Clients - i.e. browsers - should be able to cache the CSS and Javascript as the user moves through the web application. The alternative is that the client downloads the same CSS and JS over and over as the user navigates the web application. That would be too wasteful in most cases.
Each single-page BFF should be individually deployable. That is we should be able deploy a new version of one single-page BFF without having to deploy any other services. If not the system soon becomes unmanageable. Furthermore developers should be able to run and test each single-page BFF by itself without having to spin up anything else.
I do not want to have to write the same CSS and JS in each single-page BFF. I want to share it.

In this post I will show an approach that allows me to achieve all three of the above.

First notice that the obvious solution to the first bullet - allowing clients to cache shared CSS and JS - is to put the shared CSS and the shared JS in two bundles - a CSS bundle and a JS bundle - and put the bundles at a known place. That way each single-page BFF can just address the bundle at the known place, as shown below. Since all pages refer to the same bundles the client can cache them.

The problem with this is that it violates bullet 3: Any given single-page BFF only works when both the global bundles are available, so for a developer to work on a single-page BFF, they need not only the BFF, but also something to serve the bundles. If we are not careful, this approach can also violate bullet 2: Since the single-page BFFs depend on the two global bundles, but they do not contain the global bundles themselves, each one has a hard dependency on another service - the one that serves the bundles. That means we can easily end up having to deploy the service that serves the bundles at the same time as a new version of one of the single-page BFFs.

Looking instead a bullet three first suggests that we should put the shared CSS and Javascript into a package - an NPM package, that is - and include that package in each single-page BFF. That means each single-page BFF has the bundles it needs, so developers can work with each single-page BFF in isolation. This also means that each single-page BFF continues to be individually deployable: When deployed they bring along the bundles they need, so there is no dependency on another service to serve the bundles. The problem now becomes how to allow clients to cache the bundles? When the bundle are included in each single-page BFF, each page will use different URsL for the bundles - completely defeating HTTP cache header.

Luckily, as discussed in the last post, the single-page BFFs are already behind a reverse proxy. If we put a bit of smarts into the reverse proxy we can include the bundles in each single-page BFF and still let every page use same the URLs for the bundles, like so:

The trick implementing this is to have the reverse proxy look at the referrer of any request for one of the bundles - if the referrer is frontpage, the request is proxied to the frontpage single-page BFF, if the referrer is the my account page, the request is proxied to the MyAccount single-page BFF. That means that the bundles loaded on the frontpage is the bundles deployed in the frontpage single page BFF. Likewise the bundles loaded on the my account page come from the MyAccount single-page BFF. But from the client perspective the bundles are on the same URL - i.e. the client is able to cache the bundles based on HTTP cache headers.

To sum up: Put shared CSS and JS bundles in an NPM package that all the single-page BFFs use, and making the reverse proxy use the referrer on requests for the bundles to decide where to proxy requests to allows clients the cache the bundles, the single-page BFFs to be individually deployable and developers work with each single-page BFF in isolation.

Saturday, June 6, 2015

Shipping Error Logs from an Angular App

By popular demand¹ I'm writing this post describing how I've gone about implemented log shipping in an Angular app. It's what worked for me. YMMV.

Why?

The reasons for logging messages in an SPA are the same as in any other applications: To gain insight into what happens in the app at run time.
The problem with client side logging is that it is hard to get at once the app is in production - that is, once it is running on a machine you (the developer) do not have access to. Collecting the log messages on the client side and sending them back to the server lets you put the client side log messages into a log store on the server, where they are available for analysis.

Client Side

In my situation the SPA is written in Angular which provides a logging service called $log. The $log service can be used anywhere in application code and is also used by Angular itself to log stuff. So the messages written to $log are messages I would like to ship back to the server. Out of the box $log simply logs to the console. Therefore I wrap $log in a decorator and tell Angular to inject my decorated $log wherever $log would otherwise be injected. My decorator stores all log messages along with the log level - error, warning, information, debug - in an array an then delegates to the default $log implementation. That way log messages are both stored for shipping at a later point in time and logged to the console immediately. The decorator also adds a method called shipLogs that sends the current array of log messages to the server and then clears the array on success. The decorator is set up like this:

The shipping of the logs can be done periodically - in my case every 10 seconds - using Angulars $interval service

This is not a fool proof process: There is no handling of situations where shipping takes more than 10 seconds, there is no guarantee that the shipping won't happen in parallel with other requests and thus compete for bandwidth and there is no guarantee that all log messages are shipped before the user leaves the app. But it works well enough for my particular case.

Server Side

The service side is exceedingly simply.
The only point of interest is deciding how to log the client side logs on the server. I've decided to start out by degrading all client side log messages to information level on the server regardless of the level on the client. This is done because certain things - e.g. lost network - will cause error situations on the client - like failed ajax requests - even though they are not really errors seen from the server side - we cannot do anything about clients losing network. This is contextual and might be completely different in your situation. Furthermore I decided to log each batch of client side log messages in one go, in order to avoid spamming my server side log. Both of these decisions might change at a later stage once we have more insight into characteristics of the logs we get back from clients.
As I said, the server side is simple, it boils down to a single simple endpoint:

That's it. Again YMMV.

1. OK 2 tweets doesn't constitute popular demand. But hey.

Wednesday, November 26, 2014

Slides from "OWIN and Composable Web Apps" at Campus Days Cph

Here are the OWIN slides from the talk I did at Campus Days Cph today. Enjoy :)

Saturday, May 31, 2014

Breaking Domain Layer Dependency on Data Layer: Use Events for Writes

In this final part of my series (part I, II and III) on reversing the dependency from Domain Model to Data Access code found in many traditional layered code bases I will show how raising Domain Events when Domain Objects are updated can further decouple the Data Access component from the other components.

Recap

To recap the initial situation is this:

The two previous posts show how to reverse the dependency between Domain Model and Data Acces either by letting the Presentation Layer talk to the Data Access code or by invoking Dependency Inversion.

Leaking Data Access Concerns?
Reversing the depedency by letting the Presentation component depend directly on the Data Access component works, but leaves the responsibility of making the Data Access component save entities to the Presentaion component. E.g. in this snippet of Presentation Layer code there is a call to wishListRepository.Save(wishList);

This may seem innocent enough, but saving is not a presentation concern, which means presentation code should not be making that .Save call.

Reversing the depedency by Dependency Inversion also works and alleviates the Presentation code from calling .Save. On the other hand the Domain Model code has to make the Data Access code save entitites. Doing this without the compile time dependency pointing from Domain Model to Data Acces is achieved by introducing interfaces like this in the Domain Model:

Again this seems innocent enough, but what about the Save method? On the one hand it does not reveal anything in particular about the underlying Data Access code or database. On the other hand it does imply that WishList objects can be saved - presumably to a database of some kind. Furthermore the Domain Model has to decide when to call .Save, which is not really a Domain Model concern. The Domain Model should be responsible for the Domain logic. The 'what' and the 'how' of persistence is the responsiblity of the Data Access component. The 'when' of persistence is also the responsibility of the Data Access component in simple cases and of Infrastructure code in more complicated cases.

Domain Events Enable Further Decoupling
If the Presentation component is not to call .Save and the Domain Model is not to call .Save then what? -The Domain Model will raise events when Domain Objects are updated - as done at line 10 in this:

The actual saving of the WishList is done in an event handler, which can be something along the lines of - notice that this class is part of the Data Access component and uses the Domain Model:

Now neither the Domain Model or the Presentation component makes any decisions about what, how or when to persist. The Presentation layer only retrieves Domain Objects as needed and interacts with them. The Domain Model just implements the Domain Logic and raises events whenever there are state changes. It is the full responsibility of the Data Access component to handle the 'what', 'how', and 'when' of persistence.

A Caveat
It is not always good enough to simply persist state changes right away as Domain Events are raised. Sometimes a Unit Of Work is needed. In such cases the Domain Event handlers should not persist data directly, but rather interact with the Unit Of Work to record changes. Whether the Unit Of Work eventually ends up being commited or not is not the concern of the Domain Event handlers. They can rely on Infrastructure code to take care of persisting or discarding the changes recorded by the Unit of Work.

Simple Eventing
Implementing a simple mechanism for raising and handling Domain Events amounts to iterating over all Subscribers calling them one after another when a Domain Event is raised, and to register all Subscribers. That is having this:

and registering all Subscribers with that class in the Composistion Root. This can be done by explicitly listing all subscribers, by discovering them by scanning assemblies or in case the application uses a IoC/DI Container by asking the Container to resolve Subscribers.

Conclusion
Rasing Domain Events further decouples the Data Access component from the rest of the components. It frees other components of the responsibility of deciding what and when to save, and it allows both for simply saving immediately and for delegating to a Unit of Work.
Raising Domain Events also allows for similar deploupling of other components - e.g. integrations to third parties.

Tuesday, May 13, 2014

Breaking Domain Layer Dependency on Data Layer: Dependency Inversion

This 3rd post in the series about moving from an architecture where Domain Model code depends on to Data Access code to an architecture where that dependency is reversed shows how that can be accomplished using Dependency Inversion.

To recap the situation we come from is this:

and we want to get to this:

Before
While the dependency points from the Domain Model to the Data Access component we might have code like this in the Domain Model:

which depends on this interface in the Data Access Component:

Reversing the Dependency
Let's take a look at what the Dependency Inversion Principle as formulated by Uncle Bob tells us:

"Abstractions should not depend upon details. Details should depend upon abstractions"

This tells us that the Data Access code - a detailed level - should not depend on the Domain Model - an abstraction - but vice versa. In other words the Domain Model component should be changed to include this interface

which should be used by the Domain Model code:

That new DomainModel.WishListRepository interface should be defined in the Domain Model and implemented in the Data Access component:

And that inverts the dependency.

Conclusion
We have inverted the dependency between Domain Model and Data Access, such that the Data Access component now depends on the Domain Model, while the Domain Model has one dependency less.
As a side effect the type WishListDTO disappeared, so we end with less code and a better separation of concerns.

Tuesday, May 6, 2014

Breaking Domain Depedency on Data Access: Let Presentation Depend on Data Access

This is the second post about not letting domain layers be depend on data access layers. This post explores the (maybe) most obvious way to reverse the dependency between domain model and data access in classic three-layer architectures

Which is to basically turning this

into this

Notice that the arrow from the Data Access component to the Domain Model? Why is that there?
To answer that lets take a quick look at an example. Say we have a web enabled e-commerce system with an endpoint that lets clients add an item to a logged in users wishlist (in others words a simplistic version of something like Amazons wishlist). To do that the code for the endpoint needs to:

Get the user the call is made on behalf of
Retrieve that users wish list from the data access component
Add the item to the wish list
Save the updated wishlist back through the data access component

Which is accomplished with code along the lines of:

This code looks almost like it would have in a classic 3-Layer Architecture. So how does this code justify the dependency arrow from the Data Access component to the Domain component? It does so because:

The WishListRepository is found in the Data Access component
The call wishListRepository.GetByUserName(Context.CurrentUser.UserName)
returns a Domain object - an instance of the WishList Domain class.
The call wishListRepository.Save(wishList) accepts a Domain object

So there you have it: Allowing the Presentation component to talk to the Data Access component allows you to cut the dependency from Domain to Data Access, but introduces the reverse dependency. For the reasons argued in the first post (and in the resources referenced there) this a much better position to be in than the one in the top most diagram.

An Added Benefit
Notice that the data access component now returns Domain objects when queried. That is possible because the dependency points from Data Access to Domain. It used to be the other way around. So ... what did the Data Acces code return on queries before? Most likely some object containing the relevant data from the data base. These types of object come under different names: WishLisDTO, WishListEntities (as in NHibernate or Entity Framework entity), DataModels etc. Regardless of the name their main purpose is to move data from the database into the layer on top of the Data Access layer (i.e. the Domain layer in the top most diagram). But now that the Data Access code returns Domain objects directly these are no longer needed to move data into the rest of the system. In the cases where the mapping from the database to objects is not complicated these DTOs/Entities/DataModels have become superfluous and can be deleted. This can turn out to be a lot of code that can be deleted.

Downsides
The major downside here is that the endpoint code has been given slightly more responsibility, since it now has to fetch Domain objects - which it also did before - and also has to save them again after interacting with them. Depending on the state of affairs before breaking the Domain to Data Access dependency the endpoint may or may not have had to explicitly save changes. That may have been handled by a combination of dirty tracking and infrastructure code.
In the 4th post of this series I will show a way to remove this extra responsibility from the presentation layer again.

Monday, April 28, 2014

Domain Models Should Not Depend on Data Layers

This is the first of 4 posts about how to go about moving from an architecture where the Domain Model depends on the Data Acces component, to an architecture where that dependency is reversed.

In this post I outline the problem that I propose solutions for in the next posts.

Why Write About This?
Because I see this problem repeatedly in code that I get to work with. So I might as well write down what I tend to talk to clients about in those situations.

Why Do So Many Domain Models Depend on a Data Access Component?
I think there are several reasons, but the most important one it seems to me is that for many this seems to be the default for any server side software:

So that's how many systems start out. Or so it seems from my experience. At some point in the life time of the system the Business Logic component is repurposed as a Domain Model - this may happen is an attempt to get away from Transaction Scripts, or simply because having a Domain Model is seen as The Right Thing To Do (™).

Why Reverse the Dependency?
As long as the Domain Model depends on the Data Access component both are hampered. Albeit for different reasons.

Having the Domain Model depend on Data Access layer is a violation of the Dependency Inversion Principle because the Domain Model is a higher level of abstraction than the Data Access component. The consequence is that the Domain Model is tighter coupled to the details of Data Access than it should be leading the poorer maintainability and poorer testability. In fact Domain Models should be POCOs with no other dependencies than the standard library.

Having all other component depend - directly or indirectly - on the Data Access component, makes the Data Access code harder and more risky to change because it potentially effects everything else. The Data Access component is at the edge of the system and just like the entire system should not depend on the Prensentation component (another component at the edge) the entire system should not depend on the Data Access component.

Therefore this is preferable:

For deeper explainations of this I recommend Alistair Cockburns article on Hexagonal Architecture and the GOOS book by Nat Pryce and Steve Freeman.

How To Break the Dependency?
The upcoming posts covering these tactics for breaking the dependency:

Allow presentation layer to talk to data layer
Dependency Inversion
Use domain events for writes

BTW. This applies to other stuff too. E.g. integration components or hardware access components.

Saturday, March 8, 2014

In Search of Maybe in C# - Part III

Following up on the last two posts[1, 2] this final post in my little series on Maybe/Option gives a third implementation of Option. This time around I'll drop the idea of letting Option implement IEnumerable and instead focus on implementing the function needed to make my Option a monad (which it wasn't actually in the last post).

First off let's reiterate the basic implementation of Option - a simple class that can be in two states 'Some' or 'None', where Options in 'Some' state contain a value and Option in 'None' state don't.

I also want to maintain the matching functionality that I had in the other posts, so I'm keeping this:

While even this is somewhat useful as is, it becomes better if we add the remaining part of the monad; the FlatMap function (the unit function is the Some(T value) method, so only FlatMap aka Bind aka SelectMany is missing). So we add this to the OptionHelpers class:

This is nice because now we are once again able to chain operations without having to worry much about null ref exceptions.
Just for convenience I'll throw in a Filter function in the OptionHelpers as well:

To exemplify how Option can flow through a series of operations take a look at the following where there is a value,so the some case will flow through:

In the case of no value the none case will flow through as in this example:

Towards the end of the first post in the series I mentioned that it's nice to be able to treat Option as a collection. That led to Option implementing IEnumerable in the second post, thus enabling passing Options into existing code working on IEnumerables, which is quite useful when introducing Option into an existing code base. Especially so, if it's a code base that uses LINQ a lot. In such code you're likely to find functions accepting an IEnumerable and returning an IEnumerable. An Option type implementing IEnumerable fits rigth in. On the other hand, if you don't have exiting LINQ based code to fit into, the Option version in this post is maybe a bit simpler to work with because it has smaller API surface and it still provides the benefit of allowing chaining.

Thursday, February 20, 2014

In Search of Maybe in C# - Part II

In my last post I wrote that I'd like to have an implementation of the Maybe monad in C#, and I explored how to use the F# implementation - FSharpOption in C#.

In this post I'll show a quick implementation of Maybe in C#, I'll call it Option - as was noted in some of the comments on the last post this turns out to be quite simple. I left off last time mentioning that it's nice when Option can act somewhat as a collection. That turns out to be easy too. Just watch as I walk through it.

Implementing basic Option
First of all the Option type is a type that can either be Some or None. That is simply:

which allows for creating options like this:

So far so good. This all there is to a basic Maybe monad in C#.

Simulate pattern matching on Option
As argued in the last post I'd like be able to do something similar to pattern matching over my Option type. It's not going be nearly as strong as F#s or Scalas pattern matcing, but it's pretty easy to support these scenarios:

All that takes is adding to simple methods on the Option type that each check the value of the Option instance and calls the right callback. In other words just add these two methods to the Option class above:

That's matching sorted.

Act like a collection
Now I also want Option to be usable as though it was a collection. I want that in order to be able to write chains of linq transformations and then just pass Options through that. To illustrate, I want to able to do this:

This means that Option must now implement IEnumerable<T>, which is done like so:

And to be able to get back an option from a IEnumerable with 0 or 1 element:

which is useful in order to be able to do the matching from above.

All together
Putting it all together the Option type becomes:

Not too hard.

Wednesday, January 29, 2014

In Search of Maybe in C#

One of the things working in Scala has taught me about C# is that the maybe monad is nicer than null.

In case you're not sure what the maybe monad is: It's a generic type that either contains a value or not. In Scala as well as in F# this type is called Option and can be either Some - in which case it contains a value - or None - in which case it doesn't. This allows you to represent the absence of a value in an explicit way visible to the compiler, since Option<T> is a separate type from T. This turns out to be a whole lot stronger than relying on null to represent the absence of a value. I'll point you to an article about F#'s Option for more explanation on why.

Functional languages will usually let you pattern match over Option, making it very easy to clearly differentiate the case of Some from the case of None. In F# this look like so:

In C# we don't have this. The closest thing is Nullable<T>, but that doesn't work over reference types. But is Nullable<T> really the closest thing in C# to Option? - NO. As was pointed out to me in a recent email exchange the F# Option type is available in C# too.

Let's see if we can redo the above F# code in C#. First the declaration the name variable:

The F# Option type is part of the FSharp.Core assembly and the is actually called FSharpOption. Hence the line above.
On to the nameOrApology variable:

Not nearly as clear as the F# counterpart. What's in the if part; what's in the else part. Not nearly as clear to me as the pattern matching in F#.
We can do better, using this little helper extension function:

Which let's us initialize nameOrApology like this:

Now the two outcomes - a value inside the Option or no value - are much more clear to me.

This is OK. Although the name, FSharpOption, is not too nice. Some of the method names - e.g. get_IsSome - aren't nice either. More importantly: Another thing Scala taught me is that it's nice if Option can be used in place of a collection. But that's for another post.

Sunday, January 19, 2014

Slides from Warm Crocodile Talk

At the awesome Warm Crocodile Developers Conference I did a talk under the title "Does Your Architecture Enable Flow?". These are the slides from my talk. As always the slides make less sense without the words, so YMMV.

Sunday, September 1, 2013

From Fact to Theory

I often find that the tests I write can be slowly generalized as the feature I happen to be working on is fleshed out. In xUnit.net terms I find that a fair number of the tests tend to move through these four stages:

Start out as a single [Fact], working on hardcoded input data
Pull out the hardcoded data, turning the tests into a [Theory, AutoData]
Add edge cases, evolving the test into [Theory, AutoData, InlineData, InlineData]
Add cases representative of equivalence classes, thus adding more InlineData attributes to the test

Each of the stages generalizes the test a bit more making it more potent in several ways. As the test moves through each stage it gains greater coverage in terms of the possible inputs, making it a better regression guard. But there is more at play. Along with the generalization (and multiplication) of the inputs the test can often be refactored towards being more about the behavior of the SUT and less about its implementation. So alongside the four stages above I find tests tend to move through these stages as well:

While being a [Fact] the test is named something very specific - a specific example of how the SUT should behave
During stage 2 or 3 the test name becomes something that expresses a general trait of the SUT
The assertions in the test becomes less dependent on implemntation details

I am not always able to make my tests follow either of these progressions, but when I am I find it very rewarding.

Thursday, October 11, 2012

Code and Lean

I've been toying around with an idea for a while. It's sort of half baked, but inspired by some good conversations about it at AAOSConf and GOTO I'm going to go out on a limb and sketch it out below.

Lean
Lean software development (LSD) has been around for a while. The Poppendiecks book took the teachings of Lean Manufacturing and turned them into a set of principles and tools for thought that made sense for software development processes. In contrast to most writings on software development processes the "LSD" book does not prescribe (or even describe) a particular process. Rather it describes a number of principles and a number tools, that can be used to analyse your current process in order to find out where it can be improved. LSD provides a framework for developing your own process that is efficient in your particular situation. The point is to continuously optimize the process towards faster delivery and shorter cycle times.

Software Architecture
What is a good software architecture? That's a hard question to answer. The following is an answer I've used for a few years now in connection with an architecture training course:

"A good architecture shortens the time between understanding user needs and delivering a solution. It does this by eliminating rework and by providing a de-facto standard framework based on the end user domain model." --James O. Coplien.

It's from Cope's Lean Architecture book, and it's about optimizing for cycle time and for fast delivery. But how? How does a software architecture help optimize for fast delivery? I think some of those tools for thought from LSD can help us out not only if we apply them to process, but also to the software architecture and to the code.

Value Stream Mapping in Code
First and foremost I think we can learn a lot about what makes a good software architecture by applying value stream mapping to our code bases. What I mean by this is: Map out what needs to be done in the code for a typical feature/user story/use cases - whichever is your unit of work - to be implemented. Then take a look at which steps takes about which amount of effort.
Note that when I say code here I include config-files, tests, deployment scripts, migration scripts and so on. Mapping this out gives us insights into which parts of the architecture needs to become better: Is most of the time spent dealing with a particularly hairy integration? Maybe that needs to be better isolated. Is a lot of time spent writing and rewriting data migration scripts? Maybe we aren't using the right persistence solution. Is it very time consuming to build out the GUI? Maybe that part of the code isn't testable enough. These are just made up examples. The point is that I think value stream mapping is a good tool to find those hot spots within the software architecture.

Seeing Waste
Another tool I think we can apply to the code is seeing waste: Basically this just says that we should be on the look out for anything that creates waste - in terms of effort or in terms of resources. Applying this to software architecture means asking questions like: Do we have methods or even whole classes or layers that just delegate to other parts the code without any additional logic? 'coz that's just wasteful. It takes time to write, it takes time to debug through, and it's takes (a little bit of) time to execute. All of that is waste. We then need to ask ourselves if that waste is justified by some other goal fulfilled by those delegations. If not they need to go.
Another useful question is if there are parts of the architecture that are only really warranted for a few features in the systems but impact all features. E.g. if 80% of the features in an application is basic CRUD, but we do all data access through repositories because there are 20% of cases where more than CRUD is needed. Then isn't it wasteful to funnel everything through those repositories? -Even simple code takes time to write and test. Any code that isn't clearly justified is wasteful.

Options Thinking
Options thinking - in my mind - applies pretty directly to code. Options thinking is about keeping options open while not just stalling. It's about finding ways to postpone decisions until more information is available without slowing down progress. This - in code - is done pretty directly with well designed abstractions. That's the obvious one. I also think techniques like feature toggles and a/b testing along with looking for minimal viable implementations support options thinking. These techniques can help with trying out more than one direction in parallel and getting real world feedback. If this is done quickly - e.g. by focusing on only doing minimal implementations - this allows us to postpone deciding between the different directions a little while and gather more information in mean time.

Pull Systems
Coding pull style: TDD. That's it.

Summing up these are not fully formed ideas. But now they are out there. Any feedback will be much appreciated.

Friday, February 24, 2012

Repositories and Single Responsibility from the Trenches - Part II

In my last post I wrote about how we swapped sa few ever-growing repository classes for a lot of small focused requests classes, thar each take core of one request to our database. That post showed the gist of how to implement these request classes. This post will focus on how to use them, test them and mock them.

First I'll re-iterate how one of these request classes look (NB. compared to the code from the last post the base class has been renamed):

   11 public class DeleteDeviceWriteRequest : DataStoreWriteRequest

   12 {

   13   private readonly Device device;

   15   public DeleteDeviceWriteRequest(Device device)

   16   {

   17     this.device = device;

   18   }

   20   protected override void DoExecute()

   21   {

   22     // NHibernate trickery cut out for brewity

   23     Session.Delete(device);

   24   }

   25 }

And below I show how to use, test and mock this class.

Usage
Usage of these requests is really simple: You just new one up, and ask your local data store object to execute it:

  229 var deleteDeviceRequest = new DeleteDeviceWriteRequest(meter);

  230 dataStore.Execute(deleteDeviceRequest);

Say, what? I'm newing up an object that uses NHibernate directly. Gasp. That's a hard coupling to the ORM and to the database, isn't it? Well, kind of. But that is where the data store object comes into play: The request can only be exectuted through that object, because the request's only public method is its contructor and because it's base class 'DataStoreWriteRequest' has no public methods. The interface for that data store is:

    8 public interface IDataStore

    9 {

   10   void Execute(DataStoreWriteRequest req);

   11   T Execute<T>(DataStoreReadRequest<T> req);

   12   T Get<T>(long id) where T : class;

   13   T Get<T>(Expression<Func<T,bool>> where) where T : class;

   14   void Add<T>(T entity) where T : class;

   15   void Delete<T>(T entity) where T : class;

   16   int Count<T>(Expression<Func<T, bool>> where = null) where T : class;

   17 }

That could be implemented towards any database/ORM. I our case it's implemented against NHibernate, and is pretty standard except maybe for the two execute methods - but then again they turn out to be really straightforward as well:

  173 public void Execute(DataStoreWriteRequest req)

  174 {

  175   WithTryCatch(() => req.ExecuteWith(Session));

  176 }

  178 public T Execute<T>(DataStoreReadRequest<T> req)

  179 {

  180   return WithTryCatch(() => req.ExecuteWith(Session));

  181 }

  183 private void WithTryCatch(Action operation)

  184 {

  185   WithTryCatch(() => { operation(); return 0; });

  186 }

  188 private TResult WithTryCatch<TResult>(Func<TResult> operation)

  189 {

  190   try

  191   {

  192     return operation();

  193   }

  194   catch (Exception)

  195   {

  196     Dispose(); // ISession must be disposed

  197     throw;

  198   }

  199 }

Notice the calls to ExecuteWith? Those are calls to internal methods on the abstract DataStoreReadRequest and DataStoreWriteRequest classes. In fact those internal methods are the reason that DataStoreReadRequest and DataStoreWriteRequest exists. Using a template method declared internal they provide inheritors - the concrete data base requests - a way to get executed, while hiding everything but the contrustors from client code. Only our NHibernate implementation of IDataStore ever calls the ExecuteWith methods. All the code outside the our data access assembly can not even see those methods. As it turns out this is really simple code as well:

    5 public abstract class DataStoreWriteRequest

    6   {

    7     protected ISession Session { get; private set; }

    9     internal void ExecuteWith(ISession seesion)

   10     {

   11       Session = seesion;

   12       DoExecute();

   13     }

   15     protected abstract void DoExecute();

   16   }

To sum up; the client code just news the requests it needs, and then hands them off to the data store object. Simple. Requests only expose constructors to the client code, nothing else. Simple.

Testing the Requests

Testing the requests individually is as simple as using them. This is no surprise since tests - as we know - are just the first clients. The tests do whatever setup of the database they need, then new up the request and the data store, asks the data store to execute the request, and then asserts. Simple. Just like the client production code.

In fact one of the big wins with this design over our old repositories is that the tests become a whole lot simpler: Although you can split up tests classes in many ways, the reality for us (as I suspect it is for many others too) is that we tend to have one test class per production class. Sometimes two, but almost never three or more. Since the repository classes grew and grew so did the corresponding test classes resulting in some quite hairy setup code. With the new design each test class tests just one very specific request leading to much, much more cohesive test code.

To illustrate here is the first simple of test for the above DeleteDeviceRequest - note that the UnitOfWork objects in this test implement IDataStore:

   39 [Test]

   40    public void ShouldDeleteDeviceWithNoRelations()

   41    {

   42      var device = new Device();

   43      using (var arrangeUnitOfWork = CreateUnitOfWork())

   44      {

   45        arrangeUnitOfWork.Add(device);

   46      }

   48      using (var actUnitOfWork = CreateUnitOfWork())

   49      {

   50        var sut = new DeleteDeviceWriteRequest(device);

   51        actUnitOfWork.Execute(sut);

   52      }

   54      using (var assertUnitOfWork = CreateUnitOfWork())

   55      {

   56        Assert.That(assertUnitOfWork.Get<Device>(device.Id), Is.Null);

   57      }

   58    }

Mocking the Request

The other part of testing is testing the code that uses these requests; testing the client code. For those tests we don't want the request to be executed, since we don't want to get slowed down by those tests hitting the database. No, we want to mock the requests out completely. But there is a catch: The code under test like the code in the first snippet in the Usage section above new's up the request. That's a hard compile time coupling to the concrete request class. There is no seam allowing us to swap the implementation. What we're doing about this is that we're sidestepping the problem, by mocking the data store object instead. That allows us redifne what executing the reqeust means: Our mock data store never executes any of the requests it's asked to execute, it just records that it was asked to execute a certain request, and in the case of read requests returns whatever object we set it up to return. So the data store is our seam. The data store is never newed up directly in production code, it's always injected through constructors. Either by the IoC/DI container or by tests as here:

  100 [Test]

  101 public void DeleteDeviceRestCallExectuesDeleteOnDevice()

  102 {

  103   var dataStore = mock.StrictMock<IDataStore>();

  104   var sut = new RestApi(dataStore);

  106   var device = new Device { Id = 123 };

  108   Expect.Call(unitOfWork.Get<Device>(device.Id)).Return(device);

  109   Expect.Call(() =>

  110     unitOfWork.Execute(

  111     Arg<DeleteMeterWriteRequest>

  112     .Is.Equal(new DeleteMeterWriteRequest(device))));

  114   mock.ReplayAll();

  116   sut.DeleteMeter(device.Id.ToString());

  118   mock.VerifyAll();

  119 }

(the above code uses Rhino.Mocks to mock out the data store, but that's could have been done quite simply by hand as well or by any other mocking library)

That's it. :-)

Pages