Sunday, January 18, 2009

Layered Architecture with LINQ to SQL (Part 2)

In my last post I mentioned two popular approaches for structuring your data access logic when choosing the domain model: the Active Record and the pure domain model way. In this post I want to explain how I would implement a data access Layer for LINQ to SQL for a pure domain model. There are two reasons why I prefer this to Active Record :
  • I like the idea of persistence ignorance, clean ordinary classes where you focus on the business problem. I'm not too dogmatic about that. As an example I don't care about the LINQ To SQL mapping attributes I have to put in my domain classes.
  • I don't want to run most of my unit tests against the database. The Repository pattern helps a lot in achieving this.
Repositories, Unit Of Work and Entities with LINQ to SQL

The central object of LINQ to SQL is the DataContext object. It tracks changes to all retrieved entities. It implements the Unit of Work and the Identity Map patterns and also provides query functionality on a per table basis. It's similar to NHibernate's Session object. Too bad that Microsoft didn't define an interface for this class (like the NHibernate team did it with the ISession interface). Such an interface is import to provide a stubbed implemenation during unit testing. So let's define our own interface and name it IDataContext:
public interface IDataContext: IDisposable
{
void Commit();

void DeleteOnSubmit<T>(T entity) where T: class;

ChangeSet GetChanges();

IQueryable<T> GetTable<T>() where T: class;

IQueryable<T> GetTable<T>(Expression<Func<T, bool>> predicate) where T: class;

void InsertOnSubmit<T>(T entity) where T: class;
}

The class that implements this interface is just an adapter for the DataContext class. The code is straight forward:
public class LinqToSqlDataContextAdapter: IDataContext
{
private readonly DataContext _dataContext;
private bool _disposed;

public LinqToSqlDataContextAdapter(IDbConnectionConfiguration connectionConfiguration): this(new DataContext(connectionConfiguration.ConnectionString))
{

}

protected LinqToSqlDataContextAdapter(DataContext dataContext)
{
_dataContext = dataContext;
}

public void Commit()
{
_dataContext.SubmitChanges();
}

public void DeleteOnSubmit<T>(T entity) where T: class
{
_dataContext.GetTable<T>().DeleteOnSubmit(entity);
}
//... more adapter code
}

Let's continue with the Repository Pattern. According to Fowler a Repository "provides a layer of abstraction over the mapping layer where query construction code is concentrated", to "minimize duplicate query logic". A Repository usually provides a set of query operations for an Entity. In addition to that objects can be added to and removed from the Repository. My interface for a generic Repository looks like this:
public interface IRepository<T> where T: IGuidIdentityPersistence
{
void Add(T entity);

long Count();

long Count(Expression<Func<T, bool>> predicate);

void Delete(T entity);

bool Exists();

bool Exists(Expression<Func<T, bool>> predicate);

T FindFirst(Expression<Func<T, bool>> predicate);

T Find(object id);

IQueryable<T> FindAll();

IQueryable<T> FindAll(Expression<Func<T, bool>> predicate);
}

I decided to use IQuerable instead of returning a collection as the return value of the FindAll methods. This makes the usage very flexible. IQuerable is a deferred query so clients can add filters as needed. For more specialized methods I think it's better to return a collection than a query.

The generic implementation of IRepository<T> goes here:
public class Repository<T>: IRepository<T> where T: class, IGuidIdentityPersistence
{
private readonly IDataContext _dataContext;

public Repository(IDataContext dataContext)
{
_dataContext = dataContext;
}

public Repository()
{
_dataContext = UnitOfWork.Current;
}

private IDataContext DataContext
{
get { return _dataContext; }
}

public void Add(T entity)
{
DataContext.InsertOnSubmit(entity);
}

public long Count()
{
return DataContext.GetTable<T>().Count();
}

public long Count(Expression<Func<T, bool>> predicate)
{
return DataContext.GetTable(predicate).Count();
}

public void Delete(T entity)
{
DataContext.DeleteOnSubmit(entity);
}

public bool Exists()
{
return DataContext.GetTable<T>().Count() > 0;
}

public bool Exists(Expression<Func<T, bool>> predicate)
{
return DataContext.GetTable(predicate).Count() > 0;
}

public T FindFirst(Expression<Func<T, bool>> predicate)
{
return FindAll(predicate).FirstOrDefault();
}

public T Find(object id)
{
return DataContext.GetTable<T>().Where(e => e.Id.Equals(id)).FirstOrDefault();
}

/// <summary>
/// Returns a query for all object in the table for type T
/// </summary>
public IQueryable<T> FindAll()
{
return DataContext.GetTable<T>();
}

/// <summary>
/// Returns a query for all object in the table for type T that macht the predicate
/// </summary>
public IQueryable<T> FindAll(Expression<Func<T, bool>> predicate)
{
return DataContext.GetTable<T>(predicate);
}
}

I made the class concrete on propose. As the Repository class already defines a lot of helpful methods the class can be used in situations where you don't need a custom Repository. Below is an example where a generic Repository is used in a ASP.NET MVC Controller:
public class BookController: Controller
{
private IRepository<Location> _locationRepository;

public BookController(IRepository<Location> locationRepository)
{
_locationRepository = locationRepository;
}

//...
[AcceptVerbs("Post")]
public ActionResult Edit(Guid id, string title, string author, Guid locationId)
{
Book book = GetBook(id);
UpdateModel(book, new string[] {"Title", "Author"});
Location selectedLocation = _locationRepository.Find(locationId);
book.AddLocation(selectedLocation, GetCurrentUserName());
UnitOfWork.Current.Commit();
return View("Show", book);
}

A custom Repository could look like this:
    public class BookRepository: Repository<Book>, IBookRepository
{
public BookRepository(IDataContext dataContext)
: base(dataContext)
{}

public BookRepository()
{}

public IEnumerable<Book> FindByTitle(string title)
{
return FindAll().Where(b => b.Title.Contains(title)).ToList();
}
}

The implemenation of the Repository pattern I showed in this post adds a layer of abstraction on top of LINQ to SQL and results in a more decoupled architecture. As a side effect the design simplifies database independent testing. The generic Repository provides easy and flexible usage for simpler situations.

Layered Architecture with LINQ to SQL (Part 1)

As with many other Microsoft technologies LINQ to SQL mainly supports a RAD (rapid application development) style of development. You can drag and drop tables into the LINQ to SQL Designer and Visual Studio will define a simple 1:1 mapping, generate domain classes and a typed datacontext for your database. All set, ready to hack! Let's create a win form, drag some controls in it, code some LINQ to SQL queries in a click event handler and bind it to a grid. Sounds easy right? Well it's easy and you can do it, but consider the following problems:
  • Writing database queries in your presentation layer is not as bad as using SQL code in it, but it’s still a questionable practice. If you do it, you’re mixing data access code with business logic and presentation code. This means that you’ve decided to abandon the benefits of the separation of concerns.
  • LINQ to SQL queries are scattered around in your business logic or presentation code. Sooner or later you run into a maintenance problem.
  • There is no concept for code reuse. Queries that need to be used at different places have to be duplicated.
But that doesn't mean that LINQ to SQL is not suitable for building layered enterprise applications. There are a lot of resources available that can help you to structure your data access logic. Most of them are documented in Martin Fowler's Patterns of Enterprise Application Architecture (PEAA) book. If you choose the Domain Model approach there are mainly two options for your DAL:
  • Active Record. Active Record puts data access logic directly in the domain object. In most implementations you can see a set of static finder methods on the domain objects and instance methods for operations like save, update and delete.
  • Pure Domain Model with a Unit Of Work and Repositories. This approach separates the data access logic from the domain objects. With the combination of these patterns different objects exist for different concerns: the unit of work is responsible for change tracking, repositories for data access and domain object for domain logic.
In my next post I want to show you how I would apply the pure Domain Model patterns in the context of LINQ to SQL.

Sunday, December 7, 2008

ThoughtWorks Cruise: First Impressions

We use CruiseControl.Net as our continuous integration server in our company. CruiseControl.NET is a great tool and helps us a lot in adopting our continuous integration practice. Over the years a lot of cruise control projects accumulated. Currently we have 8 build servers with all together over 100 projects. Most of the time some servers just do nothing while others are under heavy load. So we're looking for a solution to use our hardware more efficiently.

What we really need is a tool that can distribute builds to different machines. Fortunately most of the commercial continuous integration servers do support this kind of feature. Most of them use some sort of a server/agent based concept to distribute builds. Thoughtworks provides a good overview of several CI products that are available on the market. I decided to have a deeper look at Cruise, the enterprise version of Cruise Control from Thoughtworks.

So I downloaded the free edition of Cruise 1.1 and installed it on my laptop. Cruise consists of a server (Cruise Server) and several agents (Cruise agents), that receive work delegated from the server. Therefore a build can be processed in parallel by several cruise agents. What I really liked about Cruise is their implementation of the concept of the deployment pipeline. Cruise allows monitoring of changes to an application as they progress from initial check-in to functional testing, performance testing, user acceptance testing, staging, and release. A pipeline consists of one ore more stages which again consists of jobs. You can consider a pipeline configuration as a workflow of a particular release process. In that way Cruise goes beyond basic continuous integration that many products support these days.

What's interesting is that not every state transition has to be automatically triggered. Some stages may be manually approved before the release process can go further. For instance, in my current project most of the tests are accomplished manually by domain experts outside of the development team. During an iteration or at the end of an iteration we provide builds for manuall testing. In Cruise, this process could be implemented by a stage that needs manual approval.

When I hit Cruise's dashboard for the first time I was disappointed about the few features the product seemed to provide. The main page has just four tabs for viewing the current activity, managing pipelines and agents and a administration tab. Everything looked so minimalistic. But when I begun to use it, I was surprised that most of the things I wanted to do I could do.

But there are some negative points too. In the current version 1.1 they just support Subversion, Mercurial, Git and Perforce for source control integration. There are some other features that are missing in comparison to CruiseControl.NET. There is no support for visualizing reports from tools like FxCop or NCover, just NUnit is supported. They don't provide a plugin mechanism like Cruise Control. So there is currently no contribution or self extension possible.

To summarize, Cruise looks very promising for me. I really liked the concept of pipelines and the minimalistic user interface. From a CruiseControl.NET user perspective there are some key features missing in the current version 1.1. But I'm sure that the agile guys from Thoughtworks will soon come out with a new version that may include those missing features.

Saturday, October 18, 2008

Are Aggregates Practical?

My coworker Jörg Jenni from GARAIO reflects in his recent post on how his team is implementing aggregates. I worked with him and his team for a few months so I know how their current implementation looks like. To be honest I was one of the developers that had problems to follow the rules that come with aggregates.

From Evans, the rules we need to enforce include:
  • The root Entity has global identity and is responsible for checking invariants
  • Root Entities have global identity. Entities inside the boundary have local identity, unique only within the Aggregate.
  • Nothing outside the Aggregate boundary can hold a reference to anything inside, except to the root Entity.
  • Objects within the Aggregate can hold references to other Aggregate roots.
  • A delete operation must remove everything within the Aggregate boundary all at once
  • When a change to any object within the Aggregate boundary is committed, all invariants of the whole Aggregate must be satisfied.
Especially the "do not hold a reference from the outside to an object inside the aggregate" rule is very strict. Jörg suggests in his post to implement aggregate internals as private classes in the root class. With this approach sure you will not violate the rule anymore as the compiler enforces it. The problem is that at some point you will need access to these objects even if they are identified as conceptually internal. For example as a client I need a reference to these internals to show some information on a screen, to test the logic of the internal object or to execute some business logic that needs knowledge of data and behavior of objects in the aggregate.

Don't understand me wrong, I really would love to apply this pattern. The essence of aggregates to identify a cluster of objects that can be conceptually thought of as one unit is very important for managing complexity. But is it practical in our today's programming model? On the domain driven design discussion board I found an interesting discussion about aggregate boundaries between DDD adopters and Eric Evans. From the discussion I see that we're not the only ones that have problems with the implementation of aggregates. And what's interesting is that in the same discussion Mr. Evans wrote that he'll try to add a concrete example and that he we'll come up with some useful new insight - he didn't return to the discussion.

Does anybody has a real world example of aggregates? One that enforces all the rules?

Saturday, September 27, 2008

Let's Do Practices

This week Ivar Jacobson gave a talk at the Microsoft Regional Architect Forum in Zurich. Ivar Jacobson is one of the founders of UML, RUP and use cases. In his talk "Getting Good Software, Quickly and at Low Cost" he spoke about his practiced-based approach to software development. At the beginning he mentioned the main problems of software processes. According to him every process tries to be complete, process documentation is never read and the way how people live the process is soon out of sync with the original process definition. He then suggested a more practice oriented approach for finding the right process. He described what a practice is and how practices can be composed to build custom processes.

The ideas he talked about made a lot of sense to me. I was very surprised by the lightweightness of his approach. As an example he proposed to document a practice by a set of index cards on which you only describe the essentials of the practice.

He also talked about agile software development. For him Agile is "a box" of the good stuff that existed since many years with some social additions. He sees SCRUM as a practice to project management. I always saw SCRUM as a process. But I think his right. It's a good way to do project management in the field of software development. But with SCRUM alone you don't deliver good software. To be successful you'll need good engineering practices, testing practices and other practices for capturing requirements or releasing software. I focused very much on SCRUM for some time now. Together with my fellows we put a lot of effort in adopting SCRUM for the projects we where involved and also for the company wide adoption of SCRUM. Maybe its time for me, to address some other things. Back to good old engineering stuff. There are a lot of practices to explore over at Ivar Jacobson International and also from the Eclipse Process Framework.

Sunday, July 20, 2008

Linq To Sql and Value Objects

Unfortunately Linq to SQL lacks support for Value Objects. This is a big limitation when you do Domain Driven Design as we do it on a project that I'm currently involved. We didn't use this pattern so far but last week there was no way around it. I had to implement a feature based on the stock of monetary units. I identified an object MonetaryUnitStock that represents the stock for a monetary unit like e.g. 10 Swiss francs. I didn't want to care about identity for that object and I wanted it to be immutable.

For my mentioned feature I had no requirements for querying. So the implementation with Linq to SQL was straight forward. Here is the code in C#.

public struct MonetaryUnitStock: IEquatable<MonetaryUnitStock>
{
private readonly int _numberOfUnit;
private readonly MonetaryUnitType _type;

public MonetaryUnitStock(MonetaryUnitType type, int numberOfUnit)
{
_type = type;
_numberOfUnit = numberOfUnit;
}

public int NumberOfUnit
{
get { return _numberOfUnit; }
}

public MonetaryUnitType MonetaryUnitType
{
get { return _type; }
}

public bool Equals(MonetaryUnitStock other)
{
return MonetaryUnitType.Equals(other.MonetaryUnitType) && this.NumberOfUnit.Equals(other.NumberOfUnit);
}

public override bool Equals(object obj)
{
if(obj is MonetaryUnitStock)
{
return Equals((MonetaryUnitStock)obj);
}
return false;
}


public override int GetHashCode()
{
return MonetaryUnitType.GetHashCode() ^ NumberOfUnit.GetHashCode();
}
}

    partial class MoneyList
{
private MonetaryUnitStock? _monetaryUnitStockUnit100;

public MonetaryUnitStock MonetaryUnitStockUnit100
{
get
{
if (_monetaryUnitStockUnit100 == null)
{
_monetaryUnitStockUnit100 = new MonetaryUnitStock(MonetaryUnitType.Unit100, NumberOfUnit100);
}
return _monetaryUnitStockUnit100.Value;
}
set
{
if(value.MonetaryUnitType != MonetaryUnitType.Unit100)
{
throw new ArgumentException("MonetaryUnitStockUnit100 expects MonetaryUnitType.Unit100");
}
_monetaryUnitStockUnit100 = value;
_NumberOfUnit100 = _monetaryUnitStockUnit100.Value.NumberOfUnit;
}
}
}


I used the DBML designer for creating this simplified example. In the designer I marked the NumberOfUnit100 attribute as private to just provide the Value Object to the outside. So client just can set and get MonetaryUnitStockUnit100 and have no access to the underlying simple type.

The reason that I implemented the MonetaryUnitStockUnit100 Property with lazy loading is, that Linq to SQL provides no interception point when reconstituting an object from persistence. The OnCreated() method generated by the designer is not suitable as it is called before the actual values are set.

Monday, March 24, 2008

Using LINQ with Text Files

I'm currently reading the book LINQ in Action which I can highly recommend to all who want to get started with LINQ. In one of the first chapters there is a nice example on how you can use LINQ with text files (originally posted by Eric White). Rather than reading all lines of the file into memory and then query it, the example uses deferred execution with an extension method on the class StreamReader.

Inspired by this example I wrote a simple file reader that can read structured non-XML data from text files. As you can see in the Process method in the code below the text file is queried and processed line by line. Objects are created on the fly, as you loop through the results. This technique allows you to work even with huge files.

public class RecordReader{
private Dictionary<string, IReaderStrategy> _strategies = new Dictionary<string, IReaderStrategy>();
private const char Comment = '#';
private int _typeFieldLength;

public RecordReader(int typeFieldLength)
{
_typeFieldLength = typeFieldLength;
}

public IList Process(StreamReader input)
{
var result =
from line in input.Lines()
where !IsBlank(line)
where !IsComment(line)
select GetStrategy(line).Process(line);
return result.ToList();
}

private IReaderStrategy GetStrategy(string line)
{
string typeCode = GetTypeCode(line);
if(!_strategies.ContainsKey(typeCode))
{
throw new NoStrategyDefinedException("No strategy defined for code " + typeCode);
}
return _strategies[typeCode];
}

private static bool IsComment(string line)
{
return line[0] == Comment;
}

private static bool IsBlank(string line)
{
return string.IsNullOrEmpty(line);
}

private string GetTypeCode(string line)
{
return line.Substring(0, _typeFieldLength);
}

public void AddStrategy(IReaderStrategy arg)
{
_strategies[arg.Code] = arg;
}
}

You can download the full source code from here