Monday, March 24, 2008

Using LINQ with Text Files

I'm currently reading the book LINQ in Action which I can highly recommend to all who want to get started with LINQ. In one of the first chapters there is a nice example on how you can use LINQ with text files (originally posted by Eric White). Rather than reading all lines of the file into memory and then query it, the example uses deferred execution with an extension method on the class StreamReader.

Inspired by this example I wrote a simple file reader that can read structured non-XML data from text files. As you can see in the Process method in the code below the text file is queried and processed line by line. Objects are created on the fly, as you loop through the results. This technique allows you to work even with huge files.

public class RecordReader{
private Dictionary<string, IReaderStrategy> _strategies = new Dictionary<string, IReaderStrategy>();
private const char Comment = '#';
private int _typeFieldLength;

public RecordReader(int typeFieldLength)
{
_typeFieldLength = typeFieldLength;
}

public IList Process(StreamReader input)
{
var result =
from line in input.Lines()
where !IsBlank(line)
where !IsComment(line)
select GetStrategy(line).Process(line);
return result.ToList();
}

private IReaderStrategy GetStrategy(string line)
{
string typeCode = GetTypeCode(line);
if(!_strategies.ContainsKey(typeCode))
{
throw new NoStrategyDefinedException("No strategy defined for code " + typeCode);
}
return _strategies[typeCode];
}

private static bool IsComment(string line)
{
return line[0] == Comment;
}

private static bool IsBlank(string line)
{
return string.IsNullOrEmpty(line);
}

private string GetTypeCode(string line)
{
return line.Substring(0, _typeFieldLength);
}

public void AddStrategy(IReaderStrategy arg)
{
_strategies[arg.Code] = arg;
}
}

You can download the full source code from here