08/26/2008

LINQ Distinct, a DataTable and the IEqualityComparer

In a recent situation I was trying to pull some aggregates out of a DataTable using LINQ. I needed to get the rows of the DataTable with a Distinct clause, but my aggregates would be on other columns of the row. The problem is that when you call LINQ’s Distinct() extension method with no arguments, it uses the "default IEqualityComparer". This means that it will work if you use the Select() extension method, only returning the column you want the distinct on. Well that works great, unless you need more columns from the DataTable.

The solution here is simple. Write a custom DataRow comparer that compares the DataRow against the column you are trying to put the distinct on. Here is an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public class PersonDataRowComparer : IEqualityComparer<DataRow>
{
#region IEqualityComparer<DataRow> Members

public bool Equals(DataRow x, DataRow y)
{
return (x.Field<int>("PersonID") == y.Field<int>("PersonID"));
}

public int GetHashCode(DataRow obj)
{
return obj.ToString().GetHashCode();
}

#endregion
}

Once we inherit IEqualityComparer<T> (T being the type we want to do the comparison on) all we do is fill in the Equals() and the GetHashCode() methods. In the Equals() method, we just tell the DataRows to compare the fields "PersonID" and return if they are equal. This will tell LINQ if the DataRow is distinct or not.

Hope this helps!


comment: