-
Notifications
You must be signed in to change notification settings - Fork 58
2009 12 23 any all and aggregate
Published on December 23rd, 2009 at 15:15
For the last few weeks, I’ve been very occupied with re-store, the O/R mapping part of re-motion. Still, there was some time left for three minor re-linq features Steve Strong asked me to implement: support for All
, Any
, and Aggregate
result operators.
Any
can be used to check whether a sequence (or query result) contains any elements satisfying a given predicate, or whether it contains any elements at all:
// got any students called "Garcia"?
var result2 = students.Any (s => s.Last == "Garcia");
// got any students at all?
var result1 = students.Any ();
Both forms of Any
are represented by re-linq via instances of AnyResultOperator
. Remember: result operators are query operators that act on the whole result set of a query, calculating a single value from the result set or transforming it into a completely different set of items. In this case, a boolean value is calculated from the result set.
As all result operators, instances of AnyResultOperator
can be analyzed by checking QueryModel.ResultOperators
or handled in a visitor by implementing or overriding the methods IQueryModelVisitor.VisitResultOperator
and QueryModelVisitorBase.VisitResultOperator
.
You may notice that the predicate that can be passed to Any
has the same semantics as a where claus; i.e., the following two queries are semantically equivalent:
students.Any (s => s.Last == "Garcia");
students.Where (s => s.Last == "Garcia").Any ();
This is similar to other result operators taking optional predicates, such as First
or Count
, and as with those result operators, re-linq represents both forms using a WhereClause
. This means that the AnyResultOperator
will never hold a predicate; the respective WhereClause
can be found in the BodyClauses
of the QueryModel
holding the result operator.
All
can be used to check whether all elements in a sequence satisfy a given predicate:
// are all students named "Garcia"?
students.All (s => s.Last == "Garcia");
In this case, the predicate is not optional, and it also doesn’t denote a filter with Where
semantics. Therefore, the corresponding AllResultOperator
created by re-linq always holds the predicate – in its resolved form. “Resolved” means that the LambdaExpression
has been simplified by substituting the parameter representing the incoming items (s
in above sample) with an expression describing those items.
In the example, AllResultOperator.Predicate will hold the following: [s].Last == “Garcia”
. The [s]
part is a QuerySourceReferenceExpression
that points to the MainFromClause
representing the students query source. re-linq always resolves LambdaExpressions
that way, so this is the same expression you would see in a SelectClause
or WhereClause
.
Aggregate
is also a result operator; it can be used to accumulate (or … aggregate) all the incoming items into a single value:
students
.Select (s => s.Kids.Count)
.Aggregate ((total, kidCount) => total + kidCount);
In this example, the aggregate operator combines all counts by adding them together, similar as if I used Sum
instead. But Aggregate
is not restricted to additions:
students
.Aggregate ("", (nameString, s) => nameString + " " + s.Last);
This example concatenates all last names into a single name string. It uses a different overload of Aggregate
, one that takes an initial seed value- This overload also allows the aggregated value to be of a different type than the incoming items, so no select
clause is needed.
Note that this overload has quite different semantics than the one I used before; compare the following two queries:
students
.Select (s => s.Last)
.Aggregate ((nameString, last) => nameString + " " + last);
students
.Aggregate ("", (nameString, s) => nameString + " " + s.Last);
Apart from the fact that I needed a select clause for the first query, but not for the second one, the result of the second query will include a leading space, whereas the result of the first query won’t. The first result does not hold a leading space because that Aggregate
overload uses the first incoming item as the seed value, and the aggregating function is only called for the remaining items. For the second query, the seed is given, so the aggregating function is called for each of the items – including the first one. And this causes one space per item to included in the second result.
Because of this semantic discrepancy, I decided to implement two different result operator classes within re-linq: AggregateResultOperator
and AggregateFromSeedResultOperator
; the former representing the first overload, the latter the second one. There’s also an additional third overload which takes an additional result selector – a LambdaExpression
transforming the aggregated value one last time before it is returned –, but this is semantically identical with the second overload, so it is also represented by the AggregateFromSeedResultOperator
(which therefore offers an OptionalResultSelector
property).
Both AggregateResultOperator
and AggregateFromSeedResultOperator
hold the aggregating function in resolved form, i.e., for the first Aggregate
example above, the result operator would hold the following expression: total => total + [s].Kids.Count, with [s]
being the reference expression that points back to the MainFromClause
representing the students
query source.
Note that most LINQ providers will probably have difficulties to support Aggregate
in its entirety, simply because it’s so flexible. You can use Aggregate
to do sums, products, divisions, string concatenations, list building, and much, much, more. But for some scenarios, it will still be handy to have support for it in re-linq – and like all result operators, AggregateResultOperator
and AggregateFromSeedResultOperator
both have an ExecuteInMemory
method to run the operator on an in-memory sequence if desired.