-
Notifications
You must be signed in to change notification settings - Fork 58
2011 12 20 how to recognize if a method is a query operator
Published on December 20th, 2011 at 9:29
On the users’ mailing list, Alex Norcliffe, Lead Architect of Umbraco 5, describes a problem they are currently facing with re-linq. As an illustration, consider the following two queries, especially the sub-queries within the where
clauses:
var query1 = from c in QuerySource
where c.Assistants.Any ()
select c;
var query2 = from c in QuerySource
where c.GetAssistants().Any()
select c;
When re-linq analyzes those sub-queries within the where
clauses, the first query will produce a SubQueryExpression
with a QueryModel
whose MainFromClause
has the following expression: [c].Assistants
. In other words, the items produced by the sub-query are those identified by the Assistants
property.
The second query, however, will produce an exception:
Remotion.Linq.Parsing.ParserException : Cannot parse expression 'c' as it has an unsupported type. Only query sources (that is, expressions that implement IEnumerable) and query operators can be parsed.
----> Remotion.Linq.Utilities.ArgumentTypeException : Expected a type implementing IEnumerable, but found 'Remotion.Linq.UnitTests.Linq.Core.TestDomain.Cook'.
Why’s that?
re-linq assumes that all methods occurring in a query operator call chain should be treated like query operators (Where
, Select
, etc.). This means that for the sub-query within query2
, re-linq regards c
as the start of the query operator chain. And, since c
’s type does not implement IEnumerable<T>
, it throws the exception shown above. Even if c
’s type implemented IEnumerable<T>
, an exception would be thrown that
GetAssistants()
“is currently not supported”, unless one registers a custom node parser for that method.
Of course, what Alex actually wanted was re-linq treating both query1
and query2
in an equivalent way. I.e., a SubQueryExpression
with a QueryModel
whose MainFromClause
has the following expression: [c].GetAssistants()
.
There is an easy workaround for now (see the mailing list), but I'm wondering how we could change re-linq to produce this result out of the box. I can think of two possibilities, both of which have certain drawbacks:
1 - Have re-linq treat MethodCallExpression
s the same way as MemberExpression
s. I.e, if the method has a registered node parser, treat it as a query operator. Otherwise, treat it (and all expression parts of the call chain before it) as the start of the query.
This would work quite well in the scenario shown above, and it would be nicely symmetric to how MemberExpression
s work in re-linq.
However, it would become a very breaking change regarding diagnostics. Consider this example, in which CustomOperator is actually a custom query operator:
`source.Where(...).CustomOperator().Select(...)`
Currently, re-linq will throw an exception that it can't parse CustomOperator()
if one forgets to register the respective parser, and the LINQ provider backend won't even get a QueryModel
to process.
If we change this behavior, the frontend will no longer throw an exception, and the backend will suddenly get a MainFromClause
with a FromExpression
of "[source].Where(...).CustomOperator()
". I think it would be difficult to understand for LINQ provider implementers why exactly this occurs. I can even imagine people believing this must be "right" (as no exception occurred) and start manually parsing the Where(...)
and CustomOperator()
calls, effectively reimplementing logic from re-linq…
2 - Have re-linq only treat MethodCallExpression
s called on enumerables as query operators. Otherwise, treat them (and all expression parts of the call chain before the method call) as the start of the query.
This would also work in the given scenario, and it has the advantage of still providing good diagnostics when methods taking IEnumerable<T>
have no associated expression node parser. However, it's still a heuristic way of parsing, and it is asymmetric (both with MemberExpression
s and in itself). Consider the following three examples:
`instanceImplementingEnumerable.StartQueryHere().Where(...)
instanceNotImplementingEnumerable.StartQueryHere().Where(...)
instanceImplementingEnumerable.StartQueryHere.Where(...)`
re-linq would parse the StartQueryHere
method in the first example as a query operator (and throw an exception if there isn't an expression node parser registered for it). The StartQueryHere
method and property in the second and third example, on the other hand, would parse just fine. I believe this is difficult to understand just as well.
What do other people think of these two options? If you want to see this scenario to be supported out of the box, please give me some feedback about it on the developer mailing list: http://groups.google.com/group/re-motion-dev/t/f9f6198bbbecd796.
- Fabian