-
Notifications
You must be signed in to change notification settings - Fork 58
2009 04 29 is re linqs transformation model powerful enough
Published on April 29th, 2009 at 16:30
As I explained in my last post, the idea of re-linq is to transform ugly IQueryable-based ASTs into nice query expression ASTs (from … select).
A discussion with Frans Bouma, distinguished LINQ tamer, brought up a point we discussed for quite a while internally: does this approach limit our ability to express every conceivable LINQ query? And if not, will it still provide value, or will this result in ASTs that are just as hard to use as IQueryable ASTs, only different?
Frans brings up the following example:
from c in md.Customer.Where(x=>x.Country=="Germany")
join o in md....
select c.Orders;
Initially we were thinking that re-linq would just lead to a 80+% solution, because using Queriable’s extension methods just lets you do things that cannot be expressed directly using query expressions. (That was good enough for us anyway, since all we wanted to do back then was to get serious – but not necessarily complete – LINQ support for re-store, our own ORM. So we just went ahead.)
When we thought about turning re-linq into a much more generic beast, we revisited this assumption and discovered that for any direct use of Queryable there is not only a way to express the same using query expressions (from … select notation). It is also easier to process these expressions, because when we get down to it, using a full LINQ provider is only useful if we transform the query to a powerful enough query language, like SQL, HQL or Entity SQL. And those languages usually support sub-queries, but none of them supports anything even remotely similar than direct invocations on Where(), SelectMany() etc.
For Frans’ example, that would be:
from c in (from x in md.Customer where x.Country=="Germany" select x)
join o in md....
select c.Orders;
The transformation result might not be shorter, but it’s much easier to process and transform because
- it is structurally similar to the output model,
- you can use existing transformations in a modular way for sub-queries, and
- everything strange (like transparent identifiers) has already been dealt with at that point.
A more intelligent transformation could transform the whole thing to just
from c in md.Customer
join o in md....
where x.Country=="Germany"
select c.Orders;
These queries are semantically identical (i.e., there should only be a difference if you execute these statements imperatively using LINQ to objects). This is a very interesting transformation that does a nice optimization. For any simple SQL-emitting back-end, this would also lead to much cleaner SQL queries without the back-end even being aware of this particular optimization).
At the end of the day, this is a matter of how many special cases you want to support. LINQ to SQL is known to go out of its way to produce the simplest SQL possible even for frightening LINQ queries. But some optimizations might be even easier to achieve using re-linq, because the patterns are more easily recognizable.
Either way, moving this kind of optimizations into a shared OSS project sounds more promising to me than solving it for every single provider. It doesn’t only reduce the overall effort, it also means that all optimizations that are done on the query model itself can be shared between providers. Everyone who thinks that re-linq cannot do some trick that they absolutely need is welcome to contribute to re-linq. (For instance, we here at rubicon are not particularly interested in optimizing LINQ joins, because joins are usually implicit in ORMs, and a good DB like SQL server would recognize the pattern anyway and come up with identical execution plans for both queries. But if someone else builds this optimization (and it won’t cost us an arm and a leg in terms of performance), we will gladly use it.)
Now we’re not there yet, and I cannot promise that we’re not going to discover situations that are more difficult to solve than we anticipated, but we’re confident that this is not a dead-end approach that will just fail for more complex scenarios. The design behind re-linq’s transformations is not trivial, and we are quite confident that it is a solid ground to build on. I’m sure Fabian will find some time to go into internals when he’s back at work.
- Stefan