We have moved to GitHub Issues
Created by Shannon Deminick 24 Oct 2016, 12:19:30 Updated by Stephan 28 Oct 2016, 15:00:18
Subtask of: U4-9085
Analyze the performance of our Lambda expression tree parsers at our service layer - often times (like in NH) these can prove to be very expensive!
If they are bottlenecks, we need to fix this - could be by caching the parser result (or similar)
4 hours and report back on whether or not these perform badly.
Some notes about this:
Parsing expression trees is not super fast and there's probably not a lot we can do to improve the speed of this without getting into some really nitty gritty details. The bottleneck here is how .Net itself compiles parts of the expression tree. Some interesting articles:
We could look into caching expression tree compilation but I think this is above and beyond where we should go, but some interesting stuff here:
... but I'm not expression tree expert and it seems the slowest part is the Compile() method of a LambdaExpression instance and in some cases people say you should cache the result ... but the way we create these in the parser varies and I'm not sure for what 'key' we'd cache the result against. So I'm not going to pursue changing these BUT I would like to know if anyone has experience with this to look through the
BaseExpressionHelper to determine if we can optimize these calls to
What I've looked into instead was what our expression tree parser does:
This means that we really don't need to do the first part if it's already been done once because this SQL structure and placeholder parameters will always be the same for the same expression. So I've done some benchmarking with using a new class called CachedExpression which our parsers will check for. If this object is used and it's already been processed, then no string manipulation will be done to produce the SQL statement, but the visitors will still execute to produce the list of SQL parameters to use. This saves roughly 25% of the overall performance:
|| Method || Median || StdDev || Scaled || Scaled-SD || Gen 0 || Gen 1 || Gen 2 || Bytes Allocated/Op || | WithNonCached | 35.1592 ms | 3.4485 ms | 1.00 | 0.00 | 46.44 | 14.78 | - | 558,702.44 | |WithCachedExpression | 26.3088 ms | 0.4446 ms | 0.73 | 0.06 | 37.00 | 18.00 | - | 524,566.87 |
So we can certainly implement this change, a 25% performance improvement isn't nothing.
We then need to look at all of the places in the Core that uses this, it is anywhere that a
Query<T>.Builder.Where method is used. We use this quite a lot: 155 usages in Umbraco.Core and 1 in UmbracoExamine and of course lots in tests.
We can actually replace some of the usages of these with static/singleton references for them because the parameters never changes. For example:
Query<IContent>.Builder.Where(x => x.Published == true);
We shouldn't be re-generating this every time we want to use it, it should be created once and re-used.
These expression tree parsers are also used in the strongly typed
PetaPocoSqlExtensions usages of SQL, such as:"
var subQuery = new Sql() .Select("umbracoNode.id as nodeId") .From<ContentDto>(_syntaxProvider) .InnerJoin<NodeDto>(_syntaxProvider) .On<ContentDto, NodeDto>(_syntaxProvider, left => left.NodeId, right => right.NodeId) .WhereIn<ContentDto>(dto => dto.ContentTypeId, contentTypeIds, SqlSyntax) .Where<NodeDto>(x => x.NodeObjectType == NodeObjectTypeId);
Each one of those Where
PocoToSqlExpressionHelper so there's also overhead in producing these queries (along with the reflection used for the strongly typed joins, etc...). There's over 140 instances of
Where<T> being used in the core that does this.
Another alternative for some of our queries is to just not use Expressions to build the SQL and instead just use SQL. We definitely need to keep using the expression trees when passing around an IQuery
So moving forward, if we want to use the CachedExpression mentioned above, I think we'd need to create a new
IQuery<T> interface called
IQuery2<T> (temporary until v8). The implementation of
Query<T>) could have a static method:
Build(CachedExpression) (instead of the current
Query<T>.Builder method). This would allow the passed in CachedExpression instance to be passed to the creation of
ModelToSqlExpressionHelper and then wrap the
predicate. Then we can create CachedExpression instances for any query that needs to be executed at the service or repo level.
We could then create an overload for the
Where<T> PetaPoco extension method to pass in an instance of CachedExpression for it to be used when creating a
... this would give us 25% performance improvement when generating these queries but it's a little bit of work.
PR is here https://github.com/umbraco/Umbraco-CMS/pull/1553 for:
New tasks should be created if we want to do the above
NOTE: We should defo implement the above changes for 7.6 when we have UmbracoDeploy using our internal queries since Deploy will be querying quite a lot (i.e. like Courier does) so this will have an good performance impact on that.
Just to be sure, the new CachedExpression thing is not used anywhere, so it's just provision for future usage? This, and... the expression would still be visited on every execution, in order to populate the SQL parameters, only the SQL string would not be rebuilt?
think I got it, all good
Backwards Compatible: True
Due in version: 7.5.5
Sprint: Sprint 45
Story Points: 1