We have moved to GitHub Issues
Created by Dennis Boonk 13 Dec 2017, 09:23:50 Updated by Sebastiaan Janssen 18 Apr 2018, 11:53:47Tags: Gold partner
Relates to: U4-10833
Relates to: U4-10854
Relates to: U4-11208
we're currently building a website with a lot of articles and when we've imported a lot of articles last week (8000+ articles) our website became pretty slow. Some pages took more than 8 seconds to render, and we didn't have these problems when we had only around 100 articles. We looked at the problem and found out the following:
takes about 40ms to execute, if it’s a guid of a node near the end of the umbraco.config. This is because in the core the following xpath query is executed:
"//* [@isDoc and @key=$guid]”
This way it is traversing the complete tree to find the content with a given key/guid.
If we try to get the same content, but look it up via the id (instead of the UDI) is takes less than 1 ms:
If we use the Umbraco.TypedContent for 100 items the difference is 100ms (100x1ms) with ids vs 4000ms (100x40ms) with guids/udis.
So the loadtime by searching by UDI is almost 40 times as slow!
In our case we've used a MultiNodeTreePicker, and TypedContent is used on the output of that. The given site has a complex structure with a lot of multilingual nodes and MultiNodeTreePickers.
In our opinion this is something that should be fixed in Umbraco Core, but I hope you can shine your light on it!
Kind regards, Dennis
What happens is... all we have is this big Xml document. When you get-by-id (integer id) we can use the find-node-by-id feature of Xml documents which is somewhat optimized (no idea how, but ids are a special thing in Xml so I can imagine the document having some sort of index). When you get-by-guid there is nothing we can use, so we rely on an XPath query which has to be slower, yes.
It is definitively something that you would expect Core to manage.
We would need to build an additional cross-reference map of guid-to-id somewhere in the content cache, above the Xml document. So... I'd say feature request, which makes total sense.
Hi @zpqrtbnk, thanks for your input. But a feature request sounds like 'not soon in core'. Could you recommend anything for a shortterm hack/fix/work-around?
Only proper solution I can see is to implement your own guid-to-id map, which means hook into "published" events and register guid-to-id in some sort of dictionary. Conversely, handle "unpublished" events and clear the dictionary. Then, instead of getting-by-guid, do a quick dictionary lookup and then get-by-id.
@zpqrtbnk What about v8? With no XML cache, I'd imagine things will look different...
v8 will have both an id-to-content and a guid-to-content lookup indexes, so perfs should be identical
Hi @zpqrtbnk, Thanks for your input. Is it possible that we can inject our own implementation of the umbracohelper so we can override the TypedContent method and use a guid-to-id-map? Therefore, every property value converter and piece of code who uses the umbracohelper uses the new class. So we can tackle the problem at one place.
Alternatively, inject by default a custom ContentQuery in the umbracohelper. I am hoping to fix the problem at one place for every piece of code that uses the guid to content method. A fix for umbraco, property value converters, razor and our custom code.
What do you think is the best approach to fix the performance problem?
@zpqrtbnk Since all built-in property editors now use UDIs and thus suffer from this issue, would it not make sense to implement this guid-to-id map in Core as soon as possible in v7? This is a serious performance issue, which is definitely noticable whenever anything is done with lists of IPublishedContent where each item references additional IPublishedContent items through e.g. MultiNodeTreePicker properties, as that triggers many lookups with GUIDs for each item through the new property value converters. While v8 seems really cool so far (also enjoyed your demo at UK Fest this year), it might still take a while to be released and a fix for v7 in the meantime would be greatly appreciated :)
Agree with @dknippers, we are currently at a standstill and have had to implement some terrible bandaid-caching to get us by until we begin the build of the suggested guid-to-id mapping solution. I would raise the priority of this to Critical. Also worthy to note, waiting and upgrading to v8 may introduce a load of other bugs that we may not be ready for just yet.
This issue is the primary performance bottleneck on a website that I've worked on and recently launched. It makes heavy use of multinode tree pickers to reference multiple pieces of content from another piece of content.
We've had to implement our own caching to mask the issue. Which is turn has created more issues, due to cache invalidation and other considerations.
I am very keen to see this addressed within 7.x
A quick chime on this one (from a current XML perspective), I spent some time this morning looking into if the
<!ATTLIST ... could support multiple
ID attributes, but unfortunately it can not.
Sharing in case useful - for anyone else looking to work-around the issue or if any use for core: https://github.com/AndyButland/UmbracoUdiToIdCache/
Have put this together working with Peter who commented above - currently untested more than a quick check over with a local starter kit, so if you do want to use it please do so your own risk, and check carefully on larger sites if it's actually helpful, due to the fact that it'll use up memory and will have a start-up hit due to the database call. Wouldn't normally put something out with so little testing but, if you take that caveat, it may help if someone is running into urgent performance issues and is happy to take on the testing task.
@leekelleher funny - I did the same.
We will definitely prioritise this for an upcoming v7.minor during this (Q1 2018) quarter
An update: We'll be rolling out a small performance fix for next patch that should improve performance by up to 30%. Then we'll be implementing a GUID->ID lookup table for 7.9 that should really increase performance (7.8 is being wrapped up for release so it won't be able to make it for that release).
Thanks @hartvig, great news! Is there an ETA on the patch?
The patch is a part of vLatest. 7.9 is end of Q1.
Added a related issue for the 7.7.9 release notes U4-10833
FWIW, I've proposed a temporary solution here: https://github.com/umbraco/Umbraco-CMS/pull/2398
I've no expectations of it being merged in as is - feel free to use it as a discussion point.
Unfortunately we didn't have enough time to put this into 7.9, I'm moving this to 7.10 as that is the first opportunity to add new database tables.
Hi @sebastiaan, that is indeed really really unfortunate :(... Do you have any clue about the release date of 7.10?
So... let's get this done.
Umbraco already has an id/guid cache which was added for Deploy because Deploy runs exclusively with guids and needs to do a lot of id/guid mapping. The idea is to reuse this cache. Which, technically, is not very different from the package @abutland proposed. And, it has the benefit of working for documents and medias.
Indeed, we will need to deal with medias. The solution by @leekelleher which was merged in 7.7 and relies on the Xml cache, only supports documents;
Until now the cache started empty, and each cache miss triggered a database hit. I have modified the cache to fully load every know id/guid pairs for documents and medias on the first query. New documents and medias still cause a cache miss and a database query.
So, compared to @leekelleher solution, this might be slower on the first query. But it deals with medias, which we cannot do with Xml and XPath only. Things will be different in v8 where the cache knows about published and unpublished documents, and medias.
So I guess this is a Request for Comment: what do ppl think about the solution?
@zpqrtbnk Looks like a fine solution to me! :-)
review: on any site, ideally with some content, create a view that does TypedContent(
So the big difference betweeen @leekelleher's approach and this one is that this one requires a DB hit whereas Lee's doesn't since it just uses the already cached values from the config file.
I understand that the PR didn't address media, however media is a totally different story in the first place and would be much faster than the problem that exists for Content. The Content performance problem is due to the XPath query being slow. Media on the other hand already used the IdKMap (lazily) and then queries against lucene.
My concern here is that now the content lookups always require a DB hit whereas with Lee's PR: https://github.com/umbraco/Umbraco-CMS/pull/2398/files this is not required.
I would like to see what is faster. I suppose a rudimentary test would be to have a debug trace for the PopulateLocked method.
What i'm also very confused about is that Lee's PR was merged in this commit: https://github.com/umbraco/Umbraco-CMS/commit/41c96e6e2636c521511565456c77f0b4e9984420 ... but these changes are not actually part of the dev-v7 branch !? where did these changes go?
@Shandem Very weird about the changed disappearing in the dev-v7 branch. I've been looking over my local repo to spot when it happened, (as GitHub didn't show me much), pinpointed it to this commit https://github.com/umbraco/Umbraco-CMS/commit/19ec26fef997347877126e418bc7b3c1870236df (again GitHub doesn't show the actual revert, but local GitExtensions does). I can't really comment on the Git merging strategy here - maybe @sebastiaan knows more?
Yea @zpqrtbnk has figured it out, somewhere in branch merging it's disappeared but think the fix is still in releases (i'm not totally sure which ones)
It was a huge merge and it looks like I've made a mistake in the manual merge there, it has not been applied to 7.8 and 7.9, I suggest we apply Lee's interim fix to the next 7.9 again so that it's available in the next patch release.
revisiting the PR - have unearthed other places where perfs would be slow - trying to make sense of it all
Soooo I have updated the PR - we don't do pre-loads from database anymore, but the content cache pre-populates the idkMap with those id/keys it knows - and the rest comes from database. Back into review.
@zpqrtbnk I don't think you pushed you code, see https://github.com/umbraco/Umbraco-CMS/pull/2505/commits
All looks good, I've made some comments on the PR and pushed one change.
I've tested and stepped through the code:
PublishedContentCache.GetKeyForIdget their values from the pre-populated cache
Going to merge and mark as fixed but @zpqrtbnk please let me know if there's any changes I've done that you don't like :)
@Shandem Should the dependency on
Running 7.10.0 on Windows Server 2008 with .net452 causes a YSOD due to missing dependency, we've resolved by manually installing package as we can't update to > .net46 in this environment.
We are migrating site to to Azure where I guess additional dependency not needed?
Ack, yes, yes it should! I'll update asap
Type: Feature (request)
Backwards Compatible: False
Affected versions: 7.7.6, 7.7.7
Due in version: 7.10.0
Sprint: Sprint 81
Story Points: 3