U4-9494 - Left outer join query for Documents is very inefficient when querying for multiple nodes

Created by Shannon Deminick 06 Feb 2017, 23:28:33 Updated by Sebastiaan Janssen 09 Feb 2017, 20:02:29

Relates to: U4-9450

When querying for document data we need to retrieve both:

  • The 'newest' document version information
  • The 'published' document version information

In some cases, this the same version information but if there are pending changes then these will be different. The query we run to fetch a document uses a Left Outer Join to join the cmsDocument to cmsDocument to fetch both version information pieces. This is ok when fetching a single document and saves us an extra query to get the published document version data, however when fetching multiple document's, we need to sort this result and when performing this left outer join + the sorting of the results spikes SQL server CPU activity immensely, this is very inefficient.

Instead, what we do is:

  • For single document queries, we keep the outer join, when we build the entity from the dto we use a single dto which contains version information for both newest and published
  • For multi document queries, we remove the outer join which means the first query only contains 'newest' version info, we then run a second query to fetch all published version information for the current dataset, then when building the entity from the dto, we use 2x dtos, one for the main document and one for the published version information

This change is a huge performance improvement especially for scenarios where we query multiple content items:

  • Rebuilding Indexes
  • List Views
  • Content pickers
  • (probably lots more)


Shannon Deminick 06 Feb 2017, 23:28:43

The PR for this is combined with http://issues.umbraco.org/issue/U4-9450

Priority: Normal

Type: Bug

State: Fixed


Difficulty: Normal


Backwards Compatible: True

Fix Submitted:

Affected versions:

Due in version: 7.5.10

Sprint: Sprint 52

Story Points: