U4-9545 - EntityRepository performs poorly for Media due to unnecessary outer joins

Created by Shannon Deminick 20 Feb 2017, 04:46:06 Updated by Claus Jensen 27 Feb 2017, 11:06:25

Relates to: U4-9546

Relates to: U4-9547

Relates to: U4-9470

Subtask of: U4-9548

Using the EntityService should generally be faster than using the MediaService for returning simple entities. Media however is treated slightly differently with the EntityService because a media item's property values are required to fill it's metadata - for things like the URL of the media item. The SQL for media therefore returns it's property collection, so this will be generally slower than for content or members. However, there is a problem with the media query, it is also performing some outer joins on cmsDocument to get the newest/published versions, but media doesn't actually have this data so these joins are irrelevant and costly.

Comments

Shannon Deminick 20 Feb 2017, 05:01:19

There is code here: http://issues.umbraco.org/issue/U4-9470 that already does this so we need to extract it.


Shannon Deminick 20 Feb 2017, 07:15:36

PR: https://github.com/umbraco/Umbraco-CMS/pull/1763


Shannon Deminick 20 Feb 2017, 07:33:10

Testing:

  • Review code, ensure tests pass
  • Ensure media section works with folders at several levels, ensure tree + list views + grid views all work
  • Ensure content tree works as expected
  • Ensure content/media/member pickers work as expected
  • When paging in the media section, make sure that REST calls to GetChildFolders are not being made


Mads Rasmussen 21 Feb 2017, 18:33:27

The reason why we have the getChildFolders end point is so we are able to get the folders across all pages of a node. I don’t know if there is more performance friendly approach to do this but after the “fix” we now only see the folders for the current page. This means that if I want to navigate my media section, using the dashboard, I have to guess on which page the folder I want to navigate to is.

Other than that, I have made the following tests:

I have made the following tests:

  • Media section ** Upload works ** Pagination works ** Data type paging settings works ** Folders in folders work ** Delete, Recycle bin + empty recycle bin works ** Both media grid and media list works (list view layouts)
  • Content tree ** Nested nodes works ** List views work
  • Pickers ** Tree picker (content, media member) ** I can pick nodes + nested nodes in several levels ** I can pick list views ** I can pick nodes from list views ** I can pick folders + nested folders
  • Media picker ** Upload files works ** I can navigate media library ** I can select images + folders in several levels


Shannon Deminick 22 Feb 2017, 02:32:47

madsrasmussen where have you made these changes, there is nothing in git?

Ok, so the getChildFolders is the whole cause for the media section poorly performing, we will need to refactor how this works in c# then - though there's still nothing stopping people from having 1000 folders under a single node, so I don't know how you want to handle that? We surely don't want to show 1000 folders?

I'll have to revert my change and then refactor the c# side of things, however we need to change the JS because it re-requests all of these folders on every page change when it only needs to request this once.


Shannon Deminick 22 Feb 2017, 06:25:50

@madsrasmussen Ok, after a lot of work today, I've got this going. I'll spare you the details of what I did and then ended up doing (it's all in the git commit notes). The important part is this:

  • The call to getChildFolders is back, however it only makes this call once per parent node instead of everytime you change a page
  • The call will return a max of 500 folders, if people have more than that they won't be returned, if we want to allow this then the UI needs to be changed to page the folders - which we support now
  • The getChildFolders call is optimized now, so it won't totally slow down the media section, the folder query is now querying with SQL instead of filtering all results in memory

If you can test that would be great!


Mads Rasmussen 22 Feb 2017, 13:16:23

I have tested again and now everything works as expected 👍


Shannon Deminick 22 Feb 2017, 13:19:03

Great! Maybe get Emil or Stephen to have a quick look at the c# and get it pulled in ;) hope you saw my notes about the 500 Max?


Shannon Deminick 22 Feb 2017, 13:19:24

And about only loading the folders once?


Mads Rasmussen 22 Feb 2017, 18:00:17

Yes, only loading the folders once makes a lot of sense 😃 and it works as described. I think if we add a limit to the folders we also need to handle if anybody exceeds that limit. Event though 500 folders for one node sounds like an edge case, the folders, including the images, would “disappear” if you had more. Has the pagination options been added to the getChildFolders end point yet?


Shannon Deminick 23 Feb 2017, 04:47:26

@madsrasmussen Yes the paging is already available, in my code notes I mentioned that if we want to allow that then the UI will have to change to support it. Do you want to create a new task for that?


Shannon Deminick 23 Feb 2017, 04:48:07

@zpqrtbnk can you have a quick look at this PR c# code and complete this task if everything is ok?


Mads Rasmussen 23 Feb 2017, 08:15:52

I have made this tasks for the 500 folders limit: http://issues.umbraco.org/issue/U4-9561


Claus Jensen 27 Feb 2017, 11:06:20

I've reviewed the C# part of this and it seems good to me - basically removing a lot of the extra stuff for content not needed for media hogging the performance. Unit tests run and in my tests it works in the backoffice too - to confirm that you haven't removed "too much".

I'm a bit surprised about the folder check just being a alias check if the contenttype ends with "Folder", but that's apparently not something new :)


Priority: Normal

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions:

Due in version: 7.5.11

Sprint: Sprint 53

Story Points:

Cycle: