U4-7312 - GetBigThumbnail is slow when working with large media folders

Created by Wincent 27 Oct 2015, 10:20:18 Updated by Wincent 02 Feb 2016, 15:45:26

If you work with media folders that have more than a few hundred images uploaded the thumbnails take long time to download.

This affects media picker and media library in umbraco backoffice.

If I do a test where everything runs locally it takes about 10 seconds to open a media library folder which contains ~700 images

2 Attachments

Comments

Wincent 31 Oct 2015, 12:15:56

I ran a more realistic test on this by throttling the bandwidth in chrome network inspector to 2mbit/s Loading the same media folder takes about 3 minutes with this setting.

The problem is that the output from GetBigThumbnail is not cached so the user has to wait another 3 minutes to open the same media folder again.

Would it be possible to use ImageProcessor for this task and enable cache headers?


Wincent 31 Oct 2015, 15:46:38

Made a pull request with proposed fix: https://github.com/umbraco/Umbraco-CMS/pull/875

Thumbnail paths are replaced in mediahelper.sevice.js and fileupload.controller.js

Please note that canvasdesigner/editors/background.js and propertyeditors/imagecropper/imagecropper.controller.js is not changed.


Wincent 31 Oct 2015, 15:48:09

Ran the same test again and once the thumbnails are cached by image processor this is the result.

10 seconds instead of 3 minutes :)


Nicholas Westby 31 Oct 2015, 17:39:04

@Wincent Some questions. Does the new method reload images if they've changed (e.g., if you upload a new file (with the same file name) for a media item)? My guess is not (not sure if the old method did, but my guess is yes). Also, does this work OK (as in, not exploding) for non-image items (e.g., PDF's)?


Wincent 01 Nov 2015, 08:39:54

@Knickerbocker - Ran som quick tests

Does the new method reload images if they've changed (e.g., if you upload a new file (with the same file name) for a media item)?

  • Tried uploading a new image with the same name on an existing umbraco file using media library.
  • It does not refresh
  • How can the cache be invalidated?
  • Would it help to use the GetCropUrl() method instead?

Also, does this work OK (as in, not exploding) for non-image items (e.g., PDF's)?

  • Tried uploading a pdf file
  • Yes, pdf file gets a generic document thumbnail in media library and media picker


Wincent 01 Nov 2015, 12:46:20

It seems that the media thumbnails gets invalidated on upload here: https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web/PropertyEditors/FileUploadPropertyValueEditor.cs#L151

Is this a good place to also invalidate ImageProcessor cache?

"There are only two hard things in Computer Science: cache invalidation and naming things." -- Phil Karlton


Wincent 01 Nov 2015, 12:50:35

It also seems that @Shandem agrees that it would be a good idea to replace the media thumbnail generation with the same process as is used by frontend web :)

https://github.com/umbraco/Umbraco-CMS/blob/389724cba5a1576c65e447d44989b3e19eaaa690/src/Umbraco.Web/Editors/ImagesController.cs#L107


Wincent 07 Nov 2015, 13:36:27

@Knickerbocker - Did som further digging into this Actually, ImageProcessor already handles invalidation of cache if the source image is changed as is documented here: http://imageprocessor.org/imageprocessor-web/imageprocessingmodule/#caching Since ImageProcessor will set a rather long max-age the thumbnail will not be refreshed unless the user makes a reload.

I could go ahead and fix the cache invalidation problems but I would need some input first. A possible solution would be to change the editor value of FileUploadPropertyValueEditor from a single string to a JSON value that stores the image path and last write time of the source image. This value could be passed as a query string parameter to ImageProcessor.

What do you think?


Nicholas Westby 09 Nov 2015, 07:19:56

@Wincent You could make a request to the server to get metadata about the image (such as last update time). I believe the media item itself (rather than the actual file) should have that information. Then, as you say, you could pass that to ImageProcessor (as a query string, to bust the cache).

To speed things up, you could request that information in batches (i.e., make one biggish request rather than several hundred small requests).

I would not recommend updating the storage format of specific property editors (too invasive).


Wincent 09 Nov 2015, 08:34:30

@Knickerbocker - Sounds like a good idea to keep the format backwards compatible.

Will post here when a fix is ready.


Wincent 09 Nov 2015, 21:00:38

@Knickerbocker - Made media panel and media library generate ImageProcessor urls that include media update date. Also made GetBigThumnbail redirect to the new url format to keep everything backwards compatible.

Pull request updated here: https://github.com/umbraco/Umbraco-CMS/pull/875


Wincent 09 Nov 2015, 21:05:56

Also noticed that the thumbnail does not get updated if I upload a new image with the same name on an existing media in media library.

Perhaps this is beyond the scope of this update?


Nicholas Westby 09 Nov 2015, 21:14:38

@Wincent Would be nice if the thumbnail updated, but I just checked on 7.3.1, and it does not currently update the thumbnails when the media item is updated with a different image that has the same file name. So, it seems like an existing issue that is beyond the scope of this update (though it would be nice if you could fit that in).


Wincent 10 Nov 2015, 20:17:48

@Knickerbocker - A possible solution would be to do something like this: https://github.com/Phosworks/Umbraco-CMS/compare/U4-7312...Phosworks:U4-7312-optional?expand=1

Also, it should be safe to remove generation of _thumb and _big-thumb on every upload, since these files will not be used anyway.

Should I add these fixes to the existing pull request?


Nicholas Westby 11 Nov 2015, 15:43:14

@Wincent Not sure I follow those commits. Looks like you are adding a query string parameter ("index"). My assumption is that you are using that to bust any caching that may be in place.

If that's the case, I'd say you could do similar for other places displaying media. In fact, that's what I thought you were already doing with the "rnd" query string set to the modified date of the media item. Even if the file is the same, so long as it has a different modified date, it ought to avoid caching.


Wincent 11 Nov 2015, 20:32:10

@Knickerbocker - yes it is a bit complicated :)

The 'file.thumbnail = thumbnailUrl + "&index=" + index' exercise is there just to make ng-repeat understand that ng-src could be different. If the new ng-src is the same as the previous value it will just not create a new img element. The reason that ng-src does not change is that it uses the GetBigThumbnail method that will redirect to the new thumnbnail url

A more clean solution would be possible if that particular context had access to entity.updateDate. However, I couldn't figure out how to do that without making big changes to the codebase.


Wincent 21 Nov 2015, 14:18:39

Maybe there should be a new issue for this, since the original problem is fixed by the pull request?


Nicholas Westby 22 Nov 2015, 20:06:19

@Wincent Makes sense to me :-)


Shannon Deminick 25 Nov 2015, 09:38:14

If there is a different issue that this PR doesn't fix, please create it and link it here.


Shannon Deminick 25 Nov 2015, 09:44:33

Apart from that, this looks like a good fix. The main issue of performance here being the SQL queries required which is certainly not ideal. I will pull in this PR once i hear back from a comment I've posted there.


Shannon Deminick 25 Nov 2015, 19:27:21

Hi all,

I've reverted this PR. We are releasing 7.3.2 very soon and unfortunately this won't be able to make it in there due to issues.

It would be much apprecated to submit another PR for the next version but please be sure to test:

  • The standard image upload property editor
  • The grid
  • the image cropper
  • any other place where a media image is being rendered.


Wincent 26 Nov 2015, 12:17:57

Ok, what is the expected schedule for next release?

I have created a new issue for the refresh-bug here: http://issues.umbraco.org/issue/U4-7463


Shannon Deminick 26 Nov 2015, 12:54:21

7.4 beta should be out in a couple weeks, with any luck 7.4 will be out before the end of the year.


Wincent 03 Dec 2015, 17:00:06

Made a new pull request: https://github.com/umbraco/Umbraco-CMS/pull/937


Shannon Deminick 28 Dec 2015, 11:02:38

I've added a comment to the PR that needs fixing


Stephen Adams 08 Jan 2016, 23:10:39

@Shandem is there any possibility that this will be included in a 7.3.x release? 7.4 looks to be quite far out, and we're running into severe performance issues on the backend because of this (our site is using images HEAVILY as well as leveraging AWS S3). Problem w/ 7.4 is with any major release, we're probably better off waiting for 7.4.1 and that could be months out. Thoughts about a fix for 7.3.x?


Wincent 11 Jan 2016, 07:42:53

The fix is based on the 7.3 codebase so I guess it should be ok.

@Shandem - I have removed the System.IO dependency. Unless I hear something else I will assume everything is good to go.


Sebastiaan Janssen 12 Jan 2016, 13:24:38

Thanks @davidwincent! I added the generation of thumbnails back as there's quite a few people that rely on the myimage_thumb.jpg to still work but the media section now uses ImageProcessor to display them and doesn't need to go to the database for each image, which is awesome and much faster!

Will be included in 7.3.5 and higher!


Wincent 13 Jan 2016, 08:15:10

@sebastiaan - Great!

Yes, makes sense to have backwards compapility over a small improvement in upload speeds.


Stephen Adams 14 Jan 2016, 22:09:44

@Wincent and @sebastiaan should we get similar performance gains for those of us using S3 for storage instead of the file system?


Stephen Adams 28 Jan 2016, 19:19:52

@davidwincent looks like there's an issue when clicking on the media itself when using S3. I get this exception:

{"Message":"An error has occurred.","ExceptionMessage":"A relative URI cannot be created because the 'uriString' parameter represents an absolute URI.","ExceptionType":"System.UriFormatException","StackTrace":" at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)\r\n at Umbraco.Web.Editors.ImagesController.GetResized(String imagePath, Int32 width, String suffix)\r\n at lambda_method(Closure , Object , Object[] )\r\n at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass10.b__9(Object instance, Object[] methodParameters)\r\n at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.ApiControllerActionInvoker.d__0.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Web.Http.Filters.ActionFilterAttribute.d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.ActionFilterAttribute.d__0.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Controllers.ActionFilterResult.d__2.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Filters.AuthorizationFilterAttribute.d__2.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at System.Web.Http.Dispatcher.HttpControllerDispatcher.d__1.MoveNext()"}

This is because of the URL: http://testing.swappedsite.com/umbraco/backoffice/UmbracoApi/Images/GetBigThumbnail?originalImagePath=http%3A%2F%2Fcdn.swappedsite.com%2Fmedia%2F10979%2Faed_protected_image_name.jpg

I swapped out the URL and the file name as it had client information in it.

Any ideas on what can be done to fix this?


Wincent 29 Jan 2016, 09:04:53

interesting, how do you set up s3 storage for media?

Does GetCropUrl() work when using s3?


Stephen Adams 29 Jan 2016, 16:09:13

@davidwincent FileSystemProviders. The type being Umbraco.Storage.S3.BucketFileSystem, Umbraco.Storage.S3.

So far the media list works fine through the media section and the media picker. They're all pulling directly through the URL.

Would the file upload controller need to be edited to not use GetBigThumbnail if possible? There seem to be a few remnants of GetBigThumbnail, particular in services, controllers and canvasdesigner. Not a lot of instances, but certainly a few.


Stephen Adams 29 Jan 2016, 16:11:59

Also, admittedly, not sure if GetCropUrl works when using S3. On the front end we're using ImageResizer. On save of a piece of content, if that content has crops, those crops are saved to S3. This does bypass GetCropUrl.


Stephen Adams 29 Jan 2016, 17:02:55

So inside of the file upload controller in the initialize method, I added this check:

        var thumbnailUrl = umbRequestHelper.getApiUrl(
                    "imagesApiBaseUrl",
                    "GetBigThumbnail",
                    [{ originalImagePath: file.file }]);

        if (isValidURL(file.file))
            file.thumbnail = file.file + "?width=500&mode=max";
        else
            file.thumbnail = thumbnailUrl;

I have an isValidURL helper method that merely check to see if the file is a URL which should indicate that something like S3 or Azure Blob Storage is being used. So far this is working. I need to test more to see any far reaching effects. If it does work, I'll add isValidURL to an appropriate helper and make the necessary adjustments. I want to comb through the rest of the UI to see if there are any other problem children. If not, I'll add a pull request.

Do you see any issues, @davidwincent, with this approach?

FWIW, it's important to note that, say, cdn.swappedsite.com maps to our site which then pipes into image resizer. So this might not work for CDNs that don't support that functionality. Which essentially means the images wouldn't be resized.


Wincent 02 Feb 2016, 15:45:26

@StephenPAdams - yes, as long as the cdn-server has ImageResizer I guess it should work.

There will probably be issues with cache invalidation. The main reason why I kept the GetBigThumbnail api method was that I couldn't figure out a fast and reliable way to get the rnd part in the url using javascript.


Priority: Normal

Type: Performance Problem

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted: Pull request

Affected versions: 7.3.0, 7.3.1, 7.3.2, 7.3.3

Due in version: 7.3.5

Sprint: Sprint 6

Story Points:

Cycle: