U4-8856 - Getting non-existing media from cache can flood the log

Created by Stephan 16 Aug 2016, 13:32:53 Updated by Lee Kelleher 02 Nov 2016, 12:51:38

Tags: Unscheduled

Is duplicated by: U4-9006

Relates to: U4-7823

Get non-existing media from media cache: media cache logs one line "Could not retrieve media...". On a site with media pickers, where a good number of picked medias have been deleted, this floods the log. We should only log a warning if the media could not be retrieved from Examine yet exists in the DB - and even though, we should do it once only.

Comments

Stephan 16 Aug 2016, 13:45:07

PR https://github.com/umbraco/Umbraco-CMS/pull/1436

When a media is not found in Examine, we first try to load from DB before logging anything. If it is not in the DB either then all is well and we return. If it ''is'' in the DB then we have an issue and we log a warning stating that the Examine index is probably corrupted.

Actually, we only log after a given (10) number of failures, and we only log once (until the app restarts).

Review: create a page that gets an existing media eg media = Umbraco.TypedMedia(1234) - then clear the Examine index and reload that page more than 10 times => should see 1 log line about it.


Claus Jensen 17 Aug 2016, 08:54:41

Looking good - it's merged!

Note - test needs to be 10 different media items failing, since the result is cached so the miss count will only be increased when requesting a different id (which I've talked to @zpqrtbnk about and it seems like the best way to do the logging).


Eric Schrepel 01 Sep 2016, 04:27:37

Seeing this in 7.4.2, in the meantime is there a way to tell what nodes/partials/macros those media calls might be coming from? I'd try to resolve them but since the media folders themselves are of course missing, it's hard to know what node or code is looking for them.


Claus Jensen 01 Sep 2016, 06:16:48

@Eric.Schrepel Not really - there's nothing in the logs telling exactly what requested those specific items - it could also be any kind of custom code running in your site, utilizing the APIs without the request actually coming from a visit to a node.

I guess the only way would be to either load your pages one at a time and see if that triggered anything in the logs and go from there - or create a script that runs through all your items and checks for missing references.


Lee Kelleher 02 Nov 2016, 12:03:04

@zpqrtbnk Apologies if I should post this elsewhere, I wasn't sure where was the best place. I have a question about the GetUmbracoMediaCacheValues logic.

https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web/PublishedCache/XmlPublishedCache/PublishedMediaCache.cs#L193-L242

So the logic goes...

  1. Query Examine for media item - making sure that it hasn't been trashed (checking path for "-1,-21")
  2. If the media item isn't in Examine, then get from the MediaService and return

My question is, if when querying Examine we're checking if the item is trashed, but we're not doing the same when querying the MediaService. Wouldn't this mean that it will always return a trashed media item?


Stephan 02 Nov 2016, 12:34:41

methink you're right and it should read

var media = ApplicationContext.Current.Services.MediaService.GetById(id); if (media == null || media.Trashed) return null; // not found, ok

Correct? Will fix.


Lee Kelleher 02 Nov 2016, 12:36:58

@zpqrtbnk Sounds great! Thanks Stephan!

Did you want me to create a separate ticket for this? So there's a reference chain on the tracker/release notes?


Stephan 02 Nov 2016, 12:43:16

sure if you have a min that'd be great


Lee Kelleher 02 Nov 2016, 12:51:38

No problem-o, created issue #U4-9137


Priority: Normal

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 7.5.0

Due in version: 7.5.1

Sprint: Sprint 40

Story Points:

Cycle: