U4-3145 - Cache file periodically becomes corrupted: "Oops: this document is published but is not in the cache (internal error)"

Created by Funka! 14 Oct 2013, 19:31:51 Updated by Shannon Deminick 28 Aug 2014, 05:22:18

Why is this happening more frequently now for me after moving from v4 to v6? No, I don't have a way of reproducing it, but when the whole site breaks I have downloaded the umbraco.config cache file and looked through it, and what I find is certain nodes (identified by their ID) are appearing in there twice for some reason. Republishing the site does not fix this, so what I have to do is stop the site in IIS manager, delete the umbraco.config cache file, then restart the site. Magically everything works again..... for a while.

According to some ideas in this thread { http://our.umbraco.org/forum/getting-started/installing-umbraco/39296-Oops-this-document-is-published-but-is-not-in-the-cache-%28internal-error%29?p=1 } some people experience this under distributed environments, some people experience it with "publish at" functionality, and people like myself who aren't doing any of this seem to experience it too. This happens to me on "very simple" sites, and is very disconcerting. I was surprised to not find this in the issue tracker, so here ya go. And when the whole site turns into a YSOD because an expected node or even entire sections of the site are suddenly nowhere to be found, I would consider this not just a "critical" bug but rather a "show-stopper". Especially since it requires a server administrator to fix. I really wish I had a way of reproducing this reliably.

1 Attachments

Download UmbracoTraceLog.txt.zip

Comments

Sebastiaan Janssen 14 Oct 2013, 19:57:06

Version number please?


Funka! 14 Oct 2013, 21:00:27

6.1.3. I updated the issue to report as much, but the Our forum thread i linked above seems to indicate a variety of versions, which is why I didn't set this originally. Thank you!


Stephan 15 Oct 2013, 09:27:05

Happened to me once. I ''think'' it has to do with something bad happening when publishing, ie the node is inserted in the cache but not removed. The error or exception is swallowed so you don't know about it. I ''think'' I figured it out by running under the VS debugger. Unfortunately that's all I can remember at the moment. Will try to think harder.


Sebastiaan Janssen 15 Oct 2013, 09:37:53

Just a tip in the mean time, UptimeRobot is a free tool that lets you monitor your sites so it's not the client that needs to call you: http://uptimerobot.com/


Funka! 15 Oct 2013, 18:27:39

I wonder if an app pool reset (such as someone touching web.config) during a publishing activity may be something to investigate - i tried doing this manually a handful of times just now (publishing a branch of nodes and trying to touch web.config at the same time) but wasn't able to reproduce.


Rich Green 15 Oct 2013, 20:02:23

We have the same error, trying to debug it now. v 6.1.6


Rich Green 16 Oct 2013, 10:06:01

For anyone experiencing this, check your parent nodes are published correctly, we're hoping to work out the fix soon as we can replicate this.


Allan Kirk 22 Nov 2013, 08:39:31

Any news on this one? We are facing the same issue on a site we are working on at the moment.


Funka! 22 Nov 2013, 22:37:24

Thinking back on the times we experienced this, I think they tended to coincide with times that we were actively working on the site, whether that was updating document types, publishing nodes, saving templates, installing a package, or restarting the app pool --- not sure exactly which --- so we would tend to notice this error pretty soon afterwards, usually within a day or two of happening (because, sometimes only a certain node or branch is affected and you don't notice it right away).

I'm not sure I remember running into this error randomly at some other time when no one was working on the site, so my guess would be some kind of concurrency or race bug in the code happening ... I dunno ... somewhere. When it does happen though, at least it is "fixable" by following the steps I mentioned in the issue description. Noticing the problem in the first place is our biggest challenge, but I'm sure this is something maybe someone more clever than myself could figure out how to automatically detect (and hopefully fix!?)


Funka! 20 Dec 2013, 01:46:00

I've narrowed this problem down to find it occurring pretty regularly after we install packages. Just simple packages that contain document types and templates, maybe a data type (using one of the stock or maybe ucomponents render controls), but no other binaries or post-install actions.

There may certainly be other actions that cause this to happen, but package installation seems to be a pretty regular trigger.

(We've also found on development sites that aren't receiving any traffic that you don't need to log into the server, you just need to delete umbraco.config and then touch web.config.)


Douglas Robar 28 Dec 2013, 10:58:52

I've seen this quite often. The most reliable way to make it happen (though perhaps not the only way) is to install ezSearch package. The search page will be there and appear to be published but actually it will have this error.

(ezSearch is simply an example package that demonstrates the problem as Funka! mentioned... there's nothing wrong with ezSearch itself)


Darren Ferguson 15 Jan 2014, 12:30:28

In my case it wasn't related to installing a package.

We'd just done an upgrade.

I wrote a hack to traverse the tree and update the document cache - which seems to sort it out.

However a restart of this site - and removing umbraco.config puts us back in the same situation.

A cache file is produced but no nodes are in the cache.


Darren Ferguson 15 Jan 2014, 12:42:53

I've noted that in the cache file (umbraco.config) when I start Umbraco i get an empty doctype: e.g.

....

As I publish nodes, the doctypes get added e.g:

]> .....

So I'm guessing the cache file is produced but Umbraco can't load it because the elements aren't declared.


Darren Ferguson 15 Jan 2014, 15:52:35

Furthermore - i seem to have fixed this by doing the following in DocumentType.cs:

var documentTypes = contentTypes.Where(x => x != null).Select(x => new DocumentType(x));

instead of:

var documentTypes = contentTypes.Select(x => new DocumentType(x));

This seemed to be the issue stopping the DTD generating - it is inside a massive try/catch which made it hard to find.


Shannon Deminick 17 Feb 2014, 04:42:32

Thanks Darren!

I've looked into the code and first need to determine why there are null items in the returned 'documentTypes' collection. So the easy fix is in RepositoryBase.GetAll - we aren't filtering out any null items in the returned items from the sub classes which we always should be since we should never have null items in a returned collection. So that should fix this and it works across the board for all repos.

The reason why a null item might have been returned for a content type: For the sake of simplicity when the new 6.x API layer was created, the PerformGetAll on the ContentTypeRepository simply calls a "yield return Get(int id);" for each id found in the db for a content type. This then goes and fetches an item from the cache if it is there but perhaps the cache has been invalidated and thus could return null.


Shannon Deminick 17 Feb 2014, 04:44:32

I've also updated this null check on the GetByQuery underlying method.

All updated in 82ba1a24a3492ad399e3f5389743368ab199c8be


Stephan 17 Feb 2014, 07:31:25

This is great!


Rich Green 17 Feb 2014, 08:28:26

Great work guys


Funka! 17 Feb 2014, 22:22:03

Shannon, excellent work! Did you happen to correlate the problem when this happens to the installation of a package? Stepping through might reveal why, as you mentioned, "null items [are coming back] in the returned 'documentTypes' collection"... Thanks again!


Biagio Paruolo 05 May 2014, 08:16:32

So this problem is solved into 6.2.0 version? I've the problem on 6.1.6 on future published content ( valorized published at ).


Dan Booth 07 May 2014, 08:00:12

I'm not convinced this has been solved. I've upgraded a site to 6.2.0 and EVERY scheduled publish I run I get the message, "Oops: this document is published but is not in the cache (internal error)". I've removed all custom events in case it was related, but that made no difference. All I see in the trace log is, "WARN Umbraco.Web.Routing.DefaultUrlProvider - [Thread 31] Couldn't find any page with nodeId=4498. This is most likely caused by the page not being published."


Shannon Deminick 07 May 2014, 08:03:25

@Dan.Booth can you please log another bug? This might be related but there was no mention this was caused by a scheduled publish previously so that is not what was tested to fix this specific issue.


Dan Booth 07 May 2014, 08:06:25

@Shandem Yeah, I will, just trying to figure out more info. For instance, on a vanilla install of a clean 6.2.0 site scheduled publish works. But on the site I've upgraded it doesn't - just double-checking every config file and trying to figure out what might make the difference so I can give you more info.


Biagio Paruolo 07 May 2014, 08:21:55

This is a big problem for scheduled publishing content! My customer is a bit angry for this. I use 6.1.6. To refresh cache, I need to unpublish and publish content. Into v.4.8.x I've not this issue.


Biagio Paruolo 07 May 2014, 08:23:04

@Dan.Booth I've the same problem.


Biagio Paruolo 07 May 2014, 08:29:46

This is my log file zip.


Stephan 07 May 2014, 09:04:21

To all: one thing worth being noted is that the "oops: this document..." message is not an issue per se. It only properly reports that the cache has become corrupted, due to something else going wrong. So it'd be good to create proper issues for the various cases.

Eg here (Biagio's case) we seem to have an issue where scheduled publishing ''does'' publish the content but fails to refresh the cache. The cause seems to be that UrlTracker throws when handling the publish/unpublish event, because it calls umbraco.library.NiceUrl(nodeId) which requires a current UmbracoContext -- which does not seem to be present.

Dan, that would be why it works on vanila but not on your site. Because you have something in the way that kills the publish before the cache is refreshed.

Would need to create a issue... "there's no current UmbracoContext during scheduled publishing".


Biagio Paruolo 07 May 2014, 09:34:05

So: How correct the error? Is an UrlTracker Issue? I use UrlTracker to 301 to redirect some nodes that I use only as parent node into the content tree via.


Damian Green 20 May 2014, 16:05:06

I am getting this error in 7.1.3

I cant publish a document.

Also any changes i make to a document type are not picked up by the backend UI unless i recycle the app pool. Not sure if its related.


Stephan 20 May 2014, 16:08:25

@Damian: see my post above. This is not an "error" per se. Something ''else'' must be wrong, so in the end the cache is corrupted. The errors that you see when publishing must be related. Try to figure out those errors and create a new issue.


Dan Booth 21 May 2014, 14:11:54

Stephan - there is a separate issue for the UmbracoContext (HttpContext.Current) not being available - see U4-2724


Dan Booth 21 May 2014, 14:25:46

Stephan - I've also added an issue for scheduled publish in load-balanced environment, as this can lead to this error - see U4-581


Denise del Bando 08 Jul 2014, 14:44:19

the state of this bug is fixed and due version is 6.2.0. i am still experiencing this error in v6.2.1. What happened in my situation was, we unpublish the root node because it is currently in development. then when we publish it for UAT, changes or QA, we need to go through each and every page and publish since checking Publish [page name] and all its subpages and including unpublished pages causes error mentioned in this issue tracker. Sometimes this function works, sometimes it doesnt.


Shannon Deminick 09 Jul 2014, 00:54:57

Any chance you are getting validation errors with child documents which is leading to them not being publishable? (i.e. you have a required field that is not filled in ?)


Biagio Paruolo 09 Jul 2014, 08:02:53

This is a big issue for me and my customer with Umbraco v6.1.6 (Assembly version: 1.0.5021.24867) and I've any possible to upgrade to 6.2.x because UBlogsy is not compatible ( I tried to do this but the test site did'nt work ). How to publish at time? I don't have the problem with old version!


Shannon Deminick 09 Jul 2014, 08:08:33

@Biagio: there is a lot of information in this thread and people have various issues that are 'related' to this, we don't know what you specific issue is, can you please describe? What is your 'old' version? Did you read my comment above? Your original post said you don't know how to reproduce the error, is that still the case or can you give us a way to reproduce?


Denise del Bando 09 Jul 2014, 13:51:05

@Shandem all child pages are valid because when i publish them one at a time, i do not have to correct any validation error.


Richard Thompson 21 Jul 2014, 12:21:34

I've experienced this issue after upgrading from 6.16 as described here http://our.umbraco.org/forum/getting-started/installing-umbraco/54739-Upgrading-616-to-621-issues and republishing all nodes.

I tried publishing from the top level node but all children now have a YSOD:

ContentTypeService failed to find a content type with alias "LandingPageDevelopmentSearch".


Biagio Paruolo 26 Aug 2014, 20:53:27

I found another issue...I put into Publish at field the value: 2014-07-09 10:40 So If I click on Save and Publish the system says that message: Credits (2151) could not be published because these properties: did not pass validation rules. Instead, the data format value into Last Edited field is different: 09/07/2014 10:35


Biagio Paruolo 26 Aug 2014, 20:54:58

@Shandem I've a post. If I put in a future date and time to publish, after I've into Link document field: Oops: this document is published but is not in the cache (internal error).


Shannon Deminick 28 Aug 2014, 05:22:18

@Biagio.Paruolo This issue is closed, if you have a reproducible issue and are using the latest versions of Umbraco, please create a new issue with steps to reproduce.


Priority: Critical

Type: Bug

State: Fixed

Assignee: Shannon Deminick

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 6.1.3, 6.1.6

Due in version: 7.1.0, 6.2.0

Sprint:

Story Points:

Cycle: