U4-3787 - Unpublishing a node with descendants and publishing it again causes a xml cache error.

Created by Mads Krohn 04 Dec 2013, 13:47:33 Updated by Martin Fenton 08 Sep 2015, 14:59:44

Relates to: U4-2108

Relates to: U4-7076

When you unpublish a page that has descendants and afterwards want to publish that page again, the descendants will not be re-added to the xml cache, instead showing the message "Oops: this document is published but is not in the cache (internal error)". The odd thing is, that this is only true for descendants, not immediate children. It seems somewhat similar to an error I reported back in version 6.0.4 -> http://issues.umbraco.org/issue/U4-2108 The xml cache can be re-synced by either publishing all documents again or by publishing the entire site from the root content node.

Comments

Stefan Kip 02 Apr 2014, 10:16:47

Experiencing this issue too with v6.1.6


Stephan 18 Apr 2014, 12:23:07

Reproduced on 7.1.1.


Stephan 18 Apr 2014, 18:12:04

Debug notes:

In Umbraco.Web.Strategies.Publishing.UpdateCacheAfterPublish we do hit PublishingStrategy_Published. If we manually change e.IsAllRepublished to true, then descendants are back in the cache, else they are not.

So it's down do DistributedCache.Instance.RefreshPageCache(...) not doing its job? Goes to DefaultServerMessenger.PerformRefresh with zero servers, a proper refresher, and some content instances. Goes to DefaultServerMessenger.MessageSeverForManyObjects... to InvokeMethodOnRefresherInstance with dispatchType RefreshById, refresher is a strongly typed refresher, call Refresh(instance)... Goes to PageCacheRefresher.Refresh(instance)...

Ends up in content.Instance.UpdateDocumentCache... and there, PublishNodeDo does NOT add the node to the XML.

Because it cannot find the parent in the XML? Aha! If we're re-publishing A, in A>B>C, then the instances we try to refresh are C,B in that order, so C's parent (B) is not in the XML so it all fails. Instances from from ToArray in UpdateCacheAfterPublish.UpdateMultipleContentCache, which applies to an IEnumerable which is the result of ContentService.GetPublishedDescendants(content).

Which ends in ContentRepository.GetByPublishedVersion() which orders by version date, sort order BUT not by path => fails.

Modify:

private void UpdateMultipleContentCache(IEnumerable content) { DistributedCache.Instance.RefreshPageCache(content.OrderBy(x => x.Path).ToArray());
}

And it works.


Morten Christensen 18 Apr 2014, 19:56:53

The Path ordering should be done after descendants are found, right? It sounds like you want to sort by path in the repository, but that doesn't make sense as its fetching the version of a single content item. The sorting before publishing/cache update sounds right though :)


Morten Christensen 18 Apr 2014, 20:01:50

I actually thought ContentService.GetPublishedDescendants was doing this type of ordering as it would make sense to get an ordered list of descendants returned from that method. Maybe that is where the regressw lies?


Stephan 19 Apr 2014, 06:44:34

@Morten: think you're right, we should make sure that ContentService.GetPublishedDescendants returns the descendants in the proper "xpath" order. That's what I wanted to do but: it relies on repository.GetByPublishedVersion(query) which sorts by (VersionDate(desc), SortOrder). I'd rather modify that method. Its output is used by GetAllPublished (would make sense to get the tree in the proper "xpath tree" order too), GetPublishedContentOfContentType (same) and that's all. OK, going to change GetByPublishedVersion returned order - if you know of some place where we expect it to return the "most recent" nodes first, let me know.


Stephan 19 Apr 2014, 07:42:09

Real "document order" is depth first, with the children of a node being sorted by sortOrder. Can do in SQL with: ORDER BY substring(path, 1, len(path) - charindex(',', reverse(path))), sortOrder

  • For this very issue, we can return nodes sorted by level, to ensure parents are returned first, and that would be enough.
  • Sorting by level,sortOrder would ensure that parents come first, and siblings come in the right order, but it's not depth-first.
  • Sorting by path would be slower but depth-first, though not respecting sort-order.
  • Sorting by "document order" using the expression above would be idea but slower, plus I have no idea how to do it within our persistence framework.

For the sake of simplicity I'd vote for (level, sortOrder). Votes?


Morten Christensen 19 Apr 2014, 10:12:17

Doesn't GetByPublishedVersion return the specific (published) version of a single document? I'm not sure if I'm missing what you are trying to say, but sorting by Path within GetByPublishedVersion just doesn't make sense to me when we are dealing with a single document. However, sorting within GetPublishedDescendants makes perfect sense to me.

I dont remember anything about additional usages of the top of my head, so will have to get to that on Monday or Tuesday.

Level, SortOrder sounds reasonable.


Morten Christensen 19 Apr 2014, 10:20:59

Okay, just looked it up and realized I was confusing the repository.GetByPublishedVersion with service.GetByVersion. Sorry!! My bad.

The repository method should at least have the Level included in the sorting. I'm not sure what impact the Path will have?


Morten Christensen 19 Apr 2014, 10:23:37

I can't really think of anything that Path would give us that we can't archive with sorting by Level...?


Stephan 19 Apr 2014, 10:43:00

Yes at one moment you got me confused ;-) Going with level+sortOrder then.


Mads Krohn 19 Apr 2014, 12:12:41

Thank you for looking at this issue guys #h5yr !


Stephan 19 Apr 2014, 12:34:48

Pushed 7ae522b511ba2e768f06e7f45512aa7c441207e1 in 7.1.2 and backported as 3bf1041041487d56dbd2b4fb24ef719d44efaa19 in 6.2.0.


Stefan Kip 19 Apr 2014, 14:16:32

Awesome work!


Priority: Normal

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 6.1.5, 6.1.6, 7.1.1

Due in version: 6.2.0, 7.1.2

Sprint:

Story Points:

Cycle: