U4-9936 - PageCacheRefresher CacheUpdated event is occurring before the cache has updated in v7.6.1 on front end servers

Created by Jeavon Leopold 19 May 2017, 12:15:42 Updated by Shannon Deminick 24 May 2017, 06:15:17

Subtask of: U4-9609

The PageCacheRefresher CacheUpdated event is occurring before the cache has updated on Umbraco v7.6.1 (it's behaving like an Updating event) but only for front end servers. This is different to the behaviour on Umbraco v7.5.

This was hard to pin down, using the following cache event test code:

private void CachePageEvents_CacheUpdated(PageCacheRefresher sender, CacheRefresherEventArgs e) { var id = 0; if (e.MessageType == MessageType.RefreshByInstance || e.MessageType == MessageType.RemoveByInstance) if (e.MessageType == MessageType.RefreshById || e.MessageType == MessageType.RemoveById) { id = (int) e.MessageObject; }

		var cachedContent = UmbracoContext.Current.ContentCache.GetById(id);
		var siteTitle = cachedContent.GetProperty("siteTitle").DataValue;

		LogHelper.Info<UmbracoEvents>(string.Format("siteTitle: {0} On Server:{1} AppDomain:{2}",siteTitle, Environment.MachineName, System.AppDomain.CurrentDomain.FriendlyName));
	}

After changing the property "siteTitle" in Umbraco to "Fanoe - Umbraco Starter Kit 100" in the logs we can see that the front end server is behind.

Front end server: 2017-05-19 13:09:03,049 [P5504/D15/T39] INFO BearTheLoad.Logic.Events.UmbracoEvents - siteTitle: Fanoe - Umbraco Starter Kit 11 On Server:JEAVONS-T470P AppDomain:/LM/W3SVC/5/ROOT-14-131396687454934893

Umbraco server: 2017-05-19 13:08:57,548 [P13876/D23/T56] INFO BearTheLoad.Logic.Events.UmbracoEvents - siteTitle: Fanoe - Umbraco Starter Kit 100 On Server:JEAVONS-T470P AppDomain:/LM/W3SVC/4/ROOT-17-131396687370961393

Whats strange here is that both servers have the same code and the same event is occurring but the cache is different...?

This is a significant problem as these events are used often to clear output caching but with this issue the output cache is regenerated with the stale content on front end servers.

Comments

Sebastiaan Janssen 19 May 2017, 13:04:17

Thanks @crumpled_jeavon ! Let's see what @Shandem can figure out about this! :-)


Jeavon Leopold 19 May 2017, 13:46:35

I wanted to double check so I have just executed exactly the same code on Umbraco v7.5.4 and as I thought both log files entries were exactly the same.

As an aside, in Umbraco v7.6.1 which has a newer version of log4net if you are running two sites from the same files (i.e. simulating synced load balancing locally) it seems that only one site can write to the log file (which wasn't the case in v7.5), this isn't a problem in a live environment as it wouldn't happen but for local testing I've been adjusting the log4net.config to include the appDomain in the file name so it creates two files.


Shannon Deminick 22 May 2017, 05:47:54

@crumpled_jeavon I think the log thing should never have been possible in the first place :)

For front-end servers, they will process the cache refresher events based on the instructions in the database table based on active requests coming in. Are you saying that the front-end server never gets updated?

Can you check if there's instructions being written to the table?


Jeavon Leopold 22 May 2017, 07:38:53

@Shandem yes the entries are in the table, it all works generally, the issue is that this event is occurring as if it was an Updating event rather than a Updated event but only on the front end servers on the publishing server the event occurs as expected as a Updated event


Shannon Deminick 22 May 2017, 08:55:18

Ah i think i see what you are saying.

This must be an issue with how we are deduplicating events and merging latest entities into the event args.

I'll see what i can do to replicate/unit test. I don't know if this will make any sense to you but these are where tests for this are:

During a Scope, all events that were raised are intercepted and queued and then raised at the end of the Scope, but these events are both deduplicated and the entity is merged between event args so that it's the exact same entity used. The Scope thing is hard to explain but it's essentially a transaction that wraps service methods.


Shannon Deminick 22 May 2017, 08:58:29

Actually re-reading this I'm unsure that's the issue. If it works on the Master server, those same event args should be used to serialize the instructions so I don't think it's that.

I think maybe the problem might be that the cache just isn't actually updated when you do this: var cachedContent = UmbracoContext.Current.ContentCache.GetById(id);

I'll need to run some tests and check with @zpqrtbnk


Sebastiaan Janssen 22 May 2017, 09:07:21

FYI, this issue was prompted by this tweet: https://twitter.com/crumpled_jeavon/status/865151287440138241

Does indeed sound like the cache is not updated immediately.


Jeavon Leopold 22 May 2017, 09:14:21

@Shandem that's the odd thing, it's the same event occurring on the master and slave servers but that cache is in a different state. I wrote the code using UmbracoContext.Current.ContentCache.GetById(id) to try to work out what was going on. In 7.5 UmbracoContext.Current.ContentCache.GetById(id) will always be up to date on both Master and Slave in this event while in 7.6 it is only up to date on the Master, the Slave will be one version behind.

I noticed this issue as the PageCacheRefresher CacheUpdated event is used often to clear Output cache on all servers and this has worked perfectly until 7.6. With 7.6 the Output Cache is cleared on both Master and Slave however the Slave will re-cache with stale content while the Master will re-cache with correct content.


Shannon Deminick 22 May 2017, 09:29:08

@crumpled_jeavon So just need to confirm, is the slave server always stuck in a previous version or does it eventually get into the correct version? (i.e. after refreshing a few times or something)


Jeavon Leopold 22 May 2017, 09:40:50

@Shandem no it's not stuck at all, the Slave servers will get the correct version in cache, the issue is that this event is occurring before the cache has updated, if this event was named CacheUpdating it would be correct although only on the Slaves, the master is still correctly named CacheUpdated


Jeavon Leopold 22 May 2017, 09:41:59

@Shandem I can make a video if it would be helpful?


Shannon Deminick 22 May 2017, 09:57:29

@crumpled_jeavon ok, than something funky with the cache is happening that maybe it's somehow postponing it's cache updating. This is the code that executes before the event that you are binding to fires on slaves: https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web/Cache/PageCacheRefresher.cs#L61

The call to base.Refresh(id); is actually responsible for emitting that event.

There is a subtle difference with the way it executes on the master - which does so with the real object: https://github.com/umbraco/Umbraco-CMS/blob/dev-v7/src/Umbraco.Web/Cache/PageCacheRefresher.cs#L84

I'll need @zpqrtbnk to have a look at this


Jeavon Leopold 22 May 2017, 10:34:52

@Shandem interesting, I can see nothing has changed in either of those methods so as you say must be something funky further down with the cache. @zpqrtbnk anything you need let me know, I can make a demo video if helpful, just let me know...


Stephan 23 May 2017, 06:58:32

Looking into this issue now. Pretty sure it's linked to scopes deferring events in some way.


Stephan 23 May 2017, 09:22:53

Mmm... actually, it might "just" be due to a silly bug in PageCacheRefresher that should be fixed in:

PR https://github.com/umbraco/Umbraco-CMS/pull/1960

@crumpled_jeavon any chance you can test?


Jeavon Leopold 23 May 2017, 09:26:19

@zpqrtbnk sure thing


Jeavon Leopold 23 May 2017, 09:49:35

@zpqrtbnk I'm afraid that change didn't solve the issue :(


Stephan 23 May 2017, 11:02:08

duh


Stephan 23 May 2017, 11:50:42

well, reproduced, that's a start


Jeavon Leopold 23 May 2017, 11:52:37

@zpqrtbnk phew, I was willing to get on the Eurostar to show you :)


Stephan 23 May 2017, 12:19:48

think I got this right this time, want to try again?


Jeavon Leopold 23 May 2017, 12:36:34

@zpqrtbnk of course :)


Jeavon Leopold 23 May 2017, 13:11:23

@zpqrtbnk bingo! Working beautifully again :)


Stephan 23 May 2017, 14:06:48

cool, moving issue to review so someone else can merge thanks for your time & helpful debugging infos sorry for the inconvenience!


Shannon Deminick 24 May 2017, 06:15:14

Code looks good! merged :)


Priority: Normal

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 7.6.0, 7.6.1

Due in version: 7.6.2

Sprint:

Story Points:

Cycle: 1