We have moved to GitHub Issues
Created by Shannon Deminick 16 Aug 2013, 00:36:24 Updated by Shannon Deminick 16 Sep 2014, 00:59:33
Is duplicated by: U4-877
Relates to: U4-3189
Relates to: U4-5491
Relates to: U4-5492
Whenever we need to make a cache refresher call to other servers in an LB scenario, we should collect all of the calls that need to be made during a request and then transmit the combined data in one request to each server at the end of the request. This should save on a lot of inter-server communications especially when performing bulk operations.
This is all working in rev: 672672d0aa9fc6635bc473212d16a6b47f1aa4d2 beb1979b4655a529b703ceec094a50f6a3dd8602
contained in a custom POC branch
This definitely needs to get into the next releases in both 6 and 7. I've been doing a lot of work with the variants branch and have a load balanced local setup going on and the amount of unnecessary calls to refresh cache is staggering, this will all be fixed with this change.
Great stuff, Shannon!
So glad this has been done and going to make it into the next releases for both 6 and 7. Well done :)
Ok, so have found an issue with this now. It's not a huge issue or anything but will probably be annoying to some users.
The problem is that since we are batching all of these calls at the very end of the request on all servers, including the current server, this means that the cache has not actually been 'refreshed' on the current server during the request. The problem people will see is that when they first publish a document, you'll get the "this document is published but is not in the cache (internal error)" msg where the URL should be in for the content item in the back office. The URL in fact has been generated and is stored in the cache but because the UI's request ended before the cache refreshing was done, it is not reflected there.
This isn't going to cause problems but of course some admins will be asking questions.
To fix this I guess there are a couple of things we could do:
For now, I'm going to implement this with option #2 since it's the easiest and requires no explicit config.
Great work, Shannon. I'd agree Option 2 seems like the simplest and I can't imagine there would be any measurable overhead in clearing the cache on the master server twice.
Any chance the server could send the "originating server name" (thinking... IP address or machine name of some sort) as part of the request (or maybe this would break all existing requests) so a server could ignore a request that originates from itself (or is this a silly idea?)
We could do that but in some cases people will be load balancing on the same server - mostly for testing. This is why we added the attributes appId or serverName to the server section: http://our.umbraco.org/documentation/Installation/load-balancing#Correctconfigforscheduledpublishing&tasks
So we'd have to use the HttpRuntime.AppDomainAppId instead. This could definitely be possible to prevent the duplicate cache refreshes! Since the web service for bulk distributed calls is new, it wouldn't be a breaking change. I'll look into it :)
Debug info like that is gold dust when trying to work out what the heck is going wrong. Good call :)
Ok this works good, now we pass up the appId with the batched dist web service call so now cache refreshing doesn't occur twice on the source server.
Assignee: Shannon Deminick
Backwards Compatible: True
Affected versions: 6.1.0, 6.1.1, 6.1.2, 6.1.3
Due in version: 7.1.5, 6.2.2