U4-3189 - Umbraco 6 distributed call, can't ping itself on startup

Created by Arthur Berthier 17 Oct 2013, 19:52:00 Updated by Shannon Deminick 22 May 2014, 01:04:14

Relates to: U4-2633

We are currently using umbraco 6.1.1 on a load balanced environment. We have one authoring server and two production servers.

the umbracosettings.config is set like this on the authorig server... the prod servers don't have distributed call enabled.

<user>0</user>

<servers>

web1

web2

</servers>

It works fine this way... The problem is that when I add the authoring server to the loop (auth), there is no way to restart the authoring server.

After investigation, the process appears to freeze at this point

INFO Umbraco.Core.Sync.DefaultServerMessenger - [Thread 51] Submitting calls to distributed servers

Indeed it is trying to access the page

/umbraco/webservices/CacheRefresher.asmx

but this page is not accessible because it is waiting for the startup process to finish, which is waiting for the page to be available...

We found a solution modifying the source code of this file: Umbraco.Core\Sync\DefaultServerMessenger.cs the changes are attached, look for "//Changes AB" there are 4 changes...

The idea is to refresh the cache of the authoring server with or without distributed calls, and make the authoring server not sending a distributed call to it self by removing it from the list of distributed calls.

1 Attachments

Download DefaultServerMessenger.cs

Comments

Tristan Thompson 29 Oct 2013, 10:48:07

Just to add to this - I have the same issue, however, this fix didn't work.

My symptom is that when I turn distributed calls on, the site just hangs on startup and I have to kill the IIS process

The above code change didn't work for me.


Arthur Berthier 29 Oct 2013, 19:42:09

Where is your process stuck, if you look at umbracoTrace?


Chris Hopkins 05 Mar 2014, 14:56:52

I was having this exact same issue, implemented the change above and it still didn't work. However, I hadn't removed the Admin server from the umbracoSettings.config file, once I did this it worked perfectly.

Thanks for spotting this


Matt Salmon 05 Mar 2014, 15:19:00

The //TODO statement on line 395 of the DefaultServerMessenger.cs file implies there would be a performance gain from implementing this fix, as well as resolving the issue that some of us are experiencing: https://github.com/umbraco/Umbraco-CMS/blob/6.1.6/src/Umbraco.Core/Sync/DefaultServerMessenger.cs


Shannon Deminick 24 Apr 2014, 04:58:13

Fixing the TODO statement in there is a tricky one which is why it hasn't been done yet since we cannot detect which server in the list matches the current server executing. In some scenarios we can, in others we cannot.

In the meantime can I get some more background info on this - Can you tell me why a distributed call is being made on server startup? AFAIK the Umbraco core doesn't perform CRUD operations during startup so distributed calls will not be made. Are you running some custom code on startup that is performing any CRUD?


Shannon Deminick 24 Apr 2014, 04:58:41

Doh! By CRUD I actually just mean the CUD part ..


Arthur Berthier 16 May 2014, 00:01:56

On server startup, umbraco is pinging the distributed servers to check which one are alive and should receive a distributed call.


Shannon Deminick 16 May 2014, 00:03:54

Is this something custom that you've written? There's nothing in the core that ping's the distributed servers on startup.


Arthur Berthier 16 May 2014, 00:14:23

no it is default behavior, the custom code I wrote was to workaround the problem


Shannon Deminick 16 May 2014, 00:18:10

Ok, so a different question - do you have any event handlers running on startup? i.e. to publish/save/update content, etc... ?

Are you also able to tell me what kind of distributed call is being made that fails (i.e. content, media, etc...) during startup?

Again, we don't 'ping' servers on startup, the only time a distributed calls is made is when an entity is saved/updated/deleted/etc...


Tristan Thompson 20 May 2014, 15:51:51

So, to be clear about the issue I am having - I have 2 web servers (no separate authoring server) "web1" and "web2", but with no authoring server. I enter both server names into the umbraco.config (the file is replicated across both servers, so needs to stay the same).

As soon as I enable the distributed call (enabled="true") and restart the app pool, the site just hangs (same symptoms as the main issue, just slightly different configuration). This wasn't an issue in Umbraco v4 as we've had this setup for years.

The only handlers I have are on publish/new doc...nothing on-startup...the same handlers I hook into my v4 sites.


Shannon Deminick 22 May 2014, 01:04:04

The only thing I can think of when this might occur is if there is no xml cache file on the server and during startup it gets generated and potentially sends a distributed call. I'll need to check, will have a look on Monday.

In any case, I realize the underlying problem is that if distributed calls are made during startup it cannot do it because it's waiting for a response from itself but it hasn't initialized yet. I'm fairly sure this would fix this issue: U4-2633


Priority: Normal

Type: Bug

State: Open

Assignee: Shannon Deminick

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 6.1.1

Due in version:

Sprint:

Story Points:

Cycle: