U4-9101 - fix up how cache refresh instructions are read or written and to prevent duplicate processing

Created by Shannon Deminick 24 Oct 2016, 12:16:29 Updated by Stephan 26 Oct 2016, 17:26:57

Relates to: U4-7673

Subtask of: U4-9085

We need to fix up how cache refresh instructions are read or written. In many cases duplicate cache instructions are processed and in some of those cases it's thousands of times. We should be able to de-duplicate these instructions so that we are not running them more than they need to be run. This will require some investigation to see what is possible since in some cases and the order that instructions are in, we might not be able to 'just' take the latest duplicate instruction.

Comments

Shannon Deminick 26 Oct 2016, 10:36:05

*De-duplicating the instructions that are being processed - we don't need to process duplicate instructions which will save some processing power *Ensuring that only the MaxProcessingInstructionCount number of instructions are ever written to a json instruction blob per row - so that there is never too many instructions being processed at once at startup *During syncing (instruction processing), the request used to start instruction processing will only query for the top 100 db rows instead of every row *During instruction writing, the maximum number of instructions written to any db row will be MaxProcessingInstructionCount *During instruction processing we will check if the app domain has been signaled to shut down and if so we'll exit the processing loop to allow graceful/quick shut down. This of course means the current instruction batch will need to be reprocessed, but that is ok. *When the app domain shutdown is waiting for the instruction processing to terminate, if they do not exit within 5 seconds, the lock will be cleared and the app will continue shutting down

Some notes regarding the de-deplication of instructions. We need to consider if not re-processing a duplicate instruction could cause some data inconsistencies. An instruction is identical if ALL of it's information matches.

*PageCacheRefresher - This clears some runtime caches and updates the item (which is resolved from the database) into the content xml cache or removes the item from the xml cache. There should be no reason why re-processing a duplicate instruction would have a different affect on the runtime cache or the xml cache. *UnpublishedPageCacheRefresher - This clears some runtime caches and updates any sort order for an item (which is resolved from the database) in the xml cache. There should be no reason why re-processing a duplicate instruction would have a different affect on the runtime cache or the xml cache. *MediaCacheRefresher - This clears some runtime cache. There should be no reason why re-processing a duplicate instruction would have a different affect on the runtime cache or the xml cache.

All of the above also invoke Examine to re-index data. When this occurs the data is also looked up from the database and converted to XML to store in Lucene. There should be no reason why re-processing a duplicate instruction would have a different affect on the index value. When an cache refresh remove instruction is executed, Examine will check if it needs to remove the index item AND it's descendants - this is done by querying the state of the index for descendants ... so the question is, would removing duplicate instructions create an issue for doing this?


Shannon Deminick 26 Oct 2016, 12:26:03

PR for review: https://github.com/umbraco/Umbraco-CMS/pull/1550


Shannon Deminick 26 Oct 2016, 12:26:20

This PR is also for http://issues.umbraco.org/issue/U4-7673


Stephan 26 Oct 2016, 16:02:17

Seems ok, merged.


Priority: Normal

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions:

Due in version: 7.5.5

Sprint: Sprint 45

Story Points: 3

Cycle: