U4-9136 - XML Data integrity fix doesn't work on Umbraco Cloud

Created by Sebastiaan Janssen 02 Nov 2016, 08:30:59 Updated by Shannon Deminick 02 Mar 2017, 15:35:28

Tags: Unscheduled

Relates to: U4-9587

When fixing the XML data integrity of media items on Umbraco Cloud with a large number of media items (about 60.000), the fix tool can't complete because SQL Server times out at some point. This is going to be an issue for more people running SQLAzure (even outside of Umbraco Cloud).

The error: 2016-11-02 07:11:25,891 [P8188/D12/T32] ERROR Umbraco.Core.Persistence.UmbracoDatabase - Database exception occurred System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. This failure occurred while attempting to connect to the routing destination. The duration spent while attempting to connect to the original server was - [Pre-Login] initialization=4; handshake=18; [Login] initialization=0; authentication=0; [Post-Login] complete=3; ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at StackExchange.Profiling.Data.ProfiledDbCommand.ExecuteNonQuery() at Umbraco.Core.Persistence.PetaPocoCommandExtensions.<>c__DisplayClass1.<ExecuteNonQueryWithRetry>b__0() at Umbraco.Core.Persistence.FaultHandling.RetryPolicy.ExecuteAction[TResult](Func1 func) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command, RetryPolicy cmdRetryPolicy, RetryPolicy conRetryPolicy) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command, RetryPolicy retryPolicy) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command) at Umbraco.Core.Persistence.Database.Execute(String sql, Object[] args) ClientConnectionId:e2a9f31a-b71b-45cb-a0a1-230e5ce770bf Error Number:-2,State:0,Class:11 ClientConnectionId before routing:98551529-dc87-40ac-b539-0f18de3560f3 Routing Destination:a83baef09cf0.tr33.westeurope1-a.worker.database.windows.net,11084 2016-11-02 07:11:43,273 [P8188/D12/T144] INFO Umbraco.Core.PluginManager - Starting resolution types of Umbraco.Web.Trees.TreeController 2016-11-02 07:11:43,273 [P8188/D12/T144] INFO Umbraco.Core.PluginManager - Completed resolution of types of Umbraco.Web.Trees.TreeController, found 0 (took 0ms) 2016-11-02 07:12:21,107 [P8188/D12/T188] ERROR Umbraco.Core.Persistence.UmbracoDatabase - Database exception occurred System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. This failure occurred while attempting to connect to the routing destination. The duration spent while attempting to connect to the original server was - [Pre-Login] initialization=17; handshake=25; [Login] initialization=0; authentication=0; [Post-Login] complete=2; ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource1 completion, Int32 timeout, Task& task, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at StackExchange.Profiling.Data.ProfiledDbCommand.ExecuteNonQuery() at Umbraco.Core.Persistence.PetaPocoCommandExtensions.<>c__DisplayClass1.<ExecuteNonQueryWithRetry>b__0() at Umbraco.Core.Persistence.FaultHandling.RetryPolicy.ExecuteAction[TResult](Func1 func) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command, RetryPolicy cmdRetryPolicy, RetryPolicy conRetryPolicy) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command, RetryPolicy retryPolicy) at Umbraco.Core.Persistence.PetaPocoCommandExtensions.ExecuteNonQueryWithRetry(IDbCommand command) at Umbraco.Core.Persistence.Database.Execute(String sql, Object[] args) ClientConnectionId:b4646c61-86c4-4253-9ca8-395f3de0a2eb Error Number:-2,State:0,Class:11 ClientConnectionId before routing:a2ef0264-5b00-4993-be2c-846dabd24a93 Routing Destination:a83baef09cf0.tr33.westeurope1-a.worker.database.windows.net,11084

Comments

Sebastiaan Janssen 02 Nov 2016, 08:31:56

Proposed fix: don't do this in a giant transaction but replace xml rows 1 by 1 instead of deleting everything and rebuilding.


Stephan 02 Nov 2016, 11:46:26

PR: https://github.com/umbraco/Umbraco-CMS/pull/1562

Before: in 1 big transaction,

  • delete everything Xml
  • in groups of 5000 items, ordered by path (why?) -- load the items, generate the Xml -- bulk insert the Xml rows

Now: in groups of 200 items, ordered by id

  • load the items, generate the Xml
  • insert or update each Xml row

It potentially can be slower. But, it should use less memory and (and that's the whole point) be much more resilient as it should not keep a giant transaction running that Azure will eventually time out.

Test: need to test on the Azure site we see is failing.


Priority: Major

Type: Bug

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 7.5.4

Due in version: 7.5.5

Sprint: Sprint 46

Story Points: 1

Cycle: