SqlAzureRecoveryStrategy

Posts   
 
    
obzekt
User
Posts: 60
Joined: 29-Apr-2004
# Posted on: 15-Sep-2021 09:01:07   

Hello. We have deployed our app with LLBL 5.7.1 in Azure, which uses a recovery strategy and (optionally) AAD authentication to Azure SQL:

    internal class AzureAdapter : DataAccessAdapter
    {
        private bool _acquireAADToken;

        internal AzureAdapter(string connectionString, bool keepConnectionOpen, bool acquireToken) : base(connectionString, keepConnectionOpen)
        {
            _acquireAADToken = acquireToken;
            ActiveRecoveryStrategy = new SD.LLBLGen.Pro.ORMSupportClasses.SqlAzureRecoveryStrategy();
        }

        protected override DbConnection CreateNewPhysicalConnection(string connectionString)
        {
            SqlConnection conn = (SqlConnection)base.CreateNewPhysicalConnection(connectionString);

            if (_acquireAADToken)
            {
                // https://docs.microsoft.com/en-us/dotnet/api/overview/azure/service-to-service-authentication
                string tokenConnString = (string.IsNullOrEmpty(AzureUtils.AzureIdentityClientID)) ? null : string.Format("RunAs=App;AppId={0}", AzureUtils.AzureIdentityClientID);
                conn.AccessToken = (new AzureServiceTokenProvider(tokenConnString)).GetAccessTokenAsync(AzureUtils.SqlResourceID).Result;
            }

            return conn;
        }
    }

We notice some intermittent errors, in the form of Win32Exception:

An exception was caught during the execution of a retrieval query: A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.). Check InnerException, QueryExecuted and Parameters of this exception to examine the cause of this exception. A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) The semaphore timeout period has expired

Here's the call stack:

SD.LLBLGen.Pro.ORMSupportClasses.ORMQueryExecutionException: at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.Execute (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27) at SD.LLBLGen.Pro.ORMSupportClasses.DataAccessAdapterCore.ExecuteSingleRowRetrievalQuery (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27) ... at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.TagAndExecuteCommand (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27) at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.Execute (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27) Inner exception System.ComponentModel.Win32Exception handled at System.Data.SqlClient.SqlConnection.OnError:

It seems SqlAzureRecoveryStrategy is only inspecting SqlExceptions. Shouldn't the above network error be considered transient too?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 15-Sep-2021 09:09:48   

The strategy inspects sqlexceptions as it can then determine whether these are 'non fatal', as in 'losing a connection' at the sqlconnection level can be retried, but if a query fails, you can't do that at the strategy level, you have to do that at a higher level. Also other exceptions can be caused by anything, so we can't decide at the strategy level to retry or not for all exceptions, only the ones which are recoverable at the strategy level. Any other exception is considered fatal and the operation has to be restarted at a higher level.

The exception you post looks like an exception inside the SqlConnection class, does the inner exception stacktrace reveal anything regarding where it crashes? If you're using the new SqlClient (Microsoft.Data.. ), do you use the latest version? (it has had some bugfixes since release). I.e. it might be this is an error we can recover from at the strategy level but the SqlClient doesn't pick it up as such (a win32 error suggest it might have a bug)

You can adjust the strategy class btw and check for this exception as well and retry like it was a recoverable exception.

Frans Bouma | Lead developer LLBLGen Pro
obzekt
User
Posts: 60
Joined: 29-Apr-2004
# Posted on: 15-Sep-2021 09:53:55   

Thanks Frans for the quick reply. We may have to customize indeed the recovery strategy and include that Win32Exception, if it persists. Btw this db exception came right after a SocketFailure and timeout connecting to Redis, so it must be a networking/infrastructure thing, some kind of swap taking place, not triggered by us. The outage lasted ~30sec.

The app uses System.Data.SqlClient.dll (4.700.20.37001) and this is the full stack trace with the inner SqlException:

SD.LLBLGen.Pro.ORMSupportClasses.ORMQueryExecutionException:
   at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.Execute (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.EntityMaterializerBase.Materialize (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.DataAccessAdapterCore.ExecuteMultiRowRetrievalQuery (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.DataAccessAdapterCore.FetchEntityCollectionInternal (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.DataAccessAdapterCore.FetchEntityCollection (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.DataAccessAdapterBase+<>c__DisplayClass10_0.<FetchEntityCollection>b__0 (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.RecoveryStrategyBase+<>c__DisplayClass7_0.<Execute>b__0 (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.RecoveryStrategyBase.Execute (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
...
   at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke (System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin (System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5 (System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage11 (System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.ServiceModel.Dispatcher.MessageRpc.Process (System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
Inner exception System.Data.SqlClient.SqlException handled at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.Execute:
   at System.Data.SqlClient.SqlConnection.OnError (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniError (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParserStateObject.TryReadByte (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.TdsParser.TryRun (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlDataReader.get_MetaData (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlCommand.ExecuteReader (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.TagAndExecuteCommand (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
   at SD.LLBLGen.Pro.ORMSupportClasses.RetrievalQuery.Execute (SD.LLBLGen.Pro.ORMSupportClasses, Version=5.7.0.0, Culture=neutral, PublicKeyToken=ca73b74ba4e3ff27)
Inner exception System.ComponentModel.Win32Exception handled at System.Data.SqlClient.SqlConnection.OnError:
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39614
Joined: 17-Aug-2003
# Posted on: 16-Sep-2021 09:07:23   

It indeed looks like you have to anticipate for it in the strategy if you want to retry it at that level. It's tricky, as a server going down likely has the same effect but you can't see it. Then again, if the server is down, retries at any level will fail.

You can just take the source of the strategy and adjust it a bit to add your exceptions there and assign an instance of that strategy to the adapter instead.

Frans Bouma | Lead developer LLBLGen Pro