Query cancellation in “async_network_io” state

Posts   
 
    
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 17-Sep-2019 12:27:34   

One of our customers has a very high performance server with SQL Server 2012, capable of processing large amounts of data in a very short time. Unfortunately, the network does not seem to be as efficient. As a client, users can set filters and query a view using LLBLGen. The view has many fields but is a simple set of INNER JOIN. Over long periods it returns large amounts of data. Client side the operator can also manually cancel the query in progress. Everything works properly under normal conditions.

The particular problem is when a query is in progress from the client and, manually, the operator decides to interrupt it: in these conditions the SQL process remains alive and blocks the tables until the client application is closed. In detail the condition is that: - The query goes into "async_network_io" within a few seconds and stays there for tens of seconds. In other words, processing is much faster than transferring information. - If the query is canceled BEFORE it goes into the "async_network_io" state, the deletion is successful. - If the query is canceled DURING the "async_network_io" state, the process seems to go to block.

We are able to systematically re-create it, but we have no idea how to intervene and whether it is a SQL, network or LLBLGen problem. We also tried to update the project's .NET Framework (the original was with .NET 2.0), but the result seems the same. We also tried to update the project's LlLBLGen references (the original was 3.5), but the result seems the same.

Any advice on this?

Thanks

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 17-Sep-2019 14:08:19   

What llblgen pro version are you using? I understand from your post that you're not using Async queries on .NET 4.5+ ? (in which case you should use the cancellationtoken option)

You use adapter or selfservicing?

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 17-Sep-2019 14:14:15   

We use: - .NET 3.5 - LLBLGen 3.5 We use DataAccessAdapter and we load a TypedView.

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 18-Sep-2019 08:25:40   

If you use .Net3.5 then you are not using the async keywork, Is that correct? Could you please share the code and details about the repro case? That way we can isolate the problem and investigate further. If you could create a standard repro case using Northwin DB or something like that it would be ever better, if it's only possible using your code and db, please provide the most basic elements that reproduce the scenario.

David Elizondo | LLBLGen Support Team
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 18-Sep-2019 10:05:09   

The code is really trivial: it is a simple view (ElencoServiziTypedView) filtered by a bucket (composed on the client side). This code is executed on a secondary thread, which can be interrupted (Abort).

Unfortunately we can't even reproduce it locally: unlike our client's server, we don't have enough data (320Gb) nor sufficient processing power (locally we would have timeout, on our client's server it processes data quickly).

I point out that, in these special conditions, the query lasts even more than 30 seconds but does NOT time out because it was processed on the server. From our observations, the fact that the query is interrupted during the "async_network_io" state seems to cause this block.

Question: when the dispose of your DataAccessAdapter is executed, is the query "canceled" in any way?

I am attaching the simple code we use for this query. Thanks in advance.


ElencoServiziTypedView viewElenco = null; using (DataAccessAdapter adapter = new DataAccessAdapter(_dbFactory.stringaConnessione)) { // Imposto il catalog name affinchè LLBLGen generi le query // con il nome del catalog corretto adapter.CatalogNameUsageSetting = CatalogNameUsage.ForceName; adapter.CatalogNameToUse = _dbFactory.getConnection().Database;

viewElenco = new ElencoServiziTypedView();

if (bucket == null)
{ bucket = new RelationPredicateBucket(); }

adapter.FetchTypedView(viewElenco, bucket, true);

} return viewElenco;

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 18-Sep-2019 10:30:32   

mcarugati wrote:

The code is really trivial: it is a simple view (ElencoServiziTypedView) filtered by a bucket (composed on the client side). This code is executed on a secondary thread, which can be interrupted (Abort).

Aborting a thread isn't a good way to do this. See below how to do this. Aborting threads shouldn't be done in .NET. Instead implement a way to simply end the thread.

Unfortunately we can't even reproduce it locally: unlike our client's server, we don't have enough data (320Gb) nor sufficient processing power (locally we would have timeout, on our client's server it processes data quickly).

I point out that, in these special conditions, the query lasts even more than 30 seconds but does NOT time out because it was processed on the server. From our observations, the fact that the query is interrupted during the "async_network_io" state seems to cause this block.

This happens because a synchronous operation in SqlClient still uses async io nowadays. You simply kill the parent thread, but the async io works with callbacks under the hood (IO completion ports in windows) so they're orphaned due to the thread abort and you get this error. There's a way around it, see below.

Question: when the dispose of your DataAccessAdapter is executed, is the query "canceled" in any way?

I am attaching the simple code we use for this query. Thanks in advance.


ElencoServiziTypedView viewElenco = null; using (DataAccessAdapter adapter = new DataAccessAdapter(_dbFactory.stringaConnessione)) { // Imposto il catalog name affinchè LLBLGen generi le query // con il nome del catalog corretto adapter.CatalogNameUsageSetting = CatalogNameUsage.ForceName; adapter.CatalogNameToUse = _dbFactory.getConnection().Database;

viewElenco = new ElencoServiziTypedView();

if (bucket == null)
{ bucket = new RelationPredicateBucket(); }

adapter.FetchTypedView(viewElenco, bucket, true);

} return viewElenco;

In the code above, the adapter isn't disposed till the FetchTypedView returns. Having another thread dispose this object isn't recommended so there's no code in place to deal with this: an adapter is supposed to be used by a single thread, so if it operates on a query, the calling thread for that query is executing the query so it can't also stop it.

What you should do is the following. You have to create a partial class of DataAccessAdapter and override ExecuteMultiRowDataTableRetrievalQuery. In the override you store the IRetrievalQuery instance in a local variable, call the base method and if that one succeeds, simply exit.

To the partial class you also add methods to make sure you can cancel the command and connection. In our ORM Profiler application we have a panel where the user can execute the query to obtain the resultset. the user can cancel this. We do this in the following way.

  • we execute the query on a backgroundworker. This basically executes the job on a threadpool thread and it offers some extras:
private void CancelBackgroundWorker()
{
    //...
    if(_queryExecutorBW.IsBusy)
    {
        _queryExecutorBW.CancelAsync();
    }
    if(_retriever != null)
    {
        try
        {
            _retriever.AbortRetrievalAction();
            _retriever = null;
        }
        catch
        {
            // ignore. 
        }
    }
    //...
}

Here is the code which stops the action on the background worker. (_queryExecutorBW). This is thus ran on another thread (the main thread). It's similar to your situation.

The '_retriever' is an object which executes the query. To abort a DbCommand in flight, we use the following trick:


/// <summary>
/// Aborts the current execution of a command.
/// </summary>
public void AbortRetrievalAction()
{
    if(_currentConnection == null)
    {
        return;
    }
    try
    {
        if(_currentCommand != null)
        {
            _currentCommand.Cancel();
            _currentCommand.Dispose();
        }
    }
    catch
    {
        // swallow all, can't deal with it anyway.
    }
    finally
    {
        _currentCommand = null;
        AbortCurrentConnection();
    }
}


/// <summary>
/// Aborts the current connection.
/// </summary>
private void AbortCurrentConnection()
{
    if(_currentConnection == null)
    {
        return;
    }
    try
    {
        if(_currentConnection.State != ConnectionState.Closed)
        {
            _currentConnection.Close();
        }
        var clearPoolMethod = _currentConnection.GetType().GetMethod("ClearPool", BindingFlags.Public | BindingFlags.Static);
        if(clearPoolMethod != null)
        {
            clearPoolMethod.Invoke(null, new object[] { _currentConnection });
        }
        _currentConnection.Dispose();
    }
    catch
    {
        // swallow all. Nothing we can do
    }
    finally
    {
        _currentConnection = null;
    }
}

We also use this in LLBLGen Pro v5.x when you connect to a db and there's no response for 2-3 seconds. We kill it off so the user doesn't have to sit through 30 seconds till the timeout hits and they can change the settings they filled in wrong. simple_smile

Hope this helps. And you really should move to .NET 4.5.2+ .NET 3.5 isn't supported by MS anymore (well, it is sort of, but not recommended).

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 18-Sep-2019 12:08:35   

Thanks Otis for these interesting ideas.

I suppose the way you interrupt the thread (Abort or CancelAsync) doesn't affect the SQL side block problem, correct? I suppose the important thing is to delete the query and close the connection, which from what I understand is NOT managed automatically when the DataAccessAdapter is disposed.

By temporarily leaving the thread termination issue (Abort vs. CancelAsync): the following code should work correctly against the SQL issue? It simply contains the call to your AbortRetrievalAction method in case of Abort. Or does termination of the thread through Abort instead of CancelAsync in any way affect it?


ElencoServiziTypedView viewElenco = null; using (DataAccessAdapter adapter = new DataAccessAdapter(_dbFactory.stringaConnessione)) { // Imposto il catalog name affinchè LLBLGen generi le query // con il nome del catalog corretto adapter.CatalogNameUsageSetting = CatalogNameUsage.ForceName; adapter.CatalogNameToUse = _dbFactory.getConnection().Database;

try
{   
    viewElenco = new ElencoServiziTypedView();

    if (bucket == null)
    { bucket = new RelationPredicateBucket(); }

    adapter.FetchTypedView(viewElenco, bucket, true);
}
catch (System.Threading.ThreadAbortException ex)
{
    try
    {
        // In caso di abort (annullamento manuale) forzo la cancellazione della query
        adapter.AbortRetrievalAction();
    }
    catch (Exception exAbort)
    {
        Log.Error("Errore in exAbort", exAbort);
    }
    // In ogni caso propago l'eccezione iniziale per mantenere gli stessi flussi
    throw ex;
}

} return viewElenco;

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 19-Sep-2019 10:19:33   

mcarugati wrote:

Thanks Otis for these interesting ideas.

I suppose the way you interrupt the thread (Abort or CancelAsync) doesn't affect the SQL side block problem, correct?

Yes it does, it basically kills the command nicely without error.

I suppose the important thing is to delete the query and close the connection, which from what I understand is NOT managed automatically when the DataAccessAdapter is disposed.

When the adapter is disposed, connections are closed (disposed as well, so closed implicitly), but to do that, you need another thread, so the behavior is undefined.

Hence, you need something else: use the code I posted above to make sure the active command is stopped first. The code above works. After that, the method you called to fetch the typedview will end and everything will terminate gracefully without the necessity for thread abort.

By temporarily leaving the thread termination issue (Abort vs. CancelAsync): the following code should work correctly against the SQL issue? It simply contains the call to your AbortRetrievalAction method in case of Abort. Or does termination of the thread through Abort instead of CancelAsync in any way affect it?


ElencoServiziTypedView viewElenco = null; using (DataAccessAdapter adapter = new DataAccessAdapter(_dbFactory.stringaConnessione)) { // Imposto il catalog name affinchè LLBLGen generi le query // con il nome del catalog corretto adapter.CatalogNameUsageSetting = CatalogNameUsage.ForceName; adapter.CatalogNameToUse = _dbFactory.getConnection().Database;

try
{   
    viewElenco = new ElencoServiziTypedView();

    if (bucket == null)
    { bucket = new RelationPredicateBucket(); }

    adapter.FetchTypedView(viewElenco, bucket, true);
}
catch (System.Threading.ThreadAbortException ex)
{
    try
    {
        // In caso di abort (annullamento manuale) forzo la cancellazione della query
        adapter.AbortRetrievalAction();
    }
    catch (Exception exAbort)
    {
        Log.Error("Errore in exAbort", exAbort);
    }
    // In ogni caso propago l'eccezione iniziale per mantenere gli stessi flussi
    throw ex;
}

} return viewElenco;

No, you should start another thread OR call from the main thread, the adapter.AbortRetrievalAction(). Using thread.abort means things are already in teh aborted stage, the executing thread is killed off and this likely will still cause the error.

Like I said above, we call the abortretrievalaction from the main thread.

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 13-Jan-2020 16:28:39   

Hello,

After some time this problem is worsening. Two questions:   - In the case of queries executed through Linq it seems that the methods to be overridden are NOT called. Do we need to make any further changes if Linq is used?   - To solve the root problem: with newer versions of Llbl (we have 4.2) and .NET is this problem solved?

Thank you, Regards

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 14-Jan-2020 09:44:36   

mcarugati wrote:

Hello,

After some time this problem is worsening. Two questions:   - In the case of queries executed through Linq it seems that the methods to be overridden are NOT called. Do we need to make any further changes if Linq is used?   - To solve the root problem: with newer versions of Llbl (we have 4.2) and .NET is this problem solved?

Thank you, Regards

Which methods did you override? The problem you ran into is due to the thread abort as I described above.

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 14-Jan-2020 10:00:08   

I'm afraid I haven't explained myself well: the ExecuteMultiRowDataTableRetrievalQuery method we mentioned earlier with the TypedView seems not to be called if Linq is used.

Which is the correct method to override if we Linq instead of TypedView?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 14-Jan-2020 11:42:25   

Could you give the query code? The method you mention is indeed only used when you fetch a typed view into a datatable (nowadays the default is that a typedview is mapped to a poco class).

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 14-Jan-2020 12:30:15   

They are very trivial queries on Poco views. Thanks

IQueryable<ViewExamplePOCORow> query = _metadata.ViewExamplePOCO.Select(d => d);

return _dao.LinqToOrmSelectList(query);
// *****************************************************************************************************************
        /// <summary>
        /// Provvede ad eseguire la query LinqToOrm fornita come parametro.
        /// </summary>
        /// <typeparam name="T"></typeparam>
        /// <param name="query">query</param>
        /// <param name="transaction">transaction</param>
        /// <returns>
        /// Resttiuisce una lista degli oggetti previsti dalla query.
        /// </returns>
        // *****************************************************************************************************************
        internal List<T> LinqToOrmSelectList<T>(IQueryable<T> query, TransactionManager transaction = null)
        {
            if (transaction != null)
            {
                _transaction = transaction;
            }
            DataAccessAdapter adapter = GetAdapter();
            try
            {
                ((LLBLGenProProvider2)query.Provider).AdapterToUse = adapter;

                return query.ToList();
            }
            finally
            {
                DisposeAdapter(adapter);
            }
        }
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 15-Jan-2020 10:45:14   

Ah simple_smile These end up in FetchProjection. In v5 you have to override:

public virtual void FetchProjection(List<IDataValueProjector> valueProjectors, IGeneralDataProjector projector, QueryParameters parameters)

Will check if this is the same in v4

(edit) In v4 this is the same method. Queries with a custom projection also end up in that method.

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 17-Jan-2020 09:28:42   

Hi Otis,

thanks a lot for the information. I confirm that following your suggestions we have finally managed to find this possible very important solution.

However, in our context, this remains an extremely burdensome solution to be introduced:   - our software uses multiple windows forms   - each form loads data on a background thread in order not to block the interface   - the data loading logics are encoded in dedicated business libraries, also shared between different projects (services, ...). Windows Forms don't even have direct access to low-level objects like Adapters, as they are layered by libraries. In such a scenario it becomes difficult to massively introduce this approach.

Can you confirm that also with the latest versions of Llbl (we are currently at 4.2 with .NET 4.0) this problem has remained unchanged?

Thanks for the support

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39774
Joined: 17-Aug-2003
# Posted on: 17-Jan-2020 09:56:19   

mcarugati wrote:

Hi Otis,

thanks a lot for the information. I confirm that following your suggestions we have finally managed to find this possible very important solution.

However, in our context, this remains an extremely burdensome solution to be introduced:   - our software uses multiple windows forms   - each form loads data on a background thread in order not to block the interface   - the data loading logics are encoded in dedicated business libraries, also shared between different projects (services, ...). Windows Forms don't even have direct access to low-level objects like Adapters, as they are layered by libraries. In such a scenario it becomes difficult to massively introduce this approach.

You can add this override as a partial class of DataAccessAdapter in the generated code, which should then be compiled in the generated code project and with that it should be transparent for all the projects?

In any case: from what I understand the problem is that you abort a thread which causes problems. This will stay a problem as aborting a thread isn't OK, as I've explained above.

Can you confirm that also with the latest versions of Llbl (we are currently at 4.2 with .NET 4.0) this problem has remained unchanged? Thanks for the support

you mean it will occur then too? Yes because you abort a thread. We can't fix that for you. However the code you override is the same as in v5.x

However there's another solution: using async. This is already available in v4.2. You could try to use an async load on the foreground thread instead of a sync load on a background thread. For winforms, you need an event handler to start the async await call chain as these can be async void. This might be a lot of work, so you might want to look into introducing async calls only to those areas where you will run into this problem.

Frans Bouma | Lead developer LLBLGen Pro
mcarugati
User
Posts: 17
Joined: 17-Sep-2019
# Posted on: 17-Jan-2020 10:19:32   

You have been very clear and kind. Thank you very much