In(query) produces different SQL

Posts   
 
    
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 22-Jul-2013 17:40:35   

Me again. smile

I have a 'identifier' query which simply returns a list of IDs that meet some criteria.

I want to use the result of this 'identifier' query in other queries in a simple WHERE <ID> IN ('identifier' query).

I have found that the SQL created for this identifier query is different depending on how 'identifier' query is 'initialized'.

This unit test fails with an exception: "An exception was caught during the execution of a retrieval query: Each GROUP BY expression must contain at least one column that is not an outer reference.."

        [Test]
        public static void XXX()
        {
            var qf = new QueryFactory();

            var grossAmountField = new Expression(AccountsTransactionFields.Amount, ExOp.Add, AccountsTransactionFields.VATAmount.CoalesceToZero());

            var overAllocationQuery = qf.Create()
                .From(qf.AccountsTransaction
                    .InnerJoin(AccountsTransactionEntity.Relations.AllocationEntityUsingTargetAccountsTransactionID)
                )
                .GroupBy(AccountsTransactionFields.ID, grossAmountField)
                .Having(AllocationFields.Amount.Sum().GreaterThan(grossAmountField))
                .Select(() => new
                              {
                                  ID = AccountsTransactionFields.ID.ToValue<int>(),
                                 // OverAllocation = (AllocationFields.Amount.Sum() - grossAmountField).As("Overallocated").ToValue<decimal>()
                              });

            //Debug.WriteLine(new DataAccessAdapter().FetchQuery(overAllocationQuery).Count);

            var query = qf.Create()
                .From(qf.Allocation
                    .InnerJoin(AllocationEntity.Relations.AccountsTransactionEntityUsingTargetAccountsTransactionID)
                    .InnerJoin(AllocationEntity.Relations.MoneyTransferEntityUsingSourceMoneyTransferID)
                )
                .Where(AccountsTransactionFields.ID.In(overAllocationQuery))
                //.Where(AccountsTransactionFields.ID.In(512))
                .Select(() => new
                              {
                                  CommissionStatementID = AccountsTransactionFields.ID.ToValue<int>(),
                                  ProviderID = AccountsTransactionFields.LegalBodyID.ToValue<int>(),
                                  ProviderName = RefDataSources.LegalBodies.GetDisplayName(AccountsTransactionFields.LegalBodyID.As("P2").ToValue<int>()),
                                  MoneyTransferID = MoneyTransferFields.ID.As("M2").ToValue<int>(),
                                  ReceiptSourceID = MoneyTransferFields.LegalBodyID.As("RE2").ToValue<int>(),
                                  ReceiptSourceName = RefDataSources.LegalBodies.GetDisplayName(MoneyTransferFields.LegalBodyID.As("R2").ToValue<int>()),
                              });


            var result = new DataAccessAdapter().FetchQuery(query);
            Debug.WriteLine(result.Count);
        }

However, if I uncomment that first Debug.WriteLine, then the query gives the expected results.

The failing SQL In clause is:

SELECT
  [LPA_L1].[ID] 
FROM
  ( [AccountsTransaction]  
INNER JOIN
  [Allocation] ON [AccountsTransaction].[ID] = [Allocation].[TargetAccountsTransactionID]) 
GROUP BY
  [LPA_L1].[ID],
  ([LPA_L1].[Amount] + COALESCE([LPA_L1].[VATAmount],
  '0')) 
HAVING
  ( SUM([LPA_L2].[Amount]) > ([LPA_L1].[Amount] + COALESCE([LPA_L1].[VATAmount],
  '0')))) [LPA_L6])))

whereas the working SQL In clause is:

SELECT
  [LPA_L7].[ID] 
FROM
  (( [AccountsTransaction] [LPA_L7]  
INNER JOIN
  [Allocation] [LPA_L9] ON [LPA_L7].[ID] = [LPA_L9].[TargetAccountsTransactionID]) 
INNER JOIN
  [AccountsTransaction] [LPA_L8] ON [LPA_L8].[ID] = [LPA_L9].[ID]) 
GROUP BY
  [LPA_L7].[ID],
  ([LPA_L7].[Amount] + COALESCE([LPA_L7].[VATAmount],
  '0')) 
HAVING
  ( SUM([LPA_L8].[Amount]) > ([LPA_L7].[Amount] + COALESCE([LPA_L7].[VATAmount],
  '0')))) [LPA_L6])))

To my amateur eye, it would seem that the failing SQL version was generated with QuerySpec attempting to merge and maybe optimize table/field usage from the main query.

I am aware it may be possible to workaround this by aliases everything in an 'identifier' type query but that is a lot of work and, to my mind, makes the query hard to read.

Also, I see the Query in an In(Query) as being completely standalone - only the final output should be seen by the main query, and since the correct results can be obtained by 'poking' the 'identifier' query before it is used in the main query then I am guessing (hoping!) this is a bug.

I am running v4.0.13.0708

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 23-Jul-2013 00:09:35   

Not sure if I understand your last remark. Do you want this to run separately, or within the main query?

simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 23-Jul-2013 08:23:57   

Walaa wrote:

Not sure if I understand your last remark. Do you want this to run separately, or within the main query?

It will run within the main query but should not be merged with any of the tables of the main query. Its sole purpose is to provide a list of IDs from which the main query will limit its output.

Eventually, I will move the definition into its own method and this (and others like it) will be used from multiple 'main' queries.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39912
Joined: 17-Aug-2003
# Posted on: 24-Jul-2013 12:00:50   

The problem is I think that there's no way to know whether you did mean to reference the outer query or not, without aliases, as both outer query and the IN clause query refer to the same entity. So as there are no aliases on the outer query's entities, they're getting artificial aliases, as they're inheritance entities (this is necessary, as they refer to the same tables for example). As the IN query has no aliases mentioned as well, the engine will simply lookup which artificial aliases are 'in-scope'. As a query inside an IN clause can refer to elements in the outer scope, it will re-use these aliases. To avoid this, alias the elements in the IN clause query there. The only thing that needs to happen is that the elements in the query in the IN clause know that they're referencing elements in their own query, not elements outside the IN clause. So the alias used is only known in the query added to the IN clause (and if you want to make this a method, that's fine, the method creating the query only knows the alias).

As I've your model and DB still on my hdd from the bugfix 2 days ago (I kept it to jump back in if the fix failed) I'll try your code out and this alias suggestion.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39912
Joined: 17-Aug-2003
# Posted on: 24-Jul-2013 12:30:40   

Yep:


public static void QueryInINClause()
{
    var qf = new QueryFactory();

    var grossAmountField = new Expression(AccountsTransactionFields.Amount.Source("AT"), ExOp.Add, AccountsTransactionFields.VATAmount);
    var overAllocationQuery = qf.Create()
        .From(qf.AccountsTransaction.As("AT")
                .InnerJoin(AccountsTransactionEntity.Relations.AllocationEntityUsingTargetAccountsTransactionID, "AT", "A"))
        .GroupBy(AccountsTransactionFields.ID.Source("AT") , grossAmountField)
        .Having(AllocationFields.Amount.Source("A").Sum().GreaterThan(grossAmountField))
        .Select(() => new
        {
            ID = AccountsTransactionFields.ID.Source("AT").ToValue<int>(),
            // OverAllocation = (AllocationFields.Amount.Sum() - grossAmountField).As("Overallocated").ToValue<decimal>()
        });

    //Debug.WriteLine(new DataAccessAdapter().FetchQuery(overAllocationQuery).Count);

    var query = qf.Create()
        .From(qf.Allocation
            .InnerJoin(AllocationEntity.Relations.AccountsTransactionEntityUsingTargetAccountsTransactionID)
            .InnerJoin(AllocationEntity.Relations.MoneyTransferEntityUsingSourceMoneyTransferID)
        )
        .Where(AccountsTransactionFields.ID.In(overAllocationQuery))
        //.Where(AccountsTransactionFields.ID.In(512))
        .Select(() => new
        {
            CommissionStatementID = AccountsTransactionFields.ID.ToValue<int>(),
            ProviderID = AccountsTransactionFields.LegalBodyID.ToValue<int>(),
            MoneyTransferID = MoneyTransferFields.ID.As("M2").ToValue<int>(),
            ReceiptSourceID = MoneyTransferFields.LegalBodyID.As("RE2").ToValue<int>(),
        });


    var result = new DataAccessAdapter().FetchQuery(query);
    Console.WriteLine(result.Count);
}

query:


SELECT [LPA_L1].[ID],
       [LPA_L1].[LegalBodyID],
       [LPA_L5].[ID]          AS [M2],
       [LPA_L4].[LegalBodyID] AS [RE2]
FROM   (((( [TIPS].[dbo].[AccountsTransaction] [LPA_L2]
            INNER JOIN [TIPS].[dbo].[Allocation] [LPA_L3]
                ON [LPA_L2].[ID] = [LPA_L3].[ID])
          INNER JOIN [TIPS].[dbo].[AccountsTransaction] [LPA_L1]
              ON [LPA_L1].[ID] = [LPA_L3].[TargetAccountsTransactionID])
         INNER JOIN [TIPS].[dbo].[MoneyTransfer] [LPA_L5]
             ON [LPA_L5].[ID] = [LPA_L3].[SourceMoneyTransferID])
        INNER JOIN [TIPS].[dbo].[AccountsTransaction] [LPA_L4]
            ON [LPA_L4].[ID] = [LPA_L5].[ID])
WHERE  (([LPA_L1].[ID] IN
         (SELECT [LPA_L6].[ID]
          FROM   (SELECT [LPA_A7].[ID]
                  FROM   (( [TIPS].[dbo].[AccountsTransaction] [LPA_A7]
                            INNER JOIN [TIPS].[dbo].[Allocation] [LPA_A9]
                                ON [LPA_A7].[ID] = [LPA_A9].[TargetAccountsTransactionID])
                          INNER JOIN [TIPS].[dbo].[AccountsTransaction] [LPA_A8]
                              ON [LPA_A8].[ID] = [LPA_A9].[ID])
                  GROUP  BY [LPA_A7].[ID],
                            ([LPA_A7].[Amount] + [LPA_L1].[VATAmount])
                  HAVING (SUM([LPA_A8].[Amount]) > ([LPA_A7].[Amount] + [LPA_L1].[VATAmount]))) [LPA_L6]))) 

I removed the coalesce but that's not important. It illustrates what's happening. It also shows that the elements for the IN clause query are selfcontained, so you can make it a method and don't have to worry about passing an alias into it. The only thing you have to make sure is that you alias it uniquely. This is a requirement which is unavoidable, as it has to use aliases and it otherwise can't know what your intend is: do you want to refer to the outer elements (which are the same, as both have no alias!) or do you want to reference the elements in its own query. The only way to express that intend is by using an alias.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 24-Jul-2013 15:55:54   

Thanks for the comprehensive answer and test.

As a query inside an IN clause can refer to elements in the outer scope, it will re-use these aliases.

I didn't realise that - I thought that that outer query would effectively just see the results and nothing else. I've now read up on the SQL Server definition of a subquery and indeed, it will use an Outer Query tablename but only if that tablename is not already present within the inner query. In fact it is described as a caution rather than a feature - I can't think of a use for it personally.

So if I understand it now for QuerySpec:- 1) All queries will have their tables aliased one way or another - for those not done manually, Queryspec will give them an automatic one at execution time. 2) A query used in an IN clause is not treated specially 3) My running the inner query manually first just let it get its aliases set first and so when they were reused with the outer query they were seen as 'manually' set.

Disclaimer: I'm not expecting or asking you to change how LLBLGEN works, so don't shout at me smile But I would like to ask a few questions about how Queryspec works so that I can customize mine to be more like SQL Server which is the only DB I use:-

1) Does Queryspec have a method/property that says don't use aliases for this query - just output the table names directly? 2) Is there a method I can call for a query which will auto-assign its aliases but without having to execute the query? 3) Is CreateSubQuery/CreateNewAliasScope() very low level stuff (I haven't looked in detail) or could I use this to scope the inner query as independent.

I'm thinking along the lines of adding a bool on an overloaded In clause which when true, will do 'something' to that subquery that prevents it getting merged with the outer query.

(Yes, I know I should just add the aliases and be done with it but I find it much more difficult to read)

Cheers Simon

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 24-Jul-2013 22:45:29   

1) Does Queryspec have a method/property that says don't use aliases for this query - just output the table names directly?

No.

2) Is there a method I can call for a query which will auto-assign its aliases but without having to execute the query?

Just manmually assign the aliases as in Frans' example.

3) Is CreateSubQuery/CreateNewAliasScope() very low level stuff (I haven't looked in detail) or could I use this to scope the inner query as independent.

Low level indeed, and I really recommend setting the aliases manually. Unless you want to derive from and change a lot in the runtime libraries classes.

simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 25-Jul-2013 07:54:24   

Just manmually assign the aliases as in Frans' example.

Well, thanks for that. I'll work it out myself.

simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 25-Jul-2013 09:41:03   

For anyone else interested, this did the trick for me:

    public static class ExtensionMethods
    {
        public static IPredicate InStandaloneQuery(this IEntityFieldCore field, DynamicQuery query)
        {
            ArgumentVerifier.CantBeNull(query, "query");

            var adapter = new DataAccessAdapter();

            ReflectionHelper.CallNonPublicMethod(query, "PrepareForExecution", adapter.FunctionMappings);

            var queryParameters = ReflectionHelper.CallNonPublicMethod<QueryParameters>(query, "GetQueryParameters");
            queryParameters.FieldsForQuery = ReflectionHelper.CallNonPublicMethod<IEntityFields2>(query, "GetProjectionAsEntityFields2");

            ReflectionHelper.CallNonPublicMethod(typeof(DataAccessAdapterBase), adapter, "CreateQueryFromElements", queryParameters);

            return field.In(query);
        }
        
    }

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39912
Joined: 17-Aug-2003
# Posted on: 25-Jul-2013 11:25:04   

Some more info:

simmotech wrote:

Thanks for the comprehensive answer and test.

As a query inside an IN clause can refer to elements in the outer scope, it will re-use these aliases.

I didn't realise that - I thought that that outer query would effectively just see the results and nothing else. I've now read up on the SQL Server definition of a subquery and indeed, it will use an Outer Query tablename but only if that tablename is not already present within the inner query. In fact it is described as a caution rather than a feature - I can't think of a use for it personally.

Correlation might be a use-case but it is indeed shaky ground.

So if I understand it now for QuerySpec:- 1) All queries will have their tables aliased one way or another - for those not done manually, Queryspec will give them an automatic one at execution time. 2) A query used in an IN clause is not treated specially 3) My running the inner query manually first just let it get its aliases set first and so when they were reused with the outer query they were seen as 'manually' set.

Correct on all 3. Aliases are needed because if you have a Subtype1 and a Subtype2 both inheriting from SuperType in your query, they both have the table SuperType is mapped on in the SQL query but obviously the second one needs an alias. To make sure the fields of the subtype which table got aliased refer to the right table, everything is aliased so there's no misunderstanding. This is the case in the low-level API as well as in Linq as well. In Linq we auto-alias everything as well.

I'm thinking along the lines of adding a bool on an overloaded In clause which when true, will do 'something' to that subquery that prevents it getting merged with the outer query. (Yes, I know I should just add the aliases and be done with it but I find it much more difficult to read)

I wouldn't use a bool for that, simply always alias the stuff. It's a bit difficult to read perhaps due to the .Source(...) calls, though there's little else that can be done. Using the bool is error prone as you might pass in 'false' at some point, but change the outer query later on and things start to fail but you don't remember why.

Btw, that 'InStandAloneQuery' method isn't something you should use. It ties your code to our code but we might change things over time internally and your code will fall apart. Please opt for the alias specification. It's clearer and it works every time.

Frans Bouma | Lead developer LLBLGen Pro