Counting rows in joined entities and beyond

Posts   
 
    
neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 14-Sep-2012 11:19:13   

3.1.11.0706 Final (SD.LLBLGen.Pro.DQE.SqlServer.NET20.dll) 3.1.12.0806 (SD.LLBLGen.Pro.ORMSupportClasses.NET20.dll 3.1.12.0507 (SD.LLBLGen.Pro.LinqSupportClasses.NET35) DotNet 4.0 vs2010 project Adapter template

I have reformulated a previous question (http://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=21316) to use Northwind.

I want to select all customers with a Region and have orders that ShipVia Speedy (ShipVia == 1). Then I want to create counts for:

  • Total orders
  • Total of employees that took these orders
  • Sub total of employees that took orders from this set where the order was delivered before the employee was hired. (fraud detection maybe?)

The code I come up with looks like this using a DateTime.Now instead of the o.ShipDate and runs but produces wrong results:

             var q = from c in metaData.Customer
                    join o in metaData.Order on c.CustomerId equals o.CustomerId
                    where o.ShipVia == 1 && c.Region != null
                    group c by new {c.CustomerId, c.Region}
                    into g
                    select new
                               {
                                   g.Key.CustomerId,
                                   g.Key.Region,
                                   TotalOrders = g.Count(),
                                   EmployeeCount =
                        g.SelectMany(h => h.Orders).Where(i => i.CustomerId == g.Key.CustomerId).Select(j => j.Employee)
                        .Count(),
                                   EmployeeCount2 =
                        g.SelectMany(h => h.Orders).Where(i => i.CustomerId == g.Key.CustomerId).Select(j => j.Employee)
                        .Count(k => k.HireDate > DateTime.Now),
                               };

The counts for Employees are for all orders, not just the set of orders selected in the parent select.

How can I correlate the orders used in the counts with those used in the top level join when I don't get access to the joined entity once I had done the group by? Also, how do I get access to the ShipDate of the joined order to test against the employee's HireDate?

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 14-Sep-2012 11:56:07   

Semantically I see your queries are centered around Orders not Customers, so maybe you can try to reformulate the Query to Select Orders rather than Customers, take it from there to get the Employee Counts.

Otherwise, you will have to reuse the Order Filter in the sub-Queries.

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 14-Sep-2012 12:01:54   

The application produces a list of customers with these counts so maybe I need to wrap that around a revised linq that is focused on orders as you suggest. In the real application, the filtering is so complex that adding it to each count (there are 9 of them!) adds tens of seconds to the processing time. I try your suggestion. Thanks.

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 14-Sep-2012 14:55:10   

Putting the filtering sql into each count in my real application takes the db processing time from 1.5 seconds to 14 seconds. The counts are now correct though!

The main performance problem is the repetitive running of the filtering on each count. So instead of 1.5 seconds , it takes roughly that for each count.

Here is the equivalent Northwind code (which runs quickly of course as there is almost not data):

            var q = from c in metaData.Customer
                    join o in metaData.Order on c.CustomerId equals o.CustomerId
                    where o.ShipVia == 1 && c.Region != null
                    group c by new {c.CustomerId, c.Region}
                    into g
                    select new
                               {
                                   g.Key.CustomerId,
                                   g.Key.Region,
                                   TotalOrders = g.Count(),
                                   EmployeeCount = (from c in metaData.Customer
                                                    join o in metaData.Order on c.CustomerId equals o.CustomerId
                                                    join e in metaData.Employee on o.EmployeeId equals In the meantime, e.EmployeeId
                                                    where o.ShipVia == 1 && c.Region != null && c.CustomerId == g.Key.CustomerId
                                                    select e).Count(),
                                   EmployeeCount2 = (from c in metaData.Customer
                                                     join o in metaData.Order on c.CustomerId equals o.CustomerId
                                                     join e in metaData.Employee on o.EmployeeId equals e.EmployeeId
                                                     where o.ShipVia == 1 && c.Region != null && c.CustomerId == g.Key.CustomerId
                                                     select e).Count(k => k.HireDate > DateTime.Now),
                               };

Is there a way to get this filtering code to run once or maybe twice only then the counts applied to the result?

In the meantime, I will try reframing the linq to be based on Orders not Customers.

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 14-Sep-2012 16:39:22   

I tried reframing the link to select orders. It works to count employees unfiltered, but not when I want to apply a filter to the Employees table:

            var q = from  o in metaData.Order
                    join e in metaData.Employee on o.EmployeeId equals e.EmployeeId
                    where o.ShipVia == 1 && o.Customer.Region != null
                    group e by new{o.CustomerId}
                        into g
                        select new
                        {
                            CustomerId = g.Key.CustomerId,
                            EmployeeCount = g.Count(),
                            EmployeeCount2 = g.Count(h => h.HireDate > DateTime.Now)
                        };

Although the above compiles, it crashes as it can't bind the HireDate. I didn't get as far as comparing the HireDate with the order ShipDate as that wouldn't compile.

I can't help thinking there is something really simple am missing:-(

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 15-Sep-2012 12:42:27   

I still can't get there. The basic problem for me is that I can't see how to use fields from a joined entity at the top level in the group by projection for filtering counts. Is this actually possible?

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 15-Sep-2012 20:33:35   

How would look the SQL of what you want to achieve? Would it look like this?

SELECT 
    c.CustomerID, 
    c.Region, 
    COUNT(*) OrdersCount,
    COUNT(DISTINCT o.EmployeeID) EmployeesCount,    
    COUNT(DISTINCT e.EmployeeID) EmployeesFraudDetectionCount
FROM Orders o
    INNER JOIN Customers c ON o.CustomerID = c.CustomerID
    LEFT JOIN Employees e ON o.EmployeeID = e.EmployeeID AND o.ShippedDate > e.HireDate
WHERE o.ShipVia =1 
GROUP BY c.CustomerID, c.Region, o.EmployeeID
ORDER BY c.CustomerID
David Elizondo | LLBLGen Support Team
neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 16-Sep-2012 00:14:34   

Pretty much, apart from the grouping by EmployeeID. I see the approach and have added another count to be sure

SELECT DISTINCT
c.CustomerID, 
c.Region, 
COUNT(*) OrdersCount,
COUNT(DISTINCT o.EmployeeID) EmployeesCount,
COUNT(DISTINCT e.EmployeeID) EmployeesFraudDetectionCount,
COUNT (DISTINCT e1.EmployeeID) EmployeesWithShortFirstName
FROM Orders o
INNER JOIN Customers c ON o.CustomerID = c.CustomerID
LEFT JOIN Employees e ON o.EmployeeID = e.EmployeeID AND o.ShippedDate < e.HireDate
LEFT JOIN Employees e1 ON o.EmployeeID = e1.EmployeeID AND LEN(e1.FirstName) < 5
WHERE o.ShipVia =1
GROUP BY c.CustomerID, c.Region
--, o.EmployeeID
ORDER BY c.CustomerID

I'll try that in Linq, but if you know how, I would love to see.

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 16-Sep-2012 08:04:50   

Not much luck. my first attempt compiles but crashes unable to understand my aliasing for the grouping

from c in metaData.Customer
                join o in metaData.Order on c.CustomerId equals o.CustomerId
                join e in metaData.Employee on o.EmployeeId equals e.EmployeeId into e1
                from e2 in e1.DefaultIfEmpty()
                where e2.HireDate > o.ShippedDate
                join f in metaData.Employee on o.EmployeeId equals f.EmployeeId into e3
                from e4 in e3.DefaultIfEmpty()
                where e4.FirstName.Length < 5
                where o.ShipVia == 1
                group new
                          {
                              c.CustomerId,
                              Employees = o.EmployeeId,
                              EmployeesFraudDetection = e2.EmployeeId,
                              EmployeesWithShortFirstName = e4.EmployeeId
                          } by new {c.CustomerId, c.Region}
                into g
                select new
                           {
                               g.Key.CustomerId,
                               g.Key.Region,
                               TotalOrders = g.Count(),
                               EmployeesCount = g.Select(h => h.Employees).Distinct().Count(),
                               EmployeesFraudDetectionCount =
                    g.Select(h => h.EmployeesFraudDetection).Distinct().Count(),
                               EmployeesWithShortFirstnameCount =
                    g.Select(h => h.EmployeesWithShortFirstName).Distinct().Count(),
                           }

It does however run in Linq to SQL although it doesn't give the right results.

neilx
User
Posts: 267
Joined: 02-Nov-2007
# Posted on: 16-Sep-2012 20:54:10   

The main reason I have had so much trouble with this Linq is that the version of LLB I am using (see head post) has a bug when trying to do more than one filtered count in a group by.

I have reported it in bugs and issues (http://www.llblgen.com/TinyForum/Messages.aspx?ThreadID=21324).

At the moment, the generated SQL simply filters the result by both sets of criteria at the same time returning no results at all. I tried it in Linq to SQL and that works as expected.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 17-Sep-2012 10:36:03   

This gets very confusing. I'll close this thread and continue in the other one, as there are apparently now 3 threads about the same problem.

Frans Bouma | Lead developer LLBLGen Pro