Paging CTE and Recordcount

Posts   
 
    
Bart
User
Posts: 3
Joined: 21-Nov-2007
# Posted on: 21-Nov-2007 10:12:01   

I'm using LLBL for a while now and i must say... it works great! One of the features i use is paging through SQL 2005 CTE.

That works great also... except that in some circumstances the GetDBCount results in slowing down everything(sometimes the DISTINCT is killing!). So i went looking for some options on this one. I was wondering how all of you think about the different methods i found on the internet and present below (method 1 is used by LLBL if i'm correct).

BTW: I did not do extensive testing but at first glance method 2 seems to have the lowest costs BTW2: This query is about selecting streets where the city is amsterdam. From a total of around 240.000 streets 4500 are in Amsterdam.


/*
Method 1
 - first get your records
 - than count total results
*/
WITH cte_Straten AS
(
    SELECT ROW_NUMBER()
        OVER (ORDER BY Straatnaam ASC) AS RowNum
        , Straatnaam
    FROM Straat S

    INNER JOIN Plaats P
        ON S.Plaatsid = P.PlaatsId

    WHERE P.PlaatsNaam = 'Amsterdam'
)

SELECT RowNum, Straatnaam
    FROM cte_Straten 
    WHERE RowNum BETWEEN (20) and (40)
    ORDER BY RowNum;

SELECT DISTINCT count(*) as TotalRowCount FROM Straat S INNER JOIN Plaats P ON S.Plaatsid = P.PlaatsId WHERE P.PlaatsNaam = 'Amsterdam';







/*
Method 2
    - Select records and determine rowcount in the CTE and return that as an additional column
*/
WITH cte_Straten AS
(
    SELECT ROW_NUMBER()
        OVER (ORDER BY Straatnaam ASC) AS RowNum
        , COUNT(*) OVER(PARTITION BY NULL) AS TotalRowCount
        , Straatnaam
    FROM Straat S

    INNER JOIN Plaats P
        ON S.Plaatsid = P.PlaatsId

    WHERE P.PlaatsNaam = 'Amsterdam'
)
SELECT rownum, TotalRowCount, Straatnaam
    FROM cte_Straten
    WHERE RowNum BETWEEN (20) and (40)
    ORDER BY RowNum;






/*
Method 3
    - Build CTE
    - Determine TotalRowcount with count on the CTE and return as additional column
*/
WITH cte_Straten AS
(
    SELECT ROW_NUMBER()
        OVER (ORDER BY Straatnaam ASC) AS RowNum
        , Straatnaam
    FROM Straat S

    INNER JOIN Plaats P
        ON S.Plaatsid = P.PlaatsId

    WHERE P.PlaatsNaam = 'Amsterdam'
)
SELECT RowNum, (SELECT COUNT(*) FROM cte_Straten) AS TotalRowCount, Straatnaam
    FROM cte_Straten 
    WHERE RowNum BETWEEN (20) and (40)
    ORDER BY RowNum;


Bart
User
Posts: 3
Joined: 21-Nov-2007
# Posted on: 21-Nov-2007 10:43:02   

forgot this one: offcourse method 2 and 3 have the big disadvantage that all rows contain the total rowcount

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 21-Nov-2007 11:57:36   

simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Bart
User
Posts: 3
Joined: 21-Nov-2007
# Posted on: 21-Nov-2007 12:19:38   

What is the meaning of the smiley Otis...

If i understand your smiley correct than you mean that my previous post with the disadvantage explains why LLBL uses method 1.

BUT... all though all records return the total rowcount i could still prefer using that methods because of their performance benefits avoiding the distinct

greets!

Bart (that is still looking for some opinions on this one!)

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 21-Nov-2007 12:39:39   

Bart wrote:

What is the meaning of the smiley Otis...

If i understand your smiley correct than you mean that my previous post with the disadvantage explains why LLBL uses method 1.

That's indeed the reason simple_smile

btw, LLBLGen Pro uses a slightly different CTE nowadays, it doesn't place the ordering in the ROW_COUNT function anymore as that can lead to wrong results sometimes.

BUT... all though all records return the total rowcount i could still prefer using that methods because of their performance benefits avoiding the distinct

Bart (that is still looking for some opinions on this one!)

The main reason is that any extra action determining the count is potentially hurting performance because it has to traverse the complete set. This can be avoided in every subsequential page retrieved. So you first determine the count, then you page through it. Page 2, 3 etc. don't have to determine the page count anymore.

What's also the case is that in a lot of cases, you can use a simpler query to determine the total # of elements in the set, i.e. only index fetches, which leads to faster performance, using a scalar query. If you determine the set size every time, this is potentially slower: the CTE isn't likely bringing the complete set into memory at once. With the count it has to (in some cases).

Frans Bouma | Lead developer LLBLGen Pro