HnD | Page "5 of XX", Page 1

Page "5 of XX"

Posts

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 18-Mar-2005 18:31:20

Not sure if this has been discussed before. I couldn't find it via a keyword search. It's possible this functionality exists and I don't know about it.

I was wondering if there were any plans for including a rowcount or total page return value in the paging queries, so you could display something like "Page 5 of 10" in your applications. You would also then be able to do things like determine whether a "next page" button was appropriate or not.

As far as the actual generated query goes, it should be simple (of course, I say this with little knowledge of how the actual SQL is generated.) .

When I have paging in SQL Server in the past, I simply did it using an output parameter and @@rowcount. The calculation for IfNextPage can be done in the application code.

Again, I don't know how the SQL generator works, so I don't want to assume that implementing this would be "easy".

Sample of how it could work:

LLBL Generated paging code:


CREATE TABLE
#TempTable ([__rowcnt][int] IDENTITY (1,1) NOT NULL,[ClientID][Int] NULL,[ClientName][VarChar](50) NULL);

INSERT INTO #TempTable ([ClientID],[ClientName])
SELECT [SECURITY].[dbo].[client].[client_id] AS [ClientID],[SECURITY].[dbo].[client].[client_name] AS [ClientName] FROM [SECURITY].[dbo].[client];
SELECT [ClientID],[ClientName] FROM #TempTable WHERE [__rowcnt] > 2 AND [__rowcnt] <= 4;
DROP TABLE #TempTable

Code with @@rowcount variable:


--this would be an output parameter
DECLARE @therowcount int 

CREATE TABLE
#TempTable ([__rowcnt][int] IDENTITY (1,1) NOT NULL,[ClientID][Int] NULL,[ClientName][VarChar](50) NULL);

INSERT INTO #TempTable ([ClientID],[ClientName])
SELECT [SECURITY].[dbo].[client].[client_id] AS [ClientID],[SECURITY].[dbo].[client].[client_name] AS [ClientName] FROM [SECURITY].[dbo].[client];
--
SELECT @therowcount = @@rowcount
--
SELECT [ClientID],[ClientName] FROM #TempTable WHERE [__rowcnt] > 2 AND [__rowcnt] <= 4;
DROP TABLE #TempTable

From a few simple tests, it looks like there is no cost involves in getting the @@rowcount.

If this functionality already exists, please ignore this post.

Frans, thanks for the lightning-fast help on my predicate question!

Jason Lashmet
User

Posts: 33
Joined: 05-Feb-2005

# Posted on: 19-Mar-2005 05:25:01

You can get the number of rows by using the CountRow aggregate. Just use the same predicate you use to actually fetch the entities. Heres a snippet from the documentation


// C#
IRelationPredicateBucket filter = new RelationPredicateBucket();
filter.PredicateExpression.Add(PredicateFactory.CompareValue(
    CustomerFieldIndex.Country, ComparisonOperator.Equal, "France"));
filter.RelationCollection.Add(OrderEntity.Relations.CustomerEntityUsingCustomerId);
DataAccessAdapter adapter = new DataAccessAdapter();
int amount = (int)adapter.GetScalar(    
    EntityFieldFactory.Create(OrderFieldIndex.OrderId), 
    null, AggregateFunction.CountRow, 
    filter.PredicateExpression, null, filter.RelationCollection);

Jason

Otis
LLBLGen Pro Team

Posts: 39828
Joined: 17-Aug-2003

# Posted on: 19-Mar-2005 11:27:44

Thanks for the answer, Jason!

Frans Bouma | Lead developer LLBLGen Pro

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 19-Mar-2005 15:56:21

Jason wrote:

You can get the number of rows by using the CountRow aggregate. Just use the same predicate you use to actually fetch the entities. Heres a snippet from the documentation


// C#
IRelationPredicateBucket filter = new RelationPredicateBucket();
filter.PredicateExpression.Add(PredicateFactory.CompareValue(
    CustomerFieldIndex.Country, ComparisonOperator.Equal, "France"));
filter.RelationCollection.Add(OrderEntity.Relations.CustomerEntityUsingCustomerId);
DataAccessAdapter adapter = new DataAccessAdapter();
int amount = (int)adapter.GetScalar(    
    EntityFieldFactory.Create(OrderFieldIndex.OrderId), 
    null, AggregateFunction.CountRow, 
    filter.PredicateExpression, null, filter.RelationCollection);

Jason

I appreciate your answer, but . . .

This isn't really what I was suggesting. This additional action has costs. The database has to process an additional query, and the application has to make an additional call across the wire and wait for a response.

Using the @@rowcount method accomplishes the exact same thing via a call and query that you are already performing, and its only cost is an output parameter.

Frans, can you put this on your list of suggestions?

Otis
LLBLGen Pro Team

Posts: 39828
Joined: 17-Aug-2003

# Posted on: 19-Mar-2005 16:57:11

Rowcount doesn't work, as it will stop processing rows when it hits the counter, so if you specify an order by, your resultset will be different. In 1.0.2004.2 I've optimized the query for a great deal to only read (pagesize*pagenumber)+1 rows into the temptable, which should greatly reduce the amount of data processed, but nevertheless, the query has to be executed twice (and for each page also).

Frans Bouma | Lead developer LLBLGen Pro

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 19-Mar-2005 18:50:09

Otis wrote:

Rowcount doesn't work, as it will stop processing rows when it hits the counter, so if you specify an order by, your resultset will be different.

I might be misunderstanding you, so if I am I apologize.

No matter what, when you use paging, you insert the entire dataset into a temp table, right? Immediately after that insert, you can grab the @@rowcount global variable and put it into a local variable to get the total number of records processed.

The ORDER BY should have nothing to do with it--the ORDER BY clause will not affect the number of records inserted into the temp table, which is what @@rowcount will be returning.

See my original post for an example. Again, if I am misunderstanding you, I apologize.

Otis wrote:

In 1.0.2004.2 I've optimized the query for a great deal to only read (pagesize*pagenumber)+1 rows into the temptable, which should greatly reduce the amount of data processed, but nevertheless, the query has to be executed twice (and for each page also).

This new method would obviously affect what I wrote above.

There is still a tradeoff, however. This new method would reduce the number of records inserted into the temp table, but the entire set would still have to be processed (not inserted, but still scanned and sorted) if an ORDER BY clause was being used.

My "SQL sense" (which is not always right ) tells me that the new method will only truly save processing when there is no ORDER BY clause. Maybe when processing large results sets, too.

Also, if developers don't care in most cases whether or not there is a "next page" or need the total number of records, then grabbing the record count using the COUNT aggregate is fine. However, I think when you are using paging in an application, you almost always want that information.

In addition, doing it all in one call is, IMHO, a better practice. That COUNT(field) aggregate is (or at least can be) a relatively expensive operation. Especially if you are passing it "difficult" predicates (LIKE, etc.).

Sorry to babble on about this, but I find it intriguing.

If I weren't heading off on vacation, I would probably have my development machine out and be running a slew of tests.

Otis
LLBLGen Pro Team

Posts: 39828
Joined: 17-Aug-2003

# Posted on: 19-Mar-2005 21:12:04

psandler wrote:

Otis wrote:

Rowcount doesn't work, as it will stop processing rows when it hits the counter, so if you specify an order by, your resultset will be different.

I might be misunderstanding you, so if I am I apologize.

No matter what, when you use paging, you insert the entire dataset into a temp table, right? Immediately after that insert, you can grab the @@rowcount global variable and put it into a local variable to get the total number of records processed.

The ORDER BY should have nothing to do with it--the ORDER BY clause will not affect the number of records inserted into the temp table, which is what @@rowcount will be returning.

See my original post for an example. Again, if I am misunderstanding you, I apologize.

oooooh! we are talking about two different things, I'm sorry, my bad. You can do paging with rowcount tricks as well, which fail in order by and other situations, so I thought you were talking about that.

There is a problem with your proposal: queries which have to use client-side paging. The code switches to client-side paging if multiple entities are used in the query and distinct can't be used.

For example you want all employees who have entered orders of a given type. As employee contains a blob for the picture, distinct can't be used but the join with orders results in a lot of duplicate rows. In that situation, the code switches to client-side paging: it doesn't use a temptable and simply skips n objects till the first object of the page is read, and reads pagesize-1 objects after that.

That situation, requires a special count query for teh amount of real objects in the object set: select count (distinct employeeid) from ... .

There are other situations in which this also is required. In other words: there is, sadly enough, no 'one query to rule them all'..

Otis wrote:

In 1.0.2004.2 I've optimized the query for a great deal to only read (pagesize*pagenumber)+1 rows into the temptable, which should greatly reduce the amount of data processed, but nevertheless, the query has to be executed twice (and for each page also).

This new method would obviously affect what I wrote above.

There is still a tradeoff, however. This new method would reduce the number of records inserted into the temp table, but the entire set would still have to be processed (not inserted, but still scanned and sorted) if an ORDER BY clause was being used.

My "SQL sense" (which is not always right ) tells me that the new method will only truly save processing when there is no ORDER BY clause. Maybe when processing large results sets, too.

The new method increases performance as it doesn't add a lot of rows to the temptable.

Also, if developers don't care in most cases whether or not there is a "next page" or need the total number of records, then grabbing the record count using the COUNT aggregate is fine. However, I think when you are using paging in an application, you almost always want that information.

It's not determinable in all cases, as I've explained above

In addition, doing it all in one call is, IMHO, a better practice. That COUNT(field) aggregate is (or at least can be) a relatively expensive operation. Especially if you are passing it "difficult" predicates (LIKE, etc.).

It's also not always possible to do that (but I have to test). There is another problem: where to return the total set size? In teh case of client side paging (access uses solely client side paging) the total setsize is unknown using the same query, you need a special query for that.

Sorry to babble on about this, but I find it intriguing.

If I weren't heading off on vacation, I would probably have my development machine out and be running a slew of tests.

have a great vacation!

Frans Bouma | Lead developer LLBLGen Pro

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 19-Mar-2005 23:34:12

Otis wrote:

There is a problem with your proposal: queries which have to use client-side paging. The code switches to client-side paging if multiple entities are used in the query and distinct can't be used.

For example you want all employees who have entered orders of a given type. As employee contains a blob for the picture, distinct can't be used but the join with orders results in a lot of duplicate rows.

Hmm, not sure I follow. Wouldn't you just subquery in this case? (I don't have Query Analyzer handy, so consider this pseudocode):


SELECT employee_id, picture_blob
FROM employee
WHERE employee_id IN 
(SELECT DISINCT employee_id FROM orders WHERE order_type = @order_type)

Again, I don't have the schema (I assume it's Northwind) or my development machine handy. I also might be misunderstanding what you mean.

Maybe this case is easier than applying the rule dynamically, so please don't think I'm trying to imply that it would be "easy" to implement.

Otis wrote:

In that situation, the code switches to client-side paging: it doesn't use a temptable and simply skips n objects till the first object of the page is read, and reads pagesize-1 objects after that.

Eek, I would be worried about how this would perform. All of the records would have to come across the wire, and all of them would have to be loaded into memory (at least briefly).

Otis wrote:

That situation, requires a special count query for teh amount of real objects in the object set: select count (distinct employeeid) from ... .

There are other situations in which this also is required. In other words: there is, sadly enough, no 'one query to rule them all'..

I can't disagree with this statement. But I do think the way you are currently doing it (with a temp table) is the best method, and can be used in almost all cases.

Otis wrote:

There is still a tradeoff, however. This new method would reduce the number of records inserted into the temp table, but the entire set would still have to be processed (not inserted, but still scanned and sorted) if an ORDER BY clause was being used.

My "SQL sense" (which is not always right ) tells me that the new method will only truly save processing when there is no ORDER BY clause. Maybe when processing large results sets, too.

The new method increases performance as it doesn't add a lot of rows to the temptable.

I think having to re-query the database for a COUNT(field) is more expensive than inserting everything into the temp table. I would have to test to be sure. Again, if you are using an ORDER BY, you still have to scan the whole table/index and sort it before doing the partial insert. The scan/sort is more expensive than the insert (would have to test this to be sure too).

Otis wrote:

In addition, doing it all in one call is, IMHO, a better practice. That COUNT(field) aggregate is (or at least can be) a relatively expensive operation. Especially if you are passing it "difficult" predicates (LIKE, etc.).

It's also not always possible to do that (but I have to test). There is another problem: where to return the total set size? In teh case of client side paging (access uses solely client side paging) the total setsize is unknown using the same query, you need a special query for that.

I can't really answer the Access question, as I'm not that familiar with it. Does it not allow temp tables?

But for more robust databases, I would suggest an output parameter.

Otis wrote:

have a great vacation!

Thanks! Have a great vacation from my posts!

Otis
LLBLGen Pro Team

Posts: 39828
Joined: 17-Aug-2003

# Posted on: 20-Mar-2005 00:14:22

psandler wrote:

Otis wrote:

There is a problem with your proposal: queries which have to use client-side paging. The code switches to client-side paging if multiple entities are used in the query and distinct can't be used.

For example you want all employees who have entered orders of a given type. As employee contains a blob for the picture, distinct can't be used but the join with orders results in a lot of duplicate rows.

Hmm, not sure I follow. Wouldn't you just subquery in this case? (I don't have Query Analyzer handy, so consider this pseudocode):

That would solve it, but a query with a relation and a filter uses a join, not a subquery.

Otis wrote:

In that situation, the code switches to client-side paging: it doesn't use a temptable and simply skips n objects till the first object of the page is read, and reads pagesize-1 objects after that.

Eek, I would be worried about how this would perform. All of the records would have to come across the wire, and all of them would have to be loaded into memory (at least briefly).

nooo . I use a datareader, so the resultset is just send in a small part to the client. I read one row at a time till I have enough. This is not that bad actually.

Otis wrote:

There is still a tradeoff, however. This new method would reduce the number of records inserted into the temp table, but the entire set would still have to be processed (not inserted, but still scanned and sorted) if an ORDER BY clause was being used.

My "SQL sense" (which is not always right ) tells me that the new method will only truly save processing when there is no ORDER BY clause. Maybe when processing large results sets, too.

The new method increases performance as it doesn't add a lot of rows to the temptable.

I think having to re-query the database for a COUNT(field) is more expensive than inserting everything into the temp table. I would have to test to be sure. Again, if you are using an ORDER BY, you still have to scan the whole table/index and sort it before doing the partial insert. The scan/sort is more expensive than the insert (would have to test this to be sure too).

Paging is always expensive, at least on SqlServer. (on other databases, there are tricks to do it very easily, though not on sqlserver). Temp table usage can be slow, when the tempdb is not large enough. On Sqlserver 2000, a table variable is faster in smaller resultsets (someone actually tested all kinds of paging techniques, it's somewhere at codeproject.com), so there is some tweaking possible, though it requires a long list of checks to see which situation the query is in. What's way more efficient is that the actual resultset which is paged is limited as much as possible by filters. It's useless to page through 10,000 rows, if you just want to see the first 3 pages.

Otis wrote:

In addition, doing it all in one call is, IMHO, a better practice. That COUNT(field) aggregate is (or at least can be) a relatively expensive operation. Especially if you are passing it "difficult" predicates (LIKE, etc.).

It's also not always possible to do that (but I have to test). There is another problem: where to return the total set size? In teh case of client side paging (access uses solely client side paging) the total setsize is unknown using the same query, you need a special query for that.

I can't really answer the Access question, as I'm not that familiar with it. Does it not allow temp tables?

Nothing

But for more robust databases, I would suggest an output parameter.

Yes, but where to put the result of that output parameter . Also, if some situations require different count queries to get the actual count, it's a mixed situation: sometimes this, sometimes that. I don't like that situation, I then think it's best to just make the developer use consistent logic so the way the count is determined is always the same.

True, it can be slow sometimes, but the resultset should really be limited beforehand. Paging isn't something you'd do through a massive resultset, as paging is a gui-related feature, but that also ties it to a human, and no human can handle 10,000 rows.

Frans Bouma | Lead developer LLBLGen Pro

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 20-Mar-2005 08:04:06

I sent you a reply via email, since most people on this forum probably don't care about any of this.

Marcus
User

Posts: 747
Joined: 23-Apr-2004

# Posted on: 09-May-2005 17:53:12

psandler wrote:

I sent you a reply via email, since most people on this forum probably don't care about any of this.

Do you think so!!! I was quite enjoying this thread!

MacDennis
User

Posts: 50
Joined: 03-May-2005

# Posted on: 10-May-2005 10:20:33

Otis wrote:

On Sqlserver 2000, a table variable is faster in smaller resultsets (someone actually tested all kinds of paging techniques, it's somewhere at codeproject.com), so there is some tweaking possible, though it requires a long list of checks to see which situation the query is in.

You can find it here: http://www.codeproject.com/aspnet/PagingLarge.asp

Dennis

psandler
User

Posts: 540
Joined: 22-Feb-2005

# Posted on: 10-May-2005 15:27:33

MacDennis wrote:

You can find it here: http://www.codeproject.com/aspnet/PagingLarge.asp

Dennis

Not a good article in my humble opinion. Seems riddled with false premises and unscientific conclusions. He (she?) also immediately discards temp tables as a solution, without any testing or real reason to ignore that solution (which is, IMHO, the best solution more often than not).

(edited to correct spelling and punctuation)

Otis
LLBLGen Pro Team

Posts: 39828
Joined: 17-Aug-2003

# Posted on: 10-May-2005 16:01:17

psandler wrote:

MacDennis wrote:

You can find it here: http://www.codeproject.com/aspnet/PagingLarge.asp

Dennis

Not a good article in my humble opinion. Seems riddled with false premises and unscientific conclusions. He (she?) also immediately discards temp tables as a solution, without any testing or real reason to ignore that solutions (which is, IMHO, the best solution more often than not.

The customer who originally wrote the addition for paging on SqlServer, did some testing on a very large database, and temp tables were in all occasions faster. (so the choice was made pretty easily, also because they work on sqlserver 7 of course). There is a point where table variables are a bit faster, but as that's with rather small resultsets, it's not that important.

Frans Bouma | Lead developer LLBLGen Pro