Should I De-Normalize my table??

Posts   
 
    
MarcoP avatar
MarcoP
User
Posts: 270
Joined: 29-Sep-2004
# Posted on: 04-Aug-2010 16:13:22   

my site has a daily deal and I show the user the total number of sales for that deal for the day. I fetch this with a simple query using a count() aggregate.

i also have a deal history page, where i also show the count using the same aggregate. now to my question, is it an expensive call to keep callling the aggregate in my queries or should i de-normalize my table to store the calculate count?

it just seems i need an additional "moving part" that calculates and updates the denormalized table and keeps it in sync.

Thoughts?

thanks!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39769
Joined: 17-Aug-2003
# Posted on: 04-Aug-2010 16:47:50   

why not do what we do on the forum as well? When you store a sale, you increase a counter. That counter is the # of sales. so the deal is an entity and it gets a counter which is updated. The user doesn't know if the sales figure is too low/high.

Anyway, if you have 100 sales a day, and the count has to aggregate over a table with less than a thousand rows, I wouldn't bother. It's when you run an aggregate many times a day that things get slow.

Frans Bouma | Lead developer LLBLGen Pro
MarcoP avatar
MarcoP
User
Posts: 270
Joined: 29-Sep-2004
# Posted on: 04-Aug-2010 16:53:20   

Otis wrote:

why not do what we do on the forum as well? When you store a sale, you increase a counter. That counter is the # of sales. so the deal is an entity and it gets a counter which is updated. The user doesn't know if the sales figure is too low/high.

Anyway, if you have 100 sales a day, and the count has to aggregate over a table with less than a thousand rows, I wouldn't bother. It's when you run an aggregate many times a day that things get slow.

Yeah, I would have to aggregate over the Transaction table, which definetly has well over a thousand rows in the table.

Would I need to perform some sort of database locking so when two users update the "counter" at the same time the correct value is written? or possibly the following would guard against that:


entity.counter = (total sold for deal) + 1

// instead of
entity.counter ++;

MarcoP avatar
MarcoP
User
Posts: 270
Joined: 29-Sep-2004
# Posted on: 04-Aug-2010 22:04:32   

bump (for frans) simple_smile

MTrinder
User
Posts: 1461
Joined: 08-Oct-2008
# Posted on: 05-Aug-2010 21:31:52   

Yes, you need to ensure that the process which updates the counter value occurs in a serializable transaction, to ensure that no other process can update it at the same time. You also need to use some sort of optimistic concurrency on the record so that you can ensure that the value has not been updated since you retrieved it.

Matt

MTrinder
User
Posts: 1461
Joined: 08-Oct-2008
# Posted on: 05-Aug-2010 21:35:19   

Another option would be to use a materialized (calculated) view (assuming your db supports them - most do) which included the aggregated counter column. This would be slighlty slower on inserts as it the db would need to recalculate the value, but avoids all of the concurrency problems of storing the value.

You would be able to use a LLBLGen Typed view to retrieve this value.

Matt

MarcoP avatar
MarcoP
User
Posts: 270
Joined: 29-Sep-2004
# Posted on: 05-Aug-2010 22:11:16   

MTrinder wrote:

Yes, you need to ensure that the process which updates the counter value occurs in a serializable transaction, to ensure that no other process can update it at the same time. You also need to use some sort of optimistic concurrency on the record so that you can ensure that the value has not been updated since you retrieved it.

Matt

Isolation Levels always trip me up. So, if I did an UpdateEntitiesDirectly and set the Counter column to an expression that does the calculation (select count(transId) from transaction where dealid = xx), do I still need to mess with the Isolation level or can I leave it as its default isolation level?

Is the serialized level to guard against as I perform the 'select count(...' in the expression, no other transactions can insert records into the transaction table?

Also, are serializable transactins dangerous (deadlocks, etc.)? It should be a quick update to one table

MarcoP avatar
MarcoP
User
Posts: 270
Joined: 29-Sep-2004
# Posted on: 06-Aug-2010 15:03:57   

With the following command, is there ever a chance the transaction row count and total sales count would not be in sync (not including deleting rows)? Does the default isolation level work for this? Thanks!

INSERT INTO [transaction] ...

UPDATE deal SET TotalSales = (select count(*) from [transaction] where dealid = @dealId) WHERE dealId = @dealId

MTrinder
User
Posts: 1461
Joined: 08-Oct-2008
# Posted on: 10-Aug-2010 21:55:50   

Pretty much the only way to guarantee that they will not get out of sync is to use Serializable as the transaction level - this is the safest, but also provides the biggest likleyhood of database contention and poor performance, as the DB will basically not allow any other transactions to even read the rows that are used in your transaction. Whether this is acceptable is something that only you can decide...!

There is a pretty good description of the way the different isolation levels work on technet

http://technet.microsoft.com/es-es/library/ms173763.aspx

Matt