Performance issue with varbinary

Posts   
 
    
peljam
User
Posts: 6
Joined: 28-Jan-2009
# Posted on: 21-Apr-2009 02:49:04   

Did a quick search and couldn't find this issue mentioned anywhere. Hopefully you can help...

Anyhow here's the scenario:

-.NET 2.0 -Using SQL 2005. LLBL 2.0 -Need to store documents in a table. -Have a 10 megabyte document stored in a varbinary(MAX) field in a database table called 'Document'

Reading this document from the table, and writing it(from scratch) to the table are reasonably fast. The issue is in updating. To update I call code something like this:

DocumentEntity entity = new DocumentEntity(ID); entity.Document = document //this is a 10 meg byte[] entity.Save();

on the line:

entity.Document = document

There is a huge amount of overhead. Here is a trace from DotTrace:

43.87 % set_Document - 177173 ms - 10 calls - REIA.Data.EntityClasses.DocumentEntity.set_Document(Byte []) 43.87 % SetValue - 177173 ms - 10 calls - SD.LLBLGen.Pro.ORMSupportClasses.EntityBase.SetValue(Int32, Object, Boolean) 43.87 % SetValue - 177173 ms - 10 calls - SD.LLBLGen.Pro.ORMSupportClasses.EntityBase.SetValue(Int32, Object, Boolean, Boolean) 43.87 % DetermineIfFieldShouldBeSet - 177173 ms - 10 calls - SD.LLBLGen.Pro.ORMSupportClasses.FieldUtilities.DetermineIfFieldShouldBeSet(IEntityFieldCore, Boolean, Object) 43.87 % ValuesAreEqual - 177173 ms - 10 calls - SD.LLBLGen.Pro.ORMSupportClasses.FieldUtilities.ValuesAreEqual(Object, Object) 43.87 % CheckArraysAreEqual - 177173 ms - 10 calls - SD.LLBLGen.Pro.ORMSupportClasses.FieldUtilities.CheckArraysAreEqual(Array, Array) 24.18 % GetValue - 97636 ms - 200000000 calls - System.Array.GetValue(Int32) 4.55 % Equals - 18377 ms - 100000000 calls - System.Byte.Equals(Object)

I'm updating the document 10 times in this trace to get a good average. So you can see it eventually boils down to calling GetValue() and Equals() 20 million and 10 million times consecutively per call. I suspect that it is trying to see if anything changed for each byte within the full 10 megs of data for this document. In this case for 10 updates it takes 177 seconds. Since my thread is only using 50% of the processor, in this example CheckArraysAreEqual() end up using 88% of my processing.

It's kinda funny too that I'm running a 32 bit processor only to be using one 8 bit segment of one register to compare values simple_smile Back in the day if you had to compare all the bytes in a large array to see if something had changes one simple optimization was to compare in the largest chunks possible based on register sizes. I.e. if I'm using a 32 bit processor I should be able to compare in 32 bit chunks up to the boundary of the array, where I might have to go back down to 16 or 8 bit. That alone would improve performance of this particular problem by a factor of 4 for 32 bit processors and 8 for 64. Anyhow I realize your not building a low level driver here, but it's just kinda funny to find a hardware limitation from 20 years ago limiting performance under all those layers of abstraction...

Anyhow to get back on topic if there is a better way to update large fields in the database please let me know. Maybe I can have it turn off checking? As in this case I already know I have to update the table...

Thanks!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39616
Joined: 17-Aug-2003
# Posted on: 21-Apr-2009 10:11:44   

You know, you're the first to mention this. simple_smile Indeed, the check is there to see if an array has changed and it compares every byte. It has to, as one byte change should result in an update of the value.

If you already know this is the case, you can take a shortcut: entity.Fields[documentFieldIndex or name].CurrentValue = document; entity.Fields[documentFieldIndex or name].IsChanged = true;

Of course, make sure the entity's IsDirty flag is now set as well. You now won't see the check performance impact.

To come back to the byte-comparing vs. long compares for example: the array is a byte array, and as .NET is a managed framework, we can't simply (as in C/C++ wink ) see the array pointer as a long* and treat the values as longs (4 byte in a row), as they might not be stored next to eachother. So we've to read every individual byte. Even if we make a long out of that, you won't see the performance increase as the reading is the slow part.

I do find the read speed pretty slow though. One would think that reading a byte from an a byte array would be lightning fast.

Frans Bouma | Lead developer LLBLGen Pro