Loading Entity Collection Takes too long

Posts   
1  /  2
 
    
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 11-Jul-2005 02:17:25   

Hi,

I am using serialized entitycollection (Adaptor) to synch data between desktop and pda. I am finding a problem when reading data back into the entitycollection on the pda. It just takes too long.

I am reading two records where one of the records contains about 20 sub records. Information in these records is limited. To read this xml (which is about 140kb) into an entitycollection takes about 4 minutes. This is obvously way too long.

Is there a reason for the entitycollection taking such a long time? Is there away to improve this performance?

As you would agree this is way too long to make the application useable. I have avoided using sqlce db by providing a custom data store (xml files). All components are in place. The only real issue now is that the loading of the xml in the entitycollection (readxml) is taking way too long.

I have also tried to serialize while on the device which takes equally long.

Any ideas/help would be greatly appreciated. Thanks Hameed.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 11-Jul-2005 08:10:40   

You use compact or normal xml export? The problem I think is that during deserialization, it will use Activator.CreateInstance, to create new instances for the graph and that might be an expensive operation on the CF.NET...

Frans Bouma | Lead developer LLBLGen Pro
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 12-Jul-2005 10:21:30   

HI Frans,

I am using compact xml thinking this would improved but it has not had much effect. The performance I am getting right now is very poor and not really usable.

What could I do to improve the performance?

What options do I have since I have spent much of my time in devloping the architecture to support this.

Any idea would be helpful.

Thanks.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 10:50:35   

I'm not really sure what causes the poor performance... reading in the XML can't be it, I use .NET XML objects to do that.

It does use some reflection to re-instantiate factories for the collections. I'm not sure if that happens a lot in your particular graph (as in: if you have 100 customer entities all with order objects in their collections, you'll get 100 times the order factory reinstantiation, though that should be slow the first time and fast all the other times, IMHO).

Do you see a drop in performance just with this piece of XML or with all XML you send? I find everything slow on the CF.NET so I don't know what to expect in performance when I test it here. Knowing some of the characteristics of the XML you want to re-instantiate into objects, is crucial I think to see what might be the culprit as profiling on a pda is very hard (if possible at all).

Frans Bouma | Lead developer LLBLGen Pro
Marcus avatar
Marcus
User
Posts: 747
Joined: 23-Apr-2004
# Posted on: 12-Jul-2005 11:00:17   

Hameed wrote:

Any idea would be helpful.Thanks.

Have you tried writting a custom deserializer? This is obviously "hardcoding" the schema, but I have seen substantial performance increases by doing this. If your app is relatively small and only contains a few entities (and especially if they are read only). Then you can simply read the XML and create hard entity objects yourself...

Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 12-Jul-2005 11:50:56   

I am also loading some datatables (these are static data that will not change) I am just passing these as datatables (xml). On the client side I am reading these back and they seem to load quickly. In addition I am also loading about 2000 records again into a datatable. This takes a while and performance is quite poor. THis I can get a round. My greatest concern is that I am serializing an entitycollection with patient as the main paent entit with 2 records and for one of these records I have about 20 sub records as contacts collection. For these to load it takes several minutes which is a real problem.

Have you tried writting a custom deserializer? This is obviously "hardcoding" the schema, but I have seen substantial performance increases by doing this. If your app is relatively small and only contains a few entities (and especially if they are read only). Then you can simply read the XML and create hard entity objects yourself...

How would I achieve this. Could you please elaborate!

Marcus avatar
Marcus
User
Posts: 747
Joined: 23-Apr-2004
# Posted on: 12-Jul-2005 12:23:28   

Hameed wrote:

How would I achieve this. Could you please elaborate!

You have to parse the XML manually and create the object graph yourself. The advantage of this is that you already know the structure of the schema and hence what to expect next when parsing the tree. You can therefore optimize the instantiation of new objects as you traverse the XML tree.

Be careful however as you will incurr an addition maintenance cost as you are effectively hardcoding the schema in deserializtion code.

I have seen this method used in some high traffic SOA implementation in order to increase message throughput. They achieved a 50% increase in throught by eliminating the generic XmlSerializer.

Im not sure what the overhead of deserializing is in CF but I would have a guess that this is where your bottleneck is...

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 12:34:39   

Marcus wrote:

Hameed wrote:

How would I achieve this. Could you please elaborate!

You have to parse the XML manually and create the object graph yourself. The advantage of this is that you already know the structure of the schema and hence what to expect next when parsing the tree. You can therefore optimize the instantiation of new objects as you traverse the XML tree.

Be careful however as you will incurr an addition maintenance cost as you are effectively hardcoding the schema in deserializtion code.

I have seen this method used in some high traffic SOA implementation in order to increase message throughput. They achieved a 50% increase in throught by eliminating the generic XmlSerializer.

Deserializing entities isn't using the XmlSerializer but a routine inside the entitycollection/entity. The XmlSerializer can't deal with cyclic references and interface based types which is the reason why the custom xml deserializer code is there.

The Xmlserializer generates C# code which is in fact containing that hard-coded schema reading code you mentioned simple_smile .

The compact XML reader is quite straight forward though, it only uses a collection creating using reflection when it has to, but this might take a while on the CF, I'm not sure (which is the hard part, if you're unsure what the bottleneck is, it's impossible to optimize). So

What I wondered was: if the deserialization of a single collection of entities is quite fast (and on the emulator here it is ok) but a graph isn't, it has something to do with the graph structure, which might cause problems, so if a single entity with a 2 entity collection inside already takes ages, it;s definitely somewhere in the graph reconstruction code which is then somewhat easier to track down.

Frans Bouma | Lead developer LLBLGen Pro
Marcus avatar
Marcus
User
Posts: 747
Joined: 23-Apr-2004
# Posted on: 12-Jul-2005 12:40:28   

Otis wrote:

Deserializing entities isn't using the XmlSerializer but a routine inside the entitycollection/entity. The XmlSerializer can't deal with cyclic references and interface based types which is the reason why the custom xml deserializer code is there.

The Xmlserializer generates C# code which is in fact containing that hard-coded schema reading code you mentioned simple_smile .

Ahhh.. I think they wrote custom serializers and deserializers to achive their performance boost, which eliminated the reflection etc... I can find out exactly what they did if your are interested.

Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 12-Jul-2005 13:45:09   

Frans you are right. The single level serialization is fine. The graph is the issue. I have not tested this recently but I did do when I was testing to see if the design will work. At that time I was only serializing the first layer (not graph) which loaded fine. Now that I think about it I think the performance issue is to do with deserializing the graph.

I will do more tests today to determine this but I think the problem does lie in the graph.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 14:33:00   

Ok, to be sure: could you test if a deserialization of 1 entity with a single filled collection with just 1 or 2 entities in that collection would give slowdowns as well? (as that would be just 3 entities to deserialize which should be very fast).

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 14:34:05   

Marcus wrote:

Otis wrote:

Deserializing entities isn't using the XmlSerializer but a routine inside the entitycollection/entity. The XmlSerializer can't deal with cyclic references and interface based types which is the reason why the custom xml deserializer code is there.

The Xmlserializer generates C# code which is in fact containing that hard-coded schema reading code you mentioned simple_smile .

Ahhh.. I think they wrote custom serializers and deserializers to achive their performance boost, which eliminated the reflection etc... I can find out exactly what they did if your are interested.

Any info on how they did it on the CF.NET platform is welcome simple_smile , thanks. The Xmlserializer on .NET 1.x generates code behind the scenes using reflection (which is slow the first time, but very fast the times after that). However this requires codedom and a compiler, which I'm sure they don'thave on a PDA wink .

(offtopic: you did win that smartphone on teched? And bummer I didn't ran into you again, as I wanted to discuss some things simple_smile )

Frans Bouma | Lead developer LLBLGen Pro
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 12-Jul-2005 14:54:11   

I have ran some tests:

For a collection with only 2 entities it took ~14seconds to load. (This has takes into account the checking of the integrity of the file and deserializing a entity that holds user details).

For a collection with 2 entities with one of the entities having sub-entities (graph) it took about 1min and 10seconds.

As you can see the time difference is huge.

Hameed

Marcus avatar
Marcus
User
Posts: 747
Joined: 23-Apr-2004
# Posted on: 12-Jul-2005 17:00:47   

Otis wrote:

(offtopic: you did win that smartphone on teched? And bummer I didn't ran into you again, as I wanted to discuss some things simple_smile )

Oh yes... smile I got the Smartphone this year... stuck_out_tongue_winking_eye

Here she is:

...I tried to meet up with you a couple more times at AskTheExperts but I think we kept missing each other... I didn't have your timetable or your mobile number and I was too hungover from the previous night (each day) to think of something clever like emailing you!!! Doooh flushed

I'll send you my IM offline if you like so that we can have a chat, as I also wanted to discuss a couple of things (since we got cut short by our guest wink ).

Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 12-Jul-2005 17:35:59   

I ran some more tests. This time I extracted the datatable for each of the entities using the SelfService template.

I then added the 3 datatables to the dataset and expoted the xml. I then loaded the xml back into a dataset on the ppc client. Changed my code to use data from the datatable and then loaded the lists.

The difference was amaing. For the two entities that with a graph that took 1min and 10 sec, this time only took 10sec to load.

This is a saving of a minute.

I am now thinking (always a dangerous task :-) ) could I export all data as datatables and load the data into a sqlce database (creating it on the fly) and then I can use the llblgen objects. I have seen a third party tool that does this called SSCEDirect.

My first preference is offcourse to use my inmemory data which I export to the device but performance would need to improve.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 20:53:21   

Marcus wrote:

Otis wrote:

(offtopic: you did win that smartphone on teched? And bummer I didn't ran into you again, as I wanted to discuss some things simple_smile )

Oh yes... smile I got the Smartphone this year... stuck_out_tongue_winking_eye

Here she is:

...I tried to meet up with you a couple more times at AskTheExperts but I think we kept missing each other... I didn't have your timetable or your mobile number and I was too hungover from the previous night (each day) to think of something clever like emailing you!!! Doooh flushed

haha smile no problem simple_smile

I'll send you my IM offline if you like so that we can have a chat, as I also wanted to discuss a couple of things (since we got cut short by our guest wink ).

wink Cool, I should have msn somewhere.. simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 12-Jul-2005 20:55:55   

Hameed wrote:

I ran some more tests. This time I extracted the datatable for each of the entities using the SelfService template.

I then added the 3 datatables to the dataset and expoted the xml. I then loaded the xml back into a dataset on the ppc client. Changed my code to use data from the datatable and then loaded the lists.

The difference was amaing. For the two entities that with a graph that took 1min and 10 sec, this time only took 10sec to load.

This is a saving of a minute.

I am now thinking (always a dangerous task :-) ) could I export all data as datatables and load the data into a sqlce database (creating it on the fly) and then I can use the llblgen objects. I have seen a third party tool that does this called SSCEDirect.

My first preference is offcourse to use my inmemory data which I export to the device but performance would need to improve.

I think the Activator.CreateInstance() is then very very slow. I'm not sure, but this could be the problem.

I've to do some tests with Ativator.CreateInstance, to see what the speed is though I think that's it. The main difference with my code and the dataset deserializer is that mine uses Activator.CreateInstance for creating factories for the collections. For the rest it uses similar code (as in: interpreting nodes in an Xml document using xpath like functions...)

I don't expect my xpath routines to be the culprit as then also a single collection should be slow.

Frans Bouma | Lead developer LLBLGen Pro
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 13-Jul-2005 10:10:15   

Thanks Frans,

In that case I will await your analysis and hopefully, prayfully have a resolution.

wink

Hameed.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 13-Jul-2005 11:08:22   

I hope to have a solution later today simple_smile .

I'll torture the emulator with a loop of activator.createinstance calls, to see how slow it really is. We'll get there simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 13-Jul-2005 13:29:16   

Hameed wrote:

I have ran some tests:

For a collection with only 2 entities it took ~14seconds to load. (This has takes into account the checking of the integrity of the file and deserializing a entity that holds user details).

When I deserialize using compact XML on the PDA emulator, it takes less than a second to deserialize 4 entities in a collection (without related entities). When I use verbose xml it takes somewhat of 1 second. This uses Activator.CreateInstance for the entity instantiation in the verbose situation so that's not slow.

I now wonder if the PDA emulator is simply not useful for real-life testing, as it might be that the PDA is way slower in real life than the emulator shows.

For a collection with 2 entities with one of the entities having sub-entities (graph) it took about 1min and 10seconds. As you can see the time difference is huge. Hameed

I'll now try to emulate this, also with a more huge entity set.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 13-Jul-2005 13:51:17   

Hmm... I get fabulous speeds. confused

I have created 2 XML files, one compact and one verbose. I've create a Customer object, 1 address object (customer-address 1:1 (two times)), 10 orders, per order 10 order lines, with each of them to 1 product (so 10 different products)).

Compact XML file is 178KB, verbose is 317KB. Compact XML file loads (thus the complete hierarchy) in 3 seconds. the verbose one in 5 seconds.

I'm really confused now. I'll now pack the test project and send it to you, as I don't have a real PDA, just the emulator. Perhaps the emulator is really fast on my box and doesn't match real world speed...

Frans Bouma | Lead developer LLBLGen Pro
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 13-Jul-2005 14:16:06   

Thats really confusing. I also thought that it maybe the device and hence tried the emulator but for me that was way too slow. The memory allocated to the emulator was too low and I am unsure how to increase this.

Anyway I look forward to the project and hence I will run it on the pda that I have.

My PDA is : iPAQ H4350 PDA (400MHz, 64MB, SD/MMC Card, Pocket PC 2003)

I believe this is at the upper end of PDA devices.

Thanks

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 13-Jul-2005 14:38:42   

Ok. simple_smile

I did a softreset on the emulator and whiped my sqlce db with that (aaaaargggg). And I'm now already fiddling for an hour to get the SQL on the PDA in a format it can work with. I'm almost there, and will then send you the project. I finally figured out how to copy/paste a file on the emulator (right-alt -> cntrl -> c / p ! smile )

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39933
Joined: 17-Aug-2003
# Posted on: 13-Jul-2005 15:20:57   

FIle send. If you don't receive it, please let me know.

Frans Bouma | Lead developer LLBLGen Pro
Hameed
User
Posts: 34
Joined: 02-May-2005
# Posted on: 13-Jul-2005 15:51:35   

Thanks frans for the files.

I have ran them on my device and as you stated it is loading the xml files reasonably fast. The customerhierarchycompact took about 15 seconds to load. The 4 entities took a couple of seconds to load.

There does not appear to be much difference between what you are doing and what I have done.

I will need to look into this further.

Hameed

1  /  2