FetchEntityCollection/PrefetchPath issue

Posts   
 
    
Drew
User
Posts: 4
Joined: 13-Nov-2007
# Posted on: 13-Nov-2007 21:24:13   

.NET Framework 2.0 LLBLGen Pro 2.0

Hi -

I'm using FetchEntityCollection() to return all "Treatment" objects from our database. A Treatment has a FK to "TreatmentType", which I populate on the Treatment object using a prefetch path:


DataAccessAdapter adapter = new DataAccessAdapter(_connectionString);
                
EntityCollection<TreatmentEntity> treatments = new EntityCollection<TreatmentEntity>();
                
IPrefetchPath2 prefetchPath = new PrefetchPath2((int)EntityType.TreatmentEntity);
prefetchPath.Add(TreatmentEntity.PrefetchPathTreatmentType);
                
adapter.FetchEntityCollection(treatments, null, prefetchPath);


If I look at an individual Treatment object, I see that its TreatmentType object is populated as expected. However, what I did not expect to see was that the TreatmentType object has its own EntityCollection<TreatmentEntity> called "Treatment", which appears to contain all Treatments within the retrieved EntityCollection that share the same TreatmentType.

Is there a way to avoid the Treatment.TreatmentType.Treatment collection from being populated?

Thanks!

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 14-Nov-2007 10:20:49   

Is there a way to avoid the Treatment.TreatmentType.Treatment collection from being populated?

What's wrong with this? Why you don't need them.

No extra Query is executed for this. When you a TreatmentType entity is assigned to a Treatment's TreatmentType property, the Treatment entity is automatically added to the TreatmentType's Treatments Collection

Drew
User
Posts: 4
Joined: 13-Nov-2007
# Posted on: 14-Nov-2007 15:43:17   

The purpose of getting all of the Treatments at once is that they are cached on the server for performance reasons. When a client requests a specific Treatment, it's retrieved from this cache. This is occurring in a .NET remoting environment, so the problem is that because the Treatment.TreatmentType.Treatment collection is populated with the "related" treatments, serialization of the Treatment object is taking much longer than if this collection was not being populated.

Doing this by default is making the object bloated by default, which is impacting our performance. If I have a collection of Treatments and want to know for a given Treatment in that collection which other Treatments have the same TreatmentType, I could simply use FindMatches. I wouldn't expect (or need) a collection of those Treatments to be hanging off of each Treatment's TreatmentType object. simple_smile

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 14-Nov-2007 16:41:25   

Well I don't think this will enlarge the transferred object, as these are references not copies of objects. Would you please check the size of each of those collections? And see if it's greater than one, or not?

A workaround that just came to my mind: You may use a subPath from TreatmentType back to Treatment with some always negative predicate. In this way I think the Collection might not get filled.

Drew
User
Posts: 4
Joined: 13-Nov-2007
# Posted on: 14-Nov-2007 17:55:28   

Here's some more code that may explain what's happening a bit better:


DataAccessAdapter adapter = new DataAccessAdapter(_connectionString);

IPrefetchPath2 prefetchPath = new PrefetchPath2((int)EntityType.TreatmentEntity);
prefetchPath.Add(TreatmentEntity.PrefetchPathTreatmentType);

System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();

//
// 1. Get Treatment 3090 using FetchEntityCollection() (this represents our caching scenario)
//

EntityCollection<TreatmentEntity> treatments = new EntityCollection<TreatmentEntity>();

adapter.FetchEntityCollection(treatments, null, prefetchPath);

TreatmentEntity treatment1 = treatments[treatments.FindMatches(new FieldCompareValuePredicate(TreatmentFields.TreatmentID, null, ComparisonOperator.Equal, 3090))[0]];

// *** Returns 852 ***
int count1 = treatment1.TreatmentType.Treatment.Count;

System.IO.MemoryStream ms1 = new System.IO.MemoryStream();
bf.Serialize(ms1, treatment1);

// *** Returns 2365225 ***
long length1 = ms1.Length;

//
// 2. Get Treatment 3090 using FetchEntity()
//

TreatmentEntity treatment2 = new TreatmentEntity(3090);

adapter.FetchEntity(treatment2, prefetchPath);

// *** Returns 1 ***
int count2 = treatment2.TreatmentType.Treatment.Count;

System.IO.MemoryStream ms2 = new System.IO.MemoryStream();
bf.Serialize(ms2, treatment2);

// *** Returns 16279 ***
long length2 = ms2.Length;


As you can see, treatment1 is much larger than treatment2, even though they represent exactly the same Treatment, simply because treatment1 was retrieved in the context of other Treatments using FetchEntityCollection(). As I mentioned, the Treatment object is being passed back to a client via .NET remoting, meaning that we are not talking about a simple object reference being passed back, but are in fact talking about the entire object having to be serialized and passed back, then deserialized on the client.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 14-Nov-2007 18:55:47   

One way to do this is to HIDE the relation ThreatmentType - Threatment in the designer. (keep Threatment - ThreatmentType). This makes the collection go away in ThreatmentType as you don't need it anyway, and it will solve your problem.

Frans Bouma | Lead developer LLBLGen Pro
Drew
User
Posts: 4
Joined: 13-Nov-2007
# Posted on: 14-Nov-2007 21:42:28   

That may be a workaround, but doesn't seem like comprehensive solution. Going into the designer is a manual step that someone will have to remember to do. If, for example, another table is added to the database to which the Treatment table will have a FK, we'll have to remember to go back into the designer and hide that new relation. Because our Treatment table has FKs to a good number of other tables in our database, this must be repeated for each of those - as well as for any other objects (besides Treatment) that we want to avoid this on, which in our case is all - since we do not need any of our objects to be "fleshed out" in this way, and in fact is something that we actively want to avoid.

IMHO, adding a collection that may potentially be very large onto a object, like the Treatment collection that is being added onto the TreatmentType object, should have to be an explicit step. That way, nothing is added to your object that you haven't asked for. It seems that would adher to the same philosophy taken with the PrefetchPath design, and I'm not sure why it should differ for this case.

I feel this should at least be an option that can easily be set "off" at some level in future versions, seeing its potential for negatively impacting performance, especially in a .NET remoting scenario such as I've presented.

Nonetheless, thanks for your suggestions. Are there no other possible workarounds? (I'll look into the SubPath idea Walaa mentioned.)

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 15-Nov-2007 11:59:59   

Drew wrote:

That may be a workaround, but doesn't seem like comprehensive solution. Going into the designer is a manual step that someone will have to remember to do. If, for example, another table is added to the database to which the Treatment table will have a FK, we'll have to remember to go back into the designer and hide that new relation. Because our Treatment table has FKs to a good number of other tables in our database, this must be repeated for each of those - as well as for any other objects (besides Treatment) that we want to avoid this on, which in our case is all - since we do not need any of our objects to be "fleshed out" in this way, and in fact is something that we actively want to avoid.

It's part of the entity management features llblgen pro offers: if you do: myOrder.Customer = myCustomer; it makes sure that: myCustomer.Orders.Add(myOrder); is done and vice versa.

To hide the PK->FK relation, the statement: myCustomer.Orders.Add(myOrder); isn't done, because there's no Orders collection. This is a mandatory feature, as it frees developers from babysitting their graphs of entities.

In edge cases like the one you're describing, having graph management isn't what you want: the graph might be there, but sending parts of the graph over the wire should be possible without sending the REST of the graph over the wire.

IMHO, adding a collection that may potentially be very large onto a object, like the Treatment collection that is being added onto the TreatmentType object, should have to be an explicit step. That way, nothing is added to your object that you haven't asked for. It seems that would adher to the same philosophy taken with the PrefetchPath design, and I'm not sure why it should differ for this case.

In fact, prefetch paths rely on the feature I detailed above: you don't have to specify the path Order-Customer if you've specified Customer-Order already, the relation from order to customer is already done for you.

Switching that off by default will make a lot of code become cumbersome to write because you have to make sure the backreferences are done as well. It has its advantages in edge cases though.

I feel this should at least be an option that can easily be set "off" at some level in future versions, seeing its potential for negatively impacting performance, especially in a .NET remoting scenario such as I've presented.

Fully agreed.

Nonetheless, thanks for your suggestions. Are there no other possible workarounds? (I'll look into the SubPath idea Walaa mentioned.)

There's a more hacky way. In v2.5 we added a feature which allowed you to track entities for deletion. This can be done in a collection inside an entity collection. To prevent hierarchies from these entities-which-should-be-deleted over remoting, we added a flag, which is accessable via a protected property 'MarkedForDeletion', if that flag is set, collections contained by the entity aren't serialized over the wire.

This is pretty hacky, and not really recommended.

There's another way as well: (assumed: fast serialization is used) - use adapter 2-class scenario - in the derived entity for threatmentType, override AddToMemberEntityCollectionsQueue in the 'MyThreadmentType' class - don't implement any code in the override. No collections will be serialized from MyThreadmentType.

This isn't ideal either.

So hiding the relation is IMHO the best thing. The ThreadmentType is a lookup entity anyway I think, so 1:n relations from it aren't that necessary. About the other relations from Threadment: if you don't fetch these, they're not in the graph. If these are lookup entities as well: hide the 1:n relations from the lookups. You don't need them in most cases.

About adding an option to disable entity management on that relation: that's coming down to doing an action in the designer anyway, which means the same for hiding the relation.

Isolating a subgraph for serialization is something else though. It's an interesting idea, I'll add it to the v3 list of things to research.

Frans Bouma | Lead developer LLBLGen Pro