- Home
- LLBLGen Pro
- Architecture
Determine if Entity.Collection is Fetched
Joined: 01-Feb-2006
Otis wrote:
So say it says 'true', so the call was made. When was that call made? You can't say. Also, what if the call was made but the filter used didn't match any entities, however you now want to see all related entities without a filter... you can't determine which filter was used based on the flag.
To answer these questions, you therefore have to KNOW what kind of prefetch path would be used otherwise and when. This alone is enough info to fill in if the data was fetched or not.
I think we are agreed on that.
Its one of those "Can't prove a negative" type things. But if you reverse the logic and have a "IsDefaultCollection" or "BackingFieldWasNullAtEntityCreationTime" flag then a true result has full meaning whereas a false result doesn't (at least not generally - its then down to the developer/app context or whatever to make sense of it but at the developer's own risk)
(Actually I suppose I'm assuming that a prefetch that is filtered and returns 0 records will still produce a 0 count member collection - is that the case or will it just leave it as null?)
Cheers Simon
Joined: 17-Aug-2003
simmotech wrote:
Otis wrote:
So say it says 'true', so the call was made. When was that call made? You can't say. Also, what if the call was made but the filter used didn't match any entities, however you now want to see all related entities without a filter... you can't determine which filter was used based on the flag.
To answer these questions, you therefore have to KNOW what kind of prefetch path would be used otherwise and when. This alone is enough info to fill in if the data was fetched or not.
I think we are agreed on that.
Its one of those "Can't prove a negative" type things. But if you reverse the logic and have a "IsDefaultCollection" or "BackingFieldWasNullAtEntityCreationTime" flag then a true result has full meaning whereas a false result doesn't (at least not generally - its then down to the developer/app context or whatever to make sense of it but at the developer's own risk)
Sure, those kind of flags have a meaning when they're true, but, and that's perhaps me, I can't feel convinced that even a 'true' value is really useful because without the flag you also can perfectly fine write a solid routine.
(Actually I suppose I'm assuming that a prefetch that is filtered and returns 0 records will still produce a 0 count member collection - is that the case or will it just leave it as null?)
They're null as there's no call being made to the collection. The routine SetRelatedEntityProperty is the one which is used to set the entity fetched by a prefetch path. If there's no entity, nothing is done, so the collection member should be null.
Joined: 02-Dec-2004
Otis wrote:
So say it says 'true', so the call was made. When was that call made? You can't say. Also, what if the call was made but the filter used didn't match any entities, however you now want to see all related entities without a filter... you can't determine which filter was used based on the flag.
That's why just a "flag" is far too simple, and I think the concept of one boolean flag that tells you if a collection was prefetched or not is throwing you off. That's not enough history to go on.
I want complete information. Was it prefetched? With what filters? With what sort order? When was it fetched? The more data you can provide, the better.
Joined: 02-Dec-2004
Otis wrote:
*** Example *** Let me conclude with a concrete example of psuedo code (easier to read) to show you one very common scenario for me.
Public Property SomePropertyThatExtendsAGeneratedEntity // Check prerequisites on FirstEntityCollection If (Not this.FirstEntityCollectionToWorkOn.SyncState.ContainsEntitiesWhere(FirstEntityFields.IsEnabled == True)) #if ThrowOnMissingData Throw new EntityStateException("Not all required data has been prefetched. Please tell your programmers to be more careful.") #endif #if FetchMissingData this.FetchMissingFirstEntityDataWhere(FirstEntityFields.IsEnabled == True) #endif End If // Check prerequisites on SecondEntity If (not this.SecondEntitySyncState.IsFetchedUsing(SecondEntityFields.IsEnabled == True) #if ThrowOnMissingData Throw new EntityStateException("Not all required data has been prefetched. Please tell your programmers to be more careful.") #endif #if FetchMissingData this.FetchMissingSecondEntityDataWhere(FirstEntityFields.IsEnabled == True) #endif End If // return results Return This.DoSomethingThatNeedsAllEnabledFirstEntitiesAndAnEnabledSecondEntity(this.FirstEntities, this.SecondEntity) End Property
And then if you want to eliminate the possibility of invalid data, you either dispose of this object before then end of the transaction, or else you change the prerequisite-check predicate to the following:
FirstEntityFields.IsEnabled == True && FirsetEntityFields.TombstoneStartTime <= this.FetchTime && (FirstEntityFields.TombstoneEndTime = Null || FirstEntityFields.TombstoneEndTime > this.FetchTime)
That's all cool, but you also know it's impossible to build this. The simple fact alone to match prefetch paths + node filters + predicates is showing this isn't going to work. Furthermore, it breaks a simple rule: the entity itself shouldn't know how it is fetched, as it doesn't fetch itself, the repository, manager or whatever you call it, does that. By adding that code to the entity, you leak an abstraction into the entities.
Don't think of it as the entity knowing how it was fetched. Think of it as the entity knowing what data is available to it so that it can make informed decisions, such as "I don't have enough data, I'll throw an exception."
So your code needs info about that related entity's fetch rule to do something. But is that the right place to do that? If you say "yes", you indeed need a history as you don't know anything about the caller.
Very much yes. I think of the "history" feature as providing context to the data. Without it, it's never know from within a callee if data is right or wrong.
I know it's a difficult matter, but there's no reliable way other than "start the process, fetch the data, process the data, persist results", and if you do that in 1 routine or scattered over 1000 objects, it doesn't matter: if you want to fragment 'fetch the data' during the phase 'process the data', you then run into the situation where you need to make decisions you can't make as there's no info at that level where you likely will make the decision, namely inside callee's.
It's not just about making decisions, either. It's about knowing that the data you have is the data you think it is.
Joined: 17-Aug-2003
joshmouch wrote:
Otis wrote:
So say it says 'true', so the call was made. When was that call made? You can't say. Also, what if the call was made but the filter used didn't match any entities, however you now want to see all related entities without a filter... you can't determine which filter was used based on the flag.
That's why just a "flag" is far too simple, and I think the concept of one boolean flag that tells you if a collection was prefetched or not is throwing you off. That's not enough history to go on.
I want complete information. Was it prefetched? With what filters? With what sort order? When was it fetched? The more data you can provide, the better.
Ok, though then you come into an area where you need to match all query elements. You could in theory do this, however in practise this is IMHO a hell to do, and the API is likely very tough to use (if there's one which is usable).
joshmouch wrote:
Otis wrote:
.... That's all cool, but you also know it's impossible to build this. The simple fact alone to match prefetch paths + node filters + predicates is showing this isn't going to work. Furthermore, it breaks a simple rule: the entity itself shouldn't know how it is fetched, as it doesn't fetch itself, the repository, manager or whatever you call it, does that. By adding that code to the entity, you leak an abstraction into the entities.
Don't think of it as the entity knowing how it was fetched. Think of it as the entity knowing what data is available to it so that it can make informed decisions, such as "I don't have enough data, I'll throw an exception."
You can't do that from within the entity. Let me give you an example. Say you have an Order entity, and in process X it has enough data if it has order line entities and in process Y it has enough data if it has a customer AND order lines.
To be able to say "I have enough data", the order entity INTERNALLY has to know if it's in X or in Y. Though, you don't want this knowledge to leak into the entity, as that would make the code fragile: what if you add a process Z which requires yet another state of Order? Then you have to alter the Order entity, and it locks Order into X, Y and Z, so using it in another process A would require yet another piece of code in Order.
This is typical for the violation of the OO rule "Tell, don't ask".
Let's be drastic here and say: "Order can assume the data it has is valid in the [b]context[/b] it is used in, as the [b]owner[/b] of the context knows what data it needs to load."
This is reasonable, because what you're doing is testing if the process which consumes the entity has loaded the right information into the entity. But that's not the spot to do so. That should be done in tests which test the process, not the entity inside.
When you look at it THAT way, you'll see that you don't need the history, as you already know what you did, as the ACTIONS which would be in your history to check are part of that process!
So your code needs info about that related entity's fetch rule to do something. But is that the right place to do that? If you say "yes", you indeed need a history as you don't know anything about the caller.
Very much yes. I think of the "history" feature as providing context to the data. Without it, it's never know from within a callee if data is right or wrong.
True, but only for 1 particular use case. In another use case, you can't test from within the entity if all the data is correct unless you KNOW the use case INSIDE the entity. IMHO that's not the way to go and also why LLBLGen Pro has pluggable validators etc.
I know it's a difficult matter, but there's no reliable way other than "start the process, fetch the data, process the data, persist results", and if you do that in 1 routine or scattered over 1000 objects, it doesn't matter: if you want to fragment 'fetch the data' during the phase 'process the data', you then run into the situation where you need to make decisions you can't make as there's no info at that level where you likely will make the decision, namely inside callee's.
It's not just about making decisions, either. It's about knowing that the data you have is the data you think it is.
And why do you have to look at a tracker's history list to know what you did in the process you are in ? Doesn't the process itself be a list of actions so you know what the actions are and therefore you can only look at yourself and say "if the data is there, it's there, if not, it's not there and therefore indeed it's not there" (that sounds pretty dumb, but it's IMHO that simple )
If you want to squeeze tests like "Do I have the right data and if not, I'll throw exceptions", INSIDE the entity, the entity must know what the right data is for THAT particular context at that particular spot inside the process.
IMHO the entity is then the most unlikely place you want to place that logic, simply because it requires information you don't want to store inside the entity. So what do you do then? Either assume the data is OK (as it is used inside a process you can test as well, e.g. with mocks if you want to) or call out to the process to check if the data in the entity is correct, or better formulated: call a method which will validate you as that method knows the process, the state an entity should have and therefore can check if the entity meets that state. Which IMHO comes down to a meaningless exercise as it would be asking the process if the statements it has are indeed right...
Joined: 02-Dec-2004
Otis wrote:
IMHO the entity is then the most unlikely place you want to place that logic, simply because it requires information you don't want to store inside the entity.
I disagree.
Say I have an entity called HumanEntity, and that entity has a property called LifeGoals that is a union of the entity collections PersonalGoals, FinancialGoals, RelationshipGoals with certain filters applied.
HumanEntity should be the ONLY object that knows how to calculate LifeGoals. This is dirtied by reality, in that limited resources require that some outside manager (DataAccessAdapter) pre-load all of the required data to calculate LifeGoals, which implicitly requires that it at least somewhat knows what to load.
In other words, encapsulation is a design goal of OO.
Do you disagree?
So now you want to unit test HumanEntity (in this case, a permanent internal test). If HumanEntity is just to assume that any data it currently has is the correct data then there is absolutely no way, whatsoever to test that LifeGoals is correct.
In other words, unit testing on a single entity is not possible without that entity knowing its inputs (which is the case with a property that is a context-less entity or entity collection).
Right?
So now you want to drop HumanEntity into the real world. It doesn't know a whole lot about its environment, as it was designed to be as self-contained as possible. Now we introduce CounselorEntity, which doesn't know a whole lot about HumanEntity because it, too, was designed to be very self-contained. CounselorEntity wants to know HumanEntity.LifeGoals. The only thing that HumanEntity can do is assume that it has all of the data that it needs, and not too much.
But this assumption is more likely to cause an error than to not. I'll provide an example to show this.
For this example, we will say the WorldManager object, which is in charge of setting up all of the objects in this world, made a mistake and gave HumanEntity wrong data. Maybe it provided an extra RelationshipGoalEntity that belongs to an unrelated DogEntity, an extra PersonalGoal that has too low a priority to make it into the LifeGoal list, and it forgot to provide any FinancialGoalEntity's at all.
So CounselorEntity reads Human.LifeGoals. However, as it turns out, LifeGoals is missing all of the data from FinancialGoals, and includes incorrect data in PersonalGoals and RelationshipGoals.
Make sense?
So HumanEntity is, at this point, at a completely invalid state, and there is NOT ONE SINGLE OBJECT IN THE ENTIRE UNIVERSE that knows that except for the programmer that put it together. You can't create complete unit tests for HumanEntity to verify it was written correctly, and you can't verify the outside influences to verify that they are using HumanEntity correctly.
In other words, it is not possible to create a 100% reliable HumanEntity easily.
Right?
That is putting WAAAAAY too much faith in a programmer. If you want to talk about making brittle spaghetti code, then talk about asking a programmer to make some changes to an application without having any way to verify that he did not affect HumanEntity in any way.
HumanEntity is doomed.............................
Joined: 17-Aug-2003
Well, I'm not going to repeat myself, I thought I've expressed what I wanted to say. You may disagree, that's perfectly fine by me, however you seem to fail to grasp that there are more ways to do things.
To give you an example: some people find it utterly stupid that there is a Save() method on an entity, while others find it utterly stupid that there ISN'T a save method on the entity but on a separate class, while there is a Load method on a File class (for example).
The thing is: I've learned long ago that discussions about what's good OO are simply a waste of time: not only are these discussions not giving any new insight, but they're also about stupid arguments: as if writing software using methodology A isn't possible but only possible with methodology B.
I found and find this whole discussion pretty uncomfortable, and I therefore won't continue it. Moved to architecture.
Joined: 02-Dec-2004
Otis wrote:
The thing is: I've learned long ago that discussions about what's good OO are simply a waste of time
Well, I'm actually just trying to make a case for a feature request. We use LlblGen for everything, and absolutely love it to death... this just feels like something that's missing.
Otis wrote:
I found and find this whole discussion pretty uncomfortable, and I therefore won't continue it.
I'm not sure why, but sorry!
Joined: 31-May-2005
Frans -
I didn't see a more recent thread on this topic, so I just wanted to see if there was an update. We are building a very large app using LLBL (will likely be ~800 tables when complete), and the lack of this feature is a huge problem.
Consider this simple method:
public decimal GetSumOfOpenOrders(EmployeeEntity emp)
{
decimal amt = 0m;
foreach (OrderEntity order in emp.Order)
{
if (order.Status == Enums.OrderStatus.Open)
amt += order.Amt;
}
return amt;
}
If this was only called in a few places, it would be easy to ensure we prefetched the right data. But this particular piece of code is being called from close to a hundred different spots, many of which are called within functions other than the ones where the data is first fetched.
As a result, our reports and some screens will often show $0 for Open Orders, when in fact there are orders. There's no way we can tell if the data was prefetched. I don't want to change the LLBL source to do this (since I'd like to buy the upgrade), but we've had so many bugs as a result I don't know what else to do.
Edit:
Self-servicing didn't seem like an option since we needed it to access thousands of identical databases (one per company) and multiple RDMSs ... in the end all we're looking for is a flag on an entity collection indicating that it was populated via a prefetch path operation.
Joined: 17-Aug-2003
You are aware of the fact that the flag is context bound, so only valid right after the fetch? (as in: fetch data, check flag, take action). If the flag is true, the checking code has to have knowledge about the pipeline the data came from and how it was fetched: if a filter was used, which resulted in 0 rows in the child nodes of the path, the flag will signal that the prefetch was performed, but not if a filter was used or not, so concluding that there are no resulting rows is therefore a delegate matter in this case: there might be related entities in this case, you just didn't fetch them with your filter.
I do understand that you have problems in some reports/forms where you need this information and the datastructure at hand (the entity graph) doesn't give it to you at the moment (0 rows might indeed mean: nothing was fetched, or a fetch attempt was made but 0 rows were found).
Did you consider doing an aggregate query for obtaining the info? Or could that query become too expensive (slow) ?
To avoid a long debate again: no, we can't keep the filter used with the prefetch path inside the collection, as you can do that too: with the collection you can also keep the prefetch path which fetched it.
Joined: 31-May-2005
Otis wrote:
You are aware of the fact that the flag is context bound, so only valid right after the fetch? (as in: fetch data, check flag, take action). If the flag is true, the checking code has to have knowledge about the pipeline the data came from and how it was fetched: if a filter was used, which resulted in 0 rows in the child nodes of the path, the flag will signal that the prefetch was performed, but not if a filter was used or not, so concluding that there are no resulting rows is therefore a delegate matter in this case: there might be related entities in this case, you just didn't fetch them with your filter.
The same can be said about a pre-filled object graph. By your logic, I shouldn't pass around a filled object graph either since the latter methods may not know how it was populated.
The problem is that many users want the option to manually lazy load (or at least verify that a prefetch occurred), but must use adapters for multi-database or multiple-RDBMS support. The lack of a flag in a collection stating whether or not the collection was prefetched (or just fetched to begin with) makes this impossible, unless I'm missing something.
If the option or another workaround is not there, I guess I'll modify the source in v2.6. I had been hoping to buy licenses of v3 when it came out, which is why I was originally reluctant.
I'm not trying to re-open old battle woulds, I'm just trying to use the prefetch feature in a meaningful way. However, it sounds like you're taking a pretty hard stance on something for philosophical reasons, rather than practical ones. Your position seems to be "I don't want to allow users to shoot themselves in the foot", but the reality is that the entitycollections pulled using prefetch paths without this property are more dangerous than those pulled with it.
Joined: 17-Aug-2003
JoshLindenmuth wrote:
Otis wrote:
You are aware of the fact that the flag is context bound, so only valid right after the fetch? (as in: fetch data, check flag, take action). If the flag is true, the checking code has to have knowledge about the pipeline the data came from and how it was fetched: if a filter was used, which resulted in 0 rows in the child nodes of the path, the flag will signal that the prefetch was performed, but not if a filter was used or not, so concluding that there are no resulting rows is therefore a delegate matter in this case: there might be related entities in this case, you just didn't fetch them with your filter.
The same can be said about a pre-filled object graph. By your logic, I shouldn't pass around a filled object graph either since the latter methods may not know how it was populated.
Absolutely right, same thing indeed.
The problem is that many users want the option to manually lazy load (or at least verify that a prefetch occurred), but must use adapters for multi-database or multiple-RDBMS support. The lack of a flag in a collection stating whether or not the collection was prefetched (or just fetched to begin with) makes this impossible, unless I'm missing something.
Fetching a collection through 'FetchEntityCollection' which then would result in 0 rows fetched, could be tracked, as well as the prefetch path one, however, as argued also in other threads and this one: the flag really is context bound and a leaky abstraction: it requires (deep) knowledge of how the entity graph got fetched to determine the meaning of the value of the flag. So I just wanted to be sure you understood that.
If the option or another workaround is not there, I guess I'll modify the source in v2.6. I had been hoping to buy licenses of v3 when it came out, which is why I was originally reluctant.
I'm not trying to re-open old battle woulds, I'm just trying to use the prefetch feature in a meaningful way. However, it sounds like you're taking a pretty hard stance on something for philosophical reasons, rather than practical ones. Your position seems to be "I don't want to allow users to shoot themselves in the foot", but the reality is that the entitycollections pulled using prefetch paths without this property are more dangerous than those pulled with it.
I don't think it's a hard stance, it's more a matter of whether a flag on an object becomes invalid without the developer knowing it, and at the same time: could we prevent developers using the flag and creating code using it which fails in some odd situations (i.e. the ones where the flag is actually not valid anymore) and then come here and ask us what can be wrong. Also, Josh, please do understand that even though some people want to write software in a given way, we are not obligated to 'thus' add code to bend our framework towards their way of working. I for one, wont add flags to objects which are not reflecting what the user think they do, just because some person wants me to. I do want to think about solution to your problem with the information not available to your graph consuming code, but not about flags. (see below as well)
FWIW, I disagree with using collection with prefetch paths without the flag being more dangerous, for the following reason: if in a callee method you need information about how a datastructure is filled by the caller, the caller should tell the callee that, the callee shouldn't have to ask for it. The flag might help, but only in the situation where the data structure is fetched right before consumption. That's not always the case. Also, when will the flag be reset in your application? As the objects in the graph might be merged with some other graph. This will require you to write code based on the graph visitor in the runtime to traverse the graph and reset the flags. If you don't do that, some code will check the flag and will do things you don't expect, which is very very hard to test.
The developer knows when which collections are fetched, as the code building the prefetch path knows this. The developer therefore, if required, can tell a callee which elements are fetched.
There's another problem for entity references. What if I fetch employees and their related departments (m:1, for simplicity), and 1 employee doesn't have a department. Its department reference is null. But it IS fetched. So this requires a flag in the entity, which requires a property. Sorry, but I really won't go there: not everyone uses this, plus some entities in some projects (trust me) have many many relationships which results in many many extra properties.
Ok, bottom line, as you have to finish a project: this is a problem which has only crappy solutions, especially if we add it as a bunch of flags and I won't do that. But there are other options perhaps, for example in the area where the prefetch path is passed to the method which consumes the graph. The prefetch path is now not the best navigational data structure for this, but this could be changed. But before spending time on this, I have to know whether you want to work into this direction or that you just want the flags and everything else is a dead end.
The problem is very complex and not solved with a flag, even if we add one. If this only occurs in some forms, please look into how you could pass info to the form, to the callee so the callee can make a better decision whether related data is there or not.
So in short: I'm not going to add flags, but I'm willing to look into ways where you can obtain the information you need at a given moment for a given object graph fetched with a given set of elements. However it will require help in how you would see this, what would work, in which scenarios it has to work etc.
Perhaps you want calls to your own code so you can build an information object, which might be easier to work with than a flag (as you can toss it away, so it doesn't become stale, the big problem with this flag hack). That's all possible, we just have to know how to make changes.
We're late in v3.0's development cycle (far in beta, the whole development cycle took almost 2 years, so you're a little late but you couldn't know that, so I don't blame you). Microsoft with its 2 free frameworks forced the O/R mapper vendors for .NET to move from a framework oriented approach to other approaches. We moved to a multi-framework designer. This still means we'll keep on supporting (and extending!) our own framework, but the 3.0 release will be mainly about the designer. This means that a change for the runtime might come in the next 3.x version, and not 3.0, but it depends on the change whether we can squeeze it in in 3.0 but I can't promise you anything if we can make the change in v3.0
Joined: 14-Dec-2003
I'll just chime in. How this is solved isn't as important to me as getting it solved. If a flag isn't the right solution, give us the best solution you can come up with. I've been asking for a solution to this for much longer than v3.0 has been in development. So count me as someone who cares.
Joined: 17-Aug-2003
arschr wrote:
I'll just chime in. How this is solved isn't as important to me as getting it solved. If a flag isn't the right solution, give us the best solution you can come up with. I've been asking for a solution to this for much longer than v3.0 has been in development. So count me as someone who cares.
You as the developer knows which query is executed, so you can check with that whether things are fetched or not. I could think of a method in adapter, which is called after each prefetch path node fetch with parent entity type, property name and #of elements fetched. You can then build a datastructure and use that in a callee.
Joined: 31-May-2005
I am certainly not attached to the idea of a flag, that was the first "easy" solution I could think of doing this. Adding a WasFetched flag to the EntityCollection class, and populating it within the FetchEntityCollectionInternal method seemed like it would fit our needs without changing hundreds of service methods. But changing source code for software that I hope to upgrade makes me nervous ... if I'm not here when the upgrade happens, we're at risk.
The developer knows when which collections are fetched, as the code building the prefetch path knows this. The developer therefore, if required, can tell a callee which elements are fetched.
Yes, but when you're dealing with many developers who are using a common set of service calls, it is far more likely that they'll not pass in a correctly loaded object graph. Currently I have no way to prevent this other than literally writing thousands of integration tests. Lazy loading, or at least throwing exceptions when not loaded correctly, eliminates these considerations, and knowing what was/wasn't prefetched gives us the best of both worlds.
There's another problem for entity references. What if I fetch employees and their related departments (m:1, for simplicity), and 1 employee doesn't have a department. Its department reference is null.
You already have a flag for this - it's called DepartmentId. If we prefetch a single related entity, we know whether or not it was prefetched ... the foreign key would be non-null while the related entity is null. We do these checks all over the place for pseudo lazy loading of single entities. There is no reason to put a flag on the Entity, just the EntityCollection.
Lazy loading, or at least contextual awareness of data retrieved, is something that is important to a lot of developers ... but self-servicing doesn't work in many enterprise level cases. LLBL is far more powerful and bug free than EF or L2S, and far easier to use than NH, but all 3 offer some type of lazy loading with multi-database support (2/3 with multiple-RDBMS).
I wasn't trying to take up a lot of your time, so I apologize if this did.
Joined: 17-Aug-2003
JoshLindenmuth wrote:
I am certainly not attached to the idea of a flag, that was the first "easy" solution I could think of doing this. Adding a WasFetched flag to the EntityCollection class, and populating it within the FetchEntityCollectionInternal method seemed like it would fit our needs without changing hundreds of service methods. But changing source code for software that I hope to upgrade makes me nervous ... if I'm not here when the upgrade happens, we're at risk.
Setting a flag after entity collection fetch (so not using a path) is not that hard. Add partial class to DataAccessAdapter, and override OnFetchEntityCollectionComplete, and in there set the flag in the EntityCollection class. It's best in that case to simply define an interface in a separate assembly which is referenced by both the dbgeneric as the db specific project, and you implement that interface in a partial class of the EntityCollection<T> class in the dbgeneric project. In the override of OnFetchEntityCollectionComplete, you utilize that interface to set the flag on the EntityCollection<T> object.
It's the prefetch paths which are a different story.
There's another problem for entity references. What if I fetch employees and their related departments (m:1, for simplicity), and 1 employee doesn't have a department. Its department reference is null.
You already have a flag for this - it's called DepartmentId. If we prefetch a single related entity, we know whether or not it was prefetched ... the foreign key would be non-null while the related entity is null. We do these checks all over the place for pseudo lazy loading of single entities. There is no reason to put a flag on the Entity, just the EntityCollection.
Oh indeed! Should have thought of that!
Ok, about prefetch paths and tapping into that, that's not really that easy at the moment, as there's no method called per node and the merge method calls entity.SetRelatedEntityProperty, which is a method in the generated code but which isn't overridable that easily (although you could create an override in a partial class of the CommonEntityBase class, then call a new virtual method you create, which you override in a partial class of the entity in question (you could generate that with a template, so that's no extra work per entity) and in there set the flag in the collection if the property is about a collection.
It will take some work, but it's doable and you can add it today. You can also do it differently, namely create a class which recursively traverses a prefetch path and collects which properties are fetched in the path and flattens that in a list. You then can use that as a lookup to verify whether a prefetch path fetched that property or not.
Another way to do it is to use the ObjectGraphUtils classes to traverse an object graph and per entity obtain the collections of the entities in the graph by calling for each entity the 'GetMemberEntityCollections()' method and set all flags in these collections (e.g. by using the interface I talked about earlier).
Lazy loading, or at least contextual awareness of data retrieved, is something that is important to a lot of developers ... but self-servicing doesn't work in many enterprise level cases. LLBL is far more powerful and bug free than EF or L2S, and far easier to use than NH, but all 3 offer some type of lazy loading with multi-database support (2/3 with multiple-RDBMS).
Context awareness of what's fetched is indeed important, but it's a problem the developer has to deal with. Simple example: fetch a couple of customer entities with a prefetch path on 'Orders' with a filter: all orders from december 2009. Not all customers have that, flag is still set. Then you add those customers to a collection of customer entities which were fetched with a different prefetch path. What does the flag mean now? that's really unclear, but a problem the developer has to solve, as the developer knows when what is fetched and which code should be aware of that. That's why this problem is so hard and flags won't cut it, even though it might look like it does. So I'd pick the external datastructure option over any flag option
(thanks btw for the kind words. Multi-catalog support in any of those 3 is not that great btw, (EF doesn't support it at all), no overwriting of names for example).
Joined: 31-May-2005
I finally had a chance to look at this last night. I'm pretty sure I got it working with partial classes of EntityCollection and DataAccessAdapter (and a bit of reflection). At least my tests passed. I ended up using a combination of your ideas, and put them in templates/tasks for all of our projects. I loved having the ability to add this to code gen without breaking the ability to upgrade - VERY cool feature!
Here's my approach, Frans - I'd love your thoughts on whether I screwed something up (edit - updated based on your comments):
In a common project, I added IEntityCollectionCustom.cs:
public interface IEntityCollectionCustom
{
bool WasFetched { get; set; }
}
In DatabaseGeneric\HelperClasses\EntityCollectionCustom.cs:
public partial class EntityCollection<TEntity> : EntityCollectionBase2<TEntity>, IEntityCollectionCustom
where TEntity : EntityBase2, IEntity2
{
public bool WasFetched { get; set; }
}
In DatabaseSpecific\DataAccessAdapterCustom.cs:
public partial class DataAccessAdapter
{
protected override void OnFetchEntityCollectionComplete(IRetrievalQuery selectQuery, IEntityCollection2 entityCollectionToFetch)
{
//This is to make sure WasFetched is set when there's no prefetch
var collectionAsIEntityCollectionCustom = entityCollectionToFetch as Common.LLBL.IEntityCollectionCustom;
if (collectionAsIEntityCollectionCustom != null)
((Common.LLBL.IEntityCollectionCustom)entityCollectionToFetch).WasFetched = true;
base.OnFetchEntityCollectionComplete(selectQuery, entityCollectionToFetch);
}
public override void FetchEntityCollection(IEntityCollection2 collectionToFill, IRelationPredicateBucket filterBucket, int maxNumberOfItemsToReturn, ISortExpression sortClauses, IPrefetchPath2 prefetchPath, ExcludeIncludeFieldsList excludedIncludedFields, int pageNumber, int pageSize)
{
base.FetchEntityCollection(collectionToFill, filterBucket, maxNumberOfItemsToReturn, sortClauses, prefetchPath, excludedIncludedFields, pageNumber, pageSize);
Common.LLBL.PrefetchPathWalker.TraversePath(prefetchPath, collectionToFill);
}
public override bool FetchEntity(IEntity2 entityToFetch, IPrefetchPath2 prefetchPath, Context contextToUse, ExcludeIncludeFieldsList excludedIncludedFields)
{
bool status = base.FetchEntity(entityToFetch, prefetchPath, contextToUse, excludedIncludedFields);
Common.LLBL.PrefetchPathWalker.TraversePath(prefetchPath, entityToFetch);
return status;
}
}
Then in my Common.LLBL project, I added PrefetchPathWalker.cs:
public static void TraversePath(IPrefetchPath2 path, IEntityCollection2 collection)
{
if ((path == null) || (collection == null) || (collection.Count == 0))
return;
foreach (IPrefetchPathElement2 element in path)
{
//all entities in a collection are of the same type, only get prop once
PropertyInfo prop = collection[0].GetType().GetProperty(element.Relation.MappedFieldName);
foreach (IEntity2 e in collection)
TraverseEntity(element, e, prop);
}
}
public static void TraversePath(IPrefetchPath2 path, IEntity2 entity)
{
if (path != null)//the conditon is because it was throwing exception of null reference if path is null so the condition is added
{
foreach (IPrefetchPathElement2 element in path)
{
PropertyInfo prop = entity.GetType().GetProperty(element.Relation.MappedFieldName);
TraverseEntity(element, entity, prop);
}
}
}
private static void TraverseEntity(IPrefetchPathElement2 element, IEntity2 entity, PropertyInfo prop)
{
if (typeof(IEntityCollectionCustom).IsAssignableFrom(prop.PropertyType)) //related prop is an entity collection
{
var relatedCollection = (IEntityCollection2)prop.GetValue(entity, null);
((Common.LLBL.IEntityCollectionCustom)relatedCollection).WasFetched = true;
if ((relatedCollection.Count > 0) && (element.SubPath != null))
{
TraversePath(element.SubPath, relatedCollection);
}
}
else //related prop is an entity
{
var relatedEntity = (IEntity2)prop.GetValue(entity, null);
if ((relatedEntity != null) && (element.SubPath != null))
{
TraversePath(element.SubPath, relatedEntity);
}
}
}
Thoughts are welcome.
Joined: 17-Aug-2003
Looks good as far as I can see. Couple of things:
instead of:
var typeCheck = entityCollectionToFetch.GetType();
if (typeCheck.IsGenericType)
((Common.LLBL.IEntityCollectionCustom)entityCollectionToFetch).WasFetched = true;
do
var collectionAsIEntityCollectionCustom = entityCollectionToFetch as IEntityCollectionCustom;
if(collectionAsIEntityCollectionCustom!=null)
{
collectionAsIEntityCollectionCustom.WasFetched = true;
}
no type fiddling, as you already have the type at hand
You can use the same with the prefetch path traverser and use GetValue() instead of GetGetMethod and invoke:
private static void TraverseEntity(IPrefetchPathElement2 element, IEntity2 entity, PropertyInfo prop)
{
if(typeof(IEntityCollectionCustom).IsAssignableFrom(prop.PropertyType)) //related prop is an entity collection
{
var relatedCollection = (IEntityCollectionCustom)prop.GetValue(entity, null);
if(relatedCollection!=null)
{
relatedCollection.WasFetched = true;
if ((relatedCollection.Count > 0) && (element.SubPath != null))
{
TraversePath(element.SubPath, relatedCollection);
}
}
}
else //related prop is an entity
{
var relatedEntity = (IEntity2)prop.GetValue(entity, null);
if ((relatedEntity != null) && (element.SubPath != null))
{
TraversePath(element.SubPath, relatedEntity);
}
}
}
You could also think about pre-generating code to avoid the reflection if that's too slow (as it can be a performance hit if you fetch many many objects). Or even use IL generation, as the properties to get are always the same after a while: http://www.codeproject.com/KB/cs/Dynamic_Code_Generation.aspx?msg=2687388
It takes very little code and can get a great performance increase. We use similar code like the one in the article above for setting ado.net provider specific enum types at runtime in every parameter, and avoiding the reflection hit.
Of course, you then have the problem of building the cache along the way, which might require some locking, so it's not ideal either.
Joined: 31-May-2005
Thanks Frans - I updated my previous thread to reflect your thoughts and made a small bug fix to my code too (for null PrefetchPaths).
Most of our tables/entities are big enough that the cost to instantiate/populate an entity is at least an order of magnitude greater than the reflection, so I think your suggestion is sufficient since we're only running this on Fetch (particularly since the round-trip to the database will dwarf any local processing).
Thanks again, this is already showing us benefits, Josh