How to dispose a large entity graph in memory?

Posts   
 
    
hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 14-May-2009 18:18:49   

Hi all,

I have a large entitygraph in memory. The root entity is an AssessmentFileEntity, or a collection of those (EntityCollection<AssessmentFileEntity>)

The graph is quite big and holds blob info etc.

Example graph: AssessmentFileEntity -->AssessmentFilePictures -->AssessmentFileDamages -->AssessmentFileDamagePictures -->AssessmentFileDamagePrices ...

Now is my question .... How do I dispose this graph in a propper/simple way?

I use this graph in a page to view a report. 1) I have my page with the button to open the reportViewer (mem. of proc. = 100.000k) 2) I click on the open button; the entitygraph is filled. (mem. of proc. = 120.000k) (big size due to the pictures) 3) I close the report and expect that the memory size will drop approx 20.000k, but this is not the case. (mem. of proc. 118.000k)

I have a model which holds the entitygraphs in memory: View.Model.AssessmentFile (the AssessmentFile/entitygraph used for the currently active report) View.Model.AssessmentFiles (the collection of assessmentFiles in case that there are more then one. The end-user can browse the assessmentFiles).

So I tried this code to dispose the graph: I just set the entity to null, because I can't find a dispose code on a single instance. Does this mean that there is no more reference to the subEntities? Will they be disposed of sooner or later? View.Model.AssessmentFile = null;

On a collection I call the dispose. View.Model.AssessmentFiles.Dispose();

When I stress test this opening and closing of the report, I notice that sometimes large portions of the memory are freed-up.

1) So am I correct to assume that with my code I removed the references to entities so that GC can do the cleanup afterwards? 2) Is there a way to instantly remove the large graph out of memory? Assume a large graph with alot of pictures ..... then the memory will go from 100.000K to 200.000k ... 300.000k .... 500.000k (once and a while large free'ings). But I want more control over this .... is that possible?

Thanks for your knowledge sharing on this one wink

Kind regards, Wim

MTrinder
User
Posts: 1461
Joined: 08-Oct-2008
# Posted on: 14-May-2009 21:59:47   

Have you tried calling GC.Collect yourself after setting the references to null ?

Matt

hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 14-May-2009 23:49:51   

Hi Matt,

I have not yet tried GC.Collect, and will definitly try this in my memory profiling tests tomorrow. The thing is that I don't want to do that action actually, because it is a very intensive process ... And the hardware is not that high-end. (the app. runs on tabletpcs)

Could you provide me with your observations or knowledge about my other questions about the entities ...

1: To dispose of a single entity, I just set the entity to null, because I can't find a dispose code on a single instance. Does this mean that there is no more reference to the subEntities? Will they be disposed of sooner or later? Is this the proper way to dispose the entity graph?

2: View.Model.AssessmentFile = null; On a collection I call the dispose. View.Model.AssessmentFiles.Dispose();

So am I correct to assume that with my code I removed the references to entities (Especially to the picturesEntities that lie like 2 or 3 levels deep as a subentity) so that GC can do the cleanup afterwards?

3: Any other ways to immediatly remove the entities from memory without calling GC.Collect?

Kind regards, Wim

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39859
Joined: 17-Aug-2003
# Posted on: 15-May-2009 10:11:50   

What's the LLBLGen Pro version, runtime lib build nr?

If you have a large graph in memory, simply dereference the root object and the graph will be garbage collected by the GC when it runs. GC.Collect is in general not recommended for production usage.

The main things to check: - do you reference elements in the graph after the report has been ran? if so, set these references to null so the graph is dereferenced. - do you fetch things in an adapter which you keep alive? This can be problematic in v2.6 as you get a growing string cache. - do you need the PK-> FK relation to be present? Say you have a customer entity who has thousands of orders. When you fetch this graph, every order has a reference to the customer but the customer also has references to all the orders through its Orders collection. If you set an order's Customer property to the customer, the order is added to the Orders collection. If you keep the Customer around, it also will keep the Orders around.

The Garbage collector recollects memory every once in a while, but it's not deterministic. The CLR also doesn't free up memory to the OS (even if the object is already destroyed) as it can then create new objects faster as it doesn't have to ask memory from the OS, it already has that memory.

Frans Bouma | Lead developer LLBLGen Pro
hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 15-May-2009 10:36:28   

Hi Otis,

LLBlgen 2.6, runtime from one dll: 2.6.9.327 (in the readme I see: Release date: 06-apr-2009)

This is what I do:

1: I fetch a graph from the database (using Adapter, with a using statement so that the dataAccessAdapter is disposed off.)

2: The graph is set on a Model class which is used by the reportViewer.

3: When I close the reportViewer, I am disposing of the graphs set on the model. I am doing this by setting the root entity of the graph to NULL (is this correct?)

On a collection of entity(graphs) I just dispose of them by calling EntityGraphCollection.Dispose() (is this correct?)

Should I also call DiscardSavedFields() on an entityGraph? Is this usefull? If so should I iterate through the entityGraphCollection and call it first and afterwards dispose the collection?

About your 2 main things to check: 1: Do I reference elements in the graph after the report: Well Im not quite sure but I don't think so. The graph is used to build up the report. But it is not really referenced imo.

2: Do I fetch things in an adapter which I keep alive: Well I don't think so because I fetched the graph with a using so the Adapter is already disposed of I guess.

I placed my questions in blue so you can notice them between all my blabbering wink

EDIT: 3: - do you need the PK-> FK relation to be present? I also have these references. But If I understand you correctly: Setting the Customer to NULL would cause the Orders to be disposed off afterwards. So in my case I can just set the EntityRoot to NULL, this will cause for their direct subentities to disposed off, is that possible? Because they also have subentities etc etc.

The Garbage collector recollects memory every once in a while, but it's not deterministic. The CLR also doesn't free up memory to the OS (even if the object is already destroyed) as it can then create new objects faster as it doesn't have to ask memory from the OS, it already has that memory. --> Nice info to know! wink

Kind regards, Wim

hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 15-May-2009 12:30:59   

Hmm, with the opening and the closing of the reportViewer it seems to work, but when I stay on the ReportViewer and do a Sign / UnSign of the report multiple times, the memory just keeps rising :s

EDIT:

I used dotTrace to take memory dumps between each Sign and Unsign action. During each of these actions I use the same code to dispose of the entities. But it seems like the entities are being persisted because the number of objects of the PictureEntities just keeps rising.

This is probably the cause of something that still references these entities, am I right to assume that?

I'm gonna do some weekend work to get to the bottom of this .... (any observations on my in-text questions are always welcome wink )

I'll keep you informed on my observations.

Kind regards, Wim

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39859
Joined: 17-Aug-2003
# Posted on: 16-May-2009 11:54:39   

hypo wrote:

Hi Otis,

LLBlgen 2.6, runtime from one dll: 2.6.9.327 (in the readme I see: Release date: 06-apr-2009)

This is what I do:

1: I fetch a graph from the database (using Adapter, with a using statement so that the dataAccessAdapter is disposed off.)

2: The graph is set on a Model class which is used by the reportViewer.

3: When I close the reportViewer, I am disposing of the graphs set on the model. I am doing this by setting the root entity of the graph to NULL (is this correct?)

.NET works always in the same way: if some object has a reference to another object, it's kept in memory. So if you clean up all references to objects in a graph, the graph is not kept in memory anymore.

On a collection of entity(graphs) I just dispose of them by calling EntityGraphCollection.Dispose() (is this correct?)

If you don't reference the entities anywhere else, just clear the collection or let the collection go out of scope. Dispose doesn't clear the collection.

Should I also call DiscardSavedFields() on an entityGraph? Is this usefull? If so should I iterate through the entityGraphCollection and call it first and afterwards dispose the collection?

You do get memory reductions, so the memory IS reclaimed, just not when you want it to happen, that's the GC. You don't need to remove these inner objects, all you need to do is make sure the graph isn't referenced anywhere anymore.

EDIT: 3: - do you need the PK-> FK relation to be present? I also have these references. But If I understand you correctly: Setting the Customer to NULL would cause the Orders to be disposed off afterwards. So in my case I can just set the EntityRoot to NULL, this will cause for their direct subentities to disposed off, is that possible? Because they also have subentities etc etc.

See my answer above: if by setting the root reference to null means there are no other references OUTSIDE the graph to the graph's elements, the graph is thus orphaned and will be cleaned up when the GC runs. ALso, if your reference to the graph goes out of scope when the report is done, you don't have to set the reference to null, as that doesn't make a difference.

Please see the garbage collection docs in the .NET documentation.

Keep in mind the following as well: myOrder.Customer = myCustomer; means that myOrder is part of myCustomer.Orders as well. If you keep around myCustomer after the report, myOrder is thus also present after the report.

So if you add objects to a graph in a method A, they're not removed from the graph after A's done, make sure that what you're caching is indeed not altered, so also the graphs you're caching aren't altered (as they're kept around and therefore can't change).

So it's always the same answer: if there are references still pointing to objects in the graph, the graph is kept in memory.

Frans Bouma | Lead developer LLBLGen Pro
hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 16-May-2009 19:01:05   

Ok otis,

I think I'm completely following.

I just have to look for anything that is still referencing the graph. Its weekend now, but now I think I have a trace to go for with the Signing and Unsigning ....

Because I never close the page, the local objects I create (which are holding a ref. to the graph) never get out of scope. This did happen when I closed the page and opened it again. So Im gonna look for that.

Good explanation on the whole way of working of GC. I should indeed take some docs and start studying them (but Im one of those Y-generation kids that learns by example stuck_out_tongue_winking_eye )

Ill keep you informed on my observations, But I definitly got enough info from your side.

Kind regards, Wim

hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 18-May-2009 11:22:59   

Otis wrote:

So if you add objects to a graph in a method A, they're not removed from the graph after A's done, **make sure that what you're caching is indeed not altered, so also the graphs you're caching aren't altered (as they're kept around and therefore can't change). **

Could you explain this a bit more in detail please? Or with a simple example? (Because I think I'm not quite following)

I still notice memory increase in the sign and unsign, but I don't see what's holding/keeping a reference to the graph so that it stays alive.

I use code like this but based on what you said I'm not sure anymore if this is correct: (The unsign is almost the same, except for the status change.) So I do the following: 1: Change the entity 2: Save the entity 3: Refetch the entity 4: Publish the updated entity so that the other Modules stay in sync

(1)
//Set the status to signed
mAssessmentService.SetStatus(View.Model.AssessmentFile, AssessmentFileStatus.Signed);

(2)
//Save the AssessmentFile
mAssessmentService.SaveAssessmentFile(View.Model.AssessmentFile, aConnectionString);

(3)
//refetch the AssessmentFile (to prevent out-of-sync in the entities)
View.Model.AssessmentFile = mAssessmentService.GetAssessmentFile(View.Model.AssessmentFile.Id, aConnectionString);

(4)
// Publish the event that the AssessmentFile has been updated
mEventAggregator.GetEvent<AssessmentFileUpdatedEvent>().Publish(View.Model.AssessmentFile);

Kind regards, Wim

Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 18-May-2009 17:11:17   

Could you explain this a bit more in detail please? Or with a simple example? (Because I think I'm not quite following)

I still notice memory increase in the sign and unsign, but I don't see what's holding/keeping a reference to the graph so that it stays alive

Wouldn't a code profiler help here?

hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 18-May-2009 17:25:29   

Walaa wrote:

Could you explain this a bit more in detail please? Or with a simple example? (Because I think I'm not quite following)

I still notice memory increase in the sign and unsign, but I don't see what's holding/keeping a reference to the graph so that it stays alive

Wouldn't a code profiler help here?

Do you mean something like dotTrace?

I'm doing some memory profiling with that tool, and after each Sign / Unsign (the code above), a graph is created by calling the GetAssessmentFile method. , I'm just looking like crazy why it stays there ........

I just wanted to know if my code in any way creates a reference that I don't see? But I don't think so simple_smile

And with the dotTrace I don't find any incoming references on the root of the graph, so I think my next task is to check each entity in the graph for INCOMING references who are NOT from the graph itself because these are problably the ones that keep the graph ALIVE? ==> Anyone who can confirm if my thaughts are right here?

Kind regards, Wim

Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 19-May-2009 07:58:53   

I'm not sure but you seem on the right track. We will be waiting for your findings.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39859
Joined: 17-Aug-2003
# Posted on: 19-May-2009 11:00:18   

With for example dotTrace or Ants you can profile memory, so after the method to create the report has been ran, the objects used in that method and which aren't passed to it or cached elsewhere, should be gone. If you still find references to them in memory with dotTrace or Ants, you can check which element keeps them in memory. But as things are cleaned up eventually as you said, I think the memory you are seeing being consumed is what I said earlier: the CLR has still the memory pages in possession and will give them up when the OS needs memory pages due to memory pressure.

Frans Bouma | Lead developer LLBLGen Pro
hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 19-May-2009 11:49:36   

If indeed the CLR is holding on to the memory, then I still need to find a solution for it because stress testing on the app. often causes an outOfMemoryException.

I can test this with GC.Collect right? Will this force the CLR to free up the memory? (I haven't done this before so I can test this just in DEBUG)

@Otis: I am just a bit worried about my code: Line 3 and 4: In line 3 I re-assign my object on my model. IMO this destroys the references the model had to the old one no?

In line 4: Here I publish het change of the AssessmentFile entity. As other modules catch this publish, they will update their models with entity-information provided by the entity I pass through.

Could this be a possible cause for references to the graph so that it is kept alive?

I think that here I made a design mistake, and that I should publish the ID of the changed entity so that other modules can request info from a repository based on that ID and then update their models based on the fetched info.

Can anyone give some advice on my findings here?

EDIT:

Otis wrote:

With for example dotTrace or Ants you can profile memory, so after the method to create the report has been ran, the objects used in that method and which aren't passed to it or cached elsewhere, should be gone. If you still find references to them in memory with dotTrace or Ants, you can check which element keeps them in memory. But as things are cleaned up eventually as you said, I think the memory you are seeing being consumed is what I said earlier: the CLR has still the memory pages in possession and will give them up when the OS needs memory pages due to memory pressure.

Things do get cleaned up when I open and close the reportViewer. This is probably because the view, presenter and model are disposed off. But with the Sign/Unsign action on the same View, I don't see any memory decreases. That's why I was/am a bit worried about my Sign / Unsign code as I explained above.

Kind regards, Wim

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39859
Joined: 17-Aug-2003
# Posted on: 20-May-2009 10:09:00   

hypo wrote:

If indeed the CLR is holding on to the memory, then I still need to find a solution for it because stress testing on the app. often causes an outOfMemoryException.

What's 'stress testing'? A real world scenario? The GC doesn't run all the time, it's easy to eat up all the memory before it kicks in, that's not said that a real world scenario indeed will too.

I can test this with GC.Collect right? Will this force the CLR to free up the memory? (I haven't done this before so I can test this just in DEBUG)

MS advices against putting GC.Collect in your code and it also isn't guaranteed that that directly collects memory.

@Otis: I am just a bit worried about my code: Line 3 and 4: In line 3 I re-assign my object on my model. IMO this destroys the references the model had to the old one no?

if no other object refers to it, indeed it does. Please keep up with what I said earlier: if you assign myCustomer to myOrder.Customer, it also means myOrder is placed in myCustomer.Orders, which means that if you keep myCustomer around and reassign myOrder to a different order, the original object is still kept in memory.

Also keep in mind that this is really a .NET related issue: object references, the CLR and the GC are building blocks you have to know about if you want to avoid memory leaks in .NET.

In line 4: Here I publish het change of the AssessmentFile entity. As other modules catch this publish, they will update their models with entity-information provided by the entity I pass through. Could this be a possible cause for references to the graph so that it is kept alive?

I think that here I made a design mistake, and that I should publish the ID of the changed entity so that other modules can request info from a repository based on that ID and then update their models based on the fetched info.

Can anyone give some advice on my findings here?

Let's stop this series of questions as they won't solve your problem. You should use a profiler, check what references what at a given time and then check whether that's indeed OK or that you have forgotten something. the basics how things work is described in the MSDN and also in brief in this thread.

EDIT:

Otis wrote:

With for example dotTrace or Ants you can profile memory, so after the method to create the report has been ran, the objects used in that method and which aren't passed to it or cached elsewhere, should be gone. If you still find references to them in memory with dotTrace or Ants, you can check which element keeps them in memory. But as things are cleaned up eventually as you said, I think the memory you are seeing being consumed is what I said earlier: the CLR has still the memory pages in possession and will give them up when the OS needs memory pages due to memory pressure.

Things do get cleaned up when I open and close the reportViewer. This is probably because the view, presenter and model are disposed off. But with the Sign/Unsign action on the same View, I don't see any memory decreases. That's why I was/am a bit worried about my Sign / Unsign code as I explained above. Kind regards, Wim

Then use a profiler and check what references objects after you've called unsign on a view. It's the same with 'my code is slow' (or worse: "YOUR code is slow"), that's not something that's ever going to get sorted out: measure, what's the real cause of a problem I'm seeing and examine the results. A profiler is the tool for that.

Frans Bouma | Lead developer LLBLGen Pro
hypo
User
Posts: 34
Joined: 14-Oct-2008
# Posted on: 20-May-2009 10:47:42   

Yep the stress testing is like the production environment stuck_out_tongue_winking_eye

Otis wrote:

Also keep in mind that this is really a .NET related issue: object references, the CLR and the GC are building blocks you have to know about if you want to avoid memory leaks in .NET.

True indeed! The root of my problem probably relies in the lack of knowledge about object references. The root of my solution relies in the knowledge of that and the use of profiler tools,

So I will go and gain knowledge on those topics wink

Kind regards, Wim

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39859
Joined: 17-Aug-2003
# Posted on: 20-May-2009 10:51:29   

good simple_smile Timing is essential for profiling. So a tip might be this: start the profiler but don't start profiling itself yet. Start the app from the profiler, then in your app you place a halting statement like popping up a message box, or if it's a console app a readline. When the application reaches that point, you can start profiling in the profiler, and continue the execution by for example closing the message box. Then stop profiling by getting a snapshot. This snapshot is how things are at the time you started profiling.

This is simpler than when you start profiling from the beginning as you then get a lot of info and you don't want that, as it makes it harder to look at specific info.

Frans Bouma | Lead developer LLBLGen Pro