Recursive saves

Posts   
 
    
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 25-Oct-2012 09:11:41   

Imagine I create a list of Invoice entities (they come from various sources so I Linq Union them together and sort by their Date).

Each Invoice has child SalesItems so I need to use a recursive save.

I do a SaveEntityCollection and look in the database and they are seemingly in random order - ie not their order in the list.

I try saving them one at time and get the same result.

So I step through and notice that saving the first also saves item 7 in the list. The second also saves items 8 & 9.

Realisation dawns! The recursive save of one entity will also pick up any others that share the Customer entity and save those at the same time.

In this particular case, I can store the CustomerID rather than the Customer (since they are not new customers, just reference data) and everything gets saved in the required order.

This got me thinking, is there a way of doing a 'downwards' only save? In this case an Invoice gets saved; the contents of its SalesItem collection should be saved but a Customer entity on the invoice doesn't recurse to find all the invoices on which it is the customer; and the Product entity on a SalesItem doesn't find all the SalesItems in which it is used.

Cheers Simon

Walaa avatar
Walaa
Support Team
Posts: 14995
Joined: 21-Aug-2005
# Posted on: 25-Oct-2012 20:44:53   

Why is it important to save entities in a collection in their index order? What you did is a good workaround.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 26-Oct-2012 14:00:36   

Problem is: it's not a tree, but a directed graph. This means there is no 'down' or 'up': the graph is traversed to find saveable entities, and which aren't necessary to be saved are skipped. This might mean that an entity referenced by an entity to save references is not needed to be saved but does reference an entity which is. This is basic topological sorting of a directed graph.

If Customer references another entity 'Address' and Customer isn't changed nor new, but Address is changed, and you save 'Invoices', the address will never be saved as it's ... reached through an entity which isn't saved? As there's no up/down, how is Address in this case related to the entities which are saved?

The ordering in which things are saved shouldn't matter as well, as the data in the database is order-less: it's a set, not an ordered set.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 29-Oct-2012 10:07:21   

@Walaa If a users enters a list of invoices in a grid then they would expect them to be saved in that order, otherwise the next time they display those items, they will appear jumbled up. The grid sort order would usually be Invoice Date then ID (PK) which retains that entry order. (assume Invoice date doesn't store time)

@Frans Yes, I see that as a general case and it is useful in most cases to just be able to set recurse on and let LLBLGen take care of everything.

But this relatively simple example shows that this isn't always the desired behaviour.

In this particular case, as well as being a 'directed graph' (I wasn't much of a scholar so I didn't study this type of stuff so forgive my ignorance), there is a clear hierarchy: a list of new, unrelated invoices each with a list of new unrelated items on it. It is only the use of actual Entities (rather than just their IDs) that changes this into a graph (I 'll admit I have no idea what a basic topological is sort is and I'm guessing that a 'directed' graph is one where a list of nodes is provided as a starting point).

I also agree with you that it is an unordered set to a database (and LLBLGEN) but it has an implicit ordering within the client-side list and it is actually possible to make saves in the desired order as long as the links between nodes are not present (using IDs not navigators) or not traversed (ignoring certain navigators).

OK, so if I wanted to write a custom recursive save, how could I go about it? I was thinking along the lines of duplicating what is there now more or less but having entities from the collection that started the recursion being resorted in the generated list-to-save to maintain their order. Or maybe being able to specify a list of types like CustomerEntity which become deadends and don't propagate further nodes to check.

Does a UOW simply work out what needs to be saved; then the order in which those saves need to be made and then perform the save sequentially?

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 30-Oct-2012 05:50:44   

IMHO, and ordering which depends on PK is a sign of a wrong choice of ID. The ordering in which you show the saved rows should be up to your BL, not the order in which they arrived the DB. Generally this is ruled by the PKs (i.e: OrderId, ProductId), or a good choice will be to show them ordered by some usual field (i.e.: product.Name).

I understand your point though. However LLBLGen cannot make assumptions on that and neither can it assume that the DB is an ordered set.

Said that, your workaround is a good one if you want save them in your own preferred order. UOW works the same way, the difference is that you can add multiple graphs to a UOW, for instance:

uow.AddForSave(customer1);
uow.AddForSave(customer2);

... where customer1 and customer2 and two separate graphs to be saved. Then they will be saved recursively in a single transaction when you do uow.Commit().

Another option you already saw is to save them in your own loop, example:

adapter.StartTransaction(...);
// save addition dependent objects

// save the list
for(int i=0; i< myOrdersList.Count; i++)
{
     adapter.SaveEntity(myOrdersList, false, false);
}
adapter.Commit();

That is not that hard if your graph is not that deep, otherwise your workaround looks good in this case.

David Elizondo | LLBLGen Support Team
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 26-Nov-2012 09:41:42   

I've been thinking more about this.

We are all agreed that a fetch from a database is a set with no ordering.

With regard to using a PK for sorting, I partly with Daelmo. But its not the PK itself per se that I want to sort against but rather the order that items were added to the database. Its just that an auto-PK happens to do that for free. The BL can control how saved rows are displayed of course but 'addition order' is still valuable as a tie-breaker. e.g. Date then AdditionOrder. Adding a DisplayOrder column is possible but can be difficult to maintain for some scenarios.

What I would like is to have a little more control over what 'recurse' does. To me, recurse means start at the top (each entity in the list) and process each child item until you can drill down no further. To LLBLGEN, it means 'build a graph involving every entity and work out what to do' But, the source is not an unordered set like a fetch, it is an ordered list whose meaning is lost under certain circumstances (ie when a related entity is used rather than just the PK of that related entity) when recursion is switched on.

I don't know how all this works internally but I'm guessing that after the graph is built, it is translated into an ordered list of items to save.

Is there any way to get control over that list of items to save so that I can perhaps reorder it to reflect the source list before the saving starts?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 27-Nov-2012 09:08:38   

There are a lot of graph search algorithms, and I'll point you to two: Depth-first search (which is what we use): http://en.wikipedia.org/wiki/Depth_first_search and Breadth first search (which is what you suggest): http://en.wikipedia.org/wiki/Breadth-first_search

For the sake of traversing the graph, it doesn't really matter, as both will end up getting what you want: a graph traversal and the information collected along the way.

In the context of recursive saves however, there's one technical detail that's important: the way to order the dependencies is done with a well known algorithm: topological sort: http://en.wikipedia.org/wiki/Topological_sorting This is important because recursive save has two tasks to do: 1) traverse the graph to see what should be saved and 2) determine the right order in which this is possible.

This algorithm relies on depth-first search, as it simply traverses the directed, acyclic graph in depth-first order and collects which nodes are visited along the way and in the end you simply traverse from back to forth and you have the order. simple_smile Breadth first search doesn't give you that.

Your proposal also has a problem with contained collections which also have to preserve an order, or when an object is referenced by multiple objects. In v1 we had a self-designed algorithm for recursive saves, which did more or less what you propose, but this lead to problems in edge-cases, as you could have graphs which were not saveable or which lead to an ordering which was incorrect (so it lead to errors).

As topological sorting is proven correct, we like to stick with that. simple_smile . E.g. even if we save the elements in a collection all first, it might not be correct: 2 elements in a collection might not be saveable in that order due to their references/dependencies. It might very well be that first element 2 has to be saved, then a bunch of other entities and then entity 1: you can't 'assume' in all cases that the entities next to each other in a collection are thus in the right order.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 27-Nov-2012 19:06:51   

Otis wrote:

As topological sorting is proven correct, we like to stick with that. simple_smile . E.g. even if we save the elements in a collection all first, it might not be correct: 2 elements in a collection might not be saveable in that order due to their references/dependencies. It might very well be that first element 2 has to be saved, then a bunch of other entities and then entity 1: you can't 'assume' in all cases that the entities next to each other in a collection are thus in the right order.

Fair enoughsimple_smile

However stuck_out_tongue_winking_eye , The topological sort you describe comes out very differently when just related entity IDs are used rather than the related entities themselves. If the topological sorting could be'tweaked' with an option to pretend that any related entities (other than new ones) are not there then you would have the best of both worlds - the existing order and a 'deterministic' version.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 28-Nov-2012 13:27:28   

The edges in the DAG are the references. These are followed, recursively. We're not going to change that as the current algorithm is exactly what it should do: depth-first search over the references of the in-memory graph. Changing that will open us again to the bugs we had in v1 where edge-case graphs might cause problems.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 30-Nov-2012 11:00:44   

After a little research, I've come to the conclusion that sorting on an auto-generated ID is perfectly normal and acceptable.

So given my goal of saving a (new) entity list in their list order, I could:

1) Carefully rewrite my code to only use IDs for reference data fields rather than entities. 2) Introduce a DisplayOrder field for every table that needs to remember entry-order. 3) Traverse a collection just before save and somehow replace 'reference' entities with their ID 4) Save a collection one entity at a time performing manual recursion. 5) Pester you to make a seemingly simple filter option so that discovered references to entities that are not new are ignored (thus emulating option 1 or 3). Its still a depth-first search but only sees New entities as edges. 6) Pester you to make ObjectGraphUtils an interface with a default implementation so it can be changed.

I can't see 5 or 6 happening but I thought I'd try before sticking with 1.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 01-Dec-2012 11:49:05   

I'm not sure what you want me to say, but as we had a major problem with our previous home-grown algorithm for dependency sorting and because there is a well-known proven algorithm (which is currently implemented) which always works, I'm not going to change any of it, sorry simple_smile

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 02-Dec-2012 11:30:30   

We seem to be miscommunicating again.

What I am suggesting does not involve going back to the previous home-grown algorithm. It also does not involve changing the well-known proven algorithm. It just involves tweaking the data provided to that algorithm when an option is set.

If I could simply override a method and get access to the dependency sorting, I would do it myself and stop pestering you. But in this case I can't because the code is deeply embedded.

So in ProduceAdjacencyLists, where you get the Depending/Dependant entities, if this magic option is set, you simple filter them so they are ignored if not new. That's all. This would give exactly the same result as if I had provided just the ID for a related entity rather than the related entity itself. Exactly the same entities get updated in exactly the same way but the existing algorithm itself happens to order them differently. There is no manual ordering here, your existing algorithm is the only thing deciding that - its just that this option will happen to coincidentally order them in the order it was given them - thus determinate.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39910
Joined: 17-Aug-2003
# Posted on: 02-Dec-2012 12:36:13   

Why not: - save the collection, setting 'recurse' to false in the call, this saves the collection in the order it is set - save the collection AGAIN, this time with recurse is true. As the entities in the collection have been saved already, they're skipped, the rest is saved in the right order.

Do this by starting a transaction manually calling StartTransaction(), then do the two calls and commit afterwards.

simmotech wrote:

We seem to be miscommunicating again.

What I am suggesting does not involve going back to the previous home-grown algorithm. It also does not involve changing the well-known proven algorithm. It just involves tweaking the data provided to that algorithm when an option is set.

If I could simply override a method and get access to the dependency sorting, I would do it myself and stop pestering you. But in this case I can't because the code is deeply embedded.

So in ProduceAdjacencyLists, where you get the Depending/Dependant entities, if this magic option is set, you simple filter them so they are ignored if not new. That's all.

What I propose above does this: the second time you save the collection, the entities in the collection are not saved, as they're not changed, however the referenced entities are.

Frans Bouma | Lead developer LLBLGen Pro
simmotech
User
Posts: 1024
Joined: 01-Feb-2006
# Posted on: 03-Dec-2012 08:07:18   

I would have to be careful about multiple levels e.g. a List of Invoices each with a list of Sale Items but that might work.

Thanks Frans