Versioning Entities

Posts   
 
    
EdDotNet
User
Posts: 31
Joined: 30-Mar-2005
# Posted on: 04-Apr-2005 19:32:38   

We are seriously considering using LLBLGen Pro for a series of applications. We are able to generate a very nice data framework from the tool based on a simple version of our datamodel.

However, we need to introduce versioning of entities per our business requirements. In other words, we need to be able to support entities in various states, lets say "Working" and "Published." What is the recommended data modeling approach that will work well with LLBLGenPro? We are using SQL Server 2000.


Let's say we had a table called Content.  The "simple" approach would look something like this:

ContentID : int, PK, IDENTITY
ContentData


So far our experimentation has made it difficult to use LLBLGenPro with a versioning model.

Any help would be greatly appreciated.

Thanks

Devildog74
User
Posts: 719
Joined: 04-Feb-2004
# Posted on: 05-Apr-2005 03:31:08   

Could you elaborate a bit? I use llblgen to get data from my database that is culture specific by simply using a filter / predicate on the current culture id. So for example, if some dataa is stored in the db and it was created by a korean author, the data is saved in the table with an identifier for the korean culture. An editor might come along and create an entry in the same table for the english dialect, so that people browsing the site using the english culture would get the english version of the same data that was originally created in korean.

That is one way of versioning an entity.

I have also implemented seperate databases using the same generated entities, i.e. I have a development DB, a QC db, and a Production DB. As long as all of the schema is the same, I can simply change the connection string and the generated entities work the same in each system.

cmartinbot
User
Posts: 147
Joined: 08-Jan-2004
# Posted on: 05-Apr-2005 07:46:13   

I may be wrong. But, I'm thinking you need to introduce some sort of source control into your projects (which BTW has nothing to do with LLBLGen but with all code in general). After that, you can have a production branch and a development branch of your code.

For the last year I've been a big fan of Subversion [http://subversion.tigris.org] and TortoiseSVN [http://TortoiseSVN.tigris.org].

Good luck and the answer is 42!

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39794
Joined: 17-Aug-2003
# Posted on: 05-Apr-2005 11:05:37   

EdDotNet wrote:

We are seriously considering using LLBLGen Pro for a series of applications. We are able to generate a very nice data framework from the tool based on a simple version of our datamodel.

However, we need to introduce versioning of entities per our business requirements. In other words, we need to be able to support entities in various states, lets say "Working" and "Published." What is the recommended data modeling approach that will work well with LLBLGenPro? We are using SQL Server 2000.


Let's say we had a table called Content.  The "simple" approach would look something like this:

ContentID : int, PK, IDENTITY
ContentData

So far our experimentation has made it difficult to use LLBLGenPro with a versioning model. Any help would be greatly appreciated.

Versioning in general on entities is done by adding a 'Version' field to the PK. This means that when you save an entity, it's always a new entity. This way you can read back a previous version if you'd like. Though that's not what you'd like to do I think. What you'd like to do is simply state management for an entity, so that when you want to fetch all items which are published, you can do that.

As Devildog described, this is done mostly by adding a field to the table which reflects the state. That is, if the object can be in only one state at a time. Looking at the example you gave I have the feeling you're working on some sort of content management system. In a CMS it can be an item is in a couple of states at the same time: there is a published version and there is a version which is edited but not yet published (waits for approval).

In the situation in which you'll have more than one state for a given item, you'd need per state buckets, for example: a 'published' table and a 'working' table. In there you'd create items in there which are a copy of the root item in the content table, which represent the item in the state represented by the table.

But that's just an idea how it can be solved, there are more ways. Either way, it is possible to get the data out of the database into the objects you want. What you shouldn't do is try to cram this into an inheritance tree, like an item will suddenly become a different type: publishedItem. That doesn't work, as type changes of objects is semantically a bad thing, though perfectly possible with data in a database.

Could you elaborate a bit on what you exactly want to achieve?

Frans Bouma | Lead developer LLBLGen Pro
EdDotNet
User
Posts: 31
Joined: 30-Mar-2005
# Posted on: 05-Apr-2005 16:21:30   

Otis wrote:

EdDotNet wrote:

We are seriously considering using LLBLGen Pro for a series of applications. We are able to generate a very nice data framework from the tool based on a simple version of our datamodel.

However, we need to introduce versioning of entities per our business requirements. In other words, we need to be able to support entities in various states, lets say "Working" and "Published." What is the recommended data modeling approach that will work well with LLBLGenPro? We are using SQL Server 2000.


Let's say we had a table called Content.  The "simple" approach would look something like this:

ContentID : int, PK, IDENTITY
ContentData

So far our experimentation has made it difficult to use LLBLGenPro with a versioning model. Any help would be greatly appreciated.

Versioning in general on entities is done by adding a 'Version' field to the PK. This means that when you save an entity, it's always a new entity. This way you can read back a previous version if you'd like. Though that's not what you'd like to do I think. What you'd like to do is simply state management for an entity, so that when you want to fetch all items which are published, you can do that.

As Devildog described, this is done mostly by adding a field to the table which reflects the state. That is, if the object can be in only one state at a time. Looking at the example you gave I have the feeling you're working on some sort of content management system. In a CMS it can be an item is in a couple of states at the same time: there is a published version and there is a version which is edited but not yet published (waits for approval).

In the situation in which you'll have more than one state for a given item, you'd need per state buckets, for example: a 'published' table and a 'working' table. In there you'd create items in there which are a copy of the root item in the content table, which represent the item in the state represented by the table.

But that's just an idea how it can be solved, there are more ways. Either way, it is possible to get the data out of the database into the objects you want. What you shouldn't do is try to cram this into an inheritance tree, like an item will suddenly become a different type: publishedItem. That doesn't work, as type changes of objects is semantically a bad thing, though perfectly possible with data in a database.

Could you elaborate a bit on what you exactly want to achieve?

Thanks for the reply. To be clear I am not looking into Source Control as another reader implied.

Yes we are working on content management. We need to be able to store entities in two states, "Pending" and "Approved." I have gone ahead and modeled it like this and have had some success using LLBLGenPro to work with it:

Table ContentItem ContentItemId int PK IDENTITY : This is unique for every row (version) ObjectId int : This is the unique identifier for a content entity from a business perspective. ContentData text : the content itself Approved bit : true = approved, false = pending PendingContentItemId = this is a link to a pending copy. the way we modeled this is that we should always link outside this table to the approved version. we'll have to do some manual management of this, but it will always be clear. we also avoid having to have two tables with the same data.

Our concerns: - Haven't yet dealt with linking this data in m:n relationships. - Requires extra ".GetMulti()" calls to figure out what is what. We had some problems with making these calls within an existing transaction using TransactionManager.

Here is a sample of a test we generated that uses the generated classes:


[Test]
public void ApprovePendingVersionOfContentItem() 
{
    //TODO: wrap in transaction!

    //in a full test, we would create a new entity that is approved
    //edit pending version -- this will create a new pending copy if none exists
    this.EditPendingVersionOfContentItem();

    int entityJawns = 1; //objectId of the entity; this should already be in database
    ContentItemEntity approvedContentItem = null;
    ContentItemEntity pendingContentItem = null;
    

    //TODO: Move this to ContentManager
    //create expression to find pending objects
    IPredicateExpression findObjectId = new PredicateExpression();
    findObjectId.Add( PredicateFactory.CompareValue( ContentItemFieldIndex.ObjectId, ComparisonOperator.Equal, entityJawns ) );
                                
    //create a collection to filter by
    ContentItemCollection contentItems = new ContentItemCollection();
    contentItems.GetMulti(findObjectId);
    
    if ( contentItems.Count > 0 ) 
    {
        //find the approved / pending contentitems. there should be a better way to do this.
        foreach ( ContentItemEntity contentItem in contentItems )
        {
            if ( contentItem.Approved )
                approvedContentItem = contentItem;
            else 
                pendingContentItem = contentItem;
        }

    }
    else
        throw new ApplicationException("Could not find matching jawns.");

    //remove the existing approved item - here is where it could be moved to an archive/audit table
    approvedContentItem.Delete();

    pendingContentItem.ContentData = "I am the new approved content.";
    pendingContentItem.Approved = true;
    
    //now save the pending item
    pendingContentItem.Save();

}

Any thoughts on this? I am curios on how our model would look specifically per your suggestion on state management. Having copy tables really isn't desirable for us since there will be actually many content-managed data tables out there.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39794
Joined: 17-Aug-2003
# Posted on: 06-Apr-2005 11:41:39   

EdDotNet wrote:

Yes we are working on content management. We need to be able to store entities in two states, "Pending" and "Approved." I have gone ahead and modeled it like this and have had some success using LLBLGenPro to work with it:

Table ContentItem ContentItemId int PK IDENTITY : This is unique for every row (version) ObjectId int : This is the unique identifier for a content entity from a business perspective. ContentData text : the content itself Approved bit : true = approved, false = pending PendingContentItemId = this is a link to a pending copy. the way we modeled this is that we should always link outside this table to the approved version. we'll have to do some manual management of this, but it will always be clear. we also avoid having to have two tables with the same data.

This is limiting your design for future enhancements, for example if it has to be modified to have a more sophisticated workflow system. You now opted for the concept that an item can only be in one single state, which is reflected in the way you added the state information into the actual content object. I assume you use a pending copy to have the ability to edit an already published item?

I don't know how you use the content items, but say you use them inside containers. So you have a given container with a set of items. It depends on the context in which the container is used which state/version of every item in the container is shown. This runs into problems if you have the model you're using I think (I've been there, our CMS uses somewhat the same system: 2 states: published and current, so if you alter the text, you can preview it which shows the current, otherwise the published version is shown, so I've seen this problem before wink ).

So you could also opt for this: have per state a table. Initially you've 2: pending and approved. Then you have a root table for the definition of the item. That will be the records you'll relate to containers. Depending on the context of the container's usage, you'll then either fetch the Pending version of an item or the approved version of an item, which comes down to (for viewing only so only approved items shown)

select * from approvedItems where contentitemID in (select contentitemid from ContainerContentItems where ContainerID = @containerid)

The problem is that once you approve a content item's contents, it's live. But, what happens when you alter it or want to make a small change? It has to be re-approved of course, though you don't want to pull the current contents. So the item actually is in 2 states at once.

Our concerns: - Haven't yet dealt with linking this data in m:n relationships. - Requires extra ".GetMulti()" calls to figure out what is what. We had some problems with making these calls within an existing transaction using TransactionManager.

Could you elaborate a bit on the problems? Reads in a transaction aren't recommended, as they're slower and don't really need to be inside a transaction and you run into deadlock problems perhaps on sqlserver. Only place save/delete actions in a transaction.

Here is a sample of a test we generated that uses the generated classes:


[Test]
public void ApprovePendingVersionOfContentItem() 
{
    //TODO: wrap in transaction!

    //in a full test, we would create a new entity that is approved
    //edit pending version -- this will create a new pending copy if none exists
    this.EditPendingVersionOfContentItem();

    int entityJawns = 1; //objectId of the entity; this should already be in database
    ContentItemEntity approvedContentItem = null;
    ContentItemEntity pendingContentItem = null;
    

    //TODO: Move this to ContentManager
    //create expression to find pending objects
    IPredicateExpression findObjectId = new PredicateExpression();
    findObjectId.Add( PredicateFactory.CompareValue( ContentItemFieldIndex.ObjectId, ComparisonOperator.Equal, entityJawns ) );
                                
    //create a collection to filter by
    ContentItemCollection contentItems = new ContentItemCollection();
    contentItems.GetMulti(findObjectId);
    
    if ( contentItems.Count > 0 ) 
    {
        //find the approved / pending contentitems. there should be a better way to do this.
        foreach ( ContentItemEntity contentItem in contentItems )
        {
            if ( contentItem.Approved )
                approvedContentItem = contentItem;
            else 
                pendingContentItem = contentItem;
        }

    }
    else
        throw new ApplicationException("Could not find matching jawns.");

If you place a unique constraint on PendingContentItemId the relation between them will be 1:1, and you'll be able to fetch the related one without using the forloop (there are at most 2 objects there.) I'd also simply fetch the approved one if I'd want that, or grab the pending one, not both.

//remove the existing approved item - here is where it could be moved to an archive/audit table
approvedContentItem.Delete();

But, what's shown now for this item during the pending state?

Any thoughts on this? I am curios on how our model would look specifically per your suggestion on state management. Having copy tables really isn't desirable for us since there will be actually many content-managed data tables out there.

If you use content state tables for ContentItem, you'd have 2 extra tables. It might be you don't need the 2 tables, one per state, that is if the viewed content is written to fixed files for example, so the 'approved' can be deleted without losing content for viewing.

Frans Bouma | Lead developer LLBLGen Pro
EdDotNet
User
Posts: 31
Joined: 30-Mar-2005
# Posted on: 06-Apr-2005 17:03:21   

This is limiting your design for future enhancements, for example if it has to be modified to have a more sophisticated workflow system. You now opted for the concept that an item can only be in one single state, which is reflected in the way you added the state information into the actual content object. I assume you use a pending copy to have the ability to edit an already published item?

I don't know how you use the content items, but say you use them inside containers. So you have a given container with a set of items. It depends on the context in which the container is used which state/version of every item in the container is shown. This runs into problems if you have the model you're using I think (I've been there, our CMS uses somewhat the same system: 2 states: published and current, so if you alter the text, you can preview it which shows the current, otherwise the published version is shown, so I've seen this problem before wink ).

I'm not sure how our model causes us to run into problems with this. If we are in a "Pending" context, we show the pending copy or the approved copy if there is no pending copy. If we are in the "Approved" context we show the approved copy. I agree that we are limiting our possibilties in terms of _ number of states _ because we have a single bit column for approved only, but I assume we could turn this into an int (lookup) later if the requirement came up.

Also to be clear PendingContentItemId is a FK to ContentItemId (on the same table), so there can be any number of copies of an item, in either an Approved (business logic will enforce that there would be only one valid approved copy) or Pending state.

So you could also opt for this: have per state a table. Initially you've 2: pending and approved. Then you have a root table for the definition of the item. That will be the records you'll relate to containers. Depending on the context of the container's usage, you'll then either fetch the Pending version of an item or the approved version of an item, which comes down to (for viewing only so only approved items shown)

select * from approvedItems where contentitemID in (select contentitemid from ContainerContentItems where ContainerID = @containerid)

The problem is that once you approve a content item's contents, it's live. But, what happens when you alter it or want to make a small change? It has to be re-approved of course, though you don't want to pull the current contents. So the item actually is in 2 states at once.

Would you mind posted a DDL for this approach? I am interested in it.

In regards to the transaction issue, I am sure it is because we were reading. We'll move those calls out of the transaction block.

If you place a unique constraint on PendingContentItemId the relation between them will be 1:1, and you'll be able to fetch the related one without using the forloop (there are at most 2 objects there.) I'd also simply fetch the approved one if I'd want that, or grab the pending one, not both.

Can you elaborate on how exactly I can fetch the approved one directly? I am using the Self-Service model (also wanted to know what your thoughts were on which model to use for this- Adapter or Self-Service). Are you saying that if we add a unique contraint on ContentItemId,ObjectId,and PendingContentItemId, LLBLGen will pick up on this 1:1 relationship and allow us to do this: ContentItemEntity myPendingContentItem = myApprovedContentItem.PendingContentItem ?

I am starting to see how using natural keys will allow us to harness more of LLBLGen's power.

//remove the existing approved item - here is where it could be moved to an archive/audit table
approvedContentItem.Delete();

But, what's shown now for this item during the pending state?

In this test case, we are ApprovingPendingContent. So the process is (and we would wrap this in a transaction): 1- Remove the old approved content (perhaps move it to an archive) 2- Promote the pending content to approved

So if this is in a transaction, we shouldn't have any problems with viewing approved copy because it will update in an atomic fashion.

If you use content state tables for ContentItem, you'd have 2 extra tables. It might be you don't need the 2 tables, one per state, that is if the viewed content is written to fixed files for example, so the 'approved' can be deleted without losing content for viewing.

I think that if we wrap our approval process in a transation we'll be ok. Alternatively, we could "unapprove" the Approved row, approve the pending row, remove the link from the approved row to the pending row and then commit. This would be similar to the two table situation, because at some point the Approved/Pending Id's would have to be swapped out right?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39794
Joined: 17-Aug-2003
# Posted on: 07-Apr-2005 23:17:57   

I'll get back to you on Saturday. I just post this to let you know your post isn't forgotten simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39794
Joined: 17-Aug-2003
# Posted on: 10-Apr-2005 14:00:19   

Saturday, Sunday, what's in a day! smile

EdDotNet wrote:

Also to be clear PendingContentItemId is a FK to ContentItemId (on the same table), so there can be any number of copies of an item, in either an Approved (business logic will enforce that there would be only one valid approved copy) or Pending state.

Ok, but be aware that in a single-table hierarchy construction, finding all pending changes is not that great as you can't do a single select.

So you could also opt for this: have per state a table. Initially you've 2: pending and approved. Then you have a root table for the definition of the item. That will be the records you'll relate to containers. Depending on the context of the container's usage, you'll then either fetch the Pending version of an item or the approved version of an item, which comes down to (for viewing only so only approved items shown)

select * from approvedItems where contentitemID in (select contentitemid from ContainerContentItems where ContainerID = @containerid)

The problem is that once you approve a content item's contents, it's live. But, what happens when you alter it or want to make a small change? It has to be re-approved of course, though you don't want to pull the current contents. So the item actually is in 2 states at once.

Would you mind posted a DDL for this approach? I am interested in it.

Well, the core definition of an item is in the item table. Though there is no content defined there, that's in the item in the various stage tables. For example the CMS of our website uses Containers and items on these containers. An item can be added to multiple containers, and for each container it can use a different viewer. So per container-item combination there is different 'published' data, which is stored with that combination. Though the item itself is stored somewhere else, so if a person alters the data of the item, the item itself isn't published, nor is the published item changed. Only when the item is published, but that can be PER container, which simply creates a new version of the item data in teh context of the container.

It's a bit problematic to cough up ddl which matches your situation completely, but it's not that hard really: see it as core data with different views and per view, the data presented by such a view is static, i.e. has to be preserved, or non-static. In the case of a published item, it's static, and in case of a pending item, it's not.

In regards to the transaction issue, I am sure it is because we were reading. We'll move those calls out of the transaction block.

that's indeed a better choice simple_smile

If you place a unique constraint on PendingContentItemId the relation between them will be 1:1, and you'll be able to fetch the related one without using the forloop (there are at most 2 objects there.) I'd also simply fetch the approved one if I'd want that, or grab the pending one, not both.

Can you elaborate on how exactly I can fetch the approved one directly? I am using the Self-Service model (also wanted to know what your thoughts were on which model to use for this- Adapter or Self-Service). Are you saying that if we add a unique contraint on ContentItemId,ObjectId,and PendingContentItemId, LLBLGen will pick up on this 1:1 relationship and allow us to do this: ContentItemEntity myPendingContentItem = myApprovedContentItem.PendingContentItem ?

From my understanding of your model, it should yes.

If you use content state tables for ContentItem, you'd have 2 extra tables. It might be you don't need the 2 tables, one per state, that is if the viewed content is written to fixed files for example, so the 'approved' can be deleted without losing content for viewing.

I think that if we wrap our approval process in a transation we'll be ok. Alternatively, we could "unapprove" the Approved row, approve the pending row, remove the link from the approved row to the pending row and then commit. This would be similar to the two table situation, because at some point the Approved/Pending Id's would have to be swapped out right?

I think the hassle more or less comes from the fact that you mix the 'view' result of placing an item and its contents in a given context with the item's actually content.

A good example is always a news item, which consists of just a headline and a piece of text. Say I want a headline section at the top of the page and a list of the news items with headline below it. Now, I just add one news item, but to two containers: the headline container and the main container with the news items. (this could be done automatically for example). Now, I add the headline and the newsitem's body text.

In the headline container I have a viewer which only displays the headline and produces some in-page url link. The normal container has a viewer for newsitems which produces a complete different look. It's the same item in 2 different contexts. Still, I have 1 item with content, and 2 viewer results. If I alter the headline and the body because I made 2 typo's, and I place it in the pending state, I altered the item's contents, not the published result. Though when I publish it in the headline container, I get a new version of the viewer result THERE, but not in the other container, I have to publish it in all containers then, which effectively means: produce a new published output using the viewers for the contexts the item is in.

If you do that, you won't have to mix version and active item contents.

Frans Bouma | Lead developer LLBLGen Pro