- Home
- LLBLGen Pro
- Architecture
SelfServicing and transactions
Joined: 14-Feb-2008
Hello!
I've recently started work on a new, small project, and we're using LLBL v2.5 as our ORM. Not sure if it matters for this discussion, but the DB is SQL Server 2005. We've used Adapter templates before and thought about giving SelfServicing a chance because we hoped it might make business code much simpler. And so far it is mostly very good.
We are, however, bothered by the fact that once we create a transaction we have to manually add objects into it for them to take part in it!
Here are my objections: (1) If someone accidentally forgets to add an entity into an ongoing transaction, the faulty code might still succeed in performing it's job. This could even be very difficult to notice. (2) The more common outcome of the mistake above would be more of a joke rather than a real problem - code that deadlocks itself. Not really the functionality we usually need. (3) If anywhere an entity fetches another entity "manually" (not via it's LLBL generated properties for related entities), the fetcher-entity must check to see if it's participating in a transaction and include what it fetched into the transaction as well. This is something that's too easy to forget. (4) Inconsistency - if an entity fetches another entity "automatically", simply by accessing it through one of it's LLBL generated properties, the fetched entity will in this case be automatically included in the transaction the fetcher is participating in. That's the impression I got, anyway. This is inconsistent with the story under (3). If it is not automatically included in the transaction, SelfServicing is of absolutely no use to us then. (5) Generally, it becomes necessary to add an extra Transaction argument to almost every relevant method, and pass the open transaction around. This is error prone also.
We'd prefer it if, once a transaction is opened on a thread, all entities would assume they're supposed to do their work in it. Just out of curiosity, why is this not already so?
I'd like to solve this myself, by writing a special class for managing transactions. A class with a ThreadStatic field holding the LLBL Transaction that's currently started. But, for things to be added in it automatically, it seems I need to override certain methods in entities, collections, DAO objects, stored procedure callers,.. It seems like a lot of work, and if I miss something, I'll be making things even worse. What do you think the best approach would be? Is there a single point where I could intercept them all?
Thank you in advance!
Josip Bakić
P.S. I found a very interesting sentence in the documentation. In the article on Transactions, under "Using the generated code" > "SelfServicing", in a part of the text dealing with the Add() method of a Transaction, it says:
You can only add objects that can execute data modification statements like an entity object or an entity collection object, however you can't add a typed list or typed view since these objects can only read data and for reading data, transactions are not interesting.
This is one seriously false statement! It also has potential scary implications for what I wrote above, under (4).
Joined: 17-Aug-2003
josip.bakic wrote:
Hello!
I've recently started work on a new, small project, and we're using LLBL v2.5 as our ORM. Not sure if it matters for this discussion, but the DB is SQL Server 2005. We've used Adapter templates before and thought about giving SelfServicing a chance because we hoped it might make business code much simpler. And so far it is mostly very good.
We are, however, bothered by the fact that once we create a transaction we have to manually add objects into it for them to take part in it!
Here are my objections: (1) If someone accidentally forgets to add an entity into an ongoing transaction, the faulty code might still succeed in performing it's job. This could even be very difficult to notice.
There's unfortunately no other way.
(2) The more common outcome of the mistake above would be more of a joke rather than a real problem - code that deadlocks itself. Not really the functionality we usually need. (3) If anywhere an entity fetches another entity "manually" (not via it's LLBL generated properties for related entities), the fetcher-entity must check to see if it's participating in a transaction and include what it fetched into the transaction as well. This is something that's too easy to forget.
Yes, though that's part of the game: there's no 'global' transaction so you have to add the objects to the transaction that's valid. Otherwise: how to obtain which transaction to add it to? You could say: 'use a [ThreadStatic] attributed member variable somewhere', but that's not really going to work because when to roll it back/commit it.
(4) Inconsistency - if an entity fetches another entity "automatically", simply by accessing it through one of it's LLBL generated properties, the fetched entity will in this case be automatically included in the transaction the fetcher is participating in. That's the impression I got, anyway. This is inconsistent with the story under (3). If it is not automatically included in the transaction, SelfServicing is of absolutely no use to us then.
It is automatically included, as it knows which transaction is active for the entity that the property is called on. As things are fetched / deleted / updated automatically (self servicing!) it will add the objects to the right transaction.
It's not inconsistent with 3) as it can't decide to which transaction to add some object as it's not a global resource. In adapter, the transaction to use is in the same object you do everything with, so it's always known, in selfservicing that's not the case.
(5) Generally, it becomes necessary to add an extra Transaction argument to almost every relevant method, and pass the open transaction around. This is error prone also.
Not necessarily: you don't need to use a transaction everywhere.
We'd prefer it if, once a transaction is opened on a thread, all entities would assume they're supposed to do their work in it. Just out of curiosity, why is this not already so?
[ThreadStatic] doesn't always work in asp.net, and you still have to decide when to commit/rollback, so in that scope you can also start the transaction and add the objects.
I'd like to solve this myself, by writing a special class for managing transactions. A class with a ThreadStatic field holding the LLBL Transaction that's currently started. But, for things to be added in it automatically, it seems I need to override certain methods in entities, collections, DAO objects, stored procedure callers,.. It seems like a lot of work, and if I miss something, I'll be making things even worse. What do you think the best approach would be? Is there a single point where I could intercept them all?
lazy loading, delete etc. these are already added to the transaction so you only have to deal with adding entities to transactions in places where you do things yourself.
For entities, you could look in overriding Save and Delete in the CommonEntityBase class (partial class of that). There, check whether the entity is already in a transaction. If not, add it to the transaction in the ThreadStatic attributed global variable.
P.S. I found a very interesting sentence in the documentation. In the article on Transactions, under "Using the generated code" > "SelfServicing", in a part of the text dealing with the Add() method of a Transaction, it says:
You can only add objects that can execute data modification statements like an entity object or an entity collection object, however you can't add a typed list or typed view since these objects can only read data and for reading data, transactions are not interesting.
This is one seriously false statement! It also has potential scary implications for what I wrote above, under (4).
Reading data in a transaction places read locks on the rows read in for example SQL Server. So if you need to read data, do that BEFORE the transaction starts: the transaction then is kept alive shorter (all fetches take places before the transaction start) and therefore locks are kept alive shorter, which overall makes the application more responsive.
Joined: 14-Feb-2008
We'd prefer it if, once a transaction is opened on a thread, all entities would assume they're supposed to do their work in it. Just out of curiosity, why is this not already so?
[ThreadStatic] doesn't always work in asp.net, and you still have to decide when to commit/rollback, so in that scope you can also start the transaction and add the objects.
I didn't know that about ASP.NET and [ThreadStatic]. However, judging by some reading I've done on the issue, it's of no concern to me - I do not plan to open a transaction and keep it hanging during separate events in a request's lifetime - any method that opens it will also close it. It's just handier to have a ThreadStatic handler class for this to enable automatic entity and collection inclusion and "careless" nesting (i.e. a method deep in the call chain just starts it's own, and the class figures out whether a transaction is already started or not, no need for anyone coding the method to bother), and to guarantee a rollback in the case of exceptions. The plan would be to pass an anonymous worker-method to the transaction handling class, and if the method completes without exceptions thrown, the handler commits. If it throws, the handler rolls back and re-throws. No possibility for a hanging transaction, just as there is no need for it, and no possibility for ASP.NET to switch threads on us during the execution of the anonymous method.
For entities, you could look in overriding Save and Delete in the CommonEntityBase class (partial class of that). There, check whether the entity is already in a transaction. If not, add it to the transaction in the ThreadStatic attributed global variable.
Thanks! But what about reads? For CommonEntityBase inheritors I guess I could override OnInitialized() and check for a transaction in progress. Collections, however, don't have a base class accessible to me for such expanding, right?
Reading data in a transaction places read locks on the rows read in for example SQL Server. So if you need to read data, do that BEFORE the transaction starts: the transaction then is kept alive shorter (all fetches take places before the transaction start) and therefore locks are kept alive shorter, which overall makes the application more responsive.
Well, that is true, but generally useless. I need those read locks, and that need is quite common in my experience. Also, we're generally dealing with transactions expected to complete relatively quickly. But in any case, speed is not primary, consistency is.
A banal example - if a service we offer fails, we return "credits" to our user. I'll read the user's credit data, increment a certain field containing the current credit, and save it. Once I've read it, no other thread is allowed to read that data until I'm done with it. I think it's pretty clear why. We will, of course, keep double records, some sort of a list of all previous events, enabling us to detect any errors, but we cannot accept an implementation where such errors are expected behaviour, and we cannot casually risk losing the user's trust.
What baffles me most right now is that if this really is your policy, why does Transaction require an isolation level to be specified as a constructor argument? Read lock behaviour is an essential distinction between different levels of isolation, and giving that argument creates, apparently, just an illusion that I'm in control of when a lock will be made, and when it won't. If I say Serialisable, I want to lock everything!
Anyway, I don't really care about TypedLists or Views, we won't be using either (and given this discussion here, I'll probably overload some of their methods just to make them throw exceptions to make sure nobody uses them, if that will be possible). But say I have an entity collection, and I've already started a serializable transaction. I add that collection to the transaction, and do a GetMulti(). Is the SELECT performed in the Transaction?
Joined: 17-Aug-2003
josip.bakic wrote:
We'd prefer it if, once a transaction is opened on a thread, all entities would assume they're supposed to do their work in it. Just out of curiosity, why is this not already so?
[ThreadStatic] doesn't always work in asp.net, and you still have to decide when to commit/rollback, so in that scope you can also start the transaction and add the objects.
I didn't know that about ASP.NET and [ThreadStatic]. However, judging by some reading I've done on the issue, it's of no concern to me - I do not plan to open a transaction and keep it hanging during separate events in a request's lifetime - any method that opens it will also close it. It's just handier to have a ThreadStatic handler class for this to enable automatic entity and collection inclusion and "careless" nesting (i.e. a method deep in the call chain just starts it's own, and the class figures out whether a transaction is already started or not, no need for anyone coding the method to bother), and to guarantee a rollback in the case of exceptions.
That's ok, but if we add something, it is valid for everyone, so we went a different route.
For entities, you could look in overriding Save and Delete in the CommonEntityBase class (partial class of that). There, check whether the entity is already in a transaction. If not, add it to the transaction in the ThreadStatic attributed global variable.
Thanks! But what about reads? For CommonEntityBase inheritors I guess I could override OnInitialized() and check for a transaction in progress. Collections, however, don't have a base class accessible to me for such expanding, right?
Everything is done by DaoBase anyway. I've looked into the code, and what you can do is the following: In the generated Dao classes there's a method called 'DetermineConnectionToUse'. This gets the ITransaction available on participating objects, or null if none found. If you change this method in the Dao template (use a copy of the dao template with the changes, create a templatebindings file with the same templateid binding but now referring to your own copy) to obtain the [ThreadStatic] transaction instead of the passed in one, it should be rather easy to do.
Reading data in a transaction places read locks on the rows read in for example SQL Server. So if you need to read data, do that BEFORE the transaction starts: the transaction then is kept alive shorter (all fetches take places before the transaction start) and therefore locks are kept alive shorter, which overall makes the application more responsive.
Well, that is true, but generally useless. I need those read locks, and that need is quite common in my experience. Also, we're generally dealing with transactions expected to complete relatively quickly. But in any case, speed is not primary, consistency is.
Our code is generic for all databases. SQL Server places read locks, Oracle doesn't as it uses MVCC, to give you an example. Read locks block writers, but not readers. So others can read the rows with a read lock. If you want to implement lock-based concurrency, I'll advice you not to do that: the locks are placed by the database, first on rows, but if it decides it's more efficient to lock a table or table page, it will do so. This could hurt performance badly. Not only that, the locks could stay due to a crash in your client code, or due to unforeseen control flow interruption (e.g. you think there's no lock present and ask for UI input, which could stall for a long time, keeping the locks on the rows).
A banal example - if a service we offer fails, we return "credits" to our user. I'll read the user's credit data, increment a certain field containing the current credit, and save it. Once I've read it, no other thread is allowed to read that data until I'm done with it. I think it's pretty clear why. We will, of course, keep double records, some sort of a list of all previous events, enabling us to detect any errors, but we cannot accept an implementation where such errors are expected behaviour, and we cannot casually risk losing the user's trust.
Still, a timestamp based concurrency approach is much more efficient. What you're after is not locking rows in the DB, it's about reserving data. Use feature locking for that or a reservation system (typical text-book example: there's 1 product left in the inventory, 2 users arrive on the website and both want to buy the product).
What baffles me most right now is that if this really is your policy, why does Transaction require an isolation level to be specified as a constructor argument? Read lock behaviour is an essential distinction between different levels of isolation, and giving that argument creates, apparently, just an illusion that I'm in control of when a lock will be made, and when it won't. If I say Serialisable, I want to lock everything!
You are not in control of the locks, the DB is. If your action affects a complete table due to a table scan, all rows are locked. If your table access is so diverse that the db decides to lock the complete table instead, it's too a complete lock of all the rows. You don't control that, the DB does. The isolation level is about who can see your changes. The result of that isolation level is depending on which db you use. In Oracle no-one blocks anyone due to their MVCC usage. On SQL Server, readers block writers and vice versa.
You now will say "I'm not on oracle", but the code isn't designed solely for your own project, but for all projects on all databases, hence the way it is set up.
Anyway, I don't really care about TypedLists or Views, we won't be using either (and given this discussion here, I'll probably overload some of their methods just to make them throw exceptions to make sure nobody uses them, if that will be possible). But say I have an entity collection, and I've already started a serializable transaction. I add that collection to the transaction, and do a GetMulti(). Is the SELECT performed in the Transaction?
Yes.
In fact, you have to do that to avoid deadlocks on sql server . When you add the collection to the transaction for a fetch, it will use the same connection as the transaction, so all locks already placed are ignored by the DB. If you don't add the collection to the transaction a NEW connection will be created and as it's a new connection all locks already set on rows touched are obeyed by the RDBMS (with respect to the isolation level of course).
Joined: 14-Feb-2008
Otis wrote:
That's ok, but if we add something, it is valid for everyone, so we went a different route.
I can see your point. Though I'm still convinced my approach is far more common, and should have perhaps been a default behaviour someone could explicitly avoid if required And particularly since TransactionScope already works like this, and it's really wonderfully simple to use...
Everything is done by DaoBase anyway. I've looked into the code, and what you can do is the following: In the generated Dao classes there's a method called 'DetermineConnectionToUse'. This gets the ITransaction available on participating objects, or null if none found. If you change this method in the Dao template (use a copy of the dao template with the changes, create a templatebindings file with the same templateid binding but now referring to your own copy) to obtain the [ThreadStatic] transaction instead of the passed in one, it should be rather easy to do.
Thank you! I'm not sure if I want to get into that. I think I'll just drop my complaining and do it your way, it's a small project, hopefully it won't hurt too much. But in the end, this is the only thing I've found in SelfServicing that makes it unusable for something more complex, at least IMO. Perhaps you could add some easier method (I prefer partial classes to tampering with templates...) of doing this in some future version?
Essentially, all these talks about what to do with the Transaction object can equally be applied to the DataAccessAdapter. All of them. The only difference is that you cannot load or save entities in Adapter mode without it, so you really can't forget to use it...
What comes below this point is really kinda off-topic, right? I'm starting to forget how we got to this.
Our code is generic for all databases. SQL Server places read locks, Oracle doesn't as it uses MVCC, to give you an example. Read locks block writers, but not readers. So others can read the rows with a read lock. If you want to implement lock-based concurrency, I'll advice you not to do that: the locks are placed by the database, first on rows, but if it decides it's more efficient to lock a table or table page, it will do so. This could hurt performance badly. Not only that, the locks could stay due to a crash in your client code, or due to unforeseen control flow interruption (e.g. you think there's no lock present and ask for UI input, which could stall for a long time, keeping the locks on the rows).
Wikipedia informs me SQL Server 2005 also uses MVCC. Interesting thing that MVCC, I didn't know things work that way in the DB.
Anyway, I'm out of my depth here, I've never learned so much about the inner workings of databases. I think I can guarantee that user interaction will not occur (we're creating a windows service working with a web app front-end, neither works with a user interactively... I'm not sure if this is the best word, but I think you know what I mean) and as for some sort of catastrophic crash - well, I'm going to use write locks anyway, I see no reason to let that possibility be a significant influence on my decisions here. A lock can always be left hanging, though that should happen really rarely, since the DB can tell the connection broke, no?
Still, a timestamp based concurrency approach is much more efficient. What you're after is not locking rows in the DB, it's about reserving data. Use feature locking for that or a reservation system (typical text-book example: there's 1 product left in the inventory, 2 users arrive on the website and both want to buy the product).
What do you mean exactly? I now we've gone off-topic, but could you perhaps recommend some reading on this? Thanks! (This discussion is turning out to be an educational experience for me )
You are not in control of the locks, the DB is. If your action affects a complete table due to a table scan, all rows are locked. If your table access is so diverse that the db decides to lock the complete table instead, it's too a complete lock of all the rows. You don't control that, the DB does. The isolation level is about who can see your changes. The result of that isolation level is depending on which db you use. In Oracle no-one blocks anyone due to their MVCC usage. On SQL Server, readers block writers and vice versa.
Let's assume the "worst" - Oracle. So, what you're saying is that thread A will read credit=99, and thread B will, if luck serves us, manage to read the same. Thread A will then save credit=100, thread B will attempt the same but get an exception? And then I'll have to, what, retry? I was kinda hoping that after thread A reads it, thread B waits.
Joined: 08-Oct-2008
http://msdn.microsoft.com/en-us/library/aa0416cz%28VS.80%29.aspx gives a fairly good breakdown of how timestamp based optimistic concurrency works.
And then I'll have to, what, retry? I was kinda hoping that after thread A reads it, thread B waits.
Either retry or fail gracefully yes. The problem with making threads reads wait on other threads writes is that is a nightmare for scalabilty and performance.
Matt
Joined: 14-Feb-2008
Joined: 14-Feb-2008
@Otis - you were right, they don't queue up at all, I don't know what I was thinking Both sessions succeed in reading a row, however, no changes could be made after that. In the end we delegated this to a stored procedure which performs a direct UPDATE on the one filed whose value we need to increment. Everything else is safe to be executed on a thread with optimistic concurrency control.
In all other situations we almost never read in a transaction, and always use optimistic concurrency, via the standard LLBL method of doing it.
So, all is well, though I'm still not happy about the Transaction class
Thank you for your help!
Joined: 17-Aug-2003
josip.bakic wrote:
@Otis - you were right, they don't queue up at all, I don't know what I was thinking Both sessions succeed in reading a row, however, no changes could be made after that. In the end we delegated this to a stored procedure which performs a direct UPDATE on the one filed whose value we need to increment. Everything else is safe to be executed on a thread with optimistic concurrency control.
In all other situations we almost never read in a transaction, and always use optimistic concurrency, via the standard LLBL method of doing it.
Serialized updates on a value... let me guess... a sequence? .
So, all is well, though I'm still not happy about the Transaction class Thank you for your help!
To be honest, I don't like the mechanism that much either, but it's the 'best' we could came up with. Adapter is my personal favorite, although I wished we would have designed it before selfservicing so selfservicing would be a wrapper around adapter. Oh well....
Joined: 14-Feb-2008
Otis wrote:
Serialized updates on a value... let me guess... a sequence? .
I don't see what you're getting at?..
Joined: 17-Aug-2003
josip.bakic wrote:
Otis wrote:
Serialized updates on a value... let me guess... a sequence? .
I don't see what you're getting at?..
When a value needs updating using a serialized transaction is often used to get a 'sequence' of unique values, like an identity or sequence (oracle). It's recommended not to use this construct and leave sequenced numbers to the RDBMS.
Joined: 14-Feb-2008
Come on, I know that!
It's not a sequence. It's a field containing a credit balance. Individual worker threads have to return their credit value to the user if they fail, but individual credit values are not necessarily 1 And it is possible for the value to be reduced should the user use the web interface to order more work.