Hi,
I am running a worker service inside a Docker Container. The service saves data in a SQL Server DB that is hosted outside the Docker Container. I noticed that the worker service stops working every time it tries to save an entity in the database.
Debugging this issue I found out that calling SaveEntity(IEntity2 entityToSave, bool refetchAfterSave) on a DataAccessAdapter just blocks indefinitely. It works fine if I run the worker service outside of the Docker Container, so I probably just cannot access the SQL Server database from within the container. I have not looked into the cause of that just yet.
What is worrying me is that it seems to be blocking executing, when I cannot reach the SQL Server database. I don't want to be rebooting the Docker Container every time the database is unavailable.
I did find a workaround for this issue by changing the recovery strategy, using async and a CancellationToken, but that does not feel right.
The following code is the code that does not work:
public bool SaveNetMessages(IEnumerable<NetmessageEntity> netMessages)
{
var success = true;
using var adaptor = new DataAccessAdapter();
foreach (var netMessage in netMessages)
{
if (!adaptor.SaveEntity(netMessage, false))
{
success = false;
}
}
return success;
}
It blocks indefinitely on the adaptor.SaveEntity(netMessage, false) call.
Changing it to this code solves the blocking problem:
public bool SaveNetMessages2(IEnumerable<NetmessageEntity> netMessages)
{
var success = true;
using var adaptor = new DataAccessAdapter();
adaptor.ActiveRecoveryStrategy = new SimpleRetryRecoveryStrategy(1, new RecoveryDelay(TimeSpan.FromSeconds(1), 1, RecoveryStrategyDelayType.Linear));
foreach (var netMessage in netMessages)
{
var tcs = new CancellationTokenSource(TimeSpan.FromSeconds(5));
var cancellationToken = tcs.Token;
var task = adaptor.SaveEntityAsync(netMessage, false, cancellationToken);
var succeeded = task.Result;
if (!succeeded)
{
success = false;
}
}
return success;
}
This will throw an exception with the message 'Recovery failed: Maximum number of retries of 1 exceeded.' (RuntimeBuild 5.6.1_Netstandard2x), which is what I expect.
As I am planning to use the Async option anyway, I could just refactor the second piece of code to production code and move on. But it does not feel right to have to use specific recovery strategy settings and using a delay on a cancellation token and setting it to a value that fits the recovery strategy settings to make it work.
Any idea what could be wrong here?
Cheers,
Robert-Jan