Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 25 #28

Closed
wants to merge 6 commits into from
Closed

Fix issue 25 #28

wants to merge 6 commits into from

Conversation

ndrwrbgs
Copy link
Collaborator

@ndrwrbgs ndrwrbgs commented Nov 5, 2017

Fixes #25

@JSkimming I'm not familiar with .NET Core style projects. I see that the original code has some NETStandard2_0 conditional compilation sections that I think this new code will need similar mechanisms for, but I wasn't able to get the tests targeting that with xUnit to test my changes - perhaps you could help update the change to work with 2.0?

This approach has the problem of having a .Wait(), but I believe that since Castle.Core only provides synchronous interception today this may be inevitable - but this doesn't make it any less perilous.

Finally, while coverage will say this is fully covered, it's not functionally fully covered. Probably needs proper tests before check in to mainline - but it does make #26 pass :)

@codecov-io
Copy link

codecov-io commented Nov 5, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@b897860). Click here to learn what that means.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #28   +/-   ##
=========================================
  Coverage          ?   99.17%           
=========================================
  Files             ?        5           
  Lines             ?      243           
  Branches          ?        3           
=========================================
  Hits              ?      241           
  Misses            ?        2           
  Partials          ?        0
Impacted Files Coverage Δ
....AsyncInterceptor/AsyncDeterminationInterceptor.cs 100% <ø> (ø)
...stle.Core.AsyncInterceptor/AsyncInterceptorBase.cs 97.05% <91.66%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b897860...6b3ec0d. Read the comment docs.

@ndrwrbgs
Copy link
Collaborator Author

Will review later, at latest tomorrow, just FYI

@JSkimming
Copy link
Owner

Hi @ndrwrbgs.

I've been tying myself in knots trying to get to the bottom of this one.

It boils down to this. It still doesn't work if an asynchronous operation is performed prior to calling proceed(). Simply demonstrated by your change in #26,

protected override async Task InterceptAsync(
    IInvocation invocation,
    Func<IInvocation, Task> proceed)
{
    try
    {
        _log.Add($"{invocation.Method.Name}:StartingVoidInvocation");

        await Task.Yield(); // <---!!!! HERE !!!!
        await proceed(invocation).ConfigureAwait(false);

        if (_msDeley > 0)
            await Task.Delay(_msDeley).ConfigureAwait(false);

        _log.Add($"{invocation.Method.Name}:CompletedVoidInvocation");
    }
    catch (Exception e)
    {
        _log.Add($"{invocation.Method.Name}:VoidExceptionThrown:{e.Message}");
        throw;
    }
}

When the tests run (don't bother using R#, it can't cope), either two things happen, a stack overflow due to a race condition, or deadlock.

The race condition occurs here:

// Do not return from this method until proceed is called or the implementor is finished
TaskCompletionSource<object> canReturnTcs = new TaskCompletionSource<object>();
Task ProceedWrapper(IInvocation innerInvocation)
{
    // "until proceed is called"
    canReturnTcs.TrySetResult(null); // !!! By completing canReturnTcs before calling proceed !!!
    return proceed(innerInvocation); // !!! It causes a race condition. !!!
}

// Will block until implementorTask's first await
Task implementorTask = InterceptAsync(invocation, ProceedWrapper);

// "or the implementor is finished"
implementorTask.ContinueWith(_ => canReturnTcs.TrySetResult(null));

canReturnTcs.Task.Wait();
return implementorTask; // !!! This can still return before proceed is called. !!!

Since canReturnTcs.TrySetResult(null); is called just before return proceed(innerInvocation);, it's still liable to the issue with AbstractInvocation, where it's returned before proceed() is called and the counter is decremented, this then loops forever resulting in either a stack overflow, or deadlock (see next).

To prevent the race condition, I tweaked ProceedWrapper to be this:

// "until proceed is called"
Task task = proceed(innerInvocation); // !!! Call proceed before setting canReturnTcs. !!!
canReturnTcs.TrySetResult(null);
return task;

With this change, it's deadlock all the way, and I believe I understand why.

Essentially by not returning the implementorTask to get around the issue with AbstractInvocation it never gets back to the calling code, and as such, the work never continues.

ProceedWrapper never gets called so canReturnTcs doesn't complete, similarly implementorTask never completes, as it's never returned to the calling code, therefore preventing the second-way canReturnTcs can complete.

Conclusion

I'm done for the weekend I'm afraid, family duty calls. I'm not sure we can get this working, or if we can it'll be quite a performance overhead (spinning up new threads).

I think we'll need to change AbstractInvocation. Instead of it keeping a counter of the next interceptor, which is ultimately where this is falling apart, it should create a chain of interceptors, that is walked by calling proceed.

Do you get my meaning?

It's maybe worth submitting a PR to Castle Core if we can produce a nice and simple reproduction. What do you think?

I'd love to be wrong about not being able to fix it in this library :-)

@ndrwrbgs
Copy link
Collaborator Author

Thanks for looking! Do you know what target profile was deadlocking for you? I was getting passes when targeting .NET 4.6, if I can reproduce the failure I can look a bit more.

Also just to make sure, it was failing in one of the tests right? Not anything attributed STAThread?

@ndrwrbgs
Copy link
Collaborator Author

RE the Castle.Core suggestion, I’d love to have the support there. Just from original skimming of their repo though it looks like they’re not convinced of the need for it yet. I’ll investigate more on both fronts.

@ndrwrbgs
Copy link
Collaborator Author

Oh sheesh! So sorry @JSkimming yeah this didn't fix 25 at all. I'm not sure what I was doing, I thought I had validated it but evidently I did not!

@ndrwrbgs
Copy link
Collaborator Author

ndrwrbgs commented Dec 2, 2017

Any updates on this @JSkimming ?

@JSkimming
Copy link
Owner

@ndrwrbgs Did your last changes fix the issue or just experimentation?

I've been busy with work of late, and I've only had the chance to work on less taxing (for my brain anyway) open source projects. I'm planning on returning to this over the Chrismas break.

My thinking is I'll investigate whether we can provide a stateless equivalent to AbstractInvocation, one that constructs a chain of interceptors rather than keeping an index into an array.

What're your thoughts?

@ndrwrbgs
Copy link
Collaborator Author

ndrwrbgs commented Dec 3, 2017

I believe it to be fixed now.
I have a hunch they didn't use the chain of interceptors for performance reasons, but I cannot be sure.

@JSkimming
Copy link
Owner

JSkimming commented Dec 3, 2017

@ndrwrbgs I've cherry-picked the crucial change you originally added in PR #26, specifically commit c88b1fd.

It passes in Travis, but hangs in AppVeyor. if you run the following command, it just hangs in deadlock

dotnet test -c Release -f netcoreapp2.0 test\Castle.Core.AsyncInterceptor.Tests\Castle.Core.AsyncInterceptor.Tests.csproj

@JSkimming
Copy link
Owner

Fixed with #54

@JSkimming JSkimming closed this Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How would I retry a failed method?
3 participants