-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug with ThreadLocal.asContextElement
#985
Comments
It is not exactly a bug, but a side-effect of the way If you write Also note, then when you use Does it help? |
Hey @elizarov thank you for your reply & the tip for I changed the test like this:
It prints Are you basically saying that, between the outermost How about after the coroutine? Is it guaranteed to be restored? |
Consider also this case where the value depends on previous value (not uncommon): val tInt: ThreadLocal<Int> = ThreadLocal.withInitial { 0 }
@Test
fun testContext4() {
println(tInt.get())
runBlocking {
println(tInt.get())
withContext(tInt.asContextElement(tInt.get() + 1)) {
println(tInt.get())
// Comment for other behavior
delay(Random.nextLong(100))
}
println(tInt.get())
withContext(tInt.asContextElement(tInt.get() + 1)) {
println(tInt.get())
// Comment for other behavior
delay(Random.nextLong(100))
}
println(tInt.get())
}
println(tInt.get())
} So now in the second |
If I understand correctly, the only workaround is to remember to set as a context element every time a coroutine is started. But what if that's happening at 100 places (not uncommon, consider the codebase which is migrating to coroutines) |
The design principle of You don't need to remember to always set it if you use the structured concurrency approach and always launch your coroutines as children of other coroutines. Once you set the corresponding element in the context of your top-level scope, it is going to be inherited by all your coroutines. |
What I was referring to is usage in a servlet container. In this case, each request will start outside of a coroutine (as a separate thread), and then a coroutine will be started at some point via So in this scenario, does the implementation guarantee that the value will be restored? In my example, outside of I hope the answer is yes. But even so, you should realize the inconsistency here:
Isn't it well reasoned to expect 2 and 3 to act the same? Outside of a coroutine there is no context at all. It's troublesome that the behavior is different. I am sure that someone will get bitten by this. |
@alamothe The value of thread-local is guaranteed to be restored when you go outside of the scope of any of the coroutines. If you write the following code:
|
Got it. The way I solved it is to explicitly add to context every time I start a coroutine like that, and even implement as a helper function. I still consider this behavior unintuitive, and suggest adding a warning to the docs, stating that "the value of a thread-local is undefined inside a coroutine which doesn't have it in its context", with some examples. |
It would also be nice to add the warning to the documentation of And it would be much better to have thread state restored outside |
The import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.slf4j.MDCContext
import kotlinx.coroutines.withContext
import org.slf4j.LoggerFactory
private val logger = LoggerFactory.getLogger("MainKt")
suspend fun main() {
logger.info("Before operation")
withContext(MDCContext(mapOf("operationId" to "123"))) {
withContext(Dispatchers.IO) {
logger.info("Inside operation")
}
}
logger.info("After operation")
} Procuces the following output:
Here the expression in square brackets is Logback's As you can see, MDC is not cleaned up after exiting from Even when this is possible, it is not convenient at all to put "empty" value of thread-local state in all root coroutines to ensure that it will not left undefined after running some code. Could you please fix it? Should I create a separate issue and is there a chance that this design of |
@frost13it Thanks for a use-case. I've reopened this way. Let's see if it would be somehow possible to fix it (ideally, in an efficient way). |
@frost13it I've looked at this problem again. Can you, please, elaborate on your actual use-case? In particular, you are giving an example where thread local value leaks into Note, that you can wrap your top-level code into |
Unfortunately it is not a good solution for me.
Don't you think that |
We assume that Unfortunately, this does not currently affect What we can do to improve this situation is to give integration modules like P.S. The downside of this approach to consider is that anyone who adds |
I think it would be sufficient in my case. It would be great to have such an ability in a library. |
Note, that this fix has potentially severe performance impact since it always walk up the coroutine completion stack in search for parent UndispatchedCoroutine. :todo: figure out how to optimize it Fixes #985
Note, that this fix has potentially severe performance impact since it always walk up the coroutine completion stack in search for parent UndispatchedCoroutine. :todo: figure out how to optimize it Fixes #985
Are there any updates on this issue? |
Thanks @elizarov for your work and everyone else in the discussion. We have solved this by making sure every time we start a coroutine, we put all thread locals that we care about in its context. But I totally understand how this doesn't work for libraries. |
Hello, I am experiencing the same issue even with the code base that is not huge and is mostly under our control. Our services use MDC logging, opencensus tracing, thread locals, and GRPC contexts. It has been a huge challenge to wire everything up correctly to work with coroutine context propagation. For the code under our control, we need to remember to seed all the different context elements every time somebody starts a coroutine. This leads to tight coupling, every piece of code that wants to start a coroutine needs to be aware of all the thread context elements in the entire codebase. And of course, it is very error-prone. The problem is exacerbated by GRPC launching coroutines without giving everyone the opportunity to override the behavior. I imagine other libraries wanting to do the same. The described behavior is surprising, to say the least. I don't totally understand why global injection is needed to fix this. In the examples above, there are clear scope/context entry and exit points. Could you explain why the |
One more request: https://youtrack.jetbrains.com/issue/KT-42582 |
This fix solves the problem of restoring thread-context when returning to another context in undispatched way. It impacts suspend/resume performance of coroutines that use ThreadContextElement since we have to walk up the coroutine completion stack in search for parent UndispatchedCoroutine. However, there is a fast-path to ensure that there is no performance impact in cases when ThreadContextElement is not used by a coroutine. Fixes #985
Is there a decision on the #1577? |
Yes, we are going to release the fix with the next release |
Magnificently! |
This fix solves the problem of restoring thread-context when returning to another context in undispatched way. It impacts suspend/resume performance of coroutines that use ThreadContextElement and undispatched coroutines. The kotlinx.coroutines code poisons the context with special 'UndispatchedMarker' element and linear lookup is performed only when the marker is present. The code also contains description of an alternative approach in order to save a linear lookup in complex coroutines hierarchies. Fast-path of coroutine resumption is slowed down by a single context lookup. Fixes #985 Co-authored-by: Roman Elizarov <elizarov@gmail.com
This fix solves the problem of restoring thread-context when returning to another context in undispatched way. It impacts suspend/resume performance of coroutines that use ThreadContextElement and undispatched coroutines. The kotlinx.coroutines code poisons the context with special 'UndispatchedMarker' element and linear lookup is performed only when the marker is present. The code also contains description of an alternative approach in order to save a linear lookup in complex coroutines hierarchies. Fast-path of coroutine resumption is slowed down by a single context lookup. Fixes #985 Co-authored-by: Roman Elizarov <elizarov@gmail.com>
Thanks for fixing this! I've been pulling my hair trying to understand why my MDC values were screwed up for months now without coming across this thread. Is there an ETA on the next release? |
Until there's a new release, it would be great to have a clear example of what this "expected usage" looks like. Specifically, the documentation for |
I'm seeing a strange side-effect which I believe is the result of this change. Using Without runBlocking {
MDC.put("key", "value")
println(MDC.copyOfContextMap)
delay(50)
println(MDC.copyOfContextMap)
}
# {key=value}
# {key=value} With runBlocking(MDCContext()) {
MDC.put("key", "value")
println(MDC.copyOfContextMap)
delay(50)
println(MDC.copyOfContextMap)
}
# {key=value}
# null |
Yes, this is the existing limitation of Quoting our KDoc:
This is a by-design limitation, that become much more clearer when more than one coroutine is launched. E.g.:
do you expect mutated MDC to be available in |
It seems like there are some strange issues related to it - https://youtrack.jetbrains.com/issue/KT-46552 |
Hello,
I'm observing a strange behavior with the following code:
The result is:
testContext1
printsaba
- expectedtestContext2
printsabbb
- It looks like the value is not restored for some reason. Is this a bug? What's the explanation?testContext3
printsabba
- expectedPlease help!
I'm using
org.jetbrains.kotlinx:kotlinx-coroutines-core:1.1.1
The text was updated successfully, but these errors were encountered: