Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

+ Monix Task and BIO support and Cats IO improvements #879

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

bogdanromanx
Copy link

  • kamon-cats-io: Added a Tracing object with an operationName function that allows creating named child spans
  • kamon-monix: Added a new module that supports monix-eval Task and monix-bio IO effect types

The span nesting does not work as evidenced by the test AbstractCatsEffectInstrumentationSpec.nest spans correctly that is failing. Any pointers are much appreciated.

The purpose of this PR is to add support for Monix as a separate module, while enhancing the Cats Effect support.

@bogdanromanx bogdanromanx changed the title Cats IO improvements and Monix Task and BIO support + Monix Task and BIO support and Cats IO improvements Oct 21, 2020
@bogdanromanx
Copy link
Author

@ivantopo, @dpsoft, do you have any ideas why the spans are not nested correctly if the context propagation is supposed to work (according to the other cats/monix test)?
Could you please give me some pointers on how I can further investigate the problem?

Comment on lines 67 to 73
// - root
// - 1 (value = 1)
// - 2 (value = 2)
// - 3 (value = 3)
val rootSpan = for {
root <- F.delay(Kamon.spanBuilder("root").start())
_ <- (1L to 3L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the spans be nested in a tree?
I thinks this test just creates unrelated spans.
Maybe try creating the other spans as children of the first one, or using Kamon.runWithSpan

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test all operations are sequenced as follows:

  • a root span is created and started without being finished
  • 3 additional spans that are started and finished, executed one after another (.map(...).sequence)
  • the root span is finished

The named function that is applied to the F.delay(idx) is actually a call to the operationName in the Tracing object that does the following:

  • creates and starts a new span with the provided name and tags that sets as parent span the result of Kamon.currentSpan()
  • executes the F[_] effect, which in this case is F.delay(idx)
  • finishes the span

Assuming the operation sequence defined above, the three effects F.delay(idx) should be wrapped in spans that use the root span as a parent, because that's the current span (the started and not finished).

I don't understand what part of this test is "iffy". :) ... but please let me know if I'm missing something.

The assertion at the end is obviously wrong because it also considers the root span for the equality check, but that's not the problem here as I've manually printed the span ids when they are created.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the kamon-scala-future module, it looks like it doesn't just create a span, but explicitly stores it in the context storage and removes it afterwards. I assumed that simply starting and finishing a span is sufficient. I'll try to do the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I've put down some println's and Kamon.currentSpan() give Span.Empty both in the span creation code and in the test code.
Adding something like Kamon.runWithSpan(root) to the test should correctly nest it in the test code.

Copy link
Contributor

@SimunKaracic SimunKaracic Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you figured it out! 🎉

Feel free to use me as a rubber duck, while I try and wrap my head around Monix code 😅

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the problem, I needed to store the span in the context storage...
But now I've hit another issue where every now and again the test fails due to the fact that the fibers are interleaved. I think this is a more fundamental issue with my PR, I may have to use TaskLocal and IOLocal to get named spans working correctly. Still investigating.

@SimunKaracic
Copy link
Contributor

Hey man, sorry for the delay, I'll go on a deeper dive here soon!
For now, the test is kinda iffy, but I've been known to be wrong.
If the tests are correct, you most likely need to figure out the correct place to use Kamon.storeContext!

P.S. you can configure your editor to add newlines at the end of files, so github stops putting red marks everywhere!

@jatcwang
Copy link
Contributor

jatcwang commented Mar 15, 2021

Hi so this was an issue I hit a brick wall with when developing https://github.com/jatcwang/kamon-cats-effect

After staring at it for a while with @ivantopo I think we've found the problem.

The gist of the problem is that we're calling scope.close() in a different thread than the original thread where the Scope was created.

See this line

override def store(newContext: Context): Scope = {
val ref = tls.get()
val previousContext = ref(0)
ref(0) = newContext
new Scope {
override def context: Context = newContext
override def close(): Unit = ref(0) = previousContext
}

Note how in def close(), we have a reference to the array thread local. This is fine if everything happens in one thread, but if close() is called from a different thread, we have disastrous consequences such as context contamination.

The solution (for this problem) is to resolve the thread local again in close()

override def close(): Unit = tls.get()(0) = previousContext

If there are benchmarks we should make sure this doesn't have significant performance impact, since the array trick was originally done for performance reasons. (We should also try removing the array trick altogether if we decide to "fix" this, since at this point it's really not doing anything)

@bogdanromanx
Copy link
Author

Thanks for this @jatcwang, I'll have a look.

@dpsoft
Copy link
Contributor

dpsoft commented Mar 17, 2021

Hi @jatcwang!

If there are benchmarks we should make sure this doesn't have significant performance impact, since the array trick was originally done for performance reasons. (We should also try removing the array trick altogether if we decide to "fix" this, since at this point it's really not doing anything)

Please take a look: https://github.com/kamon-io/Kamon/blob/master/core/kamon-core-bench/src/main/scala/kamon/bench/ThreadLocalStorageBenchmark.scala if you want measure the performance impact.

@jatcwang
Copy link
Contributor

jatcwang commented Mar 18, 2021

Ok this one is interesting, the "old" (supposedly) non-optimized version is faster when I run the benchmark.
(I didn't run that many iterations).

Some unscientific benchmarks I ran with my laptop (Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, 16GB)

On GraalVM:

# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM GraalVM CE 21.0.0.2, 25.282-b07-jvmci-21.0-b06
# VM invoker: /usr/lib/jvm/java-8-graalvm/jre/bin/java
# VM options: -XX:MaxInlineLevel=24 -XX:MaxInlineSize=270
# Warmup: 3 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 2 threads, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: kamon.bench.ThreadLocalStorageBenchmark.fastThreadLocal
# Run progress: 50.00% complete, ETA 00:01:20
# Fork: 1 of 1

Benchmark                                       Mode  Cnt   Score   Error  Units
ThreadLocalStorageBenchmark.currentThreadLocal  avgt    5  16.441 ± 4.799  ns/op
ThreadLocalStorageBenchmark.fastThreadLocal     avgt    5  13.701 ± 1.391  ns/op

and on OpenJDK:

# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk/jre/bin/java
# VM options: -XX:MaxInlineLevel=24 -XX:MaxInlineSize=270
# Warmup: 3 iterations, 10 s each
# Measurement: 5 iterations, 10 s each

Benchmark                                       Mode  Cnt   Score   Error  Units
ThreadLocalStorageBenchmark.currentThreadLocal  avgt    5  17.482 ± 0.717  ns/op
ThreadLocalStorageBenchmark.fastThreadLocal     avgt    5  12.808 ± 0.683  ns/op

Note that "currentThreadLocal" is the naive, non-optimized version. So yeah the optimized version is consistently faster but for my use case (Mainly future/IO instead of akka) I'd say the impact is negligible.

@SimunKaracic
Copy link
Contributor

Great work @jatcwang!
If you think this implementation would be useful to you, feel free to add it.
Just add a (well documented :D) setting that lets users choose which implementation they'd like to use, with the current one as default!

@jatcwang
Copy link
Contributor

@SimunKaracic PR up in #961

* @return the same effect wrapped within a named span
*/
def operationName[F[_] : Sync, A](name: String, tags: Map[String, Any] = Map.empty, takeSamplingDecision: Boolean = true)(fa: F[A]): F[A] = {
val F = implicitly[Sync[F]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take (implicit F: Sync[F]) like buildSpan?


implicit def timer: Timer[F]

private val customExecutionContext = ExecutionContext.fromExecutorService(Executors.newCachedThreadPool())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go with a fixed ThreadPool to ensure the test are running with multiple threads

val contextTag = F.toIO(contextTagF).unsafeRunSync()
contextTag shouldEqual "value"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should test cancellation and errors?

Would be good to run multiple IOs in parallel (with shift inserted in between) to ensure everything works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants