Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the Transformer#apply for intuitive implicit summoning #594

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

danicheg
Copy link
Contributor

@danicheg danicheg commented Sep 1, 2024

That way of summoning is commonly used in the projects I used to work on. In conjunction with #591, it might give a great experience of writing Transformers:

object CommonTransformers {
  implicit val strIntTransformer: Transformer[String, Int] = _.length
}

case class Id(value: String)

object DomainTransformers {
  import CommonTransformers._

  implicit val idTransformer: Transformer[Id, Int] = Transformer[String, Int].contramap(_.value)
}

Copy link

codecov bot commented Sep 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.56%. Comparing base (b3c3df6) to head (b63eb50).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #594      +/-   ##
==========================================
- Coverage   86.57%   86.56%   -0.01%     
==========================================
  Files         154      154              
  Lines        5995     6007      +12     
  Branches      544      543       -1     
==========================================
+ Hits         5190     5200      +10     
- Misses        805      807       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

*
* @since 1.5.0
*/
def apply[From, To](implicit t: Transformer[From, To]): Transformer[From, To] = t
Copy link
Contributor Author

@danicheg danicheg Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's obvious that having a counterpart method for PartialTransformer would be beneficial. However, since there is already a defined method PartialTransformer#apply, adding the counterpart method leads to a rather strange issue in Scala 2 (although it works well in Scala 3). See the minimized repro in Scastie — https://scastie.scala-lang.org/danicheg/eM7sBvzAS0q66azXmvayfg/6. Unfortunately, I haven't come up with a workaround yet, so I've left it as is.

@MateuszKubuszok
Copy link
Member

MateuszKubuszok commented Sep 1, 2024

Well, to be honest we didn't add this apply on purpose. Cats/Circe have it because:

  • you almost always summon type classes defined by someone else, to access some if their methods (which are not used through extension methods)
  • you hardly ever need to provide your own type classes, so calling new Typeclass[A] { ... } is most common way of defining them
  • so it's justified that API's apply summons the type class

Chimney does the opposite.

  • we almost always automatically derive or provide type classes
  • we hardly ever call any method on them besides transform - which is called via transformInto
  • so apply would be focused on creating an instance ad hoc

Chimney's philosophy is also more on the side of Jsoniter Scala than Circe:

  • avoiding intermediate type class instances
  • recursive derivation without relying on autoderivation
  • trying to treat cases when user define their own type class as a last resort when library failed to do it OOTB, etc)

so any need to do Transformer[From, To].contramap(f) is almost always a failure on the API's part:

  • it might mess up the error paths generated by PartialTransformer
  • it prevents users from using withFieldConst(_.path.to.nested.field, value) - the implicit for such overrides would have to be ignored, allowing its usage only in the absence of overrides
  • it's most likely a case which should have been handled with one of the integrations' API and if it's impossible, that indicates that an integration is missing.

In cases like the example it would make more sense to introduce some "higher-kinded Transformer" (something necessary for e.g. #569) which would handle external transformation as long as macro would handle the internal transformation, so something like:

package integrations

// this could support e.g. `withFieldConst(_.everyItem, ...)`
trait TotallyMappable[From, To, InnerFrom, InnerTo] {
  def transform(src: From, f: InnerFrom => InnerTo): To
}
// would allow withFieldConst(_.everyItem, 0) if necessary
given TotallyMappable[ID, Int, String, Int] with {
  def transform(src: ID, f: String => Int): To = f(src.value)
}

would potentially work much more reliably.

I'd be reluctant to promote this summon/map/contramap style of defining Transformers as it's way to easy to write underperforming code and throw all the goodies out of the window. The fact that there are no places in docs which define one implicit Transformer using another is not an overlook - we did it on purpose, as we consider it a potential antipattern (e.g. moving from 0.7.5 to 0.8.0 we changed how implicits are defined, to avoid exponential cost of derivation, and every implicit def method(implicit t: Transformer[A, B] /* should be autoderived */) broke).

I would probably let it pass if it was consistent:

  • 2 apply that create a new instance
  • maybe something else for summoning like Transformer.summon[From, To] with a documentation warning that whatever is attempted, is most likely discouraged

If PartialTransformer
apply was not a thing I might have seen it differently... but now consistency and stability take the priority.

@lbialy
Copy link
Collaborator

lbialy commented Sep 1, 2024

I kind of hope for more reasoning like this in Scala's ecosystems - stability and consistency first. 👏

@danicheg
Copy link
Contributor Author

danicheg commented Sep 2, 2024

Wow, I didn't expect such a strong reaction to that inoffensive PR. Let me clarify my reasoning on this matter:

  • First of all, I didn't intend to propose any fundamental shift in the concepts related to Chimney. If we examine Chimney's codebase, there are several cases where implicitly[Transformer[From, To]] is used, and it's even mentioned several times on the documentation site (which might be considered a form of promotion). My PR is solely about making this implicit summoning approach more user-friendly and intuitive (I'll elaborate on that point below).

  • If we consider Transformer as a type class (which, according to the Scaladoc, it is), then implicit summoning is a standard practice widely accepted by the community — even within the Scala compiler sources itself!

  • Contrary to what @lbialy said, using this approach for implicit summoning is stable and consistent across the ecosystem. I don't quite understand the opposing viewpoint and would appreciate some clarification on that matter.

Going further, I'd like to discuss the approach of having defined common instances of Transformer generally (which is out of the original scope of the PR, in my humble opinion).

Chimney does the opposite.

  • we almost always automatically derive or provide type classes

Chimney's philosophy is also more on the side of Jsoniter Scala than Circe:

  • avoiding intermediate type class instances
  • recursive derivation without relying on autoderivation

I understand this might be considered an 'Argumentum ad verecundiam,' but with over eight years of experience working with Scala, I have optimized compilation times for various codebases ranging from 50k to 1M LOC across different companies many times. I've even contributed to the Scala ecosystem by reviving a tool for profiling the compilation times of Scala 2 projects — https://github.com/scalacenter/scalac-profiling/.

Given all of my experience with Scala Macro and automated derivations, I have a STRONG OPINION that there is no worse advice from upstream libraries than to use auto-derivation to the maximum extent, without paying any attention to the consequences. The negative impact on DX in projects larger than 10 KLOC, plagued by automated recursive derivations of type class instances by Macros, without consideration of the underlying costs, is significant and frustrating.

This approach heavily contributes to the perception that "Scala is slow" or that "compilation in Scala is slow" in the broader community. Therefore, I would really like to hear @lbialy's perspective on this matter. What are we aiming to achieve? Higher velocity in writing code, or ensuring that large enterprise projects can thrive over the long term?

To elaborate a bit, earlier this year, I worked on optimizing the compilation times of a quite mature and large (> 0.5MLoC) monorepo at $work, which employed automated recursive derivations of instances of certain type classes. I’m not sure if you've encountered something similar, but I was genuinely surprised to find that the derivation of an instance of TypeClass[Foo] could take up to 70 seconds. Fortunately, we were able to fix this and reduce the derivation time to acceptable levels by manually supplying the intermediate instances that were being recursively derived by the compiler over and over again. To be honest, we have very few alternative approaches to resolving such issues without throwing away automated derivations at once.

The fact that there are no places in docs which define one implicit Transformer using another is not an overlook - we did it on purpose, as we consider it a potential antipattern

Honestly, I'm not seeing how that point relates to the chimney-protobufs package and similar, which consists of a bunch of predefined instances of Transformers. In my humble opinion, providing users with a more pleasant experience is good rather than bad.

object CommonTransformers extends ProtobufTransformerImplicits {
  implicit val foo: Transformer[Domain, Duration] = 
    Transformer[protobuf.Duration, scala.Duration].contramap(_.value)
}

vs

object CommonTransformers extends ProtobufTransformerImplicits {
  implicit val foo: Transformer[Domain, Duration] = 
    (duration: Domain) => totalTransformerFromDurationToJavaDurationInstance.transform(duration.value)
}

@lbialy
Copy link
Collaborator

lbialy commented Sep 2, 2024

If I could ask what did you find to be a strong reaction in Mateusz's post, I feel it would help us understand each other without unnecessary tensions.

I do understand your point @danicheg but I think the core point of Mateusz's reasoning is is that library authors want to manage how the library is being used by incentives and ergonomics of the API. Missing summoning apply is such a hint - it's meant to steer the user towards the recursively deriving macro. This in turn is done because Chimney is usually used in hot paths and is expected to have performance comparable with hand-written, manual data transfer functions. Recursive derivation in macro allows Chimney to select speedier implementations that are invalidated by pre-existing instances of Transformer/PartialTransformer as such interfaces create a knowledge barrier - you can't know what happens inside of an instance but if it's missing and you have knowledge from the context you can do things faster. This emphasis on doing the fast thing by default is something that is treated seriously here - I would know, I contributed the benchmarking pipeline that allows Mateusz and Piotr to detect perf regressions. Now I think it's quite important to note two things - first is that if you design a library without emphasis on perf in the first place, it's usually impossible to improve performance without massive breakage. Second is that Chimney does not disallow custom instances and in fact, does use them if they are available as it's understood that user is aware of the fact that this may be at the cost of peak performance. So there is a clear direction in which library pushes the user - allow us to do things fast for you first, if you have a subsequent problem, eg.: compilation time issues, there's an escape hatch for the informed engineer that is tasked with optimizations. While circe auto does not do anything beside lowering the amount of boilerplate at the cost of compile times, both chimney and jsoniter-scala do indeed increase performance at that cost and both offer a way to mitigate that impact if it's necessary.

I do indeed think of the DX in larger projects given that I have mostly worked on larger projects. Maybe Chimney docs should have a section regarding what to do when the impact on compile times is noticeable (which is to move towards semi-auto in large datatypes probably, I'll leave this to Mateusz who probably understands the problem better).

My comment regarding stability and consistency was not in regard to the wider ecosystem but to the fact that Chimney has had a 1.0 release and is now considered to be a stable library with a well thought-out approach to do things and I do admire the resolve not to change things for the end user after 1.0 release as I feel we have had and still have enough breaking changes (even if they are just new ways of doing things!) in our community, wouldn't you agree?

I've checked whether your PR was preceded by a discussion on whether it's something that library maintainers would want and I can't see anything. Did you hold a conversation about this on some discord perchance?

@MateuszKubuszok
Copy link
Member

MateuszKubuszok commented Sep 2, 2024

Given all of my experience with Scala Macro and automated derivations, I have a STRONG OPINION that there is no worse advice from upstream libraries than to use auto-derivation to the maximum extent, without paying any attention to the consequences. The negative impact on DX in projects larger than 10 KLOC, plagued by automated recursive derivations of type class instances by Macros, without consideration of the underlying costs, is significant and frustrating.

I am perfectly aware why recursive autoderivation is bad - I advised against it in the past myself! It spawns thousands intermediate type classes, which needs to be instantiated, which kills the runtime, explodes the amount of classes that needs to be read by the classloader, all of them go into implicit scope expanding the time needed for compilation etc. It's a mess for final performance of the generated code and is a heavy workload for the compiler.

It is also absolutely not how Chimney work.

The best way I can show it is through this Scaste example (please, hover over the line to see how we derived the code and what was the result):

case class FooInner(a: Int, b: String)
case class FooOuter(inner: FooInner)

case class BarInner(a: Int, b: String)
case class BarOuter(inner: BarInner)

import io.scalaland.chimney.Transformer
import io.scalaland.chimney.dsl.*
import scala.compiletime.codeOf

transparent inline given TransformerConfiguration[?] =
  TransformerConfiguration.default.enableMacrosLogging

// 1 new Transformer
FooOuter(FooInner(10, "bb")).transformInto[BarOuter]

// 0 new Transformers
FooOuter(FooInner(10, "bb")).into[BarOuter].transform

locally {
  given Transformer[FooInner, BarInner] = Transformer.derive

  // 2 new Transformers
  FooOuter(FooInner(10, "bb")).transformInto[BarOuter]

  // 1 new Transformers
  FooOuter(FooInner(10, "bb")).into[BarOuter].transform
}

One of our goals ever since we rewrote the code from Shapeless into macros was performance. It was mostly about runtime performance, but on 0.8.0 we changed how implicit search works which should also improve the compilation times:

  • we separated Transformer - intended to be provided only by users, from Transformer.AutoDerived - which is only available as a fallback when there is no Transformer
  • that avoid situations where macro calls itself through the implicit search, the only Transformers in the scope should be user-provided ones, in the final form with nothing to add to them
  • in the absence of Transformer are are recursively generating an expression, which does not need to create new Transformers, it happens in the same macro expansion, does not put anything new to the implicit scope, does not create more anonymous classes than necessary
    • even if macro would ask implicit scope for Transformers in the process of derivations is should be rather cheap since that scope would be completely empty by default

I can confidently say that, aside from Jsoniter Scala, Chimney is the only library in Scala ecosystem which tries so hard to avoid allocations and intermediate type classes. No internal anonymous type classes that nobody asked for (only the one you asked for via .transformInto or Transformer.derive/.define.buildTransformer), no premature lifting to partial.Result when building partial transformation, no HLists or Coproducts to traverse, no Mirror proxies on instantiating the type - these are the reasons why automatic derivation is slow both in runtime as well as compile time.

(There is only 1 overhead that Chimney has that we are aware of - the DSL used as a builder does have to collect the overrides as runtimes values stored in a Vector, and we have to build it and extract from it).

There is only 1 thing that kills this inlining and forces the macro engine to call new Transformer where it could inline the whole expression, add more implicits to the scope, and multiply the number of expansions - implicit def foo = Transformer.derive. Which means that a typical Circe advice:

object Foo {
  implicit val foo = Transformer.derive[A, B]

  implicit val foo = Transformer.derive[A1, B1]

  implicit val foo = Transformer.derive[A2, B2]
}

is the best way to kill the optimizations. We put a lot of work so that all the issues present with Circe autoderivation did not carried into Chimney.

To elaborate a bit, earlier this year, I worked on optimizing the compilation times of a quite mature and large (> 0.5MLoC) monorepo at $work, which employed automated recursive derivations of instances of certain type classes. I’m not sure if you've encountered something similar, but I was genuinely surprised to find that the derivation of an instance of TypeClass[Foo] could take up to 70 seconds. Fortunately, we were able to fix this and reduce the derivation time to acceptable levels by manually supplying the intermediate instances that were being recursively derived by the compiler over and over again. To be honest, we have very few alternative approaches to resolving such issues without throwing away automated derivations at once.

Initial version of Chimney was developed using Shapeless. I had an example when Transformer compilation took over 2 minutes in a module made of 3 files, 50 loc altogether, and a single derivation. We rewrote derivation to macros with recursion. Module compiled under 2 seconds. Currently core module's test suite compiles around 25s (Scala 2) or 45s (Scala 3) on cold JVM - it has about 2000 derivations, so each takes about 0.01 (Scala 2) to 0.02 (Scala 3) on average.

In virtually every larger code base I was able to improve things tremendously by replacing Shapeless-based derivation with Magnolia-based derivation, it helped much more than replacing autoderivation with semiautomatic derivation everywhere (that helped more with making sure that there are no 2 derivations for the same types with different behavior).

Every performance related issue with derivation with Shapeless originated in the fact that: derivation implementer has to convert between case class/sealed hierarchy and Generic, this generic is returned by a whitebox macro, there is n+1 nested implicits summoned: 1 for each ::/:+: and 1 for HNil/CNil, each of such implicits have to resolve the actual implementation for the field/subtype, which at best requires 2n+1 implicits for a flat case class/sealed, more for nested autoderivation (and when we assume that compiler is not caching intermediate instances, we are going towards exponential times).

None of that is true if recursion is performed within macro, with macro actively trying to avoid calling itself. But it's probably done this way only via a handful of libraries, since Shapeless-based approach was way to easy compared to complex macro maintenance.

Honestly, I'm not seeing how that point relates to the chimney-protobufs package and similar, which consists of a bunch of predefined instances of Transformers. In my humble opinion, providing users with a more pleasant experience is good rather than bad.

implicit transformer:

  • as a leaf of the tree
  • without any generic and implicit anotherTransformer
  • for types where you would never have to customize the derivation from the level of their parent, to the level of their children
  • especially for types for which no build-in rule would be able to generate transformation

is perfectly fine. However, introducing an API changes which would make people think that Chimney works like Circe - when it does not - would make them swamp us with issues like:

  • why my implicit is ignored? (because you overrode it with DSL, and never read the error message telling you why it has to be ignored"
  • why after using traverse, partial.Result lost all information which index had the wrong value? (because the expression we would generate handles that, but this traverse you'd have to do it yourself and you didn't)
  • why my code suddenly slows down after I customized it with implicit def? (because it used to be a single inlined expression, allocating only as much code as necessary, but you forced it to instantiate a tons of new Transformers which would have been avoided by generating the whole code at once, and customizing it with .withFieldXXX and .withSubtypeXXX methods)

which cannot be avoided by writing more docs, because nobody reads the docs and instead let the API encourage or discourage them from certain things. This is what we do - we hint that the proper way to use Chimney is to let macro deal with 99% of cases.

And we want to discourage people from using implicit def something(implicit transformer: Transformer[X, Y]): Transformer[X0, Y0] = ....

In my humble opinion, providing users with a more pleasant experience is good rather than bad.

It is our opinion as well. However, it will be our problem, to explain to every user:

  • why one apply does lifting and another does summoning (which can be EASILY avoided as I wrote in my previous comment - just give it some other name like def summon)
  • why one type class has map and another contramap, while both of them can should be a part of both interfaces (or none)
  • why these map/contramap etc used with PartialTransformers create unreasonable error value paths

apply/map/contramap on their own are not the issue - the issue is that this is a stable library (we cannot arbitrarily change conventions anymore), with large user base (who have developed some expectations and rely on preserving the current behavior), which was committed to providing the best performance we could achieve for a few years now (so we have to be very transparent where users can shoot themselves in the foot). These methods while useful have to be provided in such a way that:

  • it would be consistent with the existing API
  • it would make sure that whoever use them is aware of the differences between Chimney and other libraries

because translating blindly conventions from Circe or other Shapeless/Magnolia/Mirror-based libraries which borrowed Circe's approach (which are very convenient but also tripped over countless people from their inception) might easily backfire.

@MateuszKubuszok
Copy link
Member

MateuszKubuszok commented Sep 2, 2024

I do understand your point @danicheg but I think the core point of Mateusz's reasoning is is that library authors want to manage how the library is being used by incentives and ergonomics of the API. Missing summoning apply is such a hint - it's meant to steer the user towards the recursively deriving macro. This in turn is done because Chimney is usually used in hot paths and is expected to have performance comparable with hand-written, manual data transfer functions. Recursive derivation in macro allows Chimney to select speedier implementations that are invalidated by pre-existing instances of Transformer/PartialTransformer as such interfaces create a knowledge barrier - you can't know what happens inside of an instance but if it's missing and you have knowledge from the context you can do things faster. This emphasis on doing the fast thing by default is something that is treated seriously here - I would know, I contributed the benchmarking pipeline that allows Mateusz and Piotr to detect perf regressions. Now I think it's quite important to note two things - first is that if you design a library without emphasis on perf in the first place, it's usually impossible to improve performance without massive breakage. Second is that Chimney does not disallow custom instances and in fact, does use them if they are available as it's understood that user is aware of the fact that this may be at the cost of peak performance. So there is a clear direction in which library pushes the user - allow us to do things fast for you first, if you have a subsequent problem, eg.: compilation time issues, there's an escape hatch for the informed engineer that is tasked with optimizations. While circe auto does not do anything beside lowering the amount of boilerplate at the cost of compile times, both chimney and jsoniter-scala do indeed increase performance at that cost and both offer a way to mitigate that impact if it's necessary.

Well put together. 👏

Indeed, the "prejudice" Chimney had to address in the past was:

  • pointing out that handwritten code is faster and easier to maintain - while it quite subjective what is easy to maintain and in what context, we had to provide a way of previewing the macros and their logic to make it debuggable, and microptimize the code, so that in many cases only very fine tuned (and often unreadable) code be equally fast or faster
  • complains about compile time speed - we heard a lot of complaints and the way we are currently deriving macros seem to be as close to optimal as we could get, avoiding recursive type class instantiations and wrapping. And we have benchmarks since 0.7.0 to track regressions.

Additionally, as far as I can say, quite a lot of our users are are on Spark (this I know through private conversations), and very few use Cats module. These stats don't show it perfectly, but one can still see that while Cats integration is the most popular one... it's used by less than 10% of our users. Most of them need only very basic features

  • as a matter of the fact, more detailed Sonatype stats and GitHub searches for how people use Chimney in real projects, show that people were perfectly happy with what it offered 2+ years ago (and the cause is not removal of TransformerFs as hardly anyone used them)
  • over that time no one reported issues, nor started discussion (on GH or Gitter) about missing APIs for building type classes - only about more things that Chimney could do out of the box, that it should avoid some allocations or that it had some defaults enabled which should have been opt-in

The scraps of information I could get my hands on, paint a picture of userbase which uses Chimney for ad-hoc transformations, where they only provide implicit Transformers when needed, hardly ever share them, want the generated code to be fast and unsurprising, and expect for everything to be provided OOTB or via OSS integrations maintained by someone else (but even that is quite rare). Perhaps this image is wrong, but there are no indications for it. That works quite well with Chimney's current design, while it could be an antipattern if Chimney was designed after all the conventions that Circe uses.

I do indeed think of the DX in larger projects given that I have mostly worked on larger projects. Maybe Chimney docs should have a section regarding what to do when the impact on compile times is noticeable (which is to move towards semi-auto in large datatypes probably, I'll leave this to Mateusz who probably understands the problem better).

We have a section about performance impact of default way of derivation with import.dsl.* vs using more Circe-like with import auto.*, semiauto, import syntax.* (and why it degrades the performance rather than improving it, but if someone wants to use Chimney like Circe they are free to do so). We could expand it for compile-time impact and how to measure it (enableMacrosLogging measures the time of a macro expansion). If someone wanted to reuse/test transformers the best way would be to semiautomaticaly derive the outermost transformation, skipping the intermediate transformers and allowing the inlining of all the internal transformations.

@danicheg
Copy link
Contributor Author

danicheg commented Sep 2, 2024

Wow, it was a great decision to open this pull request — so many interesting insights into the Chimney development and its evolution. Thanks for writing all of this. I'll do my best to keep up with the growing context.

library authors want to manage how the library is being used by incentives and ergonomics of the API

Sure, it wasn't disputed.

@MateuszKubuszok so, to sum up your position: you want to promote auto-derivation because it works faster than providing custom intermediate instances of Transformers. That's a very interesting point. Are you suggesting that you re-implemented how implicit search works using macros? I mean in this case:

case class FooInner(a: Int, b: String)
case class FooOuter(inner: FooInner, qux: QuxInner)

case class BarInner(a: Int, b: String)
case class BarOuter(inner: BarInner, qux: QuxOuter)

case class QuxInner(qux: Int)
case class QuxOuter(qux: Int)

case class QuuxInner(qux: QuxInner)
case class QuuxOuter(qux: QuxOuter)

import io.scalaland.chimney.Transformer
import io.scalaland.chimney.dsl.*
import scala.compiletime.codeOf

transparent inline given TransformerConfiguration[?] =
  TransformerConfiguration.default.enableMacrosLogging

locally {
  FooOuter(FooInner(10, "bb"), QuxInner(42)).transformInto[BarOuter]

  QuuxInner(QuxInner(42)).transformInto[QuuxOuter]
}

Does the fact that Transformer[QuxInner, QuxOuter] would be derived twice (or its inlined version — a regular function QuxInner => QuxOuter) outperform the case of having implicit val quxTransformer: Transformer[QuxInner, QuxOuter] in scope? Am I following the reasoning correctly? If so, it truly sounds revolutionary! But what are the constraints of that approach? I mean, if we speak about dozens of probable intermediate instances.
Also, I'm curious about the mentioned magic of not deriving redundant instances from your example. Does it apply only to the local scope? Specifically, when the same instances of Transformer are derived in:

  • the same local scope;
  • the same file scope;
  • the same SBT submodule scope;
  • the same SBT project scope?

@MateuszKubuszok
Copy link
Member

MateuszKubuszok commented Sep 2, 2024

I didn't introduce any new way of resolving implicits as much as I sit down to understand what compiler actually does, and what are the consequences of the design choices of the other libraries, and made a different choices.

Implicits search start by looking at definitions which are available in the current scope: via import, being inherited from another class, defined the current scope or the scope it was nested in. These are sorted out by the "closeness" to the derivation callsite (however, most of the times they have the same distance and so instead of priorities we have ambiguity). Then there is a fallback on implicits derived in companion objects - compiler looks at all the types that create the summoned type ("outer" type, its type parameters, type parameters of its type parameters, etc) and sort them as well. Then it looks for "eligible implicits" - starting from the highest priority implicit it tries to call the implicit - if it has implicit on its own then it has to attempt its derivation, only when it fails it will move on to the next implicit of a lower priority.

Quite a lot of the (Scala 2) libraries are based on Shapeless. As I wrote elsewhere it:

  • provides a Generic[YourType] in the form of a whitebox macro, it would create something like Generic.Aux[YourType, Field1 :: Field2 :: ... :: HNil] or Generic.Aux[YourType, Subtype1 :+: Subtype2 :: ... :: CNil]
  • but that is just the beginning, you have to translate from/to generic representation, so the value you pass into your type class has to be converted into a list (with O(n) time for the size of your case class/sealed) and/or from list (same)
  • but having some HList/Coproduct is not enough, you have to recursively build up your typeclass for generic by doing something like implicit def appendTC[Head, Tail <: HList](head: TypeClass[Head], tail: TypeClass[Tail]): TypeClass[Head :: Tail] = .. - it happends both in compile time - and for each such nesting compiler has to both compare types as well as trying to expand the implicit to see if it's "eligible"
  • that continues until it reaches HNil/CNil which requires a dedicated implicit

Of course for that to work, these implicits have to be available when we need them. Autoderivation just imports them all into the scope, either by implicit that you do with import library.auto.* or by putting them into type class'es companion.

This might be a problem because how would you then do something like: implicit val typeclass = summon[TypeClass[F]] without circular dependency on initialization?

So there was another invention when you needed a separate type class, DerivedTypeClass which would be using TypeClass for its derivation, but no DerivedTypeClass.

trait TypeClass[A]
trait DerivedTypeClass[A]

implicit def generic[A, AGen](implicit gen: Generic.Aux[A, AGen], hlist: DerivedTypeClass[AGen]): DerivedTypeClass[A] = ...
implicit def hcons[Head, Tail](implicit head: TypeClass[A], tail: DerivedTypeClass[A]):
    DerivedTypeClass[A] = ...
implicit val hnil:  DerivedTypeClass[HNil] ...

def deriveSemi[A](implicit DerivedTypeClass[A]): TypeClass[A] = convertDerivedToNormal

Automatic derivation needed sth like 2n+1 implicits for a flat case class/sealed, it if it was LabelledGeneric that would be more like 3n + 1 (we would have to summon witness). I am assuming that we are only combining typeclasses in a naive way, if it was traversing the list to see if it should go one way or another... that would go into O(n^2).

If you have semiautomatic derivation... it does not solve that issue. It solves the issue that compiler does not have to cache the intermediate implicits (it might try to, but it is not guaranteed), so it might be easily be exponential time. Semiautomatic derivation for intermediate types is basically user caching intermediate result so that the compiler would spend CPU on lookup, rather than spending it on generating the code it already generated a moment ago.

It generates a lot of overhead. Magnolia tries to address it by having 1 macro expansion for every level, so that even if you are not caching the instances, at least you are only looking for n implicits and combine them with a single function. That improves things both on compiletime and during runtime.

And it is still not what Chimney or Jsoniter do. From the POV of approach described above Chimney/Jsoniter with recursion enabled have neither true autoderivaion nor true semiautomatic derivation.

Chimney separate Transformer from Transformer.AutoDerived similarly how Circe separates Encoder and DerivedEncoder (and same for Decoder). It allows it to make sure that when deriving things recursively, it would only not attempt to lookup things that users didn't provide themselves: automatic derivation can return Transformer.AutoDerived but macro is only allowed to lookup Transformer.

But this is where things are different. OOTB, there is no implicit Transformer in the scope for any type. So the list of eligible implicits is empty unless user explicitly add some implicit Transformer. Macro also does not turn these intermediate results into Transformers just to instantiate them and pass value into anonymous.transform. It generates just the expression. It has no issues whether to cache instances or not, since it generates no intermediate instances if the user didn't provide them, and with everything happening inside a single macro expansion, with all types instantiated, all symbols available, etc, the cost of generating similar expression twice so far was negligible.

So it's basically recursive semiautomatic derivation. Automatic derivation is achieved through a fallback: if there is no Transformer available for transformInto, Transformer.AutoDerive will be generated (and this implicit is invisible to the macro, so that it cannot trigger it). It was achieved simply with relation Transformer[A, B] <: Transformer.AutoDerived[A, B] and some tricks with companion object, that we will have to adjust for Scala 3.7.

Jsoniter has a similar approach, but it removed automatic derivation completely, so it is only semiautormatic derivation, with optional recursion, which would pick up available implicits if available, or generaten inlined expressions if not.

TL;DR

It isn't reimplementation of how implicits work, more of a divorce from the convention how they were used by virtually every library which was based on Shapeless and drew its design from Circe (because Circe is responsible for promoting these easy but suboptimal patterns).

And it was really tricky to come up with something that works this way but is easy to use 90% of the time.

Does the fact that Transformer[QuxInner, QuxOuter] would be derived twice (or its inlined version — a regular function QuxInner => QuxOuter) outperform the case of having implicit val quxTransformer: Transformer[QuxInner, QuxOuter] in scope?

If you:

  • had to use the very same transformation twice
  • could cache it in a val

then semiautomatic derivation with Transformer.derive would work better. What is important is that:

  • implicit is like a closed world
  • you cannot look inside it
  • you cannot modify or customize it

so the macro looses the ability to:

  • avoid wrapping with type class - it has to create a new instance of a Transformer, if it is stored in a val it has to do it only once, with defs that would require allocation every time we call it
  • avoid wrapping with partial.Result - if it was a PartialTransformer which didn't need that partiality, derivation of a whole tree would defer that as long as possible, the need to return partial.Result forces the macro to wrap the value, and when it's summoned it has to be treated as something that can only be mapped/parMapped/etc rather than a value which can be passed into constructor
  • cutomize - if we want to override an inner value with a DSL (withFieldConst(_.foo.everyItem.baz.matchingSome.baz.everyMapValue, value)) is possible!) implicit has to be ignored, because we cannot inject that logic into an existing value

If we don't have to reuse transformation, we might just use automatic derivation or do into.transform which inlines the transformation (but it has the builder DSL overhead for creating a Vector for possible overrides).

I'd say that if one wanted to really increase performance (and compilation times) as much as possible then something like:

object ModuleForDerivation {

  // No implicits, each macro derives only the transformation only for the outer
  // types, which would be actually transformed, not for inner types whose transformation
  // would be inlined.

  val transformer1 = Transformer.derive

  val transformer2 = Transformer.derive

  val transformer3 = Transformer.derive
}

object ModuleForImporting {
  
  // exposing the results of the derivation as implicits

  implicit val transformer1 = ModuleForDerivation.transformer1

  implicit val transformer2 = ModuleForDerivation.transformer2

  implicit val transformer3 = ModuleForDerivation.transformer3
}

combined with

import ModuleForImporting._ // transformers

import io.scalaland.chimney.syntax._
// only summons Transformers, see no Transformer.AutoDerived
// so every transformation has to be manually derived and stored somewhere

would work best - it would make sure that each Transformer has inlined body, and each transformation reused the same instance cached in a val. But it would be a PITA to work with.

Of course one has to use benchmarks to be sure - at some point inlining is detrimental to the JVM's performance rather than helping it, so there would be cases when one should avoid inlining.

@danicheg
Copy link
Contributor Author

danicheg commented Sep 2, 2024

If you:
had to use the very same transformation twice
then semiautomatic derivation with Transformer.derive would work better.

To keep things brief and statement-like, even with Chimney's approach of building an accumulated Expr in the macro context, if the same Transformers are going to be used multiple times (either derived explicitly or inlined in different macro expansions), then caching those Transformer instance in vals is a recommended way to improve compilation time, right? Asking because it was precisely my point originally, considering the root case mentioned in #594 (comment).

Anyway, even with your detailed explanation, I still missed the reasoning behind why manually written instances are worse (in terms of compilation time) than using Transformer.derive. If I understand correctly, you're still performing an implicit search (through the macro universe capabilities) for the needed instance and doing in-place inlining in the not-found case, rather than performing another macro expansion for the derivation (like many other libraries do).

@MateuszKubuszok
Copy link
Member

Hey, sorry for the late reply, I didn't forget about your question. I just had to attend to other things the last several days.

To keep things brief and statement-like, even with Chimney's approach of building an accumulated Expr in the macro context, if the same Transformers are going to be used multiple times (either derived explicitly or inlined in different macro expansions), then caching those Transformer instance in vals is a recommended way to improve compilation time, right? Asking because it was precisely my point originally, considering the root case mentioned in #594 (comment).

I cannot say anything for every case, I started doing benchmarks to be able to answer such questions and as always it's: it depends. I will show my findings on some conference soon.

Anyway, even with your detailed explanation, I still missed the reasoning behind why manually written instances are worse (in terms of compilation time) than using Transformer.derive. If I understand correctly, you're still performing an implicit search (through the macro universe capabilities) for the needed instance and doing in-place inlining in the not-found case, rather than performing another macro expansion for the derivation (like many other libraries do).

Basically, this jsoniter-like approach... well, it doesn't necessarily play along with patterns that works in Circe or other "standard auto vs semiauto" libraries:

  • Transformer.apply[From, To](implicit t: Transformer[From, To]): Transformer[From ,To] = t sounds like a nice idea, for building on top of existing transformers...
  • ...until you find out that users would automatically expect that Transformer[A, A].map(f: A => B) should also works, because why not?
  • and then they find out that:
    • there is no identity Transformer in scope
    • nor there is a Transformer for converting between collections or arrays
    • nor for wrapping/unwrapping/rewrapping Options
    • nor Eithers
    • as a matter of the fact there is 0 implicit Transformers in scope
    • and they should use Transformer.AutoDerived half the time
  • then, this approach would not work if someone do not want to import dsl._ (with our default approach) and instead go for import syntax._ approach where they would either import import auto._ or use Transformer.derive for semiauto
  • which basically means that depending on which import you pick, you'll work with a different conventions and putting things in Transformer.apply for everyone would not work with one of them
    • this suggest that instead of Transformer.apply there should be some def summonTransformer defined in dsl AND in syntax, one taking Transformer.AutoDerived and another taking Transformer, to avoid nasty surprises
  • well, if you consider that most people assume, that build-in types are supported by putting some default instances in the scope, Transformer.apply would surely surprise semiautomatic derivation users who would discover that Transformer.apply would require a manual implicit deriving things supported OOTB...
    // if Transformer.apply takes Transformer instead or Transformer.AutoDerived then
    // this is required
    implicit val ls2vs: Transformer[List[String], Vector[String]] = Transformer.derive
    // or this would fail
    Transformer[List[String], Vector[String]].map(...)
    at which point they might use Transformer.derive directly to get the instance derived without summoning
    Transformer.derive[List[String], Vector[String]].map(...)

Long story short, Transformer.apply(implitit Transformer) only works if you focus on a single case and ignore the big picture and conventions existing in other libraries which would not translate well into Chimney's approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants