Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tooling :: Interop story for .NET libraries using C# source generators for high performance #14300

Open
T-Gro opened this issue Nov 11, 2022 · 9 comments
Labels
Area-Compiler-ImportAndInterop import of .NET DLLs and interop Area-ProjectsAndBuild MSBuild tasks, project files, framework resolution Feature Request
Milestone

Comments

@T-Gro
Copy link
Member

T-Gro commented Nov 11, 2022

I want to open the discussion on consumption of .NET libraries built using C# source gen support https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/source-generators-overview .

With every .NET version, more APIs use it and is part of the reason of the unprecedented performance boosts of .NET platform for various computing tasks. A few examples include:

And I think it is only about time before database libraries or structured loggers (incl. telemetry) will have it as well.

I do not believe that F# needs it's own clone of source generators, there are concepts like type providers or Myriad which allow accomplishing similar goals. Therefore I did not continue this as a comment on fsharp/fslang-suggestions#864 , which I believe has different aspirations.

I do want to reopen the discussion from the .NET library consumption perspective, especially around libraries/frameworks that are massively backed and invested in (like aspnetcore), and are expensive to replicate and part of the big performance wins.
F# is " | Succinct, robust and performant language for .NET" and the ability to consume the fastest of .NET's libraries IMO comes with this motto.

The latest resolution I could find on the topic is "use C# project" fsharp/fslang-suggestions#864 (comment) , which makes sense in the short term (< 3 years) perspective.

However, the broader the usage of C# code gen in well established libraries, the more slices would have to be done in a project to separate the F# pieces (where F# programmers wants to write) and C# parts (simply due to libraries needing that). Important to note that it might spread multiple layers of the application, and therefore isn't just "1 F# and 1C# project", but rather an interleaved sandwich depending on the level of the stack a library is targeting. The cognitive complexity of seeing a project (which turned into a solution by now) like this is objectively bigger and the typical display by Solution Explorer does not make the dependency order visible at a first glance.

Which brings me to the tooling topic - what can we do better in order to support a smooth workflow using such libraries in a project that would want to be F#-only otherwise.

There is an older suggestion about mixed projects fsharp/fslang-suggestions#117 , which was correctly resolved as being a tooling issue and not a change to F# language itself.

My current view is that the user-facing side of this feature could look like embedding a single standalone C# file into middle of a F# project.
That C# file would have access to all project/package dependencies (this is where the source gen stuff is), F# files before it, NOT the files after it ;; and would be only accessible by F# files coming after it.
If this eliminates any worries, I think this would be handy even if always restricted to 1-C#-file scenarios only.

From the IDE side, I could imagine this being a "lightweight project within project", as well as a .cs file being within .fsproj and the F# compiler knowing to split the project into multiple compilation units, invoke Roslyn underneath and putting the results together in the right order.

I will wait for someone more knowledgeable to assess if merging the produced C# & F# ILs together is even a theoretical thought, or it if this would have to be independent .dlls on the output.

This is an XXL item

@T-Gro T-Gro added Feature Request Area-ProjectsAndBuild MSBuild tasks, project files, framework resolution Area-Compiler-ImportAndInterop import of .NET DLLs and interop labels Nov 11, 2022
@github-actions github-actions bot added this to the Backlog milestone Nov 11, 2022
@vzarytovskii vzarytovskii moved this to Not Planned in F# Compiler and Tooling Nov 11, 2022
@KathleenDollard
Copy link

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

@vzarytovskii
Copy link
Member

vzarytovskii commented Nov 11, 2022

One problem is that many C# code generators rely on partial assemblies/types, which means we either have to:

  • Support those as "augmentations" to types in F#.
  • Allow C# files to be part of F# project, and use it and an artificial implicit project to run source generator, and produce assembly which we then use.
    • Multiple problems with this: generators won't really have access to any F# types.
  • Probably a bunch of other options, like cross-compiling F# to C# or something wild like this.

That said, I think we should take TOP libraries which use source gen and see their use-cases, and what's needed from us to support those.

@T-Gro
Copy link
Member Author

T-Gro commented Nov 11, 2022

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

Indeed, the solution I had in mind was spliting the user-visible project and doing a separate compilation unit for each block, treating change of languge as a switch into a new unit.
So in this case, there would be 5 (!) compilation units, each having reference to it's predecessors as being separate assemblies.
That would also mean that in context of an F# project, the .cs files would NOT see each other bidirectionally, and the visibility would follow the project order as it does with F# files.

After those 5 separate compilation units are done, it would be of course good to put them back together into a single .dll. If that is doable, I do not know. (e.g. if a .dll created this way and containing output from two different compilers could create issues somewhere down the road when consumed)

It might look crazy to do 5 different compilation units, but in the end this is what users do today when separating those into projects manually.

image

@En3Tho
Copy link
Contributor

En3Tho commented Nov 14, 2022

One of the ways is maybe trying to embed a C# code piece to F#
Many simple but useful things like LibraryImportGenerator or RegexGenerator only use single partial method and an attribute to flag source generation. I guess they can be a goal for a start?

F# code ...
```csharp // like an md for example
public class FastRegex
{
    [RegexGenerator("WowF#")]
    public partial Regex MyCoolRegex();
}
``` //

if FastRegex.MyCoolRegex.IsMatch(...) then

Props:

  1. It sorta has a natural bit to F# in a sense that code above won't know about MyCoolRegex and code below will (at least this is the idea).
  2. You don't need to make a dedicated file for this.

Cons:

  1. Looks out of the place.
  2. All the ceremony with files is still there - need to think how to extract this code bit to a dedicated file, pass it to roslyn, import back, place breakpoint etc and also how to restrict accessability
  3. Need strict rules about where such code can be placed (I guess inside namespace only or inside a module but namespace feels easier to do)

One of the options is trying to revive F# -> Roslyn interop.
But as @vzarytovskii stated F# needs to have a support for "partial" at least.

With new "file" modifier I belive some of the complexity is gone because generators do not need to scan assemblies for similar type names, resolving conflicts etc. This might be easier to do now.

Pros:

  1. Do not need .cs files at all (at this stage at least), feels very natural to F#

Cons:

  1. Need to create both export and import to Roslyn / from Roslyn: export F# AST => C# AST for SG, wait for SG, import C# AST => F# AST (virtually or in other ways)
type Regex with
    [<RegexGenerator("WowF#")>]
    static member MyCoolRegex() = partial // keyword?

if Regex.MyCoolRegex().IsMatch(...) ...

The main idea behind thise ideas is trying to make generated stuff visible to code just right below it. To not introduce a "hard" split in code a logic. I belive this might be one of the hardest things?

@vzarytovskii
Copy link
Member

One of the options is trying to revive F# -> Roslyn interop.

This would be an extremely fragile solution and will require constant changes adapting to all roslyn changes.

@dsyme
Copy link
Contributor

dsyme commented Nov 15, 2022

This is a big topic, and I like your framing @T-Gro. Above all it's very important to approach anything in this space from the perspective of "how are we going to implement this", including in the IDE. Anything here requires very deep changes to how compilation and analysis proceed and needs very close attention to detail.

On the whole I'm going to stay out of this directly - it's important, but not my battle :) I'll jot a few notes which might be useful.

  • The framing you have is good - "think of it as a single C# file in an F# compilation" - as are the subsequent discussions about projects etc.

  • There is an existing mechanism to inject arbitrary .NET content mid-way through the F# compilation process - generative type providers. Generated types are provided by handing over an assembly and the types are rewritten at the IL level to become part of the output assembly. It's worth noting we added this for very similar reasons - .NET 1-3.x libraries were using C# code generation tools and we wanted that available in F#.

    The handover is actually pretty simple - the generative type provider reports DLLs to incorporate (via provided types that have a different assembly), and the types are rewritten and renamed as part of compilation. There is no integration with projects or build (so the TP must detect changes in inputs - e.g. DB schema - and report invalidation), and the TP has no access to types generated in the current assembly.

    I believe these could today be used to host arbitrary C# code in a CSharpProvider<" cs code ">.

    Is this a useful starting point? I'm not sure. At the high level there's no reason the TP architecture couldn't be modified a bit to allow that input to be in a source generator file instead (and if necessary adjusted so no explicit declaration is even needed in source code). Does that get close enough that you could extend and modify the mechanism to host source generators? I'm not sure. Maybe. It's worth thinking about.

    1. Certainly F# (https://github.com/fsharp/fslang-design/blob/main/RFCs/FS-1023-type-providers-generate-types-from-types.md) is necessary. That would be a very powerful addition to F# in any case.

    2. Some holes may need to be fixed in what TPs can provide.

    3. The C# code may provide partial types, as mentioned above. But perhaps the TP architecture could be adjusted to allow merging of types into F# types.

    4. Regarding IDE builds and dependencies and projects and so - the TP architecture would need to be adjusted/enriched to allow the TP to actively host a design-time build for the C# project. Or else the TP would simply be re-run in some non-incremental mode. I'm not sure.

One advantage of using an extensibility point is that the code generator and Roslyn compilation would be held "at arm's length", hosted in the TP. Further you could version that component separately. In principle you could alternatively design add a different extensibility point that achieved a similar thing. I've got a feeling it would look a lot like generative TPs.

Anyway, on the whole I'd recommend having a good think about factoring things this way. That is, via an extension point, rather than direct integration. Maybe F#-for-.NET would then come with a RoslynSourceGenerator TP thing with all the build logic automagically hooked up. Maybe not. But decoupling may be very valuable here.

If you did go down the route of extending the existing TP mechanism, other good things could potentially drop out, e.g.

Some general comments - I personally think F#'s future existence is firmly rooted in being both a Javascript language and .NET language - and we should assess everything we do from this perspective. We must also focus on F#'s own existence as its own set of libraries and ecosystem, rather than always being downstream from .NET change and churn - most of which is now frankly treating .NET as a single-language ecosystem.

To put that in perspective, in the past 90% of our efforts have been to interoperate with .NET assets. While that's been great for properly-designed truly cross-language core libraries, it's often not turned out to be very fruitful for anything that involves complex compilation (e.g. IDE tooling using code generation, likewise database and service generators). We can burn a lot of time and energy to interoperate with these libraries, and doing so can suck us into very deep dependencies on C# both technically and culturally. So I recommend looking for an approach to this that is fundamentally F#-first, where what you want drops out as an instance of a more generic capability.

@En3Tho
Copy link
Contributor

En3Tho commented Nov 24, 2022

One of the problems I recently hit when trying to make a thin (as it could possibly be) wrapper around Blazor is that it's currently impossible to inline ast/make type partial. There is a Myriad and it's a good tool I guess but it suffers from this limitation too. I can actually imagine partial modules. It should be a thing with least amount of limitations and obstacles. Not sure about partial types tho. @dsyme can you please share if you ever given a though about partial modules/types?

To illustrate the situation:
Consider we have a component like this:

type HelloWorldFSharp() =
    inherit ComponentBase()

    [<Parameter; EditorRequired>]
    member val Name = "" with get, set

    [<Parameter>]
    member val Name2 = "F#" with get, set

    override this.BuildRenderTree(builder) =
         builder.Render(blazor {
             h1 {
                 $"Hello, {this.Name} from {this.Name2}!"
             }
        })

The obstacle is that Name and Name2 are set via RenderTreeBuilder meaning not just directly Name = ... and Name2 = ... So I've decided that codegen is the best thing I can do here:

[<AutoOpen>]
module HelloWorldFSharp__Import =
    open FSharpComponents
    open System

    type [<Struct; IsReadOnly>] HelloWorldFSharp__Import(builder: BlazorBuilderCore) =

        member this.Name2 with set(value: String) =
            builder.AddAttribute("Name2", value)

        interface IComponentImport with
            member _.Builder = builder

    type HelloWorldFSharp with
        static member inline Render(builder: BlazorBuilderCore, name: String) =
            builder.OpenComponent<HelloWorldFSharp>()
            builder.AddAttribute("Name", name)
            HelloWorldFSharp__Import(builder)

And then import:

type Importer() =
    inherit ComponentBase()
    override this.BuildRenderTree(builder) =
        builder.Render(blazor {
            fun b -> HelloWorldFSharp.Render(b, "C#", Name2 = "VB")
        })

The problem is that this generated import and Render extension should live right below HelloWorldFSharp type. Currently it is impossible unless you write this thingy by hand.

@jkone27
Copy link
Contributor

jkone27 commented Sep 30, 2024

linking also Orleans (as sdk requires code generation in C#) dotnet/orleans#5772. i think F# has extra interest for BEAM/erlang/elixir devs eventually checking out .NET "beam" alternatives as code is much closer for list pattern matching etc, and has |> pipeline op as well

@T-Gro
Copy link
Member Author

T-Gro commented Sep 30, 2024

If I am looking correctly, Orleans does support FsharpTypes referenced from C# project acting as an intemmediate for the "Orleans codegenr".
See e.g. here: https://github.com/dotnet/orleans/blob/ad8d22d3e6427ebb6af5ffe32a5c3b911b7595ec/src/Orleans.CodeGenerator/SyntaxGeneration/FSharpUtils.cs

I cannot find much on else though, would be better to find someone who has used Orleans with F# to share tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compiler-ImportAndInterop import of .NET DLLs and interop Area-ProjectsAndBuild MSBuild tasks, project files, framework resolution Feature Request
Projects
None yet
Development

No branches or pull requests

6 participants