Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out which LLVM optimisation passes are worth enabling #595

Open
yorickpeterse opened this issue Jul 19, 2023 · 5 comments
Open

Figure out which LLVM optimisation passes are worth enabling #595

yorickpeterse opened this issue Jul 19, 2023 · 5 comments
Labels
accepting contributions Issues that are suitable to be worked on by anybody, not just maintainers compiler Changes related to the compiler
Milestone

Comments

@yorickpeterse
Copy link
Collaborator

Right now the only optimisation pass we enable is the mem2reg pass, because that's pretty much a requirement for non-insane machine code. We deliberately don't use the O2/O3 options as they enable far too many optimisation passes, and don't give you the ability to opt-out of some of them (Swift takes a similar approach).

We should start collecting a list of what passes are worth enabling, and ideally what the compile time cost is versus the runtime improvement. The end goal is to basically enable the passes that give a decent amount of runtime performance improvements, but without slowing down compile times too much.

@yorickpeterse yorickpeterse added accepting contributions Issues that are suitable to be worked on by anybody, not just maintainers compiler Changes related to the compiler labels Jul 19, 2023
@yorickpeterse
Copy link
Collaborator Author

From jinyus/related_post_gen#440 (comment): using OptimizationLevel::Aggressive can have a big impact on the performance compared to None. In itself this isn't surprising, because of course optimizations are beneficial. I however would like to know (somehow) which optimizations are worth enabling, rather than just enabling something as opaque as -O3.

Perhaps as a starting point we can just set that option when using inko build --aggressive, then figure out which ones to explicitly enable for regular builds.

yorickpeterse added a commit that referenced this issue Nov 17, 2023
When using `inko build --opt=aggressive`, we not set LLVM's optimization
level to "aggressive", which is the equivalent of -O3 for clang. This
gives users to ability to have their code optimized at least somewhat,
provided they're willing to deal with the significant increase in
compile times. For example, Inko's test suite takes about 3 seconds to
compile without optimizations, while taking just under 10 seconds when
using --opt=aggressive.

The option --opt=balanced still doesn't apply optimizations as we've yet
to figure out which ones we want to explicitly opt-in to.

See #595 for more details.

Changelog: performance
@yorickpeterse
Copy link
Collaborator Author

1a30de9 changes inko build such that --opt=aggressive applies the equivalent of clang's -O3. This significantly increases compile times, but it's better than nothing until we come up with our own list of passes to enable.

@yorickpeterse yorickpeterse modified the milestones: 0.18.0, 0.19.0 Oct 22, 2024
@yorickpeterse
Copy link
Collaborator Author

At leas the following passes are worth looking into more, based on playing around with them to see what effect they have:

  • instcombine
  • gvn
  • sroa (gets rid of redundant alloca instructions and their loads/stores)
  • simplifycfg (simplifies the CFG, mostly useful for debugging I think)

@yorickpeterse
Copy link
Collaborator Author

Worth adding: even with --opt=aggressive, certain methods such as Int.% aren't performing very well by the looks of it. For example, take this snippet (based on https://github.com/bddicken/languages):

import std.env (arguments)
import std.int (Format)
import std.rand (Random)
import std.stdio (Stdout)

class async Main {
  fn async main {
    let out = Stdout.new
    let rand = Random.new
    let n = Int.parse(arguments.get(0), Format.Decimal).get
    let r = rand.int_between(0, 10_000)
    let a = Array.filled(with: 0, times: 10_000)
    let mut i = 0

    while i < 10_000 {
      let mut j = 0

      while j < 100_000 {
        a.set(i, a.get(i) + (j % n))
        j += 1
      }

      a.set(i, a.get(i) + r)
      i += 1
    }

    let _ = out.print(a.get(r).to_string)
  }
}

On my laptop this takes 24 seconds to run, with about 80% of the time being spent in the code of Int.%. Oddly enough, even if I just reduce that to _INKO.int_rem() it still takes more or less the same amount of time.

I'm not sure how on earth this code is that slow, given that Rust does it in about 2.5 seconds.

@yorickpeterse
Copy link
Collaborator Author

Curiously, the above program finishes in only 3.68 seconds on my desktop. Perhaps the Intel CPU on my laptop is just really terrible at this code for some reason?

yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
yorickpeterse added a commit that referenced this issue Nov 29, 2024
Depending on how LLVM decides to optimize things, these attributes may
help improve code generation, though it's difficult to say for certain
how much at this stage.

See #595 for more details.

Changelog: performance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepting contributions Issues that are suitable to be worked on by anybody, not just maintainers compiler Changes related to the compiler
Projects
None yet
Development

No branches or pull requests

1 participant