regression: dead heap allocations aren't optimized out anymore #24194

oli-obk · 2015-04-08T14:59:25Z

#22159 was closed after a llvm update (#22526). It used to work (I remember I tried it out). Now it doesn't work anymore.

fn main() {
    let _ = Box::new(42);
}

http://is.gd/Wekr7w

LLVM-IR:

; ModuleID = 'rust_out.0.rs'
target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: uwtable
define internal void @_ZN4main20h657e6a8d1dc11120eaaE() unnamed_addr #0 {
entry-block:
  %0 = tail call i8* @je_mallocx(i64 4, i32 0)
  %1 = icmp eq i8* %0, null
  br i1 %1, label %then-block-57-.i.i, label %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit"

then-block-57-.i.i:                               ; preds = %entry-block
  tail call void @_ZN3oom20he7076b57c17ed7c6HYaE()
  unreachable

"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit": ; preds = %entry-block
  %2 = bitcast i8* %0 to i32*
  store i32 42, i32* %2, align 4
  %3 = icmp eq i8* %0, inttoptr (i64 2097865012304223517 to i8*)
  br i1 %3, label %"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit", label %cond.i

cond.i:                                           ; preds = %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit"
  tail call void @je_sdallocx(i8* %0, i64 4, i32 0)
  br label %"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit"

"_ZN14Box$LT$i32$GT$8drop.86517h1e7c6ecb62969b35E.exit": ; preds = %"_ZN5boxed12Box$LT$T$GT$3new19h823444268625886800E.exit", %cond.i
  ret void
}

define i64 @main(i64, i8**) unnamed_addr #1 {
top:
  %2 = tail call i64 @_ZN2rt10lang_start20he050f8de3bcc02b7VRIE(i8* bitcast (void ()* @_ZN4main20h657e6a8d1dc11120eaaE to i8*), i64 %0, i8** %1)
  ret i64 %2
}

declare i64 @_ZN2rt10lang_start20he050f8de3bcc02b7VRIE(i8*, i64, i8**) unnamed_addr #1

declare noalias i8* @je_mallocx(i64, i32) unnamed_addr #1

; Function Attrs: cold noinline noreturn
declare void @_ZN3oom20he7076b57c17ed7c6HYaE() unnamed_addr #2

declare void @je_sdallocx(i8*, i64, i32) unnamed_addr #1

attributes #0 = { uwtable "split-stack" }
attributes #1 = { "split-stack" }
attributes #2 = { cold noinline noreturn "split-stack" }

oli-obk · 2015-04-08T15:02:07Z

hmmm, I'm not so sure if it's a regression. It still works for vectors: http://is.gd/yNAFF6 . Maybe it never worked for boxes?

steveklabnik · 2016-05-24T21:16:00Z

So, where do we go with this ticket? Was/is this a regression? is it worth tracking?

dotdash · 2016-05-24T23:23:52Z

@steveklabnik Probably a regression from when we moved from actual zeroing-drop to using the non-zero drop pattern. The check for that pattern basically says "leak this if the adress matches the drop pattern", so LLVM cannot remove the allocation because that would change semantics. Dynamic drop will fix this, and I think we should keep this open to track that, just to make sure.

wesleywiser · 2017-03-04T20:31:14Z

This appears to be fixed: https://is.gd/kT0Gs9 Is this worth creating some kind of regression test for?

oli-obk · 2017-03-04T22:49:45Z

We could test it by adding a run pass test that uses a custom allocator (which doesn't allocate, but panics).

bluss · 2017-03-04T23:20:43Z

tentatively needstest, then. Maybe a src/test/codegen test?

nagisa · 2017-05-22T19:01:11Z

Seems to not work again on nightly.

est31 · 2017-05-22T21:50:25Z

I've used @Mark-Simulacrum 's bisection tool to find out the regressing commit. It was the LLVM 4.0 upgrade, commit 0777c75 .

nagisa · 2017-06-01T16:24:28Z

Assigning myself just so it stays on my to-do list. Not immediately working on it, though. Feel free to take over if you want.

alexcrichton · 2017-06-01T16:26:03Z

Historically this optimization was done by rust-lang/llvm@4daef48 but this commit was not carried forward to our current branch when the 4.0 upgrade was done because it no longer applies cleanly (IIRC)

nagisa · 2017-06-03T15:25:09Z

Sadly, I couldn’t make the patch above to work. In fact, some testing seems to indicate that the responsibility for eliding allocations has been moved out of LLVM into clang.

Namely, code like this

#include<stdlib.h>
void wowzers(int y) {
    void *z = malloc(y);
    if(z != NULL) free(z);
}

when compiled with clang test.c -emit-llvm -S -O3 -fno-builtin (clang version being 4.0) has the allocations in the produced IR

; Function Attrs: nounwind sspstrong uwtable
define void @wowzers(i32) local_unnamed_addr #0 {
  %2 = zext i32 %0 to i64
  %3 = tail call noalias i8* @malloc(i64 %2) #2
  %4 = icmp eq i8* %3, null
  br i1 %4, label %6, label %5
; <label>:5:                                      ; preds = %1
  tail call void @free(i8* nonnull %3) #2
  br label %6
; <label>:6:                                      ; preds = %1, %5
  ret void
}

whereas if the special handling of the built-ins is not disabled (i.e. without -fno-builtin), it compiles to a ret void even with -O0.

This seems to suggest to me that, unless we do some serious patchwork on LLVM (pretty sure we don’t want to do that), it falls onto rustc to optimise out heap allocations now.

This could also very well be a bug. I have a test case on hand that does optimise out on 3.9 but not on 4.0. I might do a bisection.

nagisa · 2017-06-03T16:10:54Z

Okay, never mind. I got it to work.

nagisa · 2017-06-15T16:14:42Z

The LLVM upgrade reintroduces optimisation for _rust_allocate, but the optimisation for __rust_allocate_zeroed was backed out due to weird UB-like behaviour with bitvec iterators in rustc_data_structures.

Should investigate eventually. Assigning myself.

nagisa · 2017-06-15T16:15:04Z

cc @arielb1 you were interested in this yesterday.

alexcrichton · 2017-07-23T15:31:40Z

I believe this is no longer a regression, so untagging as a regression.

nox · 2018-04-02T11:37:29Z

It's not very clear to me whether this is still actionable. The original snippet doesn't seem to be misoptimised anymore. Can someone produce a new snippet that demonstrates the continued existence of the issue? Should the issue be closed?

Cc @rust-lang/wg-codegen

nagisa · 2018-04-03T09:13:40Z

This should stay open as only a handful of alllocating functions are currently handled (i.e. malloc equivalent but not realloc IIRC).

…

On Mon, Apr 2, 2018, 14:37 Anthony Ramine ***@***.***> wrote: It's not very clear to me whether this is still actionable. The original snippet doesn't seem to be misoptimised anymore. Can someone produce a new snippet that demonstrates the continued existence of the issue? Should the issue be closed? Cc @rust-lang/wg-codegen <https://github.com/orgs/rust-lang/teams/wg-codegen> — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#24194 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AApc0mikNK6SdwDWKu-Gccdw9qaNIrHUks5tkg2RgaJpZM4D8f_0> .

nagisa · 2018-04-03T09:15:03Z

Or was it calloc-equivalent.

…

On Tue, Apr 3, 2018, 12:13 Simonas Kazlauskas ***@***.***> wrote: This should stay open as only a handful of alllocating functions are currently handled (i.e. malloc equivalent but not realloc IIRC). On Mon, Apr 2, 2018, 14:37 Anthony Ramine ***@***.***> wrote: > It's not very clear to me whether this is still actionable. The original > snippet doesn't seem to be misoptimised anymore. Can someone produce a new > snippet that demonstrates the continued existence of the issue? Should the > issue be closed? > > Cc @rust-lang/wg-codegen > <https://github.com/orgs/rust-lang/teams/wg-codegen> > > — > You are receiving this because you modified the open/close state. > Reply to this email directly, view it on GitHub > <#24194 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AApc0mikNK6SdwDWKu-Gccdw9qaNIrHUks5tkg2RgaJpZM4D8f_0> > . >

nox · 2018-04-03T09:25:41Z

@nagisa Any snippet demonstrating the issue?

nagisa · 2018-04-03T10:24:59Z

#![feature(allocator_api)]
use std::heap::{Heap, Layout, Alloc};

pub unsafe fn alloc_zeroed_doesnt_optimise() {
    let _ = Heap.alloc_zeroed(Layout::from_size_align_unchecked(4, 8));
}

extern "C" {
    fn foo(x: *mut u8);
}

pub unsafe fn alloc_zeroed_should_optimise_rezeroing() {
    let a = Heap.alloc_zeroed(Layout::from_size_align_unchecked(16, 8)).unwrap();
    let slc = ::std::slice::from_raw_parts_mut(a, 16);
    for i in slc.iter_mut() {
        *i = 0;
    }
    foo(slc.as_mut_ptr());
}

shepmaster · 2018-07-05T15:07:48Z

For searchability purposes, the functions are now called __rust_alloc_zeroed and __rust_dealloc.

shepmaster · 2018-07-05T15:09:17Z

This was noticed in the wild on Stack Overflow.

nikic · 2018-11-05T15:18:00Z

It would be nice if we didn't have to patch LLVM to support custom allocation functions. Unfortunately the last discussion on that topic didn't really end up anywhere: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093625.html

oli-obk · 2018-11-05T15:31:25Z

The only way around that is to guarantee this optimization via MIR optimizations. But our MIR optimization story is nowhere near the level required for that.

arthurprs · 2018-11-05T15:34:08Z

Couldn't this be generalized by having a "pure, no side-effects" unsafe attribute?

This obviates the patch that teaches LLVM internals about _rust_{re,de}alloc functions by putting annotations directly in the IR for the optimizer. The sole test change is required to anchor FileCheck to the body of the `box_uninitialized` method, so it doesn't see the `allocalign` on `__rust_alloc` and get mad about the string `alloca` showing up. Since I was there anyway, I added some checks on the attributes to prove the right attributes got set. While we're here, we also emit allocator attributes on __rust_alloc_zeroed. This should allow LLVM to perform more optimizations for zeroed blocks, and probably fixes rust-lang#90032. [This comment](rust-lang#24194 (comment)) mentions "weird UB-like behaviour with bitvec iterators in rustc_data_structures" so we may need to back this change out if things go wrong. The new test cases require LLVM 15, so we copy them into LLVM 14-supporting versions, which we can delete when we drop LLVM 14.

This obviates the patch that teaches LLVM internals about _rust_{re,de}alloc functions by putting annotations directly in the IR for the optimizer. The sole test change is required to anchor FileCheck to the body of the `box_uninitialized` method, so it doesn't see the `allocalign` on `__rust_alloc` and get mad about the string `alloca` showing up. Since I was there anyway, I added some checks on the attributes to prove the right attributes got set. While we're here, we also emit allocator attributes on __rust_alloc_zeroed. This should allow LLVM to perform more optimizations for zeroed blocks, and probably fixes #90032. [This comment](rust-lang/rust#24194 (comment)) mentions "weird UB-like behaviour with bitvec iterators in rustc_data_structures" so we may need to back this change out if things go wrong. The new test cases require LLVM 15, so we copy them into LLVM 14-supporting versions, which we can delete when we drop LLVM 14.

steveklabnik added the A-codegen Area: Code generation label Apr 9, 2015

Gankra mentioned this issue Aug 10, 2015

RFC: remove weak pointers rust-lang/rfcs#1232

Closed

bluss added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Mar 4, 2017

nagisa mentioned this issue May 22, 2017

Use simple io::Error for &[u8] Read and Write impl #42156

Closed

brson added I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. labels Jun 1, 2017

nagisa self-assigned this Jun 1, 2017

nagisa mentioned this issue Jun 3, 2017

Upgrade LLVM #42410

Merged

wesleywiser mentioned this issue Jun 9, 2017

Box::new() performance regression between 1.18 and 1.19 beta #42562

Closed

bors closed this as completed in 8938269 Jun 16, 2017

nagisa reopened this Jun 16, 2017

Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 22, 2017

alexcrichton removed the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Jul 23, 2017

nagisa mentioned this issue Sep 13, 2021

String cloning is not optimized the same way as String construction #88905

Closed

nikic mentioned this issue Oct 19, 2021

Missed optimization for unused zero-initialized vectors #90032

Closed

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regression: dead heap allocations aren't optimized out anymore #24194

regression: dead heap allocations aren't optimized out anymore #24194

oli-obk commented Apr 8, 2015

oli-obk commented Apr 8, 2015

steveklabnik commented May 24, 2016

dotdash commented May 24, 2016

wesleywiser commented Mar 4, 2017

oli-obk commented Mar 4, 2017

bluss commented Mar 4, 2017

nagisa commented May 22, 2017

est31 commented May 22, 2017

nagisa commented Jun 1, 2017

alexcrichton commented Jun 1, 2017

nagisa commented Jun 3, 2017

nagisa commented Jun 3, 2017

nagisa commented Jun 15, 2017

nagisa commented Jun 15, 2017

alexcrichton commented Jul 23, 2017

nox commented Apr 2, 2018

nagisa commented Apr 3, 2018 via email

nagisa commented Apr 3, 2018 via email

nox commented Apr 3, 2018

nagisa commented Apr 3, 2018

shepmaster commented Jul 5, 2018

shepmaster commented Jul 5, 2018

nikic commented Nov 5, 2018

oli-obk commented Nov 5, 2018

arthurprs commented Nov 5, 2018

regression: dead heap allocations aren't optimized out anymore #24194

regression: dead heap allocations aren't optimized out anymore #24194

Comments

oli-obk commented Apr 8, 2015

oli-obk commented Apr 8, 2015

steveklabnik commented May 24, 2016

dotdash commented May 24, 2016

wesleywiser commented Mar 4, 2017

oli-obk commented Mar 4, 2017

bluss commented Mar 4, 2017

nagisa commented May 22, 2017

est31 commented May 22, 2017

nagisa commented Jun 1, 2017

alexcrichton commented Jun 1, 2017

nagisa commented Jun 3, 2017

nagisa commented Jun 3, 2017

nagisa commented Jun 15, 2017

nagisa commented Jun 15, 2017

alexcrichton commented Jul 23, 2017

nox commented Apr 2, 2018

nagisa commented Apr 3, 2018 via email

nagisa commented Apr 3, 2018 via email

nox commented Apr 3, 2018

nagisa commented Apr 3, 2018

shepmaster commented Jul 5, 2018

shepmaster commented Jul 5, 2018

nikic commented Nov 5, 2018

oli-obk commented Nov 5, 2018

arthurprs commented Nov 5, 2018