Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization for unused zero-initialized vectors #90032

Closed
Herohtar opened this issue Oct 18, 2021 · 4 comments
Closed

Missed optimization for unused zero-initialized vectors #90032

Herohtar opened this issue Oct 18, 2021 · 4 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@Herohtar
Copy link

Herohtar commented Oct 18, 2021

Optimization doesn't seem to happen for unused zero-initialized vectors.

If I write a method that initializes a Vec with a non-zero value and call it a couple of times from main(), then when I build with --release (-O3) the compiler optimizes everything away into effectively a no-op:

Code

fn allocate() {
    let _a = vec![1; 100];
}

pub fn main() {
    allocate();
    allocate();
}

Output

example::main:
        ret

Godbolt link

However, if I instead initialize it with zero, the result is drastically different:

Code

fn allocate() {
    let _a = vec![0; 100];
}

pub fn main() {
    allocate();
    allocate();
}

Output

example::main:
        push    rax
        mov     edi, 400
        mov     esi, 4
        call    qword ptr [rip + __rust_alloc_zeroed@GOTPCREL]
        test    rax, rax
        je      .LBB0_2
        mov     esi, 400
        mov     edx, 4
        mov     rdi, rax
        call    qword ptr [rip + __rust_dealloc@GOTPCREL]
        mov     edi, 400
        mov     esi, 4
        call    qword ptr [rip + __rust_alloc_zeroed@GOTPCREL]
        test    rax, rax
        je      .LBB0_2
        mov     esi, 400
        mov     edx, 4
        mov     rdi, rax
        pop     rax
        jmp     qword ptr [rip + __rust_dealloc@GOTPCREL]
.LBB0_2:
        mov     edi, 400
        mov     esi, 4
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

Godbolt link

Meta

rustc --version --verbose:

rustc 1.57.0-nightly (4e89811b4 2021-10-16)
binary: rustc
commit-hash: 4e89811b46323f432544f9c4006e40d5e5d7663f
commit-date: 2021-10-16
host: x86_64-unknown-linux-gnu
release: 1.57.0-nightly
LLVM version: 13.0.0

I tried out a few different rustc versions on Godbolt and it seems the behavior started in 1.18.0 -- before that, the zero-initialized code gets optimized away as well.

@Herohtar Herohtar added the C-bug Category: This is a bug. label Oct 18, 2021
@MSxDOS
Copy link

MSxDOS commented Oct 19, 2021

Reduced:

pub fn main() {
    vec![0; 100];
}

Replacing it with Vec::from([0; 100]); optimizes properly.

@nikic
Copy link
Contributor

nikic commented Oct 19, 2021

I believe this is because we don't teach LLVM that __rust_alloc_zeroed is an allocator function. Doing so used to cause miscompiles in the past, but that was a very long time ago and we should probably try this again.

@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Oct 19, 2021
@durin42
Copy link
Contributor

durin42 commented Oct 19, 2021

Do you remember anything specific about the miscompiles? I'm knee-deep in that part of LLVM already in an attempt to let us use attributes instead of a static list of functions for allocator function identification anyway...

@nikic
Copy link
Contributor

nikic commented Oct 19, 2021

@durin42 The issue is mentioned in #24194 (comment), but I don't think anyone ever looked into what the problem was.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

4 participants