Skip to content
This repository was archived by the owner on Apr 4, 2022. It is now read-only.

Runtime Error in `/llvm-project/build_repo/bin/clang-tblgen': double free or corruption (out): 0x00007fe778296350 #102

Open
MaggieYingYi opened this issue Jul 7, 2020 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@MaggieYingYi
Copy link

After updated LLVM to LLVM 10 and merged the new hash algorithm back to the master branch, the Wiki page (https://github.com/SNSystems/llvm-project-prepo/wiki/Compile-Faster-with-the-Program-Repository-and-ccache) need to be updated correspondingly. When I built upstream llvm-project using the repo compiler, I met the double free memory error.

This comparison was run with the Repo release compiler (llvm-project-prepo commit: [000ebc2] and pstore commit: [13eacb6b].

The upstream llvm-project project with release configuration is used to demo the performance improvement. The llvm-project is compiled at points through its commit history. For recent seven consecutive days, the first commit of each day is selected to test the compilation time. Each of those commits is built in turn, keeping the cache contents and Repo database between them. I selected seven consecutive days starting from 97c0232 (Sun Jun 21 2020).

After updating llvm-project to commit a822ec7, a double free memory crash is met when running generated clang-tblgen.

@MaggieYingYi
Copy link
Author

Run the following command to reproduced the error.

$ cd llvm-project
// Starting from commit 97c02326 (no database). 
$ git checkout 97c02326
$ mkdir build_repo
$ cd build_repo
$ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_TOOL_CLANG_TOOLS_EXTRA_BUILD=OFF  -DCMAKE_TOOLCHAIN_FILE=/home/maggie/github/llvm-project-prepo/llvm/utils/repo/repo.cmake  -DCMAKE_CXX_COMPILER=/home/maggie/github/llvm-project-prepo/build/bin/clang++ -DCMAKE_C_COMPILER=/home/maggie/github/llvm-project-prepo/build/bin/clang -Dutils_dir=/home/maggie/github/llvm-project-prepo/llvm/utils/repo ../llvm
$ninja
// Build is fine.

// Updating commit to 792786e3 (with existing database). 
$ git checkout 792786e3
$ ninja
// Build is fine

// Updating commit to a822ec75 (with existing database). 
$ git checkout a822ec75
$ ninja
// Build is crash with the following error.
 Error in `/home/maggie/github/llvm-project/build_repo/bin/clang-tblgen': double free or corruption (out): 0x00007fe778296350 

You could reproduce the error using the following command.

cd /home/maggie/github/llvm-project/build_repo && /home/maggie/github/llvm-project/build_repo/bin/clang-tblgen -gen-arm-sve-builtins -I /home/maggie/github/llvm-project/clang/include/clang/Basic -I /home/maggie/github/llvm-project/clang/include -I /home/maggie/github/llvm-project/build_repo/tools/clang/include -I /usr/include/libxml2 -I /home/maggie/github/llvm-project/build_repo/include -I /home/maggie/github/llvm-project/llvm/include /home/maggie/github/llvm-project/clang/include/clang/Basic/arm_sve.td --write-if-changed -o tools/clang/include/clang/Basic/arm_sve_builtins.inc -d tools/clang/include/clang/Basic/arm_sve_builtins.inc.d

@MaggieYingYi
Copy link
Author

If compiled the llvm-project (commit: a822ec7) with no existing database, everything is fine for both first and second times build.

$ git checkout a822ec75
$ cd build_repo
$ ninja clean
$ rm clang.db
$ ninja
// First time build (no database) build is fine.

$ ninja clean
$ ninja
// Second time build (with existing database) is fine.

@MaggieYingYi
Copy link
Author

The code is crashed when using clang-tblgen with the option -gen-arm-sve-builtins to generate arm_sve_builtins.inc for clang. The file SveEmitter.cpp is used to implement the function. Compared to the generated correct SveEmitter.o file, the wrong SveEmitter.o contained an additional function named "_ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0", which caused the crash.

@paulhuggett paulhuggett added the bug Something isn't working label Jul 13, 2020
@MaggieYingYi
Copy link
Author

Analysis:

The _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is a discardable GO.

When compling the SveEmitter.cpp without the existing database, it exists in the RepoMetadataPass but is removed by another optimisation pass (which is after the RepoMetadataPass).

When compiling the SveEmitter.cpp with the existing database (clang.db) and the clang.db contains the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0, the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is pruned. The pruned GO will always be emitted in the repo object file. Therefore, the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is emitted into the object file.

I think this issue is duplicate issue of #55.

@MaggieYingYi
Copy link
Author

A possible solution is mentioned in #55 (comment): The functions/variables, which have internal, private or linkonce linkage type, are only emitted into the compilation file if they are referenced.

I have implemented this possible solution in the https://github.com/SNSystems/llvm-project-prepo/tree/issue102_ObjectWriter. However, the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is not removed and still in the ticket file.

Reason:

I used the print-after-all flag to print IR code after each pass.
Since the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is pruned (with the existing database), its linkage type changed from internal to available_externally. The linkage type change prevents the other passes’ optimisation. The _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is not removed and still called by other functions. Therefore, it still exists in the ticket file.

Since we want to generate the same ticket file when compiled the same source file with or without existing database, the above solution does not work

@MaggieYingYi
Copy link
Author

Another possible solution: The discardable functions/variables are only pruned if they are referenced by the pruned GOs. I have put the implementation in #109 for code review.

@paulhuggett
Copy link

A possible solution is mentioned in #55 (comment): The functions/variables, which have internal, private or linkonce linkage type, are only emitted into the compilation if they are referenced.

I have implemented this possible solution in the https://github.com/SNSystems/llvm-project-prepo/tree/issue102_ObjectWriter. However, the _ZSt16__insertion_sortIPSt10unique_ptrIN12_GLOBAL__N_19IntrinsicESt14default_deleteIS2_EEN9__gnu_cxx5__ops15_Iter_comp_iterIZNS1_10SVEEmitter17createRangeChecksERN4llvm11raw_ostreamEE3$3EEEvT_SG_T0 is not removed and still in the ticket file.

I’m going to call _ZSt16__insertion_…x”; the full mangled name is just way too long to be comprehensible. I assume you mean “compilation” rather than “ticket file” here (#8).

Since x is pruned (with the existing database), its linkage type changed from internal to available_externally. The linkage type change prevents the other passes’ optimisation. x is not removed and still called by other functions. Therefore, it still exists in the ticket file.

(s/ticket file/compilation).

Since we want to generate the same ticket file when compiled the same source file with or without existing database, the above solution does not work

This information should probably be added to #55. Should this be closed as a duplicate of that bug?

@paulhuggett
Copy link

paulhuggett commented Jul 22, 2020

Analysis:

x is a discardable GO.

When compiling SveEmitter.cpp without an existing database, it exists during the RepoMetadataPass but is removed by an optimization. When compiling the same source file with an existing database which contains x, the function is pruned. A pruned GO will always be emitted to the repo compilation.

I think this issue is duplicate issue of #55.

(I edited the quoted text for clarity.) It’s not yet clear to me why this process results in the program crashing. Could you explain?

@MaggieYingYi
Copy link
Author

This information should probably be added to #55.

I have added the information to #55 (comment).

@paulhuggett
Copy link

I think this issue is duplicate issue of #55.

Can this be closed as a duplicate of bug #55?

@MaggieYingYi
Copy link
Author

Can this be closed as a duplicate of bug #55?

I can't answer your question " It’s not yet clear to me why this process results in the program crashing. Could you explain?"

I can't answer this question. I know once the issue #55 is solved (PR #109 ), the crash is solved.
This is the reason why I didn't close this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants