Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eigen Alignment #653

Closed
morrisfranken opened this issue Oct 31, 2018 · 15 comments
Closed

Eigen Alignment #653

morrisfranken opened this issue Oct 31, 2018 · 15 comments

Comments

@morrisfranken
Copy link
Contributor

Hi, I've been experiencing a number of random segfaults while executing functions like RegistrationColoredICP and WritePoseGraph. After debugging a bit, it turns out that these segfaults are coming from incorrect use of Eigen structures. For example, the following code will produce a segfault (compiled in release with gcc5.4 on Ubuntu 16.04, Intel I7-4712MQ)

std::vector<Eigen::Matrix4d> poses;
poses.emplace_back(Eigen::Matrix4d::Identity());
poses.emplace_back(Eigen::Matrix4d::Identity());

Similar to what #88 has found, Eigen structures required a custom allocator when being used in containers like std::vector or std::map (https://eigen.tuxfamily.org/dox/group__TopicStlContainers.html) . In addition,- when defining Eigen variables within a class, these classes must be aligned as well (https://eigen.tuxfamily.org/dox/group__TopicStructHavingEigenMembers.html)

To solve these issues I've created a new fork of Open3D with Eigen Alignment Support: https://github.com/morrisfranken/Open3D . While I can confirm that this branch resolves the segfaults that I was experiencing before, it may break other peoples code.

More over- the Eigen alignment issue stretches beyond simple segfaults within the library. By exposing the Open3D API with Eigen elements, it requires all libraries that use Open3D to be compiled with the same optimisation options as Open3D itself. Otherwise the sizes (in bytes) of Open3D classes may differ compared to a library that is using Open3D. For example,- suppose the following class is exposed in a header of Open3D:

class PoseGraphNode {
public:
    bool unimportant_variable;
    Eigen::Matrix4d mat;

    EIGEN_MAKE_ALIGNED_OPERATOR_NEW
};

void compute(PoseGraphNode &node);

The size of this class will vary based on the compiler flags used:

Compiler Flags sizeof(PoseGraphNode)
-O2/-O3 144
-O2/-O3 -march=native 160
-DEIGEN_MAX_ALIGN_BYTES=0 136

When compiling Open3D with its default settings, the library expects the size of PoseGraphNode to be 144. But when I compile an executable that depends on Open3D with the flag -march=native, and I call the function compute(node), that executable will pass 160 bytes to the function where Open3D only expects 144 which causes undefined behaviour and most likely segfault somewhere down the line.

At this moment I don't see any other way to prevent this behaviour entirely other than disabeling alignment in exposed structs by using Eigen::DontAlign.

@qianyizh
Copy link
Collaborator

qianyizh commented Nov 2, 2018

@morrisfranken any suggestions for us to solve this issue?
@syncle @yxlao we should try to address this. I think multiple people have reported this.

@qianyizh
Copy link
Collaborator

qianyizh commented Nov 2, 2018

I see it in #654 .
This is great! Let me followup with the discussion there.

@qianyizh
Copy link
Collaborator

qianyizh commented Nov 2, 2018

@morrisfranken I tried on a few Ubuntus and still cannot reproduce the error? Can you help us reproduce the error? Maybe package a docker container with the exact environment you have so we can exactly reproduce it. @takanokage can help you set up a docker if you are not familiar with it.

@morrisfranken
Copy link
Contributor Author

I will be able to create a docker this weekend, but can you confirm that the following code runs fine on your machine?

//main.cpp
#include <iostream>
#include <vector>
#include <Eigen/Core>

int main() {
    std::cout << "starting..." << std:: endl;
    
    std::vector<Eigen::Matrix4d> poses;
    poses.emplace_back(Eigen::Matrix4d::Identity());
    poses.emplace_back(Eigen::Matrix4d::Identity());
    
    std::cout << "done!" << std:: endl;
    return 0;
}
#CMakeLists.txt
cmake_minimum_required(VERSION 3.0)
project(test_eigen)

set(CMAKE_CXX_STANDARD 11)

add_compile_options(-O3 -march=native)
add_executable(${PROJECT_NAME} main.cpp)
    
find_package(Eigen3 REQUIRED)
include_directories(${PROJECT_NAME} ${EIGEN3_INCLUDE_DIR})

For me it produces the following output:

$ ./test_eigen
starting...
test_eigen: /usr/include/eigen3/Eigen/src/Core/DenseStorage.h:128: Eigen::internal::plain_array<T, Size, MatrixOrArrayOptions, 32>::plain_array() [with T = double; int Size = 16; int MatrixOrArrayOptions = 0]: Assertion `(reinterpret_cast<size_t>(eigen_unaligned_array_assert_workaround_gcc47(array)) & (31)) == 0 && "this assertion is explained here: " "http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html" " **** READ THIS WEB PAGE !!! ****"' failed.
Aborted (core dumped)

@takanokage
Copy link
Contributor

I confirm that the issue is reproducible with the above test.

The fix for the above test, as suggested in #654 is to replace
std::vector<Eigen::Matrix4d> poses;
with
std::vector<Eigen::Matrix4d, Eigen::aligned_allocator<Eigen::Matrix4d>> poses;

@takanokage
Copy link
Contributor

However I'd also like to point out that the issue is not reproducible if instead of emplace_back we use push_back.

@takanokage
Copy link
Contributor

I might be mistaken but in the past not allocating aligned memory has led to performance issues only and not errors/crashes.

@morrisfranken Could you please tell us more about what you were working on when you encountered this error? I've search the whole Open3D and emplace_back is only used by pybind11 source code.

@takanokage
Copy link
Contributor

@morrisfranken another question: where you able to reproduce the issue without emplace_back?

@takanokage
Copy link
Contributor

This is the code I used:

#include <iostream>
#include <vector>
#include <Eigen/Core>

int main() {
    std::cout << "starting..." << std:: endl;

    // // v1, un-aligned, emplace_back, FAIL
    // std::vector<Eigen::Matrix4d> poses;

    // poses.emplace_back(Eigen::Matrix4d::Identity());
    // poses.emplace_back(Eigen::Matrix4d::Identity());
    // // v1, un-aligned, emplace_back, FAIL

    // // v2, aligned, emplace_back, PASS
    // std::vector<Eigen::Matrix4d, Eigen::aligned_allocator<Eigen::Matrix4d>> poses;

    // poses.emplace_back(Eigen::Matrix4d::Identity());
    // poses.emplace_back(Eigen::Matrix4d::Identity());
    // // v2, aligned, emplace_back, PASS

    // // v3, un-aligned, push_back, PASS
    // std::vector<Eigen::Matrix4d> poses;

    // poses.push_back(Eigen::Matrix4d::Identity());
    // poses.push_back(Eigen::Matrix4d::Identity());
    // // v3, un-aligned, push_back, PASS

    std::cout << "done!" << std:: endl;
    return 0;
}

Uncomment one section at a time (v1/v2/v3) in order to build/run the code.

@morrisfranken
Copy link
Contributor Author

These are a few more cases in which is segfaults for me without emplace_back

#include <iostream>
#include <vector>
#include <unordered_map>
#include <Eigen/Core>

int main() {
    std::cout << "starting..." << std:: endl;
    
    std::unordered_map<int, Eigen::Matrix4d> posemap;
    posemap[0] = Eigen::Matrix4d::Identity();
    posemap[1] = Eigen::Matrix4d::Identity();
    posemap[2] = posemap[0];
        
    std::vector<Eigen::Matrix4d> poses(3);
    poses[0] = Eigen::Matrix4d::Identity();
    poses[1] = Eigen::Matrix4d::Identity();
    poses[2] = poses[0];
    
    std::cout << "done!" << std:: endl;
    return 0;
}

I'm however not an expert on Eigen and don't know exactly why it produces a segfault in the examples above. But it seems we are entering the realm of undefined behaviour, and even if it does not produce a segfault right away,- it may appear later in the program out of nowhere.

The Eigen documentation does not indicate the consequences of not using the allocator, but they are clear on that you have to use it (in my case, even causing the program to crash due to an assert):

The situation with std::vector was even worse (explanation below) so we had to specialize it for the Eigen::aligned_allocator type. In practice you must use the Eigen::aligned_allocator (not another aligned allocator), and #include <Eigen/StdVector>.
(http://eigen.tuxfamily.org/dox-devel/group__TopicStlContainers.html)

To answer your question earlier, I was mainly working with RegistrationColoredICP. But it did not always produce a segfault (still more than half the time though). By addressing the eigen alignment this problem disappeared.

@takanokage
Copy link
Contributor

Using STL Containers with Eigen

According to the documentation found at the above link:

Using STL containers on fixed-size vectorizable Eigen types, or classes having members of such types, requires taking the following two steps:

  • A 16-byte-aligned allocator must be used. Eigen does provide one ready for use: aligned_allocator.
  • If you want to use the std::vector container, you need to #include <Eigen/StdVector>.

These issues arise only with fixed-size vectorizable Eigen types and structures having such Eigen objects as member. For other Eigen types, such as Vector3f or MatrixXd, no special care is needed when using STL containers.

@takanokage
Copy link
Contributor

It would appear that we really need to make the change.

@morrisfranken
Copy link
Contributor Author

morrisfranken commented Nov 6, 2018

Eigen::Matrix4d is such a type among others

Eigen::Vector2d
Eigen::Vector4d
Eigen::Vector4f
Eigen::Matrix2d
Eigen::Matrix2f
Eigen::Matrix4d
Eigen::Matrix4f
Eigen::Affine3d
Eigen::Affine3f
Eigen::Quaterniond
Eigen::Quaternionf
(https://eigen.tuxfamily.org/dox/group__TopicFixedSizeVectorizable.html)

@takanokage
Copy link
Contributor

Well, all this said I'm not sure we really need to make this change everywhere in Open3D.
Could you first of all fix just the particular piece of code that causes the crash? Once that is fixed we can revisit making the change throughout Open3D.

qianyizh pushed a commit that referenced this issue Jan 13, 2019
* Eigen alignment support

* remove white spacing

* Fixing mac-build by adding const

* Removing Eigen alignment for types that do not require it.

Removing the `EIGEN_MAKE_ALIGNED_OPERATOR_NEW` from classes, and replace
Eigen types within classes with their unaligned counterpart where required.
This will prevent segfaults due to different compiler flags used for Open3D
and a library that depends on it. (#653)

* Eigen alignment for merge with IntelVCL/Open3D

Removing alignment from `pybind_eigen_vector_of_vector` since it is conflicting with pybind and because none of the templated types require alignment.

* Trimming whitespace and added functions `EigenMatrix4dToJsonArray` and `EigenMatrix6dToJsonArray` with unaligned support

* Update for merging with original repo

* reduced line width to 80 cols or less.

* added unit tests for testing the conversion of unaligned Eigen::Matrix4d/6d to and from JsonArray.
added unit tests comments.

* fix path.

* match master gitmodules.

* test fix submodule issue that was breaking the build.
@gitouni
Copy link

gitouni commented May 24, 2023

Hi, I've been experiencing a number of random segfaults while executing functions like RegistrationColoredICP and WritePoseGraph. After debugging a bit, it turns out that these segfaults are coming from incorrect use of Eigen structures. For example, the following code will produce a segfault (compiled in release with gcc5.4 on Ubuntu 16.04, Intel I7-4712MQ)

std::vector<Eigen::Matrix4d> poses;
poses.emplace_back(Eigen::Matrix4d::Identity());
poses.emplace_back(Eigen::Matrix4d::Identity());

Similar to what #88 has found, Eigen structures required a custom allocator when being used in containers like std::vector or std::map (https://eigen.tuxfamily.org/dox/group__TopicStlContainers.html) . In addition,- when defining Eigen variables within a class, these classes must be aligned as well (https://eigen.tuxfamily.org/dox/group__TopicStructHavingEigenMembers.html)

To solve these issues I've created a new fork of Open3D with Eigen Alignment Support: https://github.com/morrisfranken/Open3D . While I can confirm that this branch resolves the segfaults that I was experiencing before, it may break other peoples code.

More over- the Eigen alignment issue stretches beyond simple segfaults within the library. By exposing the Open3D API with Eigen elements, it requires all libraries that use Open3D to be compiled with the same optimisation options as Open3D itself. Otherwise the sizes (in bytes) of Open3D classes may differ compared to a library that is using Open3D. For example,- suppose the following class is exposed in a header of Open3D:

class PoseGraphNode {
public:
    bool unimportant_variable;
    Eigen::Matrix4d mat;

    EIGEN_MAKE_ALIGNED_OPERATOR_NEW
};

void compute(PoseGraphNode &node);

The size of this class will vary based on the compiler flags used:

Compiler Flags sizeof(PoseGraphNode)
-O2/-O3 144
-O2/-O3 -march=native 160
-DEIGEN_MAX_ALIGN_BYTES=0 136
When compiling Open3D with its default settings, the library expects the size of PoseGraphNode to be 144. But when I compile an executable that depends on Open3D with the flag -march=native, and I call the function compute(node), that executable will pass 160 bytes to the function where Open3D only expects 144 which causes undefined behaviour and most likely segfault somewhere down the line.

At this moment I don't see any other way to prevent this behaviour entirely other than disabeling alignment in exposed structs by using Eigen::DontAlign.

It should be not that Eigen::aligned_allocator does not require to be explicitly assigned in C++ 17 Standard (as the CMakeLists.txt declared below):

Open3D/CMakeLists.txt

Lines 293 to 295 in a5be78c

if (BUILD_SYCL_MODULE)
set(CMAKE_CXX_STANDARD 17)
else()

It is reported in Eigen's doc that C++ 17 or later version can take care of alligned_allocator well through compilation.
See Eigen doc for more details:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants