-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-185: Make padding and alignment for all buffers be 64 bytes #74
Changes from 1 commit
6ff3048
05653cb
11b3fd7
7543267
c140e04
1d006d8
e3cca14
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,15 +36,20 @@ class Status; | |
// Buffer classes | ||
|
||
// Immutable API for a chunk of bytes which may or may not be owned by the | ||
// class instance | ||
// class instance. Buffers have two related notions of length: size and | ||
// capacity. Size is the number of bytes that might have valid data. | ||
// Capacity is the number of bytes that where allocated for the buffer in | ||
// total. | ||
// The following invariant is always true: Size < Capacity | ||
class Buffer : public std::enable_shared_from_this<Buffer> { | ||
public: | ||
Buffer(const uint8_t* data, int64_t size) : data_(data), size_(size) {} | ||
Buffer(const uint8_t* data, int64_t size) : data_(data), size_(size), capacity_(size) {} | ||
virtual ~Buffer(); | ||
|
||
// An offset into data that is owned by another buffer, but we want to be | ||
// able to retain a valid pointer to it even after other shared_ptr's to the | ||
// parent buffer have been destroyed | ||
// TODO(emkornfield) how will this play with 64 byte alignment/padding? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Inevitably alignment and padding isn't always going to be a guarantee on in-memory data (of course when data is moved for IPC purposes, that will need to be guaranteed). I suppose then that buffers will need to be able to communicate their alignment/padding for algorithm selection (i.e. can we use the spiffy AVX512 function or not?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to see how use-cases play out. It seems given the current spec, most slicing operations in the general case will need memory allocation anyways. We could likely guarantee alignment/padding by providing a utility method that either allocates slices if it can keep the contract otherwise allocates new underlying data. For now I will put a warning here. |
||
Buffer(const std::shared_ptr<Buffer>& parent, int64_t offset, int64_t size); | ||
|
||
std::shared_ptr<Buffer> get_shared_ptr() { return shared_from_this(); } | ||
|
@@ -63,6 +68,7 @@ class Buffer : public std::enable_shared_from_this<Buffer> { | |
(data_ == other.data_ || !memcmp(data_, other.data_, size_))); | ||
} | ||
|
||
int64_t capacity() const { return capacity_; } | ||
const uint8_t* data() const { return data_; } | ||
|
||
int64_t size() const { return size_; } | ||
|
@@ -76,6 +82,7 @@ class Buffer : public std::enable_shared_from_this<Buffer> { | |
protected: | ||
const uint8_t* data_; | ||
int64_t size_; | ||
int64_t capacity_; | ||
|
||
// nullptr by default, but may be set | ||
std::shared_ptr<Buffer> parent_; | ||
|
@@ -113,10 +120,7 @@ class ResizableBuffer : public MutableBuffer { | |
virtual Status Reserve(int64_t new_capacity) = 0; | ||
|
||
protected: | ||
ResizableBuffer(uint8_t* data, int64_t size) | ||
: MutableBuffer(data, size), capacity_(size) {} | ||
|
||
int64_t capacity_; | ||
ResizableBuffer(uint8_t* data, int64_t size) : MutableBuffer(data, size) {} | ||
}; | ||
|
||
// A Buffer whose lifetime is tied to a particular MemoryPool | ||
|
@@ -125,8 +129,8 @@ class PoolBuffer : public ResizableBuffer { | |
explicit PoolBuffer(MemoryPool* pool = nullptr); | ||
virtual ~PoolBuffer(); | ||
|
||
virtual Status Resize(int64_t new_size); | ||
virtual Status Reserve(int64_t new_capacity); | ||
Status Resize(int64_t new_size) override; | ||
Status Reserve(int64_t new_capacity) override; | ||
|
||
private: | ||
MemoryPool* pool_; | ||
|
@@ -138,10 +142,11 @@ class BufferBuilder { | |
public: | ||
explicit BufferBuilder(MemoryPool* pool) : pool_(pool), capacity_(0), size_(0) {} | ||
|
||
// Resizes the buffer to the nearest multiple of 64 bytes per Layout.md | ||
Status Resize(int32_t elements) { | ||
if (capacity_ == 0) { buffer_ = std::make_shared<PoolBuffer>(pool_); } | ||
capacity_ = elements; | ||
RETURN_NOT_OK(buffer_->Resize(capacity_)); | ||
RETURN_NOT_OK(buffer_->Resize(elements)); | ||
capacity_ = buffer_->capacity(); | ||
data_ = buffer_->mutable_data(); | ||
return Status::OK(); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
|
||
#include "arrow/util/memory-pool.h" | ||
|
||
#include <stdlib.h> | ||
#include <cstdlib> | ||
#include <mutex> | ||
#include <sstream> | ||
|
@@ -44,14 +45,22 @@ class InternalMemoryPool : public MemoryPool { | |
}; | ||
|
||
Status InternalMemoryPool::Allocate(int64_t size, uint8_t** out) { | ||
constexpr size_t kAlignment = 64; | ||
std::lock_guard<std::mutex> guard(pool_lock_); | ||
*out = static_cast<uint8_t*>(std::malloc(size)); | ||
if (*out == nullptr) { | ||
// TODO(emkornfield) find something compatible with windows | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. speaking of windows, I wonder if we can find a kind soul to set up Appveyor CI for this repo and get the Windows C++ build passing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep. I'll do the refactoring but I won't add the windows code, because I don't have anyway of testing it. I'll open a jira to add windows support via appveyor |
||
const int result = posix_memalign(reinterpret_cast<void**>(out), kAlignment, size); | ||
if (result == ENOMEM) { | ||
std::stringstream ss; | ||
ss << "malloc of size " << size << " failed"; | ||
return Status::OutOfMemory(ss.str()); | ||
} | ||
|
||
if (result == EINVAL) { | ||
std::stringstream ss; | ||
ss << "invalid alignment parameter: " << kAlignment; | ||
return Status::Invalid(ss.str()); | ||
} | ||
|
||
bytes_allocated_ += size; | ||
|
||
return Status::OK(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use round_to here. I'm also pretty sure there is something clever we could do to avoid the condition here, but at the moment I'm blanking on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this do it?
(num + multiple_bitmask) & ~multiple_bitmask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that looks right to me. although the performance gains are probably moot given the other condition for overflow.