-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[Arm64/Unix] WIP-DoNotMerge-Draft 64K static support #10891
Conversation
src/gc/gcsvr.cpp
Outdated
@@ -4,6 +4,7 @@ | |||
|
|||
|
|||
|
|||
#define OS_PAGE_SIZE 0x1000 | |||
#include "common.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates two definitions for OS_PAGE_SIZE in the same project. Needs to be actual current OS_PAGE_SIZE.
src/gc/gcwks.cpp
Outdated
@@ -4,6 +4,7 @@ | |||
|
|||
|
|||
|
|||
#define OS_PAGE_SIZE 0x1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
@@ -76,7 +76,7 @@ | |||
#define USE_UPPER_ADDRESS 0 | |||
|
|||
#elif defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) | |||
#define PAGE_SIZE 0x1000 | |||
#define PAGE_SIZE 0x10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Page size will become dynamic
@@ -58,7 +58,7 @@ enum VIRTUAL_CONSTANTS | |||
VIRTUAL_EXECUTE, | |||
VIRTUAL_EXECUTE_READ, | |||
|
|||
VIRTUAL_PAGE_SIZE = 0x1000, | |||
VIRTUAL_PAGE_SIZE = 0x10000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Virtual page size will become dynamic an so will VIRTUAL_PAGE_SIZE_MASK
@@ -551,6 +551,8 @@ LPVOID ClrVirtualAllocAligned(LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocatio | |||
|
|||
#else // !FEATURE_PAL | |||
|
|||
if(alignment < PAGE_SIZE) alignment = PAGE_SIZE; | |||
|
|||
// UNIXTODO: Add a specialized function to PAL so that we don't have to waste memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64K page size will increase the amount of wasted memory
@@ -1547,7 +1547,7 @@ void ZapImage::OutputTables() | |||
|
|||
#if defined(FEATURE_PAL) | |||
// PAL library requires native image sections to align to page bounaries. | |||
SetFileAlignment(0x1000); | |||
SetFileAlignment(0x10000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Section alignment to match maximum supported page size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this will make native images a lot bigger. I am not sure whether this would be acceptable.
How are the native binary formats like ELF dealing with the 64k pages? Do they have the sections aligned at 64k boundaries as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas Based on what I read in golang/go#10180 (cited by @vielmetti). I think ELF will be aligning the sections to 64K page boundaries. However the ELF file size will not be padded to fill 64K, it will truncated as needed.
It is possible we could support file with 4K alignment, but PAL loader would need to be updated. This seems feasible since we have RELOC_PAGE_SIZE set to 4K.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdmaclea what do you mean by RELOC_PAGE_SIZE? I don't see such a symbol defined anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janvorli src/zap/zaprelocs.h:#define RELOCATION_PAGE_SIZE 0x1000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas For the smallest files the size is growing about the same amount
12288 4KAligned/System.Xml.XPath.ni.dll
196608 64KAligned/System.Xml.XPath.ni.dll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas I'll see if I can fix the native image loader to not require 64K alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may to tweak the file format so that we can get the page alignment, without the padding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are the native binary formats like ELF dealing with the 64k pages? Do they have the sections aligned at 64k boundaries as well?
@jkotas We objdumped an elf. The sections are only aligned to something like 8 byte boundaries. The VA of the section is incremented by a multiple of the maximum page size. I think for Arm64 is should be 16byte boundaries.
I took a look at "Visual Studio, Microsoft Portable Executable and Common Object File Format Specification Revision 9.3 – December 29, 2015" the following statement appears in the text.
If the SectionAlignment is less than the architecture’s page size, then
FileAlignment must match SectionAlignment.
I think this would allow the PE to behave similarly to ELF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I created PR #10959 with the ZapWriter Loader rework to prevent this bloat.
@@ -55,7 +55,7 @@ void ZapWriter::Initialize() | |||
m_FileAlignment = 0x200; | |||
} | |||
|
|||
#define SECTION_ALIGNMENT 0x1000 | |||
#define SECTION_ALIGNMENT 0x10000 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Section alignment to match maximum supported page size.
#else | ||
#define card_size ((size_t)(DT_OS_PAGE_SIZE/card_word_width)) | ||
#define card_size ((size_t)(4096/card_word_width)) | ||
#endif //_TARGET_WIN64_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to stop duplicating this GC code here.
@@ -190,6 +190,6 @@ struct DT_RTL_USER_PROCESS_PARAMETERS | |||
|
|||
#endif // !FEATURE_PAL | |||
|
|||
#define DT_OS_PAGE_SIZE 4096 | |||
#define DT_OS_PAGE_SIZE 0x10000 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be complications here for remote debugging with target having separate size as host.
Also this may need a different mechanism to determine the os page size dynamically.
#else | ||
#define card_size ((size_t)(OS_PAGE_SIZE/card_word_width)) | ||
#define card_size ((size_t)(4096/card_word_width)) | ||
#endif // BIT64 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These may be better converted to hard numbers. Since they are supposedly empirical.
#if defined (_TARGET_WIN64_)
#define card_size 0x200
#else
#define card_size 0x100
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are empirical, yes, but they are also tied to GetWriteWatch
operating at page granularity. If we change these, we'll also need to change the software write watch tables and card bundle tables (cc @adityamandaleeka)
@@ -6306,7 +6306,7 @@ void gc_heap::make_c_mark_list (uint8_t** arr) | |||
static const size_t card_bundle_word_width = 32; | |||
|
|||
// How do we express the fact that 32 bits (card_word_width) is one uint32_t? | |||
static const size_t card_bundle_size = (size_t)(OS_PAGE_SIZE / (sizeof(uint32_t)*card_bundle_word_width)); | |||
static const size_t card_bundle_size = (size_t)(4096 / (sizeof(uint32_t)*card_bundle_word_width)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a name for this magic number. Suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GC_PAGE_SIZE
@jkotas PTAL at OS_PAGE_SIZE --> 4K changes and provide further direction. |
cc @Maoni0 @adityamandaleeka @swgillespie for the GC changes |
@@ -18786,13 +18786,13 @@ void gc_heap::fix_card_table () | |||
time_stop - time_start, tot_cycles); | |||
#endif //TIME_WRITE_WATCH | |||
|
|||
assert( ((card_size * card_word_width)&(OS_PAGE_SIZE-1))==0 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, this really is the true operating system page size and not an arbitrary constant. The array of pointers that we receive from GetWriteWatch
(used to track writes on the heap during concurrent GCs) are page-sized. When we're looping over this array, we're looking at single OS pages and setting cards for it.
I don't think that GetWriteWatch
supports anything other than 4k pages, so it's probably fine to replace these uses with 4096, but perhaps we should assert somewhere that if we are using GetWriteWatch
, OS_PAGE_SIZE
is what we expect.
#else | ||
#define card_size ((size_t)(OS_PAGE_SIZE/card_word_width)) | ||
#define card_size ((size_t)(4096/card_word_width)) | ||
#endif // BIT64 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are empirical, yes, but they are also tied to GetWriteWatch
operating at page granularity. If we change these, we'll also need to change the software write watch tables and card bundle tables (cc @adityamandaleeka)
No description provided.