-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable dumping corrupt WAL segments #145
Conversation
Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why do we need to change pg_waldump to be able to dump first WAL segment at compute node? It seems to me that explicit specification of first valid record (and it is address of checkpoint record which is printed in pageserver.log and pg.log) also works.
Also vanilla pg_waldump supports skipping WAL from specified position til first valid record:
pg_waldump -s 0/016BC330 000000010000000000000001
first record is after 0/16BC330, at 0/16BC340, skipping over 16 bytes
rmgr: Transaction len (rec/tot): 34/ 34, tx: 1025, lsn: 0/016BC340, prev 0/016BC300, desc: COMMIT 2022-03-22 12:09:59.594747 MSK
But if we really want to change pg_wal dump utility, then I think we should also add handling of files with *.partial suffix (added to the current segment by pageserver). Because right now files with such prefix are not recognized by pg_waldump even if file is explicitly specified) and it is necessary to rename file file.
@@ -729,7 +786,10 @@ usage(void) | |||
printf(_(" -b, --bkp-details output detailed information about backup blocks\n")); | |||
printf(_(" -e, --end=RECPTR stop reading at WAL location RECPTR\n")); | |||
printf(_(" -f, --follow keep retrying after reaching end of WAL\n")); | |||
printf(_(" -F, --file=FNAME dump log records from a single file\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why do we need --file option while it is possible to specify log file with for standar pg_waldump:
pg_waldump 000000010000000000000001
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there're implications on the file name (something this PR is addressing), try running POPS pg_waldump on 000000010000000000000001.partial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But may be just restrict log file name checking in case of presence of --skip options?
printf(_(" -n, --limit=N number of records to display\n")); | ||
printf(_(" -o, --offset=OFFSET offset of the first record to in a file to dump\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need --offser option if it is possible to specify startig point using -s option:
pg_waldump -s 0/016BC340 000000010000000000000001
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-s sets an LSN, LSN is an offset within whole WAL, while offset is specific to a file.
changing code to allow -s to serve dual purpose makes the code more confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it seems to be quite easy to map relative offset to global LSN and visa versa. If it is somehow needed...
How do you know offset of valid record? And LSN of valid record you can know from logs.
@@ -279,6 +279,14 @@ XLogReadRecord(XLogReaderState *state, char **errormsg) | |||
bool gotheader; | |||
int readOff; | |||
|
|||
#define SKIP_INVALID_RECORD(rec_ptr) do { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if rec_ptr should be aligned here.
Is there any warranty that it is maxaligned (8-bytes aligned) in all places where SKIP_INVALID_RECORD is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's sort of a defacto guarantee, with Page Header (both short and log) multiples of 8 and log records are maxaligned explicitly. Is it possible that log record in a corrupt or alternatively formatted file are not aligned, so I'm a little on the fence on how I want to advance. but records from normally running system are always maxaligned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, each wal record is 8-bytes aligned. But if parsing of record is terminated because of error somewhere in the middle, there is no warranty that RecPtr is aligned on 8.
I have not checked the code precisely, so I am not sure if such scenario is really possible.
But if for some reasons RecPtr is really not aligned, than going to skip_invalid: label can cause unaligned access fault.
requires knowing where does first log record start (which is not necessary hard, just adds an unnecessary extra step) and valid segment and page headers (much more involved step as page header includes "position" which depends on the file name)
I had a change to allow .partial files (still requires most of the changes from this PR) I removed it as 'follow' option makes the code pretty ugly |
Sorry, what do you maen by "POPS" version? In any case. Concerning proposing this patch to community, I afraid that it will be difficult to explain why somebody needs to analyze corrupted WAL segment.May be I am wrong, but the problem we are solving with not completely filled WAL segment is very zenith specific. |
First impression: This is a pretty invasive patch. I hoped we could come up with something with a much smaller diff footprint :-(. How do you use this? I tried this, but didn't work:
|
How about something like this: https://github.com/zenithdb/postgres/tree/heikki-pg_waldump ? Much smaller diff footprint |
yeah, the patch had to be invasive in order to properly integrate with Postgres way of reading WAL. regarding parameters, -i was specifically added to handle files with format errors (like missing pages headers and misformatted log records) and is independent from -F parameter. try |
Less intrusive but only covers some functionality intended:
|
@@ -279,6 +279,14 @@ XLogReadRecord(XLogReaderState *state, char **errormsg) | |||
bool gotheader; | |||
int readOff; | |||
|
|||
#define SKIP_INVALID_RECORD(rec_ptr) do { \ | |||
rec_ptr += MAXALIGN(1); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we first add 1, and then MAXALIGN here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
de facto it will be equivalent, but I agree that would make the code a little more robust
I agree it is more invasive than what we really need. It allows to dump whatever valid records contained -- whether they are after zero space hole, corrupted ones and so on, trying each single (maxaligned) byte to detect the valid record. We need only to skip zeros in the beginning + probably adjust for the initial header. But since the code is already there... well. But let's explicitly document what it does. I also don't see value in separate --file and --offset options. They confusingly conflict with existing ones (specifying segment name as positional argument -- yes, we should teach to accept .partial suffix -- for the former and -s for the latter). If I specify both -s and --offset, what offset is used? And really you more often now LSN than the offset. It would be nice to have such resilient waldump of course. |
supporting .partial makes pg_waldump code even uglier (check https://github.com/zenithdb/postgres/blob/main/src/bin/pg_waldump/pg_waldump.c#L299) and would require code invasive changes to filename handling (partially https://github.com/zenithdb/postgres/blob/main/src/include/access/xlog_internal.h#L165)
The code will error out at parameters validation. start is for WAL stream and offset only meaningful within a file (in addition to the difference in format)
with safekeeper files, currently our lowest LSN (after init db) is 0/01696620 which lands at 0x0696620 in segment 000000010000000000000001; This change is invasive (and I'm not crazy happy about that either) mostly due to the peculiar design of WAL in Postgres (and the fact that pg_waldump is not an integral part of the infrastructure) (edit: GitHub ate part of the response) |
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
* Enable dumping corrupt WAL segments Add ability to dump WAL segment with corrupt page headers and recrods skips over missing/broken page headers skips over misformatted log recrods allows dumping log record from a particular file starting from an optional offset (without a need of carefully crafted input)
Add ability to dump WAL segment with corrupt page headers and recrods
skips over missing/broken page headers
skips over misformatted log recrods
allows dumping log record from a particular file (without a need of
carefully crafted input) starting from an optional offset
specifically allows dumping safekeeper log file like this:
or
file doesn't need to be complete (but then it will produce some bogus error at the end):