-
Notifications
You must be signed in to change notification settings - Fork 180
Reduce index file size caused by symbol references by using indicies rather than offsets #262
Conversation
docs/format/index.md
Outdated
│ │ len(str_n) <uvarint> │ str_1 <bytes> │ │ | ||
│ └──────────────────────┴───────────────┘ │ | ||
│ ├──────────────────┬───────────────────┤ │ | ||
│ │ n <uvarint> │ str_1 <bytes> │ │ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(str)
was actually the number of bytes we have to subsequently read to know where the string ends. The field was not the offset.
Looking at the code actually does not serilize the index (correctly) so the update of the docs here seems to be revertible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay I misread this then. We don't want to include a field for the index in this diagram?
index/index.go
Outdated
w.symbols[s] = uint32(w.pos) + headerSize + uint32(w.buf2.len()) | ||
if w.Version == 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this change I suppose but do we really want the writer to be subject to versioning. I think we are fine with always have the writer write the most recent format version and just having the reader supporting multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is dependent on how we intend to support different index file versions in the future. Is the intention to only allow older versions for read purposed with the intention that they can load in an old tsdb database with a newer version of the code and it would then be written to an index file again with the new format?
If we don't intend to allow writing of index files with old versions then I'll remove the version checks from the writer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to write older versions.
Thanks! |
fixes #249
Left comments in sections where depending on whether we're using v1 or v2 all that would change is the var name (
offset
vsindex
) but not any of the functionality.Wasn't entirely sure if we want to change this struct:
cc: @gouthamve @fabxc