Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix check offsets #1511

Merged
merged 1 commit into from
Jul 5, 2021
Merged

Fix check offsets #1511

merged 1 commit into from
Jul 5, 2021

Conversation

benjeffery
Copy link
Member

@benjeffery benjeffery commented Jun 22, 2021

Fixes #1509

Turns out this was wrong in every call to check_offset_overflow.

Tested by twiddling the internal state of the table struct. We could probably make large columns, but this seemed simpler.

@benjeffery benjeffery marked this pull request as ready for review June 22, 2021 15:41
c/tskit/tables.c Outdated Show resolved Hide resolved
c/tskit/tables.c Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jun 22, 2021

Codecov Report

Merging #1511 (73f1a67) into main (2c897a3) will increase coverage by 0.04%.
The diff coverage is 97.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1511      +/-   ##
==========================================
+ Coverage   93.70%   93.74%   +0.04%     
==========================================
  Files          27       27              
  Lines       22759    22620     -139     
  Branches     1076     1076              
==========================================
- Hits        21326    21205     -121     
+ Misses       1399     1381      -18     
  Partials       34       34              
Flag Coverage Δ
c-tests 92.09% <97.95%> (+0.06%) ⬆️
lwt-tests 92.97% <ø> (ø)
python-c-tests 95.29% <ø> (ø)
python-tests 98.80% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
c/tskit/tables.c 90.23% <97.95%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c897a3...73f1a67. Read the comment docs.

@jeromekelleher
Copy link
Member

I think it's probably worth abstracting out an expand_ragged_column here which tsk_individual_table_expand_parents etc could call, as there's a lot of duplicated code which we could get rid of by using a few pointers. Otherwise we'll be making the same changes a dozen or more times here.

I guess the call would look something like

static int
tsk_individual_table_expand_parents(
    tsk_individual_table_t *self, tsk_size_t additional_length)
{
    return expand_ragged_column(self->max_parents_length_increment, &self->max_parents_length,
        &self->parents);
}

@benjeffery
Copy link
Member Author

Yep, will do that. Also reduces test duplication which was getting boring!

@jeromekelleher
Copy link
Member

What's the status of this @benjeffery? I think it's worth adding the expand_ragged_column infrastructure, as it'll be handy in anycase for doing things like setting individual rows.

@benjeffery
Copy link
Member Author

It's waiting for me to get back around to it - finishing up an tsconvert PR then will get to it today.

@benjeffery benjeffery force-pushed the fix_check_offsets branch 3 times, most recently from 0ef2eb0 to bbdbac9 Compare July 1, 2021 12:52
@benjeffery
Copy link
Member Author

benjeffery commented Jul 1, 2021

Ok, after some head-scratching this is done. A few queries:

  • See the comment in expand_ragged_columns. Instead of erroring when max_length+increment overflow we could/should just allocate to the max size - however this can't practically be tested.
  • All the tsk_TABLE_table_expand_RAGGEDCOLUMN functions are now one-liners that could be removed - I kept them in as expand_ragged_column has 6 arguments and the existing calls look cleaner in the higher level methods.
  • Shouldn't tsk_TABLE_table_set_max_COLUMN_length_increment be tsk_TABLE_table_set_min_COLUMN_length_increment??!!

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just need to double-check we're getting all the ragged columns.

c/tskit/tables.c Show resolved Hide resolved
@jeromekelleher
Copy link
Member

See the comment in expand_ragged_columns. Instead of erroring when max_length+increment overflow we could/should just allocate to the max size - however this can't practically be tested.

Yeah - let's not worry about it though. I guess it'll be become more pressing at some point when we implement the doubling behaviour (#5), when increment will become large quickly. I think it's fine for now, though.

All the tsk_TABLE_table_expand_RAGGEDCOLUMN functions are now one-liners that could be removed - I kept them in as expand_ragged_column has 6 arguments and the existing calls look cleaner in the higher level methods.

Yeah, let's keep then. More readable this way as you say.

Shouldn't tsk_TABLE_table_set_max_COLUMN_length_increment be tsk_TABLE_table_set_min_COLUMN_length_increment??!!

I guess tsk_TABLE_table_set_COLUMN_length_increment would be better again?

@benjeffery benjeffery force-pushed the fix_check_offsets branch from bbdbac9 to faf45dd Compare July 1, 2021 13:16
@benjeffery
Copy link
Member Author

Shouldn't tsk_TABLE_table_set_max_COLUMN_length_increment be tsk_TABLE_table_set_min_COLUMN_length_increment??!!

I guess tsk_TABLE_table_set_COLUMN_length_increment would be better again?

Yes, it is a public method, but undocumented so we could change it. However if #5 is planned then not worth it as it will be obsolete then.

@jeromekelleher
Copy link
Member

jeromekelleher commented Jul 1, 2021

However if #5 is planned then not worth it as it will be obsolete then.

I think the idea for #5 is to support both the current behaviour, and also doubling. If size_increment == 0, then increment = 2 * len(table); otherwise, increment = size_increment. So, we can do it simply enough in the current framework, while keeping the flexibility.

Probably a single size_increment per table is sufficient, having different increments for the number of rows and the ragged data columns is overboard.

@benjeffery
Copy link
Member Author

@jeromekelleher I think this is ready to go

@benjeffery benjeffery added the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Jul 2, 2021
@jeromekelleher
Copy link
Member

This needs some attention @benjeffery - some compiler warnings popping up

@benjeffery benjeffery force-pushed the fix_check_offsets branch from faf45dd to 73f1a67 Compare July 5, 2021 09:57
@mergify mergify bot merged commit d10e1a5 into tskit-dev:main Jul 5, 2021
@mergify mergify bot removed the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

metadata overflow causes crash
3 participants