Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update a single row #1545

Closed
benjeffery opened this issue Jun 30, 2021 Discussed in #1543 · 8 comments · Fixed by #1600
Closed

Update a single row #1545

benjeffery opened this issue Jun 30, 2021 Discussed in #1543 · 8 comments · Fixed by #1600
Labels
Python API Issue is about the Python API

Comments

@benjeffery
Copy link
Member

Discussed in #1543

Originally posted by hyanwong June 30, 2021
At the moment I think adding to the metadata arrays (e.g. on populations, individuals, or nodes) is quite intricate, and I have to look it up every time. I think it's something like this:

ts = msprime.sim_ancestry(10)
tables = ts.dump_tables()
pop_metadatas = [p.metadata for p in ts.populations()]  # Why is there no tables.population.get_metadatas() method?
pop_metadatas[0]['name'] += "(the only population)"
tables.populations.packset_metadata(
    [tables.populations.metadata_schema.validate_and_encode_row(r) for r in pop_metadatas])
new_ts = tables.tree_sequence()
print([p for p in new_ts.populations()])

(it looks me about 10 minutes to look up how to do that, and the information is scattered all over the docs - I don't think a beginner could do it TBH).

Should we have some convenience methods to help with this? I quite often want to edit metadata on just a single entry in a table: e.g. label a particular individual, flag up a single node, etc.

We should also give an example of doing this in the metadata tutorial

@benjeffery benjeffery added C API Issue is about the C API Python API Issue is about the Python API labels Jun 30, 2021
@benjeffery benjeffery added this to the C API 1.0.0 milestone Jun 30, 2021
@benjeffery
Copy link
Member Author

I've added this to the C 1.0 milestone as we'll need something like: tsk_update_char_ragged_array(tsk_size_t *offset_to_modify, char *array_to_modify, tsk_size_t *row_indexes, tsk_size_t *new_offset, char *new_array)

@jeromekelleher
Copy link
Member

I'm not sure this is something we want in the public API - wouldn't it be better to support updating a single row? If people are going to be updating multiple rows, then they should work column wise anyway.

@benjeffery
Copy link
Member Author

Gah, of course, forgetting the C milestone is for the public API. Was thinking this would just be used by the CPython code.

@jeromekelleher
Copy link
Member

Mind if we generalise this to "update single row" @benjeffery? No point in just focusing on the ragged columns here, we want an x_table_set_row as discussed here

@benjeffery benjeffery changed the title Set ragged array contents on a single row Update a single row Jul 5, 2021
@jeromekelleher
Copy link
Member

Removing C API tag as it's done there.

@jeromekelleher jeromekelleher removed the C API Issue is about the C API label Jul 23, 2021
@benjeffery
Copy link
Member Author

benjeffery commented Jul 28, 2021

The proposed syntax here is tables.nodes[0] = tables.nodes[0].replace(flags=tskit.NODE_IS_SAMPLE) where the RHS can be any object that has the necessary attributes for the table on the LHS.

@jeromekelleher
Copy link
Member

Can this be closed now?

@benjeffery
Copy link
Member Author

Closed by #1600

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python API Issue is about the Python API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants