Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method update_ids won't "update" ids to the same name (raises incorrect error) #633

Closed
ElDeveloper opened this issue May 31, 2015 · 7 comments
Labels

Comments

@ElDeveloper
Copy link
Member

When using update_ids, you need to provide new identifiers, otherwise the method will raise a spurious error. The example below shows how using the example table you can reproduce the error by creating a dictionary where all key and value pairs are equal:

In [7]: from biom import example_table as et

In [8]: id_map = {i: i for i in et.ids()}

In [9]: id_map
Out[9]: {'S1': 'S1', 'S2': 'S2', 'S3': 'S3'}

In [10]: et.update_ids(id_map)
---------------------------------------------------------------------------
TableException                            Traceback (most recent call last)
<ipython-input-10-3d0c303209a3> in <module>()
----> 1 et.update_ids(id_map)

/Users/yoshikivazquezbaeza/.virtualenvs/qiime-191/lib/python2.7/site-packages/biom/table.pyc in update_ids(self, id_map, axis, strict, inplace)
    990                     "Mapping not provided for %s identifier: %s. If this "
    991                     "identifier should not be updated, pass strict=False."
--> 992                     % (axis, old_id))
    993             updated_ids[idx] = new_id
    994 

TableException: Mapping not provided for sample identifier: S1. If this identifier should not be updated, pass strict=False.

The error indicates that the mapping for S1 was not provided, even though it is in the id_map dictionary. This error goes away as soon as you change the mapping so the values of the dictionary are different from the keys:

In [11]: id_map = {i: i+'-:L' for i in et.ids()}

In [12]: et.update_ids(id_map)
Out[12]: 2 x 3 <class 'biom.table.Table'> with 5 nonzero entries (83% dense)
@wasade
Copy link
Member

wasade commented May 31, 2015

That's an odd one... thanks for finding it!

On Sun, May 31, 2015 at 4:09 PM, Yoshiki Vázquez Baeza <
notifications@github.com> wrote:

When using update_ids, you need to provide new identifiers, otherwise the
method will raise a spurious error. The example below shows how using the
example table you can reproduce the error by creating a dictionary where
all key and value pairs are equal:

In [7]: from biom import example_table as et

In [8]: id_map = {i: i for i in et.ids()}

In [9]: id_map
Out[9]: {'S1': 'S1', 'S2': 'S2', 'S3': 'S3'}

In [10]: et.update_ids(id_map)---------------------------------------------------------------------------
TableException Traceback (most recent call last) in ()----> 1 et.update_ids(id_map)
/Users/yoshikivazquezbaeza/.virtualenvs/qiime-191/lib/python2.7/site-packages/biom/table.pyc in update_ids(self, id_map, axis, strict, inplace)
990 "Mapping not provided for %s identifier: %s. If this "
991 "identifier should not be updated, pass strict=False."--> 992 % (axis, old_id))
993 updated_ids[idx] = new_id
994

TableException: Mapping not provided for sample identifier: S1. If this identifier should not be updated, pass strict=False.

The error indicates that the mapping for S1 was not provided, even though
it is in the id_map dictionary. This error goes away as soon as you
change the mapping so the values of the dictionary are different from the
keys:

In [11]: id_map = {i: i+'-:L' for i in et.ids()}

In [12]: et.update_ids(id_map)
Out[12]: 2 x 3 <class 'biom.table.Table'> with 5 nonzero entries (83% dense)


Reply to this email directly or view it on GitHub
#633.

@wasade
Copy link
Member

wasade commented Jun 3, 2015

@ElDeveloper, did you try setting strict=False?

@josenavas
Copy link
Member

I hit this error yesterday and it went away when I passed strict=False. I think it is not a bug, and I think it is ok to have strict=True by default.

@wasade
Copy link
Member

wasade commented Jun 3, 2015

...to add, I don't believe it is that the IDs are the same by value, but that the IDs you're using in your example are the same by reference. The confusion here is that the way strict=True works is intended to work is that it will complain if the map doesn't overlap 100% with the IDs on an axis. The issue triggered here is a caveat (bug?) for how the check is performed. Or, in other words, this is the reason that the code is bailing:

>>> a = 'foo'
>>> b = 'foo'
>>> a is b
True

@ElDeveloper
Copy link
Member Author

@josenavas, this is not directly related to the strict argument, if
you see my example, I provide a map for all the identifiers, so the
error message is incorrect.

@wasade, that makes sense, but the message is still very misleading.

On (Jun-03-15|10:18), Daniel McDonald wrote:

...to add, I don't believe it is that the IDs are the same by value, but that the IDs you're using in your example are the same by reference. The confusion here is that the way strict=True works is intended to work is that it will complain if the map doesn't overlap 100% with the IDs on an axis. The issue triggered here is a caveat (bug?) for how the check is performed. Or, in other words, this is the reason that the code is bailing:

>>> a = 'foo'
>>> b = 'foo'
>>> a is b
True

Reply to this email directly or view it on GitHub:
#633 (comment)

@wasade
Copy link
Member

wasade commented Jun 3, 2015

For this case, yes. Issuing a PR to add clarity

@wasade wasade mentioned this issue Jun 3, 2015
@wasade
Copy link
Member

wasade commented Jun 3, 2015

@ElDeveloper, can you review #634 please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants