Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle buffers in Widget._should_send_property #1595

Merged
merged 9 commits into from
Aug 9, 2017
9 changes: 5 additions & 4 deletions docs/source/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,15 +37,16 @@ Major user-visible changes in ipywidgets 7.0 include:

Major changes developers should be aware of include:

- Custom serializers in either Python or Javascript can now return a structure which contains binary buffers. If a binary buffer is in the serialized data structure, the message will be synced in binary, which is much more efficient. ([#1194](https://github.com/jupyter-widgets/ipywidgets/pull/1194))
- On the python/kernel side:
- The python `@register` decorator for widget classes no longer takes a string argument, but registers a widget class using the `_model_*` and `_view_*` traits in the class. Using the decorator as `@register('name')` is deprecated and should be changed to just `@register`. [#1228](https://github.com/jupyter-widgets/ipywidgets/pull/1228), [#1276](https://github.com/jupyter-widgets/ipywidgets/pull/1276)
- On the Python/kernel side:
- The Python `@register` decorator for widget classes no longer takes a string argument, but registers a widget class using the `_model_*` and `_view_*` traits in the class. Using the decorator as `@register('name')` is deprecated and should be changed to just `@register`. [#1228](https://github.com/jupyter-widgets/ipywidgets/pull/1228), [#1276](https://github.com/jupyter-widgets/ipywidgets/pull/1276)
- Widgets will now need correct `_model_module` and `_view_module` Unicode traits defined.
- Selection widgets now sync the index of the selected item, rather than the label. ([#1262](https://github.com/jupyter-widgets/ipywidgets/pull/1262))
- The python `ipywidget.domwidget.LabeledWidget` is now `ipywidget.widget_description.DescriptionWidget`, and there is a new `ipywidget.widget_description.DescriptionStyle` that lets the user set the CSS width of the description.
- The Python `ipywidget.domwidget.LabeledWidget` is now `ipywidget.widget_description.DescriptionWidget`, and there is a new `ipywidget.widget_description.DescriptionStyle` that lets the user set the CSS width of the description.
- Custom serializers can now return a structure that contains binary objects (`memoryview`, `bytearray`, or Python 3 `bytes` object). In this case, the sync message will be a binary message, which is much more efficient for binary data than base64-encoding. ([#1194](https://github.com/jupyter-widgets/ipywidgets/pull/1194), [#1595](https://github.com/jupyter-widgets/ipywidgets/pull/1595))
- On the Javascript side:
- The `jupyter-js-widgets` Javascript package has been split into `@jupyter-widgets/base` package (containing base widget classes, the DOM widget, and the associated layout and style classes), and the `@jupyter-widgets/controls` package (containing the rest of the Jupyter widgets controls). Authors of custom widgets will need to update to depend on `@jupyter-widgets/base` instead of `jupyter-js-widgets` (if you use a class from the controls package, you will also need to depend on `@jupyter-widgets/controls`). See the [cookie cutter](https://github.com/jupyter-widgets/widget-cookiecutter) to generate a simple example custom widget using the new packages.
- Custom serializers in Javascript are now synchronous, and should return a snapshot of the widget state. The default serializer makes a copy of JSONable objects. ([#1270](https://github.com/jupyter-widgets/ipywidgets/pull/1270))
- Custom serializers can now return a structure that contains binary objects (`ArrayBuffer`, `DataView`, or a typed array such as `Int8Array`, `Float64Array`, etc.). In this case, the sync message will be a binary message, which is much more efficient for binary data than base64-encoding. ([#1194](https://github.com/jupyter-widgets/ipywidgets/pull/1194))
- A custom serializer is given the widget instance as its second argument, and a custom deserializer is given the widget manager as its second argument.
- The Javascript model `.id` attribute has been renamed to `.model_id` to avoid conflicting with the Backbone `.id` attribute. ([#1410](https://github.com/jupyter-widgets/ipywidgets/pull/1410))
- Regarding widget managers and the syncing message protocol:
Expand Down
125 changes: 125 additions & 0 deletions ipywidgets/widgets/tests/test_set_state.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.

from ipython_genutils.py3compat import PY3

import nose.tools as nt

from traitlets import Bool, Tuple, List, Instance

from .utils import setup, teardown

from ..widget import Widget

#
# First some widgets to test on:
#

# A widget with simple traits (list + tuple to ensure both are handled)
class SimpleWidget(Widget):
a = Bool().tag(sync=True)
b = Tuple(Bool(), Bool(), Bool(), default_value=(False, False, False)).tag(sync=True)
c = List(Bool()).tag(sync=True)



# A widget where the data might be changed on reception:
def transform_fromjson(data, widget):
# Switch the two last elements when setting from json, if the first element is True
# and always set first element to False
if not data[0]:
return data
return [False] + data[1:-2] + [data[-1], data[-2]]

class TransformerWidget(Widget):
d = List(Bool()).tag(sync=True, from_json=transform_fromjson)



# A widget that has a buffer:
class DataInstance():
def __init__(self, data=None):
self.data = data

def mview_serializer(instance, widget):
return { 'data': memoryview(instance.data) if instance.data else None }

def bytes_serializer(instance, widget):
return { 'data': bytearray(memoryview(instance.data).tobytes()) if instance.data else None }

def deserializer(json_data, widget):
return DataInstance( memoryview(json_data['data']).tobytes() if json_data else None )

class DataWidget(SimpleWidget):
d = Instance(DataInstance).tag(sync=True, to_json=mview_serializer, from_json=deserializer)

# A widget that has a buffer that might be changed on reception:
def truncate_deserializer(json_data, widget):
return DataInstance( json_data['data'][:20].tobytes() if json_data else None )

class TruncateDataWidget(SimpleWidget):
d = Instance(DataInstance).tag(sync=True, to_json=bytes_serializer, from_json=truncate_deserializer)


#
# Actual tests:
#

def test_set_state_simple():
w = SimpleWidget()
w.set_state(dict(
a=True,
b=[True, False, True],
c=[False, True, False],
))

nt.assert_equal(w.comm.messages, [])


def test_set_state_transformer():
w = TransformerWidget()
w.set_state(dict(
d=[True, False, True]
))
# Since the deserialize step changes the state, this should send an update
nt.assert_equal(w.comm.messages, [((), dict(
buffers=[],
data=dict(
buffer_paths=[],
method='update',
state=dict(d=[False, True, False])
)))])


def test_set_state_data():
w = DataWidget()
data = memoryview(b'x'*30)
w.set_state(dict(
a=True,
d={'data': data},
))
nt.assert_equal(w.comm.messages, [])


def test_set_state_data_truncate():
w = TruncateDataWidget()
data = memoryview(b'x'*30)
w.set_state(dict(
a=True,
d={'data': data},
))
# Get message for checking
nt.assert_equal(len(w.comm.messages), 1) # ensure we didn't get more than expected
msg = w.comm.messages[0]
# Assert that the data update (truncation) sends an update
buffers = msg[1].pop('buffers')
nt.assert_equal(msg, ((), dict(
data=dict(
buffer_paths=[['d', 'data']],
method='update',
state=dict(d={})
))))

# Sanity:
nt.assert_equal(len(buffers), 1)
nt.assert_equal(buffers[0], data[:20].tobytes())
8 changes: 7 additions & 1 deletion ipywidgets/widgets/tests/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,17 @@

class DummyComm(Comm):
comm_id = 'a-b-c-d'
kernel = 'Truthy'

def __init__(self, *args, **kwargs):
super(DummyComm, self).__init__(*args, **kwargs)
self.messages = []

def open(self, *args, **kwargs):
pass

def send(self, *args, **kwargs):
pass
self.messages.append((args, kwargs))

def close(self, *args, **kwargs):
pass
Expand All @@ -33,6 +38,7 @@ def teardown_test_comm():
delattr(Widget, attr)
else:
setattr(Widget, attr, value)
_widget_attrs.clear()

def setup():
setup_test_comm()
Expand Down
44 changes: 38 additions & 6 deletions ipywidgets/widgets/widget.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,9 @@ def _json_to_widget(x, obj):
}

if PY3:
_binary_types = (memoryview, bytes)
_binary_types = (memoryview, bytearray, bytes)
else:
_binary_types = (memoryview, buffer)
_binary_types = (memoryview, bytearray)

def _put_buffers(state, buffer_paths, buffers):
"""The inverse of _remove_buffers, except here we modify the existing dict/lists.
Expand Down Expand Up @@ -115,6 +115,8 @@ def _separate_buffers(substate, path, buffer_paths, buffers):
def _remove_buffers(state):
"""Return (state_without_buffers, buffer_paths, buffers) for binary message parts

A binary message part is a memoryview, bytearray, or python 3 bytes object.

As an example:
>>> state = {'plain': [0, 'text'], 'x': {'ar': memoryview(ar1)}, 'y': {'shape': (10,10), 'data': memoryview(ar2)}}
>>> _remove_buffers(state)
Expand Down Expand Up @@ -483,6 +485,30 @@ def _compare(self, a, b):
else:
return a == b

def _buffer_list_equal(self, a, b):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have this as an extension point, keeping the key could be very informative as to which kind of comparison to perform. If anything, it's a problem with the current _compare that it does not have it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. for certain keys you might want to do a floating point comparison with a given precision, while for another you need exact equality. Without the key, you will not be able to determine which comparison to apply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this being an extension point? I don't understand how you're thinking of this as an extension point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean extension point as in "something an inheriting class might want to override".

Copy link
Member

@jasongrout jasongrout Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this comparison depends on the top-level put/remove buffer function implementations, so I factored out the comparison to also be a top-level function.

Copy link
Member

@jasongrout jasongrout Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean extension point as in "something an inheriting class might want to override".

Ah. Since it depends on the implementation of the top-level functions (i.e., what exactly can be in the buffers list), I think it makes more sense to not override it, and instead put it at the top level too.

Copy link
Member Author

@vidartf vidartf Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation works for me, but if someone uses a serializer/deserialize set that might introduce noise when data is passed through a loop (a != serializer(deserialize(a))), e.g. floating point inaccuracies, they will want to override this comparison.

It is still possible to do that by overriding the _should_send_property function, but it will be more brittle as it needs to take into account the internals of that method (e.g. _property_lock).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I might be over-engineering this one. Other fields in the state other than buffers might also have this issue, so there is no obvious reason to give buffers preferential treatment.

"""Compare two lists of buffers for equality.

Used to decide whether two sequences of buffers (memoryviews,
bytearrays, or python 3 bytes) differ, such that a sync is needed.

Returns True if equal, False if unequal
"""
if len(a) != len(b):
return False
if a == b:
return True
for ia, ib in zip(a, b):
# Check byte equality:
# NOTE: Simple ia != ib does not always work as intended, as
# e.g. memoryview(np.frombuffer(ia, dtype='float32')) !=
# memoryview(np.frombuffer(b)), since the format info differs.
# However, since we only transfer bytes, we use `tobytes()`.
ia_bytes = ia.tobytes() if isinstance(ia, memoryview) else ia
ib_bytes = ib.tobytes() if isinstance(ib, memoryview) else ib
if ia_bytes != ib_bytes:
return False
return True

def set_state(self, sync_data):
"""Called when a state is received from the front-end."""
# The order of these context managers is important. Properties must
Expand Down Expand Up @@ -597,10 +623,16 @@ def _should_send_property(self, key, value):
# A roundtrip conversion through json in the comparison takes care of
# idiosyncracies of how python data structures map to json, for example
# tuples get converted to lists.
if (key in self._property_lock
and jsonloads(jsondumps(to_json(value, self))) == self._property_lock[key]):
return False
elif self._holding_sync:
if key in self._property_lock:
# model_state, buffer_paths, buffers
split_value = _remove_buffers({ key: to_json(value, self)})
split_lock = _remove_buffers({ key: self._property_lock[key]})
# Compare state and buffer_paths
if (jsonloads(jsondumps(split_value[0])) == jsonloads(jsondumps(split_lock[0]))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather pass sort_keys=True to dumps rather than pass it back through loads, but it is not critical and I have not profiled the two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I also factored out this comparison into a top-level function.

and split_value[1] == split_lock[1]
and self._buffer_list_equal(split_value[2], split_lock[2])):
return False
if self._holding_sync:
self._states_to_send.add(key)
return False
else:
Expand Down