-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uploader: inline graph filtering from dataclass_compat
#3510
Conversation
Summary: We initially used `dataclass_compat` to perform filtering of large graphs as a stopgap mechanism. This commit moves that filtering into the uploader, which is the only surface in which it’s actually used. As a result, `dataclass_compat` no longer takes extra arguments and so can be moved into `EventFileLoader` in a future change. Test Plan: Unit tests added to the uploader for the small graph, large graph, and corrupt graph cases. wchargin-branch: uploader-graph-filtering wchargin-source: 00af6ebe200fe60aacbc23f468eb44126d2d33f7
wchargin-branch: uploader-graph-filtering wchargin-source: 9179acc76158380d9f254df1d0a0aae1ec82df6c
for dataclass_event in dataclass_events: | ||
if dataclass_event.summary: | ||
for value in dataclass_event.summary.value: | ||
events = dataclass_compat.migrate_event(v2_event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't shadow events
(and event
below)-- that is confusing and error-prone. dataclass_events
was fine imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, this form of shadowing is less error-prone than the
alternative, because it is never possible to accidentally refer to the
old value. Case in point: prior to this change, the code was subtly
incorrect, because the yield
expression referred to event
rather
than dataclass_event
. (This would be observable if a dataclass
transformation were to affect the wall time or step—certainly allowed,
and perfectly conceivable for conceptually run-level summaries like
graphs or hparams.)
Shadowing makes that kind of error structurally impossible, and, as you
correctly note, this shadowing is entirely lexical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for what it's worth, all of this goes away in the subsequent PR anyway (https://github.com/tensorflow/tensorboard/pull/3511/files#diff-64ad30888691c0bf32cae63247f4ca5c).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remain opposed to any shadowing whatsoever, but don't need to argue this case since it's going away in #3511
(Eep, the former bug does suck, though; thanks for the fix).
!= graphs_metadata.PLUGIN_NAME | ||
): | ||
continue | ||
if v.tag == graphs_metadata.RUN_GRAPH_NAME: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessarily restrictive. In practice, for now, we have only run-level graphs anyway. But IIRC we expect to allow "__op_graph__/foo"
and "__conceptual_graph__/bar"
also, and those should also be filtered. NBD for now but maybe add a comment or TODO to address this later.
(I know this is a straight refactor of code that already ignored those cases, but still)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you note, I’m porting existing code. Op graphs and conceptual graphs
do not actually have plugin name graphs
; instead, they use plugin
names graph_run_metadata_graph
and graph_keras_model
, so clearly
at least some special considerations will need to be taken, even before
we consider whether the semantics should be the same. Attempting to
anticipate those now seems premature to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I forgot about the different plugin names. So this condition is protecting us from data that is associated with the graph plugin, but under a different tag. There is currently no such thing; if something arises in the future, we can't predict whether it should be filtered or not-- so now I agree the condition is correct.
# Three graphs: one short, one long, one corrupt. | ||
bytes_0 = _create_example_graph_bytes(123) | ||
bytes_1 = _create_example_graph_bytes(9999) | ||
bytes_2 = b"\x0a\x7fbogus" # invalid (truncated) proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment about what these values mean and why they are necessary for the test. "bogus"
alone was sufficient to trigger DecodeError
. Is this the case that sometimes produces RuntimeWarning
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b"bogus"
is much less clearly bogus. I prefer not decoding ASCII in my
head when I can get away with it, so while b"b"
is clearly 0x62
(and
thus clearly a valid start byte) I’m not eager to decode b"o"
to
figure out what the string length is.
The example here b"\x0a\x7fbogus"
is just what it says: a truncated
proto, because the field is asserted to be of length 0x7f but is
actually only of length 5. So this is just much easier to read, at least
for me.
I don’t feel super strongly about this, so I can change it if you do,
but in general it feels unwise to me to suggest via the code that
ASCII bytestrings have the same semantics in proto as in English. A
proto that said bytes_3 = b"valid"
wouldn’t be valid… right? :-)
I have no idea what the RuntimeWarning
business is about—I couldn’t
figure it out from the comments in the code or even from the comments on
the linked bug, which is Google-internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's all good; I'm just saying that you can't expect a reader to recognize and decode the proto wire format on sight. A comment would solve that:
# Declare a proto field of length 127 bytes (`\x7f`), but provide only 5 bytes.
Similarly explaining the \x0a would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, added an explicit comment per your suggestion.
tensorboard/uploader/uploader.py
Outdated
logger.warning( | ||
"Could not parse GraphDef of size %d.", len(graph_bytes), | ||
) | ||
return graph_bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this effectively returned None
, leading to the event being skipped entirely. I think that's correct: the philosophy here is that we only bother uploading a graph that we will later be able to render. Matching that will require adjusting the logic in _filter_graph_defs
above, because uploading an empty blob is different from dropping the event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I’ll defer to you here. Done. Please note that this both slows and
complicates the code because we have to deal changes to the size and
dimension of the tensor.
Yes, uploading an empty blob is certainly different from dropping the
event; for what it’s worth, the uploader doesn’t appear to upload empty
blobs correctly (it never finalizes them).
except message.DecodeError: | ||
actual_graph_defs.append(None) | ||
|
||
with self.subTest("small graphs should pass through unchanged"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"graphs with small attr values..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
expected_graph_def_0 = graph_pb2.GraphDef.FromString(bytes_0) | ||
self.assertEqual(actual_graph_defs[0], expected_graph_def_0) | ||
|
||
with self.subTest("large graphs should be filtered"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"large attr values should be filtered out"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
requests = list(mock_client.WriteBlob.call_args[0][0]) | ||
self.assertEqual(actual_graph_defs[1], expected_graph_def_1) | ||
|
||
with self.subTest("corrupt graphs should be passed through unchanged"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...should be skipped", per previous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -355,6 +356,65 @@ def test_upload_skip_large_blob(self): | |||
self.assertEqual(0, mock_rate_limiter.tick.call_count) | |||
self.assertEqual(1, mock_blob_rate_limiter.tick.call_count) | |||
|
|||
def test_filter_graphs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up and simplifying these tests! However, I am a little skeptical of this design because the tests are no longer independent. Can you factor out some common setup but keep the tests separate (while retaining most of the simplicity)?
Also, why did you remove the explicit tests for RuntimeWarning
vs. DecodeError
that @caisq had just requested in #3508?
Finally FYI #3509 is outstanding. NP, I'll merge it, assuming you submit first. The behavior of the empty graph case may actually change, depending on how you address the 'corrupt graph' case above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up and simplifying these tests! However, I am a
little skeptical of this design because the tests are no longer
independent.
The tests do fail independently. Python subTest
is roughly like the JS
pattern of having a describe
with setup followed by multiple it
s.
The setup runs once, and the cases run independently.
Thus:
import unittest
class Test(unittest.TestCase):
def test(self):
x = 2
with self.subTest("should be one"):
self.assertEqual(x, 1)
with self.subTest("but also two"):
self.assertEqual(x, 2)
with self.subTest("and somehow three"):
self.assertEqual(x, 3)
if __name__ == "__main__":
unittest.main()
======================================================================
FAIL: test (__main__.Test) [should be one]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/test.py", line 8, in test
self.assertEqual(x, 1)
AssertionError: 2 != 1
======================================================================
FAIL: test (__main__.Test) [and somehow three]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/test.py", line 12, in test
self.assertEqual(x, 3)
AssertionError: 2 != 3
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (failures=2)
It’s true that any side effects in one sub-test would be visible in the
others, but all the potentially effectful logic clearly happens in the
setup, not the assertions.
I’m not sure that I fully understand your concern; is this satisfactory?
Also, why did you remove the explicit tests for
RuntimeWarning
vs.
DecodeError
that @caisq had just requested in #3508?
Mock-based tests are fragile, and the one proposed in #3508 is no
exception. The code under test could well be refactored to use
ParseFromString
or MergeFromString
instead of FromString
, in which
case the test would spuriously fail. I have spent far too much sanity
over the past day, month, and year debugging and working around brittle
mocks to be comfortable adding more of them without clear need.
The proposed test covers that the code does what we expect it to if a
RuntimeWarning
is raised from a certain place. That’s not really an
contract-level spec. As far as I have been able to tell, we have no idea
how to actually trigger the offending case! (The internal bug has no
comments from anyone on the TensorBoard team, and I haven’t been able to
find a reference to why we’re even talking about this, even with the
resources available to me as a Googler.) The code as written obviously
satisfies the proposed test case—it’s right there in the except
—so the
test serves only to protect against people accidentally deleting the
except
clause, which is already protected by the attached comment.
It’s always tradeoffs in the end. It is my opinion that the test does
not justify the substantial additional complexity that it would incur.
Finally FYI #3509 is outstanding. NP, I'll merge it, assuming you
submit first. The behavior of the empty graph case may actually
change, depending on how you address the 'corrupt graph' case above.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for now for expediency. For the record, my concern is that the setup code is not generic; it includes setup specific to the tests (e.g., the values and order of the bytes examples). If I wanted to add another subtest--perhaps the one from #3509--I'd have to change the setup code here. I can see this both ways (i.e., I could also add more tests that use the same examples, without changing the setup). The fact that currently parts of the setup correspond 1:1 with specific tests makes it look more entangled than it necessarily is.
wchargin-branch: uploader-graph-filtering wchargin-source: b89f6da9cd740080aefecec49194752b0de85dd4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; updated; PTAL.
for dataclass_event in dataclass_events: | ||
if dataclass_event.summary: | ||
for value in dataclass_event.summary.value: | ||
events = dataclass_compat.migrate_event(v2_event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, this form of shadowing is less error-prone than the
alternative, because it is never possible to accidentally refer to the
old value. Case in point: prior to this change, the code was subtly
incorrect, because the yield
expression referred to event
rather
than dataclass_event
. (This would be observable if a dataclass
transformation were to affect the wall time or step—certainly allowed,
and perfectly conceivable for conceptually run-level summaries like
graphs or hparams.)
Shadowing makes that kind of error structurally impossible, and, as you
correctly note, this shadowing is entirely lexical.
!= graphs_metadata.PLUGIN_NAME | ||
): | ||
continue | ||
if v.tag == graphs_metadata.RUN_GRAPH_NAME: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you note, I’m porting existing code. Op graphs and conceptual graphs
do not actually have plugin name graphs
; instead, they use plugin
names graph_run_metadata_graph
and graph_keras_model
, so clearly
at least some special considerations will need to be taken, even before
we consider whether the semantics should be the same. Attempting to
anticipate those now seems premature to me.
# Three graphs: one short, one long, one corrupt. | ||
bytes_0 = _create_example_graph_bytes(123) | ||
bytes_1 = _create_example_graph_bytes(9999) | ||
bytes_2 = b"\x0a\x7fbogus" # invalid (truncated) proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b"bogus"
is much less clearly bogus. I prefer not decoding ASCII in my
head when I can get away with it, so while b"b"
is clearly 0x62
(and
thus clearly a valid start byte) I’m not eager to decode b"o"
to
figure out what the string length is.
The example here b"\x0a\x7fbogus"
is just what it says: a truncated
proto, because the field is asserted to be of length 0x7f but is
actually only of length 5. So this is just much easier to read, at least
for me.
I don’t feel super strongly about this, so I can change it if you do,
but in general it feels unwise to me to suggest via the code that
ASCII bytestrings have the same semantics in proto as in English. A
proto that said bytes_3 = b"valid"
wouldn’t be valid… right? :-)
I have no idea what the RuntimeWarning
business is about—I couldn’t
figure it out from the comments in the code or even from the comments on
the linked bug, which is Google-internal.
except message.DecodeError: | ||
actual_graph_defs.append(None) | ||
|
||
with self.subTest("small graphs should pass through unchanged"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
expected_graph_def_0 = graph_pb2.GraphDef.FromString(bytes_0) | ||
self.assertEqual(actual_graph_defs[0], expected_graph_def_0) | ||
|
||
with self.subTest("large graphs should be filtered"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
requests = list(mock_client.WriteBlob.call_args[0][0]) | ||
self.assertEqual(actual_graph_defs[1], expected_graph_def_1) | ||
|
||
with self.subTest("corrupt graphs should be passed through unchanged"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tensorboard/uploader/uploader.py
Outdated
logger.warning( | ||
"Could not parse GraphDef of size %d.", len(graph_bytes), | ||
) | ||
return graph_bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I’ll defer to you here. Done. Please note that this both slows and
complicates the code because we have to deal changes to the size and
dimension of the tensor.
Yes, uploading an empty blob is certainly different from dropping the
event; for what it’s worth, the uploader doesn’t appear to upload empty
blobs correctly (it never finalizes them).
@@ -355,6 +356,65 @@ def test_upload_skip_large_blob(self): | |||
self.assertEqual(0, mock_rate_limiter.tick.call_count) | |||
self.assertEqual(1, mock_blob_rate_limiter.tick.call_count) | |||
|
|||
def test_filter_graphs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up and simplifying these tests! However, I am a
little skeptical of this design because the tests are no longer
independent.
The tests do fail independently. Python subTest
is roughly like the JS
pattern of having a describe
with setup followed by multiple it
s.
The setup runs once, and the cases run independently.
Thus:
import unittest
class Test(unittest.TestCase):
def test(self):
x = 2
with self.subTest("should be one"):
self.assertEqual(x, 1)
with self.subTest("but also two"):
self.assertEqual(x, 2)
with self.subTest("and somehow three"):
self.assertEqual(x, 3)
if __name__ == "__main__":
unittest.main()
======================================================================
FAIL: test (__main__.Test) [should be one]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/test.py", line 8, in test
self.assertEqual(x, 1)
AssertionError: 2 != 1
======================================================================
FAIL: test (__main__.Test) [and somehow three]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/test.py", line 12, in test
self.assertEqual(x, 3)
AssertionError: 2 != 3
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (failures=2)
It’s true that any side effects in one sub-test would be visible in the
others, but all the potentially effectful logic clearly happens in the
setup, not the assertions.
I’m not sure that I fully understand your concern; is this satisfactory?
Also, why did you remove the explicit tests for
RuntimeWarning
vs.
DecodeError
that @caisq had just requested in #3508?
Mock-based tests are fragile, and the one proposed in #3508 is no
exception. The code under test could well be refactored to use
ParseFromString
or MergeFromString
instead of FromString
, in which
case the test would spuriously fail. I have spent far too much sanity
over the past day, month, and year debugging and working around brittle
mocks to be comfortable adding more of them without clear need.
The proposed test covers that the code does what we expect it to if a
RuntimeWarning
is raised from a certain place. That’s not really an
contract-level spec. As far as I have been able to tell, we have no idea
how to actually trigger the offending case! (The internal bug has no
comments from anyone on the TensorBoard team, and I haven’t been able to
find a reference to why we’re even talking about this, even with the
resources available to me as a Googler.) The code as written obviously
satisfies the proposed test case—it’s right there in the except
—so the
test serves only to protect against people accidentally deleting the
except
clause, which is already protected by the attached comment.
It’s always tradeoffs in the end. It is my opinion that the test does
not justify the substantial additional complexity that it would incur.
Finally FYI #3509 is outstanding. NP, I'll merge it, assuming you
submit first. The behavior of the empty graph case may actually
change, depending on how you address the 'corrupt graph' case above.
Thanks.
Summary: Fixes an oversight in #3507: we can’t assert that the raw bytes are what was expected because the code under test does a proto serialization roundtrip, which is permitted to permute keys. Test Plan: Tests still pass, and this should fix an internal sync error. wchargin-branch: uploader-test-proto-equal
wchargin-branch: uploader-graph-filtering wchargin-source: 30f0764698d66cdaf481c797dc56bf6a34ae3ae0
for dataclass_event in dataclass_events: | ||
if dataclass_event.summary: | ||
for value in dataclass_event.summary.value: | ||
events = dataclass_compat.migrate_event(v2_event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remain opposed to any shadowing whatsoever, but don't need to argue this case since it's going away in #3511
(Eep, the former bug does suck, though; thanks for the fix).
!= graphs_metadata.PLUGIN_NAME | ||
): | ||
continue | ||
if v.tag == graphs_metadata.RUN_GRAPH_NAME: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I forgot about the different plugin names. So this condition is protecting us from data that is associated with the graph plugin, but under a different tag. There is currently no such thing; if something arises in the future, we can't predict whether it should be filtered or not-- so now I agree the condition is correct.
# Three graphs: one short, one long, one corrupt. | ||
bytes_0 = _create_example_graph_bytes(123) | ||
bytes_1 = _create_example_graph_bytes(9999) | ||
bytes_2 = b"\x0a\x7fbogus" # invalid (truncated) proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's all good; I'm just saying that you can't expect a reader to recognize and decode the proto wire format on sight. A comment would solve that:
# Declare a proto field of length 127 bytes (`\x7f`), but provide only 5 bytes.
Similarly explaining the \x0a would be nice.
@@ -355,6 +356,65 @@ def test_upload_skip_large_blob(self): | |||
self.assertEqual(0, mock_rate_limiter.tick.call_count) | |||
self.assertEqual(1, mock_blob_rate_limiter.tick.call_count) | |||
|
|||
def test_filter_graphs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for now for expediency. For the record, my concern is that the setup code is not generic; it includes setup specific to the tests (e.g., the values and order of the bytes examples). If I wanted to add another subtest--perhaps the one from #3509--I'd have to change the setup code here. I can see this both ways (i.e., I could also add more tests that use the same examples, without changing the setup). The fact that currently parts of the setup correspond 1:1 with specific tests makes it look more entangled than it necessarily is.
tensorboard/uploader/uploader.py
Outdated
# a combination of mysterious circumstances. | ||
except (message.DecodeError, RuntimeWarning): | ||
logger.warning( | ||
"Could not parse GraphDef of size %d.", len(graph_bytes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reinstate "Skipping" or similar in this warning, to clarify that the failure to parse means we don't upload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meant to; thanks. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM mod a small remaining change or two
wchargin-branch: uploader-graph-filtering wchargin-source: 63adea8ca2f0a85bd9f061aee72435f04d1d56d8 # Conflicts: # tensorboard/uploader/uploader_test.py
wchargin-branch: uploader-graph-filtering wchargin-source: 63adea8ca2f0a85bd9f061aee72435f04d1d56d8
Test sync passes. |
…3510) Summary: We initially used `dataclass_compat` to perform filtering of large graphs as a stopgap mechanism. This commit moves that filtering into the uploader, which is the only surface in which it’s actually used. As a result, `dataclass_compat` no longer takes extra arguments and so can be moved into `EventFileLoader` in a future change. Test Plan: Unit tests added to the uploader for the small graph, large graph, and corrupt graph cases. wchargin-branch: uploader-graph-filtering
Summary: We initially used `dataclass_compat` to perform filtering of large graphs as a stopgap mechanism. This commit moves that filtering into the uploader, which is the only surface in which it’s actually used. As a result, `dataclass_compat` no longer takes extra arguments and so can be moved into `EventFileLoader` in a future change. Test Plan: Unit tests added to the uploader for the small graph, large graph, and corrupt graph cases. wchargin-branch: uploader-graph-filtering
Summary:
We initially used
dataclass_compat
to perform filtering of largegraphs as a stopgap mechanism. This commit moves that filtering into the
uploader, which is the only surface in which it’s actually used. As a
result,
dataclass_compat
no longer takes extra arguments and so can bemoved into
EventFileLoader
in a future change.Test Plan:
Unit tests added to the uploader for the small graph, large graph, and
corrupt graph cases.
wchargin-branch: uploader-graph-filtering