Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[devtool] make ETDumpGen use datasink #8499

Open
wants to merge 6 commits into
base: gh/gasoonjia/2/base
Choose a base branch
from

Conversation

Gasoonjia
Copy link
Contributor

@Gasoonjia Gasoonjia commented Feb 14, 2025

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 14, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8499

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 4328491 with merge base cc3974f (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 14, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

Gasoonjia added a commit that referenced this pull request Feb 14, 2025
This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

ghstack-source-id: 266455642
Pull Request resolved: #8499
@Gasoonjia
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
Gasoonjia added a commit that referenced this pull request Feb 16, 2025
Pull Request resolved: #8499

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r
ghstack-source-id: 266706035
@exported-using-ghexport

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

2 similar comments
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

Gasoonjia added a commit that referenced this pull request Feb 17, 2025
Pull Request resolved: #8499

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r
ghstack-source-id: 266774207
@exported-using-ghexport

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)
This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

Gasoonjia added a commit that referenced this pull request Feb 17, 2025
Pull Request resolved: #8499

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r
ghstack-source-id: 266868050
@exported-using-ghexport

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)
*
* @param[in] buffer A Span object representing the buffer where data will be
* stored.
*/
explicit BufferDataSink(::executorch::runtime::Span<uint8_t> buffer)
: debug_buffer_(buffer), offset_(0) {}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there internal users who need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not right now but introducing this constructor may make our friends life easier and I'd like to make it as a primary entrance in the future

If our user contructs BufferDataSink from Span:

void* ptr = malloc(2048)
Span<uint8_t> span= Span(ptr, 2048)
BufferDataSink bds(span)

However if from the new constructor:

void* ptr = malloc(2048)
BufferDataSink bds(ptr, 2048)

More straightforward and sturcure, user doesn't need to take care about span or anything else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the Span ctor would work with syntax like

void* ptr = malloc(2048);
BufferDataSink bds({ptr, 2048});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Program syntactically yes, but user need to figure out what's span is, how to construct it, and how to make the logic shorter, which introduces lots of ramp up cost.
Put into user's shoe, they don't need to knwo what span is. A buffer data sink should be directly constructed on the user-provided memory space, and all type transforming should be happened under-the-neath.

@@ -497,27 +499,15 @@ ETDumpResult ETDumpGen::get_etdump_data() {
}

void ETDumpGen::set_debug_buffer(Span<uint8_t> buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove the set_debug_buffer api. Is there any reason to keep it around? Makes it a little confusing for users i think to have these 2 around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for BC. Currently all our users use set_debug_buffer to set the memory for debug data, and their work will be impacted if we remove that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should keep this API around i think. Only users who want to do their custom thing should know about data sinks, otherwise most users should just be able to call set_debug_buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User only need standard data dumping pipeline can use both datasink and set_debug_buffer, since we have provided BufferDataSink which is exactly the same as debug buffer pipeline. But from API design prospective, I would like to deprecate set_debug_buffer api and only focus on data sink for better structure.

@@ -497,27 +499,15 @@ ETDumpResult ETDumpGen::get_etdump_data() {
}

void ETDumpGen::set_debug_buffer(Span<uint8_t> buffer) {
debug_buffer_ = buffer;
data_sink_ = std::make_shared<BufferDataSink>(buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why make_shared here. This is only owned internally by this class right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just the underlying data_sink_ is a shared_ptr;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our user may still want to use theirdata_sink after ETDumpGen lifecycle, so that I make data_sink_ as shared_ptr.

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

Gasoonjia added a commit that referenced this pull request Feb 18, 2025
Pull Request resolved: #8499

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r
ghstack-source-id: 267038962
@exported-using-ghexport

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)
}

long ETDumpGen::write_tensor_or_raise_error(Tensor tensor) {
ET_CHECK_MSG(data_sink_, "Must set data sink before writing data\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : don't need \n.

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69647096

Gasoonjia added a commit that referenced this pull request Feb 19, 2025
Pull Request resolved: #8499

This diff enables customized debug data pipeline by making ETDumpGen leverage user-provided datasink.

Details can be found in https://docs.google.com/document/d/1y_m32mKdj-OgLcLUz9TKhBW3PC3bBDYSBbeAH544EfM/edit?tab=t.0#heading=h.jlkcrurw482r
ghstack-source-id: 267135769
@exported-using-ghexport

Differential Revision: [D69647096](https://our.internmc.facebook.com/intern/diff/D69647096/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants