Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance benchmark config: MPS 8da4w #8461

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

manuelcandales
Copy link
Contributor

Adds a new performance benchmark config to keep track of performance on MPS backend when running Llama 3.2 1B inference with 8da4w quantization

Copy link

pytorch-bot bot commented Feb 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8461

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Cancelled Job

As of commit b00fce1 with merge base 0222074 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 13, 2025
@manuelcandales manuelcandales temporarily deployed to upload-benchmark-results February 13, 2025 17:41 — with GitHub Actions Inactive
@guangy10
Copy link
Contributor

Added a link to the Benchmark project here: #8473

@guangy10
Copy link
Contributor

Looks good! Please schedule an on-demand benchmark job to test this new config on your PR before merging

@manuelcandales manuelcandales had a problem deploying to upload-benchmark-results February 13, 2025 21:39 — with GitHub Actions Failure
Bump up timeout threshold
Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job got cancelled due to timeout (after running 120min). Temporarily bump the threshold to 240mins to see if it can actually finish the run successfully. Debugging the slowness run can be done later.

@guangy10 guangy10 temporarily deployed to upload-benchmark-results February 19, 2025 01:10 — with GitHub Actions Inactive
@guangy10
Copy link
Contributor

@huydhn @yangw-dev do you have any idea why this benchmark job itself is running forever? Previous attempt was timed out after 2hours, and I can’t find any info why it’s taking so long. It looks like cancelled job won’t have any log? I temporarily bumped it up to 4hours to see if it can finish successfully, but per @manuelcandales the model shouldn’t run that slow.

https://github.com/pytorch/executorch/actions/runs/13402426162/job/37438073261

@huydhn
Copy link
Contributor

huydhn commented Feb 19, 2025

From what I see in the previous run https://us-west-2.console.aws.amazon.com/devicefarm/home#/mobile/projects/02a2cf0f-6d9b-45ee-ba1a-a086587469e6/runs/4f1fcf14-2a4c-4364-ad4a-b9c7ecc0a783 and the current run https://us-west-2.console.aws.amazon.com/devicefarm/home#/mobile/projects/02a2cf0f-6d9b-45ee-ba1a-a086587469e6/runs/255ff74c-ad69-43b1-9014-4317e949d9ed, it's always iOS 18 that's hang. So, maybe this is something to do with the OS.

In other cases, the test failed with this error Assertion failed: (0 && "unexpected MPSDataType"), function getMLIRElementType, file MPSGraphUtilities.mm, line 149.

[DEVICEFARM] ########### Entering phase test ###########
 
[DeviceFarm] xcodebuild test-without-building -destination id=$DEVICEFARM_DEVICE_UDID -xctestrun $DEVICEFARM_TEST_PACKAGE_PATH/*.xctestrun -derivedDataPath $DEVICEFARM_LOG_DIR
Command line invocation:
    /Applications/Xcode_15.app/Contents/Developer/usr/bin/xcodebuild test-without-building -destination id=00008120-001A43462E52201E -xctestrun /tmp/devicefarm-workspace/execution-sfquu6rh/test-package-o6owemum/Benchmark_Tests_iphoneos17.5-arm64.xctestrun -derivedDataPath /tmp/devicefarm-workspace/execution-sfquu6rh/logs-e3ncx7ui

User defaults from command line:
    IDEDerivedDataPathOverride = /tmp/devicefarm-workspace/execution-sfquu6rh/logs-e3ncx7ui
    IDEPackageSupportUseBuiltinSCM = YES

2025-02-13 11:45:39.157 xcodebuild[1142:14060]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError
Test Suite 'All tests' started at 2025-02-13 11:45:40.111.
Test Suite 'Tests.xctest' started at 2025-02-13 11:45:40.111.
Test Suite 'GenericTests' started at 2025-02-13 11:45:40.111.
Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]' started.
Assertion failed: (0 && "unexpected MPSDataType"), function getMLIRElementType, file MPSGraphUtilities.mm, line 149.
2025-02-13 11:46:02.103 xcodebuild[1142:14065]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError

Restarting after unexpected exit, crash, or test timeout in -[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]; summary will include totals from previous launches.

Test Suite 'Selected tests' started at 2025-02-13 11:46:02.478.
Test Suite 'Tests.xctest' started at 2025-02-13 11:46:02.479.
Test Suite 'GenericTests' started at 2025-02-13 11:46:02.479.
Test Case '-[GenericTests test_load_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]' started.
Assertion failed: (0 && "unexpected MPSDataType"), function getMLIRElementType, file MPSGraphUtilities.mm, line 149.
2025-02-13 11:46:24.513 xcodebuild[1142:14110]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError

Restarting after unexpected exit, crash, or test timeout in -[GenericTests test_load_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]; summary will include totals from previous launches.

Test Suite 'Selected tests' started at 2025-02-13 11:46:24.981.
Test Suite 'Tests.xctest' started at 2025-02-13 11:46:24.981.
Test Suite 'GenericTests' started at 2025-02-13 11:46:24.981.
Test Suite 'GenericTests' failed at 2025-02-13 11:46:24.981.
	 Executed 2 tests, with 2 failures (0 unexpected) in 0.000 (0.000) seconds
Test Suite 'LLaMATests' started at 2025-02-13 11:46:24.982.
Test Case '-[LLaMATests test_generate_llama_3_2_1b_llama3_mps_8da4w_pte_tokenizer_model_iOS_17_2_1_iPhone15_4]' started.
Assertion failed: (0 && "unexpected MPSDataType"), function getMLIRElementType, file MPSGraphUtilities.mm, line 149.
2025-02-13 11:46:46.595 xcodebuild[1142:14109]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError

Restarting after unexpected exit, crash, or test timeout in -[LLaMATests test_generate_llama_3_2_1b_llama3_mps_8da4w_pte_tokenizer_model_iOS_17_2_1_iPhone15_4]; summary will include totals from previous launches.

Test Suite 'Selected tests' started at 2025-02-13 11:46:46.995.
Test Suite 'Tests.xctest' started at 2025-02-13 11:46:46.996.
Test Suite 'LLaMATests' started at 2025-02-13 11:46:46.996.
Test Suite 'LLaMATests' failed at 2025-02-13 11:46:46.996.
	 Executed 1 test, with 1 failure (0 unexpected) in 0.000 (0.000) seconds
Test Suite 'Tests.xctest' failed at 2025-02-13 11:46:46.996.
	 Executed 3 tests, with 3 failures (0 unexpected) in 0.000 (0.000) seconds
Test Suite 'Selected tests' failed at 2025-02-13 11:46:46.996.
	 Executed 3 tests, with 3 failures (0 unexpected) in 0.000 (0.001) seconds
2025-02-13 11:46:53.135 xcodebuild[1142:13639] [MT] IDETestOperationsObserverDebug: 176.335 elapsed -- Testing started completed.
2025-02-13 11:46:53.135 xcodebuild[1142:13639] [MT] IDETestOperationsObserverDebug: 0.000 sec, +0.000 sec -- start
2025-02-13 11:46:53.135 xcodebuild[1142:13639] [MT] IDETestOperationsObserverDebug: 176.335 sec, +176.335 sec -- end

Test session results, code coverage, and logs:
	/tmp/devicefarm-workspace/execution-sfquu6rh/logs-e3ncx7ui/Logs/Test/Test-Benchmark-2025.02.13_11-43-56--0800.xcresult

Failing tests:
	-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]
	-[GenericTests test_load_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_17_2_1_iPhone15_4]
	-[LLaMATests test_generate_llama_3_2_1b_llama3_mps_8da4w_pte_tokenizer_model_iOS_17_2_1_iPhone15_4]

** TEST EXECUTE FAILED **

@huydhn
Copy link
Contributor

huydhn commented Feb 19, 2025

Also, here is the test output from the hang iOS 18

[DEVICEFARM] ########### Entering phase test ###########
 
[DeviceFarm] xcodebuild test-without-building -destination id=$DEVICEFARM_DEVICE_UDID -xctestrun $DEVICEFARM_TEST_PACKAGE_PATH/*.xctestrun -derivedDataPath $DEVICEFARM_LOG_DIR
Command line invocation:
    /Applications/Xcode_16.app/Contents/Developer/usr/bin/xcodebuild test-without-building -destination id=00008120-00123D4E0CD1A01E -xctestrun /tmp/devicefarm-workspace/execution-_pntzxas/test-package-rgq2hkis/Benchmark_Tests_iphoneos17.5-arm64.xctestrun -derivedDataPath /tmp/devicefarm-workspace/execution-_pntzxas/logs-arz44yax

User defaults from command line:
    IDEDerivedDataPathOverride = /tmp/devicefarm-workspace/execution-_pntzxas/logs-arz44yax
    IDEPackageSupportUseBuiltinSCM = YES

2025-02-13 11:45:15.426 xcodebuild[1206:11246]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError
Test Suite 'All tests' started at 2025-02-13 21:45:16.588.
Test Suite 'Tests.xctest' started at 2025-02-13 21:45:16.589.
Test Suite 'GenericTests' started at 2025-02-13 21:45:16.589.
Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' started.
2025-02-13 21:45:17.981382+0200 Benchmark[528:12959] fopen failed for data file: errno = 2 (No such file or directory)
2025-02-13 21:45:17.981422+0200 Benchmark[528:12959] Errors found! Invalidating cache...
2025-02-13 21:45:18.102307+0200 Benchmark[528:12959] fopen failed for data file: errno = 2 (No such file or directory)
2025-02-13 21:45:18.102337+0200 Benchmark[528:12959] Errors found! Invalidating cache...
2025-02-13 21:45:18.403760+0200 Benchmark[528:13239] Invalid layer: Tensor dimensions N1D1C128256H1W2048 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/benchmark/apple/Benchmark/Tests/GenericTests.mm:90: Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' measured [Memory Peak Physical, kB] average: 910610.813, relative standard deviation: 0.064%, values: [909624.496000, 909755.568000, 909903.024000, 910034.096000, 910099.632000, 910165.168000, 910197.936000, 910329.008000, 910394.544000, 910476.464000, 910623.920000, 910722.224000, 910853.296000, 911000.752000, 911115.440000, 911213.744000, 911262.896000, 911344.816000, 911492.272000, 911606.960000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/benchmark/apple/Benchmark/Tests/GenericTests.mm:90: Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' measured [Memory Physical, kB] average: 105.677, relative standard deviation: 40.570%, values: [131.072000, 114.688000, 163.840000, 131.072000, 49.152000, -16.384000, 114.688000, 131.072000, 49.152000, 114.688000, 147.456000, 81.920000, 147.456000, 131.072000, 114.688000, 49.152000, 81.920000, 114.688000, 131.072000, 131.072000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/benchmark/apple/Benchmark/Tests/GenericTests.mm:90: Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' measured [Clock Monotonic Time, s] average: 0.133, relative standard deviation: 25.002%, values: [0.048478, 0.058375, 0.069154, 0.098666, 0.141264, 0.145176, 0.147966, 0.146008, 0.149075, 0.150631, 0.146958, 0.146468, 0.152349, 0.154769, 0.151167, 0.149614, 0.149809, 0.147247, 0.148472, 0.149282], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[GenericTests test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' passed (5.926 seconds).
Test Case '-[GenericTests test_load_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4]' started.
2025-02-13 21:45:24.187354+0200 Benchmark[528:13246] Invalid layer: Tensor dimensions N1D1C128256H1W2048 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
2025-02-13 21:45:25.706109+0200 Benchmark[528:13247] Invalid layer: Tensor dimensions N1D1C128256H1W2048 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
2025-02-13 11:45:47.867 xcodebuild[1206:11246]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError

Restarting after unexpected exit, crash, or test timeout; summary will include totals from previous launches.

Test Suite 'Selected tests' started at 2025-02-13 21:45:48.794.
Test Suite 'Tests.xctest' started at 2025-02-13 21:45:48.794.
Test Suite 'GenericTests' started at 2025-02-13 21:45:48.794.
Test Suite 'GenericTests' passed at 2025-02-13 21:45:48.794.
	 Executed 0 tests, with 0 failures (0 unexpected) in 0.000 (0.000) seconds
Test Suite 'LLaMATests' started at 2025-02-13 21:45:48.794.
Test Case '-[LLaMATests test_generate_llama_3_2_1b_llama3_mps_8da4w_pte_tokenizer_model_iOS_18_0_iPhone15_4]' started.
2025-02-13 21:45:49.959863+0200 Benchmark[533:14145] Invalid layer: Tensor dimensions N1D1C128256H1W2048 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
, there was an old man who wanted to do some housework. He had two small sons, and they would not let him do the housework because he was old and they were small. One day, the old man sent
PyTorchObserver {"prompt_tokens":4,"generated_tokens":45,"model_load_start_ms":0,"model_load_end_ms":0,"inference_start_ms":1739475949665,"inference_end_ms":1739475956980,"prompt_eval_end_ms":1739475950188,"first_token_ms":1739475950188,"aggregate_sampling_time_ms":189,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
 a big fish was swallowed by a little fish and then it was swallowed by a little frog. This happened so many times that the little frog was very hungry. So he thought of a plan. He wanted to eat as many fishes
PyTorchObserver {"prompt_tokens":4,"generated_tokens":45,"model_load_start_ms":0,"model_load_end_ms":0,"inference_start_ms":1739475956999,"inference_end_ms":1739475964462,"prompt_eval_end_ms":1739475957591,"first_token_ms":1739475957591,"aggregate_sampling_time_ms":381,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
, Thomas Eggar wrote a book called The Black Book of the South Seas. The author was a journalist who set out to find the truth about the slave trade and the fate of the enslaved.
The book was published in 199
PyTorchObserver {"prompt_tokens":4,"generated_tokens":45,"model_load_start_ms":0,"model_load_end_ms":0,"inference_start_ms":1739475964481,"inference_end_ms":1739475971928,"prompt_eval_end_ms":1739475965067,"first_token_ms":1739475965067,"aggregate_sampling_time_ms":595,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
 there was a man who was rich and had two sons. When the man died the sons inherited his possessions and fortune. The older son left home and entered the church and was ordained. The younger son went to a monastery and lived
PyTorchObserver {"prompt_tokens":4,"generated_tokens":45,"model_load_start_ms":0,"model_load_end_ms":0,"inference_start_ms":1739475971946,"inference_end_ms":1739475979364,"prompt_eval_end_ms":1739475972539,"first_token_ms":1739475972539,"aggregate_sampling_time_ms":789,"SCALING_FACTOR_UNITS_PER_SECOND":1000}
 in the Holy Land
Pil** BUILD INTERRUPTED **
Terminated: 15
[DEVICEFARM] ########### Stop received, exit testspec execution ###########
[DEVICEFARM] ########### Finish executing testspec ###########
 
[DEVICEFARM] ########### Setting upload permissions ###########
 
 
[DEVICEFARM] Tearing down your device. Your tests report will come shortly.

From what I see:

  • test_forward_llama_3_2_1b_llama3_mps_8da4w_pte_iOS_18_0_iPhone15_4 passed in 6s
  • test_generate_llama_3_2_1b_llama3_mps_8da4w_pte_tokenizer_model_iOS_18_0_iPhone15_4 was the one that hang?

@guangy10 guangy10 had a problem deploying to upload-benchmark-results February 19, 2025 04:57 — with GitHub Actions Failure
@guangy10
Copy link
Contributor

Weird. When I checked the log, I didn't find the section for each device. Now I can see it.

@manuelcandales I think this pointer would help you debug the issue? #8461 (comment). Do you expect this benchmark config to run on both iOS 17 and 18? If 17 only, we should disable the run for 18. Per what Huy pointed out above, generate can produce new tokens but somehow fail to terminate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants