-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process files in parallel #32
Process files in parallel #32
Conversation
Hello @kelunik. I'm trying to integrate AMP here for doing some parallel processing but I fail to make it work. The error I get is: First reason
Which points to this line saying However this seems incorrect and hints on a serialization issue. But when I try to serialize/unserialize it manually with the regular PHP functions everything looks fine. Is there any way to debug this? I can't pinpoint where this is happening exactly. Any idea? |
@theofidry What do I have to run to reproduce the error? |
|
I can reproduce that with Nevermind, will just catch it in |
The exception is not caught: it's thrown at this line and then output in the console. I used break points to look at the details of the exception |
src/Box.php
Outdated
function (\SplFileInfo $fileInfo): string { | ||
return $fileInfo->getPathname(); | ||
}, | ||
iterator_to_array($files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$files may be an array here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, it's not in practice but I should account for that still 👍
Seems like there's some bug in the closure serialization of https://github.com/opis/closure. It works if you map $args = [];
foreach ($files as $file) {
$args[] = [$file, $retrieveBasePath, $mapFile, $placeholders, $compactors];
} Now I get |
Completes successfully now. I removed a |
diff --git a/src/Box.php b/src/Box.php
index dc69c6c..7d1f73b 100644
--- a/src/Box.php
+++ b/src/Box.php
@@ -157,25 +157,16 @@ final class Box
$compactors = $this->compactors;
$placeholders = $this->placeholders;
- //Debug: the values passed to the $processFile closure seems to be working fine
- $x0 = \serialize($retrieveBasePath);
- $x1 = \serialize($mapFile);
- $x3 = \serialize($placeholders);
- $x4 = \serialize($compactors);
-
- $y0 = \unserialize($x0, []);
- $y1 = \unserialize($x1, []);
- $y3 = \unserialize($x3, []);
- $y4 = \unserialize($x4, []);
-
$files = array_map(
- function (\SplFileInfo $fileInfo): string {
- return $fileInfo->getPathname();
+ function (\SplFileInfo $fileInfo) use ($retrieveBasePath, $mapFile, $compactors, $
placeholders): array {
+ return [$fileInfo->getPathname(), $retrieveBasePath, $mapFile, $compactors, $p
laceholders];
},
iterator_to_array($files)
);
- $processFile = function (string $filePath) use ($retrieveBasePath, $mapFile, $placehol
ders, $compactors) {
+ $processFile = function ($args) {
+list($filePath, $retrieveBasePath, $mapFile, $compactors, $placeholders) = $args;
+
Assertion::file($filePath);
Assertion::readable($filePath); |
Indeed, that works for building the PHAR from the source but it still fails in the PHAR:
The error I get is:
Which do be honest I have no idea what this means. Do you also have it on your end? If you want a more automated way to check this you can simply run |
I have created opis/closure#18, without the
|
opis/closure#19 fixes it. |
Thanks for the input @kelunik that's been of great help! Regarding the usage in To be used outside a PHAR I guess I can always have a custom stub which would require the copy instead of autoloading the file from inside the PHAR |
Paves the way for #32. As it stands, #32 requires some functions from `Configuration` which would require `Configuration` to be serializable. This is a lot of work so it makes sense to rather encapsulate some behaviour in utility classes and leverage them instead. As a result only those utility classes would require to be serializable which is trivial. This also prepares a change for the `Box` class which IMO should handle the file mapping as it already handles the placeholders replacement and compactor processing.
@theofidry My guess is this, which will be inside the PHAR and not directly runnable, or does PHP support running private static $pharScriptPath;
public function __construct(string $envClassName = BasicEnvironment::class, array $env = []
if (\strpos(self::SCRIPT_PATH, "phar://") === 0) {
if (self::$pharScriptPath) {
$script = self::$pharScriptPath;
} else {
$scriptContent = \file_get_contents(self::SCRIPT_PATH);
self::$pharScriptPath = $script = \tempnam(\sys_get_temp_dir(), "amp-worker-process-");
\file_put_contents(self::$pharScriptPath, $scriptContent);
\register_shutdown_function(static function() {
@\unlink(self::$pharScriptPath);
});
}
} else {
$script = self::SCRIPT_PATH;
}
$script = [$script, $envClassName];
/* ... */ ^ This is what I tried in the linked class, but it didn't work. |
Hm that's weird I though PHARs would intercept those kind of things; I'll check it out I'd prefer to solve the real issue rather than trying to find a workaround |
There's something else strange:
|
Hm depends, there's the Php compactor enabled (see |
I've pushed a fix (amphp/parallel@d16da46), please verify it before I tag a new version. |
We've released v0.2.2 of |
Thanks, I’ll give another go soonish!
…On Wed 24 Jan 2018 at 08:06, Niklas Keller ***@***.***> wrote:
We've released v0.2.2 of amphp/parallel.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#32 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AE76gft_lQJoAbhVXcVqUaWm3HqxLcW2ks5tNuRlgaJpZM4Rl3Pa>
.
|
8c3e433
to
e79bb42
Compare
Update. So I've been doing a lot of work (see the PR description) to pave the way for this PR. Turns out that besides fixing a lot of bugs, harden tests, this also provided a speed increase of 30% which is due to optimising the collection of the files. Instead of having different config entries and adding the files one by one per config entry, they have been aggregated which allows me to ensure a file is not added twice. As files needs to be processed, this turned out to be a bigger perf gain than I expected. That said the above is a bonus, another slow part (which will become even slower with #31) is the file processing. So as you can see by the diff, the integration is actually quite trivial now and the perf gain is of 25% for now. Thanks you a lot @kelunik and @trowski, it's really cool to see it working well. Also @kelunik, apparently the issue with the usage in the PHAR has been taken care of? I though I would have to look at it today but looks to be working fine now |
@theofidry Yes, the v0.2.2 release fully fixed the usage inside PHARs. Great to hear about the gain! |
The process of adding files with Box works like the following:
Configuration
There is two expansive processes here:
The first one is out of scope of this PR, but this PR address the second point which is reducing the processing time by parallelising this work.
Note for myself: extract from this PR building files from an iterator/array of files which will drastically simplify this PR.
Box:addFiles()
method #44)