Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to configure connect timeout separately #22

Closed
wants to merge 1 commit into from
Closed

Allow to configure connect timeout separately #22

wants to merge 1 commit into from

Conversation

VincentLanglet
Copy link
Contributor

@VincentLanglet VincentLanglet commented Mar 7, 2023

Hi @daniel-jones-deepl,

Currently, Deepl only allow one option: TranslatorOptions::TIMEOUT.
This one is used for both

  • \CURLOPT_CONNECTTIMEOUT
  • \CURLOPT_TIMEOUT_MS

But we might accept a CURLOPT_TIMEOUT_MS of 30s while we want a low CURLOPT_CONNECTTIMEOUT.
In general it doesn't make a lot of sens to have more than a few seconds for the connect timeout.

So it should be better to split both option

  • TranslatorOptions::CONNECT_TIMEOUT
  • TranslatorOptions::EXEC_TIMEOUT

To keep the old behavior, I kept the TranslatorOptions::TIMEOUT option which is used as a default value for both CONNECT_TIMEOUT and EXEC_TIMEOUT

Also, the doc was saying

timeout: the number of seconds used as connection timeout for each HTTP request retry. The default value is 10.0 (10 seconds).

So I changed it to connect_timeout. (Even if timeout will still work for the BC compatibility).

I'm open to any feedback :)

@VincentLanglet
Copy link
Contributor Author

Friendly ping @daniel-jones-deepl, @JanEbbing.
Does someone have time to take a look ? :)

@JanEbbing
Copy link
Member

Hi @VincentLanglet , sorry, Daniel is out at the moment.
For context, we get several requests to configure certain HTTP parameters/behaviours - if we add them all as a member variable of TranslatorOption, I feel things are going to get unwieldy. Hence, I'd rather expose the HttpClient.

I see 2 broad ways to achieve this.

  1. Allow users to configure the $curlOptions variable in HttpClient (we would ignore/overwrite some values, e.g. URL, or anything that is set explicitly as an option, like proxy).
  2. Exposing the entire HttpClient class and allowing users to pass their own HttpClient into Translator. We would still provide the current client as the default and for users to subclass to change certain behaviour (e.g. when to retry, changing the backoff algorithm, etc).

My asks:

  • Do both of these approaches cover your use case for this PR?
  • Do you foresee anything you would want to do that would not be possible with 1) or 2) or both? (e.g. changing the backoff algorithm isn't possible with 1, but I don't think that's a big concern - maybe something like "I want to have multiple translator objects with a common HTTP Client so I can rate-limit how many outgoing requests I make" is something users would run into?)
  • Any comments/reasons you'd prefer one over the other?

@VincentLanglet
Copy link
Contributor Author

Hi @VincentLanglet , sorry, Daniel is out at the moment. For context, we get several requests to configure certain HTTP parameters/behaviours - if we add them all as a member variable of TranslatorOption, I feel things are going to get unwieldy. Hence, I'd rather expose the HttpClient.

Yes, I understand. I tried to introduce as few changes as possible:

I see 2 broad ways to achieve this.

  1. Allow users to configure the $curlOptions variable in HttpClient (we would ignore/overwrite some values, e.g. URL, or anything that is set explicitly as an option, like proxy).
  2. Exposing the entire HttpClient class and allowing users to pass their own HttpClient into Translator. We would still provide the current client as the default and for users to subclass to change certain behaviour (e.g. when to retry, changing the backoff algorithm, etc).

My asks:

  • Do both of these approaches cover your use case for this PR?
  • Do you foresee anything you would want to do that would not be possible with 1) or 2) or both? (e.g. changing the backoff algorithm isn't possible with 1, but I don't think that's a big concern - maybe something like "I want to have multiple translator objects with a common HTTP Client so I can rate-limit how many outgoing requests I make" is something users would run into?)
  • Any comments/reasons you'd prefer one over the other?

For context, I currently use the following code (with this PR)

$client = new Translator($apiKey, [
     TranslatorOptions::DEFAULT_MAX_RETRIES => 0,
     TranslatorOptions::CONNECT_TIMEOUT => 5,
     TranslatorOptions::EXEC_TIMEOUT => 30,
]);
$translations = $client->translateText($payload, $languageFrom, $languageTo, []);
$languages = $client->getTargetLanguages();

For the "better" solution, I would say it depends on the amount of time/work/changes you want to take @JanEbbing.

Allowing to pass the HttpClient to the Translator seems to be the more flexible solution (with a default http implementation) ; but to me it's also the solution which might require the more change in the Deepl codebase.

Even if the HttpClient is now an entry-point, it still could be helpful to allow people to pass some curl options to the HttpClient, in order to not having to rewrite a whole HttpClient if they want to change the timeout. So to me the solutions 1 and 2 are not incompatible. You could have a

class HttpClientFactory
{
     public function static create($authKey, $options)
     {
          // Logic which are inside Translator::construct
     }
}

And people could write either

new Translator(HttpClientFactory::create($authKey, $options));
new Translator($customHttpClient);

Still, this could require a way to override curlOptions with the factory or construct of the HttpClient if I just want to keep all the logic from Deepl except one specific curl options (like the connect timeout).

Something like

class HttpClient
{
     public function __construct($url, ..., $defaultCurlOptions) {
         ...
         $this->defaultCurlOptions = $defaultCurlOptions;
     }

     public function sendRequestWithBackOff(...) {
         ...
         while ($backoff->getNumRetries() <= $this->maxRetries) {
              $response = $this->sendRequest(
                    $method,
                    $url,
                    $backoff->getTimeUntilDeadline(),
                    $headers,
                    $params,
                    $file,
                    $outFile
                );
                ...
         }
     }
     
     private function sendRequest(..., $timeUntilDeadline) {
         $curlOptions = $this->defaultCurlOptions;
         if (isset($curlOptions[\CURLOPT_CONNECTTIMEOUT_MS]) {
             $curlOptions[\CURLOPT_CONNECTTIMEOUT_MS] = max($curlOptions[\CURLOPT_CONNECTTIMEOUT_MS], $timeUntilDeadline * 1000));
         } else {
             $curlOptions[\CURLOPT_CONNECTTIMEOUT] = max($curlOptions[\CURLOPT_CONNECTTIMEOUT], $timeUntilDeadline));
         }
         if (isset($curlOptions[\CURLOPT_TIMEOUT_MS]) {
             $curlOptions[\CURLOPT_TIMEOUT_MS] = max($curlOptions[\CURLOPT_TIMEOUT_MS], $timeUntilDeadline * 1000));
         } else {
             $curlOptions[\CURLOPT_TIMEOUT] = max($curlOptions[\CURLOPT_TIMEOUT], $timeUntilDeadline));
         }
     }
}

Does it answer all your questions ? Do you plan to implements such change alone/with your team or should/can I help ?

@VincentLanglet
Copy link
Contributor Author

Hi @JanEbbing, is there anything to do on my side ?

I would have expected a short answer to know

@JanEbbing
Copy link
Member

Hi Vincent, sorry for the lack of transparency here. We'll ultimately move all development to GitHub, then it's easier to see what we're working on.
For your questions, I'm working on allowing users to configure which HTTP Client is used, using either a PSR-18 compliant one or the symfony client. So, nothing expected from your side.

@VincentLanglet
Copy link
Contributor Author

Hi Vincent, sorry for the lack of transparency here. We'll ultimately move all development to GitHub, then it's easier to see what we're working on. For your questions, I'm working on allowing users to configure which HTTP Client is used, using either a PSR-18 compliant one or the symfony client. So, nothing expected from your side.

Thanks for your quick answer, I'm impatient to try this feature (and stop using a temporary fork).
Feel free to ping me if you need review/beta-testing before an official release :)

@JanEbbing
Copy link
Member

Hi @VincentLanglet , I added support for custom clients in a new branch. We plan to merge this in the next version, please let me know what you think.

@VincentLanglet
Copy link
Contributor Author

Hi @VincentLanglet , I added support for custom clients in a new branch. We plan to merge this in the next version, please let me know what you think.

Hi, this seems an ok implementation.
It should indeed solve our issue, allowing us to configure timeout by ourself.

The only things I see which might be improve (but I'm not certain how), would be to remove the hard dependency to guzzlehttp/psr7.
338c123#diff-d2ab9925cad7eac58e0ff4cc0d251a937ecf49e4b6bf57f8b95aab76648a9d34R15

I saw @nicolas-grekas promoting https://github.com/php-http/discovery which does an auto-discovery of the right psr implementation. The goal would be to use

  • Psr17FactoryDiscovery::findRequestFactory()::createRequest() instead of new Request()
  • Psr17FactoryDiscovery::findStreamFactory()::createStreamFromFile() instead of new MultipartStream()

but I'm not sure how easy it is to use this ; and this can still be done in another PR if needed.

@JanEbbing
Copy link
Member

Thanks, that seems like a great way to avoid a dependency on a PSR-7 implementation!

@daniel-jones-deepl
Copy link
Member

daniel-jones-deepl commented Jun 2, 2023

Another idea: We could also require a PSR-7/PSR-17 HTTP factory to be passed in with the PSR-18 HTTP client. This would mean no direct dependencies(*), but it would make the library interface a little harder to use.

*: edited-to-add: actually there would need to be a psr/http-factory dependency.

@nicolas-grekas
Copy link

nicolas-grekas commented Jun 2, 2023

I would definitely recommend relying on php-http/discovery and using its Psr18Client. It's an adapter that knows how to use whatever PSR-17+PSR-18 implementations are available.

As far as timeout is concerned, this could be considered out of scope from an SDK pov (aka let users that care do the timeout configuration as part of their wiring) or we could also discuss a way to make Psr18Client configurable somehow.

You might want to check helpscout/helpscout-api-php#311 for inspiration.

@JanEbbing
Copy link
Member

@nicolas-grekas Thanks for the code example. We use POST requests with multipart/form-data. Based on the docs I would use this, but using it in the code the StreamFactoryDiscovery::find() is marked as deprecated and to be removed in 2.0 (actually the entire class).

Is there any way around using that?

@nicolas-grekas
Copy link

Instead of this discovery class, you should use https://github.com/php-http/multipart-stream-builder
and use the Psr18Client as a stream factory (it does implement what's needed):

$client = new Psr18Client();
$builder = new MultipartStreamBuilder($client);

@JanEbbing
Copy link
Member

Thanks! I implemented that change and removed the guzzle dependency on the branch.

@VincentLanglet
Copy link
Contributor Author

Thanks! I implemented that change and removed the guzzle dependency on the branch.

Hi @JanEbbing,
I tried your branch (with the recent changes) on my project and it works fine.
Thanks :)

@VincentLanglet
Copy link
Contributor Author

Hi @JanEbbing, what are the next step for your branch ? Is it ready to be reviewed/merged/released ? :)

@JanEbbing
Copy link
Member

Yes, this (with some other fixes) will be released today.

@VincentLanglet
Copy link
Contributor Author

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants