Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP 7.2 extended string types support for PDO #1018

Closed
gjdanis opened this issue Aug 5, 2019 · 17 comments
Closed

PHP 7.2 extended string types support for PDO #1018

gjdanis opened this issue Aug 5, 2019 · 17 comments

Comments

@gjdanis
Copy link

gjdanis commented Aug 5, 2019

PHP 7.2 added support for specifying different bind types for unicode and non-unicode string data: https://wiki.php.net/rfc/extended-string-types-for-pdo

This is a major improvement over the default (PDO::PARAM_STR) which binds strings as unicode data. In the past, the inability for PHP to handle this has led to performance problems in certain cases. For example:

$statement = $pdo->prepare(
   'SELECT * FROM myTable WHERE myVarcharColumn = :value' -- will bind as N'...'
);

$statement->bindValue(':value', $myVarcharValue);
$statement->execute();

If there's an index on myVarcharColumn SQL won't be able to use the index effectively. More on this problem can be found here: http://blog.sqlgrease.com/are-varchar-or-char-hurting-your-performance-due-to-implicit-conversions/

Does the team have a plan to support this binding in the future?

@david-puglielli
Copy link
Contributor

@gjdanis Thanks for bringing this to our attention. We are looking into supporting the different bind types, but it does appear that this affects only emulate prepared statements. Do you know if there are performance issues outside of emulate prepared statements?

@gjdanis
Copy link
Author

gjdanis commented Aug 6, 2019

Thanks @david-puglielli. If not using emulated prepared statements what is the preferred way to bind PDO parameters for non unicode data? I would imagine as long as the bound value includes the N'..' you'd have performance problems from an implicit conversion (i.e. as long as you're using PDOStatement::bindParam and PDO::PARAM_STR).

Just to be clear, I'd like to know how to bind a value used in a predicate on a VARCHAR column:

$statement = $pdo->prepare(
  'SELECT *
   FROM myTable
   WHERE myVarcharColumn = :myVarcharValue // bound value will be N'my data'
);

$statement->bindValue(':myVarcharValue, 'my data', PDO::PARAM_STR);

@david-puglielli
Copy link
Contributor

david-puglielli commented Aug 7, 2019

You are correct, the performance issue impacts non-emulate prepared statements as well. Right now I do not believe there is a way to bind non-unicode data without an implict conversion in pdo_sqlsrv. We will continue investigating.

@yitam
Copy link
Contributor

yitam commented Sep 24, 2019

Hi @gjdanis

We are looking into the feasibility of implementing this feature request, but from what I see in the pull request, the national character set parameters don't affect true prepared statements in PDO MySQL.

While it's possible for PDO_SQLSRV to support the new extended string param types in implementing PDO::quote() and emulate prepared statements, for individual real prepared statements, the user can specify encoding, either as a statement option or as a parameter encoding.

For an example of parameter encoding, please check the existing documentation for PDOStatement::bindParam(), where the optional $driver_options can be used for this purpose.

In your case, using your example above, you can try this:

$options = array(PDO::SQLSRV_ATTR_ENCODING => PDO::SQLSRV_ENCODING_SYSTEM);
$statement = $pdo->prepare(
  'SELECT *
   FROM myTable
   WHERE myVarcharColumn = :myVarcharValue',  
  $options
);

$statement->bindValue(':myVarcharValue', 'my data', PDO::PARAM_STR);

If using bindParam(), you might choose to do something like this:

$statement = $pdo->prepare(
  'SELECT *
   FROM myTable
   WHERE myVarcharColumn = :myVarcharValue'
);
$p = 'my data';
$statement->bindParam(':myVarcharValue', $p, PDO::PARAM_STR, 0, PDO::SQLSRV_ENCODING_SYSTEM);

@yitam
Copy link
Contributor

yitam commented Oct 1, 2019

hi @gjdanis, just want to follow up and see if you have given our suggestions a try.

@gjdanis
Copy link
Author

gjdanis commented Oct 1, 2019

Hi @yitam it looks like this works. Here's the statement that's prepared:

declare @p1 int
set @p1=1
exec sp_prepexec @p1 output,N'@P1 varchar(8000)',N'SELECT *
   FROM myTable
   WHERE myVarcharColumn = @P1','my data'
select @p1

I'm curious though if this can be better abstracted as developers are required to know the driver that's being used (i.e. they have to write driver-aware SQL).

@yitam
Copy link
Contributor

yitam commented Oct 1, 2019

Thanks @gjdanis for getting back to us. As I mentioned above, the extended string types affect emulated prepared statements only, as shown in one MySQL test in the pull request. To not confuse the users, we should be consistent with MySQL. What do you think?

I'm not sure I understand your question about being better abstracted or writing driver-aware SQL. Please elaborate.

@gjdanis
Copy link
Author

gjdanis commented Oct 1, 2019

@yitam basically developers need to know that they're using sqlsrv to use this feature. The extended string types are more generic (although they are only implemented for a single driver and only affect emulated prepared statements).

I think we can close this issue.

@gjdanis gjdanis closed this as completed Oct 1, 2019
@yitam
Copy link
Contributor

yitam commented Oct 1, 2019

Thanks @gjdanis. We will support the new extended string types introduced in PHP 7.2 for emulated prepared statements, and we will decide whether to support these extended types in real prepared statements.

@gjdanis
Copy link
Author

gjdanis commented Aug 11, 2020

@yitam it looks like this was implemented for emulated prepared statements -- was a decision ever reached on real prepared statements?

Just to reiterate our use case, we often have tables that have VARCHAR columns. But since PDO only supports NVARCHAR bindings, we run into performance problems with some frequency. Supporting these extended types would help us better prevent these problems.

@yitam
Copy link
Contributor

yitam commented Aug 11, 2020

Hi @gjdanis , we stick with our original decision to be consistent with other drivers, as mentioned above:

the extended string types affect emulated prepared statements only, as shown in one MySQL test in the pull request. To not confuse the users, we should be consistent with MySQL.

If you mainly use VARCHAR columns, you can set your statement or connection attribute PDO::SQLSRV_ATTR_ENCODING to PDO::SQLSRV_ENCODING_SYSTEM as explained here

@gjdanis
Copy link
Author

gjdanis commented Aug 12, 2020

@yitam Thanks for getting back to me. We'd need to be careful about this solution, however, as this could break if there is unicode data in the query (i.e. if we need to join two tables together, one with a varchar column that's being filtered on and one with an nvarchar column that's being filtered on).

Can you take this use into consideration?

@yitam
Copy link
Contributor

yitam commented Aug 12, 2020

You're welcome @gjdanis

For pdo_sqlsrv, as you already know, the default encoding is PDO::SQLSRV_ENCODING_UTF8. Yet. similar to the extended string types for PDO, even with a different encoding set as a connection attribute, you can always override it when you set the encoding attribute for the statement or for individual prepare() or bindParam(), etc., depending on your use cases.

@gjdanis
Copy link
Author

gjdanis commented Aug 12, 2020

Thanks @yitam I think the problem though is that we might have something like this where we need to be able to control this setting on the parameter level:

$pdo = new \PDO(...);
$options = [PDO::SQLSRV_ATTR_ENCODING => PDO::SQLSRV_ENCODING_SYSTEM];
$statement = $pdo->prepare(
  'SELECT *
   FROM table t1
   INNER JOIN otherTable t2
      ON t1.id = t2.id 
   WHERE t1.varcharColumn = :varcharValue
     AND t2.nvarcharColumn = :nvarcharValue',
  $options
);
$statement->bindValue(':varcharValue', 'test');
$statement->bindValue(':nvarcharValue', 'Ӕ');

If I understand the documentation the unicode characters would be replaced with question marks, right?

Any multi-byte characters or characters that do not map into this code page are substituted with a single-byte question mark (?) character.

@yitam
Copy link
Contributor

yitam commented Aug 12, 2020

@gjdanis in this case please use bindParam() instead, like this

// $statement->bindValue(':nvarcharValue', 'Ӕ');
$nval = 'Ӕ';
$statement->bindParam(':nvarcharValue', $nval, PDO::PARAM_STR, 0, PDO::SQLSRV_ENCODING_UTF8);

Please check my previous reply as well

@gjdanis
Copy link
Author

gjdanis commented Aug 12, 2020

Thanks for clarifying, @yitam -- I think this should work for us.

@yitam
Copy link
Contributor

yitam commented Aug 12, 2020

Glad to be of help @gjdanis 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants