Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix last insert ID consistency across drivers #3074

Closed
wants to merge 26 commits into from

Conversation

morozov
Copy link
Member

@morozov morozov commented Mar 31, 2018

Reopened #2765 from the personal repo (replaces #2648).

TODO

  • Why should the ID be a string, not an int? Because on a 32-bit system, a numeric ID will be represented as a string by the driver (e.g. mysqli). In a general case, an extremely large ID will not fit the int range anyways.
  • Simplify the test to have a better score on Scrutinizer.
  • Why should we return a zero when the last inserted ID is unavailable, e.g. why not NULL? Because some drivers return 0 when there's no ID. Besides that, for the caller, there's no difference between 0 and NULL, they both mean "none".
  • Why do we need to suppress errors? Let's suppress them for BC. Error reporting will be improved in Statement objects should throw exception in exceptional cases #3005.
  • Address code review concerns.

@morozov morozov force-pushed the fix-last-insert-id-consistency branch from 71b1239 to 547f921 Compare March 31, 2018 18:39
@morozov morozov self-assigned this Mar 31, 2018
@morozov morozov added this to the 2.7.0 milestone Mar 31, 2018
@morozov morozov force-pushed the fix-last-insert-id-consistency branch from 547f921 to f9cf17a Compare April 2, 2018 01:35
@morozov
Copy link
Member Author

morozov commented Apr 2, 2018

Looks like we have a connection leak in general and especially with PDO drivers. Here's the number of MySQL connections obtained from SHOW PROCESSLIST on each test teardown:

MySQLi PDO MySQL
3 4
4 6
41 77

@morozov morozov force-pushed the fix-last-insert-id-consistency branch from d335808 to 6fb67c9 Compare April 2, 2018 02:01
@morozov
Copy link
Member Author

morozov commented Apr 2, 2018

Almost there. Down to one failure on SQL Server.

Ocramius added a commit to Ocramius/dbal that referenced this pull request Apr 2, 2018
@morozov morozov force-pushed the fix-last-insert-id-consistency branch 4 times, most recently from 5746e9f to 8d1ebf7 Compare April 3, 2018 06:36
@morozov morozov requested review from Majkl578 and Ocramius April 3, 2018 07:09
@morozov
Copy link
Member Author

morozov commented Apr 3, 2018

It's green now, @Ocramius, @Majkl578.

Copy link
Contributor

@Majkl578 Majkl578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the possible BC impact especially around those type changes (making everything string). :|

return $this->value;
}

public function register(string $value) : void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of having this crate mutable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it immutable. Would it make more sense to rename this to something like with() : self in this case? What about get()? Should it be something like toValue() instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Majkl578, sadly, the immutable approach will not just work since MysqliConnection and MysqliStatement share the same instance of LastInsertId. It's probably done like this to mitigate:

MySQL drivers reset the last insert ID to 0 after each non-insert query

It could be worked around by keeping the reference to the connection from the statement. But this approach stinks too (also implemented in PDO Statement/Connection pair).

@@ -32,6 +31,9 @@
*/
class PDOConnection extends PDO implements Connection, ServerInfoAwareConnection
{
/** @var string */
private $lastInsertId = '0';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default value + doc block don't seem valid since you assign LastInsertId below.

$this->sql .= self::LAST_INSERT_ID_SQL;
$this->lastInsertId = $lastInsertId;
}
$this->lastInsertId = $lastInsertId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use Driver\LastInsertId too instead of Driver\SQLSrv\LastInsertId? For BC, Driver\SQLSrv\LastInsertId can exend Driver\LastInsertId and be deprecated.
[same for SQLSrvConnection]


private function trackLastInsertId() : void
{
if (! $this->lastInsertId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=== null

@@ -155,7 +170,7 @@ public function lastInsertId($name = null)
$stmt = $this->query('SELECT @@IDENTITY');
}

return $stmt->fetchColumn();
return $stmt->fetchColumn() ?: '0';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like?: here, it hides possible error values (would that be false, '0' or 0 here?) - is it possible to use !== false for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original proposal, lastInsertId() suppresses errors everywhere by design (see the OCI implementation specifically). Not sure I understand and agree with it. We definitely could improve here.

{
if (null === $name) {
return sasql_insert_id($this->connection);
return (string) sasql_insert_id($this->connection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks BC break-ish, what returned false before would now return ''.

}

return $this->query('SELECT ' . $name . '.CURRVAL')->fetchColumn();
return (string) $this->query('SELECT ' . $name . '.CURRVAL')->fetchColumn();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks BC break-ish, what returned false before would now return ''.

*
* @throws \PDOException
*/
public function trackLastInsertId() : void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole design decision behind this method is a bit shady - making subclasses override/disable this behavior looks a bit weird. Would love to see this extracted into some InsertIdStrategyTracking with multiple providers (no-op/identity/...).

{
if ($name === null) {
return false;
return '0';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks BC break-ish, what returned false before would now return '0'.

}

return (int) $result;
return (string) $stmt->fetchColumn();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks BC break-ish, what returned 0 before would now return '0'.

Ocramius added a commit to Ocramius/dbal that referenced this pull request Apr 3, 2018
morozov pushed a commit to morozov/dbal that referenced this pull request Apr 3, 2018
@morozov
Copy link
Member Author

morozov commented Apr 3, 2018

Given the amount of remaining work which is entirely about the API and BC-related concerns, I propose to move it out of v2.7.0. The pressure of this ticket being a release blocker is demotivating and provocative to taking shortcuts.

@@ -69,6 +69,14 @@ public function exec($statement);
/**
* Returns the ID of the last inserted row or sequence value.
*
* If no sequence name is given, "0" is returned if there currently is no last insert ID available
* in the current session or in case of an error or if the driver does not support this operation.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Majkl578 this is where the error suppression is explicitly defined. It looks dangerous to me that the return value may be different from what's expected because of a runtime issue.

@Ocramius
Copy link
Member

Ocramius commented Apr 3, 2018

@morozov agreed, pushing this back into 3.0.

@Ocramius Ocramius modified the milestones: 2.7.0, 3.0.0 Apr 3, 2018
@morozov morozov force-pushed the fix-last-insert-id-consistency branch 2 times, most recently from 9fa28d2 to 007c327 Compare April 17, 2018 17:06
@morozov morozov force-pushed the fix-last-insert-id-consistency branch from 007c327 to 6894505 Compare April 18, 2018 04:34
@morozov
Copy link
Member Author

morozov commented Apr 18, 2018

I'm closing this as "won't fix" and not a bug. Below is the explanation of what exactly was attempted to get fixed and why it shouldn't.

Inconsistency in when the last inserted ID is available across platforms

This is what the largest part of the code change is about including the introduction of the LastInsertID class: MySQL "forgets" the ID right after the next query following the INSERT one, PostgreSQL "forgets" it after dropping the table, etc.

This issue is being fixed at the cost of creating either unnecessary coupling between the Statement and Connection objects (see MySQL and PDO for example), or code duplication (see SQL Server). At the same time, it allows the caller to be careless about capturing the last inserted ID any time and counting on it always being available. At the same time, if the caller is careless enough, there may be another INSERT happened between the original one and the moment of the capture, so this approach is error-prone by definition.

Instead of fixing this issue, we can explicitly recommend fetching the last insert ID immediately after INSERT. Ironically, Oracle which doesn't support identity columns, provides even better way of fetching the ID using the INSERT ... RETURNING id INTO :id syntax.

Inconsistency in what the last inserted ID looks like

Yes, there is an inconsistency between what an empty ID looks like (all kinds of empty values) and the non-empty ones (integer or a string). But it's natural for a weakly-typed language like PHP.

  1. If the ID is empty, it can be reliably checked as if (!$lastInsertedID) without any knowledge of its type. Furthermore, without the proper error reporting (Statement objects should throw exception in exceptional cases #3005) the puristic approach of if (!$lastInsertedID === '0') is dangerous in the way that it doesn't account for potentially existing bug in the DBAL or underlying drivers (e.g. $lastInsertedID = false as a result of an error). Technically, any empty value means "none" regardless of its type.
  2. If it's not empty, the most common use case is to use it in subsequent INSERT/UPDATE statements. At this point, the type of the ID doesn't matter as long as it can be bound to the statement.

Depite the fact that IDs are mostly integers in the DB internals, we cannot represent them as such due to potential overflow issues. At the same time, declaring them strings on the API level is misleading in the way that they are not just strings (e.g. "xyz"), they are strings representing integers, and PHP is fine with that.

Technically, we can re-iterate on the type consistency later, after #3005 is resolved but besides being theoretically correct, I don't see a practical value in it.

@morozov morozov closed this Apr 18, 2018
@morozov morozov added Won't Fix and removed Bug labels Apr 18, 2018
@morozov morozov deleted the fix-last-insert-id-consistency branch May 27, 2018 18:54
@morozov morozov removed this from the 4.0.0 milestone Nov 17, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants