Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(POC) Add hook_civicrm_postCommit, a less foot-gunny variant of hook_civicrm_post #15338

Merged
merged 1 commit into from
Mar 16, 2020

Conversation

totten
Copy link
Member

@totten totten commented Sep 20, 2019

Overview

So here's a pattern that's occurred a few times: one wants to provide an extra notification or log or correlated-record after something is chanaged or created. So you implement hook_civicrm_post. It sounds simple, but it doesn't work quite as expected - because running within the transaction can have some special implications:

  1. Performing subsequent SQL queries within the same transaction be wonky
  2. Errors in your notification bubble-up and break the transaction.

You can resolve this by deferring your work until the transaction completes (assuming that your work is not a critical/blocker for the success of the original operation). The deferral technique has been discussed in various media over the years; e.g. here's a mention in the dev docs:

https://docs.civicrm.org/dev/en/latest/hooks/hook_civicrm_post/#example-with-transaction-callback

However, I think the problem may be that the default is backwards: there's a larger percentage of use-cases for post hook in which you prefer to run after the transaction commits, but those common use-cases require the special incantation.

This patch is a proof-of-concept in which the system provides two hooks:

  • hook_civicrm_post: Runs immediately after the change is sent to the DB. If there's a SQL transaction, then it runs within the transaction.
  • hook_civicrm_postCommit: Runs after the change is committed to the DB. Runs outside of any SQL transactions.

My theory is that more developers would run their logic at the correct time if they had postCommit available as a hook (and, of course, the downstream code would look tidier). This isn't really a pain-point for me, so I'm not super-motivated to push it, and I haven't looked very hard for systemic side-effects of buffering all post events. However, I think it could provide better DX (making it easier for more folks to get the right timing), so I wanted to share the POC.

Before

/**
 * Hook fired after the INSERT/UPDATE is sent to the DB
 */
function hook_civicrm_post($op, $objectName, $objectId, &$objectRef)

// Optionally, within your hook listener, you may inspect CRM_Core_Transaction
// to see if there is an active transaction and to possibly reschedule work for
// the post-commit phase.
if (CRM_Core_Transaction::isActive()) {
  CRM_Core_Transaction::addCallback(CRM_Core_Transaction::POST_COMMIT_PHASE, 'run_another_function', ...);
}
else {
  run_another_function(...);
}

After

/**
 * Hook fired after the INSERT/UPDATE is sent to the DB
 */
function hook_civicrm_post($op, $objectName, $objectId, &$objectRef);

/**
 * Hook fired after the record is committed to the DB.
 */
function hook_civicrm_postCommit($op, $objectName, $objectId, $objectRef);

Technical Details

  • hook_civicrm_post always runs before hook_civicrm_postCommit
  • If work is done without any active transactions (e.g. in autocommit mode), then the two hooks are basically contemporaneous.

Overview
--------

So here's a pattern that's occurred a few times: one wants to provide an
extra notification or log or correlated-record after something is chanaged
or created.  So you implement `hook_civicrm_post`.  It sounds simple, but it
doesn't work quite as expected - because running within the transaction can
have some special implications:

1. Performing subsequent SQL queries within the same transaction be wonky
2. Errors in your notification bubble-up and break the transaction.

You can resolve this by deferring your work until the transaction completes. The
technique has been discussed in various media over the years; e.g. here's a mention in the dev docs:

https://docs.civicrm.org/dev/en/latest/hooks/hook_civicrm_post/#example-with-transaction-callback

However, I think the problem may be that the default is basically backwards: there are more use-cases
for 'post' hook in which you prefer to run *after the transaction commits*, but doing that case requires
the special incantation.

This patch is a proof-of-concept in which the system provides two hooks:

* `hook_civicrm_post`: Runs immediately after the change is *sent* to the DB.
  If there's a SQL transaction, then it runs *within* the transaction.
* `hook_civicrm_postCommit`: Runs after the change is *committed* to the DB.
  Runs outside of any SQL transactions.

My theory is that more developers would run their logic at the correct time
if they had `postCommit` available as a hook (and, of course, the downstream
code would look tidier).  This isn't really a pain-point for me, so I'm not
super-motivated to push it, and I haven't looked very hard for systemic
side-effects of buffering more posts.  However, I think it could provide
better DX (making it easier for more folks to get the right timing), so I
wanted to share the POC.

Before
------

```php
/**
 * Hook fired after the INSERT/UPDATE is sent to the DB
 */
function hook_civicrm_post($op, $objectName, $objectId, &$objectRef)
```

After
------

```php
/**
 * Hook fired after the INSERT/UPDATE is sent to the DB
 */
function hook_civicrm_post($op, $objectName, $objectId, &$objectRef);

/**
 * Hook fired after the record is committed to the DB.
 * This may be immediate (for non-transactional work) or it may be
 * delayed a few milliseconds (for transactional work).
 */
function hook_civicrm_postCommit($op, $objectName, $objectId, $objectRef);
```
@civibot
Copy link

civibot bot commented Sep 20, 2019

(Standard links)

@civibot civibot bot added the master label Sep 20, 2019
@totten totten changed the title (POC) Add hook_civicrm_postCommit, a variant of hook_civicrm_post (POC) Add hook_civicrm_postCommit, a less foot-gunny variant of hook_civicrm_post Sep 20, 2019
@jackrabbithanna
Copy link
Contributor

This is a good idea and can come in handy, much better DX.

@mattwire
Copy link
Contributor

I've marked this as concept approved as I also agree it would be a useful improvement. However, I think we need to think /discuss a little more before actually merging anything.

  • The callback code you are using is exactly what I use in hook_civicrm_post every time and it always seems a bit silly to have to do that when all I'm trying to do is trigger/modify something after it changed.
  • I don't see any reason to trigger the hook multiple times, others may? My preference would be to make it part of the contract for this hook that it will only fire once and not be triggered multiple times (which just introduces more slowness as code gets executed multiple times).

My final comment, which may be slightly out of scope but is certainly related. It would be really really nice to have a hook that fired when everything had finished, ideally with a list of modified entities. For example, currently hook_civicrm_post might fire on a Contribution, but then Civi continues and creates a membership linked to that contribution. You then get hook_civicrm_post firing for Membership, then MembershipPayment. If we could have a hook that said, "We're done, and we've just created/updated Contribution X, Line Item X, Line Item Y, Membership X, MembershipPayment X" would you like to do anything with all that?"

That would make some things so much easier as you can currently use statics to "collect" entities/IDs and/or add things to the session but you have no way of knowing which is the last one.

@totten
Copy link
Member Author

totten commented Sep 25, 2019

The callback code you are using is exactly what I use in hook_civicrm_post every time and it always seems a bit silly to have to do that when all I'm trying to do is trigger/modify something after it changed.

👍

I don't see any reason to trigger the hook multiple times, others may?

When I started the POC, I had the same thought (hence the comment about $id = $e->entity . chr(0) . $e->action . chr(0) . $e->id;) but walked it back.

Consider a unit-of-work with multiple updates. Here's a small example which runs in cv scr.

Civi::dispatcher()->addListener('hook_civicrm_post', function($e){
  printf("Recieved hook: %s => %s\n\n", 'hook_civicrm_post', json_encode($e, JSON_PRETTY_PRINT));
});

CRM_Core_Transaction::create()->run(function(){
  $c = civicrm_api3('Contact', 'create', [
    'contact_type' => 'Individual',
    'first_name' => 'First',
    'last_name' => 'Last' . time(),
  ]);
  printf("Created %d\n", $c['id']);

  civicrm_api3('Contact', 'create', [
    'id' => $c['id'],
    'do_not_phone' => 1,
  ]);

  civicrm_api3('Contact', 'create', [
    'id' => $c['id'],
    'do_not_sms' => 1,
    'do_not_email' => 0,
  ]);
});

In this case, h_c_post signals two edits on the same object in the same unit-of-work -- and the substance of the edit ($objectRef) is different. If one naively deduped the hook-invocations (like in POC's comment), then the content of $objectRef would be ill-defined.

Of course, $objectRef traditionally had a weird DX anyway, so maybe the answer is to just omit it. h_c_postCommit could provide the object types and IDs without the specific data - that would reduce the amount of data-buffering and make it easier to dedupe.

My preference would be to make it part of the contract for this hook that it will only fire once and not be triggered multiple times (which just introduces more slowness as code gets executed multiple times).

Yeah, I think that's legit.

@mattwire
Copy link
Contributor

@totten I'm happy with this PR if we can get a corresponding docs PR in place. Re my other comments we can use register_shutdown_function to trigger actions that must happen at the end and collect the entities via static variables and the post(commit) hooks.

@jaapjansma
Copy link
Contributor

Betty and I are reviewing PR's and we came across this PR and we read it is merge ready but no docs PR. @totten are you able to provide documentation so this one can be merged?

@totten
Copy link
Member Author

totten commented Mar 16, 2020

@jaapjansma Here's a PR for documentation: civicrm/civicrm-dev-docs#780

@seamuslee001
Copy link
Contributor

I'm going to merge this as we now have the docs PR and there is general agreement on this hook

@seamuslee001 seamuslee001 merged commit 2d82d5a into civicrm:master Mar 16, 2020
@homotechsual
Copy link
Contributor

@seamuslee001 can you clear needs-documentation tag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants