diff --git a/doc/html/pcre2api.html b/doc/html/pcre2api.html
index b7f64b4f2..65a50fba5 100644
--- a/doc/html/pcre2api.html
+++ b/doc/html/pcre2api.html
@@ -4048,9 +4048,18 @@ <h1>pcre2api man page</h1>
 The <b>pcre2_set_substitution_callout()</b> function can be used to specify a
 callout function for <b>pcre2_substitute()</b>. This information is passed in
 a match context. The callout function is called after each substitution has
-been processed, but it can cause the replacement not to happen. The callout
-function is not called for simulated substitutions that happen as a result of
-the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
+been processed, but it can cause the replacement not to happen.
+</P>
+<P>
+The callout function is not called for simulated substitutions that happen as a
+result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. In this mode, when
+substitution processing exceeds the buffer space provided by the caller,
+processing continues by counting code units. The simulation is unable to
+populate the callout block, and so the simulation is pessimistic about the
+required buffer size. Whichever is larger of accepted or rejected substitution
+is reported as the required size. Therefore, the returned buffer length may be
+an overestimate (without a substitution callout, it is normally an exact
+measurement).
 </P>
 <P>
 The first argument of the callout function is a pointer to a substitute callout
diff --git a/doc/pcre2.txt b/doc/pcre2.txt
index b10f86028..8cd8eeb4d 100644
--- a/doc/pcre2.txt
+++ b/doc/pcre2.txt
@@ -3893,12 +3893,20 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
        The pcre2_set_substitution_callout() function can be used to specify  a
        callout  function for pcre2_substitute(). This information is passed in
        a match context. The callout function is called after each substitution
-       has been processed, but it can cause the replacement not to happen. The
-       callout function is not called for simulated substitutions that  happen
-       as a result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
+       has been processed, but it can cause the replacement not to happen.
+
+       The callout function is not called  for  simulated  substitutions  that
+       happen  as  a result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. In
+       this mode, when substitution processing exceeds the buffer  space  pro-
+       vided  by  the caller, processing continues by counting code units. The
+       simulation is unable to populate the callout block, and so the  simula-
+       tion is pessimistic about the required buffer size. Whichever is larger
+       of  accepted or rejected substitution is reported as the required size.
+       Therefore, the returned buffer length may be an overestimate (without a
+       substitution callout, it is normally an exact measurement).
 
        The first argument of the callout function is a pointer to a substitute
-       callout  block structure, which contains the following fields, not nec-
+       callout block structure, which contains the following fields, not  nec-
        essarily in this order:
 
          uint32_t    version;
@@ -3909,34 +3917,34 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
          uint32_t    oveccount;
          PCRE2_SIZE  output_offsets[2];
 
-       The version field contains the version number of the block format.  The
-       current  version  is  0.  The version number will increase in future if
-       more fields are added, but the intention is never to remove any of  the
+       The  version field contains the version number of the block format. The
+       current version is 0. The version number will  increase  in  future  if
+       more  fields are added, but the intention is never to remove any of the
        existing fields.
 
        The subscount field is the number of the current match. It is 1 for the
        first callout, 2 for the second, and so on. The input and output point-
        ers are copies of the values passed to pcre2_substitute().
 
-       The  ovector  field points to the ovector, which contains the result of
+       The ovector field points to the ovector, which contains the  result  of
        the most recent match. The oveccount field contains the number of pairs
        that are set in the ovector, and is always greater than zero.
 
-       The output_offsets vector contains the offsets of  the  replacement  in
-       the  output  string. This has already been processed for dollar and (if
+       The  output_offsets  vector  contains the offsets of the replacement in
+       the output string. This has already been processed for dollar  and  (if
        requested) backslash substitutions as described above.
 
-       The second argument of the callout function  is  the  value  passed  as
-       callout_data  when  the  function was registered. The value returned by
+       The  second  argument  of  the  callout function is the value passed as
+       callout_data when the function was registered. The  value  returned  by
        the callout function is interpreted as follows:
 
-       If the value is zero, the replacement is accepted, and,  if  PCRE2_SUB-
-       STITUTE_GLOBAL  is set, processing continues with a search for the next
-       match. If the value is not zero, the current  replacement  is  not  ac-
-       cepted.  If  the  value is greater than zero, processing continues when
-       PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than  zero
+       If  the  value is zero, the replacement is accepted, and, if PCRE2_SUB-
+       STITUTE_GLOBAL is set, processing continues with a search for the  next
+       match.  If  the  value  is not zero, the current replacement is not ac-
+       cepted. If the value is greater than zero,  processing  continues  when
+       PCRE2_SUBSTITUTE_GLOBAL  is set. Otherwise (the value is less than zero
        or PCRE2_SUBSTITUTE_GLOBAL is not set), the rest of the input is copied
-       to  the  output and the call to pcre2_substitute() exits, returning the
+       to the output and the call to pcre2_substitute() exits,  returning  the
        number of matches so far.
 
    Substitution case callouts
@@ -3946,21 +3954,21 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
          void *callout_data);
 
        The pcre2_set_substitution_case_callout() function can be used to spec-
-       ify a callout function for pcre2_substitute() to  use  when  performing
-       case  transformations.  This does not affect any case insensitivity be-
-       haviour when performing a match, but only the user-visible  transforma-
+       ify  a  callout  function for pcre2_substitute() to use when performing
+       case transformations. This does not affect any case  insensitivity  be-
+       haviour  when performing a match, but only the user-visible transforma-
        tions performed when processing a substitution such as:
 
            pcre2_substitute(..., "\\U$1", ...)
 
-       The  default  case transformations applied by PCRE2 are reasonably com-
+       The default case transformations applied by PCRE2 are  reasonably  com-
        plete, and, in UTF or UCP mode, perform the basic locale-invariant case
-       transformations as specified by Unicode. This is suitable for  the  in-
-       ternal  (invisible)  case-equivalence  procedures  used  during pattern
+       transformations  as  specified by Unicode. This is suitable for the in-
+       ternal (invisible)  case-equivalence  procedures  used  during  pattern
        matching, but an application may wish to use more sophisticated locale-
        aware processing for the user-visible substitution transformations.
 
-       One example implementation of the callout_function using  the  ICU  li-
+       One  example  implementation  of the callout_function using the ICU li-
        brary would be:
 
            static uint32_t icu_case_callout(uint32_t ch, int to, void *)
@@ -3971,15 +3979,15 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
                   : ch;
            }
 
-       The  first argument of the case callout function is the Unicode charac-
+       The first argument of the case callout function is the Unicode  charac-
        ter to transform.
 
-       The  second  argument   is   one   of   the   constants   PCRE2_SUBSTI-
+       The   second   argument   is   one   of   the  constants  PCRE2_SUBSTI-
        TUTE_CASE_LOWER,    PCRE2_SUBSTITUTE_CASE_UPPER,    or    PCRE2_SUBSTI-
        TUTE_CASE_TITLE.
 
-       The third argument is the callout_data  supplied  to  pcre2_set_substi-
-       tute_case_callout(),  and  the  return value is the transformed Unicode
+       The  third  argument  is the callout_data supplied to pcre2_set_substi-
+       tute_case_callout(), and the return value is  the  transformed  Unicode
        character, which may be equal to the input character.
 
 
@@ -3988,56 +3996,56 @@ DUPLICATE CAPTURE GROUP NAMES
        int pcre2_substring_nametable_scan(const pcre2_code *code,
          PCRE2_SPTR name, PCRE2_SPTR *first, PCRE2_SPTR *last);
 
-       When a pattern is compiled with the PCRE2_DUPNAMES  option,  names  for
-       capture  groups  are not required to be unique. Duplicate names are al-
-       ways allowed for groups with the same number, created by using the  (?|
+       When  a  pattern  is compiled with the PCRE2_DUPNAMES option, names for
+       capture groups are not required to be unique. Duplicate names  are  al-
+       ways  allowed for groups with the same number, created by using the (?|
        feature. Indeed, if such groups are named, they are required to use the
        same names.
 
-       Normally,  patterns  that  use duplicate names are such that in any one
-       match, only one of each set of identically-named  groups  participates.
+       Normally, patterns that use duplicate names are such that  in  any  one
+       match,  only  one of each set of identically-named groups participates.
        An example is shown in the pcre2pattern documentation.
 
-       When   duplicates   are   present,   pcre2_substring_copy_byname()  and
-       pcre2_substring_get_byname() return the first  substring  corresponding
-       to  the given name that is set. Only if none are set is PCRE2_ERROR_UN-
-       SET is returned. The  pcre2_substring_number_from_name()  function  re-
-       turns  the error PCRE2_ERROR_NOUNIQUESUBSTRING when there are duplicate
+       When  duplicates   are   present,   pcre2_substring_copy_byname()   and
+       pcre2_substring_get_byname()  return  the first substring corresponding
+       to the given name that is set. Only if none are set is  PCRE2_ERROR_UN-
+       SET  is  returned.  The pcre2_substring_number_from_name() function re-
+       turns the error PCRE2_ERROR_NOUNIQUESUBSTRING when there are  duplicate
        names.
 
-       If you want to get full details of all captured substrings for a  given
-       name,  you  must use the pcre2_substring_nametable_scan() function. The
-       first argument is the compiled pattern, and the second is the name.  If
-       the  third  and fourth arguments are NULL, the function returns a group
+       If  you want to get full details of all captured substrings for a given
+       name, you must use the pcre2_substring_nametable_scan()  function.  The
+       first  argument is the compiled pattern, and the second is the name. If
+       the third and fourth arguments are NULL, the function returns  a  group
        number for a unique name, or PCRE2_ERROR_NOUNIQUESUBSTRING otherwise.
 
        When the third and fourth arguments are not NULL, they must be pointers
-       to variables that are updated by the function. After it has  run,  they
+       to  variables  that are updated by the function. After it has run, they
        point to the first and last entries in the name-to-number table for the
-       given  name,  and the function returns the length of each entry in code
-       units. In both cases, PCRE2_ERROR_NOSUBSTRING is returned if there  are
+       given name, and the function returns the length of each entry  in  code
+       units.  In both cases, PCRE2_ERROR_NOSUBSTRING is returned if there are
        no entries for the given name.
 
        The format of the name table is described above in the section entitled
-       Information  about  a  pattern.  Given all the relevant entries for the
-       name, you can extract each of their numbers,  and  hence  the  captured
+       Information about a pattern. Given all the  relevant  entries  for  the
+       name,  you  can  extract  each of their numbers, and hence the captured
        data.
 
 
 FINDING ALL POSSIBLE MATCHES AT ONE POSITION
 
-       The  traditional  matching  function  uses a similar algorithm to Perl,
-       which stops when it finds the first match at a given point in the  sub-
+       The traditional matching function uses a  similar  algorithm  to  Perl,
+       which  stops when it finds the first match at a given point in the sub-
        ject. If you want to find all possible matches, or the longest possible
-       match  at  a  given  position,  consider using the alternative matching
-       function (see below) instead. If you cannot use the  alternative  func-
+       match at a given position,  consider  using  the  alternative  matching
+       function  (see  below) instead. If you cannot use the alternative func-
        tion, you can kludge it up by making use of the callout facility, which
        is described in the pcre2callout documentation.
 
        What you have to do is to insert a callout right at the end of the pat-
-       tern.   When your callout function is called, extract and save the cur-
-       rent matched substring. Then return 1, which  forces  pcre2_match()  to
-       backtrack  and  try other alternatives. Ultimately, when it runs out of
+       tern.  When your callout function is called, extract and save the  cur-
+       rent  matched  substring.  Then return 1, which forces pcre2_match() to
+       backtrack and try other alternatives. Ultimately, when it runs  out  of
        matches, pcre2_match() will yield PCRE2_ERROR_NOMATCH.
 
 
@@ -4049,27 +4057,27 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
          pcre2_match_context *mcontext,
          int *workspace, PCRE2_SIZE wscount);
 
-       The function pcre2_dfa_match() is called  to  match  a  subject  string
-       against  a  compiled pattern, using a matching algorithm that scans the
+       The  function  pcre2_dfa_match()  is  called  to match a subject string
+       against a compiled pattern, using a matching algorithm that  scans  the
        subject string just once (not counting lookaround assertions), and does
-       not backtrack (except when processing lookaround assertions). This  has
-       different  characteristics to the normal algorithm, and is not compati-
-       ble with Perl. Some of the features of  PCRE2  patterns  are  not  sup-
+       not  backtrack (except when processing lookaround assertions). This has
+       different characteristics to the normal algorithm, and is not  compati-
+       ble  with  Perl.  Some  of  the features of PCRE2 patterns are not sup-
        ported. Nevertheless, there are times when this kind of matching can be
-       useful.  For a discussion of the two matching algorithms, and a list of
+       useful. For a discussion of the two matching algorithms, and a list  of
        features that pcre2_dfa_match() does not support, see the pcre2matching
        documentation.
 
-       The arguments for the pcre2_dfa_match() function are the  same  as  for
+       The  arguments  for  the pcre2_dfa_match() function are the same as for
        pcre2_match(), plus two extras. The ovector within the match data block
        is used in a different way, and this is described below. The other com-
-       mon  arguments  are used in the same way as for pcre2_match(), so their
+       mon arguments are used in the same way as for pcre2_match(),  so  their
        description is not repeated here.
 
-       The two additional arguments provide workspace for  the  function.  The
-       workspace  vector  should  contain at least 20 elements. It is used for
-       keeping track of multiple paths through the pattern  tree.  More  work-
-       space  is needed for patterns and subjects where there are a lot of po-
+       The  two  additional  arguments provide workspace for the function. The
+       workspace vector should contain at least 20 elements. It  is  used  for
+       keeping  track  of  multiple paths through the pattern tree. More work-
+       space is needed for patterns and subjects where there are a lot of  po-
        tential matches.
 
        Here is an example of a simple call to pcre2_dfa_match():
@@ -4089,45 +4097,45 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
 
    Option bits for pcre2_dfa_match()
 
-       The unused bits of the options argument for pcre2_dfa_match()  must  be
-       zero.   The   only   bits   that   may   be   set  are  PCRE2_ANCHORED,
-       PCRE2_COPY_MATCHED_SUBJECT, PCRE2_ENDANCHORED, PCRE2_NOTBOL,  PCRE2_NO-
+       The  unused  bits of the options argument for pcre2_dfa_match() must be
+       zero.  The  only   bits   that   may   be   set   are   PCRE2_ANCHORED,
+       PCRE2_COPY_MATCHED_SUBJECT,  PCRE2_ENDANCHORED, PCRE2_NOTBOL, PCRE2_NO-
        TEOL,   PCRE2_NOTEMPTY,   PCRE2_NOTEMPTY_ATSTART,   PCRE2_NO_UTF_CHECK,
-       PCRE2_PARTIAL_HARD,   PCRE2_PARTIAL_SOFT,    PCRE2_DFA_SHORTEST,    and
-       PCRE2_DFA_RESTART.  All but the last four of these are exactly the same
+       PCRE2_PARTIAL_HARD,    PCRE2_PARTIAL_SOFT,    PCRE2_DFA_SHORTEST,   and
+       PCRE2_DFA_RESTART. All but the last four of these are exactly the  same
        as for pcre2_match(), so their description is not repeated here.
 
          PCRE2_PARTIAL_HARD
          PCRE2_PARTIAL_SOFT
 
-       These have the same general effect as they do  for  pcre2_match(),  but
-       the  details are slightly different. When PCRE2_PARTIAL_HARD is set for
-       pcre2_dfa_match(), it returns PCRE2_ERROR_PARTIAL if  the  end  of  the
+       These  have  the  same general effect as they do for pcre2_match(), but
+       the details are slightly different. When PCRE2_PARTIAL_HARD is set  for
+       pcre2_dfa_match(),  it  returns  PCRE2_ERROR_PARTIAL  if the end of the
        subject is reached and there is still at least one matching possibility
        that requires additional characters. This happens even if some complete
-       matches  have  already  been found. When PCRE2_PARTIAL_SOFT is set, the
-       return code PCRE2_ERROR_NOMATCH is converted  into  PCRE2_ERROR_PARTIAL
-       if  the  end  of  the  subject  is reached, there have been no complete
+       matches have already been found. When PCRE2_PARTIAL_SOFT  is  set,  the
+       return  code  PCRE2_ERROR_NOMATCH is converted into PCRE2_ERROR_PARTIAL
+       if the end of the subject is  reached,  there  have  been  no  complete
        matches, but there is still at least one matching possibility. The por-
-       tion of the string that was inspected when the  longest  partial  match
+       tion  of  the  string that was inspected when the longest partial match
        was found is set as the first matching string in both cases. There is a
-       more  detailed  discussion  of partial and multi-segment matching, with
+       more detailed discussion of partial and  multi-segment  matching,  with
        examples, in the pcre2partial documentation.
 
          PCRE2_DFA_SHORTEST
 
-       Setting the PCRE2_DFA_SHORTEST option causes the matching algorithm  to
+       Setting  the PCRE2_DFA_SHORTEST option causes the matching algorithm to
        stop as soon as it has found one match. Because of the way the alterna-
-       tive  algorithm  works, this is necessarily the shortest possible match
+       tive algorithm works, this is necessarily the shortest  possible  match
        at the first possible matching point in the subject string.
 
          PCRE2_DFA_RESTART
 
-       When pcre2_dfa_match() returns a partial match, it is possible to  call
+       When  pcre2_dfa_match() returns a partial match, it is possible to call
        it again, with additional subject characters, and have it continue with
        the same match. The PCRE2_DFA_RESTART option requests this action; when
-       it  is  set,  the workspace and wscount options must reference the same
-       vector as before because data about the match so far is  left  in  them
+       it is set, the workspace and wscount options must  reference  the  same
+       vector  as  before  because data about the match so far is left in them
        after a partial match. There is more discussion of this facility in the
        pcre2partial documentation.
 
@@ -4135,8 +4143,8 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
 
        When pcre2_dfa_match() succeeds, it may have matched more than one sub-
        string in the subject. Note, however, that all the matches from one run
-       of  the  function  start  at the same point in the subject. The shorter
-       matches are all initial substrings of the longer matches. For  example,
+       of the function start at the same point in  the  subject.  The  shorter
+       matches  are all initial substrings of the longer matches. For example,
        if the pattern
 
          <.*>
@@ -4151,80 +4159,80 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
          <something> <something else>
          <something>
 
-       On  success,  the  yield of the function is a number greater than zero,
-       which is the number of matched substrings.  The  offsets  of  the  sub-
-       strings  are returned in the ovector, and can be extracted by number in
-       the same way as for pcre2_match(), but the numbers bear no relation  to
-       any  capture groups that may exist in the pattern, because DFA matching
+       On success, the yield of the function is a number  greater  than  zero,
+       which  is  the  number  of  matched substrings. The offsets of the sub-
+       strings are returned in the ovector, and can be extracted by number  in
+       the  same way as for pcre2_match(), but the numbers bear no relation to
+       any capture groups that may exist in the pattern, because DFA  matching
        does not support capturing.
 
-       Calls to the convenience functions that extract substrings by name  re-
+       Calls  to the convenience functions that extract substrings by name re-
        turn the error PCRE2_ERROR_DFA_UFUNC (unsupported function) if used af-
-       ter  a  DFA match. The convenience functions that extract substrings by
+       ter a DFA match. The convenience functions that extract  substrings  by
        number never return PCRE2_ERROR_NOSUBSTRING.
 
-       The matched strings are stored in  the  ovector  in  reverse  order  of
-       length;  that  is,  the longest matching string is first. If there were
-       too many matches to fit into the ovector, the yield of the function  is
+       The  matched  strings  are  stored  in  the ovector in reverse order of
+       length; that is, the longest matching string is first.  If  there  were
+       too  many matches to fit into the ovector, the yield of the function is
        zero, and the vector is filled with the longest matches.
 
-       NOTE:  PCRE2's  "auto-possessification" optimization usually applies to
-       character repeats at the end of a pattern (as well as internally).  For
-       example,  the pattern "a\d+" is compiled as if it were "a\d++". For DFA
-       matching, this means that only one possible match is found. If you  re-
+       NOTE: PCRE2's "auto-possessification" optimization usually  applies  to
+       character  repeats at the end of a pattern (as well as internally). For
+       example, the pattern "a\d+" is compiled as if it were "a\d++". For  DFA
+       matching,  this means that only one possible match is found. If you re-
        ally do want multiple matches in such cases, either use an ungreedy re-
-       peat  such as "a\d+?" or set the PCRE2_NO_AUTO_POSSESS option when com-
+       peat such as "a\d+?" or set the PCRE2_NO_AUTO_POSSESS option when  com-
        piling.
 
    Error returns from pcre2_dfa_match()
 
        The pcre2_dfa_match() function returns a negative number when it fails.
-       Many of the errors are the same  as  for  pcre2_match(),  as  described
+       Many  of  the  errors  are  the same as for pcre2_match(), as described
        above.  There are in addition the following errors that are specific to
        pcre2_dfa_match():
 
          PCRE2_ERROR_DFA_UITEM
 
-       This  return  is  given  if pcre2_dfa_match() encounters an item in the
-       pattern that it does not support, for instance, the use of \C in a  UTF
+       This return is given if pcre2_dfa_match() encounters  an  item  in  the
+       pattern  that it does not support, for instance, the use of \C in a UTF
        mode or a backreference.
 
          PCRE2_ERROR_DFA_UCOND
 
-       This  return  is given if pcre2_dfa_match() encounters a condition item
+       This return is given if pcre2_dfa_match() encounters a  condition  item
        that uses a backreference for the condition, or a test for recursion in
        a specific capture group. These are not supported.
 
          PCRE2_ERROR_DFA_UINVALID_UTF
 
-       This return is given if pcre2_dfa_match() is called for a pattern  that
-       was  compiled  with  PCRE2_MATCH_INVALID_UTF. This is not supported for
+       This  return is given if pcre2_dfa_match() is called for a pattern that
+       was compiled with PCRE2_MATCH_INVALID_UTF. This is  not  supported  for
        DFA matching.
 
          PCRE2_ERROR_DFA_WSSIZE
 
-       This return is given if pcre2_dfa_match() runs  out  of  space  in  the
+       This  return  is  given  if  pcre2_dfa_match() runs out of space in the
        workspace vector.
 
          PCRE2_ERROR_DFA_RECURSE
 
        When a recursion or subroutine call is processed, the matching function
-       calls  itself  recursively,  using  private  memory for the ovector and
-       workspace.  This error is given if the internal ovector  is  not  large
-       enough.  This  should  be  extremely  rare, as a vector of size 1000 is
+       calls itself recursively, using private  memory  for  the  ovector  and
+       workspace.   This  error  is given if the internal ovector is not large
+       enough. This should be extremely rare, as a  vector  of  size  1000  is
        used.
 
          PCRE2_ERROR_DFA_BADRESTART
 
-       When pcre2_dfa_match() is called  with  the  PCRE2_DFA_RESTART  option,
-       some  plausibility  checks  are  made on the contents of the workspace,
-       which should contain data about the previous partial match. If  any  of
+       When  pcre2_dfa_match()  is  called  with the PCRE2_DFA_RESTART option,
+       some plausibility checks are made on the  contents  of  the  workspace,
+       which  should  contain data about the previous partial match. If any of
        these checks fail, this error is given.
 
 
 SEE ALSO
 
-       pcre2build(3),    pcre2callout(3),    pcre2demo(3),   pcre2matching(3),
+       pcre2build(3),   pcre2callout(3),    pcre2demo(3),    pcre2matching(3),
        pcre2partial(3), pcre2posix(3), pcre2sample(3), pcre2unicode(3).
 
 
diff --git a/doc/pcre2api.3 b/doc/pcre2api.3
index 52f2cd726..4d01c3ae5 100644
--- a/doc/pcre2api.3
+++ b/doc/pcre2api.3
@@ -4038,9 +4038,17 @@ above).
 The \fBpcre2_set_substitution_callout()\fP function can be used to specify a
 callout function for \fBpcre2_substitute()\fP. This information is passed in
 a match context. The callout function is called after each substitution has
-been processed, but it can cause the replacement not to happen. The callout
-function is not called for simulated substitutions that happen as a result of
-the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
+been processed, but it can cause the replacement not to happen.
+.P
+The callout function is not called for simulated substitutions that happen as a
+result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. In this mode, when
+substitution processing exceeds the buffer space provided by the caller,
+processing continues by counting code units. The simulation is unable to
+populate the callout block, and so the simulation is pessimistic about the
+required buffer size. Whichever is larger of accepted or rejected substitution
+is reported as the required size. Therefore, the returned buffer length may be
+an overestimate (without a substitution callout, it is normally an exact
+measurement).
 .P
 The first argument of the callout function is a pointer to a substitute callout
 block structure, which contains the following fields, not necessarily in this
diff --git a/src/pcre2_substitute.c b/src/pcre2_substitute.c
index d8d31674a..2f9774c50 100644
--- a/src/pcre2_substitute.c
+++ b/src/pcre2_substitute.c
@@ -341,6 +341,7 @@ PCRE2_SIZE buff_offset, buff_length, lengthleft, fraglength;
 PCRE2_SIZE *ovector;
 PCRE2_SIZE ovecsave[3];
 pcre2_substitute_callout_block scb;
+PCRE2_SIZE sub_start_extra_needed;
 
 /* General initialization */
 
@@ -583,14 +584,16 @@ do
     }
   subs++;
 
-  /* Copy the text leading up to the match (unless not required), and remember
-  where the insert begins and how many ovector pairs are set. */
+  /* Copy the text leading up to the match (unless not required); remember
+  where the insert begins and how many ovector pairs are set; and remember how
+  much space we have requested in extra_needed. */
 
   if (rc == 0) rc = ovector_count;
   fraglength = ovector[0] - start_offset;
   if (!replacement_only) CHECKMEMCPY(subject + start_offset, fraglength);
   scb.output_offsets[0] = buff_offset;
   scb.oveccount = rc;
+  sub_start_extra_needed = extra_needed;
 
   /* Process the replacement string. If the entire replacement is literal, just
   copy it with length check. */
@@ -1148,7 +1151,10 @@ do
   remembered. Do the callout if there is one and we have done an actual
   replacement. */
 
-  if (!overflowed && mcontext != NULL && mcontext->substitute_callout != NULL)
+  if (mcontext == NULL || mcontext->substitute_callout == NULL)
+    {}
+
+  else if (!overflowed)
     {
     scb.subscount = subs;
     scb.output_offsets[1] = buff_offset;
@@ -1172,6 +1178,32 @@ do
       }
     }
 
+  /* In this interesting case, we cannot do the callout, so it's hard to
+  estimate the required buffer size. What callers want is to be able to make
+  two calls to pcre2_substitute(), once with PCRE2_SUBSTITUTE_OVERFLOW_LENGTH to
+  discover the buffer size, and then a second and final call. Older versions of
+  PCRE2 violated this assumption, by proceding as if the callout had returned
+  zero - but on the second call to pcre2_substitute() it could return non-zero
+  and then overflow the buffer again. Callers probably don't want to keep on
+  looping to incrementally discover the buffer size. */
+
+  else
+    {
+    PCRE2_SIZE newlength = (buff_offset - scb.output_offsets[0]) +
+        (extra_needed - sub_start_extra_needed);
+    PCRE2_SIZE oldlength = ovector[1] - ovector[0];
+
+    /* Be pessimistic: request whichever buffer size is larger out of
+    accepting or rejecting the substitution. */
+
+    if (oldlength > newlength)
+      extra_needed += oldlength - newlength;
+
+    /* Proceed as if the callout did not return a negative. A negative
+    effectively rejects all future substitutions, but we want to examine them
+    pessimistically. */
+    }
+
   /* Save the details of this match. See above for how this data is used. If we
   matched an empty string, do the magic for global matches. Update the start
   offset to point to the rest of the subject string. If we re-used an existing
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index 1e1b91cec..a8669dccc 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -17421,9 +17421,9 @@ Subject length lower bound = 1
 
 /a(b)c/substitute_overflow_length,substitute_callout,replace=[1]12
     abc\=substitute_skip=1
-Failed: error -48: no more memory: 3 code units are needed
+Failed: error -48: no more memory: 4 code units are needed
     abc
-Failed: error -48: no more memory: 3 code units are needed
+Failed: error -48: no more memory: 4 code units are needed
 
 /a(b)c/substitute_overflow_length,substitute_callout,replace=[2]12
     abc\=substitute_skip=1