Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details on new maintainership #603

Merged
merged 18 commits into from
Dec 11, 2024
Merged

Details on new maintainership #603

merged 18 commits into from
Dec 11, 2024

Conversation

NWilson
Copy link
Member

@NWilson NWilson commented Dec 7, 2024

It's a start!

Main goals:

  • Gratefully acknowledge contributions
  • Inform users of maintainship change
  • Don't offend anyone

I also want to tidy up some files in git. In a future commit I will probably remove some of the auto-generated files (eg some of those made by PrepareRelease) so that we aren't checking in tool outputs onto the main branch.

I will do whatever renaming (or markdown→text) is required to keep the same files in the release tarball.

@PhilipHazel
Copy link
Collaborator

The PrepareRelease script makes the HTML files from the man pages. I was explicitly asked to add these to the repo so that those who want to compile from HEAD get the same information. It was easiest just to do it that way - and try to remember to run "PrepareRelease doc" whenever I updated the man pages.

@zherczeg
Copy link
Collaborator

zherczeg commented Dec 7, 2024

Nice!

@NWilson
Copy link
Member Author

NWilson commented Dec 7, 2024

I agree Philip, the docs should be checked-in I think, so that a master checkout is basically usable as-is, equivalent to a tarball release. There are just one or two other files though that I think I could prune (index.md, for a start!)

@NWilson NWilson force-pushed the user/niwilson/docs-tidy branch 5 times, most recently from 3b892f2 to 734f244 Compare December 8, 2024 19:44
@NWilson NWilson force-pushed the user/niwilson/docs-tidy branch 2 times, most recently from 8cf8fda to fce20d3 Compare December 8, 2024 19:51
LICENCE.md Outdated Show resolved Hide resolved
Comment on lines 230 to 236
perl maint/Detrail \
$files \
cmake/{*.cmake,*.cmake.in} \
m4/* \
src/pcre2_*.{h,c} \
doc/p* \
doc/html/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some extra files to Detrail. We've added more source files since 10.44, and some of them have some trailing whitespace errors.

Comment on lines -4 to -5
# processing of the documentation, detrails files, and creates pcre2.h.generic
# and config.h.generic (for use by builders who can't run ./configure).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the comment at the top saying it creates {config,pcre2}.h.generic, I can't find any evidence that it has ever done so. I have added a CI job to ensure that these files stay up to date, using the existing make recipe in Makefile.am to generate them.

@NWilson NWilson mentioned this pull request Dec 9, 2024
@NWilson
Copy link
Member Author

NWilson commented Dec 9, 2024

I've been putting lots of random dist & tarball related tweaks & fixes on this branch.

I'm going to leave it now. I think it's OK to merge.

Comment on lines +108 to +109
name: GCC -Os, old Autotools
runs-on: ubuntu-20.04
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly... the changes between Automake 1.15 and 1.16 turn out to be somewhat annoying, and it took several attempts to make my changes to Makefile.am. We didn't have an autoconf build on old Ubuntu, which would have helped me to catch the problems.

Comment on lines +304 to +308
echo "Dirty working tree! Affected files:"
git status --porcelain || true
echo ""
echo "Diff:"
git diff || true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll now get build failures if you update documentation files, but forget to sync the HTML or TXT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we improve the tooling, this sounds a little restrictive.

Truth is that with Philip being the only native English speaker, his "final" touches over the generated documentation were an invaluable part of making sure it is as high quality as it is (including of course, updating the date with the necessary "fixes")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the docs can still be done by me (or other native English speaker). But there seems to be no good use-case for having the different copies of the documentation be out-of-sync with each other. If that ever happens, it's simply because a contributor forgot to sync the HTML or TXT files after updating the man pages.

If you'd prefer, we make it so that the CI job applies the changes and pushes a bot-authored commit to the PR. That seemed like a bit too much infrastructure though, for a simple sanity-check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I saw it was that the "source" files for the documentation were being updated by the developer, and the merge that Philip did which usually included the "documentation fixes" and "generated files" was akin to a compilation and "release" of the documentation.

specially since there is a copy of it that is autogenerated and publicly available in the web site as part of the github.io content.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I had completely forgotten to mention github.io to Nick. Apologies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware of it. It hosts index.md (which is a duplicate of README.md), and the HTML files in docs/.

We could do what some projects do, and have different versions of the docs available (for each library version) so the user has a choice of whether to read the "master" or "10.44" docs.

In a future PR, when I spruce up that website a bit, I'll probably just make it so it shows the docs for the last-released version of PCRE2.

@@ -19,7 +19,7 @@ copy_file(
out = "src/pcre2_chartables.c",
)

# Removed src/pcre2_ucptables.c below because it is #included in
# Removed src/pcre2_ucptables.c below because it is #included in
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensured that Detrail is run on all files that are part of the output tarball (...except the files where we deliberately don't want to run it, eg CRLF files, and any/all tool outputs from autoconf).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I would like to see that ALL our files get detrail, which necessary implies we need to teach it to work also in Windows style line endings.

I can see there might be a few exceptions (ex: test files that require explicit trailing blank characters), but even those could be exempted at the line number level IMHO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just have two nasty old .bat files in the repo with CRLF. They're OK for now (I have a task to convert them to .ps1 files someday - which is vastly better than bat). The test files are treated as "binary" and so I haven't proactively detrailed them. But, all the other files in the repo have now been added to Detrail.

Comment on lines +1262 to +1273
file(GLOB html ${PROJECT_SOURCE_DIR}/doc/html/*.html ${PROJECT_SOURCE_DIR}/doc/html/*.txt)
file(
GLOB txts
${PROJECT_SOURCE_DIR}/doc/*.txt
AUTHORS.md
COPYING
ChangeLog
LICENCE.md
NEWS
README
SECURITY.md
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed discrepancy between cmake --install and ./configure && make install. They should install the same files ... and they do now.

@@ -0,0 +1,368 @@
Installation Instructions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably edit this file sometime. The default Autoconf one is pretty bland and generic.

I checked it in, in preparation for that, and also to try working around an annoying issue in automake (see other comments on "foreign" flag).

Comment on lines +282 to +286
EXTRA_DIST += \
BUILD.bazel \
MODULE.bazel \
WORKSPACE.bazel \
build.zig
Copy link
Member Author

@NWilson NWilson Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within reason, I feel that users grabbing the tarball should be getting the same experience as users who check out from git (except that the tarball users don't need to run autogen.sh). So I've added some missing files to the tarball.

Copy link
Contributor

@carenas carenas Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carefully here, since these specific files are usually updated in non backward compatible ways, and are (I presume) used by some external build process that most likely only uses the git files.

Comment on lines -288 to -293
PrepareRelease \
CheckMan \
CleanTxt \
Detrail \
132html \
doc/index.html.src
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel we need to be bundling into the tarball all our maintainer scripts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed not, nowadays. I think they were originally added way back, certainly before GitHub, probably even before Svn, when tarball was the only way to download anything. Ah yes, its documented as happening in 2007.

Comment on lines +902 to +903
RunTest.bat a Windows batch file for running tests
RunGrepTest.bat a Windows batch file for pcre2grep tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm satisfied that the tarball contents are correct; and that the README is up to date now with its "manifest" of the tarball contents.

@@ -25,7 +25,7 @@ m4_define(libpcre2_posix_version, [3:5:0])
AC_PREREQ([2.60])
AC_INIT([PCRE2],pcre2_major.pcre2_minor[]pcre2_prerelease,[],[pcre2])
AC_CONFIG_SRCDIR([src/pcre2.h.in])
AM_INIT_AUTOMAKE([dist-bzip2 dist-zip])
AM_INIT_AUTOMAKE([dist-bzip2 dist-zip foreign])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal... but I ended up adding "foreign" as the path of least resistance.

The problem is, I want to just add some "FOO.md" files to the tarball. That's totally utterly reasonable. And Automake 1.16 endorses this: it special-cases them, according to the fact the latest version of the GNU packaging standards say, "great, use FOO.md instead of FOO files".

But then you have the annoying issue, because Automake 1.15 does NOT special-case them, it turned out to be really awkward to get the same behaviour from both Automake versions, just using the original configure.ac file.

So, this is the route I went down.

Autotools makes me cry (apologies if anyone is a fan...) but there's just so much time wasted every time you try and make modest changes, on every OSS or commercial project I've ever worked on that used Autotools!

@@ -0,0 +1,64 @@
#!/usr/bin/perl
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New file, to stop us checking in files with CRLF endings, or invalid UTF-8, or anything non-ASCII in files that get processed by the C compiler. (Unicode documentation files are fine.)

Just like existing files in maint/, only runs on relevant parts of the tarball (eg. leaves testdata well alone).

@@ -1,470 +0,0 @@
/* src/config.h.in. Generated from configure.ac by autoheader. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not checking in to git any of the outputs from ./autogen.sh - except this one? I'm not sure why. We do check in config.h.generic, which is reasonable (to me).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question. This file always changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's just a historical. It's been in the repo since 2015 - Philip had previous removed it in 2014, then checked it back in again.

Philip added it back alongside src/config.h, which was subsequently removed (again). But config.h.in was left.

I think it should just go. We have config.h.generic for users who want to work with an alternative build system to autoconf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but I think the idea was that then you can base your build in this file without having to run auto tools at all.

Agree that is somehow redundant with config.h.generic, but it is because "generic" is a derivative from it (through a Makefile rule)

Copy link
Member Author

@NWilson NWilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some verbose justification for the changes I've made.

@@ -58,7 +58,7 @@ sure both macros are undefined; an emulation function will then be used. */
/* Define this if your compiler supports __attribute__((uninitialized)) */
/* #undef HAVE_ATTRIBUTE_UNINITIALIZED */

/* Define to 1 if you have the 'bcopy' function. */
/* Define to 1 if you have the `bcopy' function. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new style?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, yes, I noticed this too.

autotools-mirror/autoconf@64df9b4#diff-f3bb1e17a70d2e4733de29089bdd13e951406a1a15e4b6b0f5688668f5acbb2bR79

I found a change made a few years ago on the ‎lib/autoconf/functions.m4:80 file:

# Prepare the autoheader snippet for FUNCTION.
m4_define([_AH_CHECK_FUNC],
[AH_TEMPLATE(AS_TR_CPP([HAVE_$1]),
-  [Define to 1 if you have the `$1' function.])])
+  [Define to 1 if you have the '$1' function.])])

# _AC_CHECK_FUNCS_ONE_U(FUNCTION)
# -------------------------------

So apparently the 'straight' quotes are the new style, in Autoconf 2.72 (shipped in Ubuntu 24.10 but not 24.04).

I honestly have no idea how the straight quotes came to be in the existing copy of config.h.generic. It's clearly a quite out-of-date file, not touched since the last PCRE2 release. I hypothesise that Philip perhaps had Autoconf 2.72 installed on his machine which last performed the update.

Let's just go with whatever style is produced by the version of Autoconf in Ubuntu 24.04, since that's what is most convenient to use in CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because Philip runs with LC_ALL=C to avoid introducing "Unicode" quotes like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an ASCII ` backtick character.

I tried re-running export LANG=C LC_ALL=C; ./autogen.sh; ./configure.sh; rm src/config.h.generic; make src/config.h.generic all in a totally clean checkout. No change.

It really is due to the commit I linked above - the Autoconf behaviour has changed.

Philip uses Arch Linux I remember now. They updated to Autoconf 2.72, before Ubuntu did. So that's how Philip was using it this summer when he did the 10.44 release.

AUTHORS.md Outdated

The maintainers are grateful for all contributions and participation over the years, whether or not listed here in this recent summary.
This list is includes names up until the PCRE2 10.44 release. New names will be
Copy link
Contributor

@carenas carenas Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "is" is not grammatically correct here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!! Well spotted, thanks.

@NWilson NWilson merged commit ac528f2 into master Dec 11, 2024
23 checks passed
@NWilson NWilson deleted the user/niwilson/docs-tidy branch December 11, 2024 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants