forked from getsentry/self-hosted
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(cdc): Prepare the self hosted environment for the Change Data Ca…
…pture pipeline (getsentry#938) We will use Change Data Capture to stream WAL updates from postgres into clickhouse so that features like issue search will be able to join event data and metadata (from postgres) through Snuba. This requires the followings: A logical replicaiton plugin to be installed in postgres (https://github.com/getsentry/wal2json) A service to run that streams from the replication log to Kafka (https://github.com/getsentry/cdc) Datasets in Snuba. This PR is preparing postgres to stream updates via the replication log. The idea is to download the the replication log plugin binary during install.sh mount a volume with the binary when starting postgres providing a new entrypoint to postgres that ensures everything is correctly configured. There is a difference between how this is set up and how we do the same in the development environment. In the development environment we download the library from the entrypoint itself and store it in a persistent volume, so we do not have to download it every time. Unfortunately this does not work here as the postgres image is postgres:9.6 while it is postgres:9.6-alpine. This one does not come with either wget or curl. I don't think installing that in the entrypoint would be a good idea, so the download happens in install.sh. I actually think this way is safer so we never depend on connectivity for postgres to start properly.
- Loading branch information
Showing
8 changed files
with
101 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
echo "${_group}Downloading and installing wal2json ..." | ||
|
||
FILE_TO_USE="../postgres/wal2json/wal2json.so" | ||
ARCH=$(uname -m) | ||
FILE_NAME="wal2json-Linux-$ARCH-glibc.so" | ||
|
||
DOCKER_CURL="docker run --rm curlimages/curl" | ||
|
||
if [[ $WAL2JSON_VERSION == "latest" ]]; then | ||
VERSION=$( | ||
$DOCKER_CURL https://api.github.com/repos/getsentry/wal2json/releases/latest | | ||
grep '"tag_name":' | | ||
sed -E 's/.*"([^"]+)".*/\1/' | ||
) | ||
|
||
if [[ ! $VERSION ]]; then | ||
echo "Cannot find wal2json latest version" | ||
exit 1 | ||
fi | ||
else | ||
VERSION=$WAL2JSON_VERSION | ||
fi | ||
|
||
mkdir -p ../postgres/wal2json | ||
if [ ! -f "../postgres/wal2json/$VERSION/$FILE_NAME" ]; then | ||
mkdir -p "../postgres/wal2json/$VERSION" | ||
$DOCKER_CURL -L \ | ||
"https://github.com/getsentry/wal2json/releases/download/$VERSION/$FILE_NAME" \ | ||
> "../postgres/wal2json/$VERSION/$FILE_NAME" | ||
|
||
cp "../postgres/wal2json/$VERSION/$FILE_NAME" "$FILE_TO_USE" | ||
fi | ||
|
||
echo "${_endgroup}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
# Initializes the pg_hba file with access permissions to the replication | ||
# slots. | ||
|
||
set -e | ||
|
||
{ echo "host replication all all trust"; } >> "$PGDATA/pg_hba.conf" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
#!/bin/bash | ||
# This script replaces the default docker entrypoint for postgres in the | ||
# development environment. | ||
# Its job is to ensure postgres is properly configured to support the | ||
# Change Data Capture pipeline (by setting access permissions and installing | ||
# the replication plugin we use for CDC). Unfortunately the default | ||
# Postgres image does not allow this level of configurability so we need | ||
# to do it this way in order not to have to publish and maintain our own | ||
# Postgres image. | ||
# | ||
# This then, at the end, transfers control to the default entrypoint. | ||
|
||
set -e | ||
|
||
prep_init_db() { | ||
cp /opt/sentry/init_hba.sh /docker-entrypoint-initdb.d/init_hba.sh | ||
} | ||
|
||
cdc_setup_hba_conf() { | ||
# Ensure pg-hba is properly configured to allow connections | ||
# to the replication slots. | ||
|
||
PG_HBA="$PGDATA/pg_hba.conf" | ||
if [ ! -f "$PG_HBA" ]; then | ||
echo "DB not initialized. Postgres will take care of pg_hba" | ||
elif [ "$(grep -c -E "^host\s+replication" "$PGDATA"/pg_hba.conf)" != 0 ]; then | ||
echo "Replication config already present in pg_hba. Not changing anything." | ||
else | ||
# Execute the same script we run on DB initialization | ||
/opt/sentry/init_hba.sh | ||
fi | ||
} | ||
|
||
bind_wal2json() { | ||
# Copy the file in the right place | ||
cp /opt/sentry/wal2json/wal2json.so `pg_config --pkglibdir`/wal2json.so | ||
} | ||
|
||
echo "Setting up Change Data Capture" | ||
|
||
prep_init_db | ||
if [ "$1" = 'postgres' ]; then | ||
cdc_setup_hba_conf | ||
bind_wal2json | ||
fi | ||
exec /docker-entrypoint.sh "$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters