Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: segfault when sending data to unixgram #251

Closed
mohag opened this issue Jul 23, 2019 · 4 comments
Closed

Regression: segfault when sending data to unixgram #251

mohag opened this issue Jul 23, 2019 · 4 comments
Labels

Comments

@mohag
Copy link

mohag commented Jul 23, 2019

The current statds_exporter crashes with a segmentation fault if data is sent to the unix socket.

I'm using random test data sent with ncat for a quick test.

Error on the bad revision from the bisect: (Line numbers in master might be slightly different)

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x87ce83]

goroutine 65 [running]:
main.(*StatsDUnixgramListener).handlePacket(0xc00000fbc0, 0xc0002bdfb1, 0x1b, 0xffff)
        /home/gertvdb/src/statsd_exporter/exporter.go:548 +0x133
main.(*StatsDUnixgramListener).Listen(0xc00000fbc0)
        /home/gertvdb/src/statsd_exporter/exporter.go:539 +0x8a
created by main.main
        /home/gertvdb/src/statsd_exporter/main.go:232 +0x23e0

git bisect gave me this commit: (A manual test of that vs HEAD~ (relative to that) agrees)

c7e76967c8a192d39fb201c844cd4f42ed70cd52 is the first bad commit
commit c7e76967c8a192d39fb201c844cd4f42ed70cd52
Author: Clayton O'Neill <claytono@github.com>
Date:   Sun May 26 09:08:54 2019 -0400

    Add internal event queuing and flushing

    At high traffic levels, the locking around sending on channels can cause
    a large amount of blocking and CPU usage.  These adds an event queue
    mechanism so that events are queued for short period of time, and
    flushed in batches to the main exporter goroutine periodically.

    The default is is to flush every 1000 events, or every 200ms, whichever
    happens first.

    Signed-off-by: Clayton O'Neill <claytono@github.com>

:100644 100644 749d1a81046def251a122bf1705254597f5c7b41 6a43b2c5c4d763e23a867601311edfcce00fe751 M      README.md
:100644 100644 688d7bd86de4d591ef00358d68e41515593dbd9b e243ed16622d85c30af8a9585a2b4c81014eb9d9 M      bridge_test.go
:000000 100644 0000000000000000000000000000000000000000 258b92304f65668f2076aa49eba8ed2d6263b684 A      event.go
:000000 100644 0000000000000000000000000000000000000000 97a27228e89f5afdba69ff2cbccc90ed9470810f A      event_test.go
:100644 100644 0485394e4b2c57690c07cbc93b26d2934fa8ae22 c252e828bb4969a3b13e2da6208f469d3eb32451 M      exporter.go
:100644 100644 52cf1732a240b35a7379182815ec731de40c39f8 7978ce3c8e32d0a9d27084345bffdf5ab8732486 M      exporter_benchmark_test.go
:100644 100644 747923253ab999cb8aeddf4b3d217603ef698e48 7172004212f6245b24130ab9a10e3ddb8585f6c6 M      exporter_test.go
:100644 100644 26fe6172497e64212dee6c2d7df8081ce1de60d0 0379c50848e7b7dbd7d303a117d585465d4bc604 M      main.go
:100644 100644 c338a4499ab32c0677f43656729613a415aa2d9e 7d77b710cbcd1c01bf64d2a55f5920bdd83667f0 M      telemetry.go
bisect run success

Files added in root of repo copy to run automated bisect: (I ran the test.sh in Docker, since I was testing from WSL, where the Unix sockets does not work properly, so Docker was an easy solution) (On a Linux workstation, test_code.sh can likely just build and run test.sh directly)

Dockerfile-test (Debian used due to the need to install stuff, like ncat)

FROM debian:latest

COPY ./statsd_exporter /bin/statsd_exporter
COPY ./test.sh /bin/test.sh
RUN apt-get update && apt-get install -y procps ncat && chmod +x /bin/test.sh && apt clean

USER        nobody
EXPOSE      9102 9125 9125/udp
ENTRYPOINT  [ "/bin/test.sh" ]

test.sh (this basically works by checking whether the process is still running after sending data to the unix socket)

#!/bin/sh

/bin/statsd_exporter --statsd.listen-unixgram="/tmp/statsd.socket" &

sleep 1

STATSD_PID="$!"

echo "deploys.test.myservice:1|c" | ncat -U --udp /tmp/statsd.socket

sleep 1

ps -f -p $STATSD_PID

if kill $STATSD_PID; then
        exit 0
else
        exit 1
fi

test_code.sh

#!/bin/sh

# Minimise docker build context
echo "*" > .dockerignore
echo "!test.sh" >> .dockerignore
echo "!statsd_exporter" >> .dockerignore

go build

docker build -t statsd-test -f Dockerfile-test .
docker run --rm -it statsd-test
RES=$?

git checkout .dockerignore  # Since I'm messing with it to make the build quicker

exit $RES

Bisect process:

git bisect start
git bisect bad   # on current master
git bisect good v0.10.2
git bisect run ./test_code.sh
@mohag
Copy link
Author

mohag commented Jul 23, 2019

It seems like #235 might be related to this as well. (Possibly a similar oversight for the unigram listener to the TCP one) (The fix seems similar as well)

@matthiasr matthiasr added the bug label Jul 23, 2019
@matthiasr
Copy link
Contributor

Thank you for the report! Would you mind submitting the fix as a PR then?

@mohag
Copy link
Author

mohag commented Jul 23, 2019

@matthiasr #252 submitted... (IT seems to be running some of the build still)

@mohag
Copy link
Author

mohag commented Jul 25, 2019

Fixed in v0.12.2

@mohag mohag closed this as completed Jul 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants