Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGFAULT on Repository.LookupBlob #352

Closed
AaronO opened this issue Oct 31, 2016 · 2 comments
Closed

SEGFAULT on Repository.LookupBlob #352

AaronO opened this issue Oct 31, 2016 · 2 comments

Comments

@AaronO
Copy link
Contributor

AaronO commented Oct 31, 2016

Stacktrace

runtime stack:
runtime.throw(0xbd91e0, 0x2a)
        /usr/local/go/src/runtime/panic.go:547 +0x90
runtime.sigpanic()
        /usr/local/go/src/runtime/sigpanic_unix.go:12 +0x5a

goroutine 2654459 [syscall, locked to thread]:
runtime.cgocall(0x8881fd, 0xc823f46eb8, 0xc800000000)
        /usr/local/go/src/runtime/cgocall.go:123 +0x11b fp=0xc823f46e68 sp=0xc823f46e38
github.com/libgit2/git2go._Cfunc_git_object_lookup(0xc8202b2048, 0x7fe296dffa20, 0xc8237b0760, 0x3, 0x0)
        ??:0 +0x41 fp=0xc823f46eb8 sp=0xc823f46e68
github.com/libgit2/git2go.(*Repository).lookupType(0xc822bc2980, 0xc8237b0760, 0x3, 0x0, 0x0, 0x0)
        /go/src/github.com/libgit2/git2go/repository.go:172 +0x288 fp=0xc823f46f60 sp=0xc823f46eb8
github.com/libgit2/git2go.(*Repository).LookupBlob(0xc822bc2980, 0xc8237b0760, 0xc8237b0760, 0x0, 0x0)
        /go/src/github.com/libgit2/git2go/repository.go:203 +0x44 fp=0xc823f46fa8 sp=0xc823f46f60

Calling code

func GetBlob(repo *git.Repository, sha string) (*BlobResponse, error) {
	// Ensure SHA is OID
	oid, err := git.NewOid(sha)
	if err != nil {
		return nil, err
	}

	// Get blob
	blob, err := repo.LookupBlob(oid)
	if err != nil {
		return nil, err
	}

        // More code

        return nil, nil
}

Environment

  • go : 1.6
  • git2go : 37d3c2d9ad4c4e970cac02faec8ad184412c34e6
  • libgit2 : 97e57e8770132d61ff2c36bee2de2c7ac5c9d609
@carlosmn
Copy link
Member

carlosmn commented Jul 7, 2017

This might be another instance of #334, #373 and #356 where Go's GC is a bit too eager for the assumptions the code makes.

@carlosmn
Copy link
Member

carlosmn commented Jul 9, 2017

This should be fixed via #393

@carlosmn carlosmn closed this as completed Jul 9, 2017
navytux added a commit to navytux/git-backup that referenced this issue Feb 14, 2025
Alain reports that lab.nexedi.com backup restoration sometimes fails with error like

    ...
    # file gitlab/misc -> .../srv/backup/backup-gitlab.git/gitlab-backup.Pj0fpp/gitlab_backup/db/database.pgdump/7630.dat/7630.dat.ry main.cmd_restore: main.blob_to_file: write .../srv/backup/backup-gitlab.git/gitlab-backup.Pj0fpp/gitlab_backup/db/database.pgdump/7630.dat/7630.dat.ry: bad address

which means that write system call invoked by writefile at tail of blob_to_file returned EFAULT.

The blob_to_file function is organized approximately as this:

    blob_to_file(blob_sha1, path) {
        blob = ReadObject(blob_sha1, git.ObjectBlob)
        blob_content = blob.Data()
        writefile(path, blob_content)
    }

and getting EFAULT inside writefile means that blob_content points to
some unmapped memory.

How that could be?

The answer is that blob.Data(), as implemented by git2go, returns []byte
that points to Cgo memory owned by blob object, and the blob object has
finalizer that frees that memory, which sometimes leads to libc
allocator to also return freed region completely back to OS by doing
munmap:

    https://github.com/libgit2/git2go/blob/v31.7.9-0-gcbca5b8/odb.go#L345-L359
    https://github.com/libgit2/git2go/blob/v31.7.9-0-gcbca5b8/odb.go#L162-L177
    https://github.com/libgit2/git2go/blob/v31.7.9-0-gcbca5b8/odb.go#L322-L325

and if that happens we see the EFAULT, but if no munmap happens we can
be saving corrupt data to restored file.

The OdbObject.Data even has comment about that - that one needs to keep
the object alive until retrieved data is used:

    // Data returns a slice pointing to the unmanaged object memory. You must make
    // sure the object is referenced for at least as long as the slice is used.
    func (object *OdbObject) Data() (data []byte) {

but this comment was added in 2017 in libgit2/git2go@55a109614151
as part of libgit2/git2go#393 while doing
"KeepAlive all the things" to fix segmentation faults and other
misbehaviours.

I missed all that because we switched blob_to_file from `git cat-file`
to git2go in 2016 in fbd72c0 (Switch file_to_blob() and blob_to_file()
to work without spawning Git subprocesses) and we never actively worked
on that part of code anymore. For the reference the git2go introduction
to git-backup happened on that same day in 2016 in 624393d (Hook in
git2go  (cgo bindings to libgit2)).

The problem of memory corruption inside blob_to_file can be reliably
reproduced via injecting the following patch

    blob_to_file(blob_sha1, path) {
        blob = ReadObject(blob_sha1, git.ObjectBlob)
        blob_content = blob.Data()
   +    runtime.GC()
        writefile(path, blob_content)
    }

which leads to e.g. the following test failure at every test run:

    === RUN   TestPullRestore
    ...
    # file b1	-> /tmp/t-git-backup2575257088/1/symlink.file
        git-backup_test.go:109: git-backup_test.go:297: lab.nexedi.com/kirr/git-backup.cmd_restore: lab.nexedi.com/kirr/git-backup.blob_to_file: symlink ^D<80><8c>þ�^@^@^2h space + α /tmp/t-git-backup2575257088/1/symlink.file: invalid argument

and the memory corruption can be fixed reliably by adding proper
runtime.KeepAlive so that the blob object assuredly stays alive during
writefile call:

    blob_to_file(blob_sha1, path) {
        blob = ReadObject(blob_sha1, git.ObjectBlob)
        blob_content = blob.Data()
        writefile(path, blob_content)
   +    runtime.KeepAlive(blob)
    }

However going through git2go code it could be seen that it is full of
Go <-> C interactions and given that there is a track records of catching
many crashes due to not getting lifetime management right (see
e.g. libgit2/git2go#352, libgit2/git2go#334,
libgit2/git2go#553, libgit2/git2go#513,
libgit2/git2go#373, libgit2/git2go#387
and once again libgit2/git2go#393) there is no
guarantee that no any other similar issue is there anywhere else
besides OdbObject.Data().

With that we either need to put a lot of runtime.KeepAlive after every
interaction with git2go, and put it properly, switch back to `git
cat-file` and similar things reverting fbd72c0 and friends, or do
something else.

As fbd72c0 explains switching back to `git cat-file` will slowdown
files restoration by an order of magnitude. Putting runtime.KeepAlive is
also not practical because it is hard to see all the places where we
interact with git2go, even indirectly, and so it is easy to make mistakes.

-> Thus let's keep the code that interacts with git2go well localized
   (done by previous patch), and let's make a copy over every string or
   []byte object we receive from git2go with adding careful
   runtime.KeepAlive post after that.

This fixes the problem of blob_to_file data corruption and it should fix
all other potential memory corruption problems we might ever have with
git2go due to potentially improper usage on git-backup side.

The copy cost is smaller compared to the cost of either spawning e.g. `git
cat-file` for every object, or interacting with `git cat-file --batch`
server spawned once, but still spending context switches on every request
and still making the copy on socket or pipe transfer. But most of all the
copy cost is negligible to the cost of catching hard to reproduce crashes or
data corruptions in the production environment.

For the reference the time it takes to restore "files" part of
lab.nexedi.com backup was ~ 1m51s before this patch, and became ~ 1m55s
after this patch indicating ~ 3.5% slowdown for that part. Which could be
said as noticeable but not big, and since most of the time is spent
during git pack restoration, taking much more time than files, those
several seconds of slowdown become completely negligible.

/reported-by @alain.takoudjou, @tomo
/reported-at https://www.erp5.com/group_section/forum/Gitlab-backup-zDVMZqaMAK/view?list_start=15&reset=1#2074747282
/cc @jerome, @rafael
/reviewed-on https://lab.nexedi.com/kirr/git-backup/-/merge_requests/12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants