Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/vet: detect non-64-bit-aligned arguments to sync/atomic funcs #11891

Closed
akavel opened this issue Jul 27, 2015 · 19 comments
Closed

cmd/vet: detect non-64-bit-aligned arguments to sync/atomic funcs #11891

akavel opened this issue Jul 27, 2015 · 19 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@akavel
Copy link
Contributor

akavel commented Jul 27, 2015

Discussion at https://groups.google.com/d/topic/golang-nuts/CN_rTywuD2U/discussion.

Per sync/atomic "BUG" note:

On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned.

It would be highly desirable if go/vet could detect when this rule is violated and warn about such cases. This note is easy to overlook, and non-trivial to satisfy. It's also not obvious how it applies in more complicated scenarios - e.g.:

  • "If I have an array of structs, are the first words of each and every structs in the array automatically 64-bit aligned or just the first one?"
    • davecheney: "Potentially just the first one, it depends if the size of each element, plus padding, is a multiple of 8."
  • "Do I need to know the size of a word for each build target in order to make it work correctly?"
    • davecheney: "Potentially, you may have to special case // +build arm, 386"
  • "The piece of doc I quoted is so confusing. Is a struct in an allocated array considered allocated struct? If so, should all the structs in the array have its first word 64-bit aligned?"
    • iant: "The first word in a struct or slice that is allocated via new or via a composite literal will be 64-bit aligned. A struct inside a slice or array is not allocated by itself, and no particular alignment is guaranteed beyond that of unsafe.AlignOf. "

etc. (see the discussion linked above)

@arl
Copy link
Contributor

arl commented Oct 27, 2018

At my company we have been bitten multiple times by non-64bits-aligned pointers used with sync/atomic functions on arm. Thanks to the fact that it panics at runtime, it only ever happened during tests/development. Developer were not aware of the BUG section at the bottom of the sync/atomic godoc page.

Each time it happened to us, the atomically accessed pointer was always a struct field and the struct was never embedded in another one, nor was it in a slice. I think this use case is the most common one, it is relatively rare to see a slice of structs having fields that are atomically accessed, but it might happen.

I'd like to work on a CL adding a vet check to cover for that case, that is also probably the simplest to detect, that is: detecting non-64bits alignment of a 64bit variable, when the variable is a field of a non-embedded struct.

This check would not cover all the cases, but I believe it would cover the most common one. Also it would be better than having no check at all, and could be a basis on which to improve upon.

@zigo101
Copy link

zigo101 commented Oct 27, 2018

related: #599

@arl could you post a code snippet to show how the struct variable (edit: is declared)?

There are some ways to guarantee 64-bit alignments for struct fileds.

@arl
Copy link
Contributor

arl commented Oct 27, 2018

@go101 for example you can define your struct such as the atomically accessed field is the first struct field. In that case it's guaranteed it will be 64bits aligned, even on 32bits architectures.

package main

import (
	"sync/atomic"
)

type A struct {
	b int64 // b is accessed via a sync/atomic function
	c int32
	d int64 // d should NOT be accessed via a sync/atomic function as it's not 64bits aligned.
}

func main() {
	a := A{}
	// Atomically increment a.b. This is valid, b is the first field of the struct, so it's
	// guaranteed to be 64-bits aligned.
	atomic.AddInt64(&a.b, 1)
}

edit: updated the example to a complete one

@mvdan mvdan added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 27, 2018
@mvdan
Copy link
Member

mvdan commented Oct 27, 2018

I briefly spoke with @arl this past week, and adding a simple version of this check sounds fine to me. It could be expanded to add support for the more complex cases later, such as slices of structs.

As far as I can tell, the check would pass the correctness and precision criteria; the issue causes runtime panics, and it should be possible to have zero false positives. Frequency is a bit more subjective, as the check would simply save developer's time. Any piece of software that is properly tested shouldn't have this bug shipped out, as it should be caught by a panic at run time.

Labelling as NeedsDecision, but this gets a thumbs up from me. /cc @alandonovan @robpike @josharian

@arl
Copy link
Contributor

arl commented Oct 27, 2018

@go101 just saw your edit.
In our case moving the variable at the top of the struct solved the problem. We didn't have to rely on any other technique like padding because the variable, and the struct, were one of a kind (no slices involved).

@arl
Copy link
Contributor

arl commented Oct 30, 2018

The same code using sync/atomic can either be totally valid for, say, amd64, while invalid for arm32 (triggering a runtime panic). That means this vet check should somewhat be aware of the target architecture, otherwise the majority of users would see false positives.

The only way I know to avoid false positives:

  • only enable that check if the problematic struct field declaration is covered by an arm32 build tag. That would make the check mostly useless for the majority of cases where users crosscompile the same code for various platforms (one of which being arm32)
  • enabling the check if GOHOST is arm32.
  • making this an experimental vet check, and thus an opt-in check.

Thoughts ? CC @ianlancetaylor @alandonovan

@ianlancetaylor
Copy link
Member

I would be in favor of an opt-in check, enabled with an option or with -all.

@josharian
Copy link
Contributor

We already have an arch-aware vet checks, asmdecl. It works by checking the code in play assuming GOOS/GOARCH. So if you only ever compile for amd64, you’ll never see the vet check fire, but arguably, you also don’t have a problem. As long as you run your tests at least once for any target architecture, vet will check that target architecture, and catch the problem.

I haven’t thought much about this particular check, I just wanted to point out that we have a good arch-dependent vet mechanism.

@arl
Copy link
Contributor

arl commented Oct 31, 2018

@josharian @ianlancetaylor It looks like using GOOS/GOARCH/GOARM could also be a good solution, with some false positives only in a very specific case (i.e the user is running vet on arm32 but never had the intention to run their code from arm32).

Compared to hiding the vet check behind an experimental flag, this solution has the advantage of covering users that follow vet best practices (running it once per arch)

@rasky
Copy link
Member

rasky commented Oct 31, 2018

Notice that, if they’re compiling on arm32, I think it’s safe to assume that they would try to run at least once on arm32, if only for development purposes.

@rasky rasky closed this as completed Oct 31, 2018
@rasky rasky reopened this Oct 31, 2018
@mvdan
Copy link
Member

mvdan commented Oct 31, 2018

I hadn't realised we did have arch-dependent checks in vet; Josh's suggestion sounds great. I agree that if someone runs go vet or go test on a 32-bit architecture, this check should run. We shouldn't have false positives there.

We might have false negatives if a package is compiled for 32-bit, but only tested on 64-bit. I presume said package will have bigger issues than atomic parameter alignment, though :)

@arl
Copy link
Contributor

arl commented Jan 12, 2019

I have been working on a new vet check for that issue, the code was initially based off the standard Go tree and has now been rebased off of x/tools/go/analysis because as you know that's where the vet code has been moved.

The new check comes under the form of a new vet flag named atomicalign, it only runs on 32 bits archs and is a no-op everywhere else.

Basically the code does the following:
If vet is ran on one of the target platform (arm32 and x86) it will start to look for function calls of sync/atomic funcs acting on 64bits variables. When argument to atomic functions are struct fields, it then compute its alignment in order to issue vet errors in case the field is not aligned on multiple of 64 bits.

The check currently does not recurse into embedded structs (embedded or not) to know their size, so when that case is encountered, the vet check issues no warning in order to avoid false positives. This could potentially be a useful addition, but I'd like somebody to have a look at my code before adding that.

I added some tests, right now I'm not sure about how to make those tests platform-dependent, that is exclude them on non-targeted platforms?

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/158277 mentions this issue: go/analysis/passes/atomicalign: add atomicalign ckecker

@mvdan
Copy link
Member

mvdan commented Jan 17, 2019

It's been four months since @arl suggested implementing this. All replies have been in favor so far, so I'll replace NeedsDecision with NeedsFix. If Alan has any reservations, he can always -2 the CL itself :)

@mvdan mvdan added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jan 17, 2019
@mvdan mvdan modified the milestones: Unplanned, Go1.13 Jan 17, 2019
@tmm1
Copy link
Contributor

tmm1 commented Apr 26, 2019

Looks like this was reverted in golang/tools@718ddee and a new CL with tests is required.

@arl
Copy link
Contributor

arl commented Apr 26, 2019

@tmm1

The first CL is still merged. The second one, which is an improvement over the first, has been reverted due to falling tests on some platforms.

@tmm1
Copy link
Contributor

tmm1 commented Jul 24, 2019

@arl Could you elaborate on which cases are currently handled and which ones aren't? Should users simply be able to run go vet in a package to be alerted of atomic alignment problems?

Is the alignment check specific to the host architecture, and if so is there a way to check for alignment issues on a different architecture?

@tmm1
Copy link
Contributor

tmm1 commented Jul 24, 2019

I have been working on a new vet check for that issue, the code was initially based off the standard Go tree and has now been rebased off of x/tools/go/analysis because as you know that's where the vet code has been moved.

How does one build and use vet from x/tools/go/analysis?

The vet included with my go1.12.6 installation doesn't seem to have the atomicalign check:

$ go tool vet help 2>&1 | grep atomicalign
$

@arl
Copy link
Contributor

arl commented Jul 26, 2019

@tmm1
Currently only very simple cases are handled, but with no false positives. The objective was to first handle those, then build up on it in successive CLs.

Only cases handled are atomic variables used inside a global struct, however it fails to recognize when atomic functions are called from inside a struct method (when potentially affected struct is a pointer receiver).

I pushed a CL to handle exactly this some time ago, see https://go-review.googlesource.com/c/tools/+/163997 , but tests fail on mips and x86, they pass on arm though, it didn't receive much love though :).

Strange thing is there are cases where alignment is 64-bit on mips and x86, while, at least to my understanding of the bug note, they shouldn't.
That would mean not all platforms are affected in the same way by the bug note. That's what I'm trying to discover to go further but I'd like a feedback from somebody working on the compiler that may know why it is this way.

To try it yourself, you should run atomicalign on one of the affected architectures, as this check is a no-op on non-affected ones.

@golang golang locked and limited conversation to collaborators Jul 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants