Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix event concurrency flaw #2519

Merged
merged 10 commits into from
Oct 8, 2018
Merged

Conversation

goolAdapter
Copy link
Contributor

@goolAdapter goolAdapter commented Sep 29, 2018

ref #2518

  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

@codecov-io
Copy link

codecov-io commented Sep 29, 2018

Codecov Report

Merging #2519 into develop will decrease coverage by 0.07%.
The diff coverage is 78.57%.

@@             Coverage Diff             @@
##           develop    #2519      +/-   ##
===========================================
- Coverage    61.31%   61.24%   -0.08%     
===========================================
  Files          202      202              
  Lines        16710    16718       +8     
===========================================
- Hits         10246    10239       -7     
- Misses        5599     5611      +12     
- Partials       865      868       +3
Impacted Files Coverage Δ
libs/events/events.go 96.07% <78.57%> (-0.74%) ⬇️
libs/clist/clist.go 66.49% <0%> (-1.53%) ⬇️
consensus/reactor.go 71.18% <0%> (-1.43%) ⬇️

@@ -162,6 +166,52 @@ func TestAddDifferentListenerForDifferentEvents(t *testing.T) {
}
}

var stopInputEvent = false
Copy link
Contributor

@melekes melekes Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not introduce global variables please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for dalayed reply,I'm taking a holiday, so the next relay may also delay.
Yes,this global variables will eliminated.

evl.mtx.Lock()
defer evl.mtx.Unlock()

if evl.removed {
return
return fmt.Errorf("listener was removed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var ErrListenerRemoved = errors.New("listener was removed")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even a struct for it which holds info about the listener id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is ok, I'll replace other error message with named error too.

@@ -168,10 +172,15 @@ func (cell *eventCell) RemoveListener(listenerID string) int {

func (cell *eventCell) FireEvent(data EventData) {
cell.mtx.RLock()
var listenerCopy []EventCallback
Copy link
Contributor

@melekes melekes Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we run each callback in a goroutine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goroutine per callback could work if we have more control for them than just fire-and-forget. We should evaluate lifecycles of all goroutines carefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this previously can exist a deadlock, when we call add/remove event in callback. but it can work under carefully control,copy is a way to discard the burden.

I prefer the copy, it free the event user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this previously can exist a deadlock

does the test below uncovers that too? as far as I can see we have a mutex here, so we don't need to make a copy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a new test to show the deadlock.

Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, just some minor style and language nits.

@@ -168,10 +172,15 @@ func (cell *eventCell) RemoveListener(listenerID string) int {

func (cell *eventCell) FireEvent(data EventData) {
cell.mtx.RLock()
var listenerCopy []EventCallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goroutine per callback could work if we have more control for them than just fire-and-forget. We should evaluate lifecycles of all goroutines carefully.

@@ -58,7 +59,7 @@ func (evsw *eventSwitch) OnStart() error {

func (evsw *eventSwitch) OnStop() {}

func (evsw *eventSwitch) AddListenerForEvent(listenerID, event string, cb EventCallback) {
func (evsw *eventSwitch) AddListenerForEvent(listenerID, event string, cb EventCallback) (err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid named returns when possible, any strong reason to have it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no strong reason, I'll change to the repository's convention.

evl.mtx.Lock()
defer evl.mtx.Unlock()

if evl.removed {
return
return fmt.Errorf("listener was removed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even a struct for it which holds info about the listener id?

}

func (e ErrListenerWasRemoved) Error() string {
return fmt.Sprintf("listener %s was removed.", e.listener)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no dot (.) at the end please

Use https://github.com/golang/lint

@@ -208,8 +216,9 @@ func (evl *eventListener) AddEvent(event string) error {
defer evl.mtx.Unlock()

if evl.removed {
return fmt.Errorf("listener was removed.")
return &ErrListenerWasRemoved{listener: evl.id}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for error to be a pointer (&)

for i := 0; i < roundCount; i++ {
evsw.RemoveListener("listener")
}
done <- 1
}

func inputAddListenerForEvent(t *testing.T, evsw EventSwitch, done chan uint64) {
func inputAddListenerForEvent(t *testing.T, evsw EventSwitch, done chan uint64, roundCount int, stopInputEvent *bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a channel instead of *bool for stopInputEvent

func inputAddListenerForEvent(t *testing.T, evsw EventSwitch, done chan uint64, roundCount int, stopInputEvent chan<- bool) {
....
stopInputEvent <- true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chan is ok, but bool flag looks more simple,it's only used for reduce fail log.

@@ -73,9 +81,13 @@ func (evsw *eventSwitch) AddListenerForEvent(listenerID, event string, cb EventC
}
evsw.mtx.Unlock()

var err error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err := below

@@ -162,6 +166,52 @@ func TestAddDifferentListenerForDifferentEvents(t *testing.T) {
}
}

func inputRemoveListener(t *testing.T, evsw EventSwitch, done chan uint64, roundCount int) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why input prefix? also done chan struct{} would be more appropriate, don't you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,done chan uint64 is inspired by other test case,here done chan struct{} is enough


func inputAddListenerForEvent(t *testing.T, evsw EventSwitch, done chan uint64, roundCount int, stopInputEvent *bool) {
for i := 0; i < roundCount; i++ {
index := i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need in this extra variable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it need, the next event callback use index for log.

}

func TestAddAndRemoveListenerConcurrency(t *testing.T) {
var stopInputEvent = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be united

var (
 ...
)


evsw.RemoveListener("listener") // make sure remove last

for i := 0; i < roundCount && !stopInputEvent; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the test fails? panic? or this test uncovers the DATA race?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this test is used to uncovers the DATA race

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm back from a long vacation,sorry for the delayed response,I will fix the review issue.

@@ -42,4 +42,10 @@ IMPROVEMENTS:
BUG FIXES:
- [autofile] \#2428 Group.RotateFile need call Flush() before rename (@goolAdapter)
- [node] \#2434 Make node respond to signal interrupts while sleeping for genesis time
- [consensus] [\#1690](https://github.com/tendermint/tendermint/issues/1690) wait for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are these coming from? improper merge?

for i := 0; i < roundCount; i++ {
evsw.RemoveListener("listener")
}
done1 <- struct{}{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close(done1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here chan can recycled automatically by the garbage collector, but I'll close it explicit

stopInputEvent = true
})
}
done2 <- struct{}{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close(done2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here chan can recycled automatically by the garbage collector, but I'll close it explicit

Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 :octocat: :shipit: 🍡

Thanks for the follow-up!

@xla xla merged commit 4b2bf02 into tendermint:develop Oct 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants