raft learner issue #7

WIZARD-CXY · 2019-03-21T07:48:54Z

code: learner branch
os: Mac

reproduce procedure:
1 start one main node with default setting, like
etcd

2 use
bin/etcdctl member add infra2 --peer-urls=127.0.0.1:23800 --learner
to add a learner node

3 start the learner node, like


export ETCD_NAME="infra2"

export ETCD_INITIAL_CLUSTER="default=http://localhost:2380,infra2=http://127.0.0.1:23800"

export ETCD_INITIAL_ADVERTISE_PEER_URLS="http://127.0.0.1:23800"

export ETCD_INITIAL_CLUSTER_STATE="existing"

bin/etcd --listen-client-urls 'http://localhost:23790' --advertise-client-urls 'http://localhost:23790' --listen-peer-urls 'http://localhost:23800'

4 Then send a read request to the learner node after a while(wait to receive the snapshot). The result is as below:


➜  etcd git:(promote) ✗ bin/etcdctl --endpoints=127.0.0.1:23790 get "" --prefix
{"level":"warn","ts":"2019-03-21T15:33:55.183+0800","caller":"clientv3/retry_interceptor.go:60","msg":"retrying of unary invoker failed","target":"endpoint://client-480d3208-67b0-4bd5-a309-f649baf90e89/127.0.0.1:23790","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Error: context deadline exceeded

Not quite as the desgin doc said, the learner node can serve read request.

The main server log


2019-03-21 15:32:15.157285 W | etcdserver: server is likely overloaded
2019-03-21 15:32:23.099274 W | etcdserver: ignored out-of-date read index response; local node read indexes queueing up and waiting to be in sync with leader (request ID want 7587837078940101644, got 15930192439078804740)
2019-03-21 15:33:51.185246 W | etcdserver: timed out sending read state
2019-03-21 15:33:51.185342 W | etcdserver: failed to send out heartbeat on time (exceeded the 100ms timeout for 842.440628ms)

the learner node log

2019-03-21 15:33:57.188397 W | etcdserver: timed out waiting for read index response (local node might have slow network)
2019-03-21 15:34:02.526932 W | etcdserver: read-only range request "key:\"\\000\" range_end:\"\\000\" " with result "error:context deadline exceeded" took too long (4.999960974s) to execute

The text was updated successfully, but these errors were encountered:

WIZARD-CXY · 2019-03-21T07:51:31Z

@jingyih

jingyih · 2019-03-21T08:05:39Z

We do plan to have learner member serve ONLY serializable read and endpoint status. But currently such limitation is not implemented yet. So I would expect etcdctl --endpoints=127.0.0.1:23790 get to work. Will take a closer look tomorrow.

jingyih · 2019-03-21T18:29:16Z

I was able to reproduce this. From learner node's log, it looks like learner is not able to get read index from leader, which is part of serving linearizable read. I will take a closer look.

2019-03-21 11:24:47.439000 W | etcdserver: read-only range request "key:\"foo\" " with result "error:context canceled" took too long (4.999190886s) to execute
2019-03-21 11:24:49.439917 W | etcdserver: timed out waiting for read index response (local node might have slow network)

jingyih · 2019-03-22T03:52:07Z

Looks like we found a bug. When leader receives a readIndex request, it assumes the request is from local node (leader itself) if quorum == 1. This is fine if there is no learner. But in case there is learner, the leader's assumption is wrong and therefore will not send readIndex response to learner. I will send a fix to etcd-io repo. But this bug is not blocking our learner implementation.

etcd/raft/raft.go

Lines 1025 to 1051 in a1408c5

    
           case pb.MsgReadIndex: 
        
           	if r.quorum() > 1 { 
        
           		if r.raftLog.zeroTermOnErrCompacted(r.raftLog.term(r.raftLog.committed)) != r.Term { 
        
           			// Reject read only request when this leader has not committed any log entry at its term. 
        
           			return nil 
        
           		} 
        
           		// thinking: use an interally defined context instead of the user given context. 
        
           		// We can express this in terms of the term and index instead of a user-supplied value. 
        
           		// This would allow multiple reads to piggyback on the same message. 
        
           		switch r.readOnly.option { 
        
           		case ReadOnlySafe: 
        
           			r.readOnly.addRequest(r.raftLog.committed, m) 
        
           			r.bcastHeartbeatWithCtx(m.Entries[0].Data) 
        
           		case ReadOnlyLeaseBased: 
        
           			ri := r.raftLog.committed 
        
           			if m.From == None || m.From == r.id { // from local member 
        
           				r.readStates = append(r.readStates, ReadState{Index: r.raftLog.committed, RequestCtx: m.Entries[0].Data}) 
        
           			} else { 
        
           				r.send(pb.Message{To: m.From, Type: pb.MsgReadIndexResp, Index: ri, Entries: m.Entries}) 
        
           			} 
        
           		} 
        
           	} else { 
        
           		r.readStates = append(r.readStates, ReadState{Index: r.raftLog.committed, RequestCtx: m.Entries[0].Data}) 
        
           	} 
        
           	return nil

jingyih · 2019-03-29T01:21:47Z

Fixed by etcd-io#10590.

WIZARD-CXY · 2019-03-29T01:40:58Z

great

etcdserver: add v3 request type for cluster attr

jingyih mentioned this issue Mar 27, 2019

Leader does not respond to ReadIndex message from learner etcd-io/etcd#10589

Closed

jingyih closed this as completed Mar 29, 2019

jingyih mentioned this issue Apr 7, 2019

Task list: support raft learner in etcd. etcd-io/etcd#10537

Closed

33 tasks

jingyih pushed a commit that referenced this issue Nov 27, 2019

Merge pull request #7 from jingyih/add_request_type_for_cluster_attr

9c1d4ba

etcdserver: add v3 request type for cluster attr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raft learner issue #7

raft learner issue #7

WIZARD-CXY commented Mar 21, 2019 •

edited

Loading

WIZARD-CXY commented Mar 21, 2019

jingyih commented Mar 21, 2019

jingyih commented Mar 21, 2019

jingyih commented Mar 22, 2019

jingyih commented Mar 29, 2019

WIZARD-CXY commented Mar 29, 2019

raft learner issue #7

raft learner issue #7

Comments

WIZARD-CXY commented Mar 21, 2019 • edited Loading

WIZARD-CXY commented Mar 21, 2019

jingyih commented Mar 21, 2019

jingyih commented Mar 21, 2019

jingyih commented Mar 22, 2019

jingyih commented Mar 29, 2019

WIZARD-CXY commented Mar 29, 2019

WIZARD-CXY commented Mar 21, 2019 •

edited

Loading