Consul allows stale reads during start-up #2644

j-pnd · 2017-01-10T17:00:38Z

`consul version` for both Client and Server

Client: (HTTP API)
Server: Tried with v0.6.0 and v0.7.0

`consul info` for both Client and Server

Server:

agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 1
build:
	prerelease = 
	revision = 46499d6e
	version = 0.6.0
consul:
	bootstrap = true
	known_datacenters = 1
	leader = true
	server = true
raft:
	applied_index = 61
	commit_index = 61
	fsm_pending = 0
	last_contact = never
	last_log_index = 61
	last_log_term = 9
	last_snapshot_index = 0
	last_snapshot_term = 0
	num_peers = 0
	state = Leader
	term = 9
runtime:
	arch = amd64
	cpu_count = 4
	goroutines = 61
	max_procs = 4
	os = linux
	version = go1.5.1
serf_lan:
	encrypted = false
	event_queue = 1
	event_time = 10
	failed = 0
	intent_queue = 0
	left = 0
	member_time = 1
	members = 1
	query_queue = 0
	query_time = 1
serf_wan:
	encrypted = false
	event_queue = 0
	event_time = 1
	failed = 0
	intent_queue = 0
	left = 0
	member_time = 1
	members = 1
	query_queue = 0
	query_time = 1

Operating system and Environment details

Ubuntu 14.04. Client and server on same host

Description of the Issue (and unexpected/desired result)

If consul is queried during start-up, it can return stale versions of data.

Reproduction steps

Run the following bash script with consul on your path. (Warning, this script will issue pkill consul).
The test writes a value to a test key, restarts consul, then reads the value out and checks if it matches. It then increments the value and continues in a loop.

The issue is present with or without the consistent read flag, and is present with clean shutdown or with kill -9. After the test finds the stale read it will exit. If you query the test key from consul after you will see that it eventually has the correct value. It seems that reads are being serviced during log replay.

sample run:
jpound:~/src/consul-bug> ./run.sh
./run.sh: line 21: 9907 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 21: 10046 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 21: 10248 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 21: 10442 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 29: 10620 Killed $consul_cmd > "$consul_log" 2>&1
Incorrect value read, expected 4 found 1

Script:

#!/bin/bash

consul_dir="consul.foo"
consul_cmd="consul agent -data-dir consul.foo -client 127.0.0.1 -bind 127.0.0.1 -server -bootstrap-expect 1"
consul_log="consul.out"

# ensure consul isn't running
pkill consul

# clean persistence and start consul
rm -rf "$consul_dir"
$consul_cmd  > "$consul_log"  2>&1 &
pid=$!
sleep 2

# Run test:
# Write a value to a test key, then read it out and check if it matches. Restart
# consul in between the write and read.
i="0"
while true; do
    # Write the value of 'i' to the test key
    ret="$(curl -s -X PUT -d "$i" http://127.0.0.1:8500/v1/kv/test)"
    if [ "$ret" != "true" ]; then
        echo "Failed to write key, try increasing the sleep time after the initial consul start ($ret)"
        exit 1
    fi

    # Restart consul
    kill -9 $pid
    wait
    $consul_cmd  > "$consul_log"  2>&1 &
    pid=$!

    # Read the value of 'i' from the test key, wait for consul to return a valid response
    val=""
    while [ "$val" == "No cluster leader" ] || [ "$val" == "" ]; do
        val="$(curl -s -X GET 'http://127.0.0.1:8500/v1/kv/test?raw&consistent')"
    done

    # ensure read value matches what we wrote
    if (( val != i )); then
        echo "Incorrect value read, expected $i found $val"
        exit -1
    fi

    # increment 'i'
    i=$[$i+1]
    if (( i % 10 == 0 )); then
        echo "Iteration $i"
    fi
done

Log Fragments

Log for last iteration (which produced the stale read)

==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
         Node name: 'jpound-vb'
        Datacenter: 'dc1'
            Server: true (bootstrap: true)
       Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas: <disabled>

==> Log data will now stream in as it occurs:

    2017/01/10 08:59:34 [INFO] serf: EventMemberJoin: jpound-vb 127.0.0.1
    2017/01/10 08:59:34 [INFO] raft: Node at 127.0.0.1:8300 [Follower] entering Follower state
    2017/01/10 08:59:34 [WARN] serf: Failed to re-join any previously known node
    2017/01/10 08:59:34 [INFO] serf: EventMemberJoin: jpound-vb.dc1 127.0.0.1
    2017/01/10 08:59:34 [WARN] serf: Failed to re-join any previously known node
    2017/01/10 08:59:34 [INFO] consul: adding WAN server jpound-vb.dc1 (Addr: 127.0.0.1:8300) (DC: dc1)
    2017/01/10 08:59:34 [ERR] agent: failed to sync remote state: No cluster leader
    2017/01/10 08:59:34 [INFO] consul: adding LAN server jpound-vb (Addr: 127.0.0.1:8300) (DC: dc1)
    2017/01/10 08:59:34 [ERR] http: Request GET /v1/kv/test?raw&consistent, error: No cluster leader

   (... above message repeated as client loops ...)

   2017/01/10 08:59:35 [WARN] raft: Heartbeat timeout reached, starting election
    2017/01/10 08:59:35 [INFO] raft: Node at 127.0.0.1:8300 [Candidate] entering Candidate state
    2017/01/10 08:59:35 [INFO] raft: Election won. Tally: 1
    2017/01/10 08:59:35 [INFO] raft: Node at 127.0.0.1:8300 [Leader] entering Leader state
    2017/01/10 08:59:35 [INFO] consul: cluster leadership acquired
    2017/01/10 08:59:35 [INFO] consul: New leader elected: jpound-vb
    2017/01/10 08:59:35 [INFO] raft: Disabling EnableSingleNode (bootstrap)

The text was updated successfully, but these errors were encountered:

slackpad · 2017-01-10T20:45:20Z

Hi @j-pnd thanks for opening an issue. Looks like we need to make sure we are the leader and that the initial barrier has been cleared before we serve any consistent results.

There's likely a race (related to #2644) where the catalog update might be in but the leader tracking doesn't report a leader, so this blocks forever and then times out. As a workaround we can lower the query wait time to always allow for a few retries.

preetapan · 2017-06-09T15:48:41Z

@j-pnd could you run your script again to see if you are able to reproduce the stale read on 0.8.3? I've tried it with up to 30 iterations and it hasn't failed with a stale read yet.

j-pnd · 2017-06-09T17:32:08Z

Hi @preetapan, Unfortunately the bug still seems to be present. Though it takes much longer for it to repro. For me I saw it after 74 iterations.

Iteration 70
./run.sh: line 30: 5324 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 30: 5346 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 30: 5381 Killed $consul_cmd > "$consul_log" 2>&1
./run.sh: line 30: 5400 Killed $consul_cmd > "$consul_log" 2>&1
Incorrect value read, expected 73 found 55

preetapan · 2017-06-09T18:14:58Z

@j-pnd thanks for the confirmation. I saw it fail after 105 iterations, we will look into it.

Fix stale reads on server startup. Consistent reads will now wait for up to config.RPCHoldTimeout for the server to get past its raft log, before returning an error. Servers that are starting up will eventually catch up. This fixes issue #2644

slackpad · 2017-06-21T00:58:07Z

Fixed in #3154.

adias408 · 2020-03-22T23:19:14Z

@slackpad @preetapan this issue was fixed for "consistent" reads at https://github.com/hashicorp/consul/pull/3154/files#diff-cefe66c3774a83b3ec294243f1550944R441, which are not generally recommended (https://www.consul.io/api/features/consistency.html#consistent). It seems like it would be simple to also wait for whether the server is ready for consistent reads for the "default" mode without taking the cost of checking the peers on each such request via something like

        if queryOpts.RequireConsistent {
                if err := s.consistentRead(); err != nil {
                        return err
                }
        } else if !queryOpts.AllowStale {
                if !s.isReadyForConsistentReads() {
                        return structs.ErrNotReadyForConsistentReads
                }
        }

(adding the else if !queryOpts.AllowStale logic).
Is there any downside to doing something like this? We ran into this kind of situation when doing queries as a system is restarting after stopping all nodes. We wouldn't want to do consistent queries in general because of the cost but would want to avoid getting stale values due to the raft.db log not yet having been fully processed on startup.

slackpad added the type/bug Feature does not function as expected label Jan 10, 2017

slackpad added this to the 0.7.4 milestone Jan 10, 2017

slackpad mentioned this issue Feb 6, 2017

Drop testutil wait to 2 seconds. #2715

Merged

slackpad modified the milestones: 0.8.0, Triaged Feb 8, 2017

slackpad modified the milestones: 0.8.0, 0.8.1 Mar 30, 2017

slackpad modified the milestones: 0.8.1, 0.8.2 Apr 12, 2017

slackpad removed this from the 0.8.2 milestone Apr 25, 2017

slackpad added the theme/internal-cleanup Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization label May 25, 2017

slackpad closed this as completed Jun 21, 2017

slackpad mentioned this issue Nov 6, 2017

Existing key not found shortly after the leader is restarted #3649

Closed

vaLski mentioned this issue Mar 20, 2018

consul-replicate sync empty data during raft leader election in the master DC hashicorp/consul-replicate#82

Closed

manjotkhangura mentioned this issue Apr 2, 2018

Consul allows stale reads during start-up #4008

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consul allows stale reads during start-up #2644

Consul allows stale reads during start-up #2644

j-pnd commented Jan 10, 2017

slackpad commented Jan 10, 2017

preetapan commented Jun 9, 2017

j-pnd commented Jun 9, 2017

preetapan commented Jun 9, 2017

slackpad commented Jun 21, 2017

adias408 commented Mar 22, 2020 •

edited

Loading

Consul allows stale reads during start-up #2644

Consul allows stale reads during start-up #2644

Comments

j-pnd commented Jan 10, 2017

consul version for both Client and Server

consul info for both Client and Server

Operating system and Environment details

Description of the Issue (and unexpected/desired result)

Reproduction steps

Log Fragments

slackpad commented Jan 10, 2017

preetapan commented Jun 9, 2017

j-pnd commented Jun 9, 2017

preetapan commented Jun 9, 2017

slackpad commented Jun 21, 2017

adias408 commented Mar 22, 2020 • edited Loading

`consul version` for both Client and Server

`consul info` for both Client and Server

adias408 commented Mar 22, 2020 •

edited

Loading