COS418 Assignment 5: Key-Value Storage Service

Introduction

In this assignment you will build a fault-tolerant key-value storage service using your Raft library from the previous assignments. Your key-value service will be structured as a replicated state machine with several key-value servers that coordinate their activities through the Raft log. Your key/value service should continue to process client requests as long as a majority of the servers are alive and can communicate, in spite of other failures or network partitions.

Your system will consist of clients and key/value servers, where each key/value server also acts as a Raft peer. Clients send Put(), Append(), and Get() RPCs to key/value servers (called kvraft servers), who then place those calls into the Raft log and execute them in order. A client can send an RPC to any of the kvraft servers, but if that server is not currently a Raft leader, or if there's a failure, the client should retry by sending to a different server. If the operation is committed to the Raft log (and hence applied to the key/value state machine), its result is reported to the client. If the operation failed to commit (for example, if the leader was replaced), the server reports an error, and the client retries with a different server.

Software

We have supplied you with skeleton code and tests here. You will need to modify kvraft/client.go, kvraft/server.go, and perhaps kvraft/common.go. (Even if you don't modify common.go, you should submit it as-provided.) For this assignment we give you the option to either use your own implementation of Raft from HW4 or use our solution, which we recommend. Our solution will be provided to you as a binary [Go 1.8.3] [Go 1.9.1][Go 1.9.2] that you will import in your project.

To get up and running, execute the following commands, as in the previous assignments, and change into the src/kvraft directory:

  $ cd cos418 # directory containing the tarball
  $ tar -vzxf hw5.tgz
  $ rm hw5.tgz
  $ export GOPATH="$PWD"

To apply the binary, follow these instructions:

  $ tar -vzxf raft-binary*.tgz
  $ rm raft-binary*.tgz
  $ mv raft.go src/raft/raft.go
  $ cd src/kvraft
  $ go test # this should now print "Creating RAFT instance from binary"

Part I

The service supports three RPCs: Put(key, value), Append(key, arg), and Get(key). It maintains a simple database of key/value pairs. Put() replaces the value for a particular key in the database, Append(key, arg) appends arg to key's value, and Get() fetches the current value for a key. An Append to a non-existant key should act like Put.

You will implement the service as a replicated state machine consisting of several kvservers. Your kvraft client code (Clerk in src/kvraft/client.go) should try different kvservers it knows about until one responds positively. As long as a client can contact a kvraft server that is a Raft leader in a majority partition, its operations should eventually succeed.

Your kvraft servers should not directly communicate; they should only interact with each other through the Raft log.

Your first task is to implement a solution that works when there are no dropped messages, and no failed servers. Note that your service must provide sequential consistency to applications that use its client interface. That is, completed application calls to the Clerk.Get(), Clerk.Put(), and Clerk.Append() methods in kvraft/client.go must appear to have affected all kvservers in the same order, and have at-most-once semantics. A Clerk.Get(key) should see the value written by the most recent Clerk.Put(key, …) or Clerk.Append(key, …) (in the total order).

A reasonable plan of attack may be to first fill in the Op struct in server.go with the "value" information that kvraft will use Raft to agree on (remember that Op field names must start with capital letters, since they will be sent through RPC), and then implement the PutAppend() and Get() handlers in server.go. The handlers should enter an Op in the Raft log using Start(), and should reply to the client when that log entry is committed. Note that you cannot execute an operation until the point at which it is committed in the log (i.e., when it arrives on the Raft applyCh).

After calling Start(), your kvraft servers will need to wait for Raft to complete agreement. Commands that have been agreed upon arrive on the applyCh. You should think carefully about how to arrange your code so that your code will keep reading applyCh, while PutAppend() and Get() handlers submit commands to the Raft log using Start(). It is easy to achieve deadlock between the kvserver and its Raft library.

Your solution needs to handle the case in which a leader has called Start() for a client RPC, but loses its leadership before the request is committed to the log. In this case you should arrange for the client to re-send the request to other servers until it finds the new leader. One way to do this is for the server to detect that it has lost leadership, by noticing that a different request has appeared at the index returned by Start(), or that the term reported by Raft.GetState() has changed. If the ex-leader is partitioned by itself, it won't know about new leaders; but any client in the same partition won't be able to talk to a new leader either, so it's OK in this case for the server and client to wait indefinitely until the partition heals. More generally, a kvraft server should not complete a Get() RPC if it is not part of a majority.

You have completed Part I when you reliably pass the first test in the test suite: "One client". You may also find that you can pass the "concurrent clients" test, depending on how sophisticated your implementation is. From the src/kvraft directory:

$ go test -v -run Basic
=== RUN   TestBasic
Test: One client ...
  ... Passed
--- PASS: TestBasic (15.18s)
PASS
ok  kvraft 15.190s

Part II

In the face of unreliable connections and node failures, your clients may send RPCs multiple times until it finds a kvraft server that replies positively. One consequence of this is that you must ensure that each application call to Clerk.Put() or Clerk.Append() must appear in that order just once (i.e., write the key/value database just once).

Thus, your task in Part II is to cope with duplicate client requests, including situations where the client sends a request to a kvraft leader in one term, times out waiting for a reply, and re-sends the request to a new leader in another term. The client request should always execute just once.

You will need to uniquely identify client operations to ensure that they execute just once. You can assume that each clerk has only one outstanding Put, Get, or Append.

For stability, you must make sure that your scheme for duplicate detection frees server memory quickly, for example by having the client tell the servers which RPCs it has heard a reply for. It's OK to piggyback this information on the next client request.

You have completed Part II when you reliably pass all tests through TestPersistPartitionUnreliable().

$ go test -v
=== RUN   TestBasic
Test: One client ...
  ... Passed
--- PASS: TestBasic (15.22s)
=== RUN   TestConcurrent
Test: concurrent clients ...
  ... Passed
--- PASS: TestConcurrent (15.83s)
=== RUN   TestUnreliable
Test: unreliable ...
  ... Passed
--- PASS: TestUnreliable (16.68s)
=== RUN   TestUnreliableOneKey
Test: Concurrent Append to same key, unreliable ...
  ... Passed
--- PASS: TestUnreliableOneKey (1.40s)
=== RUN   TestOnePartition
Test: Progress in majority ...
  ... Passed
Test: No progress in minority ...
  ... Passed
Test: Completion after heal ...
  ... Passed
--- PASS: TestOnePartition (2.54s)
=== RUN   TestManyPartitionsOneClient
Test: many partitions ...
  ... Passed
--- PASS: TestManyPartitionsOneClient (24.08s)
=== RUN   TestManyPartitionsManyClients
Test: many partitions, many clients ...
  ... Passed
--- PASS: TestManyPartitionsManyClients (26.12s)
=== RUN   TestPersistOneClient
Test: persistence with one client ...
  ... Passed
--- PASS: TestPersistOneClient (18.68s)
=== RUN   TestPersistConcurrent
Test: persistence with concurrent clients ...
  ... Passed
--- PASS: TestPersistConcurrent (19.34s)
=== RUN   TestPersistConcurrentUnreliable
Test: persistence with concurrent clients, unreliable ...
  ... Passed
--- PASS: TestPersistConcurrentUnreliable (20.37s)
=== RUN   TestPersistPartition
Test: persistence with concurrent clients and repartitioning servers...
  ... Passed
--- PASS: TestPersistPartition (26.91s)
=== RUN   TestPersistPartitionUnreliable
Test: persistence with concurrent clients and repartitioning servers, unreliable...
  ... Passed
--- PASS: TestPersistPartitionUnreliable (26.89s)
PASS
ok  kvraft 214.069s

Resources and Advice

This assignment doesn't require you to write much code, but you will most likely spend a substantial amount of time thinking and staring at debugging logs to figure out why your implementation doesn't work. Debugging will be more challenging than in the Raft lab because there are more components that work asynchronously of each other. Start early!

You should implement the service without worrying about the Raft log's growing without bound. You do not need to implement snapshots (from Section 7 in the paper) to allow garbage collection of old log entries.

As noted, a kvraft server should not complete a Get() RPC if it is not part of a majority (so that it does not serve stale data). A simple solution is to enter every Get() (as well as each Put() and Append()) in the Raft log. You don't have to implement the optimization for read-only operations that is described in Section 8.

In Part I, you should probably modify your client Clerk to remember which server turned out to be the leader for the last RPC, and send the next RPC to that server first. This will avoid wasting time searching for the leader on every RPC.

Submission

Submit your code to the COS418 Assignment 5 Dropbox. You may submit multiple times, only the one in the Dropbox at the time of grading will be recorded.

The Dropbox script will run only a small number of tests (though you should of course pass these!). It is not a substitute for doing your own testing on the full go test cases detailed above. Before submitting, please run the full tests given above for both parts one final time. You are responsible for making sure your code works.

You will receive full credit for Part I if your software passes the tests mentioned for that section on the CS servers. You will receive full credit for Part II if your software passes the tests mentioned for that section on the CS servers.

The final portion of your credit is determined by code quality tests, using the standard tools gofmt and go vet. You will receive full credit for this portion if all files submitted conform to the style standards set by gofmt and the report from go vet is clean for your raftkv package (that is, produces no errors). If your code does not pass the gofmt test, you should reformat your code using the tool. You can also use the Go Checkstyle tool for advice to improve your code's style, if applicable. Additionally, though not part of the graded cheks, it would also be advisable to produce code that complies with Golint where possible.

Acknowledgements

This assignment is adapted from MIT's 6.824 course. Thanks to Frans Kaashoek, Robert Morris, and Nickolai Zeldovich for their support.

Last updated: 2018-01-14 22:57:14 -0500