Implementing a DHT in Go, part 1

In order to further explore the capabilities and limitations of Go, I thought it would be interesting to try implementing something that was practical, non-trivial, and of interest on its own. With that in mind, we're going to spend the next few posts creating an implementation of the Kademlia DHT in Go.

A DHT, or Distributed Hash Table is a peer-to-peer networking primitive that, in its most basic formulation, permits storage and lookup of key, value pairs - that is, it's a hash table that's distributed across many nodes. The name is not entirely accurate for some newer DHTs, as some formulations permit many other operations besides those focused around data storage and retrieval.

Kademlia is a good example of a basic DHT, because unlike some competing algorithms, it's extremely simple. There are no explicit routing update messages, and the internal state it maintains is fairly straightforward and easy to understand. Lookups are also accomplished in an obvious - yet very efficient - manner. In return for this simplicity, Kademlia sacrifices a few of the features of competitors like Pastry and Chord - it's not as practical to implement other primitives such as pubsub over it.

The reference we're going to use for building our implementation is XLattice's excellent protocol spec. This covers the important facets of implementing a Kademlia DHT, while being easier to read than the original paper (PDF).

The central concept of any DHT is the Node ID. In the case of Kademlia, this is a 160 bit opaque identifier, used to identify nodes on the network, as well as content hashes. Let's define it in Go, using Go's handy ability to extend any arbitrary type:

const IdLength = 20

type NodeID [IdLength]byte

In order to create NodeIDs, we need a couple of functions - one to decode them from a hex string, and one to generate a new random ID:

func NewNodeID(data string) (ret NodeID) {
  decoded, _ := hex.DecodeString(data);
  for i := 0; i < IdLength; i++ {
    ret[i] = decoded[i];
  }
  return;
}

func NewRandomNodeID() (ret NodeID) {
  for i := 0; i < IdLength; i++ {
    ret[i] = uint8(rand.Intn(256));
  }
  return;
}

This should be fairly self explanatory so far. Note we're using Go's support for named return arguments in order to avoid the need to declare one of our own in both of these functions. We're also making use of the encoding/hex and rand modules from the Go standard library.

We should also define a function to convert a Node ID back into hex, for printing and so forth:

func (node NodeID) String() string {
  return hex.EncodeToString(node[0:IdLength]);
}

Since this function has the correct signature, our NodeID struct now implements the fmt.Stringer interface, meaning that our String function will be used to convert our NodeID for display any time we use Printf or similar functions on it.

We should also define a couple of common operations on our NodeID - equality and ordering:

func (node NodeID) Equals(other NodeID) bool {
  for i := 0; i < IdLength; i++ {
    if node[i] != other[i] {
      return false;
    }
  }
  return true;
}

func (node NodeID) Less(other interface{}) bool {
  for i := 0; i < IdLength; i++ {
    if node[i] != other.(NodeID)[i] {
      return node[i] < other.(NodeID)[i];
    }
  }
  return false;
}

Together, these methods define a well-ordering for NodeIDs. NodeIDs are big-endian, with the most significant byte being the low-order one. This will allow us to sort NodeIDs, which will be important later.

All DHTs rely on having a 'distance metric' - a way of comparing two NodeIDs or hashes to determine how far apart they are. Kademlia uses the 'XOR metric': The distance between two NodeIDs is the XOR of those two NodeIDs, interpreted as a number. For that, we'll need to define an Xor method for our NodeIDs:

func (node NodeID) Xor(other NodeID) (ret NodeID) {
  for i := 0; i < IdLength; i++ {
    ret[i] = node[i] ^ other[i];
  }
  return;
}

Now that we've defined our NodeID thoroughly, we can start on the routing table implementation. To enable efficient traversal through a DHT, a routing table needs to contain a selection of nodes both close to and far away from our own node. Kademlia accomplishes this by breaking up the routing table into 'buckets', with each 'bucket' corresponding to a particular range of distances between the nodes stored in that bucket and ourselves. Bucket 0 contains nodes that differ in the first bit, bucket 1 contains nodes that differ in the second bit, and so forth.

This exponential sizing of buckets has a couple of implications. First, fully half the nodes in the DHT should be expected to end up in bucket 0; half of the remainder in bucket 1, and so forth. This means that we should have a complete set of all the nodes nearest us, gradually getting sparser over increasing distance. This is necessary in order to ensure we can always find data if it exists in the DHT. The second implication of this approach is that the number of the bucket a given node should be placed in is determined by the number of leading 0 bits in the XOR of our node ID with the target node ID, which makes for easy implementation. Let's add a function to our NodeID struct to facilitate this:

func (node NodeID) PrefixLen() (ret int) {
  for i := 0; i < IdLength; i++ {
    for j := 0; j < 8; j++ {
      if (node[i] >> uint8(7 - j)) & 0x1 != 0 {
        return i * 8 + j;
      }
    }
  }
  return IdLength * 8 - 1;
}

Now we can begin defining our routing table proper:

const BucketSize = 20;

type Contact struct {
  id NodeID;
}

type RoutingTable struct {
  node NodeID;
  buckets [IdLength*8]*list.List;
}

func NewRoutingTable(node NodeID) (ret RoutingTable) {
  for i := 0; i < IdLength * 8; i++ {
    ret.buckets[i] = list.New();
  }
  ret.node = node;
  return;
}

First, we define a Contact struct, which contains the data we need to store in the routing table. For now, this is just the Node ID, but a practical implementation will also need to store some way of contacting the node in question, such as an IP address and port. We'll come back to that in future posts.

Our RoutingTable struct stores the current node's ID, and an array of list instances, one for each bit in the ID. The list package implements doubly-linked lists, and includes convenient methods on the list, such as MoveToFront.. We'll make use of this in our Update method, which takes a Contact and updates the routing table with it:

func (table *RoutingTable) Update(contact *Contact) {
  prefix_length := contact.id.Xor(table.node.id).PrefixLen();
  bucket := table.buckets[prefix_length];
  element := iterable.Find(bucket, func(x interface{}) bool {
    return x.(*Contact).id.Equals(table.node.id);
  });
  if element == nil {
    if bucket.Len() <= BucketSize {
      bucket.PushFront(contact);
    }
    // TODO: Handle insertion when the list is full by evicting old elements if
    // they don't respond to a ping.
  } else {
    bucket.MoveToFront(element.(*list.Element));
  }
}

This function starts by finding the appropriate bucket, then using the iterable module's Find method to locate the contact inside the bucket if it already exists. If it does, it moves the contact to the front of the bucket. This behaviour is an important part of Kademlia's robustness: Nodes that have been around for a long time are more likely to remain in the DHT than new nodes, so Kademlia has a large preference for established nodes. If the node already in the bucket, it's appended to the end if there's room. If there's not, it's simply discarded.

The TODO here is also important: If adding our contact to the bucket would make the bucket too full, we should asynchronously ping the least-recently-seen element in the bucket (the one at the back of the list). If it doesn't respond, we should evict it from the bucket.

Finally, we need a way to query the DHT, returning the closest nodes to a particular hash or ID. We do this by first finding the bucket that the queried ID would fall into and adding that bucket's elements to the values to return. We then work outward from there, adding elements from the buckets on either side, until our list contains at least enough elements or we run out of buckets to check. Then, we sort the list of results, and return those closest to the original ID. Here's the implementation:

type ContactRecord struct {
  node *Contact;
  sortKey NodeID;
}

func (rec *ContactRecord) Less(other interface{}) bool {
  return rec.sortKey.Less(other.(*ContactRecord).sortKey);
}

func copyToVector(start, end *list.Element, vec *vector.Vector, target NodeID) {
  for elt := start; elt != end; elt = elt.Next() {
    contact := elt.Value.(*Contact);
    vec.Push(&ContactRecord{contact, contact.id.Xor(target)});
  }
}

func (table *RoutingTable) FindClosest(target NodeID, count int) (ret *vector.Vector) {
  ret = new(vector.Vector).Resize(0, count);

  bucket_num := target.Xor(table.node.id).PrefixLen();
  bucket := table.buckets[bucket_num];
  copyToVector(bucket.Front(), nil, ret, target);

  for i := 1; (bucket_num-i >= 0 || bucket_num+i < IdLength*8) && ret.Len() < count; i++ {
    if bucket_num - i >= 0 {
      bucket = table.buckets[bucket_num - i];
      copyToVector(bucket.Front(), nil, ret, target);
    }
    if bucket_num + i < IdLength * 8 {
      bucket = table.buckets[bucket_num + i];
      copyToVector(bucket.Front(), nil, ret, target);
    }
  }
  
  sort.Sort(ret);
  if ret.Len() > count {
    ret.Cut(count, ret.Len());
  }
  return;
}

Note that sorting the vector is a simple matter of calling .Sort() on it; this is thanks to our use of the 'ContactRecord' struct type, which has a Less() method that compares items based on their distance to the target. Also note the conditional call to Cut() to eliminate excess elements; this is conditional due to a current bug in the Go library that causes it to expand a Vector if you call Cut with a start index larger than the current dimensions.

That's it for the implementation of our routing table; you can see the entire code, with some basic unit tests, here. In the next post, we'll go over Go's support for RPCs, and make use of it to begin implementing the RPCs that form the basis of our DHT.

Comments

blog comments powered by Disqus