BitTorrent Protocols Analysis

The “BitTorrent Protocol” is, in fact, a set of protocols, used in different stages, such as torrent discovery, peer discovery, download, seeding, and so on.

Let’s take a look at the following diagram.

screen-shot-10-27-16-at-11-26-am

We can see the different protocols in action.

Let’s take a closer look at them

  • HTTP Tracker Protocol
    • This is the oldest (and original) Tracker Protocol
    • Typically, it consists of an HTTP GET request for a given torrent/swarm with the following arguments, announcing the peer’s interest in the swarm and querying the tracker for more interested peers
      • info_hash: 20 byte sha1 hash of the bencoded form of the info value from the metainfo file
      • peer_id: string of length 20 which the downloader uses as its id
      • example
        • GET /announce?peer_id=aaaaaaaaaaaaaaaaaaaa&info_hash=aaaaaaaaaaaaaaaaaaaa&port=6881&left=0&downloaded=100&uploaded=0&compact=1
    • The HTTP response will usually contain a list of peers in the requested swarm
      • Each peer entry in the list contains:
        • peer id
        • ip
        • port
    • URLs for this protocol take the form: http://tracker:port/announce?peer_id=X1&info_hash=X2&port=X3&left=X4&downloaded=100&uploaded=0&compact=1
  • UDP Tracker Protocol
    • it is similar to the HTTP Tracker Protocol, but it is a binary protocol and it is UDP-based
    • it is usually lighter and faster than the HTTP version
    • URLs for this protocol take the form: udp://tracker:port
  • Local Peer Discovery
    • this protocol introduces a mechanism  to announce the presence of swarm to potential peers in the same LAN, using “http over udp-multicast”
    • a peer can broadcast its swarms to these multicast groups:
      • A) 239.192.152.143:6771 (org-local)
      • B) [ff15::efc0:988f]:6771 (site-local)
    • An announce broadcast message looks like the following:
      • BT-SEARCH * HTTP/1.1
        Host: <host>
        Port: <port>
        Infohash: <ihash>
        cookie: <cookie (optional)>
    • Usually, a peer broadcasts these messages every 5 minutes
  • DHT Protocol – “Distributed sloppy hash table”
    • The purpose of this protocol is similar to the previous ones: to find peers interested in a given swarm/torrent, but with an interesting twist:
      • it is implemented inside each Downloader, so that no trackers are needed
      • therefore, torrents announced in the DHT are potentially easier to find, albeit kind of “more public”, because their public “visibility” is not restricted to a particular set of trackers
    • Node
      • in the BitTorrent terminology, a Node is an entity that implements the DHT protocol
      • typically, a Downloader contains a DHT Node that implements the DHT protocol, cooperating with the other Nodes
      • each node is assigned a globally unique identifier, the “node id
    • The protocol is based on Kademlia (A Peer-to-peer DHT algorithm Based on the XOR Metric)
    • DHT Queries
      • ping
        • to keep the connection alive between two nodes
      • find_node
        • “id” containing the node ID of the querying node
        • “target” containing the ID of the node sought
        • this query can useful for torrents that specify specific nodes instead of trackers
      • get_peers
        • get peers associated with a torrent infohash
        • if the queried node knows some peers with the infohash, they are returned
        • otherwise, the node returns a list of nodes that are “closer” to the queried infohash
          • in this case, the querying node should continue by querying these nodes
      • announce_peer
        • Announce that the peer controlling the querying node is interested in the torrent with the given infohash
  • BitTorrent Protocol (a.k.a. Peer Protocol )
    • as the name implies, this is the most important BitTorrent protocol
    • it is used for symmetrical communication between peers, including
      • data transfer (torrent “pieces”)
      • metadata transfer (for torrents)
      • extended protocol data and metadata exchange
    • it can run over TCP/IP or using a BitTorrent specific transport layer, uTP over UDP
      • uTP can be used to improve congestion management
    • typical message flow includes:
      • handshake + extensions
        • each peer sends an handshake message to the other
          • the handshake includes the desired torrent’s “infohash”
            • note that when contacting a new peer, the originator peer believes that the target peer has some of the desired torrent’s pieces
              • because it found it through some tracking mechanism based on the infohash
        • followed by
          • extensions supported
          • and a never-ending stream of length-prefixed messages
      • base messages
        • 0 – choke, 1 – unchoke
          • for bandwidth control
        • 2 – interested, 3 – not interested
          • specifies if the peer is interested in the pieces that the other peer has available for downloading
        • 4 – have
          • a peer can inform the other peer of which torrent pieces it already has downloaded
        • 6 – request, 7 – piece
          • a peer can “request” a given piece
          • the other peer returns the “piece”
      • a peer will keep requesting the “pieces” it hasn’t downloaded yet to the other peer
        • the other peer will return them
        • if a peer detects that the other one is not allowing it to download as many pieces as it should, it may “choke” the other peer, which means it will not allow it to download more pieces until “unchoked”
        • if a peer detects that the other peer only has pieces that it has already downloaded, it should send it a “not interested” message
    • the BitTorrent protocol can be extended with new messages, which peers can use to check for extra functionality in other peers
      • extension examples:
        • DHT node support
          • (was not present in the initial protocol version)
        • Torrent Metadata exchange support
          • (was not present in the initial protocol version)
        • Peer Exchange (PEX)
          • (was not present in the initial protocol version)
    • Encryption (or more precisely, Obfuscation )
      • a weak encryption can be used in this protocol, to try to hide the BitTorrent protocol from ISPs that block or perform traffic shaping over this traffic
      • an encapsulation protocol (called Message Stream Encryption or PHE) can be used for this purpose
        • it uses a completely random header and a D-H key exchange in order to accomplish its purpose
      • a downloader can announce encryption support to a tracker by using the following extra arguments
        • supportcrypto=1, requirecrypto=1, cryptoport=X

Conclusion

This concludes this quick overview of the protocols used by the BitTorrent network.

In the following posts, I will show some low-level protocol examples.

References

 

 

 

 

 

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s