BitTorrent Protocols Analysis

The “BitTorrent Protocol” is, in fact, a set of protocols, used in different stages, such as torrent discovery, peer discovery, download, seeding, and so on.

Let’s take a look at the following diagram.

screen-shot-10-27-16-at-11-26-am

We can see the different protocols in action.

Let’s take a closer look at them

  • HTTP Tracker Protocol
    • This is the oldest (and original) Tracker Protocol
    • Typically, it consists of an HTTP GET request for a given torrent/swarm with the following arguments, announcing the peer’s interest in the swarm and querying the tracker for more interested peers
      • info_hash: 20 byte sha1 hash of the bencoded form of the info value from the metainfo file
      • peer_id: string of length 20 which the downloader uses as its id
      • example
        • GET /announce?peer_id=aaaaaaaaaaaaaaaaaaaa&info_hash=aaaaaaaaaaaaaaaaaaaa&port=6881&left=0&downloaded=100&uploaded=0&compact=1
    • The HTTP response will usually contain a list of peers in the requested swarm
      • Each peer entry in the list contains:
        • peer id
        • ip
        • port
    • URLs for this protocol take the form: http://tracker:port/announce?peer_id=X1&info_hash=X2&port=X3&left=X4&downloaded=100&uploaded=0&compact=1
  • UDP Tracker Protocol
    • it is similar to the HTTP Tracker Protocol, but it is a binary protocol and it is UDP-based
    • it is usually lighter and faster than the HTTP version
    • URLs for this protocol take the form: udp://tracker:port
  • Local Peer Discovery
    • this protocol introduces a mechanism  to announce the presence of swarm to potential peers in the same LAN, using “http over udp-multicast”
    • a peer can broadcast its swarms to these multicast groups:
      • A) 239.192.152.143:6771 (org-local)
      • B) [ff15::efc0:988f]:6771 (site-local)
    • An announce broadcast message looks like the following:
      • BT-SEARCH * HTTP/1.1
        Host: <host>
        Port: <port>
        Infohash: <ihash>
        cookie: <cookie (optional)>
    • Usually, a peer broadcasts these messages every 5 minutes
  • DHT Protocol – “Distributed sloppy hash table”
    • The purpose of this protocol is similar to the previous ones: to find peers interested in a given swarm/torrent, but with an interesting twist:
      • it is implemented inside each Downloader, so that no trackers are needed
      • therefore, torrents announced in the DHT are potentially easier to find, albeit kind of “more public”, because their public “visibility” is not restricted to a particular set of trackers
    • Node
      • in the BitTorrent terminology, a Node is an entity that implements the DHT protocol
      • typically, a Downloader contains a DHT Node that implements the DHT protocol, cooperating with the other Nodes
      • each node is assigned a globally unique identifier, the “node id
    • The protocol is based on Kademlia (A Peer-to-peer DHT algorithm Based on the XOR Metric)
    • DHT Queries
      • ping
        • to keep the connection alive between two nodes
      • find_node
        • “id” containing the node ID of the querying node
        • “target” containing the ID of the node sought
        • this query can useful for torrents that specify specific nodes instead of trackers
      • get_peers
        • get peers associated with a torrent infohash
        • if the queried node knows some peers with the infohash, they are returned
        • otherwise, the node returns a list of nodes that are “closer” to the queried infohash
          • in this case, the querying node should continue by querying these nodes
      • announce_peer
        • Announce that the peer controlling the querying node is interested in the torrent with the given infohash
  • BitTorrent Protocol (a.k.a. Peer Protocol )
    • as the name implies, this is the most important BitTorrent protocol
    • it is used for symmetrical communication between peers, including
      • data transfer (torrent “pieces”)
      • metadata transfer (for torrents)
      • extended protocol data and metadata exchange
    • it can run over TCP/IP or using a BitTorrent specific transport layer, uTP over UDP
      • uTP can be used to improve congestion management
    • typical message flow includes:
      • handshake + extensions
        • each peer sends an handshake message to the other
          • the handshake includes the desired torrent’s “infohash”
            • note that when contacting a new peer, the originator peer believes that the target peer has some of the desired torrent’s pieces
              • because it found it through some tracking mechanism based on the infohash
        • followed by
          • extensions supported
          • and a never-ending stream of length-prefixed messages
      • base messages
        • 0 – choke, 1 – unchoke
          • for bandwidth control
        • 2 – interested, 3 – not interested
          • specifies if the peer is interested in the pieces that the other peer has available for downloading
        • 4 – have
          • a peer can inform the other peer of which torrent pieces it already has downloaded
        • 6 – request, 7 – piece
          • a peer can “request” a given piece
          • the other peer returns the “piece”
      • a peer will keep requesting the “pieces” it hasn’t downloaded yet to the other peer
        • the other peer will return them
        • if a peer detects that the other one is not allowing it to download as many pieces as it should, it may “choke” the other peer, which means it will not allow it to download more pieces until “unchoked”
        • if a peer detects that the other peer only has pieces that it has already downloaded, it should send it a “not interested” message
    • the BitTorrent protocol can be extended with new messages, which peers can use to check for extra functionality in other peers
      • extension examples:
        • DHT node support
          • (was not present in the initial protocol version)
        • Torrent Metadata exchange support
          • (was not present in the initial protocol version)
        • Peer Exchange (PEX)
          • (was not present in the initial protocol version)
    • Encryption (or more precisely, Obfuscation )
      • a weak encryption can be used in this protocol, to try to hide the BitTorrent protocol from ISPs that block or perform traffic shaping over this traffic
      • an encapsulation protocol (called Message Stream Encryption or PHE) can be used for this purpose
        • it uses a completely random header and a D-H key exchange in order to accomplish its purpose
      • a downloader can announce encryption support to a tracker by using the following extra arguments
        • supportcrypto=1, requirecrypto=1, cryptoport=X

Conclusion

This concludes this quick overview of the protocols used by the BitTorrent network.

In the following posts, I will show some low-level protocol examples.

References

 

 

 

 

 

 

 

 

 

 

BitTorrent Architecture Overview

According to its proponents, BitTorrent is “a free speech tool”.

This is indeed the case, as it allows its users to distribute content without a centralized authority, using mainly each user’s network and computing resources in a distributed fashion.

A user’s network and computing resources are somewhat shared with the other users (also called “peers” in this context), so that everyone can benefit from expanded availability and reduced censorship properties of the BitTorrent network. The content can be stored in multiple peers at the same time: so that if a peer goes down, the content can still be obtained from the remaining peers that have a complete or partial copy of the content.

Additionally, BitTorrent allows a user to download content from multiple peers, instead of downloading it from a single server. This feature can often enable faster download speeds for its users.

Privacy can also be enhanced by using BitTorrent with minimal or no logging, whenever possible. This contrasts with downloading content from a server, which is usually logged.

Main components/terms

BitTorrent has a complex terminology. So it will be interesting to clarify its meaning.

Let’s start with:

  • Original file/content
    • The content to be published/shared with other users
  • Downloader
    • Application that implements the BitTorrent protocols and specifications
      • there are several such applications
      • examples: uTorrent, qbittorrent
  • “.torrent” file (a.k.a. Metadata file)
    • a “summary” file that summarizes the original file contents
      • the user can share the metadata file with other users, so that they can download the shared content
    • “info_hash” is a SHA hash for the “info” section in the “.torrent” file
      • this hash is used to identify the torrent and for searching for peers seeding the torrent
  • Magnet URI
    • a URI for a metadata file
      • even simpler to share than a file
      • the user can share the magnet URI with other users, so that they can download the shared content
  • Piece
    • a segment of the original file
      • the original file is split in multiple “pieces” for downloading and uploading
  • Peer
    • another user with whom to share file “pieces”
  • Swarm
    • a group of peers sharing the same “torrent”
      • downloading
      • or uploading (also known as “seeding”)
    • identified by the torrrent’s “info_hash”
  • Seed
    • a peer that has the entire “torrent” contents
      • has already downloaded all the “pieces” (or is the original publisher)
  • Tracker
    • an auxiliary service that behaves as a kind of “name server”
    • maps torrent’s “info_hashes” in lists of peers that are seeding each torrent
    • peers “announce” that their are “interested” in the torrent identified by a given “info_hash” and simultaneously receive a list of peers that also “interested”
      • “interested” peers are willing to upload and download “pieces” of the torrent
    • trackers are setup and maintained by voluntary entities
  • DHT – Distributed Hash Table
    • another auxiliary service implemented mostly by the full set of BitTorrent nodes
    • also behaves as a kind of “name server”
    • peers “announce” that their are “interested” in the torrent identified by a given “info_hash” and simultaneously receive a list of peers that also “interested”

Step 1. Creating & sharing a new torrent + swarm

screen-shot-10-21-16-at-06-33-pm

In order to publish a new torrent, a user typically has to perform the following steps:

  • 1. Create the “.torrent” metadata file
    • select which content (local files) to include in the new “.torrent”
    • optionally select which tracker(s) will be used to announce the new torrent
    • optionally select if the torrent will be private (not announced in the DHT nor in public trackers) or if it will be public (announced in the DHT and in public trackers)
    • optionally select “web seeds” for the content
      • (these are just HTTP URLs pointing to some web server that is also serving the same content via HTTP)
  • 2. Save the “.torrent” metadata file and “Magnet URI” for later use
  • 3. Announce and Seed the new “.torrent”
    • Announce
      • In the “Trackers” defined in the metadata
      • In the DHT
      • For Local Peers (in the same LAN)
    • Seed
      • allow any interested peer to download from the initial seeder
    • the “info_hash” is used to uniquely identify the “.torrent” file
  • 4. Share the “.torrent” file or Magnet URI with the intended peer audience using means external to the BitTorrent network
    • web sites, emails, chat, sms, …

Example using uTorrent (Windows):

screen-shot-10-20-16-at-07-50-pm

Example using qbittorrent (Linux):

screen-shot-10-20-16-at-07-20-pm

Step 2. Finding and Joining peers for a given torrent/swarm

screen-shot-10-22-16-at-12-28-pm

In order to find and join a torrent, a user typically has to perform the following steps:

  • 1. Search for interesting “.torrent” or Magnet URIs using means external to the BitTorrent network
    • web sites, emails, chat, sms, …
    • NOTE: BitTorrent does not provide a “content search” mechanism, as some of its predecessors did1.
  • 2. Download and Save the “.torrent” file or Magnet URI
  • 3. Calculate the “info_hash” for the “.torrent” file or Magnet URI
  • 4. Query for peers seeding the torrent and Join the Swarm
    • using the calculated “info_hash
    • announce interest in the torrent
    • get lists of peers
    • through the Trackers, the DHT, Local Peer Discovery
    • ( and also through Peer Exchange, after some peers have been found through the other mechanisms )
  • 5. Download and Upload content “Pieces” from/to other Peers
  • 6. Reassemble the original file by assembling all downloaded pieces together
  • 7. Become a Seed
    • Keep seeding the contents
    • seeding all “pieces”

Adding a new torrent file example:

Screen Shot 10-22-16 at 01.18 PM.PNG

Screen Shot 10-22-16 at 01.19 PM.PNG

screen-shot-10-22-16-at-01-19-pm-001

Trackers example:

screen-shot-10-19-16-at-12-15-pm

DHT Example:

screen-shot-10-19-16-at-12-16-pm

Step 3. Downloading and Uploading torrent/swarm pieces

During the download and even after a full download, the “Downloader” also “seeds” the “pieces” it has already downloaded. This means that other peers can download these “pieces” from it. This allows for extra availability and extra bandwidth, when there are many peers in a swarm.

When one or more peers go offline, some of the torrent “pieces” may become “not  available”. This means that other peers which still don’t have those “pieces” will no be able to download the full torrent until those “pieces” become available again.

As previously mentioned:

  • 5. Download and Upload content “Pieces” from/to other Peers
  • 6. Reassemble the original file by assembling all downloaded pieces together
  • 7. Become a Seed
    • Keep seeding the contents
    • seeding all “pieces”

Download Examples (with uTorrent)

 

Screen Shot 10-23-16 at 12.27 PM.PNG

Download status for a specific torrent/file

 

screen-shot-10-21-16-at-05-07-pm

Peers in the swarm

screen-shot-10-21-16-at-05-06-pm

Peers in the swarm

screen-shot-10-19-16-at-12-21-pm

Fully downloaded torrent/file

screen-shot-10-19-16-at-12-18-pm

Peers in the swarm with decoded countries

screen-shot-10-19-16-at-12-20-pm

Fully downloaded torrent/file in “Seeding” status

 

Final remarks

BitTorrent is a very powerful and popular free speech tool.

I will be describing more protocol details in future posts.

References