A Minimal BitTorrent Client

July 19, 2025

Writing a BitTorrent client is a perfect weekend project for exploring network protocols. It's complex enough to be interesting, yet simple enough to build from scratch with minimal dependencies.

The challenge isn't understanding how BitTorrent works; there are plenty of excellent guides covering the protocol1, 2, and more complex components like Distributed Hash Tables and Peer Exchange Algorithms. The challenge is knowing where to start and what to implement first. Should you implement magnet links? What about piece prioritization? How much of the protocol do you actually need to implement to get something working?

Instead of another protocol deep-dive, I thought it would be more interesting to showcase one way of building software: start with the absolute minimum that works, then iterate using an appropriate testing strategy. Enforcing that we don't break what we've built so far. In this post, I'll build the simplest possible BitTorrent client—one that can download exactly one piece from one peer under perfect conditions. No error handling, no optimizations, just the bare bones that prove the concept works.

I use literate programming to go over the code but you can also check the complete end-result on GitHub.

Simplifying the problem

Implementing a fully-featured client is too vague and too much in one go. The first step is to decide which simpler problem to solve. That means which features we will ignore for now and what libraries or pre-existing components can we use to simplify our task or because we don't really have an interest in building our own

Minimal BitTorrent Protocol

We have to simplify the problem by making a few assumptions so that the problem is no longer "Implementing a BitTorrent Client" but "Implementing a BitTorrent Client that only supports the smallest subset of the protocol while still being able to download something".

Here are the assumptions that I made:

  • Work with a tracker. Not Magnet links or other peer discovery mechanisms.
  • Tracker is available and always has a peer we can download from.
  • Download a single piece from a single file Torrent.
  • Connect to a single peer that has all the content of the Torrent.
  • Assume no disconnection or networking issue.
  • Download only, we don't respond to requests.
  • What we receive is correct; if the hash verification fails, we fail too. No retry.

These are the requirements for what we're going to implement. With that, we can write a leeching client that starts from a MetaInfo file, downloads a small file that fits in one piece, and works under somewhat perfect networking conditions.

If you wanted to simplify things even further, you could assume you know the address of the Peer you're going to download from. That way, you can skip the HTTP request to the Tracker and connect directly to it.

Minimal implementation

Now that we have redefined the problem and have a set of assumptions to work with, here's what we actually need to handle.

  • Ingesting the Torrent file to get the tracker's announce URL and the info_hash of the torrent.
  • Send an HTTP request to the Tracker to get a list of peers to connect to.
  • Connect to a peer and be able to send handshake, interested, request messages, and receive handshake, bitfield, unchoke, and piece.

That's it, and it's really only the piece message we need to process; the other messages we can just verify that we receive them and discard them without further processing.

Mise en place

Using dependencies

Because the starting point is the MetaInfo file (the .torrent file), we need to be able to ingest it. And for that, you have to be able to encode and decode bencoded strings. In my case, I decided to implement it, but you could decide to use libtorrent to parse the MetaInfo file directly, or to use bencode.py to decode it and extract what's needed from it.

As for communication with the Tracker and Peers, we need two things:

  1. A way to make HTTP requests to the tracker's announce URL to at least be able to get a list of Peers. I'm going to use Python's requests module.
  2. Write structured data to a socket, since that's how messages are transmitted between peers. It can be fun to roll out your own, but in my case, I'm going to use Python's struct and asyncio streams libraries.

socket is an alternative to asyncio streams, but later I plan on using async/await semantics so might as well use asyncio from the start even if not required for this simple implementation.

Test bench

Now that we have defined our minimal client, we need a strategy to validate it.

Because of our simplifying assumptions, we can't test against real torrents in the wild, it's too simplistic for that. Instead, we'll create a controlled environment with our own tracker and seeding peer.

We'll use Open Tracker and libtorrent for this. They are well-tested tools that we can safely assume work. Any test failures will be ours. The last thing we want is to debug our own tests.

This black-box approach tests the same interface a real user would use, and they don't have any knowledge of implementation details, that way they stay relevant as we iterate on our client.

This black-box approach tests the same interface a real user would use, keeping our tests relevant as we iterate. The trade-off is that we only test the happy path and need logs for debugging when things fail—but that's perfect for a proof of concept

There are more details in the last part of this article: Testing setup in details, about the environment setup and tests.

SimpleClient Implementation

For the sake of simplicity, all the code can go in a single method. Refrain from doing any error handling, or try to consider anything other than the happy path.

I personally find that doing so can lead to distraction. You start to fiddle with the code, making it look a bit nicer… And before you know it you're writing a Message Handler Factory and hitting too many boxes on your Gang of Four Bingo card.

At a high-level it looks like this:

class SimpleClient: 
    async def fetch_first_piece(
            self, metainfo_file: str, output: str
    ) -> bool:
        # Ingest the torrent MetaInfo file
        torrent = TorrentInfo.from_file(metainfo_file)
        assert torrent

        # Get one peer
        peers = torrent.get_peers()
        addr, port = peers[0]

        # Open socket
        reader, writer = await asyncio.open_connection(addr, port)

        # BitTorrent message flow
        <<handshake>>
        <<recv-bitfield>>
        <<send-interested>>
        <<recv-unchoke>>
        <<block-size-calculation>>

        # Request individual blocks for the piece
        for i in range(block_count):
            <<send-block-request>>

            while True:
                <<recv-message>>

                if msg_type != 7:
                    <<discard-message>>

                else:
                    <<process-piece>>
                    break

        # Cleanup
        writer.close()

        # Write out the file
        with open(Path(output) / torrent.file, "wb") as f:
            f.write(data)

        return True

Note: I'm using a literate programming style where <<section-name>> represents code blocks that I'll define later. That way I can show the high-level flow first, then dive into the details later.

Handshake

After opening a socket with the peer, we send the opening message. It is 68 bytes long, and is as follow:

0:1 The number 19
1:20 The string BitTorrent protocol
20:28 All zero
28:48 SHA-1 hash of the bencoded torrent info file
48:68 Our Peer ID

We expect to receive a message back in the same format, with the id of the peer we connected to.

handshake
logger.info("Sending 'handshake' message")
writer.write(
    struct.pack(
        ">B19s8x20s20s",
        19,
        b"BitTorrent protocol",
        torrent.info_hash[0],
        MY_PEER_ID.encode(),
    )
)
await writer.drain()

# Wait for response, extract peer id
response = await reader.readexactly(68)
peer_id = response[-20:]
logger.info(f"Connected to {peer_id=}")

Bitfield

The very first message after the handshake is a bitfield message, where the peer tells us what pieces it has available.

One of our assumptions is that the peer has the full file we're after so can just discard this message.

A more complete implementation would have to process it to decide if the peer has any piece you're interested in and ask for it. You also get updates when the peer has new pieces so you'd need to keep track of that. We won't for now.

recv-bitfield
logger.info("Expecting 'bitfield' message")
msg = await reader.readexactly(5)
msg_len, msg_type = struct.unpack(">IB", msg)

if msg_type != 5:  # Bitfield message is type 5
    raise Exception(f"Expected Bitfield got {msg_type=}, {msg_len=}")
logger.info("Received 'bitfield' message")
await reader.readexactly(msg_len - 1)

Interested and unchoke

Now we have to tell the peer that we're interested in what it has.

send-interested
logger.info("Sending 'interested' message")
writer.write(struct.pack(">IB", 1, 2))
await writer.drain()

And in return it should unchoke us so that we're able to send requests.

recv-unchoke
logger.info("Expecting 'unchoke' message")
msg = await reader.readexactly(5)
msg_len, msg_type = struct.unpack(">IB", msg)

if msg_len != 1 and msg_type != 1:
    raise Exception(f"Expected 'unchoke' got {msg_type=}, {msg_len=}")

logger.info("Received 'unchoke' message")

At this point we're now ready to actually make requests for the blocks making up a piece.

Requesting blocks

Pieces are defined in the torrent file, typically they are a power of two size (16kb , 32kb…, 1 Mb…). The MetaInfo file contains the SHA1 hashes of each pieces for verification. It's the smallest part, the building block of the protocol.

But by convention we don't request the full piece in one chunk, we break it down into smaller blocks. Usually 16kb. You could use something smaller, but not bigger as the BitTorrent Spec explains:

All current implementations use 214 (16 kiB), and close connections which request an amount greater than that.

What we're most likely to get wrong is the the size of the last block. If we can't break the piece in multiple block of equal size, the last block will be of a different size. Getting this wrong will prevent us from making a well-formed request and the peer might cut us off.

Later on, the same kind of consideration has to be made for the last piece of a torrent. It might not be the same size as the other pieces.

block-size-calculation
block_size = 16 * 2**10

q, r = divmod(torrent.piece_length, block_size)
if r > 0:
    block_count = q + 1
    last_block_size = r
else:
    block_count = q
    last_block_size = block_size

# Allocate the bytearray we're going to write the piece data to
data = bytearray(torrent.piece_length)

logger.info(f"{block_count=}, {block_size=}, {last_block_size=}")

Now we're ready to send the individual requests. In a 'real' client, we'll request multiple blocks at the same time, a technique called pipelining as to saturate the connection. But we're going for simple, so we'll make the block request sequentially.

send-block-request
offset = i * block_size
size = block_size if i < block_count - 1 else last_block_size

logger.info(f"Asking for {i=} of {block_count=}, size={size / (2**10)} kb")

writer.write(struct.pack(">IBIII", 13, 6, 0, offset, size))
await writer.drain()

Receiving blocks

For each block request we sent, we should get back a piece message with the content. So we're looping until we get all we need. Sometimes the peer we'll also send other messages in-between, like unchoke so to not be thrown off by it we'll check that we are receiving a piece message and discard anything else.

First, get the first 5 bytes so we can check message length and type.

recv-message
msg = await reader.readexactly(5)
msg_len, msg_type = struct.unpack(">IB", msg)

If the msg_type is not 7, i.e. not piece, we consume it.

discard-message
logger.info(f"received {msg_len=}, {msg_type=}")
if msg_len > 1:
    await reader.readexactly(msg_len - 1)
    logger.info("Consumed message")

If it is a piece, message we get the payload and write it to our bytearray.

process-piece
piece_data = await reader.readexactly(8)  # index (4) + begin (4)
piece_idx, bgn = struct.unpack(">II", piece_data)
block_data = await reader.readexactly(msg_len - 9)
data[bgn : bgn + len(block_data)] = block_data

logger.info(
    f"Received block message: {msg_type=}, {msg_len=}, {piece_idx=}, offset={bgn}"
)

And that's it! We don't need more than that for the simple case.

We can run this and verify that we get what we expect:

tests/test_bittorent.py::test_simple_download 
----------------------------------------------------------- live log call -----------------------------------------------------------
INFO     root:test_bittorent.py:98 Attempting to fetch first piece with our client
INFO     root:client.py:34 Sent 'handshake' message
INFO     root:client.py:39 Connected to peer_id=b'-LT20B0-gXH4yLS3!G6j'
INFO     root:client.py:42 Expecting 'bitfield' message
INFO     root:client.py:48 Received 'bitfield' message
INFO     root:client.py:54 Sent 'interested' message
INFO     root:client.py:57 Expecting 'unchoke' message
INFO     root:client.py:63 Received 'unchoke' message
INFO     root:client.py:81 block_count=1, block_size=16384, last_block_size=16384
INFO     root:client.py:87 Asking for i=0 of block_count=1, size=16.0 kb
INFO     root:client.py:99 received msg_len=1, msg_type=1
INFO     root:client.py:109 Received block message: msg_type=7, msg_len=16393, piece_idx=0, offset=0
INFO     root:test_bittorent.py:104 Successfully fetched first piece!
INFO     root:test_bittorent.py:115 Original checksum: 1bd4db450abc8914c2fac721cace2704ff4c16028e6d07293154dad289835694
INFO     root:test_bittorent.py:116 Downloaded checksum: 1bd4db450abc8914c2fac721cace2704ff4c16028e6d07293154dad289835694
PASSED

And that's our Simple BitTorrent Client, ~120 lines of Python. No error handling, no optimizations, just the absolute minimum to be able to download a single-file, single-piece Torrent.

Now we have a starting point for building more complex features.

Testing setup in details

Below are a few more details about the testing setup. I start Open Tracker manually, not as part of the test, and that's just because i kept running into permissions issue. Maybe because the tests are ran from a virtual environment. I didn't have time to figure it out.

Also, the code below was generated with the help of a LLM, as convenience because I'm not familiar with libtorrent.

This is, roughly, what our test code is going to look like:

def test_bittorent_download(workspace:str):
    <<create-payload>>
    <<create-torrent>>
    <<create-mock-peer>>

    client = SimpleClient()
    result = client.download(torrent_file, download_dir)

    # Check that the payload size, and checksum match.
    <<verify-result>> 

Open Tracker

We can run our own instance of Open Tracker using a docker image

docker run -d --name opentracker -p 8080:6969 lednerb/opentracker-docker

and it will be ready for us at http://localhost:8080.

Metainfo file

We can use libtorrent to generate the MetaInfo file referencing our local tracker and whatever payload we are going to try to download.

I've made the choice to create the file as part of the test instead of crafting a static one that I then use during the tests. This makes it easier to have multiple tests with different payloads.

def create_torrent_file(payload_file: str, tracker: str, workspace: str) -> str:
    """Create the torrent file for the content of the payload in the workspace"""
    payload_path = Path(workspace) / payload_file

    fs = lt.file_storage()
    lt.add_files(fs, str(payload_path))

    t = lt.create_torrent(fs)
    t.add_tracker(tracker)
    t.set_creator("test-setup")

    lt.set_piece_hashes(t, str(payload_path.parent))
    torrent_data = lt.bencode(t.generate())

    torrent_path = payload_path.with_suffix(".torrent")
    with open(torrent_path, "wb") as f:
        f.write(torrent_data)

    logger.debug(f"Torrent file: {str(torrent_path)}")

    return str(torrent_path)

Mock peer

Again, we can use libtorrent to create a mock peer and give it the torrent we setup for our test, it will have the full payload in its workspace, ready to seed.

We can use this to also create multiple peers, each using a different port.

def create_mock_peer(port: int, payload: str, torrent: str, workspace_dir: str):
    """Create a single peer bound to port, seeding"""
    settings = {
        "listen_interfaces": f"0.0.0.0:{port}",
        "enable_dht": False,
        "enable_lsd": False,
    }

    session = lt.session(settings)
    info = lt.torrent_info(torrent)

    # Create the temp directory, and copy the asset to it
    peer_dir = Path(workspace_dir) / f"peer_{port}"
    peer_dir.mkdir(exist_ok=True)

    if payload:
        shutil.copy(payload, peer_dir)

    lt_params = {
        "ti": info,
        "save_path": str(peer_dir),
    }

    handle = session.add_torrent(lt_params)
    sessions.append(session)

    return {
        "port": port,
        "dir": str(peer_dir),
        "session": session,
        "handle": handle,
    }

With this in place we can write a test case to that enables our client to connect to a tracker, get a peer and download a single piece from it and verify that the content matches what we expect.

async def test_simple_download(workspace, create_mock_peer):
    async def test_simple_download(workspace, create_mock_peer):
    """Test downloading first piece from a single seeder using our Client"""
    # Create a small payload (smaller than default piece size)
    payload_file = create_payload(workspace, 16 * 2**10)
    torrent_file = create_torrent_file(payload_file, TRACKER_URL, workspace)

    # Create a seeding peer
    create_mock_peer(6881, payload_file, torrent_file, workspace)

    # Give the seeder time to start up and connect to tracker
    await asyncio.sleep(2)

    # Create download directory
    download_dir = Path(workspace) / "download"
    download_dir.mkdir(exist_ok=True)

    # Attempt to fetch the first piece
    logger.info("Attempting to fetch first piece with our client")
    client = SimpleClient()
    result = await client.fetch_first_piece(torrent_file, str(download_dir))

    # Verify we successfully connected and fetched without errors
    assert result, "Client should have successfully fetched the first piece"
    logger.info("Successfully fetched first piece!")

    # Verify the downloaded file exists
    downloaded_file = download_dir / Path(payload_file).name

    assert downloaded_file.exists(), f"Downloaded file should exist at {downloaded_file}"

    # Calculate checksums for comparison
    original_checksum = calculate_sha256(payload_file)
    downloaded_checksum = calculate_sha256(str(downloaded_file))

    logger.info(f"Original checksum: {original_checksum}")
    logger.info(f"Downloaded checksum: {downloaded_checksum}")

    assert original_checksum == downloaded_checksum, "Downloaded file checksum should match original"

    # Verify file size matches
    original_size = Path(payload_file).stat().st_size
    downloaded_size = downloaded_file.stat().st_size
    assert original_size == downloaded_size, f"File sizes should match: {original_size} != {downloaded_size}"

Finally, I like to add a test to test the setup as a quick way to make sure it works as expected.

def test_setup(workspace, create_mock_peer):
    """Test the testing setup by creating a peer and make it download the torrent"""
    payload_file = copy_payload("image.png", ASSETS_DIR, workspace)
    torrent_file = create_torrent_file(payload_file, TRACKER_URL, workspace)

    seeding_peers = [
        create_mock_peer(port, payload_file, torrent_file, workspace)
        for port in [6100, 6101, 6102]
    ]
    logger.debug(f"{seeding_peers=}")

    time.sleep(2)
    assert Path(torrent_file).exists()

    leech = create_mock_peer(6881, None, torrent_file, workspace)
    logger.debug("Downloading...")

    # Download until complete
    h = leech["handle"]

    while not h.status().is_seeding:
        s = h.status()
        logger.debug(
            f"Progress: {s.progress * 100:.1f}% "
            f"Down: {s.download_rate / 1000:.1f} kB/s "
            f"Peers: {s.num_peers}"
        )
        time.sleep(1)

    logger.debug("Download complete!")
    assert True

Footnotes