A Minimal BitTorrent Client
July 19, 2025
Writing a BitTorrent client is a perfect weekend project for exploring network protocols. It's complex enough to be interesting, yet simple enough to build from scratch with minimal dependencies.
The challenge isn't understanding how BitTorrent works; there are plenty of excellent guides covering the protocol1, 2, and more complex components like Distributed Hash Tables and Peer Exchange Algorithms. The challenge is knowing where to start and what to implement first. Should you implement magnet links? What about piece prioritization? How much of the protocol do you actually need to implement to get something working?
Instead of another protocol deep-dive, I thought it would be more interesting to showcase one way of building software: start with the absolute minimum that works, then iterate using an appropriate testing strategy. Enforcing that we don't break what we've built so far. In this post, I'll build the simplest possible BitTorrent client—one that can download exactly one piece from one peer under perfect conditions. No error handling, no optimizations, just the bare bones that prove the concept works.
I use literate programming to go over the code but you can also check the complete end-result on GitHub.
Simplifying the problem
Implementing a fully-featured client is too vague and too much in one go. The first step is to decide which simpler problem to solve. That means which features we will ignore for now and what libraries or pre-existing components can we use to simplify our task or because we don't really have an interest in building our own
Minimal BitTorrent Protocol
We have to simplify the problem by making a few assumptions so that the problem is no longer "Implementing a BitTorrent Client" but "Implementing a BitTorrent Client that only supports the smallest subset of the protocol while still being able to download something".
Here are the assumptions that I made:
- Work with a tracker. Not Magnet links or other peer discovery mechanisms.
- Tracker is available and always has a peer we can download from.
- Download a single piece from a single file Torrent.
- Connect to a single peer that has all the content of the Torrent.
- Assume no disconnection or networking issue.
- Download only, we don't respond to requests.
- What we receive is correct; if the hash verification fails, we fail too. No retry.
These are the requirements for what we're going to implement. With that, we can write a leeching client that starts from a MetaInfo file, downloads a small file that fits in one piece, and works under somewhat perfect networking conditions.
If you wanted to simplify things even further, you could assume you know the address of the Peer you're going to download from. That way, you can skip the HTTP request to the Tracker and connect directly to it.
Minimal implementation
Now that we have redefined the problem and have a set of assumptions to work with, here's what we actually need to handle.
- Ingesting the Torrent file to get the tracker's
announce
URL and theinfo_hash
of the torrent. - Send an HTTP request to the Tracker to get a list of peers to connect to.
- Connect to a peer and be able to send
handshake
,interested
,request
messages, and receivehandshake
,bitfield
,unchoke
, andpiece
.
That's it, and it's really only the piece
message we need to
process; the other messages we can just verify that we receive them
and discard them without further processing.
Mise en place
Using dependencies
Because the starting point is the MetaInfo file (the .torrent
file),
we need to be able to ingest it. And for that, you have to be able to
encode and decode bencoded strings. In my case, I decided to implement
it, but you could decide to use libtorrent to parse the MetaInfo file
directly, or to use bencode.py to decode it and extract what's needed
from it.
As for communication with the Tracker and Peers, we need two things:
- A way to make HTTP requests to the tracker's announce URL to at least be able to get a list of Peers. I'm going to use Python's requests module.
- Write structured data to a socket, since that's how messages are transmitted between peers. It can be fun to roll out your own, but in my case, I'm going to use Python's struct and asyncio streams libraries.
socket is an alternative to asyncio streams, but later I plan on using async/await semantics so might as well use asyncio from the start even if not required for this simple implementation.
Test bench
Now that we have defined our minimal client, we need a strategy to validate it.
Because of our simplifying assumptions, we can't test against real torrents in the wild, it's too simplistic for that. Instead, we'll create a controlled environment with our own tracker and seeding peer.
We'll use Open Tracker and libtorrent for this. They are well-tested tools that we can safely assume work. Any test failures will be ours. The last thing we want is to debug our own tests.
This black-box approach tests the same interface a real user would use, and they don't have any knowledge of implementation details, that way they stay relevant as we iterate on our client.
This black-box approach tests the same interface a real user would use, keeping our tests relevant as we iterate. The trade-off is that we only test the happy path and need logs for debugging when things fail—but that's perfect for a proof of concept
There are more details in the last part of this article: Testing setup in details, about the environment setup and tests.
SimpleClient Implementation
For the sake of simplicity, all the code can go in a single method. Refrain from doing any error handling, or try to consider anything other than the happy path.
I personally find that doing so can lead to distraction. You start to fiddle with the code, making it look a bit nicer… And before you know it you're writing a Message Handler Factory and hitting too many boxes on your Gang of Four Bingo card.
At a high-level it looks like this:
class SimpleClient: async def fetch_first_piece( self, metainfo_file: str, output: str ) -> bool: # Ingest the torrent MetaInfo file torrent = TorrentInfo.from_file(metainfo_file) assert torrent # Get one peer peers = torrent.get_peers() addr, port = peers[0] # Open socket reader, writer = await asyncio.open_connection(addr, port) # BitTorrent message flow <<handshake>> <<recv-bitfield>> <<send-interested>> <<recv-unchoke>> <<block-size-calculation>> # Request individual blocks for the piece for i in range(block_count): <<send-block-request>> while True: <<recv-message>> if msg_type != 7: <<discard-message>> else: <<process-piece>> break # Cleanup writer.close() # Write out the file with open(Path(output) / torrent.file, "wb") as f: f.write(data) return True
Note: I'm using a literate programming style where <<section-name>>
represents code blocks that I'll define later. That way I can show the
high-level flow first, then dive into the details later.
Handshake
After opening a socket with the peer, we send the opening message. It is 68 bytes long, and is as follow:
0:1 | The number 19 |
1:20 | The string BitTorrent protocol |
20:28 | All zero |
28:48 | SHA-1 hash of the bencoded torrent info file |
48:68 | Our Peer ID |
We expect to receive a message back in the same format, with the id of the peer we connected to.
logger.info("Sending 'handshake' message") writer.write( struct.pack( ">B19s8x20s20s", 19, b"BitTorrent protocol", torrent.info_hash[0], MY_PEER_ID.encode(), ) ) await writer.drain() # Wait for response, extract peer id response = await reader.readexactly(68) peer_id = response[-20:] logger.info(f"Connected to {peer_id=}")
Bitfield
The very first message after the handshake
is a bitfield
message,
where the peer tells us what pieces it has available.
One of our assumptions is that the peer has the full file we're after so can just discard this message.
A more complete implementation would have to process it to decide if the peer has any piece you're interested in and ask for it. You also get updates when the peer has new pieces so you'd need to keep track of that. We won't for now.
logger.info("Expecting 'bitfield' message") msg = await reader.readexactly(5) msg_len, msg_type = struct.unpack(">IB", msg) if msg_type != 5: # Bitfield message is type 5 raise Exception(f"Expected Bitfield got {msg_type=}, {msg_len=}") logger.info("Received 'bitfield' message") await reader.readexactly(msg_len - 1)
Interested and unchoke
Now we have to tell the peer that we're interested in what it has.
logger.info("Sending 'interested' message") writer.write(struct.pack(">IB", 1, 2)) await writer.drain()
And in return it should unchoke
us so that we're able to send
requests.
logger.info("Expecting 'unchoke' message") msg = await reader.readexactly(5) msg_len, msg_type = struct.unpack(">IB", msg) if msg_len != 1 and msg_type != 1: raise Exception(f"Expected 'unchoke' got {msg_type=}, {msg_len=}") logger.info("Received 'unchoke' message")
At this point we're now ready to actually make requests for the blocks making up a piece.
Requesting blocks
Pieces are defined in the torrent file, typically they are a power of two size (16kb , 32kb…, 1 Mb…). The MetaInfo file contains the SHA1 hashes of each pieces for verification. It's the smallest part, the building block of the protocol.
But by convention we don't request the full piece in one chunk, we break it down into smaller blocks. Usually 16kb. You could use something smaller, but not bigger as the BitTorrent Spec explains:
All current implementations use 214 (16 kiB), and close connections which request an amount greater than that.
What we're most likely to get wrong is the the size of the last block. If we can't break the piece in multiple block of equal size, the last block will be of a different size. Getting this wrong will prevent us from making a well-formed request and the peer might cut us off.
Later on, the same kind of consideration has to be made for the last piece of a torrent. It might not be the same size as the other pieces.
block_size = 16 * 2**10 q, r = divmod(torrent.piece_length, block_size) if r > 0: block_count = q + 1 last_block_size = r else: block_count = q last_block_size = block_size # Allocate the bytearray we're going to write the piece data to data = bytearray(torrent.piece_length) logger.info(f"{block_count=}, {block_size=}, {last_block_size=}")
Now we're ready to send the individual requests. In a 'real' client, we'll request multiple blocks at the same time, a technique called pipelining as to saturate the connection. But we're going for simple, so we'll make the block request sequentially.
offset = i * block_size size = block_size if i < block_count - 1 else last_block_size logger.info(f"Asking for {i=} of {block_count=}, size={size / (2**10)} kb") writer.write(struct.pack(">IBIII", 13, 6, 0, offset, size)) await writer.drain()
Receiving blocks
For each block request we sent, we should get back a piece
message
with the content. So we're looping until we get all we need. Sometimes
the peer we'll also send other messages in-between, like unchoke
so
to not be thrown off by it we'll check that we are receiving a piece
message and discard anything else.
First, get the first 5 bytes so we can check message length and type.
msg = await reader.readexactly(5) msg_len, msg_type = struct.unpack(">IB", msg)
If the msg_type
is not 7, i.e. not piece
, we consume it.
logger.info(f"received {msg_len=}, {msg_type=}") if msg_len > 1: await reader.readexactly(msg_len - 1) logger.info("Consumed message")
If it is a piece, message we get the payload and write it to our
bytearray
.
piece_data = await reader.readexactly(8) # index (4) + begin (4) piece_idx, bgn = struct.unpack(">II", piece_data) block_data = await reader.readexactly(msg_len - 9) data[bgn : bgn + len(block_data)] = block_data logger.info( f"Received block message: {msg_type=}, {msg_len=}, {piece_idx=}, offset={bgn}" )
And that's it! We don't need more than that for the simple case.
We can run this and verify that we get what we expect:
tests/test_bittorent.py::test_simple_download ----------------------------------------------------------- live log call ----------------------------------------------------------- INFO root:test_bittorent.py:98 Attempting to fetch first piece with our client INFO root:client.py:34 Sent 'handshake' message INFO root:client.py:39 Connected to peer_id=b'-LT20B0-gXH4yLS3!G6j' INFO root:client.py:42 Expecting 'bitfield' message INFO root:client.py:48 Received 'bitfield' message INFO root:client.py:54 Sent 'interested' message INFO root:client.py:57 Expecting 'unchoke' message INFO root:client.py:63 Received 'unchoke' message INFO root:client.py:81 block_count=1, block_size=16384, last_block_size=16384 INFO root:client.py:87 Asking for i=0 of block_count=1, size=16.0 kb INFO root:client.py:99 received msg_len=1, msg_type=1 INFO root:client.py:109 Received block message: msg_type=7, msg_len=16393, piece_idx=0, offset=0 INFO root:test_bittorent.py:104 Successfully fetched first piece! INFO root:test_bittorent.py:115 Original checksum: 1bd4db450abc8914c2fac721cace2704ff4c16028e6d07293154dad289835694 INFO root:test_bittorent.py:116 Downloaded checksum: 1bd4db450abc8914c2fac721cace2704ff4c16028e6d07293154dad289835694 PASSED
And that's our Simple BitTorrent Client, ~120 lines of Python. No error handling, no optimizations, just the absolute minimum to be able to download a single-file, single-piece Torrent.
Now we have a starting point for building more complex features.
Testing setup in details
Below are a few more details about the testing setup. I start Open Tracker manually, not as part of the test, and that's just because i kept running into permissions issue. Maybe because the tests are ran from a virtual environment. I didn't have time to figure it out.
Also, the code below was generated with the help of a LLM, as convenience because I'm not familiar with libtorrent.
This is, roughly, what our test code is going to look like:
def test_bittorent_download(workspace:str): <<create-payload>> <<create-torrent>> <<create-mock-peer>> client = SimpleClient() result = client.download(torrent_file, download_dir) # Check that the payload size, and checksum match. <<verify-result>>
Open Tracker
We can run our own instance of Open Tracker using a docker image
docker run -d --name opentracker -p 8080:6969 lednerb/opentracker-docker
and it will be ready for us at http://localhost:8080
.
Metainfo file
We can use libtorrent
to generate the MetaInfo file referencing our
local tracker and whatever payload we are going to try to download.
I've made the choice to create the file as part of the test instead of crafting a static one that I then use during the tests. This makes it easier to have multiple tests with different payloads.
def create_torrent_file(payload_file: str, tracker: str, workspace: str) -> str: """Create the torrent file for the content of the payload in the workspace""" payload_path = Path(workspace) / payload_file fs = lt.file_storage() lt.add_files(fs, str(payload_path)) t = lt.create_torrent(fs) t.add_tracker(tracker) t.set_creator("test-setup") lt.set_piece_hashes(t, str(payload_path.parent)) torrent_data = lt.bencode(t.generate()) torrent_path = payload_path.with_suffix(".torrent") with open(torrent_path, "wb") as f: f.write(torrent_data) logger.debug(f"Torrent file: {str(torrent_path)}") return str(torrent_path)
Mock peer
Again, we can use libtorrent
to create a mock peer and give it the
torrent we setup for our test, it will have the full payload in its
workspace, ready to seed.
We can use this to also create multiple peers, each using a different port.
def create_mock_peer(port: int, payload: str, torrent: str, workspace_dir: str): """Create a single peer bound to port, seeding""" settings = { "listen_interfaces": f"0.0.0.0:{port}", "enable_dht": False, "enable_lsd": False, } session = lt.session(settings) info = lt.torrent_info(torrent) # Create the temp directory, and copy the asset to it peer_dir = Path(workspace_dir) / f"peer_{port}" peer_dir.mkdir(exist_ok=True) if payload: shutil.copy(payload, peer_dir) lt_params = { "ti": info, "save_path": str(peer_dir), } handle = session.add_torrent(lt_params) sessions.append(session) return { "port": port, "dir": str(peer_dir), "session": session, "handle": handle, }
With this in place we can write a test case to that enables our client to connect to a tracker, get a peer and download a single piece from it and verify that the content matches what we expect.
async def test_simple_download(workspace, create_mock_peer): async def test_simple_download(workspace, create_mock_peer): """Test downloading first piece from a single seeder using our Client""" # Create a small payload (smaller than default piece size) payload_file = create_payload(workspace, 16 * 2**10) torrent_file = create_torrent_file(payload_file, TRACKER_URL, workspace) # Create a seeding peer create_mock_peer(6881, payload_file, torrent_file, workspace) # Give the seeder time to start up and connect to tracker await asyncio.sleep(2) # Create download directory download_dir = Path(workspace) / "download" download_dir.mkdir(exist_ok=True) # Attempt to fetch the first piece logger.info("Attempting to fetch first piece with our client") client = SimpleClient() result = await client.fetch_first_piece(torrent_file, str(download_dir)) # Verify we successfully connected and fetched without errors assert result, "Client should have successfully fetched the first piece" logger.info("Successfully fetched first piece!") # Verify the downloaded file exists downloaded_file = download_dir / Path(payload_file).name assert downloaded_file.exists(), f"Downloaded file should exist at {downloaded_file}" # Calculate checksums for comparison original_checksum = calculate_sha256(payload_file) downloaded_checksum = calculate_sha256(str(downloaded_file)) logger.info(f"Original checksum: {original_checksum}") logger.info(f"Downloaded checksum: {downloaded_checksum}") assert original_checksum == downloaded_checksum, "Downloaded file checksum should match original" # Verify file size matches original_size = Path(payload_file).stat().st_size downloaded_size = downloaded_file.stat().st_size assert original_size == downloaded_size, f"File sizes should match: {original_size} != {downloaded_size}"
Finally, I like to add a test to test the setup as a quick way to make sure it works as expected.
def test_setup(workspace, create_mock_peer): """Test the testing setup by creating a peer and make it download the torrent""" payload_file = copy_payload("image.png", ASSETS_DIR, workspace) torrent_file = create_torrent_file(payload_file, TRACKER_URL, workspace) seeding_peers = [ create_mock_peer(port, payload_file, torrent_file, workspace) for port in [6100, 6101, 6102] ] logger.debug(f"{seeding_peers=}") time.sleep(2) assert Path(torrent_file).exists() leech = create_mock_peer(6881, None, torrent_file, workspace) logger.debug("Downloading...") # Download until complete h = leech["handle"] while not h.status().is_seeding: s = h.status() logger.debug( f"Progress: {s.progress * 100:.1f}% " f"Down: {s.download_rate / 1000:.1f} kB/s " f"Peers: {s.num_peers}" ) time.sleep(1) logger.debug("Download complete!") assert True