Writing Raw HTTP Over TCP with Sockets
You use HTTP every single day. Every website, every API call, every time you scroll Instagram HTTP is working behind the scenes. But have you ever wondered what's actually happening when you type a URL and press Enter?
I decided to find out by building an HTTP1 server from scratch. No frameworks. No libraries. Just raw sockets and a lot of curiosity.
This is everything I learned along the way.
What Even Is HTTP?
Let's start with the basics.
HTTP stands for Hypertext Transfer Protocol. It's just a set of rules for how computers talk to each other on the web. That's it. it's literally just text being sent back and forth.
When you visit google.com, here's what happens:
- Your browser opens a connection to Google's server
- Your browser sends a text message saying "Hey, give me the homepage"
- Google's server sends back a text message with the HTML
- Your browser renders it
The “text messages” follow a specific format — that format is defined by HTTP.
The First Surprise: HTTP Is Just Text
This blew my mind when I first learned it. HTTP/1.1 is literally plain text.
Here's what your browser sends when you visit a website:
GET /index.html HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0
Accept: text/html
That's it. That's an HTTP request. It's just text with some rules:
- Line 1: What you want (GET this page) and what version of HTTP you speak
- Lines 2-4: Extra info (headers) like "who are you" and "what do you accept"
- Empty line: Signals "I'm done talking"
The server responds with something like:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234
<!DOCTYPE html>
<html>
<body>Hello World!</body>
</html>Again, just text! The first line says "everything's OK" (that's what 200 means). Then some headers. Then an empty line. Then the actual webpage.
You could literally do this with telnet if you wanted to. Open a terminal and try:
telnet example.com 80Then type:
GET / HTTP/1.1
Host: example.com
Press Enter twice after the Host line. You'll see raw HTML come back. You just spoke HTTP manually!
How Does the Text Get There?
Here's where it gets interesting. HTTP doesn't actually send anything. It's just the format of the messages.
The actual sending is done by TCP (Transmission Control Protocol). Think of it like this:
- TCP is the postal service it delivers packages reliably
- HTTP is the language you write your letters in
When you send an HTTP request:
- Your computer opens a TCP connection
- You send your HTTP message through that connection
- The server sends an HTTP response back
- The connection closes (or stays open — more on that later)
TCP handles all the hard stuff: making sure data arrives, arrives in order, and nothing gets lost. HTTP just worries about the format.
Building the Server: Sockets 101
To build an HTTP server, you need to understand sockets. A socket is basically a phone line for computers.
Here's the lifecycle of a server:
1. Create a socket → "I want to make/receive calls"
2. Bind to a port → "My phone number is 8080"
3. Listen → "I'm waiting for calls"
4. Accept → "Hello, who's calling?"
5. Read/Write → "Let's talk"
6. Close → "Goodbye"In Python, it looks like this:
import socket
# Create a socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind to localhost:8080
server.bind(('127.0.0.1', 8080))
# Start listening for connections
server.listen(5)
print("Waiting for connections...")
# Accept a connection (this blocks until someone connects)
client, address = server.accept()
print(f"Got a connection from {address}!")
# Read data from the client
data = client.recv(1024)
print(f"They said: {data}")
# Send a response
client.send(b"HTTP/1.1 200 OK\r\n\r\nHello!")
# Close the connection
client.close()Run this, then open http://localhost:8080 in your browser. Your browser will send an HTTP request, your server will print it, and send back "Hello!"
Congrats! you just built an HTTP server in 15 lines.
The Tricky Part: TCP Is a Stream, Not Messages
Here's something that tripped me up for hours.
When you send:
GET / HTTP/1.1\r\nHost: localhost\r\n\r\n
You might expect to receive exactly that. But TCP doesn't work that way. TCP is a byte stream. It has no concept of "messages."
You might receive:
GET / HT(first chunk)TP/1.1\r\nHost(second chunk): localhost\r\n\r\n(third chunk)
Or you might get it all at once. Or split differently each time. You never know.
This means you can't just do data = socket.recv(1024) and assume you got a complete request. You need to keep reading until you find the pattern that marks "end of message."
For HTTP, that pattern is \r\n\r\n an empty line after the headers.
Here's how I handle it:
def read_until_empty_line(socket):
data = b""
while b"\r\n\r\n" not in data:
chunk = socket.recv(1024)
if not chunk:
break # Connection closed
data += chunk
return dataThis keeps reading until we see that empty line. Only then do we know we have all the headers.
HTTP/1.0 vs HTTP/1.1: The Big Difference
Okay, here's something important.
In HTTP/1.0, after every request/response, the connection closes. Want to load a webpage with 10 images? That's 11 connections:
Connect → Request index.html → Response → Disconnect
Connect → Request image1.png → Response → Disconnect
Connect → Request image2.png → Response → Disconnect
... (8 more times)This is slow! Opening a connection takes time (there's a whole handshake process).
HTTP/1.1 introduced persistent connections (also called "keep-alive"). One connection, multiple requests:
Connect → Request index.html → Response
→ Request image1.png → Response
→ Request image2.png → Response
→ ... (all on the same connection)
→ Disconnect (eventually)Much faster! This is why browsers feel snappy despite loading dozens of resources.
In my server, this meant adding a loop:
def handle_connection(client_socket):
while True:
# Read a request
request = read_request(client_socket)
if not request:
break # Client disconnected
# Send response
response = handle_request(request)
client_socket.send(response)
# Check if client wants to keep connection open
if should_close(request):
break
client_socket.close()The loop keeps handling requests until the client says "I'm done" (by sending Connection: close) or the connection times out.
The Body Problem: How Do You Know When It Ends?
Headers are separated from the body by an empty line. But how do you know where the body ends?
If I send you:
POST /api/users HTTP/1.1
Content-Type: application/json
{"name": "John"}How does the server know the body is just {"name": "John"} and not more data coming?
The answer: Content-Length.
POST /api/users HTTP/1.1
Content-Type: application/json
Content-Length: 16
{"name": "John"}That 16 tells the server: "After the empty line, read exactly 16 bytes. That's the body."
In code:
content_length = int(headers.get('Content-Length', 0))
body = read_exactly(socket, content_length)Without Content-Length (in HTTP/1.0), servers would just read until the connection closed. But with keep-alive connections, that doesn't work the connection stays open! So Content-Length became essential.
Chunked Encoding: When You Don't Know the Size
But what if you're streaming data? Like a live video feed or a huge file you're generating on-the-fly? You don't know the total size upfront.
Enter chunked transfer encoding:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
1a
This is the first chunk.
1b
This is the second chunk.
0
Each chunk starts with its size (in hexadecimal), then the data, then a new line. The last chunk has size 0, signaling "I'm done."
It's clever because:
- Sender doesn't need to know total size upfront
- Receiver knows exactly when each chunk ends
- Works perfectly with keep-alive connections
Why We Need Multiple Connections
Here's a problem with HTTP/1.1 that I discovered the hard way.
Let's say you request three files:
styles.css(small, fast)big-image.png(huge, slow)app.js(small, fast)
On a single connection, they go in order:
Request 1 → Response 1 (fast)
Request 2 → Response 2 (SLOW... still loading... almost there...)
Request 3 → Response 3 (fast, but had to wait!)
The small app.js had to wait for the huge image, even though it was ready! This is called head-of-line blocking.
The workaround? Browsers open 6 parallel connections to each server. That way, if one gets stuck on a slow resource, the others keep working.
But that's wasteful 6 TCP connections means 6 handshakes, 6 sets of resources, 6 times the memory.
This problem is exactly why HTTP/2 was created.
Handling Multiple Clients: Threading
When I first built my server, it could only handle one client at a time. While serving one person, everyone else had to wait.
The fix? Threads. Each client gets their own thread:
import threading
while True:
client, address = server.accept()
# Handle this client in a separate thread
thread = threading.Thread(target=handle_connection, args=(client,))
thread.start()
# Main thread immediately goes back to accepting new connectionsNow my server can handle hundreds of clients simultaneously. Each thread handles one connection, and the main thread just keeps accepting new ones.
This is called the "thread-per-connection" model. It's simple and works great for moderate traffic. For massive scale (like millions of connections), you'd use async I/O or event loops but that's a story for another day.
What About Security?
Building this server made me realize how many ways things can go wrong:
1. Slowloris Attack: A malicious client connects and sends headers... very... slowly... one byte per minute. Your server thread is stuck waiting forever.
Fix: Set timeouts.
client_socket.settimeout(30) # Give up after 30 seconds2. Huge Headers: Someone sends a 10GB header. Your server runs out of memory.
Fix: Limit header size.
MAX_HEADER_SIZE = 8192 # 8KB max3. Path Traversal: Someone requests /../../../etc/passwd and accesses files they shouldn't.
Fix: Validate paths and reject anything suspicious.
4. No HTTPS: Everything is sent in plain text. Anyone on the network can read it.
Fix: Add TLS. (I haven't done this yet it requires wrapping the socket with the ssl module.)
HTTP/2 and HTTP/3: A Quick Look
So I built an HTTP/1.1 server. But the web has moved on. What's new?
HTTP/2: Multiplexing
Remember that head-of-line blocking problem? HTTP/2 fixes it with multiplexing.
Instead of sending requests one after another, HTTP/2 breaks everything into small "frames" with IDs:
[Frame: Stream 1, part 1]
[Frame: Stream 2, part 1]
[Frame: Stream 1, part 2]
[Frame: Stream 3, part 1]
[Frame: Stream 2, part 2]
...
Multiple requests and responses are interleaved on the same connection. If Stream 2 is slow, Streams 1 and 3 keep flowing. No more blocking!
Other HTTP/2 goodies:
- Binary format (not text anymore — faster to parse)
- Header compression (headers are often repetitive)
- Server push (server can send files before you ask for them)
HTTP/3: Goodbye TCP
HTTP/2 fixed blocking at the application layer. But TCP still has its own blocking problem if one packet gets lost, TCP waits for it before delivering anything else.
HTTP/3 takes a radical approach: ditch TCP entirely.
It runs on QUIC, which uses UDP. QUIC implements reliability per stream, so a lost packet in one stream doesn't block others.
Plus:
- Faster connections (can resume with zero round-trips)
- Better for mobile (handles network switches gracefully)
- Built-in encryption (TLS 1.3 required)
HTTP/3 is what YouTube, Google, and Cloudflare use today. It's the future.
What I Learned
Building an HTTP server from scratch taught me more about networking than any tutorial or documentation. Here's what stuck with me:
-
HTTP is just text. Don't be intimidated by protocols — they're just rules for formatting messages.
-
TCP is a stream. You have to do the work of finding message boundaries. Nothing is "automatic."
-
Keep-alive matters. Persistent connections are a massive performance win. HTTP/1.1's biggest contribution.
-
Head-of-line blocking is real. It's why browsers open 6 connections, and why HTTP/2 exists.
-
Security is never free. Every feature you add is a potential vulnerability.
-
Threading is simple but limited. Great for learning, but not for handling a million connections.
If you want to truly understand HTTP, I encourage you to build your own server. It doesn't have to be fancy — even 50 lines of code will teach you more than reading specs for hours.
The code for my server is in server.py. Run it, poke it, break it, fix it. That's how you learn.
Try It Yourself
# Start the server
python server.py
# In another terminal, try these:
# Basic request
curl http://localhost:8080/
# See keep-alive in action (multiple requests, one connection)
curl http://localhost:8080/ http://localhost:8080/info
# POST with a body
curl -X POST -d "Hello!" http://localhost:8080/echo
# JSON API
curl -X POST -H "Content-Type: application/json" \
-d '{"name": "Alice"}' http://localhost:8080/json
# Slow endpoint (see head-of-line blocking)
curl http://localhost:8080/slow
# Chunked encoding
curl http://localhost:8080/chunkedWatch the server logs. You'll see connections opening, requests being processed, and keep-alive in action.
The Bottom Line
HTTP isn't magic. It's not even that complicated once you see it in action.
At its core, it's just:
- Open a connection
- Send a text message following a specific format
- Get a text message back
- Maybe do it again (keep-alive)
- Close the connection
Everything else — headers, methods, status codes, chunked encoding — is just details on top of this simple foundation.
Now you know how the web works. Go build something cool.
Questions? Found a bug? Just want to chat about networking? The code is all there in server.py dig in and explore.