SP
Back to Journal

Contents

  • What Even Is HTTP?
  • The First Surprise: HTTP Is Just Text
  • How Does the Text Get There?
  • Building the Server: Sockets 101
  • The Tricky Part: TCP Is a Stream, Not Messages
  • HTTP/1.0 vs HTTP/1.1: The Big Difference
  • The Body Problem: How Do You Know When It Ends?
  • Chunked Encoding: When You Don't Know the Size
  • Why We Need Multiple Connections
  • Handling Multiple Clients: Threading
  • What About Security?
  • HTTP/2 and HTTP/3: A Quick Look
      • HTTP/2: Multiplexing
      • HTTP/3: Goodbye TCP
  • What I Learned
  • Try It Yourself
  • The Bottom Line
Feb 6, 2026
~ 15 MIN READ

Writing Raw HTTP Over TCP with Sockets

You use HTTP every single day. Every website, every API call, every time you scroll Instagram HTTP is working behind the scenes. But have you ever wondered what's actually happening when you type a URL and press Enter?

I decided to find out by building an HTTP1 server from scratch. No frameworks. No libraries. Just raw sockets and a lot of curiosity.

This is everything I learned along the way.


What Even Is HTTP?

Let's start with the basics.

HTTP stands for Hypertext Transfer Protocol. It's just a set of rules for how computers talk to each other on the web. That's it. it's literally just text being sent back and forth.

When you visit google.com, here's what happens:

  1. Your browser opens a connection to Google's server
  2. Your browser sends a text message saying "Hey, give me the homepage"
  3. Google's server sends back a text message with the HTML
  4. Your browser renders it

The “text messages” follow a specific format — that format is defined by HTTP.


The First Surprise: HTTP Is Just Text

This blew my mind when I first learned it. HTTP/1.1 is literally plain text.

Here's what your browser sends when you visit a website:

GET /index.html HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0
Accept: text/html
 

That's it. That's an HTTP request. It's just text with some rules:

  • Line 1: What you want (GET this page) and what version of HTTP you speak
  • Lines 2-4: Extra info (headers) like "who are you" and "what do you accept"
  • Empty line: Signals "I'm done talking"

The server responds with something like:

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234
 
<!DOCTYPE html>
<html>
  <body>Hello World!</body>
</html>

Again, just text! The first line says "everything's OK" (that's what 200 means). Then some headers. Then an empty line. Then the actual webpage.

You could literally do this with telnet if you wanted to. Open a terminal and try:

telnet example.com 80

Then type:

GET / HTTP/1.1
Host: example.com
 

Press Enter twice after the Host line. You'll see raw HTML come back. You just spoke HTTP manually!


How Does the Text Get There?

Here's where it gets interesting. HTTP doesn't actually send anything. It's just the format of the messages.

The actual sending is done by TCP (Transmission Control Protocol). Think of it like this:

  • TCP is the postal service it delivers packages reliably
  • HTTP is the language you write your letters in

When you send an HTTP request:

  1. Your computer opens a TCP connection
  2. You send your HTTP message through that connection
  3. The server sends an HTTP response back
  4. The connection closes (or stays open — more on that later)

TCP handles all the hard stuff: making sure data arrives, arrives in order, and nothing gets lost. HTTP just worries about the format.


Building the Server: Sockets 101

To build an HTTP server, you need to understand sockets. A socket is basically a phone line for computers.

Here's the lifecycle of a server:

1. Create a socket       → "I want to make/receive calls"
2. Bind to a port        → "My phone number is 8080"  
3. Listen                → "I'm waiting for calls"
4. Accept                → "Hello, who's calling?"
5. Read/Write            → "Let's talk"
6. Close                 → "Goodbye"

In Python, it looks like this:

import socket
 
# Create a socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 
# Bind to localhost:8080
server.bind(('127.0.0.1', 8080))
 
# Start listening for connections
server.listen(5)
 
print("Waiting for connections...")
 
# Accept a connection (this blocks until someone connects)
client, address = server.accept()
print(f"Got a connection from {address}!")
 
# Read data from the client
data = client.recv(1024)
print(f"They said: {data}")
 
# Send a response
client.send(b"HTTP/1.1 200 OK\r\n\r\nHello!")
 
# Close the connection
client.close()

Run this, then open http://localhost:8080 in your browser. Your browser will send an HTTP request, your server will print it, and send back "Hello!"

Congrats! you just built an HTTP server in 15 lines.


The Tricky Part: TCP Is a Stream, Not Messages

Here's something that tripped me up for hours.

When you send:

GET / HTTP/1.1\r\nHost: localhost\r\n\r\n

You might expect to receive exactly that. But TCP doesn't work that way. TCP is a byte stream. It has no concept of "messages."

You might receive:

  • GET / HT (first chunk)
  • TP/1.1\r\nHost (second chunk)
  • : localhost\r\n\r\n (third chunk)

Or you might get it all at once. Or split differently each time. You never know.

This means you can't just do data = socket.recv(1024) and assume you got a complete request. You need to keep reading until you find the pattern that marks "end of message."

For HTTP, that pattern is \r\n\r\n an empty line after the headers.

Here's how I handle it:

def read_until_empty_line(socket):
    data = b""
    while b"\r\n\r\n" not in data:
        chunk = socket.recv(1024)
        if not chunk:
            break  # Connection closed
        data += chunk
    return data

This keeps reading until we see that empty line. Only then do we know we have all the headers.


HTTP/1.0 vs HTTP/1.1: The Big Difference

Okay, here's something important.

In HTTP/1.0, after every request/response, the connection closes. Want to load a webpage with 10 images? That's 11 connections:

Connect → Request index.html → Response → Disconnect
Connect → Request image1.png → Response → Disconnect
Connect → Request image2.png → Response → Disconnect
... (8 more times)

This is slow! Opening a connection takes time (there's a whole handshake process).

HTTP/1.1 introduced persistent connections (also called "keep-alive"). One connection, multiple requests:

Connect → Request index.html → Response
        → Request image1.png → Response  
        → Request image2.png → Response
        → ... (all on the same connection)
        → Disconnect (eventually)

Much faster! This is why browsers feel snappy despite loading dozens of resources.

In my server, this meant adding a loop:

def handle_connection(client_socket):
    while True:
        # Read a request
        request = read_request(client_socket)
        if not request:
            break  # Client disconnected
        
        # Send response
        response = handle_request(request)
        client_socket.send(response)
        
        # Check if client wants to keep connection open
        if should_close(request):
            break
    
    client_socket.close()

The loop keeps handling requests until the client says "I'm done" (by sending Connection: close) or the connection times out.


The Body Problem: How Do You Know When It Ends?

Headers are separated from the body by an empty line. But how do you know where the body ends?

If I send you:

POST /api/users HTTP/1.1
Content-Type: application/json
 
{"name": "John"}

How does the server know the body is just {"name": "John"} and not more data coming?

The answer: Content-Length.

POST /api/users HTTP/1.1
Content-Type: application/json
Content-Length: 16
 
{"name": "John"}

That 16 tells the server: "After the empty line, read exactly 16 bytes. That's the body."

In code:

content_length = int(headers.get('Content-Length', 0))
body = read_exactly(socket, content_length)

Without Content-Length (in HTTP/1.0), servers would just read until the connection closed. But with keep-alive connections, that doesn't work the connection stays open! So Content-Length became essential.


Chunked Encoding: When You Don't Know the Size

But what if you're streaming data? Like a live video feed or a huge file you're generating on-the-fly? You don't know the total size upfront.

Enter chunked transfer encoding:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
 
1a
This is the first chunk.
1b
This is the second chunk.
0
 

Each chunk starts with its size (in hexadecimal), then the data, then a new line. The last chunk has size 0, signaling "I'm done."

It's clever because:

  • Sender doesn't need to know total size upfront
  • Receiver knows exactly when each chunk ends
  • Works perfectly with keep-alive connections

Why We Need Multiple Connections

Here's a problem with HTTP/1.1 that I discovered the hard way.

Let's say you request three files:

  1. styles.css (small, fast)
  2. big-image.png (huge, slow)
  3. app.js (small, fast)

On a single connection, they go in order:

Request 1 → Response 1 (fast)
Request 2 → Response 2 (SLOW... still loading... almost there...)
Request 3 → Response 3 (fast, but had to wait!)

The small app.js had to wait for the huge image, even though it was ready! This is called head-of-line blocking.

The workaround? Browsers open 6 parallel connections to each server. That way, if one gets stuck on a slow resource, the others keep working.

But that's wasteful 6 TCP connections means 6 handshakes, 6 sets of resources, 6 times the memory.

This problem is exactly why HTTP/2 was created.


Handling Multiple Clients: Threading

When I first built my server, it could only handle one client at a time. While serving one person, everyone else had to wait.

The fix? Threads. Each client gets their own thread:

import threading
 
while True:
    client, address = server.accept()
    
    # Handle this client in a separate thread
    thread = threading.Thread(target=handle_connection, args=(client,))
    thread.start()
    
    # Main thread immediately goes back to accepting new connections

Now my server can handle hundreds of clients simultaneously. Each thread handles one connection, and the main thread just keeps accepting new ones.

This is called the "thread-per-connection" model. It's simple and works great for moderate traffic. For massive scale (like millions of connections), you'd use async I/O or event loops but that's a story for another day.


What About Security?

Building this server made me realize how many ways things can go wrong:

1. Slowloris Attack: A malicious client connects and sends headers... very... slowly... one byte per minute. Your server thread is stuck waiting forever.

Fix: Set timeouts.

client_socket.settimeout(30)  # Give up after 30 seconds

2. Huge Headers: Someone sends a 10GB header. Your server runs out of memory.

Fix: Limit header size.

MAX_HEADER_SIZE = 8192  # 8KB max

3. Path Traversal: Someone requests /../../../etc/passwd and accesses files they shouldn't.

Fix: Validate paths and reject anything suspicious.

4. No HTTPS: Everything is sent in plain text. Anyone on the network can read it.

Fix: Add TLS. (I haven't done this yet it requires wrapping the socket with the ssl module.)


HTTP/2 and HTTP/3: A Quick Look

So I built an HTTP/1.1 server. But the web has moved on. What's new?

HTTP/2: Multiplexing

Remember that head-of-line blocking problem? HTTP/2 fixes it with multiplexing.

Instead of sending requests one after another, HTTP/2 breaks everything into small "frames" with IDs:

[Frame: Stream 1, part 1]
[Frame: Stream 2, part 1]
[Frame: Stream 1, part 2]
[Frame: Stream 3, part 1]
[Frame: Stream 2, part 2]
...

Multiple requests and responses are interleaved on the same connection. If Stream 2 is slow, Streams 1 and 3 keep flowing. No more blocking!

Other HTTP/2 goodies:

  • Binary format (not text anymore — faster to parse)
  • Header compression (headers are often repetitive)
  • Server push (server can send files before you ask for them)

HTTP/3: Goodbye TCP

HTTP/2 fixed blocking at the application layer. But TCP still has its own blocking problem if one packet gets lost, TCP waits for it before delivering anything else.

HTTP/3 takes a radical approach: ditch TCP entirely.

It runs on QUIC, which uses UDP. QUIC implements reliability per stream, so a lost packet in one stream doesn't block others.

Plus:

  • Faster connections (can resume with zero round-trips)
  • Better for mobile (handles network switches gracefully)
  • Built-in encryption (TLS 1.3 required)

HTTP/3 is what YouTube, Google, and Cloudflare use today. It's the future.


What I Learned

Building an HTTP server from scratch taught me more about networking than any tutorial or documentation. Here's what stuck with me:

  1. HTTP is just text. Don't be intimidated by protocols — they're just rules for formatting messages.

  2. TCP is a stream. You have to do the work of finding message boundaries. Nothing is "automatic."

  3. Keep-alive matters. Persistent connections are a massive performance win. HTTP/1.1's biggest contribution.

  4. Head-of-line blocking is real. It's why browsers open 6 connections, and why HTTP/2 exists.

  5. Security is never free. Every feature you add is a potential vulnerability.

  6. Threading is simple but limited. Great for learning, but not for handling a million connections.

If you want to truly understand HTTP, I encourage you to build your own server. It doesn't have to be fancy — even 50 lines of code will teach you more than reading specs for hours.

The code for my server is in server.py. Run it, poke it, break it, fix it. That's how you learn.


Try It Yourself

# Start the server
python server.py
 
# In another terminal, try these:
 
# Basic request
curl http://localhost:8080/
 
# See keep-alive in action (multiple requests, one connection)
curl http://localhost:8080/ http://localhost:8080/info
 
# POST with a body
curl -X POST -d "Hello!" http://localhost:8080/echo
 
# JSON API
curl -X POST -H "Content-Type: application/json" \
     -d '{"name": "Alice"}' http://localhost:8080/json
 
# Slow endpoint (see head-of-line blocking)
curl http://localhost:8080/slow
 
# Chunked encoding
curl http://localhost:8080/chunked

Watch the server logs. You'll see connections opening, requests being processed, and keep-alive in action.


The Bottom Line

HTTP isn't magic. It's not even that complicated once you see it in action.

At its core, it's just:

  • Open a connection
  • Send a text message following a specific format
  • Get a text message back
  • Maybe do it again (keep-alive)
  • Close the connection

Everything else — headers, methods, status codes, chunked encoding — is just details on top of this simple foundation.

Now you know how the web works. Go build something cool.


Questions? Found a bug? Just want to chat about networking? The code is all there in server.py dig in and explore.