Going Remote: The Streamable HTTP Transport

Take an MCP server off the local machine and onto the network: the Streamable HTTP transport, sessions, and why it replaced SSE — with the Python one-liner and the TypeScript wiring.

Series: Building MCP Servers — Part 7 of 12

Every server so far ran over stdio: the host launched it as a local subprocess and spoke to it through pipes. That’s perfect for a tool on your own machine, and useless the moment you want one server to serve many users, run in a container, or live behind a URL. For that you need the network transport — Streamable HTTP — and switching to it is a small change on one stack and a real one on the other.

Why Streamable HTTP (and not SSE)

If you’ve seen older MCP material, you’ll have met an SSE (Server-Sent Events) transport. It’s gone, replaced by Streamable HTTP in the late-2025 spec, and the replacement is worth understanding. The old design needed two endpoints and a long-lived connection that was awkward to scale and couldn’t resume if it dropped. Streamable HTTP uses a single endpoint (conventionally /mcp) that handles ordinary request/response over POST, and optionally upgrades to a stream when the server has more to send. It works behind standard load balancers, survives reconnects, and can even run statelessly. One door, plain HTTP, scales like anything else on your infrastructure.

The Python switch

On the Python side, FastMCP has the transport built in. The same server you wrote in Part 2 goes remote by changing one argument:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("calc-http")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers and return the sum."""
    return a + b

if __name__ == "__main__":
    mcp.run(transport="streamable-http")  # serves on http://127.0.0.1:8000/mcp
import { randomUUID } from "node:crypto";
import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import { z } from "zod";

const app = express();
app.use(express.json());

const transports: Record<string, StreamableHTTPServerTransport> = {};

function newServer() {
  const server = new McpServer({ name: "calc-http", version: "1.0.0" });
  server.registerTool(
    "add",
    { title: "Add", description: "Add two numbers.", inputSchema: { a: z.number(), b: z.number() } },
    async ({ a, b }) => ({ content: [{ type: "text", text: String(a + b) }] })
  );
  return server;
}

app.post("/mcp", async (req, res) => {
  const sid = req.headers["mcp-session-id"] as string | undefined;
  let transport = sid ? transports[sid] : undefined;
  if (!transport) {
    transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => randomUUID(),
      onsessioninitialized: (id) => { transports[id] = transport!; },
    });
    await newServer().connect(transport);
  }
  await transport.handleRequest(req, res, req.body);
});

app.listen(8123);

This is the widest the two stacks diverge in the whole series, so it’s worth being honest about. Python’s FastMCP bundles a web server and gives you a string flag; you trade control for a one-liner. TypeScript hands you the transport and expects you to bring the HTTP server — here Express — and wire the routes yourself. More code, but you own the request lifecycle, which matters once auth and middleware enter the picture next post.

Sessions

Look at what the TypeScript server is juggling: a map of transports keyed by an id. That’s session management, and it’s the one new concept the network transport forces on you. On the first request the server generates a session id and returns it in an Mcp-Session-Id header; the client echoes that header on every subsequent request, and the server routes it back to the right transport. It’s how a stateless protocol keeps per-client state — a subscription, an in-progress operation — across separate HTTP calls.

You don’t always need it. The transport also runs stateless (no session id, a fresh context per request), which is simpler and scales horizontally without sticky routing — a good default for tools that hold no state between calls. Reach for sessions when a client genuinely needs continuity; otherwise stateless is less to get wrong.

Connecting to it

The client side barely changes from the stdio version — swap the transport for the HTTP one and point it at the URL:

from mcp.client.streamable_http import streamablehttp_client

async with streamablehttp_client("http://127.0.0.1:8000/mcp") as (read, write, _):
    async with ClientSession(read, write) as session:
        await session.initialize()
        result = await session.call_tool("add", {"a": 2, "b": 3})
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

const transport = new StreamableHTTPClientTransport(new URL("http://127.0.0.1:8123/mcp"));
const client = new Client({ name: "probe", version: "1.0.0" });
await client.connect(transport);
const result = await client.callTool({ name: "add", arguments: { a: 2, b: 3 } });

Same call_tool, same result — the transport is the only thing that moved. That’s the payoff of MCP drawing the transport boundary where it did: the tools you wrote don’t know or care whether they’re reached over a pipe or a socket.

Final thoughts

Going remote is less about MCP than about everything that comes with being on a network — sessions, scaling, and the security you now can’t ignore, because a public /mcp endpoint is a public endpoint. The transport itself is nearly free, especially in Python. The work is in treating your server like the service it has just become, and the first item on that list is making sure not everyone who finds the URL gets to use it.

Next: Auth and Security for Remote Servers, where we put a door on the endpoint.


Target keyword(s): mcp streamable http, mcp remote server.

Comments