Fetching Data From the Internet, in a Few Lines
Most real Python programs eventually need to talk to something over the network — a REST API, a weather service, a GitHub endpoint, a download. The standard library can do this (via urllib), but the de-facto tool in the Python world is a third-party library called requests. The API is so much friendlier that it's worth the single pip install.
pip install requests
If you're not already working inside a virtual environment, set one up first — it keeps this install scoped to your project.
A First GET Request
Three lines: call get, check the status code, look at the body. response.text is the response body as a string. Truncated here because the full JSON is long.
Status codes worth recognizing on a cheat-sheet level:
- 200 — OK, everything worked.
- 201 — Created (common POST response).
- 301 / 302 — redirects;
requestsfollows them automatically by default. - 400 — bad request; something in what you sent was wrong.
- 401 / 403 — not authenticated / not authorised.
- 404 — the resource doesn't exist.
- 429 — rate limited; slow down.
- 500 — server error.
Parsing a JSON Response
When the endpoint returns JSON, call .json() on the response — it parses the body and hands you a dict (or list):
Under the hood, .json() is the same as json.loads(response.text) — just a shortcut for the common case.
Sending Query Parameters
Don't manually glue ?key=value&... onto the URL. Pass a dict to params=:
requests handles URL encoding for you — spaces, special characters, Unicode all work safely.
The actual URL that went out is available on response.url, handy for debugging.
POST Requests With JSON Bodies
For sending data — creating resources, submitting forms, calling mutating endpoints — use requests.post:
import requests
payload = {
"title": "Docs update",
"body": "Added HTTP requests page.",
"labels": ["docs"],
}
response = requests.post(
"https://api.example.com/issues",
json=payload,
headers={"Authorization": "Bearer YOUR_TOKEN"},
)
print(response.status_code)
print(response.json())
The json= argument does two things: it serializes payload to JSON and sets Content-Type: application/json. You could do both by hand with data=json.dumps(payload) and an explicit header, but json= is the idiomatic shortcut.
For form-encoded data (the kind a classic HTML form sends), use data= instead:
requests.post("https://example.com/login", data={"user": "rosa", "password": "..."})
Headers
Pass any custom headers as a dict:
import requests
response = requests.get(
"https://api.example.com/profile",
headers={
"Authorization": "Bearer abc123",
"User-Agent": "my-tool/1.0",
},
)
Most APIs want an Authorization header for authentication. The exact scheme (Bearer, Basic, Token) is in their docs.
Timeouts Are Not Optional
By default, requests will wait forever for a response. In a real program, that turns a flaky server into a hung script. Always pass a timeout:
import requests
try:
response = requests.get("https://api.example.com/slow", timeout=5)
except requests.Timeout:
print("Server took too long.")
The number is seconds. timeout=5 means "give up if we don't have a response in 5 seconds." You can pass a tuple (connect_timeout, read_timeout) for more control.
Error Handling
Two kinds of problems can happen:
- HTTP-level errors (4xx, 5xx) — the server responded, but the response is an error.
response.status_codetells you. - Network-level problems — timeouts, DNS failures, unreachable hosts. These raise exceptions.
The idiomatic pattern combines both:
raise_for_status() is a no-op for 2xx responses and an exception-raiser otherwise. RequestException is the base class for every error requests raises — a single catch for "anything that went wrong talking to this endpoint."
Downloading a File
For large binary downloads, stream the response so it doesn't sit in memory:
import requests
url = "https://example.com/large.zip"
with requests.get(url, stream=True, timeout=30) as r:
r.raise_for_status()
with open("large.zip", "wb") as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
stream=True tells requests not to preload the body. iter_content(chunk_size=...) yields the body a chunk at a time, which you write straight to disk.
Sessions: Reuse Connections and Defaults
If you're going to make several requests to the same service, use a Session. It reuses the underlying TCP connection (faster) and lets you set defaults once:
import requests
session = requests.Session()
session.headers.update({"Authorization": "Bearer abc123"})
# Every request through this session carries the header.
a = session.get("https://api.example.com/users/1")
b = session.get("https://api.example.com/users/2")
c = session.post("https://api.example.com/users", json={"name": "Rosa"})
For scripts that hit the same API dozens of times, a session is a meaningful speedup.
A Realistic Example: Small GitHub Client
Fetching the latest release of a repo:
import requests
def latest_release(owner, repo):
url = f"https://api.github.com/repos/{owner}/{repo}/releases/latest"
response = requests.get(url, timeout=10)
response.raise_for_status()
data = response.json()
return {
"tag": data["tag_name"],
"name": data["name"],
"published": data["published_at"],
"url": data["html_url"],
}
release = latest_release("python", "cpython")
print(release)
Under fifteen lines: build the URL, request it, check for errors, extract the fields you care about. That's the shape of most API clients you'll write.
What About urllib?
The standard library's urllib.request can do everything requests can — in more lines and with worse ergonomics. If you absolutely cannot add a dependency, it's there:
import json
import urllib.request
with urllib.request.urlopen("https://api.github.com/repos/python/cpython") as r:
data = json.loads(r.read().decode("utf-8"))
print(data["name"])
For anything beyond a quick script, requests (or httpx if you need async) is worth the install.
A Few Habits
- Always set a timeout. No exceptions.
- Use
raise_for_status()in scripts that expect success — it converts a bad response into a loud exception. - Pass dicts to
params=andjson=, not hand-built strings. - Wrap related calls in a
Sessionwhen you're hitting the same API many times. - Log
response.status_codeandresponse.textwhen debugging — the body usually tells you exactly what the server disliked.
Next: Dates and Times
With requests in your toolbox, you can talk to any modern web API, download files, and build small integrations between services. Combined with the JSON and CSV pages before this one, you now have the full "fetch data, read it, do something with it, write it back out" loop — the shape of a huge number of real Python scripts. Most of that data has a timestamp on it, though, and the next page covers how Python represents dates, times, and the timezone traps worth avoiding.
Frequently Asked Questions
How do I make an HTTP request in Python?
Install the requests library with pip install requests, then call requests.get(url) for a GET or requests.post(url, json=...) for a POST. The response object has .status_code, .text, .json(), and .headers. Example: r = requests.get('https://api.example.com/users/1').
Should I use requests or urllib?
requests for anything you'd write by hand — its API is dramatically friendlier. urllib is built in and fine when adding a dependency is impossible, but it requires more code for things like JSON bodies and sessions. For production, most teams also use httpx (a requests-compatible library with async support).
How do I send JSON in a POST request in Python?
Pass json={'key': 'value'} to requests.post(...). requests serializes the dict to JSON and sets Content-Type: application/json for you. Don't pass both data= and json= — pick one.
How do I handle errors with the requests library?
Check response.status_code (200 means success), or call response.raise_for_status() to raise an exception on 4xx/5xx. Network-level problems (timeouts, DNS failures) raise subclasses of requests.RequestException — catch that to cover both.