Getting Started¶

Prerequisites¶

Python 3.10+
uv package manager (for development) or pipx (for global install)

Installation¶

Option A: Development install (with uv)¶

1. Clone the repository¶

git clone https://github.com/tadeasf/python-reddit-scraper.git
cd python-reddit-scraper

2. Install dependencies¶

uv sync

3. Download the stealth browser binary¶

This is a one-time setup (~80 MB):

uv run camoufox fetch

Note

The tool will remind you to run uv run camoufox fetch if the binary is missing.

With this setup, run the tool via uv run download-reddit-media.

Option B: Global install (use anywhere)¶

Install globally with pipx so you can use it from any directory:

pipx install git+https://github.com/tadeasf/python-reddit-scraper.git
camoufox fetch

Tip

After a global install you run download-reddit-media directly — no uv run prefix needed. The camoufox command is also available globally via pipx's injected scripts.

First Run¶

Interactive mode¶

# uv
uv run download-reddit-media

# Global install
download-reddit-media

You'll be prompted for everything that isn't already configured:

Subreddits — Enter subreddits (comma-separated): wallpapers,earthporn
Media types — a checkbox dialog with [x] images [x] videos [x] gifs; press Space to toggle, Enter to confirm.
Output directory — press Enter to accept the shown default (./redditdownloads) or type a path (including ~).
Max pages per subreddit — press Enter to accept 50 or type a positive integer.

Any option you pass on the command line (-s, -o, --video-only, --max-pages, …) skips the matching prompt. Any option you've saved via configure is loaded from ~/.config/python_reddit_scraper/config.yaml and also skips the prompt.

Saving defaults with `configure`¶

Run configure once to persist your preferred media types, output directory, and page depth:

download-reddit-media configure

This writes (or updates) the defaults: block in ~/.config/python_reddit_scraper/config.yaml while leaving any providers: block intact. Re-run configure any time to change them.

Direct mode (all flags)¶

download-reddit-media -s wallpapers,earthporn --video-only --max-pages 20 -o ~/Pictures/reddit

Resolution order¶

For each option, the first rule that matches wins:

CLI flag (e.g. --video-only, -o /tmp/x).
defaults: block in ~/.config/python_reddit_scraper/config.yaml.
Interactive prompt (requires a TTY — scripts should pass the flags instead).

Custom output directory¶

download-reddit-media -s wallpapers -o ~/Pictures/reddit

Output Structure¶

Files live directly under the output directory, organized by subreddit and media type:

<output-dir>/
├── wallpapers/
│   ├── images/
│   ├── videos/
│   └── gifs/
└── earthporn/
    ├── images/
    ├── videos/
    └── gifs/

Re-running against the same output directory is how you refresh a subreddit — the scraper fetches fresh posts and the downloader skips any file already on disk (via an on-disk exists-check). Partial downloads are written to {filename}.part and renamed atomically on success, so an interrupted write never masquerades as complete.

Configuration File¶

User configuration lives in ~/.config/python_reddit_scraper/config.yaml. Both top-level blocks are optional.

defaults:
  media_types: [images, videos, gifs]   # non-empty subset of these three
  output_dir: /home/you/redditdownloads
  max_pages: 50

providers:
  - name: webshare
    accounts:
      - email: you@example.com
        api_key: <webshare-api-key>
  - name: proxy-cheap
    accounts:
      - username: <user>
        password: <pass>
        ip_address: 178.93.44.23
        port: 46271

Tip

Use download-reddit-media configure to write the defaults: block — it preserves your providers: block on write.

Proxy providers¶

webshare — rotating proxies fetched via API on every run. Multiple accounts are tried in order; empty/invalid accounts are skipped.
proxy-cheap — static HTTP endpoints. SOCKS5 is not supported: Camoufox is built on Playwright's Firefox, which cannot authenticate against SOCKS5 proxies (Browser does not support socks5 proxy authentication). Request an HTTP endpoint from your proxy-cheap dashboard.

Next Steps¶

See CLI Reference for all available options
See API Reference for using the Python API directly