Web scraping indeed.com

Name: indeed.com Web Scraping API Benchmark
Creator: Scrapeway

Last updated: 2024-04-08

Indeed is one of the biggest job listing and recruitment portals in the world.

Indeed.com is using proprietary web scraping protection tech that is being constantly updated together with Cloudflare anti-bot service. This makes it difficult to scrape Indeed data reliably and this is where web scraping APIs come in handy.

Overall, most web scraping APIs we've tested through our benchmarks perform well for scraping Indeed.com at $4.39 per 1,000 scrape requests on average.

Indeed.com scraping API benchmarks

Scrapeway runs bi-weekly benchmarks for Indeed Jobs against the most popular web scraping APIs. Here's the ranking for this period:

Web scraping API benchmark for indeed.com — success rate, speed, cost per 1,000 requests. Data: 2026-07-11 to 2026-07-17.
#	Service	Success	Speed	Cost/1k
1 🥇	Scrapfly	95% +9	7.5s +=	$7.18 +0.3	(237) ★ 4.9
2 🥈	Firecrawl	86% -7	7.8s +1.5	$8.18 -1.27	—
3 🥉	WebScrapingAPI	82% -1	19.1s +2.5	$2.71 =	—
4	Scraperapi	81% -8	2.3s =	$4.9 =	(62) ★ 4.6
5	Zenrows	40% +34	5.8s -0.6	$6.9 =	(103) ★ 4.8
6	Scrapingbee	32% -6	2.0s +0.1	$3.37 +0.15	(137) ★ 4.9
7	Scrapingant	15% =	30.5s +10.3	$1.9 =	—
8	Scrapingdog	0% —	— —	— —	—

Data range Jul 11 – Jul 17

All Benchmarks →

How to scrape indeed.com?

Indeed.com is relatively easy to scrape as it's mostly static content with very few dynamic elements so headless browser use is not required.

That being said, Indeed.com has several anti-scraping technologies in place, so it's recommended to use a reliable web scraping service that can bypass the constantly changing anti-scraping measures. See benchmarks for the most up-to-date results.

Indeed's HTML pages are well structured and minimal so it can be easily parsed using traditional HTML parsing tools like XPath or CSS selectors. Though, that's often unnecessary as the entire of Indeed's page dataset is available in JSON variables like _initialData.

Code example

indeed_scraper.py

import json
from parsel import Selector

# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.scrape(ScrapeConfig(
            url=url,
            asp=True,
            render_js=False,
            cache=False,
            cache_ttl=900,
            debug=True,
            url='https://www.indeed.com/viewjob?jk=b890040081be3ef3',
            method='GET',

    ))
    return api_result.selector

# example search page url:
url = "https://www.indeed.com/jobs?q=python&l=Seattle%2C%20WA"
selector = scrape(url, country="US")

# Indeed jobs can be found in Javascript variable as an array of job objects:
data = selector.re(r'window.mosaic.providerData\["mosaic-provider-jobcards"\]=(\{.+?\});')
data = json.loads(data[0])
jobs = data["metaData"]["mosaicProviderJobCardsModel"]["results"]
print(len(jobs))
15
from pprint import pprint
pprint(jobs[0])