Web scraping twitter.com

Name: twitter.com Web Scraping API Benchmark
Creator: Scrapeway

Last updated: 2024-04-08

X.com (formerly Twitter) is one of the biggest social networks out there and a popular web scraping target for tracking social signals and announcements.

X.com is using proprietary web scraping protection mechanisms that are constantly evolving. This makes it difficult to scrape Twitter data reliably and this is where web scraping APIs come in handy.

Overall, most web scraping APIs we've tested through our benchmarks perform well for X.com at $1.67 per 1,000 scrape requests on average.

Twitter.com scraping API benchmarks

Scrapeway runs bi-weekly benchmarks for X.com Tweets against the most popular web scraping APIs. Here's the ranking for this period:

Web scraping API benchmark for twitter.com — success rate, speed, cost per 1,000 requests. Data: 2026-07-11 to 2026-07-17.
#	Service	Success	Speed	Cost/1k
1 🥇	Scrapfly	98% -1	9.0s -32.4	$0.83 +0.38	(237) ★ 4.9
2 🥈	WebScrapingAPI	89% —	13.1s —	$2.71 —	—
3 🥉	Scrapingant	87% +85	13.5s +3.5	$1.9 =	—
4	Zenrows	40% -3	15.3s +3.7	$6.9 =	(103) ★ 4.8
5	Scrapingdog	16% -75	1.1s +0.1	$1.0 =	—
6	Firecrawl	0% —	— —	— —	—
7	Scrapingbee	0% —	— —	— —	(137) ★ 4.9
8	Scraperapi	0% —	— —	— —	(62) ★ 4.6

Data range Jul 11 – Jul 17

All Benchmarks →

How to scrape twitter.com?

X.com is relatively difficult to scrape as it's a heavy javascript application so headless browser use is required.

To add, Twitter has a lot of anti-scraping mechanisms in place, so it's recommended to use a reliable web scraping service that can bypass the constantly changing anti-scraping measures. See benchmarks for the most up-to-date results.

As for parsing scraped X.com data using traditional HTML parsing tools like XPath or CSS selectors is relatively easy. Twitter uses `data-test` markup extensively through out their application meaning it's very easy to parse the HTML for the data you need.

Code example

twitter_scraper.py

import json
from parsel import Selector

# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.scrape(ScrapeConfig(
            url=url,
            asp=True,
            render_js=False,
            cache=False,
            cache_ttl=900,
            country='us',
            rendering_stage='domcontentloaded',
            url='https://x.com/freekmurze/status/2036057988637594058',
            method='GET',

    ))
    return api_result.selector

url = "https://twitter.com/XCreators/status/1770093017506189440"
selector = scrape(url, render_js=True, country="US")

# Twitter can be parsed using css selectors and data-testid attributes
views, reposts, quotes, likes, bookmarks, *_ = selector.css('[data-testid=app-text-transition-container] span::text').getall()
data = {
    "tweet": selector.css("[data-testid=tweetText] ::text").get(),
    "views": views,
    "reposts": reposts,
    "quotes": quotes,
    "likes": likes,
    "bookmarks": bookmarks,
}
from pprint import pprint
pprint(data)