Zillow is one of the biggest real estate listing websites in the United States which contains
vast amount of real estate current and historical data. This makes it the most popular real estate
target for web scraping.
Zillow.com is using its own proprietary web scraping protection technology in combination with
PerimeterX anti-bot service. This makes it difficult to scrape Zillow property data reliably
and this is where web scraping APIs come in handy.
Overall, most web scraping APIs we've tested through our benchmarks
perform well for Zillow.com at $2.18 per 1,000 scrape requests on average.
Zillow.com scraping API benchmarks
Scrapeway runs bi-weekly benchmarks for Zillow Listings against the most popular web scraping APIs. Here's the ranking for this period:
Web scraping API benchmark for zillow.com — success rate, speed, cost per 1,000 requests. Data: 2026-05-02 to 2026-05-08.
Zillow is one of the easiest targets to scrape as it's a highly efficient javascript
application that stores all of its data in JSON format which means
headless browser use is not required.
That being said, Zillow.com has a lot of anti-scraping technologies in place, so it's recommended to use
a reliable web scraping service that can bypass the constantly changing anti-scraping measures.
See benchmarks for the most up-to-date results.
Zillow's HTML datasets contain their data in JSON variables under NextJS framework variables like
__NEXT_DATA__ and can be easily extracted for full listing datasets making it
an easy scraping target overall.
Code example
zillow_scraper.py
importjsonfromparselimportSelector# create an API client instance# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# create an API client instance
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# install using `pip install scrapfly-sdk`fromscrapflyimportScrapflyClient,ScrapeConfig,ScrapeApiResponse# create an API client instanceclient=ScrapflyClient(key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.scrape(ScrapeConfig(url=url,asp=True,render_js=False,cache=False,cache_ttl=900,url='https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/',method='GET',))returnapi_result.selectorurl="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse
# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.scrape(ScrapeConfig(
url=url,
asp=True,
render_js=False,
cache=False,
cache_ttl=900,
url='https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/',
method='GET',
))
return api_result.selector
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# webscrapingapi has a Python SDK but it's not great, use httpx instead:# `pip install httpx`importhttpx# create an API client instanceclient=httpx.Client(timeout=180)# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"url":url,"api_key":"YOUR API KEY",# NOTE: add your API KEY here!"timeout":60_000,"render_js":"False","url":"https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/","method":"GET",},)assertapi_result.status_code==200,api_result.reason_phrasereturnSelector(api_result.text)url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# webscrapingapi has a Python SDK but it's not great, use httpx instead:
# `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"url": url,
"api_key": "YOUR API KEY", # NOTE: add your API KEY here!
"timeout": 60_000,
"render_js": "False",
"url": "https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/",
"method": "GET",
},
)
assert api_result.status_code == 200, api_result.reason_phrase
return Selector(api_result.text)
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# scrapingdog has no integration but we can use httpx# install using `pip install httpx`importhttpx# create an API client instanceclient=httpx.Client(timeout=180)# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:payload={"api_key":"YOUR API KEY","url":url,"dynamic":"True","api_url":"https://api.scrapingdog.com/scrape","premium":"True","url":"https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/","method":"GET",}api_result=client.post("https://api.scrapingdog.com/scrape",json=payload,)data=api_result.json()assertdata['success'],f"scrape failed: {data['message']}"returnSelector(data['html'])url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# scrapingdog has no integration but we can use httpx
# install using `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
payload = {
"api_key": "YOUR API KEY",
"url": url,
"dynamic": "True",
"api_url": "https://api.scrapingdog.com/scrape",
"premium": "True",
"url": "https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/",
"method": "GET",
}
api_result = client.post(
"https://api.scrapingdog.com/scrape",
json=payload,
)
data = api_result.json()
assert data['success'], f"scrape failed: {data['message']}"
return Selector(data['html'])
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# install using `pip install scraperapi`fromscraper_apiimportScraperAPIClient# create an API client instanceclient=ScraperAPIClient(api_key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url=url,headers=headersor{},render=False,url=https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/,method=GET,)assertapi_result.ok,api_result.textreturnSelector(api_result.text)url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# install using `pip install scraperapi`
from scraper_api import ScraperAPIClient
# create an API client instance
client = ScraperAPIClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url=url,
headers=headers or {},
render=False,
url=https://www.zillow.com/homedetails/134-Holiday-Dr-Martinez-GA-30907/14215648_zpid/,
method=GET,
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# install using `pip install zenrows`fromzenrowsimportZenRowsClient# create an API client instanceclient=ZenRowsClient(apikey="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"json_response":"True",})assertapi_result.ok,api_result.textdata=api_result.json()returnSelector(data['html'])url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# install using `pip install zenrows`
from zenrows import ZenRowsClient
# create an API client instance
client = ZenRowsClient(apikey="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": "True",
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['html'])
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# install using `pip install scrapingbee`fromscrapingbeeimportScrapingBeeClient# create an API client instanceclient=ScrapingBeeClient(api_key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"json_response":True,"transparent_status_code":True,})assertapi_result.ok,api_result.textdata=api_result.json()returnSelector(data['body'])url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# install using `pip install scrapingbee`
from scrapingbee import ScrapingBeeClient
# create an API client instance
client = ScrapingBeeClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": True,
"transparent_status_code": True,
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['body'])
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
importjsonfromparselimportSelector# install using `pip install scrapingant-client`fromscrapingant_clientimportScrapingAntClient# create an API client instanceclient=ScrapingAntClient(token="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.general_request(url,)assertapi_result.ok,api_result.textreturnSelector(api_result.text)url="https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"selector=scrape(url)# The entire dataset can be found in a javascript variable:data=selector.css("script#__NEXT_DATA__::text").get()data=json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]property_data=list(json.loads(data).values())[0]['property']# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(person_data)
import json
from parsel import Selector
# install using `pip install scrapingant-client`
from scrapingant_client import ScrapingAntClient
# create an API client instance
client = ScrapingAntClient(token="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.general_request(
url,
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.zillow.com/homedetails/1414-1416-20th-Ave-San-Francisco-CA-94122/332857311_zpid/"
selector = scrape(url)
# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)["props"]["pageProps"]["componentProps"]["gdpClientCache"]
property_data = list(json.loads(data).values())[0]['property']
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(person_data)
For scraping Zillow.com we're retrieving the HTML and extract the property dataset from a
hidden JSON variable. As Zillow.com is using next.js this variable is available in the NEXT_DATA script.
Why scrape Zillow Listings?
Zillow is a popular web scraping as it has a large amount of
real estate data from listing information to market trends and metadata.
With lead scraping Zillow can be used to generate leads for real estate agents,
estate owners and investors.
As real estate is one of the biggest markets in the world Zillow is an invaluable
Market research tool. It can be used to analyze market trends to minute details like
specific neighborhoods and property types.
Zillow.com is also often scraped by real estate agents and investors to monitor competition.
and adjust their product and pricing strategies.