Walmart is one of the biggest e-commerce retailers in the United States containing product data
of brick and mortar stores as well as online stores.
Walmart is using proprietary web scraping protection mechanisms that are constantly evolving.
This makes it difficult to scrape Walmart data reliably and this is where web scraping APIs come in handy.
Overall, most web scraping APIs we've tested through our benchmarks
perform well for Walmart at $2.73 per 1,000 scrape requests on average.
Walmart.com scraping API benchmarks
Scrapeway runs weekly benchmarks for Walmart Products for the most popular web scraping APIs.
Here's the table for this week:
Walmart is relatively easy to scrape as it's mostly static content with
a few dynamic elements so headless browser use is not required.
That being said, Walmart has a lot of anti-scraping mechanisms in place, so it's recommended to use
a reliable web scraping service that can bypass the constantly changing anti-scraping measures.
See benchmarks for the most up-to-date results.
Walmart's HTML datasets can be difficult to parse just because of sheer data point scale however
many of the datapoints can be accessed through NextJS framework variables walmart is using.
To do this look for the __NEXT_DATA__ variable in the HTML source.
Walmart.com scraper
importjsonfromparselimportSelector# install using `pip install scraperapi`fromscraper_apiimportScraperAPIClient# create an API client instanceclient=ScraperAPIClient(api_key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url=url,headers=headersor{},premium=True,country_code=US,)assertapi_result.ok,api_result.textreturnSelector(api_result.text)url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# install using `pip install scraperapi`
from scraper_api import ScraperAPIClient
# create an API client instance
client = ScraperAPIClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url=url,
headers=headers or {},
premium=True,
country_code=US,
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# install using `pip install zenrows`fromzenrowsimportZenRowsClient# create an API client instanceclient=ZenRowsClient(apikey="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"json_response":"True","premium_proxy":"True","proxy_country":"US","js_render":"True",})assertapi_result.ok,api_result.textdata=api_result.json()returnSelector(data['html'])url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# install using `pip install zenrows`
from zenrows import ZenRowsClient
# create an API client instance
client = ZenRowsClient(apikey="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": "True",
"premium_proxy": "True",
"proxy_country": "US",
"js_render": "True",
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['html'])
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# install using `pip install scrapfly-sdk`fromscrapflyimportScrapflyClient,ScrapeConfig,ScrapeApiResponse# create an API client instanceclient=ScrapflyClient(key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.scrape(ScrapeConfig(url=url,asp=True,country='US',))returnapi_result.selectorurl="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse
# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.scrape(ScrapeConfig(
url=url,
asp=True,
country='US',
))
return api_result.selector
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# webscrapingapi has a Python SDK but it's not great, use httpx instead:# `pip install httpx`importhttpx# create an API client instanceclient=httpx.Client(timeout=180)# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"url":url,"api_key":"YOUR API KEY",# NOTE: add your API KEY here!"timeout":60_000,"render_js":"1","country":"US",},)assertapi_result.status_code==200,api_result.reason_phrasereturnSelector(api_result.text)url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# webscrapingapi has a Python SDK but it's not great, use httpx instead:
# `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"url": url,
"api_key": "YOUR API KEY", # NOTE: add your API KEY here!
"timeout": 60_000,
"render_js": "1",
"country": "US",
},
)
assert api_result.status_code == 200, api_result.reason_phrase
return Selector(api_result.text)
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# install using `pip install scrapingbee`fromscrapingbeeimportScrapingBeeClient# create an API client instanceclient=ScrapingBeeClient(api_key="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.get(url,headers=headers,params={"json_response":True,"transparent_status_code":True,"premium_proxy":"True","country_code":"US","render_js":"False",})assertapi_result.ok,api_result.textdata=api_result.json()returnSelector(data['body'])url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# install using `pip install scrapingbee`
from scrapingbee import ScrapingBeeClient
# create an API client instance
client = ScrapingBeeClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": True,
"transparent_status_code": True,
"premium_proxy": "True",
"country_code": "US",
"render_js": "False",
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['body'])
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# install using `pip install scrapingant-client`fromscrapingant_clientimportScrapingAntClient# create an API client instanceclient=ScrapingAntClient(token="YOUR API KEY")# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:api_result=client.general_request(url,browser=True,return_page_source=False,proxy_type='datacenter',proxy_country='US',)assertapi_result.ok,api_result.textreturnSelector(api_result.text)url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# install using `pip install scrapingant-client`
from scrapingant_client import ScrapingAntClient
# create an API client instance
client = ScrapingAntClient(token="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.general_request(
url,
browser=True,
return_page_source=False,
proxy_type='datacenter',
proxy_country='US',
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
importjsonfromparselimportSelector# scrapingdog has no integration but we can use httpx# install using `pip install httpx`importhttpx# create an API client instanceclient=httpx.Client(timeout=180)# create scrape function that returns HTML parser for a given URLdefscrape(url:str,country:str="",render_js=False,headers:dict=None)->Selector:payload={"api_key":"YOUR API KEY","url":url,"premium":"true","country":"us",}api_result=client.post("https://api.scrapingdog.com/scrape",json=payload,)data=api_result.json()assertdata['success'],f"scrape failed: {data['message']}"returnSelector(data['html'])url="https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"selector=scrape(url)# Walmart is using NextJS framework so the product data is stored in a JSON variabledata=selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()data=json.loads(data)product=data["props"]["pageProps"]["initialData"]["data"]["product"]# the resulting dataset is pretty big but here are some example fields:frompprintimportpprintpprint(product){"id":"4SZSM8SXAAJT","name":"Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage","shortDescription":"Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!","additionalOfferCount":2,"availabilityStatus":"IN_STOCK","averageRating":4.7,"associatedBundleId":null,"suppressReviews":false,"brand":"Apple","productTypeId":"710","model":"MGN63LL/A","buyNowEligible":true,"fulfillmentType":"FC","fulfillmentBadge":"Tomorrow","checkStoreAvailabilityATC":false,"checkAvailabilityGlobalDFS":false,"hasSellerBadge":null,"hasCarePlans":true,"hasHomeServices":null,"itemType":null,"primaryUsItemId":"609040889","conditionType":"New","imageInfo":{"allImages":[{"id":"0D4F1BA24DB24A7F89FA742D2A069922","url":"https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg","zoomable":true},"...truncated...",],},"priceInfo":{"currentPrice":{"price":699,"priceString":"$699.00","variantPriceString":"$699.00","currencyUnit":"USD","bestValue":null,"priceDisplay":"$699.00"},"...truncated..."
import json
from parsel import Selector
# scrapingdog has no integration but we can use httpx
# install using `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
payload = {
"api_key": "YOUR API KEY",
"url": url,
"premium": "true",
"country": "us",
}
api_result = client.post(
"https://api.scrapingdog.com/scrape",
json=payload,
)
data = api_result.json()
assert data['success'], f"scrape failed: {data['message']}"
return Selector(data['html'])
url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
"id": "4SZSM8SXAAJT",
"name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
"shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
"additionalOfferCount": 2,
"availabilityStatus": "IN_STOCK",
"averageRating": 4.7,
"associatedBundleId": null,
"suppressReviews": false,
"brand": "Apple",
"productTypeId": "710",
"model": "MGN63LL/A",
"buyNowEligible": true,
"fulfillmentType": "FC",
"fulfillmentBadge": "Tomorrow",
"checkStoreAvailabilityATC": false,
"checkAvailabilityGlobalDFS": false,
"hasSellerBadge": null,
"hasCarePlans": true,
"hasHomeServices": null,
"itemType": null,
"primaryUsItemId": "609040889",
"conditionType": "New",
"imageInfo": {
"allImages": [
{
"id": "0D4F1BA24DB24A7F89FA742D2A069922",
"url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
"zoomable": true
},
"...truncated...",
],
},
"priceInfo": {
"currentPrice": {
"price": 699,
"priceString": "$699.00",
"variantPriceString": "$699.00",
"currencyUnit": "USD",
"bestValue": null,
"priceDisplay": "$699.00"
},
"...truncated..."
For scraping walmart.com above we're using HTML scraping and extract a JSON variable that contains
the product data. This variable can be found under __APP_DATA__ in the HTML source.
Why scrape Walmart Products?
Walmart is a popular target for web scraping as it contains a massive e-commerce dataset
that can be used for various purposes lik price monitoring,
market research, and competitive analysis.
With price monitoring scraping we can keep track of the product's historic pricing data and
take advantage of market fluctuations to make better purchasing decisions or investments.
Market research scraping, and especially Walmart review scraping,
can help with understanding customer preferences through sentiment analysis,
identify trends through statistics, and make informed decisions about
new product development and marketing strategies.
Walmart is also often scraped by Walmart partners to monitor brand awareness and performance
and adjust their negotiation strategies.
Finally, Walmart contains so much data that it can be used in AI model training.