How to scrape walmart.com and which web scraping API to use

Walmart is one of the biggest e-commerce retailers in the United States containing product data of brick and mortar stores as well as online stores.

Walmart is using proprietary web scraping protection mechanisms that are constantly evolving. This makes it difficult to scrape Walmart data reliably and this is where web scraping APIs come in handy.

Overall, most web scraping APIs we've tested through our benchmarks perform well for Walmart at $3.03 per 1,000 scrape requests on average.

Walmart.com scraping API benchmarks

Scrapeway runs weekly benchmarks for Walmart Products for the most popular web scraping APIs. Here's the table for this week:

	Service	Success %	Speed	Cost $/1000
1	Scraperapi	100% =	12.7s +4.5	$2.45 =
2	Scrapingant	99% +1	30.8s +7.4	$1.9 =
3	Zenrows	94% +50	8.4s +1.5	$6.9 =
4	Scrapfly	84% -2	7.8s +2.6	$3.97 +0.03
5	WebScrapingAPI	84% +3	11.1s -5.8	$2.71 =
6	Scrapingbee	50% -3	2.3s -0.4	$3.27 =
7	Scrapingdog	0%	-	-

Data range Jun 14 - Jun 20

How to scrape walmart.com?

Walmart is relatively easy to scrape as it's mostly static content with a few dynamic elements so headless browser use is not required.

That being said, Walmart has a lot of anti-scraping mechanisms in place, so it's recommended to use a reliable web scraping service that can bypass the constantly changing anti-scraping measures. See benchmarks for the most up-to-date results.

Walmart's HTML datasets can be difficult to parse just because of sheer data point scale however many of the datapoints can be accessed through NextJS framework variables walmart is using. To do this look for the __NEXT_DATA__ variable in the HTML source.

Walmart.com scraper

import json
from parsel import Selector
# install using `pip install scraperapi`
from scraper_api import ScraperAPIClient

# create an API client instance
client = ScraperAPIClient(api_key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.get(
        url=url, 
        headers=headers or {},
        premium=True,
        country_code=US,
        )
    assert api_result.ok, api_result.text
    return Selector(api_result.text)

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # install using `pip install scraperapi` from scraper_api import ScraperAPIClient # create an API client instance client = ScraperAPIClient(api_key="YOUR API KEY") # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.get( url=url, headers=headers or {}, premium=True, country_code=US, ) assert api_result.ok, api_result.text return Selector(api_result.text) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# install using `pip install scrapingant-client`
from scrapingant_client import ScrapingAntClient

# create an API client instance
client = ScrapingAntClient(token="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.general_request(
        url, 
        browser=True,
        return_page_source=False,
        proxy_type='datacenter',
        proxy_country='US',
        )
    assert api_result.ok, api_result.text
    return Selector(api_result.text)

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # install using `pip install scrapingant-client` from scrapingant_client import ScrapingAntClient # create an API client instance client = ScrapingAntClient(token="YOUR API KEY") # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.general_request( url, browser=True, return_page_source=False, proxy_type='datacenter', proxy_country='US', ) assert api_result.ok, api_result.text return Selector(api_result.text) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# install using `pip install zenrows`
from zenrows import ZenRowsClient

# create an API client instance
client = ZenRowsClient(apikey="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.get(
        url, 
        headers=headers,
        params={
            "json_response": "True",
            "premium_proxy": "True",
            "proxy_country": "US",
            "js_render": "True",
            }
    )
    assert api_result.ok, api_result.text
    data = api_result.json()
    return Selector(data['html'])

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # install using `pip install zenrows` from zenrows import ZenRowsClient # create an API client instance client = ZenRowsClient(apikey="YOUR API KEY") # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.get( url, headers=headers, params={ "json_response": "True", "premium_proxy": "True", "proxy_country": "US", "js_render": "True", } ) assert api_result.ok, api_result.text data = api_result.json() return Selector(data['html']) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.scrape(ScrapeConfig(
        url=url, 
        asp=True,
        country='US',
        ))
    return api_result.selector

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # install using `pip install scrapfly-sdk` from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse # create an API client instance client = ScrapflyClient(key="YOUR API KEY") # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.scrape(ScrapeConfig( url=url, asp=True, country='US', )) return api_result.selector url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# webscrapingapi has a Python SDK but it's not great, use httpx instead:
# `pip install httpx`
import httpx

# create an API client instance
client = httpx.Client(timeout=180)

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.get(
        url,
        headers=headers,
        params={
            "url": url,
            "api_key": "YOUR API KEY",  # NOTE: add your API KEY here!
            "timeout": 60_000,
            "render_js": "1",
            "country": "US",
            },
    )
    assert api_result.status_code == 200, api_result.reason_phrase
    return Selector(api_result.text)

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # webscrapingapi has a Python SDK but it's not great, use httpx instead: # `pip install httpx` import httpx # create an API client instance client = httpx.Client(timeout=180) # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.get( url, headers=headers, params={ "url": url, "api_key": "YOUR API KEY", # NOTE: add your API KEY here! "timeout": 60_000, "render_js": "1", "country": "US", }, ) assert api_result.status_code == 200, api_result.reason_phrase return Selector(api_result.text) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# install using `pip install scrapingbee`
from scrapingbee import ScrapingBeeClient

# create an API client instance
client = ScrapingBeeClient(api_key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.get(
        url, 
        headers=headers,
        params={
            "json_response": True,
            "transparent_status_code": True,
            "premium_proxy": "True",
            "country_code": "US",
            "render_js": "False",
            }
    )
    assert api_result.ok, api_result.text
    data = api_result.json()
    return Selector(data['body'])

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # install using `pip install scrapingbee` from scrapingbee import ScrapingBeeClient # create an API client instance client = ScrapingBeeClient(api_key="YOUR API KEY") # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: api_result = client.get( url, headers=headers, params={ "json_response": True, "transparent_status_code": True, "premium_proxy": "True", "country_code": "US", "render_js": "False", } ) assert api_result.ok, api_result.text data = api_result.json() return Selector(data['body']) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

import json
from parsel import Selector
# scrapingdog has no integration but we can use httpx
# install using `pip install httpx`
import httpx

# create an API client instance
client = httpx.Client(timeout=180)

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    payload = {
        "api_key": "YOUR API KEY",
        "url": url,
        "premium": "true",
        "country": "us",
        
    }
    api_result = client.post(
        "https://api.scrapingdog.com/scrape",
        json=payload,
    )
    data = api_result.json()
    assert data['success'], f"scrape failed: {data['message']}"
    return Selector(data['html'])

url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889"
selector = scrape(url)
# Walmart is using NextJS framework so the product data is stored in a JSON variable
data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get()
data = json.loads(data)
product = data["props"]["pageProps"]["initialData"]["data"]["product"]

# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(product)
{
  "id": "4SZSM8SXAAJT",
  "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage",
  "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!",
  "additionalOfferCount": 2,
  "availabilityStatus": "IN_STOCK",
  "averageRating": 4.7,
  "associatedBundleId": null,
  "suppressReviews": false,
  "brand": "Apple",
  "productTypeId": "710",
  "model": "MGN63LL/A",
  "buyNowEligible": true,
  "fulfillmentType": "FC",
  "fulfillmentBadge": "Tomorrow",
  "checkStoreAvailabilityATC": false,
  "checkAvailabilityGlobalDFS": false,
  "hasSellerBadge": null,
  "hasCarePlans": true,
  "hasHomeServices": null,
  "itemType": null,
  "primaryUsItemId": "609040889",
  "conditionType": "New",
  "imageInfo": {
    "allImages": [
      {
        "id": "0D4F1BA24DB24A7F89FA742D2A069922",
        "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg",
        "zoomable": true
      },
      "...truncated...",
    ],
  },
  "priceInfo": {
    "currentPrice": {
      "price": 699,
      "priceString": "$699.00",
      "variantPriceString": "$699.00",
      "currencyUnit": "USD",
      "bestValue": null,
      "priceDisplay": "$699.00"
    },
  "...truncated..."

import json from parsel import Selector # scrapingdog has no integration but we can use httpx # install using `pip install httpx` import httpx # create an API client instance client = httpx.Client(timeout=180) # create scrape function that returns HTML parser for a given URL def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector: payload = { "api_key": "YOUR API KEY", "url": url, "premium": "true", "country": "us", } api_result = client.post( "https://api.scrapingdog.com/scrape", json=payload, ) data = api_result.json() assert data['success'], f"scrape failed: {data['message']}" return Selector(data['html']) url = "https://www.walmart.com/ip/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage/609040889" selector = scrape(url) # Walmart is using NextJS framework so the product data is stored in a JSON variable data = selector.xpath('//script[@id="__NEXT_DATA__"]/text()').get() data = json.loads(data) product = data["props"]["pageProps"]["initialData"]["data"]["product"] # the resulting dataset is pretty big but here are some example fields: from pprint import pprint pprint(product) { "id": "4SZSM8SXAAJT", "name": "Apple MacBook Air 13.3 inch Laptop - Space Gray, M1 Chip, 8GB RAM, 256GB storage", "shortDescription": "Introducing The 13-inch MacBook Air with the Apple M1 chip is incredibly thin and light with a silent fanless design. It delivers remarkable performance and up to 18 hours of battery life. And it has a beautiful Retina display for super sharp text and vibrant colors. Amazing performance, Unbeatable price. It's a laptop you’re going to love!", "additionalOfferCount": 2, "availabilityStatus": "IN_STOCK", "averageRating": 4.7, "associatedBundleId": null, "suppressReviews": false, "brand": "Apple", "productTypeId": "710", "model": "MGN63LL/A", "buyNowEligible": true, "fulfillmentType": "FC", "fulfillmentBadge": "Tomorrow", "checkStoreAvailabilityATC": false, "checkAvailabilityGlobalDFS": false, "hasSellerBadge": null, "hasCarePlans": true, "hasHomeServices": null, "itemType": null, "primaryUsItemId": "609040889", "conditionType": "New", "imageInfo": { "allImages": [ { "id": "0D4F1BA24DB24A7F89FA742D2A069922", "url": "https://i5.walmartimages.com/seo/Apple-MacBook-Air-13-3-inch-Laptop-Space-Gray-M1-Chip-8GB-RAM-256GB-storage_af1d4133-6de9-4bdc-b1c6-1ca8bd0af7a0.c0eb74c31b2cb05df4ed11124d0e255b.jpeg", "zoomable": true }, "...truncated...", ], }, "priceInfo": { "currentPrice": { "price": 699, "priceString": "$699.00", "variantPriceString": "$699.00", "currencyUnit": "USD", "bestValue": null, "priceDisplay": "$699.00" }, "...truncated..."

For scraping walmart.com above we're using HTML scraping and extract a JSON variable that contains the product data. This variable can be found under __APP_DATA__ in the HTML source.

Why scrape Walmart Products?

Walmart is a popular target for web scraping as it contains a massive e-commerce dataset that can be used for various purposes lik price monitoring, market research, and competitive analysis.

With price monitoring scraping we can keep track of the product's historic pricing data and take advantage of market fluctuations to make better purchasing decisions or investments.

Market research scraping, and especially Walmart review scraping, can help with understanding customer preferences through sentiment analysis, identify trends through statistics, and make informed decisions about new product development and marketing strategies.

Walmart is also often scraped by Walmart partners to monitor brand awareness and performance and adjust their negotiation strategies.

Finally, Walmart contains so much data that it can be used in AI model training.

Web Scraping Walmart.com Overview

Walmart.com scraping API benchmarks

How to scrape walmart.com?

Join the Scrapeway newsletter!

Why scrape Walmart Products?