Web Scraping Stockx.com Overview

2024-04-08

Stockx is one of the biggest fashion markets in the US offers a unique take on stock-like trading experience for wearables. This makes it an important web scraping target as it contains value pricing data.

Stockx.com is using proprietary web scraping protection technology in combination with PerimeterX, Akamai and Cloudflare anti-bot services. This makes it difficult to scrape StockX data reliably and this is where web scraping APIs come in handy.

Overall, most web scraping APIs we've tested through our benchmarks perform well for StockX.com at $3.81 per 1,000 scrape requests on average.

Stockx.com scraping API benchmarks

Scrapeway runs weekly benchmarks for Stockx Products for the most popular web scraping APIs. Here's the table for this week:

Service Success % Speed Cost $/1000
1
100%
=
1.2s
+0.2
$0.15
=
2
100%
=
8.6s
-2.0
$6.9
=
3
100%
=
34.2s
+9.9
$1.9
=
4
99%
=
8.1s
+1.1
$9.8
=
5
98%
-1
9.1s
-1.4
$2.2
=
6
86%
+48
14.9s
-0.8
$2.45
=
7
62%
+1
2.2s
-0.6
$3.27
=
Data range Jun 21 - Jun 28

How to scrape stockx.com?

StockX is one of the easiest targets to scrape as it's a highly dynamic javascript application that stores all of its data in JSON format which means headless browser use is not required.

That being said, StockX.com has a lot of anti-scraping technologies in place, so it's recommended to use a reliable web scraping service that can bypass the constantly changing anti-scraping measures. See benchmarks for the most up-to-date results.

StockX's HTML datasets contain their data in JSON variables under NextJS framework variables like __NEXT_DATA__ and can be easily extracted for full product datasets.

Stockx.com scraper
import json
from parsel import Selector
# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")

# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
    api_result = client.scrape(ScrapeConfig(
        url=url, 
        asp=True,
        country='US',
        ))
    return api_result.selector


url = "https://stockx.com/air-jordan-4-retro-military-blue-2024"
selector = scrape(url)

# The entire dataset can be found in a javascript variable:
data = selector.css("script#__NEXT_DATA__::text").get()
data = json.loads(data)

# full dataset is huge but can be reduced using jsonpath:
from jsonpath_ng import parse 
products = []
for product in parse("$..product").find(data):
    products.append(product.value)
    
print(len(products))
# the resulting dataset is pretty big but here are some example fields:
from pprint import pprint
pprint(products[0])
{
  "listingType": "STANDARD",
  "gender": "men",
  "browseVerticals": ["sneakers"],
  "primaryTitle": "Jordan 4 Retro",
  "secondaryTitle": "Military Blue (2024)",
  "description": "The Air Jordan 4 Retro Military Blue 2024 stands as a beacon of Jordan Brand's innovation and style. It features an off-white leather base contrasted with the striking military blue splashed on the eyelet wings, heel, and parts of the midsole, creating a look of disciplined yet daring design. The neutral grey touches on the forefoot and outsole balance the aesthetic, highlighting the sneaker's clean lines and geometric shapes.<br><br>Every detail is meticulously crafted, with the Jumpman logo on the tongue, the 'Flight' script on the tongue's underside, and a visible Air-Sole unit encased in the sole—all of which are hallmarks of the Jordan brand's dedication to quality and performance. The industrial blue mesh inserts not only offer breathability but also complement the sneaker's color scheme, adding both function and flair. The beloved Nike Air logo graces the heel and stays true to the original 1989 iteration.<br><br>Released on May 4, 2024, at a retail price of $215, the Air Jordan 4 Retro Military Blue quickly ascended to iconic status. Collectors and sneaker enthusiasts eagerly celebrated the revival of the military blue colorway on a timeless model.",
  "condition": "New",
  "productCategory": "sneakers",
  "editionType": null,
  "title": "Jordan 4 Retro Military Blue (2024)",
  "media": {
    "imageUrl": "https://images.stockx.com/images/Air-Jordan-4-Retro-Military-Blue-2024-Product.jpg?fit=fill&bg=FFFFFF&w=700&h=500&fm=webp&auto=compress&q=90&dpr=2&trim=color&updated_at=1713464187",
    "all360Images": [
      "https://images.stockx.com/360/Air-Jordan-4-Retro-Military-Blue-2024/Images/Air-Jordan-4-Retro-Military-Blue-2024/Lv2/img01.jpg?fm=webp&auto=compress&w=559&q=90&dpr=2&updated_at=1713385181",
      "..."
    ]
  },
  "brand": "Jordan",
  "variants": [ "...trucated..." ],
  "contentGroup": "sneakers",
  "traits": [ "...truncated..."],
  "model": "Jordan 4 Retro",
  "styleId": "FV5029-141",
  "market": {
    "state": {
      "lowestAsk": {
        "amount": 259
      },
      "numberOfAsks": 5990,
      "highestBid": {
        "amount": 291
      },
      "numberOfBids": 1918,
      "numberOfCustodialAsks": 0,
      "lowestCustodialAsk": null
    },
    "salesInformation": {
      "lastSale": 248
    }
  }
}

For scraping stockx.com above we're retrieving the HTML and extract the entire page dataset from a hidden JSON variable. As Stockx is using next.js for page rendering the full product dataset can be found in the "product" key. For that we're using jsonpath to select this key recursively.

Join the Scrapeway newsletter!

Early benchmark reports and industry insights every week!

Why scrape Stockx Products?

StockX is a popular web scraping target because it has a large amount of e-commerce apparel data that can be used for various purposes like price monitoring, market research, and competitive analysis.

With price and sale monitoring scraping we can keep track of the product's historic pricing data and current price trends. This is particularly useful because of the unique pricing structure of StockX which treats products as stocks.

Market research scraping, and especially review and bid/ask scraping, can help us understand customer preferences through sentiment analysis and price fluctuations. This can be used to identify trends through statistics and make informed decisions a bout product development and marketing strategies.

Stockx.com is also often scraped by Stockx sellers themselves to monitor competition and adjust their product and pricing strategies.

Finally, Stockx contains a lot of user-generated data in form of reviews which can be used in AI training.