Etsy is one of the biggest e-commerce user generated markets in the world focusing on user made or vintage items.
Etsy.com is using a proprietary web scraping protection tech that is being constantly updated together with anti-bot services like Datadome. This makes it difficult to scrape Etsy data reliably and this is where web scraping APIs come in handy.
Overall, most web scraping APIs we've tested through our benchmarks perform well for Etsy.com at $3.72 per 1,000 scrape requests on average.
Etsy.com scraping API benchmarks
Scrapeway runs weekly benchmarks for Etsy Products for the most popular web scraping APIs. Here's the table for this week:
Service | Success % | Speed | Cost $/1000 | |
---|---|---|---|---|
1
|
100%
=
|
4.4s
+0.9
|
$4.9
=
|
|
2
|
100%
+1
|
7.8s
-1.2
|
$6.9
=
|
|
3
|
99%
-1
|
13.1s
-1.4
|
$4.19
+0.44
|
|
4
|
97%
+3
|
28.1s
-10.4
|
$1.9
=
|
|
5
|
81%
-13
|
23.8s
-5.0
|
$2.71
=
|
|
6
|
71%
+16
|
2.6s
+0.3
|
$3.27
=
|
|
7
|
14%
=
|
8.6s
+0.3
|
$2.2
=
|
How to scrape etsy.com?
Etsy is relatively easy to scrape as it's mostly static content with very few dynamic elements so headless browser use is not required.
That being said, Etsy.com has a lot of anti-scraping technologies in place, so it's recommended to use a reliable web scraping service that can bypass the constantly changing anti-scraping measures. See benchmarks for the most up-to-date results.
Etsy's HTML datasets are well structured and even use microdata (json-ld
) to
provide product datasets for automation. This data can be easily extracted using XPath or CSS selectors.
import json
from parsel import Selector
# install using `pip install scraperapi`
from scraper_api import ScraperAPIClient
# create an API client instance
client = ScraperAPIClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url=url,
headers=headers or {},
premium=True,
country_code=US,
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# install using `pip install zenrows`
from zenrows import ZenRowsClient
# create an API client instance
client = ZenRowsClient(apikey="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": "True",
"premium_proxy": "True",
"js_render": "True",
"proxy_country": "US",
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['html'])
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# install using `pip install scrapfly-sdk`
from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse
# create an API client instance
client = ScrapflyClient(key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.scrape(ScrapeConfig(
url=url,
asp=True,
country='US',
))
return api_result.selector
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# install using `pip install scrapingant-client`
from scrapingant_client import ScrapingAntClient
# create an API client instance
client = ScrapingAntClient(token="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.general_request(
url,
browser=True,
return_page_source=False,
proxy_type='datacenter',
proxy_country='US',
)
assert api_result.ok, api_result.text
return Selector(api_result.text)
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# webscrapingapi has a Python SDK but it's not great, use httpx instead:
# `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"url": url,
"api_key": "YOUR API KEY", # NOTE: add your API KEY here!
"timeout": 60_000,
"country": "US",
"render_js": "1",
},
)
assert api_result.status_code == 200, api_result.reason_phrase
return Selector(api_result.text)
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# install using `pip install scrapingbee`
from scrapingbee import ScrapingBeeClient
# create an API client instance
client = ScrapingBeeClient(api_key="YOUR API KEY")
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
api_result = client.get(
url,
headers=headers,
params={
"json_response": True,
"transparent_status_code": True,
"country_code": "US",
"premium_proxy": "True",
"render_js": "False",
}
)
assert api_result.ok, api_result.text
data = api_result.json()
return Selector(data['body'])
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
import json
from parsel import Selector
# scrapingdog has no integration but we can use httpx
# install using `pip install httpx`
import httpx
# create an API client instance
client = httpx.Client(timeout=180)
# create scrape function that returns HTML parser for a given URL
def scrape(url: str, country: str="", render_js=False, headers: dict=None) -> Selector:
payload = {
"api_key": "YOUR API KEY",
"url": url,
"premium": "true",
"country": "us",
}
api_result = client.post(
"https://api.scrapingdog.com/scrape",
json=payload,
)
data = api_result.json()
assert data['success'], f"scrape failed: {data['message']}"
return Selector(data['html'])
url = "https://www.etsy.com/sg-en/listing/1336771386/python-cheat-sheet-coasters-made-from-a"
# scrape through US geo-location as etsy is blocked in many countries
selector = scrape(url, country="US")
# etsy has product data as JSON in a script element
script = selector.xpath("//script[contains(text(),'offers')]/text()").get()
data = json.loads(script)
from pprint import pprint
pprint(pprint(data))
{'@context': 'https://schema.org',
'@type': 'Product',
'aggregateRating': {'@type': 'AggregateRating',
'ratingValue': '5.0',
'reviewCount': 615},
'brand': {'@context': 'https://schema.org',
'@type': 'Brand',
'name': 'WestArtFactory'},
'category': 'Electronics & Accessories < Gadgets',
'description': 'The Python Cheat Sheet is an absolute must for every software '
'engineer, hacker or programmer.\n'
'The product consists of a high-quality printed circuit board '
'(PCB).\n'
'The special thing about the board is the wafer-thin lettering '
'and conductor tracks coated with real gold.\n'
'The gold is applied very thinly using a highly complex '
'process and forms a homogeneous connection with the circuit '
'board.\n'
'This not only creates a unique look, but also a very special '
'feel.\n'
'\n'
'In addition to the excellent quality of the coaster, the '
'carefully selected software examples are also extremely '
'helpful in Python software development.\n'
'The cheat sheet offers helpful support, especially for '
'beginners. Professionals should see the cheat sheet as a sign '
'of belonging to the Python community :)\n'
'\n'
'Dimensions: 10 cm x 10 cm [3.94" x 3.94"]\n'
'Thickness: 1.6mm [0.04"]\n'
'Weight: 30g',
'gtin': 'n/a',
'image': [{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/bf2e1d/4363359320/il_fullxfull.4363359320_696a.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/c/3000/2384/0/96/il/bf2e1d/4363359320/il_340x270.4363359320_696a.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_fullxfull.4363359086_ct24.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/2256d7/4363359086/il_340x270.4363359086_ct24.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_fullxfull.4410691101_mnk9.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/18d781/4410691101/il_340x270.4410691101_mnk9.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_fullxfull.4410691105_tows.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/02a44d/4410691105/il_340x270.4410691105_tows.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_fullxfull.4363305468_2u8t.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/3ce3cd/4363305468/il_340x270.4363305468_2u8t.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_fullxfull.4410691113_1vd7.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/9dd0f1/4410691113/il_340x270.4410691113_1vd7.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_fullxfull.4363305474_g8nj.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/764f10/4363305474/il_340x270.4363305474_g8nj.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_fullxfull.4410691059_tgtk.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/d8f739/4410691059/il_340x270.4410691059_tgtk.jpg'},
{'@context': 'https://schema.org',
'@type': 'ImageObject',
'author': 'WestArtFactory',
'contentURL': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_fullxfull.4410691049_mhx2.jpg',
'description': None,
'thumbnail': 'https://i.etsystatic.com/32739805/r/il/712818/4410691049/il_340x270.4410691049_mhx2.jpg'}],
'logo': 'https://i.etsystatic.com/isla/22edb6/52736152/isla_fullxfull.52736152_gllqny0m.jpg?version=0',
'material': 'Circuit Board/Gold/FR4',
'name': 'Python Cheat Sheet Coasters made from a high quality circuit board '
'for software engineers, hackers and programmers',
'offers': {'@type': 'Offer',
'availability': 'https://schema.org/InStock',
'eligibleQuantity': 416,
'price': '21.49',
'priceCurrency': 'USD'},
'review': [{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Sarah Givens'},
'datePublished': '2024-03-24',
'reviewBody': 'Super fast shipping and really great quality!',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Christina'},
'datePublished': '2024-03-14',
'reviewBody': 'Very pretty and useful, my husband was happy with '
'his gift',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'bob'},
'datePublished': '2024-03-08',
'reviewBody': 'good long delivery due to 3rd party company delay',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}},
{'@type': 'Review',
'author': {'@type': 'Person', 'name': 'Julien Aversano'},
'datePublished': '2024-02-27',
'reviewBody': 'Great quality, will order again',
'reviewRating': {'@type': 'Rating',
'bestRating': 5,
'ratingValue': 5}}],
'sku': '1336771386',
'url': 'https://www.etsy.com/listing/1336771386/python-cheat-sheet-coasters-made-from-a'}
As seen above to scrape etsy we can extract schema.org microdata which contains most of the product data. The remaining details are avaiable in the visible HTML and can be extracted using CSS or XPath selectors.
Why scrape Etsy Products?
Etsy is a popular web scraping target because it has a large amount of e-commerce data that can be used for various purposes like price monitoring, market research, and competitive analysis.
With price and sale monitoring scraping we can keep track the product's historic pricing data and take advantage of market changes and trends. This is data is commonly scraped by Etsy power users exploring the plaftorm's trends for launching new products as sale numbers and pricing data is publicly available on etsy.com.
Market research scraping, and especially review scraping, can help us understand customer preferences through sentiment analysis and user metadata. This can be used to identify trends through statistics and make informed decisions about product development and marketing strategies.
Etsy.com is also often scraped by Etsy sellers themselves to monitor competition and adjust their pricing strategy.
Finally, Etsy contains a lot of user generated data which can be used in AI and machine learning models.