Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
Actowiz Metrics Now Live!
logo
Unlock Smarter , Faster Analytics!
216.73.216.93
{
  "geoplugin_status":429,
  "geoplugin_message": "Blacklisted due to sending too many requests to geoplugin.net. Consider whitelisting your IP or domain",
  "geoplugin_url": "https://www.geoplugin.com/premium/"

}
http://www.geoplugin.net/php.gp?ip=216.73.216.93
Array
(
    [success] => 
    [message] => You've hit the monthly limit
)




        
    
How-to-Scrape-Decathlon-Data-using-Playwright-and-Python

Decathlon is a well-known retailer in the sporting goods industry, offering a wide range of products like shoes, sports apparel, and equipment. Extracting Decathlon site can offer important insights into product pricing, trends, and market details. In this blog, we will explore how to scrape apparel data from Decathlon's site by category through combination of Python and Playwright.

Playwright is an automation library that allows you to control web browsers such as Firefox, WebKit, and Chromium using programming languages like JavaScript and Python. It is an excellent tool for web scraping as it automates tasks like filling forms, clicking buttons, and scrolling. Using Playwright, we will navigate through each category on Decathlon's website and extract product information, including their name, pricing, and description.

In this blog, you'll gain a fundamental understanding of how to utilize Python and Playwright to extract data from Decathlon's site through category. We'll scrape different data attributes from separate product pages including Product URL, Brand, MRP, Product Name, Number of Reviews, Sale Price, Color, Features, Ratings, and Product Information.

Let’s go through a step-by-step blog guide for using Python and Playwright to extract apparel data from Decathlon by category.

Import Necessary Libraries

To begin the process, we need to import the necessary libraries that will enable us to interact with the Decathlon website and extract the desired information. Here are the libraries we need to import:

Import-Necessary-Libraries

The required libraries for scraping data from Decathlon's website by category using Playwright and Python are:

random: This library is used for generating random numbers, which can help generate test data or randomize the order of tests. While it may not directly relate to web scraping in this context, it can be helpful in other scenarios.

asyncio: This library is essential for handling asynchronous programming in Python. Since we will be using the asynchronous API of Playwright, asyncio allows us to write asynchronous code, making it easier to work with Playwright effectively.

pandas: This library is widely used for data analysis and manipulation. It provides convenient data structures and functions for handling tabular data. In this tutorial, we may use pandas to store and manipulate the data obtained from the web pages during scraping.

async_playwright: This library is the asynchronous API for Playwright, which is used to automate browser testing. With the asynchronous API, you can perform multiple operations concurrently, making your scraping tasks faster and more efficient.

By importing these libraries, you'll have the tools to automate browser interactions, handle asynchronous operations, and store and analyze scraped data from Decathlon's website using Playwright and Python.

Product URL Extraction

The next step is to extract the URLs of the apparel products based on their respective categories. We will navigate through the Decathlon website and collect the URLs for each product in each category.

Extraction-of-Product-URLs

To extract the product URLs from a web page, we can define a Python function called get_product_urls. This function will utilize the Playwright library to automate browser testing and extract the resulting product URLs.

In this function, we start by initializing an empty list called product_urls. We use page.query_selector_all to find all the elements on the page that contain the product URLs, using the CSS selector a.product-link. We then iterate through each element and extract the href attribute, which contains the URL of the product page. The extracted URLs are appended to the product_urls list.

Next, we check if there is a "next" button on the page using page.query_selector. If a "next" button exists, we click on it using next_button.click(), and then recursively call the get_product_urls function to extract URLs from the next page. The URLs extracted from the subsequent pages are also appended to the product_urls list.

Finally, we return the product_urls list, which will contain all the extracted product URLs from the web page.

You can call this function within the previous scrape_urls function, passing the browser and page instances, to extract the product URLs for each category.

You-can-call-this-function-within-the-previous

To scrape product URLs depending on a product category, we have to click on a product category button for expanding the list of available categories. Then, we will click on each category to filter the products and extract the corresponding URLs.

To-crape-product-URLs

In the previous step, we filtered the products by category and obtained the respective product URLs. Now, we will proceed to extract the names of the products from the web pages.

To extract the product names, we can define a function called extract_product_names. This function will take the page instance as a parameter and utilize Playwright's query selector to identify the elements containing the product names. We will then iterate through these elements and extract the text content, which represents the product names.

In this function, we start by initializing an empty list called product_names. We use page.query_selector_all with the CSS selector .product-name to find all elements on the page that contain the product names. Then, we iterate through each element and extract the text content using element.text_content(). The extracted names are appended to the product_names list.

You can call this function after navigating to a specific product page within your scraping script. By doing so, you will obtain a list of the product names from the page, which can be further processed or stored as needed.

You-can-call-this-function-after

In the previous step, we defined a function called extract_product_names to extract the names of the products from the web pages.

In this function, we attempt to find the corresponding product name element on the page using page.query_selector and passing the appropriate CSS selector, which is .product-name. If the element is found, we retrieve the text content using name_element.text_content() and append it to the product_names list. If the element is not found or if an exception occurs during the process, we append the string "Not Available" to the product_names list.

By handling potential exceptions and setting a default value of "Not Available" when the product name element is not found, we ensure that the function runs smoothly and provides a meaningful result even in case of unexpected situations.

You can call this function within your scraping script after navigating to the product page to extract the name of the product.

Scraping Brand and Its Products

The next step is scraping of a brand of products from different web pages.

Scraping-Brand-and-Its-Products

In the previous steps, we have seen how to extract the product name and brand from web pages. Now, let's move on to extracting other attributes such as sale price, MRP, ratings, color, total reviews, product information, and features.

For each of these attributes, we can define separate functions that use the query_selector method and text_content method, or similar methods, to select the relevant element on the page and extract the desired information. These functions will follow a similar structure as the extract_product_names and extract_product_brands functions.

In these functions, we attempt to locate the corresponding elements using appropriate CSS selectors for each attribute. If the element is found, we extract its text content and assign it to the respective variable. If an exception occurs during the process, we set the variable to an empty string or any default value that suits your needs.

You can call these functions within your scraping script, after navigating to the product page, to extract the desired attributes for each product.

Remember to adjust the CSS selectors used in these functions based on the structure of the web page you are scraping. Inspecting the HTML structure of the web page will help you identify the appropriate selectors for each attribute.

Scraping Products’ MRP

Scraping-Products-MRP

Extraction of Sale Price of the Products

Extraction-of-Sale-Price-of-the-Products

Scraping Total Product Reviews

Scraping-Total-Product-Reviews

Scraping Product Ratings

Scraping-Product-Ratings

Scraping Product Colors

Scraping-Product-Colors

Scraping Product Features

Scraping-Product-Features

Certainly! Here's an updated explanation for the function that extracts the product description section from a Decathlon product page:

In this function, we attempt to locate the element containing the product description using the CSS selector .product-description. If the element is found, we extract its text content and assign it to the description variable.

To remove unwanted characters, we split the description text by the newline character (\n) and use a list comprehension to filter out any elements that are empty or contain only whitespace. We then use the strip() method to remove leading and trailing spaces from each remaining element. The resulting list of strings represents the cleaned product description.

By using this function in your scraping script, you can extract the product description from each product page and store it as a list of strings for further processing or analysis.

Scraping Product Information

Scraping-Product-Information

The get_product_information function is an asynchronous function that takes a page object as its parameter. It aims to scrape product details from Decathlon's website.

Here's an explanation of the function's logic:

  • Find all product information entries on the page.
  • Iterate through each information entry.
  • Locate the elements containing the name and value of each product information.
  • Extract the text content of the name and value elements.
  • Remove newline characters from extracted strings.
  • Add a name-value pair in a product_information dictionary.
  • In case of exception during the procedure (e.g., elements not found or text content extraction fails), set the product_information dictionary to "Not Available".
  • Return the product_information dictionary.

By using this function, you can extract the product information from each product page on Decathlon's website, providing valuable insights and data for further analysis or processing.

Request Retry Using Maximum Retry Limits

Implementing a retry mechanism is an important aspect of web scraping as it helps handle temporary network errors or unexpected responses from the website. The goal is to increase the chances of success by sending the request again if it fails initially.

In this script, a retry mechanism is implemented before navigating to a URL. It uses a while loop that continuously tries to navigate to the URL until either the request gets succeed or maximum retries has been tried. If maximum retries are reached without a successful request, the script increases an exception to handle the failure.

This function is particularly useful when scraping web pages because requests may occasionally time out or fail due to network issues. By incorporating a retry mechanism, you improve the reliability of the scraping process and increase the likelihood of obtaining the desired data.

async-def-perform-request-with-retry-page-url

The provided function is responsible for making a request of a particular link using goto method about page object from Playwright library. It incorporates a retry mechanism in case the initial request fails.

Here's an updated explanation of the function:

The perform_request_with_retry function performs a request to a specific link by utilizing the goto method of the page object. It implements a retry mechanism by allowing the request to be retried up to a maximum number of times described by MAX_RETRIES constant.

If the initial request fails, the function retries the request by entering a while loop. Between each retry, the function introduces a random wait time using the asyncio.sleep method. This random wait duration, ranging from 1 to 5 seconds, helps prevent rapid and frequent retries that could potentially exacerbate the request failures.

The function takes two arguments: link and page. The page argument represents the Playwright page object used to perform the request, while the link argument denotes the URL to which the request is made.

By utilizing this function, you can ensure that requests are retried in case of temporary failures, improving the overall reliability of your web scraping process.

Scraping and Saving Product Data

In the following step, we call functions as well as save data to empty lists.

Scraping-and-Saving-Product-Data

The provided Python script demonstrates the usage of an asynchronous function named "main" to scrape product information from Amazon pages. It leverages the Playwright library to launch a Firefox browser and navigate to the Amazon page.

The "main" function follows these steps:

1. It utilizes the "extract_product_urls" function to extract the URLs of each product from the Amazon page, storing them in a list called "product_url".

2. It then iterates through each product URL in the "product_url" list.

3. For each URL, it uses the "perform_request_with_retry" function to load the product page, making use of a retry mechanism to handle temporary failures.

4. It extracts various information such as the product name, brand, star rating, number of reviews, MRP, sale price, color, features, and product information from each product page.

5. The extracted information is stored as a tuple in a list called "data".

6. After processing every 10 product URLs, it prints a progress message.

7. Once all the product URLs have been processed, it prints a completion message.

8. The data in the "data" list is converted to a Pandas DataFrame.

9. The DataFrame is saved as a CSV file using the "to_csv" method.

10. Finally, the script closes the browser instance using the "browser.close()" statement.

To execute the script, the "main" function is called using the "asyncio.run(main())" statement, which runs the "main" function as an asynchronous coroutine.

By running this script, you can scrape product information from Amazon pages, store it in a structured format, and save it as a CSV file for further analysis or processing.

Conclusion

In today's competitive business landscape, having access to accurate and up-to-date data is crucial for making informed decisions. Web scraping offers a valuable solution to extract data from websites like Decathlon, providing valuable insights into market trends, pricing, and competitor analysis.

Businesses can automate collecting data from Decathlon's website using tools like Playwright and Python. This enables them to gather information such as product offerings, pricing, reviews, and more, which can be used to gain a competitive edge and drive growth.

Partnering with a reputable web scraping company like Actowiz Solutions can further enhance the benefits of web scraping. Actowiz Solutions offers tailored web scraping solutions that cater to specific business needs, providing access to comprehensive and relevant data. From product details to pricing information and customer reviews, Actowiz Solutions enables brands to understand their industry and competition deeply.

By leveraging the power of web data, businesses can make data-driven decisions, optimize their operations, and drive profitability. Actowiz Solutions can provide the necessary expertise and tools to extract valuable data, whether for product development, market research, or marketing campaigns.

If you're ready to unlock the potential of web data for your brand, reach out to Actowiz Solutions, the experts in web scraping. Contact us today to explore how web scraping can revolutionize your business and give you a competitive advantage.

For all your web scraping, mobile app scraping, or instant data scraper service needs, Actowiz Solutions is here to assist you. Our team of experts is skilled in extracting data from various sources, including websites and mobile applications. Whether you need to gather market data, competitor information, or any other specific data set, we have the expertise and tools to deliver accurate and reliable results.

216.73.216.93
{
  "geoplugin_status":429,
  "geoplugin_message": "Blacklisted due to sending too many requests to geoplugin.net. Consider whitelisting your IP or domain",
  "geoplugin_url": "https://www.geoplugin.com/premium/"

}
http://www.geoplugin.net/php.gp?ip=216.73.216.93
Array
(
    [success] => 
    [message] => You've hit the monthly limit
)

Start Your Project

Additional Trust Elements

✨ "1000+ Projects Delivered Globally"

⭐ "Rated 4.9/5 on Google & G2"

🔒 "Your data is secure with us. NDA available."

💬 "Average Response Time: Under 12 hours"

From Raw Data to Real-Time Decisions

All in One Pipeline

Scrape Structure Analyze Visualize

Look Back Analyze historical data to discover patterns, anomalies, and shifts in customer behavior.

Find Insights Use AI to connect data points and uncover market changes. Meanwhile.

Move Forward Predict demand, price shifts, and future opportunities across geographies.

Industry:

Coffee / Beverage / D2C

Result

2x Faster

Smarter product targeting

★★★★★

“Actowiz Solutions has been instrumental in optimizing our data scraping processes. Their services have provided us with valuable insights into our customer preferences, helping us stay ahead of the competition.”

Operations Manager, Beanly Coffee

✓ Competitive insights from multiple platforms

Industry:

Real Estate

Result

2x Faster

Real-time RERA insights for 20+ states

★★★★★

“Actowiz Solutions provided exceptional RERA Website Data Scraping Solution Service across PAN India, ensuring we received accurate and up-to-date real estate data for our analysis.”

Data Analyst, Aditya Birla Group

✓ Boosted data acquisition speed by 3×

Industry:

Organic Grocery / FMCG

Result

Improved

competitive benchmarking

★★★★★

“With Actowiz Solutions' data scraping, we’ve gained a clear edge in tracking product availability and pricing across various platforms. Their service has been a key to improving our market intelligence.”

Product Manager, 24Mantra Organic

✓ Real-time SKU-level tracking

Industry:

Quick Commerce

Result

2x Faster

Inventory Decisions

★★★★★

“Actowiz Solutions has greatly helped us monitor product availability from top three Quick Commerce brands. Their real-time data and accurate insights have streamlined our inventory management and decision-making process. Highly recommended!”

Aarav Shah, Senior Data Analyst, Mensa Brands

✓ 28% product availability accuracy

✓ Reduced OOS by 34% in 3 weeks

Industry:

Quick Commerce

Result

3x Faster

improvement in operational efficiency

★★★★★

“Actowiz Solutions' data scraping services have helped streamline our processes and improve our operational efficiency. Their expertise has provided us with actionable data to enhance our market positioning.”

Business Development Lead,Organic Tattva

✓ Weekly competitor pricing feeds

Industry:

Beverage / D2C

Result

Faster

Trend Detection

★★★★★

“The data scraping services offered by Actowiz Solutions have been crucial in refining our strategies. They have significantly improved our ability to analyze and respond to market trends quickly.”

Marketing Director, Sleepyowl Coffee

Boosted marketing responsiveness

Industry:

Quick Commerce

Result

Enhanced

stock tracking across SKUs

★★★★★

“Actowiz Solutions provided accurate Product Availability and Ranking Data Collection from 3 Quick Commerce Applications, improving our product visibility and stock management.”

Growth Analyst, TheBakersDozen.in

✓ Improved rank visibility of top products

Trusted by Industry Leaders Worldwide

Real results from real businesses using Actowiz Solutions

★★★★★
'Great value for the money. The expertise you get vs. what you pay makes this a no brainer"
Thomas Gallao
Thomas Galido
Co-Founder / Head of Product at Upright Data Inc.
Product Image
2 min
★★★★★
“I strongly recommend Actowiz Solutions for their outstanding web scraping services. Their team delivered impeccable results with a nice price, ensuring data on time.”
Thomas Gallao
Iulen Ibanez
CEO / Datacy.es
Product Image
1 min
★★★★★
“Actowiz Solutions offered exceptional support with transparency and guidance throughout. Anna and Saga made the process easy for a non-technical user like me. Great service, fair pricing highly recommended!”
Thomas Gallao
Febbin Chacko
-Fin, Small Business Owner
Product Image
1 min

See Actowiz in Action – Real-Time Scraping Dashboard + Success Insights

Blinkit (Delhi NCR)

In Stock
₹524

Amazon USA

Price Drop + 12 min
in 6 hrs across Lel.6

Appzon AirPdos Pro

Price
Drop −12 thr

Zepto (Mumbai)

Improved inventory
visibility & palniring

Monitor Prices, Availability & Trends -Live Across Regions

Actowiz's real-time scraping dashboard helps you monitor stock levels, delivery times, and price drops across Blinkit, Amazon: Zepto & more.

✔ Scraped Data: Price inights Top-slling SKUs

Our Data Drives Impact - Real Client Stories

Blinkit | India (Relail Partner)

"Actow's helped us reduce out of ststack incidents by 23% within 6 weeks"

✔ Scraped Data, SKU availability, delivery time

US Electronics Seller (Amazon - Walmart)

With hourly price monitoring, we aligned promotions with competitors, drove 17%

✔ Scraped Data, SKU availability, delivery time

Zepto Q Commerce Brand

"Actow's helped us reduce out of ststack incidents by 23% within 6 weeks"

✔ Scraped Data, SKU availability, delivery time

Actowiz Insights Hub

Actionable Blogs, Real Case Studies, and Visual Data Stories -All in One Place

All
Blog
Case Studies
Infographics
Report
Aug 01, 2025

Mastering Geographical Pricing Strategy - A Business Guide for Regional Market Success

Master the geographical pricing strategy to boost profits, tailor pricing by region, and drive market growth through location-based pricing tactics.

thumb

How Scraping RERA Project Listings Helped Developers Achieve Faster Regulatory Compliance

Learn how developers used scraping RERA project listings to automate real estate compliance, monitor approvals, and stay aligned with evolving regulations.

thumb

Jumia Product Data Scraping - Extracting Product Listings from Africa’s Leading eCommerce Platform

Unlock eCommerce insights with Jumia Product Data Scraping. Extract product listings, pricing, and specs from Africa’s top online marketplace.

Aug 01, 2025

Mastering Geographical Pricing Strategy - A Business Guide for Regional Market Success

Master the geographical pricing strategy to boost profits, tailor pricing by region, and drive market growth through location-based pricing tactics.

July 31, 2025

How Zomato and Swiggy Review Scraping Can Transform Brand Intelligence?

Zomato and Swiggy Review Scraping helps brands unlock customer sentiment, improve service, and track competitor feedback for smarter food delivery strategies.

July 30, 2025

Why WebMD Drug Information Scraping Is Essential for Extracting Accurate Pharmaceutical Data?

Discover why WebMD Drug Information Scraping is vital for extracting accurate pharmaceutical data, dosage details, side effects, and drug interactions.

thumb

How Scraping RERA Project Listings Helped Developers Achieve Faster Regulatory Compliance

Learn how developers used scraping RERA project listings to automate real estate compliance, monitor approvals, and stay aligned with evolving regulations.

thumb

Lazada and TikTok Shop Data Scraping for Student Research Projects

Explore how Lazada and TikTok Shop data scraping empowers student research with real-world datasets for pricing, trends, and product analysis accuracy.

thumb

Using Pizza Price Scraping in Canada to Optimize Regional Pricing Strategies for Delivery Apps

Discover how Pizza price scraping in Canada helps delivery apps optimize regional pricing, monitor competitors, and improve profitability across provinces.

thumb

Jumia Product Data Scraping - Extracting Product Listings from Africa’s Leading eCommerce Platform

Unlock eCommerce insights with Jumia Product Data Scraping. Extract product listings, pricing, and specs from Africa’s top online marketplace.

thumb

Thriving on Delivery Intermediaries - Digital Shelf Analytics for Consumer Brands in 2025

Discover how Digital Shelf Analytics for Consumer Brands helps drive growth on delivery platforms. Unlock performance data, pricing trends, and market insights.

thumb

TV Streaming Thumbnail Data Extraction - Platform-Wise Image Validation for Streaming Services

Extract TV streaming thumbnail data platform-wise. Validate image quality, consistency, and display across Netflix, Prime Video, Hulu & more.