Automotive intelligence depends heavily on timely data about new launches, variant specifications, on-road pricing, and feature mapping across regions.
Platforms like:
…publish real-time information on:
This tutorial shows you exactly how to scrape:
…using Python, Selenium, Requests & BeautifulSoup.
This is the same framework Actowiz Solutions deploys for automotive OEMs, market research agencies, and mobility analytics platforms.
Let’s begin.
pip install selenium
pip install requests
pip install beautifulsoup4
pip install pandas
pip install lxml
pip install undetected-chromedriver
CarWale new launches page:
https://www.carwale.com/new-cars/
This page lists:
import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from time import sleep
browser = uc.Chrome()
browser.get("https://www.carwale.com/new-cars/")
sleep(4)
for _ in range(6):
browser.find_element(By.TAG_NAME, "body").send_keys(Keys.END)
sleep(2)
launch_cards = browser.find_elements(By.XPATH, '//div[contains(@class,"newcar-launch-card")]')
carwale_launches = []
for card in launch_cards:
try:
name = card.find_element(By.CLASS_NAME, "o-bkmzIL").text
except:
name = ""
try:
price = card.find_element(By.CLASS_NAME, "o-bkzfWJ").text
except:
price = ""
try:
launch_date = card.find_element(By.CLASS_NAME, "o-cpnuEd").text
except:
launch_date = ""
try:
url = card.find_element(By.TAG_NAME, "a").get_attribute("href")
except:
url = ""
carwale_launches.append({
"platform": "CarWale",
"model_name": name,
"expected_price": price,
"launch_date": launch_date,
"url": url
})
Every model’s detail page contains:
import requests
from bs4 import BeautifulSoup
def scrape_carwale_specs(url):
try:
html = requests.get(url, timeout=10).text
soup = BeautifulSoup(html, "lxml")
spec_block = soup.find("div", {"id": "specifications"})
if not spec_block:
return {}
specs = {}
rows = spec_block.find_all("tr")
for row in rows:
cols = row.find_all("td")
if len(cols) == 2:
key = cols[0].text.strip()
val = cols[1].text.strip()
specs[key] = val
return specs
except:
return {}
for car in carwale_launches:
car["specifications"] = scrape_carwale_specs(car["url"])
Yalla Motors new car page:
https://uae.yallamotor.com/new-cars
Yalla Motors lists:
browser.get("https://uae.yallamotor.com/new-cars")
sleep(4)
for _ in range(10):
browser.find_element(By.TAG_NAME, "body").send_keys(Keys.END)
sleep(2)
yalla_records = []
cards = browser.find_elements(By.XPATH, '//div[contains(@class,"newcars-card")]')
for item in cards:
try:
name = item.find_element(By.CLASS_NAME, "model-title").text
except:
name = ""
try:
price = item.find_element(By.CLASS_NAME, "model-price").text
except:
price = ""
try:
url = item.find_element(By.TAG_NAME, "a").get_attribute("href")
except:
url = ""
yalla_records.append({
"platform": "Yalla Motors",
"model_name": name,
"price": price,
"url": url
})
Variant specifications include:
def scrape_yalla_specs(url):
try:
html = requests.get(url, timeout=10).text
soup = BeautifulSoup(html, "lxml")
specs_table = soup.find("table", {"class": "specs-table"})
if not specs_table:
return {}
specs = {}
rows = specs_table.find_all("tr")
for row in rows:
cols = row.find_all("td")
if len(cols) == 2:
specs[cols[0].text.strip()] = cols[1].text.strip()
return specs
except:
return {}
for car in yalla_records:
car["specifications"] = scrape_yalla_specs(car["url"])
import pandas as pd
df = pd.DataFrame(carwale_launches + yalla_records)
df.head()
Example: extract engine cc from specs.
import re
def extract_engine(val):
match = re.search(r"(\d+)\s?cc", val.lower())
return int(match.group(1)) if match else None
Apply:
df["engine_cc"] = df["specifications"].apply(lambda x: extract_engine(x.get("Engine", "")) if isinstance(x, dict) else None)
Extract power (hp / bhp)
def extract_hp(val):
match = re.search(r"(\d+)\s?(hp|bhp)", val.lower())
return int(match.group(1)) if match else None
Extract mileage / fuel economy
def extract_mileage(val):
match = re.search(r"(\d+\.?\d*)\s?(km\/l|mpg)", val.lower())
return float(match.group(1)) if match else None
Final table includes:
df.to_csv("car_spec_mapping.csv", index=False)
Using Plotly:
import plotly.express as px
fig = px.histogram(df, x="engine_cc", color="platform", title="Engine Distribution Across New Launches")
fig.show()
Use Actowiz when you need:
We support:
With full spec, pricing, variant & image extraction.
In this tutorial, you learned how to:
This becomes your foundation for:
Actowiz Solutions can deploy a complete automotive data intelligence engine across India + UAE + GCC.
You can also reach us for all your mobile app scraping, data collection, web scraping, and instant data scraper service requirements!
Our web scraping expertise is relied on by 4,000+ global enterprises including Zomato, Tata Consumer, Subway, and Expedia — helping them turn web data into growth.
Watch how businesses like yours are using Actowiz data to drive growth.
From Zomato to Expedia — see why global leaders trust us with their data.
Backed by automation, data volume, and enterprise-grade scale — we help businesses from startups to Fortune 500s extract competitive insights across the USA, UK, UAE, and beyond.
We partner with agencies, system integrators, and technology platforms to deliver end-to-end solutions across the retail and digital shelf ecosystem.
Extract real-time travel mode data via APIs to power smarter AI travel apps with live route updates, transit insights, and seamless trip planning.
How a $50M+ consumer electronics brand used Actowiz MAP monitoring to detect 800+ violations in 30 days, achieving 92% resolution rate and improving retailer satisfaction by 40%.

Track UK Grocery Products Daily Using Automated Data Scraping across Morrisons, Asda, Tesco, Sainsbury’s, Iceland, Co-op, Waitrose, and Ocado for insights.
Whether you're a startup or a Fortune 500 — we have the right plan for your data needs.