Start Your Project with Us

Whatever your project size is, we will handle it well with all the standards fulfilled! We are here to give 100% satisfaction.

  • Any feature, you ask, we develop
  • 24x7 support worldwide
  • Real-time performance dashboard
  • Complete transparency
  • Dedicated account manager
  • Customized solutions to fulfill data scraping goals
Careers

For job seekers, please visit our Career Page or send your resume to hr@actowizsolutions.com

How-to-Scrape-TripAdvisor-Hotels-Data-Using-Python-and-BeautifulSoup

How to extract a website and make a dataset?

TripAdvisor is the world’s biggest travel website and it is a very popular website to find restaurants, hotels, transportation, and spaces to visit. When somebody plans a trip to a city or country, they are expected to visit TripAdvisor to get the finest places for staying and visiting. TripAdvisor has more than 702 million reviews of the world’s top hotels, lists more than 8 million locations (restaurants, hotels, tourist charms), and ranks 1st in the Traveling and Tourism categories in the United States.

In this blog, we will provide a script, which will extract hotel data from the TripAdvisor webpage, scrape a few data elements and make a dataset. Here, are the steps that would be executed using Python & BeautifulSoup.

1. Import different libraries.

2. Review the HTML structure of a web page

3. Retrieve and change HTML Data

4. Find and scrape data elements

5. Make a data frame

6. Convert data frame into a CSV file

Import Different Libraries

# Import the libraries.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import csv

Requests permit you to send different HTTP requests to the server, which returns the Response Object having all the reply data (i.e., HTML).

Beautifulsoup (bs4) is used for pulling data out of the HTML files and convert data into a BeautifulSoup object that represents HTML as the nested data structure.

Pandas is used to do data manipulation and analysis.

CSV module implements different classes in reading and writing tabular data within a CSV format.

Review a Webpage’s HTML Structure

We have to recognize the contents and structure of HTML tags in webpages. For that project, we would be using TripAdvisor Hawaii Hotels & Places of staying webpage (given below). You can get this webpage through choosing a link.

Review-a-Webpages-HTML-Structure

We can extract this webpage through parsing HTML of a page and scraping the data required for the dataset. To extract some data from the web page, just right-click anywhere on this webpage, choose inspect from a drop-down list and click an arrow icon given on the screen’s upper left-hand side with HTML and click on hotel name (Prince Waikiki) in review section of a webpage. It will result in the given screen shown.

Review-a-Webpages-HTML-Structure-2

On HTML screen, you would see highlighted an HTML line having the Hotel Name called Prince Waikiki.

If you are moving one line from the tag then you would find a div tag having the class of “listing_title”. It is a parent of tag. Therefore, if you want to find, scrape, and capture hotel names on a webpage you might follow these steps.

Get all HTML lines for any particular parent (div tag having class = listing_title) that might include their related children.

Scrape data elements and create a list having all the hotel names.

The code to find and extract hotel names might be the following:

hotels = []
for name in soup.findAll('div',{'class':'listing_title'}):
hotels.append(name.text.strip())

We will get, scrape and store other data elements on a webpage following similar procedures as given above.

Find and Scrape Data Elements

For all data elements we need to scrape, we will get all HTML lines, which are within any particular class and tag. Then, we will scrape data elements as well as store data in the list.

# Find and extract data elements.
hotels = []
for name in soup.findAll('div',{'class':'listing_title'}):
    hotels.append(name.text.strip())
ratings = []
for rating in soup.findAll('a',{'class':'ui_bubble_rating'}):
    ratings.append(rating['alt'])
reviews = []
for review in soup.findAll('a',{'class':'review_count'}):
    reviews.append(review.text.strip())
prices = []
for p in soup.findAll('div',{'class':'price-wrap'}):
    prices.append(p.text.replace('₹','').strip())

Creating a Data Frame

Creating-a-Data-Frame

We would create a dictionary, which will have data names and standards for all data elements which were scraped.

# Create the dictionary.
dict = {'Hotel Names':hotels,'Ratings':ratings,'Number of Reviews':reviews,'Prices':prices}

Create and show a data frame.

# Create the dataframe.
hawaii = pd.DataFrame.from_dict(dict)
hawaii.head(10)

Converting Data Frames into a CSV file

Converting-Data-Frames-into-a-CSV-file
# Convert dataframe to CSV file.
hawaii.to_csv('hotels.csv', index=False, header=True)

Making it all together…

# Import the libraries.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import csv

# Extract the HTML and create a BeautifulSoup object.
url = ('https://www.tripadvisor.in/Hotels-g28932-Hawaii-Hotels.html')

user_agent = ({'User-Agent':
			'Mozilla/5.0 (Windows NT 10.0; Win64; x64) \
			AppleWebKit/537.36 (KHTML, like Gecko) \
			Chrome/90.0.4430.212 Safari/537.36',
			'Accept-Language': 'en-US, en;q=0.5'})

def get_page_contents(url):
    page = requests.get(url, headers = user_agent)
    return BeautifulSoup(page.text, 'html.parser')

soup = get_page_contents(url)

# Find and extract the data elements.
hotels = []
for name in soup.findAll('div',{'class':'listing_title'}):
    hotels.append(name.text.strip())

ratings = []
for rating in soup.findAll('a',{'class':'ui_bubble_rating'}):
    ratings.append(rating['alt'])  

reviews = []
for review in soup.findAll('a',{'class':'review_count'}):
    reviews.append(review.text.strip())

prices = []
for p in soup.findAll('div',{'class':'price-wrap'}):
    prices.append(p.text.replace('₹','').strip())  

# Create the dictionary.
dict = {'Hotel Names':hotels,'Ratings':ratings,'Number of Reviews':reviews,'Prices':prices}

# Create the dataframe.
hawaii = pd.DataFrame.from_dict(dict)
hawaii.head(10)

# Convert dataframe to CSV file.
hawaii.to_csv('hotels.csv', index=False, header=True)

Thank you so much to read this blog. Please give your valuable comments or feedback. For the best mobile app scraping and web scraping services, contact Actowiz Solutions now!

RECENT BLOGS

View More

Beyond Basic Price Monitoring - How to Detect Competitor Stockouts and Win Market Share

Learn how Beyond Basic Price Monitoring helps you detect competitor stockouts in real-time and gain market share with smarter pricing and inventory strategies.

Extracting Public Dating Profiles for User Behavior & Trend Analysis

Explore how Actowiz Solutions extracts public dating profiles to analyze user behavior and trends with web scraping and data intelligence for smarter matchmaking insights.

RESEARCH AND REPORTS

View More

Number of Whataburger restaurants in the US 2025

Discover the total number of Whataburger restaurants in the US 2025, including state-wise data, top cities, and regional growth trends.

Research Report - Decathlon 2024 Sales Analysis - Key Metrics and Consumer Behavior

An in-depth Decathlon 2024 sales analysis, exploring key trends, consumer behavior, revenue growth, and strategic insights for future success.

Case Studies

View More

Case Study - Scrape Coupang Product Listings for Better Pricing Strategies: A Real-World Case Study

Discover how businesses can scrape Coupang product listings to gain competitive pricing insights, optimize strategies, and boost sales. A real-world case study example.

Cracking the Code - How Actowiz Solved Glovo’s Data Volatility with Precision Glovo Data Scraping

Discover how Actowiz Solutions used smart Glovo Data Scraping to overcome data volatility, ensuring accurate store listings and real-time delivery insights.

Infographics

View More

City-Wise Grocery Cost Index in the USA – Powered by Real-Time Data

Discover real-time grocery price trends across U.S. cities with Actowiz. Track essentials, compare costs, and make smarter decisions using live data scraping.

2025 Rental Price Insights from 99acres, MagicBricks & NoBroker

Explore 2025 rental trends with real-time data from 99acres, MagicBricks & NoBroker. Actowiz reveals top areas, price shifts & smart market insights.