Actowiz Metrics Real-time
logo
analytics dashboard for brands! Try Free Demo
Web-Scraping-Craigslist-A-Complete-Tutorial

This blog will use the code extracting apartment data from the East Bay Area Craigslist. The code here can be changed to pull data from any category, region, property kind, etc.

Getting Data

The initial thing we needed to utilize was to get the module from the requests package. After that, we defined a response and variable and assigned it to a get method named on a base URL. A base URL is a URL on the first page that you wish to scrape data from, excluding any additional arguments. We went through an apartment section for Eastern Bay and observed the "Has Picture" filter to narrow the search so that it's not the real base URL.

We have checked the length and type of the item to ensure it matches the total posts on a page (120 there). Then we imported BeautifulSoup from the bs4, a module that can parse the web page HTML retrieved from a server. You can get our import statements with the setup code here:

Getting-Data

Using find_all technique on a newly made html_soup variable quantity in the given code, we have found posts. We had to study a website's structure to get a parent tag about the posts. If you see the screenshot below, you can observe that this is

  • It is a tag for a single post that is a box having all elements we grabbed!

    Using-find

    To scale that, ensure to work in the given way:

    • Grab the initial post and the different variables you wish from that.
    • Ensure you understand how to use them for a single post before looping the entire page.
    • Finally, ensure that you successfully extract one page before adding a loop that goes through different pages

    Class bs4.element.ResultSet gets indexed; therefore, we looked at the initial apartment by indexing the posts[0]. And it's all a code that belongs to

  • tag!

    The-pricing-of-this-post-is-easy-to-get

    The pricing of this post is easy to get:

    To-scale-that

    We scraped the time and date by stipulating the attributes' datetime' on the class 'result-date.' By specifying a 'datetime' attribute, We saved the step in cleaning data by making that needless to convert that attribute from the string to datetime objects. It might also be done into the one-liner by positioning ['datetime'] at the end of the .find() call; however, we split that into the two lines to get clarity.

    The post title and URL are accessible as a 'href' attribute is a link, which is pulled by stipulating the argument. And the title is the text of the tag.

    We-scraped-the-time-and

    Total square footage and bedrooms are in similar tags; therefore, we split those values and grasped everyone element-wise. A neighborhood is a tag having class "result-hood"; consequently, we scraped the text from that.

    The-post-title-and-URL

    The following block is a loop for different pages for East Bay. As there isn't always data on the square footage with total bedrooms, we built the series of statements surrounded within a loop for handling all cases

    Total-square-footage-and-bedrooms

    The loop starts on the initial page, and for every post on the page, this works as the given logic:

    The-following-block-is-a

    We have included some web cleaning steps in a loop, including pulling 'datetime' attributes and removing 'ft2' from square footage variables, and making the value an integer. We have removed 'br' from the total bedrooms because we have extracted it. That's how we have started cleaning data with a few works already completed. From the given options, elegant code is the finest option! We must do more; however, the code might become very specific to the region and could not work in areas.

    The-loop-begins-on-the-initial-page

    The given code makes a data frame from lists of different values!

    We-have-included-some

    Wonderful! Here it is. Undoubtedly, there is a bit of data cleaning to get done. We will go through genuine quicks, and it's time to search for data!

    Wonderful-Here-it-is-Undoubted
  • Investigative Data Analysis

    Sadly, after removing duplicate URLs, we saw only 120 instances. Those numbers will be different if you run a code, as there would be various posts at various times of data scraping. There were around 20 posts that didn't get square footage or bedrooms listed also. For statistical details, that isn't a far-fetched data set; however, we have taken note of it and pushed it forward.

    We wanted to observe the price distribution for East Bay; therefore, we made the given plot. Using the .describe() technique, we got a more comprehensive look. The lowest place is $850, while the most exclusive is $4,800.

    The subsequent code block produces a scatter plot in which points get colored by total bedrooms. It shows an understandable and clear stratification: we observe the point of layers clustered around any pricing with square footage, and with an increase in pricing and square footage, do total bedrooms.

    The-subsequent-code-block

    The subsequent code block produces a scatter plot in which points get colored by total bedrooms. It shows an understandable and clear stratification: we observe the point of layers clustered around any pricing with square footage, and with an increase in pricing and square footage, do total bedrooms.

    with-a-bootstrap-confidence And-dont-forget-the-mainstay

    We have fitted the line on these two variables. Let's observe the correlations. We used eb_apts.corr() for getting these:

    It-looks-like-we-have

    As assumed, the correlation is stronger between total bedrooms with square footage. It makes sense as square footage increases with the increase in total bedrooms.

    As-suspected

    Prices By Neighborhood Sustained

    We wanted to know how locations affect price, so we gathered by neighborhood and combined by calculating means for every variable.

    We have produced it with single line code:

    eb_apts.groupby('neighborhood').mean() where 'neighborhood' is the 'by=' argument, and an aggregator function indicates the mean.

    We have noticed there are two places for North Oaklands: Oakland North and North Oakland, so we have recorded one for them in other likes so:

    Scraping the pricing and sorting in ascending order shows the lowest and most exclusive places to live. A complete line of code is: eb_apts.groupby('neighborhood').mean()['price'].sort_values() which results in the given output:

    Finally, we looked at spreading every neighborhood for price. By doing so, we saw how pricing in neighborhoods might differ and to what extent.

    Here's a code that produces a plot that follows

    Berkeley had an enormous range. It may be because it comprises Downtown Berkeley, South Berkeley, and West Berkeley. In the future form of the project, it can be essential to consider changing the scope of all the variables so they can be more thoughtful of price variability between neighborhoods in every city.

    Well, that's it from us! Feel free to contact us if you want to know more. You can also reach us for all your mobile app scraping and web scraping services requirements.

    Social Proof That Converts

    Trusted by Global Leaders Across Q-Commerce, Travel, Retail, and FoodTech

    Our web scraping expertise is relied on by 4,000+ global enterprises including Zomato, Tata Consumer, Subway, and Expedia — helping them turn web data into growth.

    4,000+ Enterprises Worldwide
    50+ Countries Served
    20+ Industries
    Join 4,000+ companies growing with Actowiz →
    Real Results from Real Clients

    Hear It Directly from Our Clients

    Watch how businesses like yours are using Actowiz data to drive growth.

    1 min
    ★★★★★
    "Actowiz Solutions offered exceptional support with transparency and guidance throughout. Anna and Saga made the process easy for a non-technical user like me. Great service, fair pricing!"
    TG
    Thomas Galido
    Co-Founder / Head of Product at Upright Data Inc.
    2 min
    ★★★★★
    "Actowiz delivered impeccable results for our company. Their team ensured data accuracy and on-time delivery. The competitive intelligence completely transformed our pricing strategy."
    II
    Iulen Ibanez
    CEO / Datacy.es
    1:30
    ★★★★★
    "What impressed me most was the speed — we went from requirement to production data in under 48 hours. The API integration was seamless and the support team is always responsive."
    FC
    Febbin Chacko
    -Fin, Small Business Owner
    icons 4.8/5 Average Rating
    icons 50+ Video Testimonials
    icons 92% Client Retention
    icons 50+ Countries Served

    Join 4,000+ Companies Growing with Actowiz

    From Zomato to Expedia — see why global leaders trust us with their data.

    Why Global Leaders Trust Actowiz

    Backed by automation, data volume, and enterprise-grade scale — we help businesses from startups to Fortune 500s extract competitive insights across the USA, UK, UAE, and beyond.

    icons
    7+
    Years of Experience
    Proven track record delivering enterprise-grade web scraping and data intelligence solutions.
    icons
    4,000+
    Projects Delivered
    Serving startups to Fortune 500 companies across 50+ countries worldwide.
    icons
    200+
    In-House Experts
    Dedicated engineers across scrapers, AI/ML models, APIs, and data quality assurance.
    icons
    9.2M
    Automated Workflows
    Running weekly across eCommerce, Quick Commerce, Travel, Real Estate, and Food industries.
    icons
    270+ TB
    Data Transferred
    Real-time and batch data scraping at massive scale, across industries globally.
    icons
    380M+
    Pages Crawled Weekly
    Scaled infrastructure for comprehensive global data coverage with 99% accuracy.

    AI Solutions Engineered
    for Your Needs

    LLM-Powered Attribute Extraction: High-precision product matching using large language models for accurate data classification.
    Advanced Computer Vision: Fine-grained object detection for precise product classification using text and image embeddings.
    GPT-Based Analytics Layer: Natural language query-based reporting and visualization for business intelligence.
    Human-in-the-Loop AI: Continuous feedback loop to improve AI model accuracy over time.
    icons Product Matching icons Attribute Tagging icons Content Optimization icons Sentiment Analysis icons Prompt-Based Reporting

    Connect the Dots Across
    Your Retail Ecosystem

    We partner with agencies, system integrators, and technology platforms to deliver end-to-end solutions across the retail and digital shelf ecosystem.

    icons
    Analytics Services
    icons
    Ad Tech
    icons
    Price Optimization
    icons
    Business Consulting
    icons
    System Integration
    icons
    Market Research
    Become a Partner →

    Popular Datasets — Ready to Download

    Browse All Datasets →
    icons
    Amazon
    eCommerce
    Free 100 rows
    icons
    Zillow
    Real Estate
    Free 100 rows
    icons
    DoorDash
    Food Delivery
    Free 100 rows
    icons
    Walmart
    Retail
    Free 100 rows
    icons
    Booking.com
    Travel
    Free 100 rows
    icons
    Indeed
    Jobs
    Free 100 rows

    Latest Insights & Resources

    View All Resources →
    thumb
    Blog

    How to Scrape Shopify Store Data: Product Prices, Reviews & Inventory (2026 Guide)

    Complete guide to scraping Shopify store data in 2026. Extract product prices, reviews, and inventory from Shopify stores for competitive intelligence.

    thumb
    Case Study

    How Natural Grocers Achieved 23% Higher Promotional ROI Using Real-Time Organic Product Pricing Intelligence

    Discover how Natural Grocers achieved a 23% increase in promotional ROI using real-time organic product pricing intelligence. Learn how data-driven pricing strategies enhance promotions and retail performance.

    thumb
    Report

    Track UK Grocery Products Daily Using Automated Data Scraping to Monitor 50,000+ UK Grocery Products from Morrisons, Asda, Tesco, Sainsbury’s, Iceland, Co-op, Waitrose, Ocado

    Track UK Grocery Products Daily Using Automated Data Scraping across Morrisons, Asda, Tesco, Sainsbury’s, Iceland, Co-op, Waitrose, and Ocado for insights.

    Start Where It Makes Sense for You

    Whether you're a startup or a Fortune 500 — we have the right plan for your data needs.

    icons
    Enterprise
    Book a Strategy Call
    Custom solutions, dedicated support, volume pricing for large-scale needs.
    icons
    Growing Brand
    Get Free Sample Data
    Try before you buy — 500 rows of real data, delivered in 2 hours. No strings.
    icons
    Just Exploring
    View Plans & Pricing
    Transparent plans from $500/mo. Find the right fit for your budget and scale.
    Get in Touch
    Let's Talk About
    Your Data Needs
    Tell us what data you need — we'll scope it for free and share a sample within hours.
    • icons
      Free Sample in 2 HoursShare your requirement, get 500 rows of real data — no commitment.
    • icons
      Plans from $500/monthFlexible pricing for startups, growing brands, and enterprises.
    • icons
      US-Based SupportOffices in New York & California. Aligned with your timezone.
    • icons
      ISO 9001 & 27001 CertifiedEnterprise-grade security and quality standards.
    Request Free Sample Data
    Fill the form below — our team will reach out within 2 hours.
    +1
    Free 500-row sample · No credit card · Response within 2 hours

    Request Free Sample Data

    Our team will reach out within 2 hours with 500 rows of real data — no credit card required.

    +1
    Free 500-row sample · No credit card · Response within 2 hours