Rizwan Rizwan
HOME SERVICES CASE STUDIES BLOG ABOUT $100 USD FREE Consultation Call →
PYTHON

How to Scrape Google SERPs using Selenium in Python

In this guide, we show you how to scrape Google Search Results using Selenium in Python. You'll learn how to set up Selenium, handle dynamic content, and get around common issues. By the end, you'll be able to collect data from Google's search results efficiently.

Rizwan Rizwan August 19, 2025 6 min read
How to Scrape Google SERPs using Selenium in Python

Table of Contents

Today, I'm going to share how you can use Selenium to scrape Google's Search Engine Results Pages (SERP) from scratch in Python. And guess what? We'll be doing this on a DigitalOcean server using a Jupyter Notebook without using any scraping API. If you're ready for an adventure in web scraping, let's get started!

Setting Up Our DigitalOcean Server

First things first, we need a server. I'm opting for a DigitalOcean droplet because of its ease of use and reliability. For this project, I've set up a droplet with 2 vCPU, 2 GB RAM, and a 60GB SSD. This configuration runs on Ubuntu 22.04 (LTS) x64, offering a stable and robust environment for our scraping project.

SSH into Your Server

Once your droplet is ready, let’s SSH into it. If you’re familiar with SSH, this should be a breeze. Make sure you have your private key ready. Here’s the command I use:

ssh -L 8888:localhost:8888 -i /path/to/private/key root@{ip-address}

Notice the port tunneling in the command? That's because we're going to use Jupyter Notebook, which runs on port 8888 by default.

Setting Up the Environment

With access to our server, it's time to set up our Python environment. It's crucial to set up our environment with all necessary dependencies, including Chrome and ChromeDriver. Here are the commands to get everything ready:

1. Update the Server:

Before installing anything, it's a good practice to update your server's package list:

sudo apt update

2. Install Google Chrome:

Google Chrome isn't included in the default Ubuntu repositories. To install it, first download the Debian package from the Chrome website:

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb

Then install it using the DPKG command:

sudo dpkg -i google-chrome-stable_current_amd64.deb

If you encounter any errors, fix them by running:

sudo apt install -f

Confirm your installation by running this:

google-chrome --version

It should return:

Google Chrome {version-number-here}

3. Install Chromedriver:

For Selenium to control Chrome, you need Chromedriver. Download and install it with:

wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
sudo mv chromedriver /usr/bin/chromedriver
sudo chown root:root /usr/bin/chromedriver
sudo chmod +x /usr/bin/chromedriver

4. Install Additional Dependencies:

You might need a few additional packages to ensure everything works smoothly:

sudo apt install -y libxi6 python3-pip

5. Install Python Packages:

Now, install the necessary Python packages including Selenium, BeautifulSoup, and Jupyter Notebook:

pip install selenium bs4 notebook python3-lxml cchardet PyVirtualDisplay xvfb

With these dependencies installed, your DigitalOcean server is now ready for web scraping using Python.

6. Start Jupyter Notebook:

Now, start Jupyter notebook with this command:

jupyter notebook --allow-root

Now, you can access the Jupyter Notebook through your local browser by navigating to localhost:8888

Setting Up Selenium with Proxy

For scraping Google SERP, we’ll use Selenium with a proxy. I’m using OxyLabs’ Datacenter Proxy, but you can choose any other reliable proxy service. Here’s the sample code to set up Selenium with python headless browser and a proxy in a Jupyter Notebook:

import time, lxml, cchardet, re
from bs4 import BeautifulSoup

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from chromedriver_py import binary_path
from pyvirtualdisplay import Display

screen_display = Display(visible=0, size=(800, 800))
screen_display.start()

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=ddc.oxylabs.io:8011')
chrome_options.add_experimental_option("prefs", {"profile.managed_default_content_settings.images": 2})
chrome_options.add_argument('--headless=new')

with webdriver.Chrome(service=Service(binary_path), options=chrome_options) as driver:
    driver.maximize_window()

    driver.get('https://www.google.com/search?q=restaurants+near+me')
    time.sleep(3) # Wait 3 seconds for page to load

    thesoup = BeautifulSoup(driver.page_source, 'lxml')

In this code, we’re setting up the Chrome driver with the necessary options, including the proxy settings. The --headless option allows us to run the browser in the background.

Once everything is set up, the actual scraping is straightforward. We navigate to the Google search page for our query and then parse the page source with BeautifulSoup. Here, we’re looking for “restaurants near me”, but you can modify the query as needed.

Extracting Data with BeautifulSoup

Assuming we have navigated to our desired Google search page and have thesoup contained the page source, our next step is to extract useful information. In this case, let’s extract the titles and URLs of the search results. Here’s how you can do it:

for result in thesoup.find_all('div', class_='tF2Cxc'):
    title = result.find('h3').get_text()
    link = result.find('a')['href']
    print(f"Title: {title}\nLink: {link}\n")

This code iterates through each search result (which, in Google’s HTML structure, is typically contained within div tags with the class ‘tF2Cxc’). For each result, it finds the title (<h3> tag) and the corresponding URL (the href attribute in the <a> tag).

Advanced Data Extraction

If you want to get more sophisticated, you could also extract other pieces of information like the brief description (snippet) that Google provides for each search result:

for result in thesoup.find_all('div', class_='tF2Cxc'):
    title = result.find('h3').get_text()
    link = result.find('a')['href']
    snippet = result.find('div', class_='VwiC3b').get_text()
    print(f"Title: {title}\nLink: {link}\nSnippet: {snippet}\n")

This code is similar to the previous one but also looks for a div with the class ‘IsZvec’, which typically contains the search result snippet.

Final Thoughts Before Wrapping Up

In this article, I showed you how to scrape google results data for various purposes like SEO analysis, market research, or even academic purposes. While BeautifulSoup makes it easy to parse and extract data, remember to use this power responsibly and abide by legal and ethical considerations.

Happy data extraction!

Rizwan

About Rizwan

Full-stack developer and technical leader with 13+ years building scalable web applications. I help agencies and startups ship faster through strategic guidance and hands-on development.

MORE INSIGHTS

Keep reading for more development wisdom.

Python

How to Stream Selenium Web Automation using Headless Chrome and Flask in Python

Selenium makes it easy to automate your browser and take control of web pages for testing and automation. But have you ever thought about streaming your

Rizwan Rizwan May 27, 2025 6 min read

HOW I CAN HELP YOU

I work with founders, agencies, and developers who need that extra push to get projects live. Whether it's fixing a stubborn bug, steering your tech strategy, or building full apps with my team. You don't have to do it alone

GET UNSTUCK

60-minute call to debug your specific problem. Stop spinning your wheels.

$249
BOOK NOW →

FRACTIONAL CTO

Ongoing strategic guidance to prevent disasters like this.

$2k-7k/mo
LEARN MORE →

CUSTOM DEV

Full project rebuild with our expert team. Done right the first time.

Custom
GET QUOTE →