Software Engineer, Python Enthusiast, Darbuka Player,

How to manage cookies with Selenium

Image for post
Image for post

In the previous post, we looked at how to execute JavaScript code with the help of Selenium. We will explore cookie management in this post of the series.

What is a cookie and what are they used for? Cookies are key-value pairs followed by zero or more attribute-value pairs stored on the client-side. It is a small piece of data sent from the web application and stored in the web browser, while the user is browsing that web page. They usually store information about users, their preferences, and past activities. …

How to execute JavaScript code with Selenium

Image for post
Image for post

In this post, we will look at how to execute JavaScript code on the page with Selenium.

Document Object Model(DOM) can access and manipulate all the elements on a web page with JavaScript. You can inspect an element on a web page and see the available methods using developer tools of your browser of choice. You can run JavaScript code with Selenium if a certain action can not be performed using regular Selenium commands.

To accomplish this, Selenium WebDriver will inject the JavaScript statement into the browser and the script will perform the job.

There are two methods for the…

Farewell to 2020 with Python

Image for post
Image for post

If you are a person who likes to send new year message cards to your friends or family, maybe you want to do it by programming this time. Let’s start…

Prepare Your Environment

We need to create a virtual environment before starting to isolate dependencies from the rest of the system.

  • If you are using pip
python -m venv env
source env/bin/activate
pip install yagmail
  • If you are using pipenv
pipenv install yagmail
pipenv shell

yagmail is a Python package created to interact with Gmail accounts to make sending emails easier.

Implementation & Explanation

The program will have both command line parameters and interactive mode support.

How to handle alert dialogs

Image for post
Image for post

In the previous post, we looked at working with cookies. We will explore the handling of alert dialogs in this post.

An alert is a pop-up window. It gets triggered due to some action performed by the user or automatically due to some system settings. Their purpose is to give some information to the user, take permission, or take some input from the user.

Selenium WebDriver Alert API provides methods to handle interactions with these pop-up message boxes.

We can categorize the alerts into the following three types:

  1. Simple Alert
  2. Confirmation Alert
  3. Prompt Alert

Simple Alert

A simple alert shows a custom…

Customizing events

Image for post
Image for post

In the previous post, we explored how to pass options to the Selenium WebDriver instance to set the various preferences for the browser.

EventFiringWebDriver class is a wrapper for a WebDriver instance that supports firing events.

You can use this to take some actions before or after certain events like finding an element, navigating to a url, clicking an element, or quitting the browser.

It takes driver, and event_listener as arguments. event_listener is an instance of a class that subclasses the AbstractEventListener class.

Following methods of the AbstractEventListener class should be implemented fully or partially.

* before_navigate_to(self, url, driver) *…

Tweak preferences of the browser

Image for post
Image for post

In the previous post, we explored how to control the keyboard actions with Selenium. We will see how we can configure the WebDriver instance by passing options/capabilities to it.

Passing Options

You can pass the options to WebDriver instance such as headless to run the browser without a GUI.

The following example uses the Options class imported as FirefoxOptions to pass the headless option to the Firefox driver.


Capabilities are options that you can use to customize and configure a WebDriver session. A client can use capabilities to specify required features while creating a new session.

List of Capabilities

  • browserName
  • browserVersion

Controlling the keyboard

Image for post
Image for post

In the previous post, we looked at the different navigation strategies. We will see how to control the keyboard actions in this post.

Selenium allows us to emulate actions on the keyboard such as pressing keys, or clearing the content written. Also, we can use modifier keys like CTRL or SHIFT to perform some compound/combined keypresses with the help of ActionChains class.

Action Chains

Action chains allow interactions such as mouse movements, mouse button actions, keypress, and drag-and-drop. When you call methods for actions on the ActionChains object, the actions are stored in a queue in the ActionChains object. When you call…

How to navigate among URLs, windows, and frames

up arrows on a wooden door
up arrows on a wooden door

In the previous post, we looked at different wait strategies. I will try to explain navigation among URLs, windows, frames, and alerts in this post.

Opening a Website

Opening a website is the first thing after starting a web browser. In Selenium this is done with the get method call on WebDriver instance like below.

Implementation Detail

  • get method invokes the execute(Command.GET, {'url':url}) call. Url passed is used in the POST request to /session/:sessionId/url endpoint.
  • execute method calls the command executer’s execute method. Command executer is an instance of the RemoteConnection class. …

Make sure it is there

Image for post
Image for post

In the previous post, we looked at how to locate elements before interacting with them. We will explore how waits work in Selenium to make sure that elements are presented in the DOM before interacting with them in this post.

The document.readyState property of an HTML document describes the loading state of the current document. By default, a driver.get request returns to the caller when the ready state becomes complete.

There are 3 kinds of page loading strategies for Selenium WebDriver.


Wait for the entire page is loaded that is determined with the load event.


Wait until the…

How to locate elements on the page?

Image for post
Image for post

In the previous post, we looked at the high-level architecture of Selenium applications. This post will be about locating elements on a web page to interact with them.

We should first locate elements on a page to perform some operations on them. We can locate elements by id attribute, name attribute, css selector, class name, tag name, xpath, and full or partial link text.

Locating Strategies

Python api provides the following methods to find elements on a page.

  • find_element_by_id
  • find_element_by_name
  • find_element_by_xpath
  • find_element_by_link_text
  • find_element_by_partial_link_text
  • find_element_by_tag_name
  • find_element_by_class_name
  • find_element_by_css_selector

These methods return an object in WebElement (represents a particular DOM node) type or raise NoSuchElementException

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store