Why Selenium? To facilitate your life.

This is a detailed guide for beginners on web scraping using Selenium and chrome driver.

Coding is a skill that you can easily learn by focusing first on one programming language such as Python, R, JavaScript, HTML and many others. Personally, I prefer Python because it feels more practical and has lots of fun packages. In this article, I will be using Python and Spyder to introduce you to the package Selenium. The below guide is outlined as follows:

  1. What is Selenium?
  2. Why Selenium?
  3. Downloading the necessary software and package.
  4. A simple automation process.
Photo by Florian Olivo on Unsplash

What is Selenium?

Selenium is a package in Python that is used to automate web applications for testing purposes. The main functions of the package are but not limited to:

  • Web scraping (collecting or gathering data from the web such as real estate prices, locations….)
  • Automate web browser actions (click, input, select, navigate…)
  • Data Mining
  • Gathering information such as reviews from competitors on the product quality.
Photo by Nicolas Picard on Unsplash

Why Selenium?

Well the package is highly flexible, it is supported by many languages : Java, C#, PHP, Ruby, Perl, Python, JavaScript, Objective-C, Haskell and R. Selenium is also compatible with different platforms : Windows, Linux and Mac. Lastly the package supports most of the famous browsers : Google Chrome, Firefox, Internet Explorer, Safari, Opera and Microsoft Edge.

The package has many functions and is very easy to use, all you need is a laptop/PC, Wi-Fi and 30 minutes, and you will be able to start applying it to your work or your studies or even start flexing on your friends that you can create bots ;).

Photo by Alex Chumak on Unsplash

Downloading necessary software programs and packages:

Let’s start by downloading the software Anaconda which is basically a toolkit that includes many programs to write your code on. For example, it has a software called Spyder where you write and run your Python code on. Anaconda has many software programs for different languages, but for this project I will be using Spyder just because it is more user-friendly. You can find and download Anaconda from this corresponding link : https://www.anaconda.com/products/individual

After downloading Anaconda, open the application (no need to sign in) and download the Spyder software. Upon launching the latter, you will see this screen.

Before we start coding lets first download the Selenium package. To do so you have to go to your terminal (search bar : “Terminal”). After launching, type in the following code :

pip install selenium

pip is a package management system where u can download/access python packages.

Once this is done, you can head back to Spyder and start coding. We will start by importing Selenium and then accessing the webdriver function :

import selenium 
from selenium import webdriver

P.S. in order to run your code press Shift+Enter.

Now in order to access the browser, you have to download its own driver. In this example I will be using Google Chrome. You can follow the link below to download its driver:

Following that step, copy the path (location where you downloaded your chrome driver) and create a variable to store it (make sure you paste your path between quotations). Now, we have to create the driver:

path = '/Users/younes/Downloads/chromedriver'
driver = webdriver.Chrome(path)

A simple automation process:

After everything has been set up, we can now start testing and see how Selenium is really useful. The driver that we created is the tool that will help us automate most of the things. We will ask the driver (Selenium) to open up a link by itself:

driver.get('https://www.instagram.com')

Now if you run (Execute) your code a browser will pop up with the Instagram login page. For this simple example we will automate filling and clicking tasks.

First, let’s start by explaining how Selenium works. Basically, we will search for the element that we need to automate on the login page itself using its x path which is an identification of the element (a bunch of code to identify the element in the whole window page). For example, when the browser opens up, a window will pop up asking us to accept the cookies. In order to login we need to handle this issue first. To do so we have to locate “The accept cookies” button, simply by hovering over it and right-clicking on the accept cookies button and then going to inspect.

Next, you will see a bunch of code on the right, don’t get scared you don’t have to worry about it, just go to top left and press on the symbol that has a pointer.

Upon clicking on the pointer, you can hover over the “Accept all” button and press on it to get its location on the code. Then, right-click on that code and copy its XPath.

In this way we identified this button, so Selenium can automate pressing it.As simple as that, we can now write this line of code to automate this process. You have to paste the xpath between the round brackets and in quotations:

accept_cookies_button=driver.find_elements_by_xpath('/html/body/div[2]/div/div/button[1]')

Now that we told selenium which button is the accept cookies, we can go ahead and tell it to click it.

accept_cookies_button.click()

Go ahead and run your code, and you can see that we automated the process. Now in order to fill the username and password we do the same thing. First, identify the element by its XPath then create a variable that will be used by Selenium to automate. Similarly, for the username we get the following code

username_login = driver.find_element_by_xpath('//*[@id="loginForm"]/div/div[1]/div/label/input')

Now in order to fill your username, the send_keys function will be used:

username_login.send_keys('your_username')

And voilà, run your code, and you can see that Selenium fills the username box by itself. In the same fashion the password can be filled :

password=driver.find_element_by_xpath('//*[@id="loginForm"]/div/div[2]/div/label/input')
password.send_keys('your_password')

Now as a practice, try to access the login in button and clicking it.

Finally, This simple example demonstrated the simplicity of automating the logging in process. I hope that this tutorial helped you understand the basics of Selenium, a powerful tool used to automate anything. Thank you for your time and I hope you enjoyed your stay.

GitHub's code : https://github.com/younesskhatib/Selenium-Intro.git

Master student in business analytics trying to survive :)