2021年6月23日 星期三

How to Automate Login using Selenium and how to Extract and Submit Web Forms from a URL using Python

 https://www.thepythoncode.com/article/automate-login-to-websites-using-selenium-in-python

https://www.thepythoncode.com/article/extracting-and-submitting-web-page-forms-in-python





How to Automate Login using Selenium in Python

Learn how to use Selenium library with Chrome driver in Python to login to websites automatically as well as verifying login success.



Controlling a web browser from a program can be useful in many scenarios, example use cases are website text automation and web scraping, a very popular framework for this kind of automation is Selenium WebDriver.

Selenium WebDriver is a browser-controlling library, it supports all major browsers (Firefox, Edge, Chrome, Safari, Opera, etc.) and is available for different programming languages including Python. In this tutorial, we will be using its Python bindings to automate login to websites.

Automating the login process to a website proves to be handy. For example, you may want to edit your account settings automatically, or you want to extract some information that requires login, etc.

We have a tutorial on extracting web forms using BeautifulSoup library, so you may want to combine extracting login forms and filling them with the help of this tutorial.

First, let's install Selenium for Python:

pip3 install selenium

The next step is installing the driver specific to the browser we want to control, download links are available on this page. I'm installing ChromeDriver, but you're free to use your favorite.

To make things concrete, I'll be using Github login page to demonstrate on how you can automatically login using Selenium.

Open up a new Python script and initialize the WebDriver:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait

# Github credentials
username = "username"
password = "password"

# initialize the Chrome driver
driver = webdriver.Chrome("chromedriver")

After you downloaded and unzipped the driver for your OS, put it in your current directory or in a known path, so you can pass it to webdriver.Chrome() class. In my case, chromedriver.exe is in the current directory, so I simply pass its name to the constructor.

Since we're interested in automating Github login, we'll navigate to Github login page and we inspect the page to identify its HTML elements:

Github Login page HTML elementsThe id of the login and password input fields, and the name of the Sign in button will be useful for us to retrieve these elements in code and insert to it programmatically.

Notice the username/email address input field has login_field id, where the password input field has the id of password, see also the submit button has the name of commit, the below code goes to Github login page, extracts these elements, fills the credentials and clicks the button:

# head to github login page
driver.get("https://github.com/login")
# find username/email field and send the username itself to the input field
driver.find_element_by_id("login_field").send_keys(username)
# find password input field and insert password as well
driver.find_element_by_id("password").send_keys(password)
# click login button
driver.find_element_by_name("commit").click()

The find_element_by_id() function retrieves an HTML element by its id, and the send_keys() method simulates keypresses, the above code cell will make Chrome type in the email and the password, and then click the Sign in button.

The next thing to do is to determine whether our login was successful, there are a lot of ways to detect that, but in this tutorial, we'll do it by detecting the shown errors upon login (of course, this will change from a website to another).

Github Error Login PageThe above image shows what happens when we insert wrong credentials, you'll see a new HTML div element with the class "flash-error" that has the text of "Incorrect username or password.".

The below code is responsible for waiting for the page to be loaded after the login is performed using WebDriverWait(), and checks for the error:

# wait the ready state to be complete
WebDriverWait(driver=driver, timeout=10).until(
    lambda x: x.execute_script("return document.readyState === 'complete'")
)
error_message = "Incorrect username or password."
# get the errors (if there are)
errors = driver.find_elements_by_class_name("flash-error")
# print the errors optionally
# for e in errors:
#     print(e.text)
# if we find that error message within errors, then login is failed
if any(error_message in e.text for e in errors):
    print("[!] Login failed")
else:
    print("[+] Login successful")

We use WebDriverWait to wait until the document finished loading, the execute_script() method executes Javascript in the context of the browser, the JS code return document.readyState === 'complete' returns True when the page is loaded, and False otherwise.

Finally, we close our driver:

# close the driver
driver.close()

Conclusion

Alright, now you have the skill to login automatically to the website of your choice, note that Github will block you when you run the script multiple times with wrong credentials, so be aware of that.

Now you can do the thing you want to do after you login using your account, you can add the code in the line where we're printing 'Login successful'.

Also, if you've successfully logged in using your real account, you may encounter email confirmation, to bypass that, you have to read your email programmatically with Python and extracts the confirmation code and insert it in real time using Selenium, great challenge, isn't it ? Good luck on it!

Note that login process will differ from website to another, the goal of this tutorial is to give you the essential skills to automate the login of your target website.




End

2023 Promox on Morefine N6000 16GB 512GB

2023 Promox on Morefine N6000 16GB 512GB Software Etcher 100MB (not but can be rufus-4.3.exe 1.4MB) Proxmox VE 7.4 ISO Installer (1st ISO re...