Integration of Sikuli with Selenium for Interacting with GUI Components
Introduction of Sikuli
Sikuli is an open source tool developed by MIT to interact with flash objects. Flash objects are those that cannot be located with Selenium Locators since they don’t even have their presence in DOM structure of that particular web page. Example of flash objects can be any type of windows based pop-ups, flash games etc.
Sikuli is an automation tool that uses “Image Recognition” technology to handle GUI components. Sikuli can also be used for automating window or desktop application. However, it is more preferable when GUI components are more stable and are not changing in the near future.
Now you might be thinking, if the elements of these GUI components are not present in DOM than how these elements would be accessible using Selenium with Sikuli. As mentioned above, Sikuli uses Image Recognition or Visual Image Match method for which elements to be interacted must be taken as an image and stored in a directory. Sikuli will interact with the desired element depending upon the saved web element image we have passed as parameters of various methods.
Integration of Sikuli with Selenium
Since Selenium and Sikuli are both automation tools, so before proceeding let’s have a quick look at their differentiation:
- Selenium is used for automating only web based applications while Sikuli can be used for automating both desktop and web applications.
- Selenium cannot automate flash objects while the primary purpose of Sikuli is to automate flash objects.
- Selenium doesn’t use Image Recognition technology while Sikuli is totally based on Image Recognition technology.
Sikuli provides a Screen class whose object can be used for accessing all the methods predefined by Sikuli. Below are some commonly used methods:
- click() - It takes image name as a parameter to perform click operation on desired web element.
- doubleClick() - It takes image name as a parameter to perform a double click operation on desired web element.
- rightClick() - It takes image name as a parameter to perform a right click operation on desired web element.
- hover() - It takes image name as a parameter to hover over a desired web element.
- type() - It takes image name and input value as a parameter to enter some text in an input field.
- dragDrop() - It takes two images(source image name and destination image name) as a parameter to drag source image from its source position and drop to its destination position.
- paste() - It takes image name and copied text as a parameter to get it pasted in an input field.
- find() - It takes image name as a parameter to find desired web element on a current web page.
Let’s have a look at the simple example using above methods:
package MyPackage; import java.awt.AWTException; import org.openqa.selenium.WebDriver; import org.openqa.selenium.chrome.ChromeDriver; import org.sikuli.script.FindFailed; import org.sikuli.script.Screen; import org.testng.annotations.Test; public class SikuliTest { @Test public void func() throws FindFailed, InterruptedException, AWTException{ WebDriver driver = new ChromeDriver(); driver.get("https://www.google.com/"); driver.manage().window().maximize(); Screen screen = new Screen(); screen.type("C:\\Users\\Lenovo-I7\\Desktop\\GoogleSearch.JPG","who invented selenium"); Thread.sleep(2000); screen.click("C:\\Users\\Lenovo-I7\\Desktop\\sikuli\\GoogleSearchButton.JPG"); Thread.sleep(4000); driver.quit(); } }
Output:
Code walkthrough:
In the above code, we have created an object of Screen class. The object has been used to call "type" method which would be giving input to the Google Search field and also used to call “click” method which would be clicking on the Google Search Button to search for the relevant data.
Note: Please make sure that you have taken screenshots properly and have passed the correct image name/path in your code otherwise while practicing your script might fail.
Pattern class: It is a good practice to use Pattern class along with the Screen class in order to uniquely identify the element with additional attributes. It accepts image path as a parameter.
Below are some methods predefined in Pattern class:
- similar(similarity) - It accepts the value between 0 and 1 as a parameter and returns a Pattern object that fall within the specified similarity range.
- exact() - It has minimum similarity set to 1.0 and returns a Pattern object with exact match of the specified element.
- targetoffset(x,y) - It accepts x and y coordinated to find the element and performs the defined operation accordingly.
- getFilename() - returns the filename of an image defined in pattern object.
- getTargetOffset() - returns the target offset of the Pattern object.
Now let's have a look at the example for using Pattern class along with the Screen class to interact with window pop-up that occur when we try to upload a file.
In the above attached screenshot, since this file upload pop-up is a flash object and could not be located using Selenium Locator, we have used Sikuli here to upload the file. Though this operation can also be done using Robot class, however, this would increase your code length. If the GUI of an element is not expected to be changed in the near future than Sikuli is the right option to access flash objects.
With the help of Sikuli, we would be giving the image name/path in the "File name" field and after that we would be clicking the "Open" button to upload the same image.
Here is the screenshot of directory storing 3 images which would be used in the below code.
Java code sample:
package MyPackage; import java.awt.AWTException; import java.util.concurrent.TimeUnit; import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.chrome.ChromeDriver; import org.sikuli.script.FindFailed; import org.sikuli.script.Pattern; import org.sikuli.script.Screen; import org.testng.annotations.Test; public class SikuliFileUpload { @Test public void func() throws FindFailed, InterruptedException, AWTException{ System.setProperty("webdriver.chrome.driver", "C:\\Users\\Lenovo-I7\\Desktop\\chromedriver.exe"); WebDriver driver = new ChromeDriver(); driver.get("https://blueimp.github.io/jQuery-File-Upload/"); driver.manage().window().maximize(); Screen screen = new Screen(); WebElement addFile = driver.findElement(By.xpath("//*[@id=\"fileupload\"]/div/div[1]/span[1]")); addFile.click(); driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS); Pattern fileName = new Pattern("C:\\Users\\Lenovo-I7\\Desktop\\sikuli\\fileName.JPG"); String filePath = "C:\\Users\\Lenovo-I7\\Desktop\\sikuli\\GoogleSearchButton.JPG"; screen.type(fileName, filePath); Pattern openFile = new Pattern("C:\\Users\\Lenovo-I7\\Desktop\\sikuli\\openFile.JPG"); screen.click(openFile); Thread.sleep(2000); WebElement startUpload = driver.findElement(By.xpath("//span[contains(text(),'Start upload')]")); startUpload.click(); Thread.sleep(4000); driver.quit(); } }
Output:
Code walkthrough:
Here our motive is to test upload feature functionality.We have been using Pattern class to provide the image path for the image of an element we wish to interact with. And then we have used Screen class object to perform various operations for Pattern Object. We would be directing to “Add Files” option to upload an image. Once the Window’s file upload pop-up appears, we have used Sikuli to access this flash object which would further upload an image.
Summarizing it all!!
Sikuli is best used for interacting with flash objects which cannot be accessed alone with Selenium locators. For integrating Sikuli with Selenium, we as a Selenium Automated Testing Services have used various methods of Screen class and Pattern class. Sikuli is a great tool when you need to interact with web elements on the basis of their visual match. Give it a try, please make sure that you have taken the screenshot of desired web element correctly and given the correct image name/path in your code.