1 - Page object models
Note: this page has merged contents from multiple sources, including
the Selenium wiki
Overview
Within your web app’s UI, there are areas where your tests interact with.
A Page Object only models these as objects within the test code.
This reduces the amount of duplicated code and means that if the UI changes,
the fix needs only to be applied in one place.
Page Object is a Design Pattern that has become popular in test automation for
enhancing test maintenance and reducing code duplication. A page object is an
object-oriented class that serves as an interface to a page of your AUT. The
tests then use the methods of this page object class whenever they need to
interact with the UI of that page. The benefit is that if the UI changes for
the page, the tests themselves don’t need to change, only the code within the
page object needs to change. Subsequently, all changes to support that new UI
are located in one place.
Advantages
- There is a clean separation between the test code and page-specific code, such as
locators (or their use if you’re using a UI Map) and layout.
- There is a single repository for the services or operations the page offers
rather than having these services scattered throughout the tests.
In both cases, this allows any modifications required due to UI changes to all
be made in one place. Helpful information on this technique can be found on
numerous blogs as this ‘test design pattern’ is becoming widely used. We
encourage readers who wish to know more to search the internet for blogs
on this subject. Many have written on this design pattern and can provide
helpful tips beyond the scope of this user guide. To get you started,
we’ll illustrate page objects with a simple example.
Examples
First, consider an example, typical of test automation, that does not use a
page object:
There are two problems with this approach.
- There is no separation between the test method and the AUT’s locators (IDs in
this example); both are intertwined in a single method. If the AUT’s UI changes
its identifiers, layout, or how a login is input and processed, the test itself
must change.
- The ID-locators would be spread in multiple tests, in all tests that had to
use this login page.
Applying the page object techniques, this example could be rewritten like this
in the following example of a page object for a Sign-in page.
and page object for a Home page could look like this.
So now, the login test would use these two page objects as follows.
There is a lot of flexibility in how the page objects may be designed, but
there are a few basic rules for getting the desired maintainability of your
test code.
Assertions in Page Objects
Page objects themselves should never make verifications or assertions. This is
part of your test and should always be within the test’s code, never in a page
object. The page object will contain the representation of the page, and the
services the page provides via methods but no code related to what is being
tested should be within the page object.
There is one, single, verification which can, and should, be within the page
object and that is to verify that the page, and possibly critical elements on
the page, were loaded correctly. This verification should be done while
instantiating the page object. In the examples above, both the SignInPage and
HomePage constructors check that the expected page is available and ready for
requests from the test.
Page Component Objects
A page object does not necessarily need to represent all the parts of a
page itself. This was noted by Martin Fowler in the early days, while first coining the term “panel objects”.
The same principles used for page objects can be used to
create “Page Component Objects”, as it was later called, that represent discrete chunks of the
page and can be included in page objects. These component objects can
provide references to the elements inside those discrete chunks, and
methods to leverage the functionality or behavior provided by them.
For example, a Products page has multiple products.
Each product is a component of the Products page.
The Products page HAS-A list of products. This object relationship is called Composition. In simpler terms, something is composed of another thing.
The Product component object is used inside the Products page object.
So now, the products test would use the page object and the page component object as follows.
The page and component are represented by their own objects. Both objects only have methods for the services they offer, which matches the real-world application in object-oriented programming.
You can even
nest component objects inside other component objects for more complex
pages. If a page in the AUT has multiple components, or common
components used throughout the site (e.g. a navigation bar), then it
may improve maintainability and reduce code duplication.
Other Design Patterns Used in Testing
There are other design patterns that also may be used in testing. Discussing all of these is
beyond the scope of this user guide. Here, we merely want to introduce the
concepts to make the reader aware of some of the things that can be done. As
was mentioned earlier, many have blogged on this topic and we encourage the
reader to search for blogs on these topics.
Implementation Notes
PageObjects can be thought of as facing in two directions simultaneously. Facing toward the developer of a test, they represent the services offered by a particular page. Facing away from the developer, they should be the only thing that has a deep knowledge of the structure of the HTML of a page (or part of a page) It’s simplest to think of the methods on a Page Object as offering the “services” that a page offers rather than exposing the details and mechanics of the page. As an example, think of the inbox of any web-based email system. Amongst the services it offers are the ability to compose a new email, choose to read a single email, and list the subject lines of the emails in the inbox. How these are implemented shouldn’t matter to the test.
Because we’re encouraging the developer of a test to try and think about the services they’re interacting with rather than the implementation, PageObjects should seldom expose the underlying WebDriver instance. To facilitate this, methods on the PageObject should return other PageObjects. This means we can effectively model the user’s journey through our application. It also means that should the way that pages relate to one another change (like when the login page asks the user to change their password the first time they log into a service when it previously didn’t do that), simply changing the appropriate method’s signature will cause the tests to fail to compile. Put another way; we can tell which tests would fail without needing to run them when we change the relationship between pages and reflect this in the PageObjects.
One consequence of this approach is that it may be necessary to model (for example) both a successful and unsuccessful login; or a click could have a different result depending on the app’s state. When this happens, it is common to have multiple methods on the PageObject:
The code presented above shows an important point: the tests, not the PageObjects, should be responsible for making assertions about the state of a page. For example:
could be re-written as:
Of course, as with every guideline, there are exceptions, and one that is commonly seen with PageObjects is to check that the WebDriver is on the correct page when we instantiate the PageObject. This is done in the example below.
Finally, a PageObject need not represent an entire page. It may represent a section that appears frequently within a site or page, such as site navigation. The essential principle is that there is only one place in your test suite with knowledge of the structure of the HTML of a particular (part of a) page.
Summary
- The public methods represent the services that the page offers
- Try not to expose the internals of the page
- Generally don’t make assertions
- Methods return other PageObjects
- Need not represent an entire page
- Different results for the same action are modelled as different methods
Example
2 - Domain specific language
A domain specific language (DSL) is a system which provides the user with
an expressive means of solving a problem. It allows a user to
interact with the system on their terms – not just programmer-speak.
Your users, in general, do not care how your site looks. They do not
care about the decoration, animations, or graphics. They
want to use your system to push their new employees through the
process with minimal difficulty; they want to book travel to Alaska;
they want to configure and buy unicorns at a discount. Your job as
tester is to come as close as you can to “capturing” this mind-set.
With that in mind, we set about “modeling” the application you are
working on, such that the test scripts (the user’s only pre-release
proxy) “speak” for, and represent the user.
The goal is to use ubiquitous language. Rather than referring to “load data into this table” or
“click on the third column” it should be possible to use language such as “create a new account” or
“order displayed results by name”
With Selenium, DSL is usually represented by methods, written to make
the API simple and readable – they enable a report between the
developers and the stakeholders (users, product owners, business
intelligence specialists, etc.).
Benefits
- Readable: Business stakeholders can understand it.
- Writable: Easy to write, avoids unnecessary duplication.
- Extensible: Functionality can (reasonably) be added
without breaking contracts and existing functionality.
- Maintainable: By leaving the implementation details out of test
cases, you are well-insulated against changes to the AUT*.
Further Reading
(previously located: https://github.com/SeleniumHQ/selenium/wiki/Domain-Driven-Design)
There is a good book on Domain Driven Design by Eric Evans http://www.amazon.com/exec/obidos/ASIN/0321125215/domainlanguag-20
And to whet your appetite there’s a useful smaller book available online for
download at http://www.infoq.com/minibooks/domain-driven-design-quickly
Java
Here is an example of a reasonable DSL method in Java.
For brevity’s sake, it assumes the driver
object is pre-defined
and available to the method.
This method completely abstracts the concepts of input fields,
buttons, clicking, and even pages from your test code. Using this
approach, all a tester has to do is call this method. This gives
you a maintenance advantage: if the login fields ever changed, you
would only ever have to change this method - not your tests.
It bears repeating: one of your primary goals should be writing an
API that allows your tests to address the problem at hand, and NOT
the problem of the UI. The UI is a secondary concern for your
users – they do not care about the UI, they just want to get their job
done. Your test scripts should read like a laundry list of things
the user wants to DO, and the things they want to KNOW. The tests
should not concern themselves with HOW the UI requires you to go
about it.
*AUT: Application under test
3 - Generating application state
Selenium should not be used to prepare a test case. All repetitive
actions and preparations for a test case, should be done through other
methods. For example, most web UIs have authentication (e.g. a login
form). Eliminating logging in via web browser before every test will
improve both the speed and stability of the test. A method should be
created to gain access to the AUT* (e.g. using an API to login and set a
cookie). Also, creating methods to pre-load data for
testing should not be done using Selenium. As mentioned previously,
existing APIs should be leveraged to create data for the AUT*.
*AUT: Application under test
7 - Tips on working with locators
When to use which locators and how best to manage them in your code.
Take a look at examples of the supported locator strategies.
In general, if HTML IDs are available, unique, and consistently
predictable, they are the preferred method for locating an element on
a page. They tend to work very quickly, and forego much processing
that comes with complicated DOM traversals.
If unique IDs are unavailable, a well-written CSS selector is the
preferred method of locating an element. XPath works as well as CSS
selectors, but the syntax is complicated and frequently difficult to
debug. Though XPath selectors are very flexible, they are typically
not performance tested by browser vendors and tend to be quite slow.
Selection strategies based on linkText and partialLinkText have
drawbacks in that they only work on link elements. Additionally, they
call down to querySelectorAll selectors internally in WebDriver.
Tag name can be a dangerous way to locate elements. There are
frequently multiple elements of the same tag present on the page.
This is mostly useful when calling the findElements(By) method which
returns a collection of elements.
The recommendation is to keep your locators as compact and
readable as possible. Asking WebDriver to traverse the DOM structure
is an expensive operation, and the more you can narrow the scope of
your search, the better.
8 - Test independency
Write each test as its own unit. Write the tests in a way that will not be
reliant on other tests to complete:
Let us say there is a content management system with which you can create
some custom content which then appears on your website as a module after
publishing, and it may take some time to sync between the CMS and the
application.
A wrong way of testing your module is that the content is created and
published in one test, and then checking the module in another test. This
is not feasible as the content may not be available immediately for the
other test after publishing.
Instead, you can create a stub content which can be turned on and off
within the affected test, and use that for validating the module. However,
for content creation, you can still have a separate test.
9 - Consider using a fluent API
Martin Fowler coined the term “Fluent API”. Selenium already
implements something like this in their FluentWait
class, which is
meant as an alternative to the standard Wait
class.
You could enable the Fluent API design pattern in your page object
and then query the Google search page with a code snippet like this one:
The Google page object class with this fluent behavior
might look like this:
10 - Fresh browser per test
Start each test from a clean, known state.
Ideally, spin up a new virtual machine for each test.
If spinning up a new virtual machine is not practical,
at least start a new WebDriver for each test.
Most browser drivers like GeckoDriver and ChromeDriver will start with a clean
known state with a new user profile, by default.