The page-object model (POM) is a frequently used designed-patterns in UI automations (browser, mobile or desktop).
It is recognized as “the right way” to write automated tests and many automation gurus would vouch for its benefits.
In this article I will present a theoretical case against page object models for automated UI tests and how it can actually reduce productivity.
UPDATE — part 2 has been uploaded — https://eldadu1985.medium.com/a-case-against-page-object-model-part2-2e3f7c1d0cdd
What is Page-Object model?
UI testing is complex, costly and extremely difficult to write and maintain.
So as test suites got larger and larger, automation developers were seeking out for ways to ease on the maintenance of their code.
Intuitively, the solution was adherence to Object-Oriented programing principles.
The first principle in mind is the principle of encapsulation, which states that behaviors (methods) should bind with data fields (members) and that all access to the data fields is indirect via the methods that encapsulate them.
In other words, the methods either retrieve or mutate the members — and thus the methods mutate the state of the object.
The seconds principle is actually a set of principles known as SOLID
- Single responsibility — a class should have a responsibility over a single part of the program that it should encapsulate,
Or in the words of Robert.C Martin (AKA: uncle Bob), “there should never be more than one reason for change”.
- Open-close — classes are open to extension (by inheritance) and close to modifications
- Liskov substitution — If A subclasses B then A can replace B without warnings.
- Interface segregation — in short, you should chop your interfaces to small unit such that no client is forced to be dependent on methods that it doesn’t use
- Dependency Inversion — classes are dependent on abstractions and not on concrete implementations.
The third principle is design-patterns, which states that known problems should be fitted into known templated solutions.
So with these principles dominate the software engineering, the solution to maintainable UI automation code was to devise a design-pattern, and that design pattern has become known as the page-object model.
According to the page-object model, each “page” of the application is represented by a class (or an object).
These objects there fore have a “single responsibility” over that one page only.
The object model has all the locators of that specific page, and therefore all the methods of that object use these locators internally, so there’s encapsulation of data with behavior.
The test code contains only high level test script while all the implementation is done in the page objects, so they are decoupled and dependent by abstractions.
Lets see an example:
In this example, we have one test suite with one test case.
The test case performs login and then asserts that the welcome element has the text “Welcome Paul”.
With a single test case this would do.
But as test cases and test suites start to accumulate, it becomes troublesome.
It is impossible to reuse this code, and it cannot be understood in any sense.
So let’s refactor this code into a page object model:
Now with the page object model, we have our test case shorten to just 5 lines of code, it is more readable and flat.
The business logic part is implemented by 1 abstract class Page, and 2 concrete classes. LoginPage and HomePage.
The implementation is decoupled and easy to reuse, and there would be no duplications of code.
Things look promising!
The benefits associated with page object model are the usually the following:
- Easy maintenance.
- Tests scripts are more readable
In deed — page objects has it’s benefits, no one denies that.
But in software engineering, just like in any field of engineering and in life in general, we should look not only at the benefits, but also at the costs!
In other words, we should ask our selves the question — do the benefits outweigh the costs, or vise versa?
Cost #1 — ITS NOT A CLASS!
Lets take a closer look at the LoginPage Class:
def __init__(self, drv):
self._username_locator: Tuple[By, str] = (By.ID, "txtUsername")
self._password_locator: Tuple[By, str] = (By.ID, "txtPassword")
self._login_btn_locator: Tuple[By, str] = (By.ID, "btnLogin")def insert_username(self, username: str, timeout: int = 10) -> None:
username_element = WebDriverWait(self._driver,
username_element.send_keys(username)def insert_password(self, password: str, timeout: int = 10) -> None:
username_element = WebDriverWait(self._driver,
username_element.send_keys(password)def click_login(self, timeout: int = 10) -> HomePage:
username_element = WebDriverWait(self._driver,
hp = HomePage(self._driver)
Yeah, it looks like a class… indeed.
It has the class keyword as a start, it’s name is a noun, it has a self reference, it has 4 methods and they are all verbs and it has some members too.
But it is not a real class! it’s all an illusion!
The so called “members” of this “class” never mutate!
There’s no association between the data types and the behavior (in other words, there’s no encapsulation).
all these “members” are for read-only purposes and the methods are stateless.
So it’s not really a class…
You know what that really is?
It’s a NAMESPACE! that’s it! just a fancy namespace, and nothing more.
Cost #2 — it doesn’t make any sense!
Lets take a look at the test code once more.
login_page: LoginPage = LoginPage(self._driver)
home_page: HomePage = login_page.click_login()
self.assertAlmostEqual(home_page.welcome_message, "Welcome Paul")
So what do we got?
We are instantiating the LoginPage class into a login_page object, use 3 of it’s methods, we never access it’s properties (well… because it doesn’t have any properties…) and that’s it, we dispose of it.
So why do we need this object for in the first place?
We only need it for it’s methods right? so if it’s just a fancy namesapce with methods attached, why do we need to create an entire instance of this “class” and then dispose of it?
Why is that not “just” a function?
Cost #3 —It is not as readable as you think
For example, in the click_login method in the LoginPage class, we return an instance of the HomePage object, with the idea that in the outer context we can prevent the creation of the wrong page object and avoid problems (this is usually referred to as poka-yoke).
But the problem is that in the midst of speculating future problems, we usually add more and more layers of complexity to our code that wouldn’t pay off in the long run.
In this example, it seems easy, you instantiated one object, used a few methods, got a new object and run another method.
but in a longer test case, you are probably going to move between 8–10 pages, which means you’ll end up managing 8–10 page objects in your test code, some times you might even go back and forward with these pages.
So now, the test code writer and reader would have to manage all these instances and understand what is their meaning and purpose.
Cost #4 — shared elements?
Very often, we have elements shared by different pages, and sometimes we even get similar behaviors in different pages.
In or example, after we log in we have a main menu at the top:
This is a shared element that almost all pages in our application use, and the selection of a single item from the menu is a shared behavior.
How are we going to encapsulate this shared behavior across different pages?
In object oriented programming there are really only 2 ways to do that:
- Inheritance — object share behaviors and fields from their parent classes.
- Composition — objects hold a reference to the object containing the shared behaviors
So now you have to really think through what objects are needed for every page model and how to bind them all together.
The consequence is a massive conceptual overhead and this bother is not likely to pay off.
If you wrote these methods as pure functions and the locators as constants (in other words, the behaviors and data fields would not be encapsulated), you could get a more direct access to the needed data fields without the burden of wiring up all your objects.
Cost #5 — Its speculative
The concern of shared elements — or cross system considerations — leads to speculative software design that become very difficult to grasp.
In other words, when we create our page object model, we start to think about future needs and problems that might occur and we prepare our code to absorb these future changes.
In our example, we have the BasePage class as an abstract class, the reason we used the abstract class is because we didn’t want to duplicate the constructor of all the concrete classes.
So we might then decide that we don’t want to duplicate all our “wait” statements so we will refactor our code into this:
This refactor is actually a good one, because we eliminated code duplications.
However, in early stages of our design we imagine that there will be different waiting strategies — for example, waiting not only for the element it self to be visible but for a list of other element to be visible.
If you follow object oriented programming seriously, then you would probably create a WaitStrategy class with a “wait” method and have the Page class hold an instance of this class.
The result is automation code that has too many concepts and design patterns and it is even harder to understand or even get a track of.
Instead, if you just created functions, you will be able to develop the parts of code that are necessary, and when ever comes the moment you realize you need a new wait strategy you can implement it on the spot and use it when needed.
Cost #6 — You start asking silly questions
Should I have a “Waiter” object for different waiting strategies?
Should I Create a “Navigator” Object to navigate through these pages?
Or should pages navigate them selves?
Maybe I should also create a “clicker” object for different clicking strategies or a filler “Selector” object for different combobox selection strategies?
As your entire infrastructure is based on classes and objects instead of functions, your entire thought process becomes a huge philosophical dilemma.
Cost #7— The kingdom of nouns
This refers to the satirical article published by Steve Yegge — execution in the kingdom of nouns.
The point of this article is that object-oriented code (particularly in Java) has a tendency to put too much emphasis on nouns (ie classes, objects, “things”) rather than on verbs (function, methods, behavior).
The consequence is that the noun oriented thinking leads to an overly abstract code where the actual lines of logic are scattered around and very difficult to track.
The various behaviors of the code are fractured into tiny pieces that would then have to be wired up together.
Page object falls into the same case, since it imposes a concept where all our actions are in the context of pages, our entire orientation is towards the “things”.
Cost #8 — paralysis by analysis
Analysis-Paralysis is the situation when the analysis phase takes too long so that decision making is delayed to unrealistic time scales.
Page object models are more prone to this because of the extensive philosophical nature.
As our automation code gets overly engineered for an allegedly more flexible infrastructure, along side with speculative generalizations that are unneeded, the design phase tends to take too long.
That’s not to say that no thought process is needed in designing automation code.
Similarly to any code, automation infrastructure should be designed and planned out.
Over planning and over thinking however leads to a situations where our profits become our losses.
Page Object models benefits, in my opinion, are outweighed by it’s costs.
The constant overthinking, the need for fracturing your code to unrealistic levels and to wire all these pieces together is too much of a burden that is not justified.