The evolution of cucumber UI test steps

I’m currently the framework/lead developer of a UI testing framework using Selenium WebDriver and Cucumber (written in Javascript using the webdriver.io bindings).  In an ideal world, cucumber test steps have no reference to anything beyond what is visible to the user in the user interface:

And I enter Password1 in the Password field
And I click the Log In button

In the initial version of the test steps that I wrote, I made steps like this reusable by allowing the test creator to enter custom data in reusable steps:

And I enter "Password1" in the "Password" field
And I click the "Log In" button

Where the text in parentheses is a variable, allowing the same step to be used for any similar field:

And I enter "johndoe" in the "Username" field
And I enter "Password1" in the "Password" field

The Javascript code behind these cucumber steps made certain assumptions about how the UI text was associated with the UI objects that the user needed to interact with, but this was hidden from the author of the cucumber steps. In the examples above, the assumption was that the text associated with the input field was contained in an associated <label> tag, e.g.,

<label for="username">Username</label>
<input type="text" name="username>

Another example step:

And I click the "Submit" button

which corresponds to this DOM element:

 <input type="submit" value="Submit">

For our applications that are actively under development, we discussed the need for consistent UI conventions with our developers and they agreed to abide by a set of UI design patterns. Since the agreed upon patterns represented best practices in HTML UI development in addition to making automation easier, our developers were agreeable. Afterwards, if we started to automate a new UI screen and discovered that it did not conform to the agreed-upon patterns, we filed a design-for-testability bug and the UI would be changed.

In the next phase of our automation project, we began to  to automate some of our legacy applications, and we quickly discovered that the developers had not been nearly as consistent with the UI designs of these applications: the assumptions behind our existing cucumber steps did not always hold true. Since these applications will eventually be phased out, the company is carefully limiting the amount of development work that is put into them. Therefore, we could not request that the UIs be changed to enable them to conform to the design patterns we had agreed upon for our new UIs.

To accommodate automation of these applications, we had to deviate from the cucumber principle that the test writer doesn’t need to know any more about the UI than what they can see. We developed some steps for other common UI patterns, and our automated test developers had to look at the DOM of the elements they needed to interact with in order to decide which step to use, such as:

And I enter "johndoe" in the input text field with name "username"

which corresponds to the input text field:

<input type="text" name="username">

and:

And I click the button that contains the text "Submit"

which corresponds to:

<button...>Submit</button>

(In that step, the ‘contains’ is the part that requires the user to understand the DOM and which differentiates this step from other steps related to buttons).

To my pleasant surprise, even our manual testers who didn’t really have any significant experience with HTML or the DOM learned these UI patterns quickly and adapted to them.

For the most part, we discovered that the UIs of our legacy applications, while not as conducive to the spirit of cucumber, still used only a fairly small set of UI conventions. We coded some more reusable steps for those other conventions and covered probably 90% of the cases that we encountered.

Eventually, I had to add some steps for the more obtuse UI design cases, such as:

And I enter "johndoe" in the "input" field with attribute "custom_attribute_name" with value "x_username"

I’m the first to admit that that is an ugly, ugly step, but steps like this allow us to automate the other 10% of UI designs that lie outside the conventions outlined above, and we’ve only had to use these types of steps a few times.

It did not take us long to create a library of steps that accommodate all of our standard UI interactions. I don’t think I’ve added a general-purpose UI interaction step in a couple of months, which is awesome.

For the next phase in the evolution of our automation framework, I’m thinking of using the XPATH ‘or’ attribute to collapse multiple steps back into one, but that may end up being more confusing to our test writers, especially if I can collapse some similar steps but not all of the same type. We’ll see.

 

Identifying form elements by the <label> tag contents

In a previous post, I explained how I use the text associated with UI objects in my cucumber tests. My steps look something like this:

Given I go to the login page
And I enter "johndoe" in the field with id "username"
And I enter "password1" in the field with id "password"

If the application under test is using the <label> tag to identify form elements (and it should! The <label> tag was designed specifically for this purpose), then your application under test has UI objects that look something like this:

<label for="username">Username</label>
<input name="username" type="text"></input>

Writing a cucumber step to interact with the <input> field based on the text in the <label> tag consists of locating the label element and then using the value of its for= attribute to locate the input element. Using the webdriver.io Selenium bindings, my code looks like this:

  this.Given(/^I enter "([^"]*)" in the "([^"]*)" field$/, function (textToEnter, labelText, callback) {
    xpath = '//label[contains(.,"' + labelText + '")]';
    this.getAttribute(xpath, 'for').then(
      function (value) {
      // get the input tag with the name= {value} and enter the text
        xpath = '//input[@name="' + value + '"]';
        that.setValue(xpath, textToEnter).then(
          function () {
            callback();
          },
          function (err) {
            callback(err);
          }
        );
      },
      function (err) {
        callback(err);
      }
    );
  });

Writing Selenium test steps for cucumber tests

In my current job, I’m responsible for developing a framework for performing UI testing of web and mobile applications with Selenium WebDriver/Appium, using the webdriver.io bindings for node.js Javascript. We have adopted cucumber as the format for defining the tests.

Conventional wisdom regarding UI testing holds that you should always strive to select UI objects by the attribute that is least likely to change. With web UIs, you’ll typically see a hierarchy of preferred selectors as the following:

  1. Element ID
  2. Element name
  3. CSS class(es) associated with the element
  4. Text associated with the element

If you used this strategy with cucumber, you would get tests that looked something like this:

Given I go to the login page
And I enter "johndoe" in the field with id "username"
And I enter "password1" in the field with id "password"
Then the input element with type "submit" should become enabled
And I click the input element with type "submit"

But such steps are counter to the principles of behavior-driven development. The primary purpose of cucumber is that it allows your business owner to write requirements that can also directly serve as tests, and identifiers such as element ID or name are artifacts of implementation details, not part of the basic requirements. Or more simply–your business owner should be able to describe interaction with the application using only what’s visible to them in the UI.

More typically, cucumber steps for the login scenario above would look something like this:

Given I go to the login page
And I enter "johndoe" in the "Username" field
And I enter "password1" in the "Password" field
The the "Log in" button should become enabled
And I click the "Log in" button

Note that the steps use the visible text associated with each element to select the element.

The argument against this approach is: “But whenever your UI text changes–and let’s face it, it often changes–you’ll have to go update every single test!” While this is true, my rebuttal is that it’s a simple global search and replace change, and more importantly, it’s not a change to the code behind the cucumber, only to the cucumber test definitions, which, in my opinion, is a lower risk change than an actual code change.

To be fair, as the number of tests grew, I saw the redundancy between tests as a maintenance risk, so I’ve abstracted out frequently used functions, such as login and navigation, into single steps. Steps like the examples above no longer exists in our codebase. Instead, we have single steps such as:

Given I log in as username "johndoe" and password "password1"

So, now, if the text on our login page changes, we only have to change it in one place.

In my next blog post, I’ll show some of the code behind these steps.

UPDATE: Here is the next blog post.