The evolution of cucumber UI test steps

I’m currently the framework/lead developer of a UI testing framework using Selenium WebDriver and Cucumber (written in Javascript using the bindings).  In an ideal world, cucumber test steps have no reference to anything beyond what is visible to the user in the user interface:

And I enter Password1 in the Password field
And I click the Log In button

In the initial version of the test steps that I wrote, I made steps like this reusable by allowing the test creator to enter custom data in reusable steps:

And I enter "Password1" in the "Password" field
And I click the "Log In" button

Where the text in parentheses is a variable, allowing the same step to be used for any similar field:

And I enter "johndoe" in the "Username" field
And I enter "Password1" in the "Password" field

The Javascript code behind these cucumber steps made certain assumptions about how the UI text was associated with the UI objects that the user needed to interact with, but this was hidden from the author of the cucumber steps. In the examples above, the assumption was that the text associated with the input field was contained in an associated <label> tag, e.g.,

<label for="username">Username</label>
<input type="text" name="username>

Another example step:

And I click the "Submit" button

which corresponds to this DOM element:

 <input type="submit" value="Submit">

For our applications that are actively under development, we discussed the need for consistent UI conventions with our developers and they agreed to abide by a set of UI design patterns. Since the agreed upon patterns represented best practices in HTML UI development in addition to making automation easier, our developers were agreeable. Afterwards, if we started to automate a new UI screen and discovered that it did not conform to the agreed-upon patterns, we filed a design-for-testability bug and the UI would be changed.

In the next phase of our automation project, we began to  to automate some of our legacy applications, and we quickly discovered that the developers had not been nearly as consistent with the UI designs of these applications: the assumptions behind our existing cucumber steps did not always hold true. Since these applications will eventually be phased out, the company is carefully limiting the amount of development work that is put into them. Therefore, we could not request that the UIs be changed to enable them to conform to the design patterns we had agreed upon for our new UIs.

To accommodate automation of these applications, we had to deviate from the cucumber principle that the test writer doesn’t need to know any more about the UI than what they can see. We developed some steps for other common UI patterns, and our automated test developers had to look at the DOM of the elements they needed to interact with in order to decide which step to use, such as:

And I enter "johndoe" in the input text field with name "username"

which corresponds to the input text field:

<input type="text" name="username">


And I click the button that contains the text "Submit"

which corresponds to:


(In that step, the ‘contains’ is the part that requires the user to understand the DOM and which differentiates this step from other steps related to buttons).

To my pleasant surprise, even our manual testers who didn’t really have any significant experience with HTML or the DOM learned these UI patterns quickly and adapted to them.

For the most part, we discovered that the UIs of our legacy applications, while not as conducive to the spirit of cucumber, still used only a fairly small set of UI conventions. We coded some more reusable steps for those other conventions and covered probably 90% of the cases that we encountered.

Eventually, I had to add some steps for the more obtuse UI design cases, such as:

And I enter "johndoe" in the "input" field with attribute "custom_attribute_name" with value "x_username"

I’m the first to admit that that is an ugly, ugly step, but steps like this allow us to automate the other 10% of UI designs that lie outside the conventions outlined above, and we’ve only had to use these types of steps a few times.

It did not take us long to create a library of steps that accommodate all of our standard UI interactions. I don’t think I’ve added a general-purpose UI interaction step in a couple of months, which is awesome.

For the next phase in the evolution of our automation framework, I’m thinking of using the XPATH ‘or’ attribute to collapse multiple steps back into one, but that may end up being more confusing to our test writers, especially if I can collapse some similar steps but not all of the same type. We’ll see.


Automating android cordova/phonegap applications with appium

I am just putting this out there for others to find, because I had such a difficult time locating this crucial tidbit of information: if you select the Android WebView context in your hybrid mobile application, you can run your DOM-based Selenium tests against it using appium.

I am tasked with creating UI automation for an angularJS-based application that we are deploying as a web application and as a mobile application using cordova/phonegap. I wrote my Selenium-based tests for the web deployment and then wanted to use the same tests for the Android mobile application using appium. When I launched my mobile application in the Android emulator and then used Android’s UIAutomator to view the UI objects, all I saw was the Android-based objects, no DOM-based objects–even when I selected the WebView context. My heart sunk because I thought I would have to write separate automation for the web deployment and the Android app. After quite a bit of Googling, though, I found the nugget of information above (I can’t find the source now). So, I’m able to write my tests against the web deployment using Selenium and then run them against the Android app using appium.

Identifying form elements by the <label> tag contents

In a previous post, I explained how I use the text associated with UI objects in my cucumber tests. My steps look something like this:

Given I go to the login page
And I enter "johndoe" in the field with id "username"
And I enter "password1" in the field with id "password"

If the application under test is using the <label> tag to identify form elements (and it should! The <label> tag was designed specifically for this purpose), then your application under test has UI objects that look something like this:

<label for="username">Username</label>
<input name="username" type="text"></input>

Writing a cucumber step to interact with the <input> field based on the text in the <label> tag consists of locating the label element and then using the value of its for= attribute to locate the input element. Using the Selenium bindings, my code looks like this:

  this.Given(/^I enter "([^"]*)" in the "([^"]*)" field$/, function (textToEnter, labelText, callback) {
    xpath = '//label[contains(.,"' + labelText + '")]';
    this.getAttribute(xpath, 'for').then(
      function (value) {
      // get the input tag with the name= {value} and enter the text
        xpath = '//input[@name="' + value + '"]';
        that.setValue(xpath, textToEnter).then(
          function () {
          function (err) {
      function (err) {

Writing Selenium test steps for cucumber tests

In my current job, I’m responsible for developing a framework for performing UI testing of web and mobile applications with Selenium WebDriver/Appium, using the bindings for node.js Javascript. We have adopted cucumber as the format for defining the tests.

Conventional wisdom regarding UI testing holds that you should always strive to select UI objects by the attribute that is least likely to change. With web UIs, you’ll typically see a hierarchy of preferred selectors as the following:

  1. Element ID
  2. Element name
  3. CSS class(es) associated with the element
  4. Text associated with the element

If you used this strategy with cucumber, you would get tests that looked something like this:

Given I go to the login page
And I enter "johndoe" in the field with id "username"
And I enter "password1" in the field with id "password"
Then the input element with type "submit" should become enabled
And I click the input element with type "submit"

But such steps are counter to the principles of behavior-driven development. The primary purpose of cucumber is that it allows your business owner to write requirements that can also directly serve as tests, and identifiers such as element ID or name are artifacts of implementation details, not part of the basic requirements. Or more simply–your business owner should be able to describe interaction with the application using only what’s visible to them in the UI.

More typically, cucumber steps for the login scenario above would look something like this:

Given I go to the login page
And I enter "johndoe" in the "Username" field
And I enter "password1" in the "Password" field
The the "Log in" button should become enabled
And I click the "Log in" button

Note that the steps use the visible text associated with each element to select the element.

The argument against this approach is: “But whenever your UI text changes–and let’s face it, it often changes–you’ll have to go update every single test!” While this is true, my rebuttal is that it’s a simple global search and replace change, and more importantly, it’s not a change to the code behind the cucumber, only to the cucumber test definitions, which, in my opinion, is a lower risk change than an actual code change.

To be fair, as the number of tests grew, I saw the redundancy between tests as a maintenance risk, so I’ve abstracted out frequently used functions, such as login and navigation, into single steps. Steps like the examples above no longer exists in our codebase. Instead, we have single steps such as:

Given I log in as username "johndoe" and password "password1"

So, now, if the text on our login page changes, we only have to change it in one place.

In my next blog post, I’ll show some of the code behind these steps.

UPDATE: Here is the next blog post.

Underestimating myself

I’m a self-taught programmer; I’ve never done it full-time. Most of the QA jobs that I have had over the years have included some amount of scripting and/or automated testing work, but it’s always been done alongside other QA tasks. When I was job hunting last year, I concluded that if I really wanted to remain competitive in my career, my next job needed to focus solely or primarily on writing test automation. Plus, I love to code, and I thought I would really enjoy such a job.

During my job hunt, my standard line about my coding skills was: I’m a decent coder, but because I’ve never done it full-time, I’m probably not qualified to fill a role where I’m the person primarily responsible for the development of an automation framework. When I interviewed here at Netspend for a senior automation developer job, I was told that they were seeking someone with a strong automation background but that they were less concerned whether the candidate knew the language being used, node.js/Javascript. Since node.js is a fairly new framework, and since Javascript is rarely used in automation, they (accurately, in my opinion) didn’t expect to find a candidate who had node.js/Javascript experience. Furthermore, they assumed that any qualified candidate could pick up the language.

It turned out that those were perfect expectations for me. During my interviews here, the interviewers were impressed with my extensive knowledge and experience with automation, and I passed their general programming tests. In regard to programming skills, they were more interested in my thought processes and logic than producing accurate code in the spur of the moment on a whiteboard. Plus, I did have a distant background with Javascript (not just in the browser; it was also the programming language of an automation tool that I used many years ago).

Now that I’ve been here at Netspend for about 8 months and developed a framework and automated tests for UI-testing, I’ve discovered that I misapprehended the skills that are necessary for building an automated testing framework. When I said I was probably not qualified to build an automation framework, I was thinking primarily of coding skills. I have learned the Javascript syntax and the node-specific knowledge necessary to build an automation framework, and I’ve built a robust framework and set of tests that adds value to our development process.

Last week, I worked with another senior QA engineer here at Netspend to define the desired skill sets of our different types of QA jobs: general QA, QA of our back-end systems, QA for our user-facing applications, and automated test developer (I’ll do another blog post about the results of this work). Here is the list of skills that we feel are necessary for an automated test developer in our organization:

  • Source Control Management (we use git)
    • How and Why
  • Continuous Integration (we use Bamboo)
  • Programming Basics, e.g.,
    • Conditionals
    • Data Types
    • Program flow
  • Advanced Programming, e.g.,
    • Development Patterns / Architecture
    • Abstractions / Refactoring
    • Contributing to Framework development
  • X-Path / DOM manipulation (for UI testing)
  • Automation Test Concepts, e.g.,
    • How to Validate what
    • Types of automation
    • Reducing to minimum set of tests
    • Cost / Benefit analysis
  • Database basics
  • Linux basics

As it turns out (and as the people who hired me here–one of whom was the guy I worked with to create this list–astutely realized), being an expert coder of the language of choice doesn’t even make the list. I’ve been successful with the development of an automated testing framework here due to my knowledge of and experience with all of these areas.

Often overlooked test automation challenges

The company I work for develops several different products as part of a unified offering. These products need to work with each other and with products from other companies. Each product development team has its own manual QA and automation team, and we have a solutions testing team that ensures interoperability between our products and others. The company has been pretty ‘siloed’–each product’s automation team has worked mostly in isolation from the others.
Until now, the interoperability testing has been all manual, but the manager of that team has embarked on an effort 1.) to get the different automation teams familiar with each other and their work, and 2.) to leverage the automation efforts of the various product teams in the interoperability testing effort.
To that end, we’ve started holding a weekly cross-team automation meeting, with run by the interoperability testing manager with a representative from each product’s automation team. The agenda consists of the following:

  1. Product configuration automation
  2. Test lab usage management (automated check out/in of lab resources
  3. Automation strategy
  4. Test management and reporting

What’s striking is that each product automation team has put in significant effort to address all four of these functions (with a lot of duplicated work!), yet when we think about test automation, we typically only think about #3, the actual automated tests themselves.
Just an observation. When we think about test automation, we need to remember that there are several complex components to it besides the tests themselves.

“Just Good Enough”

One of the automation engineers on our team is extremely thorough. When she does code reviews, she sends back lengthy emails, and she provides a lot of good information regarding coding practices. Her devotion to detail is a real asset to the team. However, she is getting burned out on code reviews and sometimes I think her time could be better spent on her own work.
As a team lead, I struggle with this type of team member? She’s doing outstanding work and almost every point she makes is technically correct and/or a good practice. I can’t very well tell her she’s not doing a good job.
My solution is to realize that she has a different viewpoint from mine. Hers is technical: from a technical point of view she’s almost always right. But I have to balance the technical viewpoint with the business viewpoint. While what she is doing is technically right, from a business viewpoint, it may not be the best use of her time. From a business viewpoint, sometimes the right thing is consciously to let some things slide.
On this team, we’re constantly refining our coding standards and practices. Lately, I’ve introduced the idea of ‘Just good enough.’ This is short-hand for the business viewpoint, a way of balancing the technically correct decisions with the business realities.
A lot of software engineers are happy doing their coding and letting me deal with the business issues. Unfortunately, this is one instance, however, where the engineers have to think about the business perspective as well.

The value of automated UI testing

During my recent job hunt, test automation came up in practically every interview, typically some broad question like, ‘So, how would you go about implementing test automation?”
My standard answer is that you generally get the best bang for your buck the farther deeper in your code you test. As an example, I contrast maintenance of unit tests (deep end) vs. maintenance of automated UI tests: you have to update a UI test almost anytime you make a change to the UI; you only have to update a unit test if you change an existing method. And the UI typically changes much more frequently than individual classes. Furthermore, UI changes frequently necessitate changing a whole string of user actions in your automated tests, whereas unit tests, by definition, are isolated and thereby typically much quicker to update.
This morning, I ran across a new blog post by B. J. Rollison, a.k.a. I.M. Testy, titled, “UI Automation Out of Control,” in which he lists some of the shortcomings of automated UI tests and some ways you should try to test before you resort to automated UI testing. It’s a good read.