How to market yourself

I really enjoy helping QA engineers with their careers, but if you’re a stranger asking for help, how you ask makes all the difference.
Back in 1006, I received this email:

Subject: QA in Austin
I am a QA professional in Minneapolis, and I may be moving to Austin in the next few months. I found your resume and web site through Google.
You sound like a pretty interesting, friendly guy based on your website. I’m hoping that you may be able to let me know of some people in Austin who may be hiring for senior software QA positions. I’d also be interested in learning about any professional quality assurance organizations in Austin. I’m currently a member of one in Minneapolis:
I’m not sure what the general salary range in Austin is compared to Minneapolis. I have a feeling that I may need to adjust my expectations downward.
I know that this request is out of the blue, but I would appreciate any time you could give me.

I happily provided him extensive information about Austin and the job market. In our subsequent email correspondence, I connected him with some local recruiters and other QA professionals who I thought might be able to help him.
When he later moved to Austin, he invited me to lunch to thank me for the help. We subsequently became good friends and good professional colleagues.
In contrast, in 2009, I received another request for help from a stranger:

Subject: Can you help me to find a job?
My resume is attached.

Here’s the rest of the correspondence between me and the person who sent the second email above.
My response:

I see from your resume that you’re a QA engineer. I actually help a lot of local QA engineers and others to find jobs, but I might suggest that an email with the entire content “Can you help me to find a job? My resume is attached” is not a great introduction to a stranger who might be in a position to help you.

The inquirer’s response:

I am sorry, that I have not given an introduction.
My name is [redacted], have MS in Mathematics and Diploma in Computer Science I have 6 years of QA experience from Dell and Borland. I also have CSTP certification from IIST.
My resume is attached for your ready reference. Can you please help me to find a job? I appericiate your great help. I found your email and resume when I googled under SQA.
I look forward to hear back from you soon.

A little better, but not much. Me:

Here’s the best help I can give you at this point…
When networking, especially with strangers, you need to do your homework, and to use the info that you uncover to try to make as personal a contact as possible. You’re selling yourself, and in the process showing the other person that you’re thorough, thoughtful, etc.
Your first email to me, and your second one, to a large extent, was like someone coming to my door and just saying, “Hi, I’m selling X. Do you want some?” I’ll just shut the door in that person’s face. That’s why those f***ing door-to-door magazine subscription scammer kids give you some story about how they’ll win a scholarship or some such shit if they sell enough subscriptions; they don’t just come to the door and ask if you want a subscription.
If I were you, I would have written something like this:

Hi Stan,
I see that you have an extensive history of QA in Austin and that you’ve recently worked at Borland. I also noticed from your resume that you have just taken a new job. How was the job hunt? What do you think about the local job market for QA?
My name is X and I am also a QA engineer here in Austin, and in fact, I also once worked for Borland. I am also looking for a new job, and I was wondering if you could offer any advice? [Then, invent some specific question that I might have some insight on, such as] In particular, I was wondering what automated testing tools are in greatest demand right now?
I’d appreciate any insight you can share into the job search in Austin. Please feel free to email me back or call me at xxx.
Regards, X

Good luck on your job hunt. If I can provide any other specific help, let me know.

Testing and Toyota

Testing rock star James Bach has published several good blog posts about the Toyota braking problems: Advice to Lawyers Suing Toyota, Toyota Story Analysis, CNN Believes Whatever Computers Say.
The following line struck me from the ‘Advice’ post: “Extensive testing” has no fixed meaning. To management, and to anyone not versed in testing, ALL testing LOOKS extensive. This is because testing bores the hell out of most people, and even a little of it seems like a lot.
That’s very true except when a bug slips through and gets caught by users.
That’s when it’s so much fun to remind management that they chose the amount of testing that let this bug slip past. Test lead to management: Remember way back when, I presented you with some options regarding testing: if we have X amount of time, we’ll get Y amount of testing done, here’s the way we prioritize the work and the associated risks, or if we have X+A amount of time, we can do Y+B amount of testing done, with the following risks.

Documenting code changes with defect reports

Today, Rafe Colburn listed four reasons to file the bugs found in code reviews. A commenter points out that defect reports aren’t the only way of communicating about changes to the code:

I guess it depends on the local culture, but in my experience, developers only look at a bug report if it’s assigned to them. The revision control system is a better way to see what’s happened recently.

In my ideal system, I would take things a little further: every commit must have one or more work items (requirements, defect reports) associated with it and an indication of whether each is in progress or completed. The argument for this is pretty simple: if you’re not implementing a requirement or fixing a bug, then why the heck are you changing code?
Additionally, the build system should display the commits in the build, the work items associated with each commit as well as a list of the changed files and an easy way to view file diffs for changes in each commit.
As a QA engineer, my need to see completed work items is obvious. However, the list of changed files and diffs provide a different type of equally useful data. This information provides me an easy way to familiarize myself with the code, input in deciding how to test the change, and opportunities for starting discussions with the programmers about their code.
When I describe this system to programmers, their first thought is often that it requires a lot of red tape/documentation. I have only worked with a system like this once in my career, and in that situation, the programmers did not find it onerous. It’s true that the programmers had to file defect reports for bugs that they fixed and found, but we relaxed our defect report standards in such cases; they didn’t have to fill out severity, steps to reproduce, etc., which allowed the developers to spend a very short amount of time filing the reports. We decided that having a minimal ‘placeholder’ defect report was good enough in many such cases if it allowed buy-in from the developers. Besides, as mentioned above, the reporting in the build system was a backup source of information about code changes.

A note on defect severity and priority

In my previous post, Defect severity vs. priority, I used examples that explained the rationale behind deciding when to fix and not fix defects. Given agile’s focus on not allowing defects to go unaddressed, I now see that some people could have been confused by these examples.
Please note that that post addressed a general quality assurance concept, that the examples were hypothetical, and that it was not agile-specific.
I should write some more blog posts on my experiences with defects in agile environments.

Defect severity vs. priority

In my recent post, Unnecessary abstraction, I used defect severity as an example. I also mentioned that more a more descriptive (less abstract) name for this information would be something like “Customer severity” or “Impact on user.”
In my post, I assumed a specific definition of severity. In my career, I’ve dealt repeatedly with confusion between defect severity and defect priority, so I thought I should document my preferred definitions here.
I define defect severity, as I mentioned above, as the effect on the software user. If severity is a dropdown field in the defect management software, I usually recommend values such as

  • Critical functionality broken, no workaround
  • Non-critical functionality broken, or critical with workaround
  • Minor functional defect
  • Cosmetic or minor usability issue

As I mentioned in my earlier post, the values for this field don’t have to be hierarchical. Who’s to say that ‘Non-critical functionality is broken’ is more or less severe than ‘Critical functionality broken, but with workaround’?
Unless new information is discovered regarding a defect (e.g., a work-around is identified), severity should not change.
When putting together a defect tracking process, I suggest that the person who enters the defect be required to provide a severity.
Defect priority represents the development team’s priority in regard to addressing the defect. It is a risk-management decision based on technical and business considerations related to addressing the defect. To make the term less abstract, I usually propose it be called ‘Development priority’ or something similar.
Priority can be determined only after technical and business considerations related to fixing the defect are identified; therefore the best time to assess priority is after a short examination of the defect, typically during a ‘bug scrub’ attended by both the product owner and technical representatives.
Here are some examples I give when explaining severity and priority:
High severity, low priority – Critical impact on user: nuclear missiles are launched by accident. Factor influencing priority: analysis reveals that this defect can only be encountered on the second Tuesday of the first month of the twentieth year of each millennium, and only then if it’s raining and five other failsafes have failed.
Business decision: the likelihood of the user encountering this defect is so low that we don’t feel it’s necessary to fix it. We can mitigate the situation directly with the user.
High severity, low priority – Critical impact on user: when this error is encountered, the application must be killed and restarted, which can take the application off-line for several minutes. Factors influencing priority: (1) analysis reveals that it will take our dev team six months full-time refactoring work fix this defect. We’d have to put all other work on hold for that time. (2) Since this is a mission-critical enterprise application, we tell customers to deploy it in a redundant environment that can handle a server going down, planned or unplanned.
Business decision: it’s a better business investment to make customers aware of the issue, how often they’re likely to encounter it, and how to work through an incidence of it than to devote the time to fixing it.
Low severity, high priority – Minimal user impact: typo. Factors influencing priority. (1) The typo appears prominently on our login screen; it’s not a terribly big deal for existing customers, but it’s the first thing our sales engineers demo to prospective customers, and (2) the effort to fix the typo is minimal.
Decision: fix it for next release and release it as an unofficial hotfix for our field personnel.

One unanticipated value of blogging

Over at Snarkmarket, I ran across this thought today:

I always tell people that blogging is useful, even if nobody’s reading, because it forces you to have an opinion on things. You don’t realize how blankly you experience most of the stuff you read every day until you force yourself to say something—even something very simple—about it.

When I was job hunting earlier this year, I benefited greatly from this blog: I had given more thought to many of the issues that came up in interviews than the last time I was interviewing before I started this blog.

Unnecessary abstraction

At my new job, I’m currently putting together a defect management process, something I’ve done at pretty much every company I’ve ever worked at. Part of the process includes defining data fields and values associated with defect reports.
A typical defect tracking system has the following combo box field and values: field name: Severity – values: high, medium, low.
I wish I had a dime for every time I’ve answered the question, “So, what’s the again difference between ‘severity’ and ‘priority’? or “What’s the difference between a high and medium severity bug?”
Many companies I’ve worked at have tried to solve this problem by creating documentation that defines the fields and values. This type of documentation keeps me from having to repeat myself–I can just refer the person to the documentation–but it does not really address the source of the problem: both the field name and its values are abstractions of real-world data.
Over the years, I’ve begun to propose that we just give the fields and their values names that succinctly reflect their concrete meaning. Granted, this is typically easier with field names than their values, as the values typically require more explanation.
‘Severity’ would look more like this: “Customer severity” or even better “Impact on user”, with the following values:

  • Critical functionality broken, no workaround
  • Non-critical functionality broken, or critical with workaround
  • Minor functional defect
  • Cosmetic or minor usability issue

Granted, those long values make the UI of your defect management system and your reports a little messy, but in my experience, it’s a worthy sacrifice for the lack of ambiguity that the verbiage provides.
An aside: in that example, I’m still trying to force my values to fit into another common convention: hierarchical levels of severity. But if you think about it, why should I force “Non-critical functionality broken” and “Critical functionality broken” into one value? Why not just break those into separate values without worrying whether one is ‘more severe’ than the other? But I’ll save this convention for another blog post.
My question to the millions of people who read this blog: why do we have these conventions regarding abstractions and hierarchical values in the first place? How did they come about? I have my opinions, but I’d like to hear yours.

The $23,148,855,308,184,500 bug

The story of Visa charging a number of customers $23,148,855,308,184,500 has been all over the news the last couple of days. Slashdot commenter rickb928 provides a plausible explanation for the error.

I work in this industry. The only novelty here is that the error got into production, and was not caught and corrected before it went that far.
Submitters send files to processors which are supposed to be formatted according to specifications.
Note I wrote ‘supposed to be’.
Some submitters do, from time to time, change their code, and sometimes they get it wrong. For instance padding a field with spaces instead of zeros. Woopsie…!
Seems that’s what happened here. Sounds like a hex or dec field got padded with hex 20, and boom.
This is annoying, especially when the processor gets to help correct the overwhelming number of errors, and then tries to explain that it wasn’t their fault. Plenty of blame to go around with this one.
And then explains why they don’t both validate/sanitize input, and test for at least some reasonable maximum value in the transaction amount. A max amount of $10,000,000 would have fixed this. That and an obvious lapse in testing. This is what keeps my bosses awake sometimes, fearing they will end up on the front page of the fishwrap looking stupid ’cause their overworked minions screwed something up, or didn’t check, or didn’t test very well. I love one of the guys we have testing. He’s insufferable, and he catches genuine show-stoppers on a regular basis. They can’t pay him what he’s been worth, literally $millions, just in avoiding downtime and re-working code that went too far down the wrong path.
Believe me, this is in some ways preferable to getting files with one byte wrong that doesn’t show up for a month, or sending the wrong data format (hex instead of packed binary or EBCDIC, for instance) and crashing the process completely. Please, I know data should never IPL a system. Tell it to the architects, please. As if they don’t know now, after the one crash…
If you knew what I know, you’d chuckle and share this story with some of your buddies in development and certification.
And pray a little.
At least it didn’t overbill the cardholders by $.08/transaction. That would suck. This is easy by comparison. Just fix the report data. Piece of cake. Evening’s worth of coding and slam it out in off-peak time. Hahahahaha!

That’s quite a missed test case!

Descriptive vs prescriptive testing

Over at his Collaborative Software Testing blog, Jonathan Kohl has an interesting post about Descriptive and Prescriptive Testing which he defines as follows:
A prescriptive style is a preference towards direction (“do this, do that”) while a descriptive style is more reflective (“this is what we did”). Both involve desired outcomes or goals, but one attempts to plan the path to the outcome in more detail in advance, and the other relies on trying to reach the goals with the tools you have at hand, reflecting on what you did and identifying gaps, improving as you go, and moving towards that end goal.
Jonathan also explores the personality types that are drawn to each type of testing, and he relates how he recently incorporated descriptive testing into a prescriptive environment:

For example, I helped a friend out with a testing project a few weeks ago. They directed me to a test plan and classic scripted test cases. . . Within an hour or two of following test cases, I got worried about my mental state and energy levels. I stopped thinking and engaging actively with the application and I felt bored. I just wanted to hurry up and get through the scripted tests I’d signed on to execute and move on. I wanted to use the scripted test cases as lightweight guidance or test ideas to explore the application in far greater detail than what was described in the test cases. I got impatient and I had to work hard to keep my concentration levels up to do adequate testing. I finally wrapped up later that day, found a couple of problems, and emailed my friend my report.

And then he explains how he developed some descriptive testing:

The next day, mission fulfilled, I changed gears and used an exploratory testing approach. I created a coverage outline and used the test cases as a source of information to refer to if I got stuck. I also asked for the user manual and release notes. I did a small risk assessment and planned out different testing techniques that might be useful. I grabbed my favorite automated web testing tool and created some test fixtures with it so I could run through hundreds of tests using random data very quickly. That afternoon, I used my lightweight coverage to help guide my testing and found and recorded much more rich information, more bugs, and I had a lot of questions about vague requirements and inconsistencies in the application.

Like Jonathan, I definitely lean towards the descriptive testing approach. In fact, you might say that the experience that I recently described in my post Just enough process: the checklist was an attempt to balance out prescriptive and descriptive testing.

How did we get into this forest?

In my previous entry, I described a conversation I had recently. A colleague asked for my opinion on how to solve a problem with maintaining automated UI tests. After hearing a few details about the situation, I told him that in my opinion, his team had a larger problem that that: they would be better off focusing first on building and running a suite of lower-level testing, such as unit tests. Once that suite was solid, then they should explore other automated testing options, such as UI tests.
The even bigger issue here is: how did they get into this situation in the first place?
I did not talk to my colleague in detail about this, but I mentioned one fact in my previous post that sheds some light on the source of the problem: this small company contracted out their testing. In their case, they used a company that provides a part-time local QA manager and a test team in Ukraine.
By separating the testing so completely from the rest of the development process, they left the test team to address automated testing by the only means available to them: the UI.
To his credit, this colleague and his coworkers realize that the outsourced testing is not suiting their needs, and they are exploring other options for testing. But it seemed clear to me that my colleague still maintained a mental separation between testing and development. He was looking for solutions to their quality problems that could be implemented within the realm of QA/testing, not solutions to more systemic problems–solutions that would involve changing the way his entire team works.