When I joined the project in 2018 we already had a UI test framework set up in Appium with quite a few cucumber scenarios. I was informed the framework was set up by the frontend developers and the scenarios were written by the Business Analysts / product owners. The testers on the project were mainly doing manual testing, so my main focus was to drive the automated tests so the frontend developers could focus on development and write end to end tests.
The UI tests were already running on Jenkins multiple times a day, taking no longer than 10 minutes to execute and the green status was displayed on the big screens around the office for everyone to see. There seemed to be a lot of confidence in the stability of the app as Jenkins was reporting the UI tests had been passing. That same day we got an app release and we were told to manually smoke test the app. Within the first minute we found a blocker and had to stop testing, yet Jenkins was still reporting the UI tests had passed. To my surprise no one even questioned this.
After gaining access to the repository I looked into the code and to my disbelief we had many scenarios with empty step definitions, hence why Jenkins was always reporting the tests had passed. For those who are unfamiliar with BDD, you have the scenarios written in feature files and the step definition is where you write code to perform the action. An empty step definition means no code will run therefore the test passes as it does not have to run anything. As the test was run on CI no one would ever see the app just launching and immediately closing. The report generated would also show the BDD steps had passed but would not indicate the steps being empty.
Luckily the app was still in development and hadn’t been released to any users.
First thing I done was add console logging for each step so we can easily identify what steps the test has taken making it easier to debug. I then, where possible entered the desired action for the empty step definitions and for step definitions that required more time I would throw an exception so the test would fail. The idea behind throwing an exception was so we no longer report false positives. I would rather fail an automated test and run it manually than not run anything at all. This immediately took our test execution time up to 90 minutes and Jenkins was now showing tests as failing.
Seeing the UI test reporting red was a shock for certain people as they had always seen it green, and wanted it to always be green.
While we were trying to fix these failing test’s, someone for some bizarre reason, wrote a clever script which would mark the failing test’s as “doNotRun”. As the name suggests, tests with the “doNotRun” tag would not be executed in future runs, until someone fixed and removed the tag. Once again we were back to a green Jenkins and all the UI tests were “passing” again.
After multiple meetings and discussions with various people I finally managed to convince the appropriate people that tagging tests with “doNotRun” will only hide problems with the app. This was just one of the bad / wrong practises we were doing at the time, there were more.
Whilst looking at the scenarios I noticed we had many variations for the same thing. For example “Given I was logged in”, “Given I was logged into the app”, “Given as a user I was logged in”, Given the user was logged in”. These all do the same thing but will have separate step definitions with duplicated code inside them.
There were many cases of this throughout the tests and I believe this had happened for various reasons such as multiple people writing scenarios / not having an understanding on how these scenarios work in an automated framework / not having someone with good understanding review the BDD.
As the apps functionality grew so did the automated UI test pack. Within just a few months our test execution time had gone up to almost 3 hours. More people were contributing to writing tests and adding “tests at speed” seemed to be priority. Problem with this approach was that we were getting quantity rather than quality. We had a large number of tests that performed very similar actions and often developers would just write new code without checking if the function already exists. This contributed to us having bloated codebase.
Over the next few weeks with the help of developers we managed to introduce “deep linking” which would open the app at the screen you wished rather than having to go through the whole journey in order to reach a particular screen. Also by combining as many similar tests, removing duplications and making some tweaks to the tests we were able to bring the execution time down to 1 hour 15 minutes.
Speeding up the execution time was our main focus now as we wanted to get faster feedback. Again with the help of some talented developers we managed to get our automated UI test’s running in parallel on multiple simulators. This proved to be a success as we managed to further reduce our execution time down to 45 minutes.
We then added mocha api tests to bolster our test pack.
At the time of writing the api test framework was still at the early stages and had only a handful of tests. The plan was to build out the framework in a way so we could reduce the number of UI tests and add them to the api test pack as the api test was super fast. As an example, to fully onboard a user it would take the UI test approx 3 minutes, the api test on the other hand would take only 13 seconds.