Fixing an “unfixable” bug with the help of Track.js

Recently I was helping a client who had found an intermittent issue in their single-page application. This particular issue was that under some circumstances – which they weren’t able to describe, as it wasn’t any particular sequence of actions – the UI would “lock up” and become completely unusable. The agency they were working with for the development had acknowledged the defect, but declared the bug “unfixable”, because they had not got a reliable way of reproducing the issue or any diagnostics to work on. JavaScript in the browser is known for being harder to track issues with and to test than server-side code for a number of reasons; for a start the code is running within an environment you can’t control – someone else’s computer, with someone else’s operating system, and browser. The first challenge here is to somehow recreate the issue, or capture details of it, and the second challenge is to do something about the issue, and the third is to make sure it is fixed and stays. I’ve had great results in a number of projects with similar challenges by using Track.js, and this is what I turned to in this case as well.

Capturing the bug

There wasn’t a lot of information to go on, but the fact that the error is intermittent suggests that perhaps things like timing can be an issue. Modern applications usually do things asynchronously, especially things like loading data from a server, and without anything else my initial hunch was that this would turn out to be a timing problem. The ideal thing to do would be to have a reliable way of triggering the issue, that can be easily run again and again; this is a job for automation tests. Unfortunately the application had close to zero tests, and so we started to build a simplistic set of automation scripts that simulated a journey through the application.

The way automation tests work is fairly simple; an external controlling process communicates with the browser, sending commands and querying the state of the window. A typical flow would be that you check to see whether a particular element is visible on the page, interact with it in some way, and check to see whether the state of the page afterwards is what you expect it to be. By mapping out the user’s intent, and translating this to elements on the page and interactions, we can form an automatable journey through the system. We can let this journey run over and over, and capture any errors that are triggered with logging. Ideally this would be with a logging setup that we can release into production later, so that we can see not only errors that happen in development environments, but also any errors that a customer sees.

Capturing errors that happen in the browser is traditionally done with bug reports, but this isn’t ideal:

  • Not all users will send a bug report – many will suffer in silence or simply move on to a competitor
  • End users’ bug reports probably won’t have a deep level of information
  • By the time an end user is reporting a bug the damage to your reputation is already done

What we need is something that will run alongside the application, on the customer’s browser, capturing useful information about errors that happen, and sending it back to a collection point. This is exactly the problem Track.js solves; it is a lightweight JavaScript script that can be simply dropped into the application and configured to point to your account. There’s a web portal for viewing the results, with detailed analytics, full stack traces and more. The errors are categorized as well which is a nice touch, so you can see when errors re-occur. Another feature I like to turn on out of the box is integration with Slack, so that we can see a real-time flow of new errors and daily summaries as part of the “ChatOps” dashboard. If you’re able to turn bug fixes around quickly, you can get into a state where developers pounce on these bug reports whenever something new comes through, fix it, and release it in hours or less (which goes down pretty well with customers and stakeholders!).

Fixing it!

With error logging set up, the next step was to point both the automation scripts and manual testing at the application, and watch for the error. Sure enough, before long we started seeing errors being triggered and captured. As it happened this was easier than I thought it would be; it was indeed a timing issue, and the automated tests were interacting with the page more quickly than users do, so were able to trigger the problem more reliably. As well as logging, we had also configured the automation tests to take screenshots of the application. We were able to use the error log to determine the exact time the error had been triggered; knowing the time we could go straight to the right screenshot. We could then show this screenshot to the user who raised the bug, who confirmed this was the behavior he had seen.

One of the really nice features of the Track.js error log is that it contains very rich information about the error, including timelines, source maps and more. We were able to go straight to the error, and step back to the cause. In this particular case it was a classic timing problem; a method being called on an object that hadn’t been set up because a network call was still in-flight, resulting in the familiar “undefined” runtime error. This is pretty simple to resolve by a little defensive coding; the easiest way is to just check whether an object exists before interacting with it. The actual fix was to just add these guard clauses to all the places where these objects were being accessed.

Making sure it is fixed and stays fixed!

Once we think the bug is fixed, we need to measure the impact of the fix. We want to make sure we’ve fixed the problem, and not introduced any new ones in its place! This is where the automated tests once again prove their worth. We can simply deploy the fix to a test environment, and set up the automation tests to run again and again until we’re happy. We can keep an eye on the error log, and if we still see the error we know we have more work to do. If we’re happy that the error is fixed, we can release the fix into production.

Another cool feature in Track.js is that it has dashboards showing both the most common errors, and trends. We should see the error we fixed disappearing from both. These dashboards are really useful, especially just after a new release goes out; if a release goes out and a new error starts trending, the odds are it’s related to something that was changed in that release. If we check in on these dashboards regularly, we can get a good sense of the health of the JavaScript side of the application, without resorting to having users reporting bugs to us.

In summary…

With increasingly complex JavaScript applications, such as single-page applications, progressive web apps and so on, the quality of the code that powers them is more important than ever – getting this wrong can really harm the reputation of your product and hence your business! Finding JavaScript errors in the browser can be hard without the right tooling, especially if the errors are intermittent or hard to recreate. Client-side logging is a great tool to bridge that gap, and done right can save a lot of time and energy in fixing the errors. Of course you can write your own, but is that really the most valuable use of your time? There are other off-the-shelf tools out there but why not give Track.js a look? It’s a really nice product, written by developers for developers, with plenty of helpful information and integrations with other tools, and it takes a really small amount of time to get set up – and has a free trial so you can see for yourself if it meets your needs!