Testing Backbone Frontends

When I first joined Coursera a year ago, we had no tests of our frontend code, but we knew this had to change. We are building a complex product for many users that will pass through many engineer's hands, and the only way we can have a reasonable level of confidence in making changes to old code is if there are tests for it. We will still encounter bugs and users will still use the product in ways that we did not expect, but we can hope to avoid some of the more obvious bugs via our tests, and we can have a mechanism in place to test regressions. Traditionally in web development, the frontend has been the least tested part of a webapp, since it was traditionally the "dumb" part of the stack, but now that we are putting so much logic and interactivity into our frontends, it needs to be just as thoroughly tested as the backend.

There are various levels of testing that we could do on our frontends: unit testing, integration testing, visual regression testing, and QA (manual) testing. Of those, we currently only do unit testing and QA testing at Coursera, but given infinite time and resources, we would cover the spectrum. Here's a rundown of those levels of testing, and how we do them - or could do them, one day.

Unit Testing

When we call a function with particular parameters, does it do what we expect? When we instantiate a class with given options, do its methods do what we think they will? There are many popular JS unit testing frameworks now that help answer those questions, like Jasmine, QUnit, and Mocha.

We do a form of unit testing on our Backbone models and views, using a suite of testing technologies:

Mocha: An open-source test runner library that gives you a way to define suites of tests with setup and teardown functions, and then run them via the command-line or browser. It also gives you a way to asynchronously signal a test completion. For example:

describe('tests for the reporter library', function() {
  beforeEach(function() {
    // do some setup code
  }
  afterEach(function() {
   // do some cleanup code
  }
  it('renders the reporter template properly', function() {
    // test stuff
  }
  it('responds to the ajax request correctly', function(done) {
    // in some callback, call:
    done();
  }
});

Chai: An open-source test assertion library that provides convenient functions for checking the state of a variable, using a surprisingly readable syntax. For example:
```
chai.expect(2+2).to.be.equal(4);
chai.expect(2+2).to.be.greaterThan(3);
```

JSDom: An open-source library that creates a fake DOM, including fake events. This enables us to test our views without actually opening a browser, which means that we can run quite a few tests in a small amount of time. For example, we can check that clicking changes some DOM:

var view = new ReporterView().render();
view.$el.find('input[value=quiz-wronggrade]').click();
var $tips = view.$el.find('[data-problem=quiz-wronggrade]');
chai.expect($tips.is(':visible'))
  .to.be.equal(true);
chai.expect($tips.find('h5').eq(0).text())
  .to.be.equal('Tips');

SinonJS: An open-source library for creating stubs, spies, and mocks. We use it the most often for mocking out our server calls with sample data that we store with the tests, like so:

var forumThreadsJSON  = JSON.parse(
    fs.readFileSync(path.join(__filename, 'forum.threads.json')));

server    = sinon.fakeServer.create();
server.respondWith("GET", '/forums/0/threads', 
   [200,
   {"Content-Type":"application/json"},
   JSON.stringify(forumThreadsJSON)]);

// We call this after we expect the AJAX request to have started
server.respond();

We can also use it for stubbing out functionality that does not work in JSDom, like functions involving window properties, or functionality that comes from 3rd party APIs:

var util = browser.require('js/lib/util');
sinon.stub(util, 'changeUrlParam',
   function(url, name, value) { return url + value;});

var BadgevilleUtil = browser.require('js/lib/badgeville');
sinon.stub(BadgevilleUtil, 'isEnabled',
   function() { return true;});

Or we can use it to spy on methods, if we just want to check how often they're called. Sometimes this means making an anonymous function into a view method, for easier spy-ability:


sinon.spy(view, 'redirectToThread');
// do some stuff to call function to be called
chai.expect(view.redirectToThread.calledOnce)
    .to.be.equal(true);
view.redirectToThread.restore();

Besides those testing-specific libraries, we also use NodeJS to execute the tests, along with various Node modules:

require: Similar to how we use this in our Backbone models and views to declare dependencies, we use require in the tests to bring in whatever libraries we're testing.
path: A library that helps construct paths on the file system.
fs: A library that helps us read our test files.

Let's see what all of that looks like together in one test suite. These are a subset of the tests for our various about pages. The first test is a very simple one, of a basically interaction-less, AJAX-less posts. The second test is for a page that does one AJAX call:

describe('about pages', function() {
  var chai = require('chai');
  var path = require('path');
  var env  = require(path.join(testDir, 'lib', 'environment'));
  var fs   = require('fs');

  var Coursera;
  var browser;
  var sinon;
  var server;
  var _;

  beforeEach(function() {
    browser = env.browser(staticDir);
    Coursera  = browser.require('pages/home/app');
    sinon = browser.require('js/lib/sinon');
    _ = browser.require('underscore');
  });

  describe('aboutBody', function() {

    it('about page content', function() {
      var aboutBody = browser.require('pages/home/about/aboutBody');
      var body      = new aboutBody();
      var view      = body.render();

      chai.expect(document.title).to.be.equal('About Us | Coursera');
      chai.expect(view.$el.find('p').size()).to.be.equal(6);
      chai.expect(view.$el.find('h2').size()).to.be.equal(3);
    });
  });


  describe('jobsBody and jobBody', function(){

    var jobs     = fs.readFileSync(path.join(__filename, '../../data/about/jobs.json'), 'utf-8');
    var jobsJSON = JSON.parse(jobs);

    beforeEach(function() {
      server = sinon.fakeServer.create();
      server.respondWith("GET", Coursera.config.url.api + "common/jobvite.json", 
        [200, {"Content-Type":"application/json"}, jobs]);
    });

    it('job page content', function(done) {
      var jobBody = browser.require('pages/home/about/jobBody');
      var view      = new jobBody({jobId: jobsJSON[0].id});

      var renderJob = sinon.stub(view, 'renderJob', function() {
        renderJob.restore();
        view.renderJob.apply(view, arguments);
        chai.expect(view.$('.coursera-about-body h2').text())
          .to.be.equal(jobsJSON[0].title);
        done();
      });

      view.render();
      chai.expect(document.title).to.be.equal('Jobs | Coursera');
      server.respond();
    });

  });

Integration testing

Can a user go through the entire flow of sign up, enroll, watch a lecture, and take a quiz? This type of testing can be done via Selenium WebDriver, which opens up a remote controlled browser on a virtual machine, executes commands, and checks expected DOM state. The same test can be run on multiple browsers, to make sure no regressions are introduced cross-browser. They can be slow to run, since they do start up an entire browser, so it is common to use cloud services like SauceLabs to distribute tests across many servers and run them in parallel on multiple browsers.

There are client libraries for the Selenium WebDriver written in several languages, the most supported being Java and Python. For example, here is a test written in Python that goes through our login flow, entering user credentials and checking for the expected DOM:

from selenium.webdriver.common.by import By
import BaseSitePage

class SigninPage(BaseSitePage.BaseSitePage):
    def __init__(self, driver, waiter):
        super(SigninPage, self).__init__(driver, waiter)
        self._verify_page()

    def valid_login(self, email, password):
        self.enter_text('#signin-email', email)
        self.enter_text('#signin-password', password)
        self.click('.coursera-signin-button')
        self.wait_for(lambda: \
                self.is_title_equal('Your Courses | Coursera') or \
                self.is_title_equal('Coursera'))

We do not currently run our Selenium tests, as they are slow and fragile, and we have not had the engineering resources to put time into making them more stable and easier to develop locally. We may out source the writing and maintenance of these tests to our QA team one day, or hire a Testing engineer that will improve them, or both.

Visual regression testing

If we took a screenshot of every part of the site before and after a change, do they line up? If there's a difference, is it on purpose, or should we be concerned? This would be most useful to check affects of CSS changes, which can range from subtle to fatal.

There are few apps doing this sort of testing, but there's a growing recognition of its utility and thus, we're seeing more libraries come out of the woodwork for it. Here's an example using Needle with Selenium:

from needle.cases import NeedleTestCase

class BBCNewsTest(NeedleTestCase):
    def test_masthead(self):
        self.driver.get('http://www.bbc.co.uk/news/')
        self.assertScreenshot('#blq-mast', 'bbc-masthead')

There's also Perceptual Diffs, PhantomCSS, CasperJS, and SlimerJS. For a more manual approach, there's the Firefox screenshot command with Kaleidoscope. Finally, there's dpxdt (pronounced depicted).

We do not do visual regression testing at this time, but I do think it would be a good addition in our testing toolbelt, and would catch issues that no other testing layers would find. The times that I've wanted this the most were during upgrades of our Twitter Bootstrap base CSS.

QA (manual) testing

If we ask someone to try a series of steps in multiple browsers, will they see what we expect? This testing is the slowest and least automate-able, but it can be great for finding subtle usability bugs, accessibility issues, and cross-browser weirdness.

Typically, when we have a new feature and we've completed the frontend per whatever we've imagined, we'll create a worksheet in our QA testing spreadsheet that gives an overall description of the feature, a staging server to test it on, and then a series of pages or sequences of interactions to try. We'll also specify what browsers to test in (or "our usual" - Chrome, FF, IE, Safari, iPad), and anything in particular to look out for. Our QA team takes about a day to complete most feature tests, and depending on the feedback, we will put a feature through multiple QA rounds.

pamela fox's blog

Saturday, June 29, 2013