June 2013

Saturday, June 29, 2013

Testing Backbone Frontends

When I first joined Coursera a year ago, we had no tests of our frontend code, but we knew this had to change. We are building a complex product for many users that will pass through many engineer's hands, and the only way we can have a reasonable level of confidence in making changes to old code is if there are tests for it. We will still encounter bugs and users will still use the product in ways that we did not expect, but we can hope to avoid some of the more obvious bugs via our tests, and we can have a mechanism in place to test regressions. Traditionally in web development, the frontend has been the least tested part of a webapp, since it was traditionally the "dumb" part of the stack, but now that we are putting so much logic and interactivity into our frontends, it needs to be just as thoroughly tested as the backend.

There are various levels of testing that we could do on our frontends: unit testing, integration testing, visual regression testing, and QA (manual) testing. Of those, we currently only do unit testing and QA testing at Coursera, but given infinite time and resources, we would cover the spectrum. Here's a rundown of those levels of testing, and how we do them - or could do them, one day.

Unit Testing

When we call a function with particular parameters, does it do what we expect? When we instantiate a class with given options, do its methods do what we think they will? There are many popular JS unit testing frameworks now that help answer those questions, like Jasmine, QUnit, and Mocha.

We do a form of unit testing on our Backbone models and views, using a suite of testing technologies:

Mocha: An open-source test runner library that gives you a way to define suites of tests with setup and teardown functions, and then run them via the command-line or browser. It also gives you a way to asynchronously signal a test completion. For example:

describe('tests for the reporter library', function() {
  beforeEach(function() {
    // do some setup code
  }
  afterEach(function() {
   // do some cleanup code
  }
  it('renders the reporter template properly', function() {
    // test stuff
  }
  it('responds to the ajax request correctly', function(done) {
    // in some callback, call:
    done();
  }
});

Chai: An open-source test assertion library that provides convenient functions for checking the state of a variable, using a surprisingly readable syntax. For example:
```
chai.expect(2+2).to.be.equal(4);
chai.expect(2+2).to.be.greaterThan(3);
```

JSDom: An open-source library that creates a fake DOM, including fake events. This enables us to test our views without actually opening a browser, which means that we can run quite a few tests in a small amount of time. For example, we can check that clicking changes some DOM:

var view = new ReporterView().render();
view.$el.find('input[value=quiz-wronggrade]').click();
var $tips = view.$el.find('[data-problem=quiz-wronggrade]');
chai.expect($tips.is(':visible'))
  .to.be.equal(true);
chai.expect($tips.find('h5').eq(0).text())
  .to.be.equal('Tips');

SinonJS: An open-source library for creating stubs, spies, and mocks. We use it the most often for mocking out our server calls with sample data that we store with the tests, like so:

var forumThreadsJSON  = JSON.parse(
    fs.readFileSync(path.join(__filename, 'forum.threads.json')));

server    = sinon.fakeServer.create();
server.respondWith("GET", '/forums/0/threads', 
   [200,
   {"Content-Type":"application/json"},
   JSON.stringify(forumThreadsJSON)]);

// We call this after we expect the AJAX request to have started
server.respond();

We can also use it for stubbing out functionality that does not work in JSDom, like functions involving window properties, or functionality that comes from 3rd party APIs:

var util = browser.require('js/lib/util');
sinon.stub(util, 'changeUrlParam',
   function(url, name, value) { return url + value;});

var BadgevilleUtil = browser.require('js/lib/badgeville');
sinon.stub(BadgevilleUtil, 'isEnabled',
   function() { return true;});

Or we can use it to spy on methods, if we just want to check how often they're called. Sometimes this means making an anonymous function into a view method, for easier spy-ability:


sinon.spy(view, 'redirectToThread');
// do some stuff to call function to be called
chai.expect(view.redirectToThread.calledOnce)
    .to.be.equal(true);
view.redirectToThread.restore();

Besides those testing-specific libraries, we also use NodeJS to execute the tests, along with various Node modules:

require: Similar to how we use this in our Backbone models and views to declare dependencies, we use require in the tests to bring in whatever libraries we're testing.
path: A library that helps construct paths on the file system.
fs: A library that helps us read our test files.

Let's see what all of that looks like together in one test suite. These are a subset of the tests for our various about pages. The first test is a very simple one, of a basically interaction-less, AJAX-less posts. The second test is for a page that does one AJAX call:

describe('about pages', function() {
  var chai = require('chai');
  var path = require('path');
  var env  = require(path.join(testDir, 'lib', 'environment'));
  var fs   = require('fs');

  var Coursera;
  var browser;
  var sinon;
  var server;
  var _;

  beforeEach(function() {
    browser = env.browser(staticDir);
    Coursera  = browser.require('pages/home/app');
    sinon = browser.require('js/lib/sinon');
    _ = browser.require('underscore');
  });

  describe('aboutBody', function() {

    it('about page content', function() {
      var aboutBody = browser.require('pages/home/about/aboutBody');
      var body      = new aboutBody();
      var view      = body.render();

      chai.expect(document.title).to.be.equal('About Us | Coursera');
      chai.expect(view.$el.find('p').size()).to.be.equal(6);
      chai.expect(view.$el.find('h2').size()).to.be.equal(3);
    });
  });


  describe('jobsBody and jobBody', function(){

    var jobs     = fs.readFileSync(path.join(__filename, '../../data/about/jobs.json'), 'utf-8');
    var jobsJSON = JSON.parse(jobs);

    beforeEach(function() {
      server = sinon.fakeServer.create();
      server.respondWith("GET", Coursera.config.url.api + "common/jobvite.json", 
        [200, {"Content-Type":"application/json"}, jobs]);
    });

    it('job page content', function(done) {
      var jobBody = browser.require('pages/home/about/jobBody');
      var view      = new jobBody({jobId: jobsJSON[0].id});

      var renderJob = sinon.stub(view, 'renderJob', function() {
        renderJob.restore();
        view.renderJob.apply(view, arguments);
        chai.expect(view.$('.coursera-about-body h2').text())
          .to.be.equal(jobsJSON[0].title);
        done();
      });

      view.render();
      chai.expect(document.title).to.be.equal('Jobs | Coursera');
      server.respond();
    });

  });

Integration testing

Can a user go through the entire flow of sign up, enroll, watch a lecture, and take a quiz? This type of testing can be done via Selenium WebDriver, which opens up a remote controlled browser on a virtual machine, executes commands, and checks expected DOM state. The same test can be run on multiple browsers, to make sure no regressions are introduced cross-browser. They can be slow to run, since they do start up an entire browser, so it is common to use cloud services like SauceLabs to distribute tests across many servers and run them in parallel on multiple browsers.

There are client libraries for the Selenium WebDriver written in several languages, the most supported being Java and Python. For example, here is a test written in Python that goes through our login flow, entering user credentials and checking for the expected DOM:

from selenium.webdriver.common.by import By
import BaseSitePage

class SigninPage(BaseSitePage.BaseSitePage):
    def __init__(self, driver, waiter):
        super(SigninPage, self).__init__(driver, waiter)
        self._verify_page()

    def valid_login(self, email, password):
        self.enter_text('#signin-email', email)
        self.enter_text('#signin-password', password)
        self.click('.coursera-signin-button')
        self.wait_for(lambda: \
                self.is_title_equal('Your Courses | Coursera') or \
                self.is_title_equal('Coursera'))

We do not currently run our Selenium tests, as they are slow and fragile, and we have not had the engineering resources to put time into making them more stable and easier to develop locally. We may out source the writing and maintenance of these tests to our QA team one day, or hire a Testing engineer that will improve them, or both.

Visual regression testing

If we took a screenshot of every part of the site before and after a change, do they line up? If there's a difference, is it on purpose, or should we be concerned? This would be most useful to check affects of CSS changes, which can range from subtle to fatal.

There are few apps doing this sort of testing, but there's a growing recognition of its utility and thus, we're seeing more libraries come out of the woodwork for it. Here's an example using Needle with Selenium:

from needle.cases import NeedleTestCase

class BBCNewsTest(NeedleTestCase):
    def test_masthead(self):
        self.driver.get('http://www.bbc.co.uk/news/')
        self.assertScreenshot('#blq-mast', 'bbc-masthead')

There's also Perceptual Diffs, PhantomCSS, CasperJS, and SlimerJS. For a more manual approach, there's the Firefox screenshot command with Kaleidoscope. Finally, there's dpxdt (pronounced depicted).

We do not do visual regression testing at this time, but I do think it would be a good addition in our testing toolbelt, and would catch issues that no other testing layers would find. The times that I've wanted this the most were during upgrades of our Twitter Bootstrap base CSS.

QA (manual) testing

If we ask someone to try a series of steps in multiple browsers, will they see what we expect? This testing is the slowest and least automate-able, but it can be great for finding subtle usability bugs, accessibility issues, and cross-browser weirdness.

Typically, when we have a new feature and we've completed the frontend per whatever we've imagined, we'll create a worksheet in our QA testing spreadsheet that gives an overall description of the feature, a staging server to test it on, and then a series of pages or sequences of interactions to try. We'll also specify what browsers to test in (or "our usual" - Chrome, FF, IE, Safari, iPad), and anything in particular to look out for. Our QA team takes about a day to complete most feature tests, and depending on the feedback, we will put a feature through multiple QA rounds.

Tuesday, June 25, 2013

Increasing Diverse Engineers in the Workplace:
A Call for Studies and Stories

Today I met with a small group of folks from SF startups and coding academies to talk about an issue we all face: increasing diversity in the engineering workplace. We came together to focus on women, but personally, I like to think of it more in terms of the ways that women are often different: they may often have alternative educational background (like the coding academies) and they may often have less confidence in their skills (possibly related to their background). Many of our engineering practices in terms of hiring and retention don't cater well to candidates with those attributes, and there's probably a lot we can do to change that.

Our goal tonight was to brainstorm experiments that we could do in our workplaces to improve diversity (across the pipeline from attracting candidates to retaining employees), try out those experiments, and document the results publicly. Ideally, if the experiments are successful, other companies will be more likely to put in the effort to implement the changes and engineering workplace practices would gradually begin to change.

Here's what we broke the pipeline down into, and a few ideas in each area:

Attracting

Making a more gender-neutral jobs/about page, with diverse team photos.
Re-wording job descriptions to not use words like "dominant" or "mastery" and instead focus on collaboration and ability to learn on the job. (*This would need to actually be true!)
Removing CS degrees as "required" and moving it to "Nice to have.".
Hosting meetups for diverse meetup groups.
Surveying candidates at various stages of the pipeline to ask "What attracted you to the job?" and seeing if there's something that stands out as *not* attracting diverse applicants. (Like if your women engineers never answer "the job page", then something may be wrong there)

Interviewing

Sending technical worksheets out to candidates with not-as-obviously-strong resumes, giving them a chance to prove their skills before rejecting them.
Focusing less on algorithms (which are hallmarks of traditional CS education, but often not as relevant in something like frontend engineering) and more on architecture questions (which is always a practical concern).
Adding a coding exercise to the interview process that isn't on the whiteboard but is more like what they'd actually do on the job: like a full-day working inside a similar codebase, or an afternoon of pairing with a few engineers.
Making sure the interviewers themselves are diverse.

Onboarding & Retaining

Pairing new employees with a mentor that goes out of their way to help them progress and answer questions.
Having weekly one-on-ones with engineers, focusing on career growth, and even asking them questions like "Why haven't you asked for a raise yet?"
Having a dedicated group and events for a minority group (like Google's Gayglers and Women Engineers), which go out of their way to make sure they're supported.

For each of those ideas, we'd try to think about how we'd do it as an experiment, and what we'd measure to be able to really prove the success metrics. It's tricky because you ideally want to see the effect of a change on the whole pipeline - e.g. would tweaking the job page increase the number of women applicants that then go on to make it through the interview process, get hired, and do well at the job? That's quite a long pipeline to measure - but I think there are useful metrics to measure along the way.

I imagine many companies have tried out experiments in these areas, and I'd love to hear about studies and stories from those, so that we don't try to reinvent any wheels. Please share links in the comments, and if you have an experiment that you've done but not written about, I encourage you to write a post on it and link to that. Thank you!

Wednesday, June 12, 2013

Our Backbone Stack

Backbone is a base to build on top of. It gives you a framework for separating your data and your presentation into models and views, but there is a lot that it doesn't give you. It's up to you to figure out what else your unique app needs, and how much of that you'll get from open-source libraries or decide to write yourselves. I see that as both a good and a bad thing. It's good because Backbone can lend itself to many different sorts of apps, with the right combination of add-ons, but it's bad because it takes longer to find those add-ons and get them working happily together.

Saying all that, I thought I'd share what our "backbone stack" is at Coursera. I'd love to hear about your own stacks in the comments.

AJAX calls

After finding ourselves making the same modifications repeatedly to our AJAX calls, we created custom wrapper libraries to take care of those commonalities:

api.js is an AJAX API wrapper that takes care of emulating patch requests, triggering events, showing AJAX loading/loaded messages via asyncMessages.js, and creating CSRF tokens in the client. We use this from both our Backbone and non-Backbone code.

backbone.api.js is a Backbone-specific wrapper for api.js that overrides the sync method and adds create/update/read/delete methods. We mix this into our Backbone models like so:

_.extend(model.prototype, BackboneModelAPI);

Relational Models

Out of the box, Backbone will take JSON from a RESTful API and automatically turn it into a Model or a Collection of Models. However, we have many APIs that return JSON that really represent multiple models (from multiple tables in our MySQL database), like courses with universities:

[{"name": "Game Theory",
 "id": 2,
 "universities": [{"name": "Stanford"}, {"name": "UBC"}
]

We quickly realized we needed a way to model that on the frontend, if we wanted to be able to use model-specific functionality on the nested models (which we often do).

Backbone-relational is an external library that makes it easier to deal with turning JSON into models/collections with sub collections inside of them, by specifying the relations like so:

var Course = Backbone.RelationalModel.extends({
   relations: [{
      type: Backbone.HasMany,
      key: 'universities',
      relatedModel: University,
      collectionType: Universities
    }],
});

We started use that for many of our Backbone apps, but we've had some performance and caching issues with it, so we've started stripping it out in our model-heavy-apps and manually doing the conversion into nested models.

You could also use: Backbone.nested.

Regions

Backbone lets you create views and render views into arbitrary parts of your DOM, but many developers soon run into the desire for "regions" or "layouts". We want to specify different parts of their page, and only swap out the view in those parts across routes - like the header, footer, and main area. That's a better user experience, since there's no unnecessary refreshing of unchanging DOM.

origami.js is a custom library that lets us create regions associated with views, and then in a route, we'll specify which region we want to replace with a particular view file, plus additional options to pass to that view. In the view, we can bind to region events like "view:merged" or "view:appended" and take appropriate actions.

Our syntax for specifying the view regions is admittedly a bit unwieldy, but you get the idea:

Coursera.region.open({
  "pages/home/template/page": {
    regions: {
      header: {
        "pages/home/template/header": {
          initialize: {
            universityPartnerType: university.get('partner_type')
          }
        }
      },
      body: {
        "pages/home/university/universityPage": {
          initialize: {
            university: university
          }
        }
      }
    }
  }
});

Our region library also keeps track of "dirty models" and is responsible for throwing up a modal alert when the user tries to leave the view that there's unsaved data (similar to how you'd do a window.unload for a traditional website).

You could also use: Marionette.js or Chaplin.

Templating

Backbone requires Underscore as a dependency, and since Underscore includes a basic templating library, that's the one you'll see in the Backbone docs. However, we wanted a bit more out of our templating library.

Jade is a whitespace-significant, bracket-less HTML templating library. It's clean to look at because of the lack of brackets and the enforced indenting (like Python and Stylus), but it's best feature (in my opinion) is that it auto-closes HTML tags. I've dealt with too many strange bugs from un-closed tags, and I like that it's one more thing I don't have to worry about when using Jade. Here's an example:

div
    h1 {#book.get('title')}
    p
    each author in book.get('authors')
        a(href=author.get('url')) {#author.get('name')}
    if book.get('published')
        a.btn.btn-large(href="/buy") Buy now!

You could also use: Handlebars, Mustache, or many other options.

Data Binding

Backbone makes it easy for you to find out when attributes on your Model have changed, via the "changed" event, and to query for all changed attributes since the last save via the changedAttributes method, but it does not officially offer any data ⟺ dom binding. If you are building an app where the user can change the data after it's been rendered, then you will find yourself wanting some sort of data binding to re-render that data when appropriate. We have many parts of Coursera where we need very little data-binding, like our course dashboard and course description pages, but we have other parts which are all data-binding, all-the-time, like our discussion forums and all of our admin editing interfaces.

Backbone.stickit is a lightweight data-binding library that we've started to use for a few of our admin interfaces. Here's a simple example from their docs:

Backbone.View.extend({bindings: {
    '#title': 'title',
    '#author': 'authorName'
  },render: function() {
    this.$el.html('<div id="title"/><input id="author">');
    this.stickit();
  }
});

We still do custom data-binding for many of our views (using the "changed" event, changedAttributes(), and partial re-rendering), and I like that because it gives me the most control to decide exactly how a view should change, and I don't have to fight against a binding library's assumptions.

You could also use: KnockBack

History

Backbone offers the Router class and Backbone.History for creating a single-page web app experience, where the URL changes completely and the history is managed via that URL. In some cases, however, I use Backbone to create "widgets" that I can place on existing URLs, and I want to maintain history and back button in those widgets without changing the main URL of the page.

jQuery BBQ is an external non-Backbone specific library for maintaining history in the hash, and as it turns out, it works pretty well with Backbone. You can read my blog post on it for a detailed explanation.

You could also use: Backbone.Widget.

Monday, June 10, 2013

Referencing DOM from JS: there must be a DRYer, safer way

In our JS apps at Coursera, here's what a typical Backbone view might look like:

var ReporterView = Backbone.View.extend({
  render: function() {
    this.$el.html(ReporterTemplate());
  },
  events: {
     'change .coursera-reporter-input': 'onInputChange'
     'click .coursera-reporter-submit': 'onSubmitClick'
  },
  onInputChange: function() {
    this.$('.coursera-reporter-submit').attr('disabled', null);
  },
  onSubmitClick: function() {
    this.model.set('title', this.$('.coursera-reporter-input').val());
    this.model.save();
  }
});

We render out a basic template, setup a few event listeners, and respond to them by manipulating the DOM or sending some data to the server. But there are a few things that irk me about this setup:

We are repeating those class names in multiple places.
We are using CSS class names for events and manipulation.

They've been irking for a while, but today I finally decided to try out a few approaches to those issues.

DRYing It Out

It worries me whenever I see class names repeated, because I know I have to either remember to update them if we change the class name, or I need full coverage tests. Ideally we'd have the latter for all our views, but hey, if there's something I can do to make my code generally safer, why not?

To avoid repeating those class names, I could store them in some sort of constants that are accessible anywhere in the view, and only access them via the constants. For example:

var ReporterView = Backbone.View.extend({
  dom: {
     SUBMIT_BUTTON: '.coursera-reporter-submit',
     INPUT_FIELD:   '.coursera-reporter-input'
  },
  render: function() {
    this.$el.html(ReporterTemplate());
  },
  events: function() {
    var events = {};
    events['change ' + this.dom.INPUT_FIELD]    = 'onInputChange';
    events['click ' +  this.dom.SUBMIT_BUTTON]  = 'onSubmitClick';
    return events;
  },
  onInputChange: function() {
    this.$(this.dom.SUBMIT_BUTTON).attr('disabled', null);
  },
  onSubmitClick: function() {
    this.model.set('title', this.$(this.dom.INPUT_FIELD).val());
    this.model.save();
  }
});

There are a few drawbacks to that approach: 1) we add more constants to our code, which may increase file size, and 2) our events definition gets a little bit more harder to read.

But it is certainly DRYer and there are also more benefits to it besides that, like easier-to-maintain testing code:

it('enables the submit button on change', function() {
  chai.expect(view.$(view.dom.SUBMIT_BUTTON).attr('disabled'))
 .to.be.equal('disabled');
  view.$(view.dom.INPUT_FIELD).trigger('change');
 chai.expect(view.$(view.dom.SUBMIT_BUTTON).attr('disabled'))
 .to.be.equal(undefined);
});

De-class-ification

It bothers me when I realize that I'm relying on CSS class names in my JS, because that means that:

I need to create long, specific class names, since we compile the CSS used by our many views into just a few CSS bundles, and it would be a bad thing if two class names overlapped.
I need to warn our designers not to touch those class names when they're doing a re-style of our HTML Jade templates, and/or I need to make sure that I have 100% coverage on the code using those class names.

I'd much rather have every CSS class name in our HTML templates be there purely for styling reasons, so that we only put effort into avoiding name collisions if it's actually necessary, and so that our designers can safely refactor the CSS independently of the JS.

An alternative to using CSS class names is to use data attributes instead, perhaps prefixing with js-* to indicate their use in JS. That would mean changing our selectors to something like:

var ReporterView = Backbone.View.extend({
  dom: {
     SUBMIT_BUTTON: '[data-js-submit-button]',
     INPUT_FIELD:   '[data-js-input-field]'
  },
...
});

There is a drawback to using data attributes, however: performance. According to local legend and this jsperf from Craig Patik, it is faster in all browsers to query by class instead of by data attributes, both via jQuery and the native document.querySelectorAll. Given that, a compromise would be to use CSS class names prefixed with js-* and to make it very clear to the engineering and design teams that they are not to be used in CSS class names. For example:

var ReporterView = Backbone.View.extend({
  dom: {
     SUBMIT_BUTTON: '.js-submit-button',
     INPUT_FIELD:   '.js-input-field'
  },
...
});

So, that's what I came up with. Now, your turn. What's your approach to referencing the DOM from your Backbone views or Javascript app? How does it do in terms of DRYness, future-proof-ness, testability, and all those fancy terms that may not actually exist?

Friday, June 7, 2013

Exporting a Google Spreadsheet as JSON

⚠️ Update (Apr 2023): I do not actively maintain this script, so it may not work. See my recent post on using Neptyne as another approach.

I often use Google Spreadsheets as a lightweight database, by setting up columns, encouraging my colleagues to update it, and subscribing to notifications of changes. Then I export the spreadsheet as JSON and update a .json file in our codebase. It's also possible to use the jsonp output of a published spreadsheet for dynamically updated JSON, but when I'm worried about performance or the information getting mis-updated, I use the export-and-update approach.

In order to export it as JSON, I use a Google Apps Script.

Here are the steps for using the script:

Create a new spreadsheet, and put your data in columns. Give each column a name, and choose carefully. Since this name will be used for JSON keys, the best names are lowercase, whitespace-less, and descriptive. Freeze the first row with the column names.
Go to Tools → Script Editor, and it will open up a code editing interface in another window.
Paste the JavaScript from this gist into the code editor.
Reload the spreadsheet and notice a new menu shows up called "Export JSON". Click on that, and you'll see two options:

Export JSON for this sheet (with default configuration)
Export JSON for all sheets (with default configuration)

Once you click one of the export options, it will process for a few seconds and popup a textbox with the exported JSON. Now you can do whatever you'd like with that!

So there you have it! If you make any improvements to the script to make it more flexible or easier to use or better in any way, please let me know in the comments.

Wednesday, June 5, 2013

Improving Backbone App Performance

At CSSConf and JSConf, one of the big themes was performance. I saw talks from Chrome engineers on Jank-busting, a talk from Adobe engineer Peter Flynn on performance tuning in Brackets, and had a lengthy lunch discussion about API performance and caching. Eventually, I couldn't stand all that talking with no action, and I broke down and spent my night in the bar investigating a part of Coursera that is noticeably slow: the course catalog. We looked into the backend API performance a few weeks ago, but I figured, hey, maybe there were frontend improvements that we could make too. Here's what I did and what I learnt.

Step 1. Identify Problem Areas in Timeline

In order to show our course catalog to users, the browser goes through many steps: loading the JS and CSS files, parsing them, finding the right Backbone route, creating the view, sending off an AJAX request to get courses, and rendering everything to the DOM. We could have performance issues in each of those steps (and there's ways to improve the performance of all of them), but I wanted to identify if there were big problem areas with obvious room for improvement.

To start off with, I loaded the catalog with the Chrome timeline view open and recording, and looked for big blocks. I quickly found two of them that looked ripe for the fixing:

The callback function for the courses AJAX request that turned the JSON into Backbone collections and models. This one was all "Scripting" blocks (JS execution).
The function that rendered the tags and courses to the catalog. This showed up as staggered "Rendering" blocks (DOM operations).

Here's what that looked like in the timeline:

Step 2. Break them down in Profiler

The timeline view will give you a good rough idea of where the browser is spending its time, but the profiler can give you exact statistics on what functions are taking the most amount of time. After reloading the catalog while recording a Javascript CPU profile, I identified a few functions taking a suspicious amount of time, both from external libraries:

Backbone.Relational took up the majority of the time with constructor and add operations
moment.js took up a significant amount of time with date processing functions.

Here's what the profiler looked like in tree view with relevant nodes expanded:

Step 3. Remove, reload, record

I immediately started thinking of ways to reduce our usage of those external libraries (as I wasn't interested in trying to understand why they were so time-intensive), but before I tried those, I wanted to make sure I was scientific about seeing what changes had the greatest effect on performance. To the measuring machine!

I inserted console.time("Backbonify") and console.timeEnd("Backbonify") statements at the start and end of our AJAX callback, so that I could immediately see that time duration in the console on every reload. I discovered that the time varied between loads (by 20-80ms) and determined I would need to record a range, not a single number. Before making any changes, the Backbonify time was about 750-850ms.

Now that I had a starting number to gauge my progress, I created a table of code-changes and time-ranges, and started changing code, reloading, and recording the new time spent. At first, I didn't worry about whether the code actually still worked - I just wanted to see if theoretically, I was able to remove some bit of code, how great of an improvement would I see? I didn't want to go through the effort of rewriting code until I understood how much I would gain.

Here's a screenshot of that table:

Step 4. Rewrite away the slowness

After my rapid-fire removal-and-reload session, I had a very good idea of what I could do to improve our performance while still retaining our functionality. Thankfully, we already had a number of tests for the code, so I had some confidence that I would know if I broke existing functionality.

Here's a run-down of the main changes that I made:

Removed Backbone.Relational
Time saved: ~400ms
We were using the library so that we could turn our JSON into nested collections of different models, and have a nice simple API for doing that, but I discovered that I could do the conversion myself and not take nearly as much time. Here's a basic conversion from the Topic.initialize() method:
```
if (!this.get('courses') || !(this.get('courses') instanceof Courses)) {
  this.set('courses',
    new Courses(this.get('courses')), {silent: true});
}
```
Deferred and cached moment.js functionality
Time saved: 100ms
We were using moment.js to set quite a few computed attributes on our Course models (related to start dates), and we were doing that upon initialization of the model, so that the computed attributes were always available just like normal attributes. So I turned them instead into explicit function calls (course.getStartStatus() vs. course.get('start_status')) and rewrote the code that called them, so that we we don't call the expensive moment functions until absolutely necessary. I also wrote a cache for moment.js so that I'd only call the .format() method if I'd never formatted that date before, for example. That improves performance both in calling the methods multiple time on the same model but also in calling them across different models with the same start dates. Here's a snippet of that cache:
```
  var MomentCache = {
    moments: {},
    formats: {},
    getMomentFor: function(arr) {
      var key = arr.join('/');
      if (!MomentCache.moments[key]) {
        var momentFor = moment(arr);
        if (momentFor.isValid()) {
          MomentCache.moments[key] = momentFor;
        }
      }
      return MomentCache.moments[key];
    },
    getFormatted: function(momentObj, formatString) {
      var key = momentObj.valueOf() + '|' + formatString;
      if (!MomentCache.formats[key]) {
        MomentCache.formats[key] = momentObj.format(formatString);
      }
      return MomentCache.formats[key];
    }
  };
```
Stopped listening to Collection "add" for renders
Time saved: ~60ms
For the second problem area that I identified, the one that was DOM-render-heavy, I realized the problem was that we were doing many separate calls to our templating functions and jQuery append, and it would obviously be more efficient to reduce them to one template and append call. This was actually due to the way we were using Backbone events - we were listening to "add" on a collection, and despite the fact that we were adding all of our models at once to the collection, the "add" event triggers *for each model*, passing in the added model. To speed it up, I triggered a custom event for that batch add, and rendered all the non-rendered models at once in the callback. The callback looked something like:
```
    addTopics: function() {
      var self = this;
      var topicsToAdd = [];
      this.model.get('collection').each(function(topic) {
        // If we already rendered it, we don't render it again
        if (self.$('.coursera-catalog-course-listing-box[data-topic-id=' + topic.get('id') + ']').length) {
          return;
        }
        topicsToAdd.push(topic);
      });
      this.$('.coursera-catalog-listings').append(
        listingTemplate({topicsInfo: topicsToAdd}));
    }
```
Removed "comparator" on Collections
Time saved: 50ms
We had a comparator function defined on some of our collections, which is a function that Backbone optionally looks for to create a default sort order for the models in a collection. As it turns out, we weren't ever relying on that default sort order, because we always use utility functions to sort our collections in different orders depending on the view and user interactions. I removed the comparators on the collections, and from now on, I will recommend that we never use them and always use explicit sorting functions.

Step 5. Profit!

After all those changes, my final Backbonify time went down to 90-110ms. If my measurements are as highly scientific as I hope they are, that means that I shaved off ~700ms in JS execution time. That was all locally measured times, but they're similar savings in production.

Here's what the profile view looks like now:

A part of me is tempted to put in an artificial delay back into the catalog, and do an A/B test to see if the performance changes actually do affect signups. But another part of me thinks that's a horrible thing to do to users. Maybe one day.

The Chrome profiling tools are a bit intimidating at first, but I encourage you to open them up, do some poking around, and see what your app's up to. I sure learnt a lot about ours — and I know I'll keep learning. There's still so much to improve!

Tuesday, June 4, 2013

Google I/O Talk: Online Learning Made Social

Over the last six months, we've been working with the Google+ Hangouts team to experiment with using hangouts to add another dimension of communication and social to the Coursera student experience. At I/O last month, I got the opportunity to share those experiments and talk about what we've learnt from them. It wasn't a very technical talk at all, as we've been learning more about humans than machines, but that made it a fun talk to give, too. I look forward to doing more experiments with hangouts and other forms of communication in the future. Stay tuned!

You can watch the talk on Youtube:

You can also follow along with my half of the slides below (I don't yet have the combined deck):