Sunday, March 31, 2013

Rewriting our Forums with Backbone

When we rolled out the redesign of the Coursera class platform back in January, I put up a prominent message asking for feedback, and as can be expected, we got a lot of feedback. Much of the feedback was on the forums, where we had improved the aesthetics but neglected to improve the core usability. We had feedback like:

  • "I want to be able to link to a post."
  • "I can only comments, I can't edit posts."
  • "Whenever I do anything, the whole page reloads and I lose my place."

When I started to tackle the problems, I was faced with our legacy codebase, a spaghetti code of PHP outputting HTML and being manipulated in JavaScript. Some of the actions were done via API-like calls and some were done via form POSTs and server redirects. There was no consistent architecture, and that made it hard for me to make things that felt like they should be minor improvements. There was also an increasing amount of JavaScript, but it wasn't written in a clean way, and it worried me every time I added to it.

So I did the thing that everyone tells you not to do: rewrite the codebase. I knew it would be a lot of work, and that I may risk introducing regressions, but I decided it would be worth it, if it would enable us to iterate faster and innovate more in the future.


The Process

I started by turning the backend into a RESTful API, with logically organized, object-oriented classes representing the routes and the database models. Once I had enough of an API to give me thread data, I started on the frontend, a single-page Backbone web app following the style of our other Backbone apps (which I've spoken about in the past). From there, I just kept iterating, building back the features that we had before and figuring out the best way to approach our top usability bugs.

At a certain point, I realized that I couldn't handle this rewrite myself (a hard thing for me to admit, I may be a bit of a cowboy coder) and I enlisted the help of my colleague Jacob (and his expertise as an avid Reddit moderator and user).

Once we had it 80% done, I started writing tests for the frontend. When we were 95% done, we enabled it via a per-class feature flag for our Community TAs class, and spent a week addressing feedback from the TAs and from our QA team. Then we started enabling it on classes, and after addressing the biggest concern from students (lack of Markdown support in the editor), we've enabled it for all our classes. From start to end, the rewrite took us about 6 weeks - three times as long as I hoped. One day I'll learn that most things take 3x as long as I expect them to. ☺


The Database

Since I wanted to be able to introduce the new forums in old classes - and also because I wanted to scope my rewrite down - I decided to stick with the same database design and model relations.

We use MySQL (in an effectively sharded way because each class has its own database), and this is my not so technical diagram of what our tables look like for forums:

A big thing to note is that each of our threads are always related to a forum, and that we do not have infinite nesting of comments like Disqus or Reddit, we instead have top-level posts which can each have associated comments. We may change this in a future rewrite to allow more levels of nesting with arbitrary comment levels, but for now, the post/comment relation is ingrained into our database design.


The Backend

Our class platform is currently written in PHP, and much of it are custom libraries, but, hey, if you're interested, here's how the new forums backend works:

  • We model the data with the PHP Active Record library, and use class functions and static functions to capture model-specific functionality. We use the Active record functions as much as possible, but sometimes use our own SQL query system (like for INSERT IGNORE, which it doesn't handle).
  • We have a simple Rest_Router class which can recognize registered routes and pass the requests to the appropriate class for processing.
  • We have a routes.php file which lists all of the forum API related routes.
  • We have a file of classes that extend our Rest_Controller class and handle the routes, defining get/patch/post/delete as needed. (We prefer patch instead of put, since partial updates are easy via Backbone and preferable.)

For example, this URL in the routes file is for deleting a user's subscription to a thread:

$router->delete('threads/:thread_id/subscriptions', 'Subscriptions#delete');

This class in the controller file handles that URL:

class Subscriptions extends \Rest_Controller {
    
  public function delete($params) {
    $response = new \Rest_Http_Response();
    try {
      $request_body = $this->get_request_body();
      $data = json_decode($request_body, true);
      $data['thread_id'] = $params['thread_id'];
      $data['user_id']   = _current_user('id');
      $subscription_data = \Forum\Thread_Subscription::delete_subscription($data);
      $response->set_json_body(json_encode($subscription_data));
    } catch (\Exception $e) {
      return $this->error_request($e);
    }
    return $response;
  }
}

And this is the Active Record model that is called:

class Thread_Subscription extends \ActiveRecord\Model {
    
  static $table_name = 'thread_subscriptions';
    
  public static function delete_subscription($data) {
    $subscription = self::get_for_user($data['thread_id'], $data['user_id']);
    $subscription->delete();
    return null;
  }
}

The Frontend

We're a bit of a Backbone shop at Coursera now. We're not absolutely in love with it, but we've built up a lot of internal knowledge and best practices around it, so it makes sense for us to build our new apps in Backbone to keep our approach consistent. However, we do like to experiment in each app with different ways of using Backbone - like using Backbone-stickit for data binding in our most recent app. Sometimes those ways stick and become part of our best practices, and sometimes they fade away into oblivion.

Saying all that, here's a breakdown of how the forum Backbone app works. It's not perfect, but hey, it's a start.


The "Routes"

Most Backbone single-page web apps start with a routes file that maps URLs to views, and Backbone looks at the URL to figure out what view function to kick off. In this case, however, I wanted to code it so I could easily embed a forum thread on any page, regardless of URL. I want widgets, not routes.

To accomplish widget-like functionality, I wrote it so that the main JS file for the forum app looks for DIVs on the page with particular data attributes and replaces them with the relevant view. For example, here's our code for loading in a thread widget:

$('[data-forum-thread]').each(function() {
  var threadId = Number($(this).attr('data-thread-id'));
  var thread = new ThreadModel(id: threadId});
  new ThreadView(_.extend(opt, {
    el: $(this)[0], 
    model: thread,
    mode: threadMode
  }));
});
  

The Views

All of our views use Jade templates for HTML generation and separate Stylus files for CSS. Many of them listen to "change" or "sync" events on their respective models and then check the changedAttributes() array to see if they care about what attribute changed. That minimizes the amount of re-rendering that has to happen.

As an example, let's walk through ThreadView and its nested views. First, a diagram:

  • ThreadView is responsible for handling infinite loading and scrolling to permalinks. It defers all other rendering and event handling to one of its nested views, which each know to only re-render themselves when relevant properties of the thread change:
    • ThreadHeaderView: manages the title, subscription and thread admin controls.
    • ThreadTagsView: shows tags and handles adding tags.
    • PostContainerView: creates containers for each post using PostView and each comment using CommentView.
      • PostView and CommentView both extend EntryView with no modification. The slight differences between them are handled with if checks inside EntryView (e.g., only a post can be pinned, not a comment).
      • EntryView handles rendering an entry in view mode with its admin controls and voting controls, and it knows how to render an edit mode when the user wants it.

The Models

Most of our models use BackboneRelational, an extension of Backbone that knows how to take a JSON and turn keys into related Collections of Models. Our models also use our custom API wrapper, which takes care of CSRF tokens and displaying AJAX loading messages at the top of the page.

For example, let's look at ThreadModel and its related models. First, a diagram:

  • ThreadModel extends Backbone.RelationalModel, turning its "posts" and "comments" keys into PostCollection and CommentCollection, respectively. It is responsible for fetching thread JSON from the server and for figuring out how to fetch previous/next pages of the JSON. We debated how best to do this, and settled on always passing down a "post skeleton" where each post has an "id" and "order", and then we track which parts of the skeleton we've filled in (based on "post_text" existing), and fill in above/below. ThreadModel also must keep track of which user IDs it's seen on posts, and it fetches user profiles for any new user IDs from our main user database.
    • PostModel and CommentModel both extend EntryModel, and they differ only by their url (as the APIs distinguish between post and comment). PostCollection and CommentCollection are just collections of those models.
    • EntryModel extends Backbone.RelationalModel and is used for saving individual posts and comments - creating new ones and editing existing ones. EntryModel is never used for fetching JSON, because we always fetch on the Thread level, but it theoretically could be if we wanted a standalone entry view one day.

The Tests

We can't reasonably write so much logic in our JavaScript without also writing tests to verify that our logic is sound. We write our tests using the Mocha test runner framework and Chai assertion framework. We use Sinon to mock out our API responses with local test JSON. When we want to test our views, we use JSDOM to render a fake dom and react to fake events, and then we can test that the resulting DOM looks like what we expect. JSDOM does not do everything the browser dom (notably, it's missing content editable support), but it does an awful lot and is much faster than spinning up an actual browser.

For example, here's a snippet of a test for checking that save works as expected:

it('should save new post and render in view mode', function() {
  postView.render();

  server.respondWith("POST", getPath('/api/forum/threads/2703/posts'), 
    [200, {"Content-Type":"application/json;charset=utf-8"}, JSON.stringify(postJSON)]);

  postView.$('.course-forum-post-edit-link').click();
  postView.$('button.course-forum-post-edit-save').click();

  chai.expect(postView.$('.course-forum-post-edit-save').attr('disabled'))
    .to.be.equal('disabled');
  server.respond();
  chai.expect(postView.$('.course-forum-post-text').text())
    .to.be.equal(postJSON.post_text);
});

The Result

You can try out our forums by enrolling in a class and participating, and if you're really curious to learn more about the frontend Backbone code, you can browse our source snapshot here. We accomplished what I set out to do: be able to fix our major issues without feeling like I was hacking the code horribly, and making it easy for me to add new features on top of the codebase, and be able to test them. I imagine we have a forum rewrite v3 in the future, like if we decide to do truly real-time forums or do infinitely nested comments, but I hope that we will be able to re-use much of what we developed for this version. Was the rewrite worth it? I think so, but ask me again next year. ☺

No comments: