Friday, May 3, 2013

Managing history in Backbone widgets with jQuery BBQ

In my last blog post, I talked about the two different Backbone architectures we're experimenting with at Coursera: 1) single page web apps, where Backbone takes care of serving particular views for given URLs, and 2) JS widgets, where we write DIVs with particular data attributes into our HTML, and a JS module finds all of them on the page and turns them into Backbone views.


The Widget Approach

We are using this approach for our discussion forums. We have widgets for displaying lists of threads, rendering a single thread, displaying an entire forum with multiple thread lists inside it, and more. We can potentially mash up a few widgets on the same page, if we want, because they each encapsulate all of their functionality inside them.

For example, here's the bit of code that creates our forum widget:

$('[data-coursera-forum-widget]').each(function() {
  var forumId = $(this).attr('data-forum-id');

  var forum = new ForumModel({
    id: forumId
  });
  new ForumView({
    el: $(this)[0],
    model: forum
  });
});

The Problem

There was one problem with this approach, however: users kept losing their state in the widgets. For example, when a TA was paging through a long list of threads, and then they clicked away to visit one and came back, the widget would forget that it was on that page and they'd have to start from the beginning. When the forums were originally written, in the classical Web 1.0 architecture, the state was always stored in the query parameters of the URL, but now, since we are in JS-land and no longer need to change the URL to change the content of the page, we lost the URL-managed state.

That meant that we lost the ability for users to use the back button through the states of one widget, to go forward to a completely different page and back to the previous state, and to bookmark the state. Once I got enough reports from users who missed those abilities (mostly from our super users, who consider the forums to be their inbox), I realized it was time to take on this problem.


Possible Solutions

There were a few solutions that I considered and talked through with my colleagues:

  • Remember their previous state in cookies or localStorage, and always restore it from there. That wouldn't easily get me back button support, however, and it may have been odd for the user to open the forum in a new tab and see that same state.
  • Open all the links in a new tab, so that they never left the page and lost their state. Yes, I admit, this was a non-ideal solution, and I did try it for a few hours but I quickly remembered that I should not be the one deciding that the user wants those links in a new tab. Also, that wouldn't solve the problem of back button through widget state.
  • Move to the single page web app approach, and let the Backbone router manage the history. That would mean losing the mashability of my widgets, the ability to put any combination of widgets on the same page, and I'm not ready to give that up.
  • Use a JS library to store each widget's history in the URL hash, and use the hashchange event (and fallback implementations) to support the back button.

As you can guess from the title of the post, I went with the final approach. It's the only one that solved all of the users problems and let me keep my widget approach - plus it's a tried and true technique.


jQuery BBQ + Backbone

There are a few libraries out there that manage history via the hash, from very simple (js-hash) to a sophisticated polyfill approach (Hasher). The one that I was most familiar with was jQuery BBQ, and its also the one that did everything I wanted and not too much more. Plus, its docs described exactly our scenario:

<Widget> Yo, hash, update my state parameters.
<Hash> No prob, dude, done. And you didn’t even have to know about that other widget’s parameter, I just merged them in there for you.
<Widget> There’s another widget?
<Widget2> Huh? Did someone say my name?

jQuery BBQ has a straightforward API - you can pushState an object that is merged into the current hash values, you can getState on a certain key, and you can listen to the hashchanged event. To make it easy for multiple Backbone views to manage their state independently of each other, I wrote a WidgetView class with functions that can set and get state scoped to just the widget, and can trigger the view with an event whenever the widget's state changed in the hash. Here's what that class looks like:

Once my Backbone view extends that class, it can check the initial state when the view is loaded, it can set the state when the user clicks around (like on the sort or page controls), and it can listen to the state changed event to decide how to change the UI.

Here's a slimmed down version of the ThreadsView that demonstrates that:



We'll see how this works out once we build out more widgets, but so far, it seems to be working well. Let me know in the comments what approach you've taken.

Thursday, May 2, 2013

Server-side HTML vs. JS Widgets vs. Single-Page Web Apps

At the recent GOTO Chicago conference, I gave a talk on "Frontend Architectures: from the prehistoric to the Post-modern." In just my first 10 months at Coursera, I've experienced the joys and woes of many different frontend architectures, and I wanted to share what I learnt. I detail everything in the slides, but I'll summarize my thoughts here as well.

How do you pick an architecture?

We could make our decision by just checking Twitter and seeing what all the cool kids are talking about, but uh, let's pretend that we're more scientific about it than that, and figure out whats important to us as web developers.

To start off with, here are a few things that users care a lot about:

  • Usability: they can do what they want to do, quickly and intuitively
  • Linkability: they can bookmark and link to parts of the site
  • Searchability/Shareability: they can share what they're using on social networks, and find it on Google (this is important for user growth)
  • More features, less bugs: 'nuff said.

Of course, we also should think about what we as developers care about:

  • Developer Productivity: we want to be able to iterate fast, try new things, and implement features in a way that makes us feel code
  • Testability: we want confidence that our code won't break, and that we won't spend most of our time maintaining a decaying, fragile product
  • Performance: we want our servers to respond quickly and to serve minimum number of requests, and we don't want our users to wait unnecessarily.

And when developers are happier, we can more quickly and safely improve the product, and that makes users happy.

Let's compare a few architectures...

So that's what I looked at when I reviewed our different architectures, and as it turns out, there are definite differences between the architectures in those respects.


Server-side HTML ("Web 1.0")


In much of our our legacy codebase on class.coursera.org (where instructors create their courses with a combination of lectures, quizzes, assignments, wikis, and forums), we use PHP to output HTML, and it handles much of the interaction via page reloads. A bit of JS is sprinkled (in a not so pretty manner) throughout.

This architecture suffers the most in terms of usability - it's very hard for users to do many interactions in a small amount of time - but it does have definite benefits of easy linkability, shareability, and searchability.

As a developer though, I hate it. All of our data is entangled inside the HTML, and when we want to bring JS into the HTML, it quickly becomes an untestable mess. Plus, it ties us to a particular backend language, and if we want to change languages (which we're doing, to Scala/Play), we have to rewrite the presentation layer. Yes, it's possible to use this approach in a more elegant way, and try to use templates that are portable across languages, but it's just not likely.


JS widgets


For the class.coursera.org forums, we are starting to take a different approach: JS widgets. We write DIVs into the HTML with certain attributes (like data-coursera-forum-threads-widget) and then we include a widgets.js that turns those DIVs into Backbone views.

With this approach, we can create very dynamic interfaces with real-time updates, we can have decent linkability if we have server-side URLs and hash state, but we can't easily make the content in the widgets shareable and searchable, since bots don't understand the JS. However, that's not a big concern for us there, since much of the content is behind a login wall anyways.

From a developer perspective, we typically create APIs for the JS widgets to consume, and well-tested APIs mean that we can more easily create new frontends for the same data. After I ported our forums to the widgets approach, I was able to make new widgets for the same data in just a few hours, since I could re-use the same API. On the flip side, I have to spend more time writing tests for the frontend, since the user can change the state via sequences of interactions and many bugs may not surface until after a particular interaction.


Single-page web apps


On www.coursera.org, we serve the same HTML file for every URL, and that HTML file contains only require.js and a call to load a routes.js file. That routes file maps URLs to Backbone views, and we rely on the Backbone Router to figure out what view to load up and to manage history using the HTML5 history API (with the fallback hash technique in older browsers).

We get many of the same benefits and drawbacks as JS widgets with single page web apps, but there are a few key differences. We can have a potentially faster user experience because of the complete lack of true page reloads, but it's harder to do simple things like internal links (you'll end up with a double hash on older browsers), or listen to a window.onunload event. We would suffer from bad searchability here, but we decided that it's really important for users to be able to find and share the course descriptions, so we wrote a "Just in time renderer" that uses Selenium to render the HTML and serve that to bots instead.

On the developer side, it's much trickier to test, because our webapp now has state *across* routes, and our tests have to check different sequences of routes.


So which one's the best?

Trick question! None of them have it all, atleast not yet. I would argue that if you're doing anything highly interactive (like an admin interface), you really want to take a heavy-JS approach, and also I'll point out that we're still in the early days of JS-heavy architectures. We will get better at developers at tackling problems like testability, and bots and browsers will likely get better at problems like searchability and performance. So, yes, it's a bit of a risk to do go as far as the single page webapp approach today, but it will be less and less of a risk as time goes on.

In terms of JS widgets vs Single-page web apps, I think they each have their places. On class.coursera.org, I hope that we can give instructors increasing amounts of control and flexibility in how they put together their classes, and that's why I'm building out our functionality as widgets that can be combined together however they please. On www.coursera.org, where students browse classes, we are the only ones designing that experience, and we know what we want each page to be, so it makes more sense to go the single page web app approach. The only difference is in the routes vs. the widgets file (mapping URLs vs. transforming DIVs), so we can easily create Views and Models that we use in both apps and widgets.

For my full run-down of the good and bad bits about each approach, flip through the slides and click on the images in the slides to see more code, screenshots, or videos.

What's your approach? What do you like or dislike about it? Let me know in the comments!

Tuesday, April 16, 2013

Attracting women to developer events

GirlDevelopIt SF is now 1,500 members strong, and all but a handful of them are women interested in learning to program, make websites, and generally become more technically literate. Because of my involvement in GDI and likely also because I'm a fairly visible "woman in tech", I often get approached with my thoughts on how other events can attract more women. As a general rule, I prefer to avoid talking about the women-in-tech-thing and instead spend my time doing stuff about it, but well, every once in a while, I decide to share some thoughts. Please note that these are only my thoughts, and do not necessarily reflect those of every woman everywhere.

Given that, here are some ideas on how you can lower the barrier for women to come to your developer events:


Offer a low cost option

There are numerous reasons why you might consider lowering the financial barrier for women attendees:

  • Women don't traditionally earn as much as men. Studies show that women in the same roles as men often don't earn as much, for a myriad of reasons.
  • Many women are up-and-coming developers, so they're not earning the developer big bucks yet. There is not a high percentage of women in computer science degrees at college, but lately, at least around these parts, there's a bit of an everyone-learn-to-code revolution and women are looking around and deciding that maybe that is what they want to pursue. This is probably my distorted opinion from running GirlDevelopIt, but basically, the majority of women developers I know are in the learning stages or the junior engineer stage, so I typically don't recommend expensive conferences to them.
  • Women may not feel as comfortable at the conference as the men, so they have less financial incentive to attend. We pay money for things that we think we will get a high value out of, and no pain. When I think about paying for a conference, I think about about the content but also about the atmosphere. Am I up for doing the whole only-woman-there thing? Do I feel like hanging out with a bunch of beer-drinking dudes? Sometimes I am, but sometimes I'm not, and especially if the conference costs money. Like everyone, I hate spending money on something that wasn't as good of an experience as I'd hope. I hate wasting time, too, but somehow that doesn't feel as bad. (And if I went to a conference for free and didn't like it, I could just leave, which I admittedly have done.)

How do you actually lower the financial barrier? There are a few ways:

  • Offer a scholarship that women can apply for. You can offer this out of your own event funds or partner with someone to be their official sponsor. Put together an application form that finds out what their need is like and how interested they truly are in the event. Organize a breakfast or lunch for scholarship recipients, so that they feel an obligation to show up and get an excuse to meet other people (meeting people at events is hard, especially if you're not a chronic-event-attender).
  • Offer free or discounted tickets for women groups to give out. You can find local groups like WomenWhoCode or GDI, contact their organizers, explain sincerely why you think the conference would be of interest (don't send a damn form letter) and offer them discounted tickets.
  • Make your event free for everyone. This is an option, but it will effectively lower the barrier for *all attendees*, so if you do this, you need to go through effort to target women specifically, way more than men. Google recently put on DevFestW, and they did outreach through all of the local women groups and got up to 60% RSVPs of their cap before they advertised it to the general public. If they'd advertised it to the general public at first, it would have likely filled out with mostly men, just for statistical reasons.

Increase the value proposition

I have an anecdote-based theory that to some attendees, a conference is attractive merely because it is an excuse to drink and socialize with like minded folks. Well, I personally don't get attracted to conferences for that, because I don't always like drinking (especially with strangers who I have yet to develop trust in), and because I don't always know that I'll find like minded folks there. When I go to a conference, it's because I'm excited about what I will learn at the conference, what speakers I will get to meet (like the authors of tools I'm using), and hey, if it offers kayaking on the side like JSConf is this year, I'll get excited about that too.

Given that the value proposition of a conference may matter more to women (or atleast me), here is what I suggest:

  • Publish everything you know about the speakers and topics covered as soon as possible. Some conferences are well known enough that they attract attendees on their name alone, but that means that you are only attracting the attendees that already know the conference and have confidence in its value to them. If you want to attract women, most of them will be new attendees (given the low % of women attendees currently), and they may want more information than the brand.
  • If you do not have all the information published when ticket sales start, save some tickets. I often wait until I see a conference agenda and find out who else is going before I decide to attend it, but for more popular conferences, that sometimes means the conference is sold out by the time the value proposition is clear. If you are starting ticket sales before the full value is clear, then you might consider saving tickets that you can announce at later times, or offer through other channels.
  • Encourage previous attendees to recommend it to their (women) friends. Just like we trust restaurant recommendations from our friends more than other sources, I'm more likely to think that a conference will be beneficial if a friend tells me so.

Lower the intimidation factor

A conference can be a scary place. Hundreds of strangers milling around, an expectation of networking, content that might go too far over your head. For a woman, it can be even scarier because we stand out (so we do not have the option to blend in), and because we may suffer from imposter syndrome.

Here are a few ideas on improving that:

  • Clearly message the intended level of the conference, or make it beginner friendly. This reduces the fear of mismatched content level. For much more detailed thoughts on making an event newbie friendly, see this blog post.
  • Organize groups of women attendees. New things are less scary when we have a support crew, and that's something that you could encourage to make your event less scary. You could reach out to women groups and encourage them to make a Meetup out of it (and maybe they'd meet up for breakfast at the conference), or you could offer discounted rates to 2+ women attending together. You could also incorporate social networking into your ticketing process to see if that makes it easier for attendees to find other potential attendees.

Wrapping up

To summarize all of the above, I'd say: It's harder for women to know which events they will gain the most pleasure/experience the least pain, and they will tend to attend those with the clearest value proposition, lowest cost/commitment, and lowest intimidation factor.

Once again, I want to reiterate: these suggestions are based on my experiences, and do not necessarily reflect the views of other women. I know that some of my suggestions are along the lines of Affirmative Action, and there is much debate along those lines. I do not want to inspire heated debate, I just want to put my food for thought out there, and hopefully it can be helpful to some of you. I'd love to hear in the comments what has worked or not worked for your own events. Thanks!

Saturday, April 13, 2013

Making newbie-friendly developer events

GirlDevelopIt is all about welcoming and teaching newbies. Most of our students are completely new to web development, and they come to us because we try our best to provide a newbie-welcoming environment and get them over that newbie hump. Some of our students stop at the intros, others of them continue on to turn into 24/7 developers.

Since we have a membership of 1,500 women that are atleast marginally interested in development, we often get approached by other event organizers that want to get more women at their events, and are hoping we can give them advice and advertise their event to our members.

My first response to them is always: "Is it newbie friendly?"

You see, most events are not newbie friendly, or at least not marketed that way. Many of them are actually the exact opposite. For example, you might see a hackathon that says "Are you a coding ninja? Compete to see who can hack the most in a weekend" I'm sorry, but you're not going to get newbies foaming at the mouth to come to that event. Nobody wants to attend an event where they don't feel wanted.

So, here's the first question you should ask yourself: "Do you want to be newbie friendly?"
Maybe you only want to attract experienced developers. If that's the crowd you're going for, then that's totally fine, just be aware that you've made that decision.

If you actually want to attract more beginner level developer, though, you will have to do a bit more work to do it well. Here's what I'd recommend:

  • Provide beginner level content at your event. Either it should all be beginner level, or there should be continuous parallel tracks for the different levels. If only one small part of your event is at a beginner level but the rest isn't, then you will likely not get many beginners, or you'll get beginners that feel overwhelmed most of the time.
  • If your event itself doesn't have a beginner track, then offer events in the weeks leading up that will cover the prerequisite knowledge. For example, when I put on a 3-day Google APIs hackathon in college, I realized that my computer science classmates were effectively newbies in web development, and we organized a 2-week series of workshops before the hackathon to get them up to speed. For another example, when we wanted to get a lot of GDI members at the Everyone Hacks event but we realized that many of them were new to hackathons, Adria Richards gave a great workshop on "How to Rock your First Hackathon" to answer their doubts and build up their confidence.
  • Make it very clear in your marketing material what the prerequisites are, and be as specific as possible. Even when we list prerequisites for our GDI workshops, we get questions from students who still aren't sure if it's at their level. Beginner students are inherently not experts, so it won't be as obvious to them as it is to you what their level is and whether it's appropriate.
  • If your event targets multiple levels of expertise, make that clear, and maybe give attendees the option to specify their level. For example, if you're listing ticket types for a hackathon and one of them is "Super Hacker", then you should also have a ticket type for "First time Hacker" (like in this sign up). You might say elsewhere that it's okay to be beginner level, but damn if I'm going to identify myself as a "Super Hacker". Beginners are easily intimidated. (Well, we all are, actually.)

Here are some ideas specifically about hackathons. I really think hackathons can be a fantastic experience, which is why I encourage our members to attend them, but I also think that they can be the most intimidating, since there are so many opportunities for beginners to feel bad for being a beginner.

  • Sign up people ahead of time to be designated coaches for newbie teams. In their pitches, the coaches can say "And I'm looking to mentor a team of beginners, so if that's you, join me!" Liz Howard did that at our Everyone Hacks event, and I think it relieved a lot of beginners who were worried about being a drain on an experienced team. When we had our pre-workshop on "How to Rock your first Hackathon", the question that we got over and over is "how we will find a team to join?", so its worth it to spend time figuring out how team formation will work at your hackathon. Consider the case of strangers, beginners, shy folks, etc, and find something that will work for all of them.
  • Sign up mentors to wander around and help anyone that looks lost. A team coach can't always do it all, and it can be exhausting to mentor 24/7. At Everyone Hacks, my team decided to use Ruby on Rails, which I'm not familiar with, but luckily a Rails expert floated around and helped our team get it all up and running.
  • Make the hackathon less about competition and more about collaboration. Maybe that means making prizes for best team spirit or best idea, and maybe that means massaging the messaging in the marketing. The best thing about a hackathon is the people that you meet, anyway, so it doesn't hurt to put more emphasis on that aspect.

I'm not professing to be the world's expert on this, of course. This is just what I've observed during my experiences in GirlDevelopIt and the many developer events I attend. So, what do you do in your events to make them newbie friendly?

Wednesday, April 10, 2013

Outputting iCal with PHP

I'm a big Google Calendar user. If I don't have it on my calendar, then it's probably *not* going to happen. If I'm trying to schedule something into my week, then I'm always consulting my calendar to see how it fits in with everything else, or if its making my week too busy. And, hey, I'm pretty sure I'm not the only GCal addict out there. (Oh, and before GCal, I was totally a Yahoo! Calendar user. Retro!)

So, when I first joined Coursera, I brought with me a list of ways I wanted to improve the student experience, and one of those was "Create a Google calendar of deadlines."

I was hoping this would be an easy thing, something I'd do in my first month. Of course, I didn't realize then that our legacy codebase was a tangle of PHP, that it was split across 5 git repositories, and that it was largely untested. So I repressed my dreams and worked on improving our architecture so that features like that *would* be an easy thing.

Well, as I just announced on the Coursera blog, I finally got to a place where I could write and test the feature, and we've started surfacing it on our classes.

I still had to write it in our legacy PHP codebase, but I don't actually mind PHP when it's written relatively cleanly and testable. I found the hardest part was figuring out exactly how to format my ICS files, and I spent a while going back and forth between this handy iCal Validator and the rather boring iCalendar specification.

I started by writing 2 general classes - CalendarEvent for generating VEVENTs, and Calendar for generating VCALs. Here's the most important function of the CalendarEvent class, the one that generates the string based on the event data:

public function generateString() {
  $created = new DateTime();
  $content = '';

  $content = "BEGIN:VEVENT\r\n"
           . "UID:{$this->uid}\r\n"
           . "DTSTART:{$this->formatDate($this->start)}\r\n"
           . "DTEND:{$this->formatDate($this->end)}\r\n"
           . "DTSTAMP:{$this->formatDate($this->start)}\r\n"
           . "CREATED:{$this->formatDate($created)}\r\n"
           . "DESCRIPTION:{$this->formatValue($this->description)}\r\n"
           . "LAST-MODIFIED:{$this->formatDate($this->start)}\r\n"
           . "LOCATION:{$this->location}\r\n"
           . "SUMMARY:{$this->formatValue($this->summary)}\r\n"
           . "SEQUENCE:0\r\n"
           . "STATUS:CONFIRMED\r\n"
           . "TRANSP:OPAQUE\r\n"
           . "END:VEVENT\r\n";
  return $content;
}
And the function for the Calendar Class that generates the string of events:
public function generateString() {
  $content = "BEGIN:VCALENDAR\r\n"
             . "VERSION:2.0\r\n"
             . "PRODID:-//" . $this->author . "//NONSGML//EN\r\n"
             . "X-WR-CALNAME:" . $this->title . "\r\n"
             . "CALSCALE:GREGORIAN\r\n";

  foreach($this->events as $event) {
    $content .= $event->generateString();
  }
  $content .= "END:VCALENDAR";
  return $content;
}

Here's an example of using those classes to create a calendar with one event:

$event_parameters = array(
            'uid' =>  '123',
            'summary' => 'Introduction Quiz Deadline',
            'description' => 'Make sure you check the website for the latest information',
            'start' => new DateTime('@'.($time - (60*60))),
            'end' => new DateTime('@'.$time),
            'location' => 'http://class.coursera.org/ml/quiz/index?id=2'
        );
$event = new CalendarEvent($event_parameters);

$calendar = new Calendar();
$calendar->events = array($event);
$calendar->title  = 'Machine Learning Deadlines';
$calendar->author = 'Coursera Calendars';
$calendar->generateDownload();

In our own code, I wrote two more classes to help with generating those events for our own data, CourseItem and CourseCalendar (a subclass of Calendar).

You can check out the Calendar classes in this gist. If you've worked with iCalendar files in the past and know anything that we should be tweaking about what we're outputting, let me know more in the comments.

Sunday, March 31, 2013

Source "Snapshots"

In our small team of 20 engineers at Coursera, we've been talking a lot lately about what our approach to the world of "open source" should be, both now and long term.


Why open source?

We're all interested in the idea of open-sourcing parts of our code, and we're motivated by a few main reasons:

  • We are heavy users of open-source code in our Coursera codebase, and we would like to give back. Like many startups, we're built on top of a plethora of open-source technologies and libraries - Scala, Play, PHP, Django, Python, Backbone, RequireJS, plus numerous third party libraries built on top of those. One way of giving back is to contribute to the libraries themselves, and we've done that via bug fixes, documentation suggestions, and talks on how we use them. But we would also like to give back by showing them what we've built with their libraries, to serve as examples for other users of the library.

  • We would like to be able to point to our code as a reference in public settings. For example, if we post a bug report about a particular library, we'd like to be able to point to our whole code that uses it, to give the context. Maybe we could elicit a more informed response that way, and someone could suggest a better way to accomplish what we're trying to do. Plus, when I give talks about a particular feature, I'd love to be able to finish by pointing people to the full code for that feature, so that they don't have to guess at the functionality from the snippets I managed to fit on the slides.


Why not open source?

As much as we are motivated to open-source parts of our code, we also have our reservations:

  • We do not have the engineering resources to maintain both a private codebase and public, open-source, community-fed codebase. There have been many interesting discussions lately about the "burden of open source", like this post from Divya after she deleted a popular github repository, and this talk by the Twitter Bootstrap co-creator where he compares the Bootstrap project to a cute puppy that becomes an old, fat, unwanted dog.

    They both realized that when you open source something, you're also creating a community that requires nurturing and a product that requires maintenance and upgrades. Most open source projects do not come close to approaching the popularity of Bootstrap, but even an open source project with a handful of forks and pull requests requires resources and time. For example, my lscache library with only 35 forks has had 5 pull requests over the course of its 2 year existence, and I can distinctly remember putting lscache maintenance on my TODO list, procrastinating it, and feeling bad.

    We're in startup mode at Coursera, and we need to budget our time for internal needs first, without worrying about the potential time it can take to maintain an open-source codebase.

  • We do not always feel comfortable open-sourcing all the parts of a feature. For example, many of our Backbone JS apps communicate with a REST API. We're comfortable with developers seeing our Backbone code, since our JS is already inherently public (in an obfuscated way), but we are hesitant about the server-side REST API side. Maybe it would reveal secrets about upcoming or unknown features, or hey, maybe it has security holes that we haven't yet discovered. That code could also contain secrets like API keys, salts, and hashes, that we would have to carefully remove via a scrubbing script.

  • We do not always have the time to package a feature so that it's "ready-to-run". When you find an open source repository for a particular library or app that you actually want to use, you usually scroll down to find the "Installation instructions" and you hope that it's easy to get it working in a few minutes. As it turns out, it's not that easy to extract bits of a codebase and make them easy for anyone else to get running in their own environment. For example, at Coursera, we have our own custom build tool for our frontend code, we have custom templated configuration files that are used by that tool, and we have several Coursera-specific libraries that we use across all of our frontend apps. Yes, we could either open source all of those things, or figure out how to make our code work without depending on them, but either of those approaches would take significant time and resources.


An Approach: Source Snapshots?

But, as I started off by saying: we really want to give back to open source in some way. We've been mulling over our motivations and our reservations, and I think I've come up with an approach that I can use for many bits of our code, at least in the short term while we're low on resources: "snapshots".

A "snapshot" is a dump of some part of our codebase, taken at a point in time and copied into a public repository. It may be an incomplete dump (missing dependencies or server-side, e.g.), it would not necessarily be runnable, and it would have no guarantees of being up-to-date or ever being updated in the future.

The snapshot would still be useful, for developers looking to see how we approached some aspect in the codebase, and also for us to refer to in talks and blog posts. It would also be a way for us to dip our toes into the open source waters, and to see what developers are most interested in. If a particular snapshot got a lot of attention, then maybe one day, when we felt we had the resources, we would turn it into an actual living open-source library and spend the time needed to nurture that community. A snapshot can serve almost as an MVP, if you think of open source repositories as new products/features.

As an example, I've open sourced a snapshot of the Backbone JS for our forum rewrite, along with a blog post about how it works. If this goes well, we hope to snapshot more of our Backbone apps in the future, as well as the JS UI libraries that we've built on top of Bootstrap and Require.

So that's the hope: we can avoid the burden of open source while satisfying our desire to share our learnings. Let's see how this works. ☺

Rewriting our Forums with Backbone

When we rolled out the redesign of the Coursera class platform back in January, I put up a prominent message asking for feedback, and as can be expected, we got a lot of feedback. Much of the feedback was on the forums, where we had improved the aesthetics but neglected to improve the core usability. We had feedback like:

  • "I want to be able to link to a post."
  • "I can only comments, I can't edit posts."
  • "Whenever I do anything, the whole page reloads and I lose my place."

When I started to tackle the problems, I was faced with our legacy codebase, a spaghetti code of PHP outputting HTML and being manipulated in JavaScript. Some of the actions were done via API-like calls and some were done via form POSTs and server redirects. There was no consistent architecture, and that made it hard for me to make things that felt like they should be minor improvements. There was also an increasing amount of JavaScript, but it wasn't written in a clean way, and it worried me every time I added to it.

So I did the thing that everyone tells you not to do: rewrite the codebase. I knew it would be a lot of work, and that I may risk introducing regressions, but I decided it would be worth it, if it would enable us to iterate faster and innovate more in the future.


The Process

I started by turning the backend into a RESTful API, with logically organized, object-oriented classes representing the routes and the database models. Once I had enough of an API to give me thread data, I started on the frontend, a single-page Backbone web app following the style of our other Backbone apps (which I've spoken about in the past). From there, I just kept iterating, building back the features that we had before and figuring out the best way to approach our top usability bugs.

At a certain point, I realized that I couldn't handle this rewrite myself (a hard thing for me to admit, I may be a bit of a cowboy coder) and I enlisted the help of my colleague Jacob (and his expertise as an avid Reddit moderator and user).

Once we had it 80% done, I started writing tests for the frontend. When we were 95% done, we enabled it via a per-class feature flag for our Community TAs class, and spent a week addressing feedback from the TAs and from our QA team. Then we started enabling it on classes, and after addressing the biggest concern from students (lack of Markdown support in the editor), we've enabled it for all our classes. From start to end, the rewrite took us about 6 weeks - three times as long as I hoped. One day I'll learn that most things take 3x as long as I expect them to. ☺


The Database

Since I wanted to be able to introduce the new forums in old classes - and also because I wanted to scope my rewrite down - I decided to stick with the same database design and model relations.

We use MySQL (in an effectively sharded way because each class has its own database), and this is my not so technical diagram of what our tables look like for forums:

A big thing to note is that each of our threads are always related to a forum, and that we do not have infinite nesting of comments like Disqus or Reddit, we instead have top-level posts which can each have associated comments. We may change this in a future rewrite to allow more levels of nesting with arbitrary comment levels, but for now, the post/comment relation is ingrained into our database design.


The Backend

Our class platform is currently written in PHP, and much of it are custom libraries, but, hey, if you're interested, here's how the new forums backend works:

  • We model the data with the PHP Active Record library, and use class functions and static functions to capture model-specific functionality. We use the Active record functions as much as possible, but sometimes use our own SQL query system (like for INSERT IGNORE, which it doesn't handle).
  • We have a simple Rest_Router class which can recognize registered routes and pass the requests to the appropriate class for processing.
  • We have a routes.php file which lists all of the forum API related routes.
  • We have a file of classes that extend our Rest_Controller class and handle the routes, defining get/patch/post/delete as needed. (We prefer patch instead of put, since partial updates are easy via Backbone and preferable.)

For example, this URL in the routes file is for deleting a user's subscription to a thread:

$router->delete('threads/:thread_id/subscriptions', 'Subscriptions#delete');

This class in the controller file handles that URL:

class Subscriptions extends \Rest_Controller {
    
  public function delete($params) {
    $response = new \Rest_Http_Response();
    try {
      $request_body = $this->get_request_body();
      $data = json_decode($request_body, true);
      $data['thread_id'] = $params['thread_id'];
      $data['user_id']   = _current_user('id');
      $subscription_data = \Forum\Thread_Subscription::delete_subscription($data);
      $response->set_json_body(json_encode($subscription_data));
    } catch (\Exception $e) {
      return $this->error_request($e);
    }
    return $response;
  }
}

And this is the Active Record model that is called:

class Thread_Subscription extends \ActiveRecord\Model {
    
  static $table_name = 'thread_subscriptions';
    
  public static function delete_subscription($data) {
    $subscription = self::get_for_user($data['thread_id'], $data['user_id']);
    $subscription->delete();
    return null;
  }
}

The Frontend

We're a bit of a Backbone shop at Coursera now. We're not absolutely in love with it, but we've built up a lot of internal knowledge and best practices around it, so it makes sense for us to build our new apps in Backbone to keep our approach consistent. However, we do like to experiment in each app with different ways of using Backbone - like using Backbone-stickit for data binding in our most recent app. Sometimes those ways stick and become part of our best practices, and sometimes they fade away into oblivion.

Saying all that, here's a breakdown of how the forum Backbone app works. It's not perfect, but hey, it's a start.


The "Routes"

Most Backbone single-page web apps start with a routes file that maps URLs to views, and Backbone looks at the URL to figure out what view function to kick off. In this case, however, I wanted to code it so I could easily embed a forum thread on any page, regardless of URL. I want widgets, not routes.

To accomplish widget-like functionality, I wrote it so that the main JS file for the forum app looks for DIVs on the page with particular data attributes and replaces them with the relevant view. For example, here's our code for loading in a thread widget:

$('[data-forum-thread]').each(function() {
  var threadId = Number($(this).attr('data-thread-id'));
  var thread = new ThreadModel(id: threadId});
  new ThreadView(_.extend(opt, {
    el: $(this)[0], 
    model: thread,
    mode: threadMode
  }));
});
  

The Views

All of our views use Jade templates for HTML generation and separate Stylus files for CSS. Many of them listen to "change" or "sync" events on their respective models and then check the changedAttributes() array to see if they care about what attribute changed. That minimizes the amount of re-rendering that has to happen.

As an example, let's walk through ThreadView and its nested views. First, a diagram:

  • ThreadView is responsible for handling infinite loading and scrolling to permalinks. It defers all other rendering and event handling to one of its nested views, which each know to only re-render themselves when relevant properties of the thread change:
    • ThreadHeaderView: manages the title, subscription and thread admin controls.
    • ThreadTagsView: shows tags and handles adding tags.
    • PostContainerView: creates containers for each post using PostView and each comment using CommentView.
      • PostView and CommentView both extend EntryView with no modification. The slight differences between them are handled with if checks inside EntryView (e.g., only a post can be pinned, not a comment).
      • EntryView handles rendering an entry in view mode with its admin controls and voting controls, and it knows how to render an edit mode when the user wants it.

The Models

Most of our models use BackboneRelational, an extension of Backbone that knows how to take a JSON and turn keys into related Collections of Models. Our models also use our custom API wrapper, which takes care of CSRF tokens and displaying AJAX loading messages at the top of the page.

For example, let's look at ThreadModel and its related models. First, a diagram:

  • ThreadModel extends Backbone.RelationalModel, turning its "posts" and "comments" keys into PostCollection and CommentCollection, respectively. It is responsible for fetching thread JSON from the server and for figuring out how to fetch previous/next pages of the JSON. We debated how best to do this, and settled on always passing down a "post skeleton" where each post has an "id" and "order", and then we track which parts of the skeleton we've filled in (based on "post_text" existing), and fill in above/below. ThreadModel also must keep track of which user IDs it's seen on posts, and it fetches user profiles for any new user IDs from our main user database.
    • PostModel and CommentModel both extend EntryModel, and they differ only by their url (as the APIs distinguish between post and comment). PostCollection and CommentCollection are just collections of those models.
    • EntryModel extends Backbone.RelationalModel and is used for saving individual posts and comments - creating new ones and editing existing ones. EntryModel is never used for fetching JSON, because we always fetch on the Thread level, but it theoretically could be if we wanted a standalone entry view one day.

The Tests

We can't reasonably write so much logic in our JavaScript without also writing tests to verify that our logic is sound. We write our tests using the Mocha test runner framework and Chai assertion framework. We use Sinon to mock out our API responses with local test JSON. When we want to test our views, we use JSDOM to render a fake dom and react to fake events, and then we can test that the resulting DOM looks like what we expect. JSDOM does not do everything the browser dom (notably, it's missing content editable support), but it does an awful lot and is much faster than spinning up an actual browser.

For example, here's a snippet of a test for checking that save works as expected:

it('should save new post and render in view mode', function() {
  postView.render();

  server.respondWith("POST", getPath('/api/forum/threads/2703/posts'), 
    [200, {"Content-Type":"application/json;charset=utf-8"}, JSON.stringify(postJSON)]);

  postView.$('.course-forum-post-edit-link').click();
  postView.$('button.course-forum-post-edit-save').click();

  chai.expect(postView.$('.course-forum-post-edit-save').attr('disabled'))
    .to.be.equal('disabled');
  server.respond();
  chai.expect(postView.$('.course-forum-post-text').text())
    .to.be.equal(postJSON.post_text);
});

The Result

You can try out our forums by enrolling in a class and participating, and if you're really curious to learn more about the frontend Backbone code, you can browse our source snapshot here. We accomplished what I set out to do: be able to fix our major issues without feeling like I was hacking the code horribly, and making it easy for me to add new features on top of the codebase, and be able to test them. I imagine we have a forum rewrite v3 in the future, like if we decide to do truly real-time forums or do infinitely nested comments, but I hope that we will be able to re-use much of what we developed for this version. Was the rewrite worth it? I think so, but ask me again next year. ☺

.