Saturday, December 29, 2012

My End of Year Donations

I am not a tax expert (I consider myself accomplished if I do them at all), but I've heard that if you donate to charities by the end of the year, then it helps when you're doing taxes in April. I'm sure all of you know more than me. So, now that my ignorance is out of the way, here's where my end of year donations have gone to:

  • SF Bike Coalition: I am an SF biker myself (not in the hardcore sense of the word, but the commuter sense), and I appreciate everything that the coalition does to make this a more bike-friendly city.
  • Village Harvest: They are a local group that harvests fruit (and lets volunteers help them - I keep meaning to join in, one of these days), and donates them to feed the hungry.
  • Electronic Frontier Foundation: They protect our rights in the digital world, which is an increasingly important issue, with the crazy laws people are trying to pass.
  • Freedom from Religion Foundation: They work to uphold the separation of church and state, something I've been a fan of ever since I was a wee child, refusing to say "under god" during the pledge in school each day. People can have their beliefs, but they shouldn't be enforced on others.
  • Planned Parenthood: They're one of my favorite organizations to donate to, both because I've been a patient of them myself and I like what they provide to others. I wrote more about them in my previous post on donating to reduce birth rate.

So, there are a few days left in 2012. If you're doing any end of the year donations, where and why?

Saturday, December 15, 2012

A Tale of Two Bootstraps: Lessons Learned in Maintainable CSS


The Pull Request

346 files changed, 100 commits, 6 authors.

The title of that pull request? "Port to Bootstrap2." Yes, that was how much effort it took to make a change as seemingly simple as upgrading a CSS framework in our codebase.


The Journey

It all started on a rainy train ride, my daily commute back to San Francisco on the Caltrain. I knew that we used Bootstrap 1 in our legacy codebase (which powers class.coursera.org, where students watch lectures and take quizzes) and we were using Bootstrap 2 in our "modern" codebase (the one behind www.coursera.org, where students find courses to enroll in), and that was making it hard for us to share resources across them, so I thought, hey, I know, I'll just go through and upgrade all our class names in the old codebase. I had upgraded personal projects to Bootstrap 2 in the past, and it hadn't been that hard, especially with the help of the upgrade guide.

As you can guess, it took a wee bit longer than that train ride. Why?

  • Our codebase size: This was only my second foray into this codebase, and so I had very little idea how it worked and how big it actually was. After a few greps for Bootstrap class names, I soon realized that the codebase was about 10x bigger than I eventually realized. I was only familiar with the student side of class.coursera.org, but of course, there's an instructor side, and as it turns out, we have a rather comprehensive admin interface. I also discovered that we had multiple versions of some of our interfaces and we're still supporting both of them, until all the admins have time to learn the new interfaces.
  • Our codebase architecture: Our legacy codebase is written in a hand-spun PHP framework, spitting out HTML with inline JS and linked JS files. I expected to find some CSS class names in the JS, used in selectors or DOM creation, but besides finding a lot of that, I also discovered CSS class names outputted by PHP, and the worst, CSS class names in JS outputted by PHP. Yech! It meant that I had to look in every nook and cranny for class names, from the back to the front.
  • The bootstrap class names: Bootstrap - both 1 and 2 - uses very succinct class names, like "alert", "label", "btn". Those short names are great when you're hand typing out your CSS, but oh wow, they are a *bitch* to search for in a codebase of thousands of files. Do you know how often developers use the word "label"? All the freaking time. Bootstrap 1 even had these short add-on class names for component states like "primary", "success", "info", which thankfully Bootstrap 2 changed into the more specific "btn-primary", "btn-success", "btn-info".
  • The bootstrap JS components: At its core, Bootstrap is a CSS framework, but it offers a small number of JS components on top, and we were using a few of them. After upgrading to the new JS components and making them RequireJS-friendly (so that we could use the same JS libraries in our Require-powered codebase as this legacy one), I had to upgrade the relevant HTML and test that the interactivity still worked as expected. Now, I was no longer making just a visual change, I was possibly affecting interactions instead.

All of those factors combined turned my train-ride of a change into a month-long change, where I roped in more of our engineers, held many "Bootstrapv2athons", and worked with a QA team to do a weeklong test of the branch in multiple browsers.

Finally, we were ready to deploy it. Or, more accurately, my colleague convinced me that we should just get the damn thing out and be ready to deal with anything that we missed. As it turns out, yes, we still did miss a few things- a few big things, a few little things, but we responded as quickly as we could to minimize the damage. Now, finally, we have our entire codebase running off the same Bootstrap version and we can start to share our CSS and JS. Man, was that a journey of epic renaming proportions that I did not imagine.


The Lessons Learned

But I really don't think it should have to be that hard to upgrade a CSS framework. Here's what I want to try next time I work in a codebase that's based on Bootstrap:

  • Do not use the Bootstrap class names directly in our code. They are too short, and it's too hard to keep track of where they're being used.
  • Instead, use a CSS pre-processor like Less (which Bootstrap itself is based on) to extend the Bootstrap class names and define longer, more app-specific, more semantic class names. For example, we might extend btn into "coursera-generic-button" and we might extend btn/btn-success into "coursera-save-button".
  • We would then use those custom CSS class names everywhere instead of the Bootstrap class names, especially if we are using them in our JavaScript.

For a longer article that also recommends that technique for semantic more than maintenance reasons, read Stop Embedding Bootstrap Classes in your HTML.

What do you think? I'd love to hear ideas from those of you who rely on Bootstrap or similar frameworks in your codebase, and what you've done to avoid dependence and ease maintenance. I haven't yet had the chance to test out my strategy, so if there's a better way, I want to hear it.

Oh, on a related note, I hear that Bootstrap 3 is almost out... time to start my next epic branch?

Tuesday, December 4, 2012

Unit Testing our PHP Templates with Selector

I'm finally digging into the codebase that powers class.coursera.org, and it's a wild ride. The original Coursera prototype was built by a few grad students working in the co-founder's machine learning research labs, and like all scrappy prototypes, it was just meant to test whether the whole massively online class idea had any merit to it. As it turns out, it did, and that prototype went on to serve the next class, then the next class, until finally today, it's turned into the code that's serving 32 live courses. Needless to say, those grad students didn't realize when they were first building the codebase that it would one day be handed over to a team of bright and eager engineers who had never seen it before, so its not the most built on the most beautiful architecture or built around an open source framework.

Well, what is it then? It's a PHP codebase, with a custom built router, templating, and SQL querying engine, and it's a fair bit of code. When I first started here, I figured it wouldn't be that much code, based on what I'd seen as a student - but I didn't take into account just how many administrative interfaces power what students see. The professors need to upload their lectures, design quizzes (with variations and pattern-based grading), create peer assessments, view statistics, calculate grades, issue certificates, ban students for cheating, etc, etc. Now that I've dug into it, I realize how much we enable on our platform - and because of that, I realize how important it is to test our platform when we make changes.

Unfortunately, our legacy codebase didn't exactly have a lot of testing when I arrived (I will leave it as an exercise to the reader to figure out how much it did have), so now that I am making changes in it, I'm adding unit tests as I go along, which includes figuring out how to test the different aspects of the codebase, plus how to mock out functions and data.

Testing Templates

My most recent changes have all revolved around fixing broken "messaging" — making sure that students understand deadlines, that they know how many submissions they have left, that they know why they got the score they did — and much of that comes down to figuring out what strings to show to users and what should be rendered in the HTML templates.

Some people may argue that you shouldn't be testing HTML output, because that's simply your presentation, but I would argue that your presentation should be tested, because if a user sees a yellow warning instead of a red one when they've passed the deadline, then that's a bug. Some could also argue that no logic whatsoever should be in your templates, but well, I've never managed to do completely logic-less templates in an app, and I didn't attempt to tackle that goal here.

I started out testing the rendered HTML by testing for string equality or containment , but of course, that was horribly brittle and broke whenever I changed the slightest thing. I soon moved on to using Selector, a library that accepts a string of HTML and lets you query its contents as a DOM, so that you can check for elements and their attributes. It's a better technique because you can check for what matters (like class names) and ignore what doesn't (like whitespace).

As an example, here's the test for our quiz start screen, to make sure that it renders the appropriate data, start button, and an alert message given the passed in parameters.

function test_quiz_start_template() {
    $fake_quiz = $this->fake_quiz(1);

    $rendered = _prepare_template('A:app:quiz:start',
                          array(
                            'quiz' => $fake_quiz,
                            'view_state' => array('time' => 'before_soft_close_time',
                                                  'message' => ''),
                            'retry_delay' => '600',
                            'can_start' => true,
    ));
    $this->verify_quiz_start_template($rendered, 'Untimed', '0/100', '10 minutes 0 seconds', true, false);
$fake_quiz['duration'] = '100';
    $rendered = _prepare_template('A:app:quiz:start',
                          array(
                            'quiz' => $fake_quiz,
                            'view_state' => array('time' => 'before_soft_close_time',
                                                  ),
                            'message' => 'Warning!',
                            'retry_delay' => '0',
                            'can_start' => false,
    ));
    $this->verify_quiz_start_template($rendered, '1 minute 40 seconds', '0/100', '1 second', false, 'Warning!');
}

Notice how that function calls another function to actually do the verifying. Since I'm usually testing the output of a particular template with multiple sets of parameters, I typically make a single verify function that can be used for verifying the desired results, to avoid repeating myself. Here's what that function looks like, and this is the function that actually uses that Selector library:

function verify_quiz_start_template($rendered, $duration, $attempts, $retry, $start_button, $message) {
    $dom = new SelectorDOM($rendered);
    $this->verify_element_text_equals($dom, 'tr:contains(Duration) td', $duration);
    $this->verify_element_text_equals($dom, 'tr:contains(Retry) td', $retry);
    $this->verify_element_text_equals($dom, 'tr:contains(Attempts) td', $attempts);
    if ($start_button) {
        $this->verify_element_exists($dom, 'input.success');
    } else {
        $this->verify_element_doesnt_exist($dom, 'input.success');
    }
    if ($message) {
        $this->verify_element_text_equals($dom, '.course-quiz-start-alert', $message);
    } else {
        $this->verify_element_doesnt_exist($dom, '.course-quiz-start-alert');
    }
}

Okay, well, now you might notice that I'm calling a lot of verify_* functions inside. Those are functions that I've defined in my base TestCase class, so that I can use them anywhere where I want to test DOM output using the Selector library. Here are all the helper functions I've written so far:

function verify_and_find_element($dom, $selector) {
    $selected = $dom->select($selector);
    if (count($selected) == 0) {
      print 'Failed to find ' . $selector;
    }
    $this->assertTrue(count($selected) > 0);
    return $selected;
}

function verify_element_exists($dom, $selector) {
    $this->verify_and_find_element($dom, $selector);
}

function verify_element_doesnt_exist($dom, $selector) {
    $selected = $dom->select($selector);
    $this->assertTrue(count($selected) == 0);
}

function verify_element_text_equals($dom, $selector, $text) {
    $selected = $this->verify_and_find_element($dom, $selector);
    $this->assertEquals(trim($selected[0]['text']), trim($text));
}

function verify_element_attribute_equals($dom, $selector, $attribute, $text) {
    $selected = $this->verify_and_find_element($dom, $selector);
    $this->assertEquals(trim($selected[0]['attributes'][$attribute]), trim($text));
}

Please know that I am not a PHP expert. I used it in college to put together websites and at Google to show developers how to use the Maps API in a few articles, but I never worked on any sizable piece of software written in it. I'm trying to wrap my head about the best practices for PHP codebases, in terms of testing, architecture, and object-oriented design, and this may not be the best way to test template output. I'd love to hear your recommendations in the comments.

Oh, and yes, yes, we don't want this codebase to be PHP-powered forever — but re-writing it will take time and will be much easier once I fully understand it from all these tests I'm writing. ☺