I'm finally digging into the codebase that powers class.coursera.org, and it's a wild ride. The original Coursera prototype was built by a few grad students working in the co-founder's machine learning research labs, and like all scrappy prototypes, it was just meant to test whether the whole massively online class idea had any merit to it. As it turns out, it did, and that prototype went on to serve the next class, then the next class, until finally today, it's turned into the code that's serving 32 live courses. Needless to say, those grad students didn't realize when they were first building the codebase that it would one day be handed over to a team of bright and eager engineers who had never seen it before, so its not the most built on the most beautiful architecture or built around an open source framework.
Well, what is it then? It's a PHP codebase, with a custom built router, templating, and SQL querying engine, and it's a fair bit of code. When I first started here, I figured it wouldn't be that much code, based on what I'd seen as a student - but I didn't take into account just how many administrative interfaces power what students see. The professors need to upload their lectures, design quizzes (with variations and pattern-based grading), create peer assessments, view statistics, calculate grades, issue certificates, ban students for cheating, etc, etc. Now that I've dug into it, I realize how much we enable on our platform - and because of that, I realize how important it is to test our platform when we make changes.
Unfortunately, our legacy codebase didn't exactly have a lot of testing when I arrived (I will leave it as an exercise to the reader to figure out how much it did have), so now that I am making changes in it, I'm adding unit tests as I go along, which includes figuring out how to test the different aspects of the codebase, plus how to mock out functions and data.
Testing Templates
My most recent changes have all revolved around fixing broken "messaging" — making sure that students understand deadlines, that they know how many submissions they have left, that they know why they got the score they did — and much of that comes down to figuring out what strings to show to users and what should be rendered in the HTML templates.
Some people may argue that you shouldn't be testing HTML output, because that's simply your presentation, but I would argue that your presentation should be tested, because if a user sees a yellow warning instead of a red one when they've passed the deadline, then that's a bug. Some could also argue that no logic whatsoever should be in your templates, but well, I've never managed to do completely logic-less templates in an app, and I didn't attempt to tackle that goal here.
I started out testing the rendered HTML by testing for string equality or containment , but of course, that was horribly brittle and broke whenever I changed the slightest thing. I soon moved on to using Selector, a library that accepts a string of HTML and lets you query its contents as a DOM, so that you can check for elements and their attributes. It's a better technique because you can check for what matters (like class names) and ignore what doesn't (like whitespace).
As an example, here's the test for our quiz start screen, to make sure that it renders the appropriate data, start button, and an alert message given the passed in parameters.
function test_quiz_start_template() {
$fake_quiz = $this->fake_quiz(1);
$rendered = _prepare_template('A:app:quiz:start',
array(
'quiz' => $fake_quiz,
'view_state' => array('time' => 'before_soft_close_time',
'message' => ''),
'retry_delay' => '600',
'can_start' => true,
));
$this->verify_quiz_start_template($rendered, 'Untimed', '0/100', '10 minutes 0 seconds', true, false);
$fake_quiz['duration'] = '100';
$rendered = _prepare_template('A:app:quiz:start',
array(
'quiz' => $fake_quiz,
'view_state' => array('time' => 'before_soft_close_time',
),
'message' => 'Warning!',
'retry_delay' => '0',
'can_start' => false,
));
$this->verify_quiz_start_template($rendered, '1 minute 40 seconds', '0/100', '1 second', false, 'Warning!');
}
Notice how that function calls another function to actually do the verifying. Since I'm usually testing the output of a particular template with multiple sets of parameters, I typically make a single verify function that can be used for verifying the desired results, to avoid repeating myself. Here's what that function looks like, and this is the function that actually uses that Selector library:
function verify_quiz_start_template($rendered, $duration, $attempts, $retry, $start_button, $message) {
$dom = new SelectorDOM($rendered);
$this->verify_element_text_equals($dom, 'tr:contains(Duration) td', $duration);
$this->verify_element_text_equals($dom, 'tr:contains(Retry) td', $retry);
$this->verify_element_text_equals($dom, 'tr:contains(Attempts) td', $attempts);
if ($start_button) {
$this->verify_element_exists($dom, 'input.success');
} else {
$this->verify_element_doesnt_exist($dom, 'input.success');
}
if ($message) {
$this->verify_element_text_equals($dom, '.course-quiz-start-alert', $message);
} else {
$this->verify_element_doesnt_exist($dom, '.course-quiz-start-alert');
}
}
Okay, well, now you might notice that I'm calling a lot of verify_*
functions inside. Those are functions that I've defined in my base TestCase class, so that I can use them anywhere where I want to test DOM output using the Selector library. Here are all the helper functions I've written so far:
function verify_and_find_element($dom, $selector) {
$selected = $dom->select($selector);
if (count($selected) == 0) {
print 'Failed to find ' . $selector;
}
$this->assertTrue(count($selected) > 0);
return $selected;
}
function verify_element_exists($dom, $selector) {
$this->verify_and_find_element($dom, $selector);
}
function verify_element_doesnt_exist($dom, $selector) {
$selected = $dom->select($selector);
$this->assertTrue(count($selected) == 0);
}
function verify_element_text_equals($dom, $selector, $text) {
$selected = $this->verify_and_find_element($dom, $selector);
$this->assertEquals(trim($selected[0]['text']), trim($text));
}
function verify_element_attribute_equals($dom, $selector, $attribute, $text) {
$selected = $this->verify_and_find_element($dom, $selector);
$this->assertEquals(trim($selected[0]['attributes'][$attribute]), trim($text));
}
Please know that I am not a PHP expert. I used it in college to put together websites and at Google to show developers how to use the Maps API in a few articles, but I never worked on any sizable piece of software written in it. I'm trying to wrap my head about the best practices for PHP codebases, in terms of testing, architecture, and object-oriented design, and this may not be the best way to test template output. I'd love to hear your recommendations in the comments.
Oh, and yes, yes, we don't want this codebase to be PHP-powered forever — but re-writing it will take time and will be much easier once I fully understand it from all these tests I'm writing. ☺