Saturday, July 27, 2013

Rewriting Django Admin in Backbone

Preamble:

This is an internal guide I wrote for Coursera about our site admin app, which is a Backbone "port" of Django admin. I've snapshotted it here in case it's interesting to other folks that are using Django admin and going down the same route, or just generally thinking about making an admin interface in Backbone.

A Bit of History

When Coursera first began advertising courses on www.coursera.org, the process to add a new course was quite manual: our Course Ops team would work with the university admins to draft up HTML describing the course, engineers would paste that HTML into the DB, and make edits to it as requested. Or, that's how the lore goes.

Here's how that data might look on the site:

As you can imagine, that process doesn't scale: it took unnecessary engineering honors to make incremental edits to the course descriptions, and it did not please university admins to have to wait on such a slow cycle. Ideally, they could log in, edit the description, and see the changes live, within a matter of minutes.

So we set about making an admin interface for the data. Since www.coursera.org runs off a Python/Django/MySQL backend, we first went down the expected route: Django Admin, an app that's built into Django for easy editing of database tables, complete with different permission levels, edit logs, and an extensibility mechanism.

Here's what it looked like on our data:

Django Admin was easy to set up and skin, but when we started implementing requests from Course Ops to improve the editing experience and workflow, we found we had to fight against Django Admin, to hack into its core in ways that felt wrong. We wanted a lot of different ways of editing data, which we could do via custom Django Admin widgets, but that often meant writing HTML and JavaScript into our Python code itself. We also wanted different buttons in our forms depending on the state of the model, like "open a session", and to do that, we had to modify the base templates with hacky HTML and javascript.

You can see a few of our widgets and buttons here:

We were making it work, sure, but I wasn't sure how long we could keep making it work, and if I could bring myself to look at the codebase later, knowing how we'd contorted it to meet our needs. Once I realized that I was the likely engineer to be making bug fixes and implementing feature requests, I decided it was time to move away, fast, before we got too deep into it.

Site Admin

So, over the course of the 3-day Labor Day weekend, I wrote "site admin", an admin interface that used our new approach to writing frontends, with a Backbone frontend and a RESTful Django API. It wasn't feature-complete with Django admin after those 3 days (and it still isn't), but it was built to be extendible on the client-side in a way that Django Admin was not, and that has made it much easier to build on.

Let's walk through the API and the frontend.

The API

The site admin API is designed to be exactly the sort of API that Backbone expects, a RESTful JSON API. It's based off an open-source project called Djangbone, but is now heavily modified.

In admin_api/views.py, the RestrictedAdminAPIView extends the generic Django View class, and defines get/post/put/delete methods that respond to the appropriate HTTP verb and understand how to generically fetch/create/edit/delete any model/collection. That file also contains AdminLogsView, which handles creation and retrieval of edit logs, and AdminSearchView, which is used by autocompletes in the frontend for finding models.

To set up an API for a particular set of models, we'd follow these steps, using the categories app and Category model as an example:

  • Update categories/models.py: We add a static method to a Model class that returns back all of the models that the given user is allowed to administer.
    class Category(models.Model):
        @staticmethod
        def objects_administerd_by_user(user):
            if user.is_superuser:
                return Category.objects.all()
            else:
              return Category.objects.none()
    
  • Create categories/admin_api.py: In it, we create a new view that extends RestrictedAdminAPIView, where we specify the base_queryset and base_model, corresponding to our Model, and we provide a list of fields that should be serialized into the JSON (we do not want to pass down all fields, particularly in the case of models with related fields, like course students). We also define get_add_form and get_edit_form, which return a Django Form subclass based on the request. We often serve different forms to different users, like when we want looser field validation for super users versus university admins.
    CATEGORY_FIELDS = (
                      'name',
                      'description',
                      )
    
    
    class CategoryAdminAPIView(RestrictedAdminAPIView):
        base_queryset = Category.objects.all()
        base_model = Category
    
        serialize_fields = CATEGORY_FIELDS + ('id', 'short_name')
    
        def get_add_form(self, request):
            if request.user.is_superuser:
                return NewCategoryAdminAPIForm
            else:
                return None
    
        def get_edit_form(self, request):
            if request.user.is_superuser:
                return EditCategoryAdminAPIForm
            else:
                return None
    
    
    class EditCategoryAdminAPIForm(AdminAPIForm):
    
        protected_fields = [
                            'name',
                            'short_name',
                            ]
    
        class Meta:
            model = Category
            fields = CATEGORY_FIELDS
    
        def clean(self):
            cleaned_data = super(EditCategoryAdminAPIForm, self).clean()
            short_name = cleaned_data.get('short_name')
            if short_name is not None and len(short_name) > 20:
                err = 'Please limit to 20 chars (currently %d).' % (
                    len(short_name))
                self._errors['short_name'] = self.error_class([err])
                del cleaned_data['short_name']
            if short_name is not None and not re.match('^[a-z0-9-\.]+$', short_name):
                self._errors['short_name'] = \
                    self.error_class(['Please limit to a-z,0-9.'])
                del cleaned_data["short_name"]
            name = cleaned_data.get('name')
            if name is not None and len(name) > 60:
                err = 'Please limit to 60 chars (currently %d).' % (
                    len(name))
                self._errors['name'] = self.error_class([err])
                del cleaned_data['name']
            return cleaned_data
    
    
    class NewCategoryAdminAPIForm(EditCategoryAdminAPIForm):
    
        class Meta:
            model = Category
            fields = CATEGORY_FIELDS + ('short_name',)
    
  • Update admin_api/urls.py: We add 2 URL patterns for this model's API, to handle the model and collection verbs:
    url(r'^categories$',
      CategoryAdminAPIView.as_view(),
      name="api_categories"),
    url(r'^categories/(?P\d+)',
      CategoryAdminAPIView.as_view(),
      name="api_category"), 
        
  • Update admin_api/tests.py: We add tests for the API, checking permissions and validation.
       def test_create_category(self):
        client = Client()
    
        cats_url = reverse('api_categories')
    
        # Test: super-user can create category
        create_and_login_as_superuser(client)
        data = {
            'name': 'Fake Category',
            'short_name': 'fake-cat',
        }
        response = self.post_json(client, cats_url, data)
        response_json = simplejson.loads(response.content)
        self.assertEqual(response.status_code, 200)
        self.assertEqual(response_json['name'], data['name'])
        self.assertEqual(response_json['short_name'], data['short_name'])
    
        # Test: cant edit short name after its created
        cat_url = reverse('api_category', args=[response_json['id']])
        data['short_name'] = 'fakecat2'
        response = self.put_json(client, cat_url, data)
        response_json = simplejson.loads(response.content)
        self.assertEqual(response.status_code, 200)
        self.assertEqual(response_json['short_name'], 'fake-cat')
    
        # Test: Cant use a shitty short name
        data['short_name'] = 'SoGreat OMG'
        response = self.post_json(client, cats_url, data)
        response_json = simplejson.loads(response.content)
        self.assertEqual(response.status_code, 400)
        self.assertEqual(response_json['short_name'],
            ['Please limit to a-z,0-9.'])
    
        # Test: instructors cant create categories
        create_and_login_as_instructor(client)
        response = self.post_json(client, cats_url, data)
        self.assertEqual(response.status_code, 400)
    

The Frontend

The site admin frontend is designed in a similar way as the backend: generic views that understand models and collections generally, with ways to specify the differences for each model.

ModelAdminPageView is responsible for creating a page with a header, banner, and then nesting a ModelAdminFieldsView which knows how to create a form with fields, buttons, and links to related models. To create that form, ModelAdminFieldsView calls upon a number of views which extend FieldView, like Select2View and HiddenInputView, and renders them depending on the attributes <-> fields mapping in a model.

CollectionAdminPageView is responsible for creating a page with a header, "new model" button, and then nesting a CollectionAdminListView, which knows how to create statistics charts via the nvd3 library and a tabular view of a collection via CollectionAdminTableView.


To add a model to the frontend, we follow these steps, using the category model as an example:

  • Create models/CategoryAdminModel.js: This model extends AdminModel, an extension of Backbone.Model. It defines properties that are needed by Backbone, like the API endpoint, as well as custom properties that are needed by the views (and yes, that is not a perfect separation of data and presentation, but life must go on). The custom properties include a mapping of attributes to form field types, any custom buttons and modals, attributes to filter by in the table view, and more.
    define(["jquery",
              "backbone",
              "underscore",
              "js/core/coursera",
              "pages/site-admin/models/AdminModel"
              ],
    function($, Backbone, _, Coursera, AdminModel) {
    
      var model = AdminModel.extend({
    
        url: 'admin/categories',
        webUrlLabel: 'categories',
        label: 'Category',
    
        displayName: function() {
          return this.get('name');
        },
    
        fieldsets: function() {
    
          return [{
            name: 'name',
            type: 'text'
          }, {
            name: 'short_name',
            type: 'text',
            readonly: !this.isNew()
          }];
        }
    
      });
    
      return model;
    
    });
    
  • Create collections/CategoriesAdminCollection.js: This extends AdminCollection, and specifies a handful of properties needed by Backbone and our views. The bulk of the custom logic is in the model, not the collection.
    define(["backbone",
              "js/core/coursera",
              "pages/site-admin/collections/AdminCollection",
              "pages/site-admin/models/CategoryAdminModel"
              ],
    
    function(Backbone, Coursera, AdminCollection, CategoryAdminModel) {
    
      var collection = AdminCollection.extend({
    
        url: 'admin/categories',
    
        webUrlLabel: 'categories',
    
        label: 'Categories',
    
        model: CategoryAdminModel
    
      });
    
      return collection;
    });
    
  • Update site-admin/routes.js: This routes file extends Backbone.Router and defines generic URLs that can handle any AdminModel or AdminCollection. We add the new models and collections to modelAcls and collectionAcls, respectively, so that we know what collections to link a user to in the dashboard view. We enforce actual ACLs on the server-side, of course.

The future

The site admin API and frontend have served us reasonably well as we have grown to want more editing abilities and workflow improvements, but there is much to be improved upon.

Collaborative Editing

While developing site admin by myself in my living room, there was one thing that never occurred to me: I was building a collaborative editing interface. We have many different sorts of admins that edit course data, everywhere from super users to university admins to instructors to TAs, and sometimes, there could be multiple admins editing at once. I didn't realize this, however, until after we unleashed site admin for a big launch and I got emails at 4am about instructors losing data and freaking out. I quickly realized how easily that could happen, if two staff were working on a course on different machines. As a quick fix, I added a notion of "protected fields" - fields that weren't allowed to go from something to nothing - and that prevented the worst case of admins losing all the text they'd painstakingly inputted. It does add a problem, however, of admins legitimately wanting to clear fields sometimes, and engineers needing to manually make that change. It also doesn't protect against losing incremental data in a field.

To make site admin work better for multiple editors, there are a few approaches we've thought of, which could be combined in some optimal way:

  • Partial updates: Currently site admin does a PUT of the full data of the model, and saves it wholly. An approach we take in many of our frontends now is to use Backbone's changedAttributes to track what changed, and only do an HTTP PATCH and partial update of the model. That would mean two admins could edit different fields and not worry about overriding eachother's changes.
  • Real-time update: We could poll for updates to the model, updating the fields when we see changes. If the admin is currently editing an updated field, we could alert to confirm override or show them the changed version.
  • Notifications: We could keep track of what admins are on a page, and alert them about the possibility of concurrent edits, which might encourage them to consult with eachother about what changes they are making.
  • Change confirmations: We could detect that something had changed since the admin started editing, and prompt the admin to confirm that yes, indeed, they are okay with that change.

DRYer Permissions

We currently have code in the frontend that mirrors the permissions on the backend, like to figure out what collections a particular admin has access to at all, and to figure out what fields should be read-only for particular admins. This code is problematic since it means twice the code to change permission, and it's also dangerous because an engineer could fool themself into thinking that they'd secured something properly, if they did not put a test on the backend.

A better approach might be an API that sends down the permissions for an admin, perhaps using the HTTP OPTIONS verb. For example, an HTTP OPTIONS request to 'admin/api' might return the following:

{"collection": ["categories", "universities", "instructors"]}

Once loading the form for a particular model, the HTTP OPTIONS request might return the following, based on inspecting the Django Form instances:

{"fields": {
   "short_name": {"read_only": false, "restriction": "[a-zA-Z]", "max_length": "20"},
}

Those restrictions would also depend on whether the model was new or existing, and that would need to be represented in that API.

Soft Delete

When we do a delete in site admin now, it does a true delete, removing the rows and related rows from the database tables. That is a scary operation, of course, since it means that the only way to get back the data is to find it in an old database backup, provided we still have it around for the time of deletion.

We would prefer to do a soft delete, which would mean adding a "deleted" column to each model, setting that to true upon deletion, and changing all of our APIs to honor that deleted column when fetching data. This also makes it easier to do an undo. The biggest hurdle to this is change would be auditing all of the APIs that call upon that data to make sure they respect the change.

Drafts vs. Master

When an admin edits and saves their changes in site admin now, the changes are immediately live. Admins don't always want this; they often want to be able to preview their changes, feel confident in them, and then make them live. We have this in place for our course admin data, like quizzes and lectures, but it would require significant changes to the database tables, admin APIs, and user-facing APIs. A step in the right direction might be to make it possible for admins to preview their course pages with the unsaved data by sending it through an iframe and postMessage, perhaps. That is made more difficult by the fact that our admin APIs represent the data in a very different form than the user-facing APIs, however.

Improved Logs

We currently record who did what to which model, but the "what" is only whether it was a creation, edit, or delete. Ideally, we would also include exactly what fields were changed, and what the diff was between the fields. That would be made much easier to do by adding HTTP PATCH support. We also need better pagination and searching of the logs, and we may want to expose them to non-super-users.

Workflow-Based vs. Model-Based

Django admin revolves around models, assuming that the admin thinks to themself, "Yes, I'd like to edit such and such today." However, as it turns out, admins more often think in terms of goals or workflows, such as "Today I'd like to release grades." or "Today I'd like to open the course for enrollment and let all the subscribers know." These workflows often involve different models at each step, and it is non intuitive for the user to have to figure out that sequence. For a few of them, we've created "Checklist" views, with steps and links to the relevant models, anchor'ed at the form element that should be edited. But I think it would be worth it to re-think the admin interface from scratch with the idea of workflows being a first class citizen, instead of something we've tacked on at the end.

No comments: