Wednesday, February 22, 2023

Managing Python dependency versions for web projects

Though I've worked on several production Python codebases, I've never been in charge of managing the dependencies. However, I now find myself developing many templates to help Python devs get started with web apps on Azure, and I want to set those up with best practices for dependency management. After discussing with my fellow Python advocates and asking on Mastodon, this seems to be the most commonly used approach:

  1. Pin all the production requirements for the web app. (Not necessary to pin development requirements, like linters.)
  2. Use a service like PyUp or Github Dependabot to notify you when a dependency can be upgraded.
  3. As long as tests all pass with the newest version, update to the new version. This assumes full test coverage, and as Brian Okken says, "you’re not testing enough, probably, test more." And if you really trust your tests, use a tool like Anthony Shaw's Dependa-lot-bot to auto-merge Github PRs when all checks pass.

I've now gone through and made that change in my web app templates, so here's what that looks like in an example repo Let's use flask-surveys-container-app, a containerized Flask app with a PostgreSQL database.


Pin the production requirements

My app already had a requirements.txt, but without any versions:

Flask
Flask-Migrate
Flask-SQLAlchemy
Flask-WTF
psycopg2
python-dotenv
SQLAlchemy
gunicorn
azure-keyvault-secrets
azure.identity

To figure out the current versions, I ran python3 -m pip freeze and copied the versions in:

Flask==2.2.3
Flask-Migrate==4.0.4
Flask-SQLAlchemy==3.0.3
Flask-WTF==1.1.1
psycopg2==2.9.5
python-dotenv==0.21.1
SQLAlchemy==2.0.4
gunicorn==20.1.0
azure-keyvault-secrets==4.6.0
azure-identity==1.12.0

Add Github dependabot

The first time that I set up Dependabot for a repo, I used the Github UI to automatically create the dependabot.yaml file inside the .github folder. After that, I copied the same file into every repo, since all my repos use the same options. Here's what the file looks like for an app that uses pip and stores requirements.txt in the root folder:

version: 2
updates:
  - package-ecosystem: "pip" # See documentation for possible values
    directory: "/" # Location of package manifests
    schedule:
      interval: "weekly"

If your project has a different setup, read through the Github docs to learn how to configure Dependabot.


Ensure the CI runs tests

For this system to work well, it really helps if there's an automated workflow that runs test and checks test coverage. Here's a snippet from the Python workflow file for the Flask app:

- name: Install dependencies
  run: |
    python -m pip install --upgrade pip
    pip install -r requirements-dev.txt
- name: Run Pytest tests
  run: pytest
  env:
    DBHOST: localhost
    DBUSER: postgres
    DBPASS: postgres
    DBNAME: postgres

To make sure the tests are actually testing the full codebase, I use coverage and make the tests fail if the coverage isn't high enough. Here's how I configure pyproject.toml to make that happen:

[tool.pytest.ini_options]
addopts = "-ra --cov"
testpaths = [
    "tests"
]
pythonpath = ['.']

[tool.coverage.report]
show_missing = true
fail_under = 100

Install Depend-a-lot-bot

Since I have quite a few Python web app repos (for Azure samples), I expect to soon see my inbox flooded with Dependabot pull requests. To help me manage them, I installed Dependa-lot-bot, which should auto-merge Github PRs when all checks pass.

I added .github/dependabot-bot.yaml file to tell the bot which packages are safe to merge:

safe:
 - Flask
 - Flask-Migrate
 - Flask-SQLAlchemy
 - Flask-WTF
 - psycopg2
 - python-dotenv
 - SQLAlchemy
 - gunicorn
 - azure-keyvault-secrets
 - azure-identity

Notably, that's every single package in this project's requirements, which may be a bit risky. Ideally, if my tests are comprehensive enough, they should notify me if there's an actual issue. If they don't, and an issue shows up in the deployed version, then that indicates my tests need to be improved.

I probably would not use this bot for a live website, like khanacademy.org, but it is helpful for the web app demo repos that I maintain.

So that's the process! Let me know if you have ideas for improvement or your own flavor of dependency management.

No comments: