A big part of my job in Python advocacy at Microsoft is to create and maintain code samples, like examples of how to deploy to Azure using FastAPI, Flask, or Django. We've recently undergone an effort to standardize our best practices across samples. Most best practices are straightforward, like using ruff for linting and black for PEP8 formatting, but there's one area where the jury's still out: dependency management. Here's what we've tried and the ways in which they have failed us. I'm writing this post in hopes of getting feedback from other maintainers on the best strategy.
Unpinned package requirements files
Quite a few of our samples simply provide a requirements.txt without versions, such as:
quart
uvicorn[standard]
langchain
openai
tiktoken
azure-identity
azure-search-documents
azure-storage-blob
The benefit of this approach is that a developer installing the requirements will automatically get the latest version of every package. However, that same benefit is also its curse:
- What happens when the sample is no longer compatible with the latest version? The goal of our samples is usually somewhat orthogonal to the exact technologies used, like getting an app deployed on App Service, and we generally want to prioritize a working sample over a sample that is using the very latest version. We could say, well, we'll just wait for a bug report from users, and then we'll scramble to fix it. But that assumes users will make reports and that we have the resources to scramble to fix old samples at any point.
- What if a developer bases their production code off the sample, and never ends up pinning versions? They may end up deploying that code to production, without tests, and be very sad when they realize their code is broken, and they don't necessarily know what version update caused the breakage.
So we have been trying to move away from the bare package listings, since neither of those situations are good.
Pinned direct dependencies
The next step is a requirements.txt file that pins known working versions of each direct dependency, such as:
quart==0.18.4
uvicorn[standard]==0.23.2
langchain==0.0.187
openai[datalib]==0.27.8
tiktoken==0.4.0
azure-identity==1.13.0
azure-search-documents==11.4.0b6
azure-storage-blob==12.14.1
With this approach, we also set up a dependabot.yaml file so that GitHub emails us every week when new versions are available, and we run tests in GitHub actions so that we can use the pass/fail state to reason about whether a new version upgrade is safe to merge.
I was pretty happy with this approach, until it all fell apart one day. The quart library brings in the werkzeug library, and a new version came out of the werkzeug library that was incompatible with the pinned version of quart (which was also latest). That meant that every developer who had our sample checked out suddenly saw a funky error upon installing requirements, caused by quart trying to use a feature no longer available in werkzeug. I immediately pinned an issue with workarounds for developers, but I still got DMs and emails from developers trying to figure out this sudden new error in previously working code.
I felt pretty bad as I'd heard developers warning about only pinning direct dependencies, but I'd never experienced an issue like this first-hand. Well, now I have, and I will never forget! I think this kind of situation is particularly painful for code samples, where we have hundreds of developers using code that they didn't originally write, so we don't want to put them in a situation where they have to fix a bug they didn't introduce and lack the context to quickly understand.
Compiled direct & indirect dependencies
I made a pull request for that repo to use pip-tools to compile pinned versions of all dependencies. Here's a snippet of the compiled file:
uvicorn[standard]==0.23.2
# via -r app/backend/requirements.in
uvloop==0.17.0
# via uvicorn
watchfiles==0.20.0
# via uvicorn
websockets==11.0.3
# via uvicorn
werkzeug==3.0.0
# via
# flask
# quart
I assumed naively that I had it all figured out: this was the approach that we should use for all repos going forward! No more randomly introduced errors!
Unfortunately, I started getting reports that Windows users were no longer able to run the local server, with an error message that "uvloop is not supported on Windows". After some digging, I realized that our requirement of uvicorn[standard]
brought in certain dependencies only in certain environments, including uvloop for Linux environments. Since I ran pip-compile in a Linux environment, the resulting requirements.txt included uvloop, a package that doesn't work on Windows. Uh oh!
I realized that our app didn't actually need the additional uvloop requirement, so I changed the dependency from uvicorn[standard] to uvicorn, and that resolved that issue. But I was lucky! What if there was a situation where we did need a particular environment-specific dependency? What approach would we use then?
I imagine the answer is to use some other tool that can both pin indirect dependencies while obeying environment conditionals, and I know there are tools like poetry and hatch, but I'm not an expert in them. So, please, I request your help: what approach would avoid the issues we've run into with the three strategies described here? Thank you! 🙏🏼