Thursday, July 24, 2025

Automated repo maintenance via GitHub Copilot coding agent

I have a problem: I'm addicted to making new repositories on GitHub. As part of my advocacy role at Microsoft, my goal is to show developers how to combine technology X with technology Y, and a repository is a great way to prove it. But that means I now have hundreds of repositories that I am trying to keep working, and they require constant upgrades:

  • Upgraded Python packages, npm packages, GitHub Actions
  • Improved Python tooling (like moving from pip to uv, or black to ruff)
  • Hosted API changes (versions, URLs, deprecations)
  • Infrastructure upgrades (Bicep/Terraform changes)

All of those changes are necessary to keep the repositories working well, but they're both pretty boring changes to make, and they're very repetitive. In theory, GitHub already offers Dependabot to manage package upgrades, but unfortunately Dependabot hasn't worked for my more complex Python setups, so I often have to manually take over the Dependabot PRs. These are the kinds of changes that I want to delegate, so that I can focus on new features and technologies.

Fortunately, GitHub has introduced the GitHub Copilot coding agent, an autonomous agent powered by LLMs and MCP servers that can be assigned issues in your repositories. When you assign an issue to the agent, it will create a PR for the issue, put a plan in that PR, and ask for a review when it's made all the changes necessary. If you have comments, it can continue to iterate, asking for a review each time it thinks it's got it working.

I started off with some manual experimentation to see if GitHub Copilot could handle repo maintenance tasks, like tricky package upgrades. It did well enough that I then coded GitHub Repo Maintainer, a tool that searches for all my repos that require a particular maintenance task and creates issues for @Copilot in those repos with detailed task descriptions.

Here's what an example issue looks like:

Screenshot of issue assigned to Copilot agent

A few minutes after filing the issue, Copilot agent sends a pull request to address the issue:

Screenshot of PR from Copilot agent

To give you a feel for the kinds of issues that I've assigned to Copilot, here are more examples:

  • Update GitHub Actions workflow to use ubuntu-latest: This was an easy task. The only issue was with a more complex workflow where the latest ubuntu had a conflict with an additional service, and it came up with a roundabout way of fixing that.
  • Update Bicep to new syntax: This worked well when I provided it the exact new syntax to use. When I only told it that the old Bicep syntax was deprecated, it came up with a more convoluted way to fix it, and also tried fixing all the Bicep warnings too. It got sidetracked since the agent uses "az bicep build" to check the Bicep syntax validity, and that tool includes warnings by default, and Copilot generally likes to be a do-gooder and fix warnings too. I often will tell it explicitly "ignore the warnings, just fix the errors" for Bicep-related tasks.
  • Upgrade a tricky Python package: This was a harder upgrade as it required upgrading another package at the same time, something Dependabot had failed to do. Copilot was able to work it out, but only once I pointed out that the CI failed and reminded it to make sure to pip install the requirements file.
  • Update a deprecated URL: This was easy for it, especially because my tool tells it exactly which files it found the old URLs in.

Generally a good strategy has been for me to verify the right general fix in one repo, and then send that well-crafted issue to the other affected repos.

How to assign issues to GitHub Copilot

The GitHub documentation has a great guide on using the UI, API, or CLI to assign issues to the GitHub Copilot coding agent. When using the API, we have to first check if the Copilot agent is enabled, by doing a query to see if the repository's suggestedActors includes copilot-swe-agent. If so, then we grab the id of the agent and use that id when creating a new issue.

Here's what it looks like in Python to find the ID for the agent:

async def get_repo_and_copilot_ids(self, repo):
  headers = {"Authorization": f"Bearer {self.auth_token}", "Accept": "application/vnd.github+json"}
  query = '''
    query($owner: String!, $name: String!) {
      repository(owner: $owner, name: $name) {
        id
        suggestedActors(capabilities: [CAN_BE_ASSIGNED], first: 100) {
          nodes {
            login
             __typename
             ... on Bot { id }
          }
        }
      }
    }
  '''
  variables = {"owner": repo.owner, "name": repo.name}

  async with httpx.AsyncClient(timeout=self.timeout) as client:
    resp = await client.post(GITHUB_GRAPHQL_URL, headers=headers, json={"query": query, "variables": variables})
      resp.raise_for_status()
      data = resp.json()
    repo_id = data["data"]["repository"]["id"]
    copilot_node = next((n for n in data["data"]["repository"]["suggestedActors"]["nodes"]
        if n["login"] == "copilot-swe-agent"), None)
    if not copilot_node or not copilot_node.get("id"):
      raise RuntimeError("Copilot is not assignable in this repository.")
    return repo_id, copilot_node["id"]

The issue creation function uses that ID for the assignee IDs:

async def create_issue_graphql(self, repo, issue):
  repo_id, copilot_id = await self.get_repo_and_copilot_ids(repo)
  headers = {"Authorization": f"Bearer {self.auth_token}", "Accept": "application/vnd.github+json"}
  mutation = '''
  mutation($input: CreateIssueInput!) {
    createIssue(input: $input) {
      issue {
        id
        number
        title
        url
      }
    }
  }
  '''
  input_obj = {
    "repositoryId": repo_id,
    "title": issue.title,
    "body": issue.body,
    "assigneeIds": [copilot_id],
  }
  async with httpx.AsyncClient(timeout=self.timeout) as client:
    resp = await client.post(GITHUB_GRAPHQL_URL, headers=headers,
        json={"query": mutation, "variables": {"input": input_obj}})
    resp.raise_for_status()
    data = resp.json()
  issue_data = data.get("data", {}).get("createIssue", {}).get("issue")
  return {
    "number": issue_data["number"],
    "html_url": issue_data["url"]
  }

Lessons learned (so far!)

I've discovered that there are several intentional limitations on the behavior of the @Copilot agent:

  • Workflows must be approved before running: Typically, when a human contributor submits a pull request, and they're an existing contributor to the repository, the workflows automatically run on their PRs, and the contributor can see quickly if they need to fix any CI failures. For security reasons, GitHub requires a human to press "Approve and run workflows" on each push to a @Copilot PR. I will often press that, see that the CI failed, and comment @Copilot to address the CI failures. I would love to skip that manual process on my side, but I understand why GitHub is erring on the side of security here. See more details in their Copilot risk mitigation docs.
  • PRs must be marked "ready to review": Once again, typically a human contributor would start a PR in draft and mark it as "ready for review" before requesting a review. The @Copilot agent does not mark it as ready, and instead requires a human reviewer to mark it for them. According to my discussion with the GitHub team in the Copilot agent issue tracker, this is intentional to avoid triggering required reviews. However, I am hoping that GitHub adds a repository setting to allow the agent itself to mark PRs as ready, so that I can skip that trivial manual step.

I've also realized a few common ways that the @Copilot agent makes unsatisfactory PRs, and have started crafting issue descriptions better to improve the agent's success. My issue descriptions now include...

  • Validation steps: The agent will try to execute any validation steps, so if there are any that make sense, like running a pip install, a linter, or a script, I include those in the issue description. For example, for Bicep changes, issues include "After making this change, run `az bicep build` on `infra/main.bicep` to ensure the Bicep syntax is valid.".
  • How to make a venv: While testing its changes, the agent kept making Python virtual environments in directories other than ".venv", which is the only directory name that I use, and the one that's consistently in my .gitignore files. I would then see PRs that had 4,000 changed files, due to an accidentally checked in virtual environment folder. Now, in my descriptions, I tell it explicitly to create the venv in ".venv".

It's early days, but I'm pretty excited that there's a way that I can keep making ridiculous amounts of repositories and keep them well maintained. Definitely check out the GitHub Copilot coding agent to see if there are ways that it can help you automate the boring parts of repository maintenance.

No comments: