Friday, January 16, 2026

Using on-behalf-of flow for Entra-based MCP servers

In December, we presented a series about MCP, culminating in a session about adding authentication to MCP servers. I demoed a Python MCP server that uses Microsoft Entra for authentication, requiring users to first login to the Microsoft tenant before they could use a tool. Many developers asked how they could take the Entra integration further, like to check the user's group membership or query their OneDrive. That requires using an "on-behalf-of" flow, also known as "delegation" in OAuth, where the MCP server uses the user's identity to call another API, like the Microsoft Graph API. In this blog post, I will explain how to use Entra with OBO flow in a Python FastMCP server.

How MCP servers can use Entra authentication

The MCP authorization specification is based on OAuth2, but with some additional features tacked on top. Every MCP client is actually an OAuth2 client, and each MCP server is an OAuth2 resource server.

Diagram of OAuth 2.1 entities with MCP client and server

MCP auth adds these features to help clients determine how to authorize a server:

  • Protected resource metadata (PRM): Implemented on the MCP server, provides details about the authorization server and method
  • Authorization server metadata: Implemented on the authorization server, gives URLs for OAuth2 endpoints

Additionally, to allow MCP servers to work with arbitrary MCP clients, MCP auth supports either of these client registration methods:

  • Dynamic Client Registration (DCR): Implemented on the authorization server, it can register new MCP clients as OAuth2 clients, even if it hasn't seen them before.
  • Client ID Metadata Documents (CIMD): An alternative to DCR, this requires both the MCP client to make a CIMD document available on a server, and requires the authorization server to fetch the CIMD document for details about the client.

Microsoft Entra does support authorization server metadata, but it does not support either DCR or CIMD. That's actually fine if you are building an MCP server that's only going to be used with pre-authorized clients, like if the server will only be used with VS Code or with a specific internal MCP client. But, if you are building an MCP server that can be used with arbitrary MCP clients, then either DCR or CIMD is required. So what do we do?

Fortunately, the FastMCP SDK implements DCR on top of Entra using an OAuth proxy pattern. FastMCP acts as the authorization server, intercepting requests and forwarding to Entra when needed, and storing OAuth client information in a designated database (like in-memory or Cosmos DB).

Diagram of OAuth proxy pattern

Let's walk through the steps to set that up.

Registering the server with Entra

Before the server can use Entra to authorize users, we need to register the server with Entra via an app registration. We can do registration using the Azure Portal, Azure CLI, Microsoft Graph SDK, or even Bicep. In this case, I use the Python MS Graph SDK as it allows me to specify everything programmatically.

First, I create the Entra app registration, specifying the sign-in audience (single-tenant), redirect URIs (including local MCP server and VS Code redirect URIs), and the scopes for the exposed API.

request_app = Application(
  display_name="FastMCP Server App",
  sign_in_audience="AzureADMyOrg",  # Single tenant
  web=WebApplication(
   redirect_uris=[
        "http://localhost:8000/auth/callback",
        "https://vscode.dev/redirect",
        "http://127.0.0.1:33418",
        "https://deployedurl.com/auth/callback"
    ],
  ),
  api=ApiApplication(
    oauth2_permission_scopes=[
      PermissionScope(
        id=uuid.UUID("{" + str(uuid.uuid4()) + "}"),
        admin_consent_display_name="Access FastMCP Server",
        admin_consent_description="Allows access to the FastMCP server as the signed-in user.",
        user_consent_display_name="Access FastMCP Server",
        user_consent_description="Allow access to the FastMCP server on your behalf",
        is_enabled=True,
        value="mcp-access",
        type="User",
      )],
    requested_access_token_version=2,  # Required by FastMCP
  )
)
app = await graph_client.applications.post(request_app)

await graph_client.applications.by_application_id(app.id).patch(
  Application(identifier_uris=[f"api://{app.app_id}"]))

Thanks to that configuration, when an MCP client like VS Code requests an OAuth2 token, it will request a token with the scope "api://{app.app_id}/mcp-access", and the FastMCP server will validate that incoming tokens contain that scope.

Next, I create a Service Principal for that Entra app registration, which represents the Entra app in my tenant

request_principal = ServicePrincipal(app_id=app.app_id, display_name=app.display_name)
await graph_client.service_principals.post(request_principal)

I need a way for the server to prove that it can use that Entra app registration, so I register a secret:

password_credential = await graph_client.applications.by_application_id(app.id).add_password.post(
  AddPasswordPostRequestBody(
    password_credential=PasswordCredential(display_name="FastMCPSecret")))

Ideally, I would like to move away from secrets, as Entra now has support for using federated identity credentials for Entra app registrations instead, but that form of credential isn't supported yet in the FastMCP SDK. If you choose to use a secret, make sure that you store the secret securely.

Granting admin consent

This next step is only necessary when our MCP server wants to use an OBO flow to exchange access tokens for other resource server tokens (Graph API tokens, in this case). For the OBO flow to work, the Entra app registration needs permission to call the Graph API on behalf of users. If we controlled the client, we could force it to request the required scopes as part of the initial login dialog. However, since we are configuring this server to work with arbitrary MCP clients, we don't have that option. Instead, we grant admin consent to the Entra app for the necessary scopes, such that no Graph API consent dialog is needed.

This code grants the admin consent to the associated service principal for the Graph API resource and scopes:

server_principal = await graph_client.service_principals_with_app_id(app.app_id).get()
grant = GrantDefinition(
    principal_id=server_principal.id,
    resource_app_id="00000003-0000-0000-c000-000000000000", # Graph API
    scopes=["User.Read", "email", "offline_access", "openid", "profile"],
    target_label="server application")
resource_principal = await graph_client.service_principals_with_app_id(grant.resource_app_id).get()
desired_scope = grant.scope_string()
await graph_client.oauth2_permission_grants.post(
  OAuth2PermissionGrant(
    client_id=grant.principal_id,
    consent_type="AllPrincipals",
    resource_id=resource_principal.id,
    scope=desired_scope))

If our MCP server needed to use an OBO flow with another resource server, we could request additional grants for those resources and scopes.

Our Entra app registration is now ready for the MCP server, so let's move on to see the server code.

Using FastMCP servers with Entra

In our MCP server code, we configure FastMCP's built in AzureProvider based off the details from the Entra app registration process:

auth = AzureProvider(
    client_id=os.environ["ENTRA_PROXY_AZURE_CLIENT_ID"],
    client_secret=os.environ["ENTRA_PROXY_AZURE_CLIENT_SECRET"],
    tenant_id=os.environ["AZURE_TENANT_ID"],
    base_url=entra_base_url, # MCP server URL
    required_scopes=["mcp-access"],
    client_storage=oauth_client_store, # in-memory or Cosmos DB
)

To make it easy for our MCP tools to access an identifier for the currently logged in user, we define a middleware that inspects the claims of the current token using FastMCP's get_access_token() and sets the "oid" (Entra object identifier) in the state:

class UserAuthMiddleware(Middleware):
    def _get_user_id(self):
        token = get_access_token()
        if not (token and hasattr(token, "claims")):
            return None
        return token.claims.get("oid")

    async def on_call_tool(self, context: MiddlewareContext, call_next):
        user_id = self._get_user_id()
        if context.fastmcp_context is not None:
            context.fastmcp_context.set_state("user_id", user_id)
        return await call_next(context)

    async def on_read_resource(self, context: MiddlewareContext, call_next):
        user_id = self._get_user_id()
        if context.fastmcp_context is not None:
            context.fastmcp_context.set_state("user_id", user_id)
        return await call_next(context)

When we initialize the FastMCP server, we set the auth provider and include that middleware:

mcp = FastMCP("Expenses Tracker",
  auth=auth,
  middleware=[UserAuthMiddleware()])

Now, every request made to the MCP server will require authentication. The server will return a 401 if a valid token isn't provided, and that 401 will prompt the MCP client to kick off the MCP authorization flow.

Inside each tool, we can grab the user id from the state, and use that to customize the response for the user, like to store or query items in a database.

@mcp.tool
async def add_user_expense(
    date: Annotated[date, "Date of the expense in YYYY-MM-DD format"],
    amount: Annotated[float, "Positive numeric amount of the expense"],
    description: Annotated[str, "Human-readable description of the expense"],
    ctx: Context,
):
  """Add a new expense to Cosmos DB."""
  user_id = ctx.get_state("user_id")
  if not user_id:
    return "Error: Authentication required (no user_id present)"
  expense_item = {
    "id": str(uuid.uuid4()),
    "user_id": user_id,
    "date": date.isoformat(),
    "amount": amount,
    "description": description
  }
  await cosmos_container.create_item(body=expense_item)

Using OBO flow in FastMCP server

Now we have everything we need to use an OBO flow inside the MCP tools, when desired. To make it easier to exchange and validate tokens, we use the Python MSAL SDK, configuring a ConfidentialClientApplication similarly to how we set up the FastMCP auth provider:

confidential_client = ConfidentialClientApplication(
    client_id=os.environ["ENTRA_PROXY_AZURE_CLIENT_ID"],
    client_credential=os.environ["ENTRA_PROXY_AZURE_CLIENT_SECRET"],
    authority=f"https://login.microsoftonline.com/{os.environ['AZURE_TENANT_ID']}",
    token_cache=TokenCache(),
)

Inside the tool that requires OBO, we ask MSAL to exchange the MCP access token for a Graph API access token:

access_token = get_access_token()
graph_resource_access_token = confidential_client.acquire_token_on_behalf_of(
  user_assertion=access_token.token, scopes=["https://graph.microsoft.com/.default"]
)
graph_token = graph_resource_access_token["access_token"]

Once we successfully acquire the token, we can use that token with the Graph API, for any operations permitted by the scopes in the admin consent granted earlier. For this example, we call the Graph API to check whether the logged in user is a member of a particular Entra group, and restrict tool usage if not:

async with httpx.AsyncClient() as client:
  url = ("https://graph.microsoft.com/v1.0/me/transitiveMemberOf/microsoft.graph.group"
    f"?$filter=id eq '{group_id}'&$count=true")
  response = await client.get(
    url,
    headers={
      "Authorization": f"Bearer {graph_token}",
      "ConsistencyLevel": "eventual",
  })
  data = response.json()
  membership_count = data.get("@odata.count", 0)

You could imagine many other ways to use an OBO flow however, like to query for more details from the Graph API, upload documents to OneDrive/SharePoint/Notes, send emails, and more!

All together now

For the full code, check out the open source python-mcp-demos repository, and follow the deployment steps for Entra. The most relevant code files are:

  • auth_init.py: Creates the Entra app registration, service principal, client secret, and grants admin consent for OBO flow.
  • auth_update.py: Updates the app registration's redirect URIs after deployment, adding the deployed server URL.
  • auth_entra_mcp.py: The MCP server itself, configured with FastMCP's AzureProvider and tools that use OBO for group membership checks.

As always, please let me know if you have further questions or ideas for other Entra integrations.

No comments: