Skip to content

Commit 9ea2ad3

Browse files
MSAL MSI with Credentials - Authentication Design (#5096)
* init * doc links * apis * typo * Update MSI V2 authentication steps and API table * Create slc_revocation_spec.md * Add MSAL EPIC link to related documents. * Update revocation spec with unspecified credential issue * Refactor JSON body construction in documentation * pr comments * pr comments * pr comments * pr comments * azure_sdk * BindingCertificateRefreshed * Add mermaid sequence diagram for MSI V2 process * Add sequence diagram to credential probe doc * Update IMDS header handling logic * Add sequence diagram for SLC revocation. * Add mermaid sequence diagram for SLC revocation * Add sections on SLC revocation and claims challenge * Add sequence diagram for mTLS communication * Remove mermaid sequence diagram from guidance document --------- Co-authored-by: Gladwin Johnson <[email protected]>
1 parent 7fb7918 commit 9ea2ad3

File tree

4 files changed

+688
-0
lines changed

4 files changed

+688
-0
lines changed
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Guidance for SDKs Consuming MSAL
2+
3+
## Overview
4+
5+
To support MSI V2 authentication with the `/credential` endpoint, the **Azure SDK** leverages the `IMsalMtlsHttpClientFactory` interface and **certificate management APIs** for secure communication with Azure AD using **mutual TLS (mTLS)**.
6+
7+
This section covers:
8+
- How Azure SDK uses **`IMsalMtlsHttpClientFactory`** for MTLS authentication.
9+
- How SDKs interact with the **certificate APIs** to obtain the binding certificate certificates.
10+
- The **new `CertificateRefreshed` event**, which notifies when a binding certificate is updated.
11+
12+
---
13+
14+
## **Handling Certificate Rotation for Long-Lived Clients**
15+
16+
The Problem: Certificate Expiry and Rotation
17+
18+
- mTLS Proof of Possession (PoP) tokens are signed by a binding certificate.
19+
- The binding certificate is valid for 90 days.
20+
- A new certificate is made available 5 days before expiration.
21+
- The SDKs consuming MSAL (customizing httpclient) must ensure that its HttpClient uses the latest certificate.
22+
23+
Proposed Solution
24+
- SDK clients must monitor certificate updates and refresh their HttpClient dynamically.
25+
- MSAL exposes an event-driven model to notify when the binding certificate is refreshed.
26+
27+
## **New Interface: `IMsalMtlsHttpClientFactory`**
28+
29+
MSAL introduces the `IMsalMtlsHttpClientFactory` interface to facilitate **mTLS-based authentication** in Azure SDKs. This ensures:
30+
- Secure token acquisition by enabling **mTLS authentication**.
31+
- **Reusable `HttpClient` instances** to prevent socket exhaustion.
32+
- Optimized **Azure SDK integration** for managing identities securely.
33+
34+
### **Interface Definition**
35+
36+
```csharp
37+
public interface IMsalMtlsHttpClientFactory : IMsalHttpClientFactory
38+
{
39+
/// <summary>
40+
/// Returns an HttpClient configured with a certificate for mutual TLS authentication.
41+
/// </summary>
42+
/// <param name="x509Certificate2">The certificate to be used for MTLS authentication.</param>
43+
/// <returns>An HttpClient instance configured with the specified certificate.</returns>
44+
HttpClient GetHttpClient(X509Certificate2 x509Certificate2);
45+
}
46+
```
47+
48+
## **Binding Certificate**
49+
50+
SDKs customizing the httpclient factory will need a way to get the binding certificate for MTLS authentication. MSAL exposes a few different APIs to help SDKs manage certificates.
51+
52+
- Retrieving existing certificates from the OS store or memory.
53+
- Notifying SDKs when a certificate is updated via the CertificateRefreshed event.
54+
55+
| API Name | Purpose |
56+
|----------------------------------|------------------------------------------------------------------------------------|
57+
| `GetManagedIdentitySourceAsync()`| Will expose the MSI Source including the new `IMDSV2` source |
58+
| `GetBindingCertificate()` | Helper method to get the binding certificate when source is `IMDSV2`. |
59+
| `BindingCertificateRefreshed` | Event to notify SDKs when the binding certificate is updated. |
60+
61+
62+
Lines changed: 324 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,324 @@
1+
# MSAL MSI V2 /credential Endpoint Design Document
2+
3+
## Overview
4+
5+
This document provides detailed guidance for SDK developers to implement MSI V2 `/credential` endpoint support. It focuses on the **token acquisition process**, ensuring seamless interactions with Managed Identity Resource Providers (MIRPs) on **Azure Virtual Machines (VMs) and Virtual Machine Scale Sets (VMSS)**.
6+
7+
## Goals
8+
9+
The primary objective is to enable seamless token acquisition in MSI V2 for VM/VMSS, utilizing the `/credential` endpoint.
10+
11+
- Define the **MSI V2 token acquisition process**.
12+
- Describe how MSAL interacts with the `/credential` and the ESTS regional token endpoint.
13+
- Ensure compatibility with **Windows and Linux** VMs and VMSS.
14+
15+
## Token Acquisition Process
16+
17+
In **MSI V1**, IMDS or any other Managed Identity Resource Provider (MIRP) directly returns an **access token**. However, in **MSI V2**, the process involves two steps:
18+
19+
```mermaid
20+
sequenceDiagram
21+
participant Application
22+
participant MSAL
23+
participant IMDS
24+
participant ESTS
25+
26+
Application ->> MSAL: 1. Request token using Managed Identity
27+
MSAL ->> IMDS: 2. Probe for `/credential` endpoint availability
28+
IMDS -->> MSAL: 3. Response (200 OK / 404 Not Found)
29+
30+
alt `/credential` endpoint available
31+
MSAL ->> IMDS: 4. Request Short-Lived Credential (SLC) via `/credential`
32+
IMDS -->> MSAL: 5. Return SLC
33+
MSAL ->> ESTS: 6. Exchange SLC for Access Token via MTLS
34+
ESTS -->> MSAL: 7. Return Access Token
35+
MSAL ->> Application: 8. Return Access Token
36+
else `/credential` endpoint not available
37+
MSAL ->> IMDS: 4a. Fallback to legacy `/token` endpoint
38+
IMDS -->> MSAL: 5a. Return Access Token
39+
MSAL ->> Application: 6a. Return Access Token
40+
end
41+
```
42+
43+
### Short-Lived Credential Retrieval from `/credential` Endpoint
44+
45+
- Azure Managed Identity Resource Providers host the `/credential` endpoint.
46+
- The client (MSAL) calls the `/credential` endpoint to retrieve a **short-lived credential (SLC)**.
47+
- This credential is valid for a short duration and must be used promptly in the next step.
48+
49+
### Access Token Acquisition via ESTS
50+
51+
- The client presents the **short-lived credential** to **ESTS** over **MTLS** as an assertion.
52+
- ESTS validates the credential and issues an **access token**.
53+
- The access token is then used to authenticate with Azure services.
54+
55+
## Certificate Handling
56+
57+
To start the flow, MSAL requires a certificate. MSAL follows these steps:
58+
59+
1. **Check for an existing certificate (Windows only)**: MSAL looks for a platform certificate (`devicecert.mtlsauth.local`) in the given Azure resource (In both local machine and local user store).
60+
2. **Create a new certificate, if none is found**: If a platform certificate is not available, MSAL generates one (self signed) for authentication.
61+
3. **Linux Only**: MSAL will always generate a self signed certificate on Linux.
62+
63+
## Source Detection Logic
64+
65+
MSAL follows a source detection process to determine how to interact with MSI endpoints and acquire tokens.
66+
67+
### Environment Variable Check
68+
69+
MSAL checks for Azure resource type based on specific environment variables to determine if the application is running on:
70+
71+
- **Service Fabric**
72+
- **App Service**
73+
- **Machine Learning**
74+
- **Cloud Shell**
75+
- **Azure Arc**
76+
77+
If identified, MSAL will use the appropriate legacy MSI endpoint for that resource.
78+
79+
### Fallback to IMDS
80+
81+
- If no specific Azure resource is identified from the environment variables, MSAL will fall back to IMDS (VMs and VMSS).
82+
- This fallback is the MSI v1 design or the legacy fallback mechanism.
83+
- In this new MSI v2 design, Before fully falling back to IMDS, MSAL will now **probe for the Credential Endpoint**.
84+
- MSAL probes to see if the `/credential` endpoint exists.
85+
- If the `/credential` endpoint is unavailable, it falls back to the legacy `/token` endpoint.
86+
- If probe is succesful then we can assume the current Azure Resource is a VM/VMSS
87+
88+
## MSI V2 /credential Endpoint Details
89+
90+
### Short-Lived Credential Retrieval
91+
92+
- The `/credential` endpoint provides a **temporary credential** instead of an access token.
93+
- This credential is only valid for a short duration (1 hour) and must be used **immediately** to acquire an access token from ESTS.
94+
- This mechanism improves security by reducing the lifetime of sensitive authentication materials.
95+
96+
### Retry Logic
97+
98+
MSAL uses the **default Managed Identity [retry](https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/blob/651b71c7d1dcaf3261e598e01e017dfd3672bb25/src/client/Microsoft.Identity.Client/Http/HttpManagerFactory.cs#L28) policy** for MSI V2 credential/token requests, whether calling the ESTS endpoint or the new `/credential` endpoint. i.e. MSAL performs 3 retries with a 1 second pause between each retry. Retries are performed on certain error [codes](https://github.com/AzureAD/microsoft-authentication-library-for-dotnet/blob/651b71c7d1dcaf3261e598e01e017dfd3672bb25/src/client/Microsoft.Identity.Client/Http/HttpRetryCondition.cs#L12) only.
99+
100+
## Steps for MSI V2 Authentication
101+
102+
This section outlines the necessary steps to acquire an access token using the MSI V2 `/credential` endpoint.
103+
104+
### 1. Check for an Existing (Platform) Certificate (Windows only)
105+
- Search for a specific certificate (`devicecert.mtlsauth.local`) in `(Cert:\LocalMachine\My)`.
106+
- If the certificate is not found in Local Machine, check Current User's certificate store `(Cert:\CurrentUser\My)`.
107+
108+
### 2. Generate a New Certificate (if platform certificate is not found)
109+
- If no valid platform certificate is found in Cert:\LocalMachine\My or Cert:\CurrentUser\My, create a new in-memory self-signed certificate.
110+
- This applies especially to Linux VMs, where platform certificates are not pre-configured, and MSAL must always generate an in-memory certificate for MTLS authentication.
111+
112+
#### Certificate Creation Requirements
113+
- **Subject Name:** CN=mtls-auth (subject name not final).
114+
- **Validity Period:** 90 days.
115+
- **Key Export Policy:** Private key must be exportable to allow use for MTLS authentication.
116+
- **Key Usage must include:** Digital Signature, Key Encipherment and TLS Client Authentication.
117+
- **Storage:** The certificate should exist only in memory. It is not stored in the certificate store. It is discarded when the process exits.
118+
119+
#### Certificate Rotation Strategy
120+
- **Track Expiry:** The expiration of the certificate must be monitored at runtime.
121+
- **Rotation Trigger:** 5 days before expiry, generate a new in-memory certificate.
122+
123+
### 3. Extract Certificate Data
124+
- Convert the certificate to a Base64-encoded string (`x5c`).
125+
- Format the JSON payload containing the certificate details for request authentication.
126+
127+
### 4. Request MSI Credential
128+
- Send a POST request to the IMDS `/credential` endpoint with the certificate details.
129+
- The request must include:
130+
- `Metadata: true` header.
131+
- `X-ms-Client-Request-id` header with a GUID.
132+
- JSON body containing the certificate's public key in `jwk` format. [RFC](https://datatracker.ietf.org/doc/html/rfc7517#appendix-B)
133+
- Parse the response to extract:
134+
- `regional_token_url`
135+
- `tenant_id`
136+
- `client_id`
137+
- `credential` (short-lived credential).
138+
139+
### 5. Request Access Token from ESTS
140+
- Construct the OAuth2 request body, including:
141+
- `grant_type=client_credentials`
142+
- `scope=https://management.azure.com/.default`
143+
- `client_id` from the MSI response.
144+
- `client_assertion` containing the short-lived credential.
145+
- `client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer`.
146+
- Send a POST request to the `regional_token_url` with the certificate for mutual TLS (mTLS) authentication.
147+
148+
### 6. Retrieve and Use Access Token
149+
- Parse the response to extract the `access_token`.
150+
- Use the access token to authenticate requests to Azure services.
151+
- Handle any errors that may occur during the token request.
152+
153+
## End-to-End Script
154+
155+
```powershell
156+
# Define certificate subject names
157+
$searchSubject = "CN=devicecert.mtlsauth.local" # Existing cert to look for
158+
$newCertSubject = "CN=mtls-auth" # Subject for new self-signed cert
159+
160+
# Step 1: Search for an existing certificate in LocalMachine\My
161+
$cert = Get-ChildItem -Path "Cert:\LocalMachine\My" | Where-Object { $_.Subject -eq $searchSubject -and $_.NotAfter -gt (Get-Date) }
162+
163+
# Step 2: If not found, search in CurrentUser\My
164+
if (-not $cert) {
165+
Write-Output "🔍 No valid certificate found in LocalMachine\My. Checking CurrentUser\My..."
166+
$cert = Get-ChildItem -Path "Cert:\CurrentUser\My" | Where-Object { $_.Subject -eq $searchSubject -and $_.NotAfter -gt (Get-Date) }
167+
}
168+
169+
# Step 3: If found, use it
170+
if ($cert) {
171+
Write-Output "✅ Found valid certificate: $($cert.Subject)"
172+
} else {
173+
Write-Output "❌ No valid certificate found in both stores. Creating a new self-signed certificate in `CurrentUser\My`..."
174+
175+
# Step 4: Generate a new self-signed certificate in `CurrentUser\My`
176+
# For POC we are creating the cert in the user store. But in Product this will be a in-memory cert
177+
$cert = New-SelfSignedCertificate `
178+
-Subject $newCertSubject `
179+
-CertStoreLocation "Cert:\CurrentUser\My" `
180+
-KeyExportPolicy Exportable `
181+
-KeySpec Signature `
182+
-KeyUsage DigitalSignature, KeyEncipherment `
183+
-TextExtension @("2.5.29.37={text}1.3.6.1.5.5.7.3.2") `
184+
-NotAfter (Get-Date).AddDays(90)
185+
186+
Write-Output "✅ Created certificate in CurrentUser\My: $($cert.Thumbprint)"
187+
}
188+
189+
# Ensure `$cert` is valid
190+
if (-not $cert) {
191+
Write-Error "❌ No certificate found or created. Exiting."
192+
exit
193+
}
194+
195+
# Step 5: Compute SHA-256 of the Public Key for `kid`
196+
$publicKeyBytes = $cert.GetPublicKey()
197+
$sha256 = New-Object System.Security.Cryptography.SHA256Managed
198+
$certSha256 = [BitConverter]::ToString($sha256.ComputeHash($publicKeyBytes)) -replace "-", ""
199+
200+
Write-Output "🔐 Using SHA-256 Certificate Identifier (kid): $certSha256"
201+
202+
# Step 6: Convert certificate to Base64 for JWT (x5c field)
203+
$x5c = [System.Convert]::ToBase64String($cert.RawData)
204+
Write-Output "📜 x5c: $x5c"
205+
206+
# Step 7: Construct the JSON body properly
207+
$bodyObject = @{
208+
cnf = @{
209+
jwk = @{
210+
kty = "RSA"
211+
use = "sig"
212+
alg = "RS256"
213+
kid = $certSha256 # Use SHA-256 instead of Thumbprint
214+
x5c = @($x5c) # Ensures correct array formatting
215+
}
216+
}
217+
latch_key = $false # Final version of the product should not have this. IMDS team is working on removing this.
218+
}
219+
220+
# Convert JSON object to a string
221+
$body = $bodyObject | ConvertTo-Json -Depth 10 -Compress
222+
Write-Output "🔹 JSON Payload: $body"
223+
224+
# Step 8: Request MSI credential
225+
$headers = @{
226+
"Metadata" = "true"
227+
"X-ms-Client-Request-id" = [guid]::NewGuid().ToString()
228+
}
229+
230+
$imdsResponse = Invoke-WebRequest -Uri "http://169.254.169.254/metadata/identity/credential?cred-api-version=1.0" `
231+
-Method POST `
232+
-Headers $headers `
233+
-Body $body
234+
235+
$jsonContent = $imdsResponse.Content | ConvertFrom-Json
236+
237+
$regionalEndpoint = $jsonContent.regional_token_url + "/" + $jsonContent.tenant_id + "/oauth2/v2.0/token"
238+
Write-Output "✅ Using Regional Endpoint: $regionalEndpoint"
239+
240+
# Step 9: Authenticate with Azure
241+
$tokenHeaders = @{
242+
"Content-Type" = "application/x-www-form-urlencoded"
243+
"Accept" = "application/json"
244+
}
245+
246+
$tokenRequestBody = "grant_type=client_credentials&scope=https://management.azure.com/.default&client_id=$($jsonContent.client_id)&client_assertion=$($jsonContent.credential)&client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer"
247+
248+
try {
249+
$tokenResponse = Invoke-WebRequest -Uri $regionalEndpoint `
250+
-Method POST `
251+
-Headers $tokenHeaders `
252+
-Body $tokenRequestBody `
253+
-Certificate $cert # Use the full certificate object
254+
255+
$tokenJson = $tokenResponse.Content | ConvertFrom-Json
256+
Write-Output "🔑 Access Token: $($tokenJson.access_token)"
257+
} catch {
258+
Write-Error "❌ Failed to retrieve access token. Error: $_"
259+
}
260+
```
261+
262+
## Summary of New APIs on Managed Identity Builder
263+
264+
| API Name | Purpose |
265+
|----------------------------------|------------------------------------------------------------------------------------|
266+
| `GetBindingCertificate()` | Helper method to get the binding certificate when a credential endpoint exist. |
267+
| `GetManagedIdentitySourceAsync()`| Helper method to get the managed identity source. |
268+
| `WithCorrelationID(GUID id)` | Sets the correlation id for the managed identity requests (v2 source only) |
269+
270+
## Client-Side Telemetry
271+
272+
To improve observability and diagnostics of Managed Identity (MSI) scenarios within MSAL, we propose introducing a **new telemetry counter** named `MsalMsiCounter`. This counter will be incremented (or otherwise recorded) whenever MSI token acquisition activities occur, capturing the most relevant context in the form of tags.
273+
274+
### Counter Name
275+
- **`MsalMsiCounter`**
276+
277+
### Tags
278+
Each time we increment `MsalMsiCounter`, we include the following tags:
279+
280+
1. **MsiSource**
281+
Describes which MSI path or resource is used.
282+
- Possible values: `"AppService"`, `"CloudShell"`, `"AzureArc"`, `"ImdsV1"`, `"ImdsV2"`, `"ServiceFabric"`
283+
284+
2. **TokenType**
285+
Specifies the type of token being requested or used.
286+
- Possible values: `"Bearer"`, `"POP"`, `"mtls_pop"`
287+
288+
3. **bypassCache**
289+
Indicates whether the MSAL cache was intentionally bypassed.
290+
- Possible values: `"true"`, `"false"`
291+
292+
4. **CertType**
293+
Identifies which certificate was used during the MSI V2 flow.
294+
- Possible values: `"Platform"`, `"inMemory"`, `"UserProvided"`
295+
296+
5. **CredentialOutcome**
297+
If using the `/credential` endpoint (ImdsV2) log the outcome.
298+
- Not found
299+
- Retry Failed
300+
- Retry Succeeded
301+
- Success
302+
303+
6. **MsalVersion**
304+
The MSAL library version in use.
305+
- Example: `"4.51.2"`
306+
307+
7. **Platform**
308+
The runtime/OS environment.
309+
- Examples: `"net6.0-linux"`, `"net472-windows"`
310+
311+
## Related Documents
312+
313+
- **[SLC Design Document](https://microsoft.sharepoint.com/:w:/t/AzureMSI/EURnTEtFXPlDngpYhCUioqUBvbSUWEX7vZjP0nm8bxUsQA?e=Ejok1n&wdLOR=cE6820299-49AF-4D7A-B7F7-F58D65C232B6)**
314+
- **[MSAL EPIC](https://identitydivision.visualstudio.com/Engineering/_workitems/edit/3027078)**
315+
316+
## Glossary
317+
318+
- **MSAL (Microsoft Authentication Library):** SDK for authentication with Azure AD.
319+
- **IMDS (Instance Metadata Service):** Metadata service for Azure VMs.
320+
- **PoP (Proof of Possession) Token:** Token tied to a specific key.
321+
- **SAMI (System Assigned Managed Identity):** Auto-managed identity for Azure resources.
322+
- **UAMI (User Assigned Managed Identity):** Manually created and assigned identity.
323+
324+
This specification serves as a reference for SDK developers integrating MSI V2 features into MSAL.

0 commit comments

Comments
 (0)