- 
                Notifications
    You must be signed in to change notification settings 
- Fork 585
docs: explain control plane services #4185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| 
 | 
| The latest updates on your projects. Learn more about Vercel for GitHub. 
 | 
| 
 Andreas seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. | 
| 📝 WalkthroughWalkthroughThis change updates and expands the architecture documentation. The main index was redesigned with a narrative overview and visual card-based navigation for six core services. Comprehensive documentation was added for two services: Ctrl (control plane orchestration with build and deployment workflows) and Krane (Kubernetes deployment abstraction). Metadata files were updated to register these new documentation pages. Changes
 Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 
 Pre-merge checks and finishing touches❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 ✨ Finishing touches🧪 Generate unit tests (beta)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment  | 
Removed architecture configuration from mermaid initialization.
| Thank you for following the naming conventions for pull request titles! 🙏 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
apps/engineering/content/docs/architecture/services/ctrl/build.mdx (1)
9-11: Minor style suggestion: Lead with "first" for improved clarity.Per the LanguageTool style guide, restructuring this sentence would improve clarity:
-The CLI first requests a deployment from the control plane, which returns a presigned S3 URL. +First, the CLI requests a deployment from the control plane, which returns a presigned S3 URL.This is a very minor stylistic improvement and can be deferred if not aligned with your documentation style guide.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
- apps/engineering/content/docs/architecture/index.mdx(1 hunks)
- apps/engineering/content/docs/architecture/services/ctrl/build.mdx(1 hunks)
- apps/engineering/content/docs/architecture/services/ctrl/index.mdx(1 hunks)
- apps/engineering/content/docs/architecture/services/ctrl/meta.json(1 hunks)
- apps/engineering/content/docs/architecture/services/krane.mdx(1 hunks)
- apps/engineering/content/docs/architecture/services/meta.json(1 hunks)
🧰 Additional context used
🪛 LanguageTool
apps/engineering/content/docs/architecture/services/ctrl/build.mdx
[style] ~10-~10: Consider placing the discourse marker ‘first’ at the beginning of the sentence for more clarity.
Context: ...ication, the following process occurs:  The CLI first requests a deployment from the control ...
(SENT_START_FIRST_PREMIUM)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test Go API Local / Test
- GitHub Check: Test API / API Test Local
- GitHub Check: Build / Build
🔇 Additional comments (10)
apps/engineering/content/docs/architecture/services/ctrl/meta.json (1)
1-6: LGTM!Metadata structure is consistent and correctly references the documentation pages (index and build).
apps/engineering/content/docs/architecture/services/meta.json (1)
1-19: LGTM!Service entries properly added to navigation registry and positioned logically within the services list.
apps/engineering/content/docs/architecture/services/krane.mdx (3)
27-35: Strong architectural justification provided for design choices.The StatefulSets rationale (lines 27-35) effectively explains the tradeoff between convenience (stable DNS) and convention (standard Deployments). The acknowledgment of this as a "known design compromise" and mention of future improvements is valuable context for maintainers.
107-131: RBAC configuration example is complete and accurate.The YAML example is properly formatted and correctly specifies the required permissions for Krane's Kubernetes backend operations.
8-11: All references verified as correct.
go/apps/krane/directory exists ✓
go/proto/krane/v1/deployment.protofile exists ✓
/cli/run/kraneCLI command documentation exists ✓The documentation references are accurate and point to valid codebase locations.
apps/engineering/content/docs/architecture/services/ctrl/index.mdx (3)
43-43: Confirm "./build" cross-reference is valid.Line 43 references
./build, which should point to thebuild.mdxfile in the same directory. This appears correct based on the PR changes provided, but verify it renders properly in the documentation build.
8-10: All service metadata verified as accurate.
go/apps/ctrl/exists at./go/apps/ctrl- CLI command
/cli/run/ctrlis properly documented at./apps/engineering/content/docs/cli/run/ctrl/index.mdx- Protocol
Connect RPC (HTTP/2)is correctly stated and documented in the Technology Stack section
55-55: External documentation link is valid and exists in the codebase.The referenced file
deployment-service.mdxexists atapps/engineering/content/docs/architecture/workflows/deployment-service.mdx. The link path/docs/architecture/workflows/deployment-servicecorrectly maps to this file. No action required.Likely an incorrect or invalid review comment.
apps/engineering/content/docs/architecture/index.mdx (1)
3-3: New imports and narrative structure look good.The Cards component import and architecture narrative provide excellent context for users navigating the services.
Also applies to: 6-8
apps/engineering/content/docs/architecture/services/ctrl/build.mdx (1)
75-95: All backend and storage file paths verified as correct.Verification confirms all documented paths exist in the codebase:
- ✓
go/apps/ctrl/services/build/backend/depot/- ✓
go/apps/ctrl/services/build/backend/docker/- ✓
go/apps/ctrl/services/build/storage/s3.goNo corrections needed.
| ## Core Services | ||
|  | ||
| <Cards> | ||
| <Card | ||
| title="Control Plane (Ctrl)" | ||
| description="Orchestrates deployments, builds containers via Depot, provisions TLS certificates, and configures routing using durable Restate workflows" | ||
| href="./services/ctrl" | ||
| /> | ||
| <Card | ||
| title="Krane" | ||
| description="Kubernetes deployment abstraction that manages StatefulSets across multiple clusters and regions without replicating control plane logic" | ||
| href="./services/krane" | ||
| /> | ||
| <Card | ||
| title="API" | ||
| description="Handles key verification, analytics queries, and management operations in Go. Deployed to multiple AWS regions behind Global Accelerator" | ||
| href="./services/api/config" | ||
| /> | ||
| <Card | ||
| title="Gateway (GW)" | ||
| description="Routes traffic to customer deployments by querying the partition database, terminating TLS, and proxying requests to Kubernetes pods" | ||
| href="./services/gateway" | ||
| /> | ||
| <Card | ||
| title="ClickHouse" | ||
| description="Stores analytics events for key verification logs, API usage metrics, and audit trails with automatic scaling and replication" | ||
| href="./services/clickhouse" | ||
| /> | ||
| <Card | ||
| title="Vault" | ||
| description="Encrypts sensitive data using envelope encryption with AWS KMS, decrypting on demand without storing plaintext secrets" | ||
| href="./services/vault" | ||
| /> | ||
| </Cards> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify all service card links resolve correctly.
The new Cards section references six service pages with relative paths. Confirm all destinations exist:
- ./services/ctrl→ should resolve to ctrl/index.mdx ✓ (in PR)
- ./services/krane→ should resolve to krane.mdx ✓ (in PR)
- ./services/api/config→ verify this path exists
- ./services/gateway→ verify this path exists
- ./services/clickhouse→ verify this path exists
- ./services/vault→ verify this path exists
🏁 Script executed:
#!/bin/bash
# Verify all service documentation link destinations exist
# Check architecture/services directory structure
fd --type f "*.mdx" --path "*architecture/services*" | sort
# Specific checks for referenced paths
echo "=== Checking specific paths ==="
for path in "api/config" "gateway" "clickhouse" "vault"; do
  if fd --type f "*.mdx" --path "*architecture/services/$path*" >/dev/null 2>&1; then
    echo "✓ services/$path exists"
  else
    echo "✗ services/$path MISSING"
  fi
doneLength of output: 377
🏁 Script executed:
#!/bin/bash
# Verify all service documentation link destinations exist (corrected)
# Check architecture/services directory structure
echo "=== All .mdx files in architecture/services ==="
find apps/engineering/content/docs/architecture/services -name "*.mdx" 2>/dev/null | sort
echo ""
echo "=== Checking specific referenced paths ==="
for path in "ctrl/index.mdx" "krane.mdx" "api/config.mdx" "gateway.mdx" "clickhouse.mdx" "vault.mdx"; do
  filepath="apps/engineering/content/docs/architecture/services/$path"
  if [ -f "$filepath" ]; then
    echo "✓ services/$path exists"
  else
    echo "✗ services/$path MISSING"
  fi
doneLength of output: 282
🏁 Script executed:
#!/bin/bash
# Check the actual structure of the docs and architecture directories
echo "=== Exploring docs directory structure ==="
find apps/engineering/content/docs -type d -name "architecture" 2>/dev/null
echo ""
echo "=== All files/dirs under docs/architecture (first 3 levels) ==="
find apps/engineering/content/docs/architecture -maxdepth 3 -type f -o -type d 2>/dev/null | head -50
echo ""
echo "=== Check if services directory exists ==="
if [ -d "apps/engineering/content/docs/architecture/services" ]; then
  echo "✓ services directory exists"
  ls -la apps/engineering/content/docs/architecture/services
else
  echo "✗ services directory does not exist"
fi
echo ""
echo "=== All .mdx files in docs/architecture (any depth) ==="
find apps/engineering/content/docs/architecture -name "*.mdx" 2>/dev/null | sortLength of output: 244
🏁 Script executed:
#!/bin/bash
# Broader exploration
echo "=== Check if apps/engineering exists ==="
if [ -d "apps/engineering" ]; then
  echo "✓ apps/engineering exists"
else
  echo "✗ apps/engineering does not exist"
fi
echo ""
echo "=== Check if apps/engineering/content exists ==="
if [ -d "apps/engineering/content" ]; then
  echo "✓ apps/engineering/content exists"
else
  echo "✗ apps/engineering/content does not exist"
fi
echo ""
echo "=== List apps directory ==="
ls -la apps/ 2>/dev/null || echo "apps directory not found"
echo ""
echo "=== Check if the specific index.mdx file exists ==="
if [ -f "apps/engineering/content/docs/architecture/index.mdx" ]; then
  echo "✓ File exists"
  head -20 apps/engineering/content/docs/architecture/index.mdx
else
  echo "✗ File does not exist"
fi
echo ""
echo "=== Find all index.mdx files ==="
find . -name "index.mdx" 2>/dev/null | head -20Length of output: 1510
🏁 Script executed:
#!/bin/bash
# Check service files in correct git directory
echo "=== All .mdx files in architecture/services ==="
find ./git/apps/engineering/content/docs/architecture/services -name "*.mdx" 2>/dev/null | sort
echo ""
echo "=== Checking specific referenced paths ==="
for path in "ctrl/index.mdx" "krane/index.mdx" "api/config/index.mdx" "gateway/index.mdx" "clickhouse/index.mdx" "vault/index.mdx"; do
  filepath="./git/apps/engineering/content/docs/architecture/services/$path"
  if [ -f "$filepath" ]; then
    echo "✓ services/$path exists"
  else
    echo "✗ services/$path MISSING"
  fi
done
echo ""
echo "=== Check if api/config is actually api.mdx or api/index.mdx ==="
find ./git/apps/engineering/content/docs/architecture/services -path "*api*" -name "*.mdx" 2>/dev/nullLength of output: 1115
Fix broken service card link: gateway page does not exist.
Verification found that five of the six referenced service pages exist, but the "Gateway (GW)" card links to ./services/gateway which has no corresponding documentation file (neither gateway.mdx nor gateway/index.mdx exists in the services directory).
Confirmed valid links:
- ✓ ./services/ctrl→ ctrl/index.mdx
- ✓ ./services/krane→ krane.mdx
- ✓ ./services/api/config→ api/config.mdx
- ✓ ./services/clickhouse→ clickhouse.mdx
- ✓ ./services/vault→ vault.mdx
Broken link:
- ✗ ./services/gateway→ file not found
Create gateway.mdx in the services directory or update the card href to point to an existing page.
🤖 Prompt for AI Agents
In apps/engineering/content/docs/architecture/index.mdx around lines 10 to 43
the "Gateway (GW)" Card href points to ./services/gateway which doesn't exist;
fix by either creating the missing documentation file at
apps/engineering/content/docs/architecture/services/gateway.mdx (or
services/gateway/index.mdx) with the Gateway content, or update the Card href to
an existing path for the gateway docs (e.g., point to the correct existing file
location) so the link resolves.
|  | ||
| Private keys are encrypted using the vault service before storage. Certificates are stored in the partition database for fast gateway access without encryption overhead. Challenge records track certificate expiry with 90-day validity periods. | ||
|  | ||
| [Read detailed Certificate docs →](./certificates) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify "./certificates" cross-reference.
Line 65 links to ./certificates. Confirm a certificates.mdx file exists in the same directory (apps/engineering/content/docs/architecture/services/ctrl/), or this link will be broken.
🏁 Script executed:
#!/bin/bash
# Check if certificates documentation exists for Ctrl service
fd --type f "certificates.mdx" --path "*architecture/services/ctrl*"Length of output: 233
🏁 Script executed:
#!/bin/bash
# Check if certificates.mdx exists in the Ctrl service documentation directory
fd "certificates.mdx" "apps/engineering/content/docs/architecture/services/ctrl/"Length of output: 0
🏁 Script executed:
#!/bin/bash
# List all .mdx files in the Ctrl service documentation directory to verify structure
fd --type f "\.mdx$" "apps/engineering/content/docs/architecture/services/ctrl/"Length of output: 133
Remove or create the missing certificates documentation file.
The link to ./certificates on line 65 is broken. Verification confirms that certificates.mdx does not exist in apps/engineering/content/docs/architecture/services/ctrl/. Either remove the link or create the missing documentation file.
🤖 Prompt for AI Agents
In apps/engineering/content/docs/architecture/services/ctrl/index.mdx around
line 65, the link "[Read detailed Certificate docs →](./certificates)" points to
a missing file (certificates.mdx); either remove that link or add the missing
certificates.mdx in the same directory. To fix: if the docs should exist, create
apps/engineering/content/docs/architecture/services/ctrl/certificates.mdx with
the intended content and ensure frontmatter/title are correct; otherwise edit
index.mdx to delete or replace the link with a valid target.
|  | ||
| The CLI first requests a deployment from the control plane, which returns a presigned S3 URL. The CLI packages the source code into a tarball and uploads it directly to S3, bypassing the control plane for efficient transfer. Once uploaded, the CLI triggers the build by sending the S3 path to the control plane. | ||
|  | ||
| The control plane retrieves or creates a dedicated Depot project for the customer, then initiates a build with Depot. Depot provisions an isolated BuildKit machine, downloads the build context from S3, executes the Docker build, and pushes the resulting image to its registry. The image name is returned to the control plane. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does Depot provision the isolated buildkit machine? Is that our infra? Their infra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also "to its registry".. what is "its" here?
|  | ||
| The deployment workflow progresses through several phases. It first builds the container image if building from source, then creates the deployment in Krane, our Kubernetes abstraction layer. Next it polls for instance readiness for up to 5 minutes, checking every second whether all pods are running. Once instances are ready, it registers them in the partition database so gateways can route traffic to them. It attempts to scrape an OpenAPI spec from the running service, though this is optional. Finally, it assigns domains and creates gateway configurations via the routing service, then marks the deployment as ready. | ||
|  | ||
| Each phase is durable. If ctrl crashes during deployment, Restate resumes from the last completed phase rather than restarting from the beginning. This ensures deployments complete reliably even during system failures. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not important right now but a link to whatever explains how this durability is provided through Restate would be useful
|  | ||
| ### Ctrl Service | ||
|  | ||
| The ctrl service provides health checks and instance metadata. Its primary operation is `Liveness`, which serves as a health check endpoint for Kubernetes probes. This service is minimal by design, handling only operational concerns rather than business logic. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is in opposition with the original statement on line 14.
|  | ||
| ## Kubernetes Backend | ||
|  | ||
| The Kubernetes backend runs inside a cluster with appropriate RBAC permissions. It uses in-cluster config to authenticate with the API server. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: what's "appropriate"? (link to the permission def or inline it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented with questions/suggestions 😄

What does this PR do?
This PR adds comprehensive architecture documentation for the Unkey system, focusing on:
The documentation includes sequence diagrams for deployment flows, detailed explanations of service responsibilities, and technical implementation details for both production and local development environments.
Type of change
How should this be tested?
Checklist
Required
pnpm buildpnpm fmtconsole.logsgit pull origin mainAppreciated