-
Notifications
You must be signed in to change notification settings - Fork 22
Add health check and separate deployments #2141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
Conversation
…on jobs. Update bootstrap logic to conditionally initialize GraphQL components and endpoints based on the new environment variables. Improve health check endpoint to include service statuses and connectivity checks for database and Redis.
…nd staging configurations - Introduced four GraphQL API instances (impact-graph-graphql-1 to impact-graph-graphql-4) for improved load balancing. - Updated environment variables and health checks for each instance. - Adjusted the impact-graph-jobs service to ensure proper logging and cron job functionality. - Enhanced deployment scripts in GitHub Actions for phased rollout and health verification of services.
| needs: deploy | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - First Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.PROD_HOST_ALL }} | ||
| username: ${{ secrets.PROD_USERNAME_ALL }} | ||
| key: ${{ secrets.PROD_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Update each backend service one by one | ||
| ## First Deployment - Jobs instance and first GraphQL instance | ||
| docker compose rm -fs impact-graph-graphql-1 | ||
| docker compose rm -fs impact-graph-jobs | ||
| docker compose up --force-recreate -d impact-graph-graphql-1 | ||
| docker compose up --force-recreate -d impact-graph-jobs | ||
| # Wait for impact-graph-graphql-1 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-1 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-1)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-1 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-1 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-1 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-1)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-1 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-jobs is running | ||
| if [ "$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs)" != "\"running\"" ]; then | ||
| echo "impact-graph-jobs is not running, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "First deployment phase completed successfully" | ||
| rollout-deploy-2: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-1 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - Second Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.PROD_HOST_ALL }} | ||
| username: ${{ secrets.PROD_USERNAME_ALL }} | ||
| key: ${{ secrets.PROD_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Second Deployment | ||
| docker compose rm -fs impact-graph-graphql-2 | ||
| docker compose up --force-recreate -d impact-graph-graphql-2 | ||
| # Wait for impact-graph-graphql-2 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-2 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-2)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-2 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-2 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-2 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-2)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-2 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "Second deployment phase completed successfully" | ||
| rollout-deploy-3: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-2 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - Third Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.PROD_HOST_ALL }} | ||
| username: ${{ secrets.PROD_USERNAME_ALL }} | ||
| key: ${{ secrets.PROD_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Third Deployment | ||
| docker compose rm -fs impact-graph-graphql-3 | ||
| docker compose up --force-recreate -d impact-graph-graphql-3 | ||
| # Wait for impact-graph-graphql-3 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-3 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-3)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-3 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-3 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-3 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-3)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-3 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "Third deployment phase completed successfully" | ||
| rollout-deploy-4: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-3 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - Fourth Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.PROD_HOST_ALL }} | ||
| username: ${{ secrets.PROD_USERNAME_ALL }} | ||
| key: ${{ secrets.PROD_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Fourth Deployment - Additional GraphQL instances for production | ||
| docker compose rm -fs impact-graph-graphql-4 | ||
| docker compose rm -fs impact-graph-graphql-5 | ||
| docker compose up --force-recreate -d impact-graph-graphql-4 | ||
| docker compose up --force-recreate -d impact-graph-graphql-5 | ||
| # Wait for impact-graph-graphql-4 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-4 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-4)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-4 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-4 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Wait for impact-graph-graphql-5 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-5 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-5)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-5 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-5 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if both instances are healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-4)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-4 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-5)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-5 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "Fourth deployment phase completed successfully" | ||
| deployment-verification: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-4 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Verify All Services | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.PROD_HOST_ALL }} | ||
| username: ${{ secrets.PROD_USERNAME_ALL }} | ||
| key: ${{ secrets.PROD_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| echo "=== Final Production Deployment Verification ===" | ||
| # Check all GraphQL instances health status | ||
| for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3 impact-graph-graphql-4 impact-graph-graphql-5; do | ||
| health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"") | ||
| echo "$service health status: $health_status" | ||
| if [ "$health_status" != "\"healthy\"" ]; then | ||
| echo "ERROR: $service is not healthy!" | ||
| exit 1 | ||
| fi | ||
| done | ||
| # Check jobs instance status | ||
| jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"") | ||
| echo "impact-graph-jobs status: $jobs_status" | ||
| if [ "$jobs_status" != "\"running\"" ]; then | ||
| echo "ERROR: impact-graph-jobs is not running!" | ||
| exit 1 | ||
| fi | ||
| echo "=== All services are healthy and running! ===" | ||
| echo "Production deployment completed successfully at $(date)" |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: deploy | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - First Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.STAGING_HOST_ALL }} | ||
| username: ${{ secrets.STAGING_USERNAME_ALL }} | ||
| key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Update each backend service one by one | ||
| ## First Deployment - Jobs instance and first GraphQL instance | ||
| docker compose rm -fs impact-graph-graphql-1 | ||
| docker compose rm -fs impact-graph-jobs | ||
| docker compose up --force-recreate -d impact-graph-graphql-1 | ||
| docker compose up --force-recreate -d impact-graph-jobs | ||
| # Wait for impact-graph-graphql-1 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-1 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-1)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-1 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-1 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-1 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-1)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-1 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-jobs is running | ||
| if [ "$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs)" != "\"running\"" ]; then | ||
| echo "impact-graph-jobs is not running, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "First deployment phase completed successfully" | ||
| rollout-deploy-2: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-1 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - Second Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.STAGING_HOST_ALL }} | ||
| username: ${{ secrets.STAGING_USERNAME_ALL }} | ||
| key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Second Deployment | ||
| docker compose rm -fs impact-graph-graphql-2 | ||
| docker compose up --force-recreate -d impact-graph-graphql-2 | ||
| # Wait for impact-graph-graphql-2 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-2 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-2)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-2 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-2 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-2 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-2)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-2 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "Second deployment phase completed successfully" | ||
| rollout-deploy-3: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-2 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: SSH and Redeploy - Third Phase | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.STAGING_HOST_ALL }} | ||
| username: ${{ secrets.STAGING_USERNAME_ALL }} | ||
| key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| ## Third Deployment | ||
| docker compose rm -fs impact-graph-graphql-3 | ||
| docker compose up --force-recreate -d impact-graph-graphql-3 | ||
| # Wait for impact-graph-graphql-3 to be healthy (timeout after 5 minutes) | ||
| echo "Waiting for impact-graph-graphql-3 to become healthy..." | ||
| timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-3)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-3 to be healthy..."; sleep 5; done' | ||
| if [ $? -eq 124 ]; then | ||
| echo "Timeout waiting for impact-graph-graphql-3 to become healthy" | ||
| exit 1 | ||
| fi | ||
| # Check if impact-graph-graphql-3 is healthy | ||
| if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-3)" != "\"healthy\"" ]; then | ||
| echo "impact-graph-graphql-3 is not healthy, stopping deployment" | ||
| exit 1 | ||
| fi | ||
| echo "Third deployment phase completed successfully" | ||
| deployment-verification: |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| needs: rollout-deploy-3 | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Verify All Services | ||
| uses: appleboy/[email protected] | ||
| with: | ||
| host: ${{ secrets.STAGING_HOST_ALL }} | ||
| username: ${{ secrets.STAGING_USERNAME_ALL }} | ||
| key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }} | ||
| port: ${{ secrets.SSH_PORT }} | ||
| script: | | ||
| cd giveth-all | ||
| echo "=== Final Deployment Verification ===" | ||
| # Check all GraphQL instances health status | ||
| for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3; do | ||
| health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"") | ||
| echo "$service health status: $health_status" | ||
| if [ "$health_status" != "\"healthy\"" ]; then | ||
| echo "ERROR: $service is not healthy!" | ||
| exit 1 | ||
| fi | ||
| done | ||
| # Check jobs instance status | ||
| jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"") | ||
| echo "impact-graph-jobs status: $jobs_status" | ||
| if [ "$jobs_status" != "\"running\"" ]; then | ||
| echo "ERROR: impact-graph-jobs is not running!" | ||
| exit 1 | ||
| fi | ||
| echo "=== All services are healthy and running! ===" | ||
| echo "Deployment completed successfully at $(date)" |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
WalkthroughIntroduces phased deployments in staging and production workflows, splits the monolithic impact-graph service into multiple GraphQL instances plus a dedicated jobs service, adds feature flags to control GraphQL and cronjobs, and gates server initialization and health reporting accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant GH as GitHub Actions
participant SSH as SSH Runner (Host)
participant DC as docker compose
participant SVC as Services (graphql-n, jobs)
rect rgb(245,248,255)
note over GH: Deploy job
GH->>SSH: git reset/checkout/pull (branch)
GH->>SSH: docker image prune -a --force
GH->>DC: pull images
end
rect rgb(240,255,245)
note over GH,SVC: Phased rollout (n=1..k)
GH->>SSH: rm graphql-n (+jobs in phase 1)
GH->>SSH: up --force-recreate graphql-n (+jobs)
loop health check ≤5 min
SSH->>SVC: docker inspect Health.Status
end
alt healthy
GH->>GH: proceed to next phase
else not healthy
GH--xGH: fail early
end
end
rect rgb(255,250,240)
note over GH,SVC: Final verification
GH->>SVC: check graphql-1..k healthy and jobs running
end
sequenceDiagram
autonumber
participant Node as Server bootstrap.ts
participant DB as Database
participant R as Redis
participant GQL as GraphQL Server
participant Cron as Cron Scheduler
participant HC as /health
Node->>Node: read ENABLE_GRAPHQL / ENABLE_CRONJOBS
alt GraphQL enabled
Node->>GQL: init schema, plugins, context, start
else
Node->>Node: skip GraphQL init
end
alt Cronjobs enabled
Node->>Cron: initialize cron jobs
else
Node->>Node: skip cron setup
end
HC->>DB: SELECT 1
HC->>R: PING
HC-->>Node: JSON {graphql, cronjobs, db, redis, version, env, uptime}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (13)
config/example.env (1)
346-348: Document the default behavior for feature togglesThe comment states these are service feature toggles, but doesn't clarify what happens when these values are not set. Based on the code in
src/server/bootstrap.ts, both default totruefor backward compatibility.Consider enhancing the documentation:
# Service feature toggles +# Both default to 'true' if not set (for backward compatibility) +# Set to 'false' to disable the respective service functionality ENABLE_GRAPHQL=true ENABLE_CRONJOBS=true.github/workflows/master-pipeline.yml (2)
167-207: Consider adding explicit permissions for GitHub token securityCodeQL correctly flags that this job lacks explicit permissions. While the job doesn't directly use GITHUB_TOKEN, setting minimal permissions is a security best practice.
Add minimal permissions block:
rollout-deploy-1: needs: deploy runs-on: ubuntu-latest + permissions: {} steps:
333-333: Remove trailing spacesYAMLlint detected trailing spaces that should be removed.
- echo "=== Final Production Deployment Verification ===" - + echo "=== Final Production Deployment Verification ===" + # Check all GraphQL instances health status for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3 impact-graph-graphql-4 impact-graph-graphql-5; do health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"") @@ -339,15 +339,15 @@ exit 1 fi done - + # Check jobs instance status jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"") echo "impact-graph-jobs status: $jobs_status" if [ "$jobs_status" != "\"running\"" ]; then echo "ERROR: impact-graph-jobs is not running!" exit 1 fi - + echo "=== All services are healthy and running! ==="Also applies to: 343-343, 351-351
.github/workflows/staging-pipeline.yml (2)
193-232: Add minimal permissions for security complianceSimilar to the production pipeline, these jobs should have explicit permissions set.
rollout-deploy-1: needs: deploy runs-on: ubuntu-latest + permissions: {} steps:Apply the same change to
rollout-deploy-2,rollout-deploy-3, anddeployment-verificationjobs.
311-311: Remove trailing spaces in YAML- echo "=== Final Deployment Verification ===" - + echo "=== Final Deployment Verification ===" + # Check all GraphQL instances health status for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3; do health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"") @@ -317,7 +317,7 @@ exit 1 fi done - + # Check jobs instance status jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"") echo "impact-graph-jobs status: $jobs_status" @@ -326,7 +326,7 @@ echo "ERROR: impact-graph-jobs is not running!" exit 1 fi - + echo "=== All services are healthy and running! ==="Also applies to: 321-321, 329-329
docker-compose-staging.yml (8)
1-1: Bump Compose file version (or drop it) to avoid start_period incompatibility.You’re using healthcheck.start_period but the file declares version "3.3", where start_period may be ignored by older docker-compose. Safer choices: bump to 3.8 or remove the version key (Compose Spec).
Apply either of these minimal diffs:
-version: '3.3' +version: '3.8' # ensure start_period is recognized by older docker-compose—or—
-version: '3.3' +# version intentionally omitted to use the latest Compose Specification
28-33: Make healthchecks independent of curl availability.Relying on curl assumes it exists in the image. Use Node (always present here) to check /health, avoiding surprises.
Apply this pattern (repeat for each service block shown in the line ranges):
- healthcheck: - test: ["CMD-SHELL", "curl -f http://localhost:4000/health || exit 1"] + healthcheck: + # Avoid dependency on curl: use Node to assert HTTP 200 + test: ["CMD", "node", "-e", "require('http').get('http://127.0.0.1:4000/health', r => process.exit(r.statusCode===200?0:1)).on('error',()=>process.exit(1))"] interval: 30s timeout: 10s retries: 3 start_period: 60sAlso applies to: 59-63, 89-93, 119-124
14-23: Mount config read-only; verify need for duplicate config mounts.
- Config should be immutable at runtime; mark those two config binds as read-only.
- You map ./config into both /usr/src/app/config and /usr/src/app/build/config. If the app only reads one, drop the other to reduce complexity.
Example for each service (apply to both config mounts):
- type: bind source: ./config target: /usr/src/app/config + read_only: true - type: bind source: ./config target: /usr/src/app/build/config + read_only: truePlease confirm whether both container paths are actually read by the app at runtime. If only one is needed, we can delete the other bind.
Also applies to: 44-53, 74-83, 105-114
26-27: depends_on does not wait for Redis to be ready; consider health-gated dependency or app-level retries.Compose’s depends_on controls start order only; it doesn’t wait for service health. If the app needs Redis on boot, add a Redis healthcheck and (if your Compose supports it) use condition: service_healthy, or ensure robust retry/backoff in the app bootstrap.
Minimal example once Redis has a healthcheck:
- depends_on: - - redis-giveth + depends_on: + redis-giveth: + condition: service_healthyIf your environment’s compose doesn’t support conditions, keep app-level retry logic.
Also applies to: 56-57, 86-87, 117-118
115-116: Don’t publish the jobs port; it’s only used for internal health.Jobs don’t expose a public API, and healthcheck runs inside the container. Remove the host port mapping to reduce surface area.
- ports: - - "4000:4000"
126-135: Tighten Redis: drop unused env, add healthcheck, and update base.
- REDIS_ALLOW_EMPTY_PASSWORD is not used by the official redis image; remove to avoid confusion.
- Add a Redis healthcheck so other services can gate on it.
- Consider updating off EOL Alpine and pinning to a supported patch (or track minor): e.g., redis:7.2-alpine3.20.
Apply this diff:
redis-giveth: - # it's better to not using the latest tag, maybe latest tag have some breaking changes - image: redis:7.2.0-alpine3.18 + # Pin to a supported base; avoid EOL Alpine branches + image: redis:7.2-alpine3.20 container_name: redis-giveth - environment: - - REDIS_ALLOW_EMPTY_PASSWORD=yes restart: always volumes: - redis-data:/data + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 10s + timeout: 3s + retries: 5Please confirm your target runtime supports the chosen tag and that redis-cli is present (it is in the official image by default).
5-13: DRY the duplicated service scaffolding with anchors or env_file.The three GraphQL services share image/command/restart/volumes and most env. You can simplify with YAML anchors or an env_file for common keys to reduce drift and speed edits.
Example (outside these blocks):
x-graphql-common: &graphql_common image: ghcr.io/giveth/impact-graph:staging command: npm run start:docker:server restart: always volumes: - type: bind source: ./config target: /usr/src/app/config read_only: true - type: bind source: ./config target: /usr/src/app/build/config read_only: true - type: bind source: ./logs target: /usr/src/app/logs impact-graph-graphql-1: <<: *graphql_common environment: - ENVIRONMENT=staging - LOG_PATH=/usr/src/app/logs/impact-graph-graphql-1.log - ENABLE_GRAPHQL=true - ENABLE_CRONJOBS=false ports: ["4001:4000"]Repeat for graphql-2/3 with only LOG_PATH and ports differing.
Also applies to: 35-43, 65-73, 100-103
10-12: Optional: prefer stdout logging over bind-mounted log files in containers.File logging inside containers complicates rotation and can lead to permission issues. Consider logging to stdout/stderr and rely on the Docker logging driver or your aggregator; remove the logs bind if you adopt that approach.
If you want to keep file logs for now, no action needed—just ensure ./logs exists and is writable by the container user in your staging hosts.
Also applies to: 40-42, 70-72, 101-103
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
.github/workflows/master-pipeline.yml(1 hunks).github/workflows/staging-pipeline.yml(1 hunks)config/example.env(1 hunks)docker-compose-production.yml(2 hunks)docker-compose-staging.yml(2 hunks)src/server/bootstrap.ts(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/server/bootstrap.ts (5)
src/utils/logger.ts (1)
logger(44-44)src/types/ApolloContext.ts (1)
ApolloContext(16-20)src/services/authorizationServices.ts (1)
authorizationHandler(51-69)src/orm.ts (1)
AppDataSource(8-69)src/redis.ts (1)
redis(10-10)
🪛 GitHub Check: CodeQL
.github/workflows/staging-pipeline.yml
[warning] 193-233: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 234-265: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 266-297: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 298-331: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
.github/workflows/master-pipeline.yml
[warning] 168-208: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 209-240: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 241-272: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 273-319: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
[warning] 320-353: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}
🪛 YAMLlint (1.37.1)
.github/workflows/staging-pipeline.yml
[error] 311-311: trailing spaces
(trailing-spaces)
[error] 321-321: trailing spaces
(trailing-spaces)
[error] 329-329: trailing spaces
(trailing-spaces)
.github/workflows/master-pipeline.yml
[error] 333-333: trailing spaces
(trailing-spaces)
[error] 343-343: trailing spaces
(trailing-spaces)
[error] 351-351: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (9)
.github/workflows/master-pipeline.yml (1)
161-165: Good use of Git reset for clean deployment stateThe deployment now uses
git reset --hardfollowed by explicit checkout and pull, ensuring a clean state that matches the remote repository. This is more reliable than relying on local changes.src/server/bootstrap.ts (4)
86-89: Feature toggles default to enabled - verify backward compatibilityThe feature flags check for
!== 'false'which means they default totrue. This ensures backward compatibility when the environment variables are not set.
319-375: Excellent health check implementationThe enhanced health check endpoint provides comprehensive service status including:
- Individual service states (GraphQL, cronjobs)
- Database and Redis connectivity verification
- Version and environment information
- Proper HTTP status codes (200 for healthy, 503 for degraded)
This will greatly improve observability in production.
129-199: Proper gating of GraphQL initializationThe code correctly gates all GraphQL-related initialization including schema creation, Apollo server setup, and worker pool creation based on the
isGraphqlEnabledflag. This ensures clean separation of concerns.
554-556: Cronjobs initialization properly gatedThe cronjobs are only initialized when
isCronjobsEnabledis true, preventing duplicate job execution across multiple instances.docker-compose-production.yml (3)
4-124: Well-structured service separation for GraphQL instancesThe separation into multiple GraphQL instances (1-4) with:
- Individual log files for better debugging
- Proper health checks on each instance
- Disabled cronjobs to prevent duplicate execution
- Different exposed ports for load balancing
This architecture enables zero-downtime deployments and better scalability.
125-154: Dedicated jobs service properly configuredThe jobs service is correctly configured with:
- GraphQL disabled to prevent unnecessary resource usage
- Cronjobs enabled for background processing
- Separate log file for job monitoring
- Health check endpoint for monitoring
95-124: Verify if a fifth GraphQL instance is neededThe production pipeline references
impact-graph-graphql-5but only 4 GraphQL services are defined here.Either add the fifth service to this file or update the pipeline to only reference services 1-4.
docker-compose-staging.yml (1)
95-104: Confirm that /health is served when ENABLE_GRAPHQL=false (jobs service).Your jobs container disables GraphQL but keeps the same server command. Verify that the process still binds :4000 and serves /health so the healthcheck won’t flap.
Suggested local check:
- docker compose up -d impact-graph-jobs
- docker compose logs -f impact-graph-jobs
- Inside the container: curl -f http://localhost:4000/health and confirm HTTP 200 with expected JSON.
Also applies to: 119-124
| docker compose rm -fs impact-graph-graphql-4 | ||
| docker compose rm -fs impact-graph-graphql-5 | ||
| docker compose up --force-recreate -d impact-graph-graphql-4 | ||
| docker compose up --force-recreate -d impact-graph-graphql-5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Container mismatch: impact-graph-graphql-5 not defined in docker-compose
The pipeline attempts to deploy impact-graph-graphql-5, but docker-compose-production.yml only defines services up to impact-graph-graphql-4.
🏁 Script executed:
#!/bin/bash
# Verify which GraphQL services are defined in docker-compose files
echo "=== Checking docker-compose-production.yml for GraphQL service definitions ==="
grep -n "impact-graph-graphql-" docker-compose-production.yml | head -20
echo -e "\n=== Checking if graphql-5 is defined anywhere ==="
grep -r "impact-graph-graphql-5" --include="*.yml" --include="*.yaml" .Length of output: 2114
Critical: Missing service definition for impact-graph-graphql-5
A verification of your compose setup shows that only services impact-graph-graphql-1 through impact-graph-graphql-4 are defined in your Docker Compose files—no impact-graph-graphql-5 exists. The workflow steps at .github/workflows/master-pipeline.yml (lines 286–289) will therefore fail when trying to remove or start impact-graph-graphql-5.
• Location of broken refs:
- File: .github/workflows/master-pipeline.yml
Lines: 286–289docker compose rm -fs impact-graph-graphql-4
-
docker compose rm -fs impact-graph-graphql-5 docker compose up --force-recreate -d impact-graph-graphql-4 -
docker compose up --force-recreate -d impact-graph-graphql-5
• Verification summary:
- No “impact-graph-graphql-5:” block found in any
docker-compose*.yml. - Only definitions for services
impact-graph-graphql-1–4exist.
Next steps (choose one):
- Define
impact-graph-graphql-5in your compose file(s) (e.g. duplicate the-graphql-4service and adjust as needed). - Remove or update all references to
impact-graph-graphql-5in the master-pipeline workflow.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| docker compose rm -fs impact-graph-graphql-4 | |
| docker compose rm -fs impact-graph-graphql-5 | |
| docker compose up --force-recreate -d impact-graph-graphql-4 | |
| docker compose up --force-recreate -d impact-graph-graphql-5 | |
| docker compose rm -fs impact-graph-graphql-4 | |
| docker compose up --force-recreate -d impact-graph-graphql-4 |
🤖 Prompt for AI Agents
.github/workflows/master-pipeline.yml around lines 286 to 289: the workflow
references docker services impact-graph-graphql-5 (remove/stop/start) but no
such service exists in your docker-compose files (only impact-graph-graphql-1
through -4), causing the job to fail; fix by either adding a matching
impact-graph-graphql-5 service block to your docker-compose*.yml (e.g.,
duplicate and adjust the -graphql-4 service) or remove/update all references to
impact-graph-graphql-5 in this workflow so it only manages existing services.
Summary by CodeRabbit
New Features
Refactor
Chores