Add health check and separate deployments #2141

ae2079 · 2025-08-26T23:30:48Z

Summary by CodeRabbit

New Features
- Health endpoint now returns detailed JSON (service statuses, environment, version, uptime).
- Feature flags to enable/disable GraphQL and background jobs.
Refactor
- Split single service into multiple GraphQL instances plus a dedicated jobs service for staging and production, improving scalability and isolation.
Chores
- Introduced phased, health-checked rollout pipelines with final verification for staging and production.
- Streamlined deployment flow and image refresh for more reliable releases.
- Example environment file updated with new service toggles.

…on jobs. Update bootstrap logic to conditionally initialize GraphQL components and endpoints based on the new environment variables. Improve health check endpoint to include service statuses and connectivity checks for database and Redis.

…nd staging configurations - Introduced four GraphQL API instances (impact-graph-graphql-1 to impact-graph-graphql-4) for improved load balancing. - Updated environment variables and health checks for each instance. - Adjusted the impact-graph-jobs service to ensure proper logging and cron job functionality. - Enhanced deployment scripts in GitHub Actions for phased rollout and health verification of services.

.github/workflows/master-pipeline.yml

+    needs: deploy
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - First Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.PROD_HOST_ALL }}
+          username: ${{ secrets.PROD_USERNAME_ALL }}
+          key: ${{ secrets.PROD_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Update each backend service one by one
+            ## First Deployment - Jobs instance and first GraphQL instance
+            docker compose rm -fs impact-graph-graphql-1
+            docker compose rm -fs impact-graph-jobs
+            docker compose up --force-recreate -d impact-graph-graphql-1
+            docker compose up --force-recreate -d impact-graph-jobs
+
+            # Wait for impact-graph-graphql-1 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-1 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-1)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-1 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-1 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-1 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-1)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-1 is not healthy, stopping deployment"
+                exit 1
+            fi
+
+            # Check if impact-graph-jobs is running
+            if [ "$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs)" != "\"running\"" ]; then
+                echo "impact-graph-jobs is not running, stopping deployment"
+                exit 1
+            fi
+            echo "First deployment phase completed successfully"
+
+  rollout-deploy-2:


.github/workflows/master-pipeline.yml

+    needs: rollout-deploy-1
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - Second Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.PROD_HOST_ALL }}
+          username: ${{ secrets.PROD_USERNAME_ALL }}
+          key: ${{ secrets.PROD_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Second Deployment
+            docker compose rm -fs impact-graph-graphql-2
+            docker compose up --force-recreate -d impact-graph-graphql-2
+
+            # Wait for impact-graph-graphql-2 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-2 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-2)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-2 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-2 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-2 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-2)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-2 is not healthy, stopping deployment"
+                exit 1
+            fi
+            echo "Second deployment phase completed successfully"
+
+  rollout-deploy-3:


.github/workflows/master-pipeline.yml

+    needs: rollout-deploy-2
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - Third Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.PROD_HOST_ALL }}
+          username: ${{ secrets.PROD_USERNAME_ALL }}
+          key: ${{ secrets.PROD_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Third Deployment
+            docker compose rm -fs impact-graph-graphql-3
+            docker compose up --force-recreate -d impact-graph-graphql-3
+
+            # Wait for impact-graph-graphql-3 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-3 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-3)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-3 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-3 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-3 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-3)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-3 is not healthy, stopping deployment"
+                exit 1
+            fi
+            echo "Third deployment phase completed successfully"
+
+  rollout-deploy-4:


.github/workflows/master-pipeline.yml

+    needs: rollout-deploy-3
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - Fourth Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.PROD_HOST_ALL }}
+          username: ${{ secrets.PROD_USERNAME_ALL }}
+          key: ${{ secrets.PROD_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Fourth Deployment - Additional GraphQL instances for production
+            docker compose rm -fs impact-graph-graphql-4
+            docker compose rm -fs impact-graph-graphql-5
+            docker compose up --force-recreate -d impact-graph-graphql-4
+            docker compose up --force-recreate -d impact-graph-graphql-5
+
+            # Wait for impact-graph-graphql-4 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-4 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-4)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-4 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-4 to become healthy"
+                exit 1
+            fi
+
+            # Wait for impact-graph-graphql-5 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-5 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-5)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-5 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-5 to become healthy"
+                exit 1
+            fi
+
+            # Check if both instances are healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-4)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-4 is not healthy, stopping deployment"
+                exit 1
+            fi
+
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-5)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-5 is not healthy, stopping deployment"
+                exit 1
+            fi
+            echo "Fourth deployment phase completed successfully"
+
+  deployment-verification:


.github/workflows/master-pipeline.yml

+    needs: rollout-deploy-4
+    runs-on: ubuntu-latest
+    steps:
+      - name: Verify All Services
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.PROD_HOST_ALL }}
+          username: ${{ secrets.PROD_USERNAME_ALL }}
+          key: ${{ secrets.PROD_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            echo "=== Final Production Deployment Verification ==="
+            
+            # Check all GraphQL instances health status
+            for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3 impact-graph-graphql-4 impact-graph-graphql-5; do
+              health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"")
+              echo "$service health status: $health_status"
+              if [ "$health_status" != "\"healthy\"" ]; then
+                echo "ERROR: $service is not healthy!"
+                exit 1
+              fi
+            done
+            
+            # Check jobs instance status
+            jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"")
+            echo "impact-graph-jobs status: $jobs_status"
+            if [ "$jobs_status" != "\"running\"" ]; then
+              echo "ERROR: impact-graph-jobs is not running!"
+              exit 1
+            fi
+            
+            echo "=== All services are healthy and running! ==="
+            echo "Production deployment completed successfully at $(date)"


.github/workflows/staging-pipeline.yml

+    needs: deploy
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - First Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.STAGING_HOST_ALL }}
+          username: ${{ secrets.STAGING_USERNAME_ALL }}
+          key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Update each backend service one by one
+            ## First Deployment - Jobs instance and first GraphQL instance
+            docker compose rm -fs impact-graph-graphql-1
+            docker compose rm -fs impact-graph-jobs
+            docker compose up --force-recreate -d impact-graph-graphql-1
+            docker compose up --force-recreate -d impact-graph-jobs
+
+            # Wait for impact-graph-graphql-1 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-1 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-1)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-1 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-1 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-1 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-1)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-1 is not healthy, stopping deployment"
+                exit 1
+            fi
+
+            # Check if impact-graph-jobs is running
+            if [ "$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs)" != "\"running\"" ]; then
+                echo "impact-graph-jobs is not running, stopping deployment"
+                exit 1
+            fi
+            echo "First deployment phase completed successfully"
+
+  rollout-deploy-2:


.github/workflows/staging-pipeline.yml

+    needs: rollout-deploy-1
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - Second Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.STAGING_HOST_ALL }}
+          username: ${{ secrets.STAGING_USERNAME_ALL }}
+          key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Second Deployment
+            docker compose rm -fs impact-graph-graphql-2
+            docker compose up --force-recreate -d impact-graph-graphql-2
+
+            # Wait for impact-graph-graphql-2 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-2 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-2)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-2 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-2 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-2 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-2)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-2 is not healthy, stopping deployment"
+                exit 1
+            fi
+            echo "Second deployment phase completed successfully"
+
+  rollout-deploy-3:


.github/workflows/staging-pipeline.yml

+    needs: rollout-deploy-2
+    runs-on: ubuntu-latest
+    steps:
+      - name: SSH and Redeploy - Third Phase
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.STAGING_HOST_ALL }}
+          username: ${{ secrets.STAGING_USERNAME_ALL }}
+          key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            ## Third Deployment
+            docker compose rm -fs impact-graph-graphql-3
+            docker compose up --force-recreate -d impact-graph-graphql-3
+
+            # Wait for impact-graph-graphql-3 to be healthy (timeout after 5 minutes)
+            echo "Waiting for impact-graph-graphql-3 to become healthy..."
+            timeout 300 bash -c 'until [ "$(docker inspect --format="{{json .State.Health.Status}}" impact-graph-graphql-3)" == "\"healthy\"" ]; do echo "Waiting for impact-graph-graphql-3 to be healthy..."; sleep 5; done'
+            if [ $? -eq 124 ]; then
+                echo "Timeout waiting for impact-graph-graphql-3 to become healthy"
+                exit 1
+            fi
+
+            # Check if impact-graph-graphql-3 is healthy
+            if [ "$(docker inspect --format='{{json .State.Health.Status}}' impact-graph-graphql-3)" != "\"healthy\"" ]; then
+                echo "impact-graph-graphql-3 is not healthy, stopping deployment"
+                exit 1
+            fi
+            echo "Third deployment phase completed successfully"
+
+  deployment-verification:


.github/workflows/staging-pipeline.yml

+    needs: rollout-deploy-3
+    runs-on: ubuntu-latest
+    steps:
+      - name: Verify All Services
+        uses: appleboy/[email protected]
+        with:
+          host: ${{ secrets.STAGING_HOST_ALL }}
+          username: ${{ secrets.STAGING_USERNAME_ALL }}
+          key: ${{ secrets.STAGING_PRIVATE_KEY_ALL }}
+          port: ${{ secrets.SSH_PORT }}
+          script: |
+            cd giveth-all
+            echo "=== Final Deployment Verification ==="
+            
+            # Check all GraphQL instances health status
+            for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3; do
+              health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"")
+              echo "$service health status: $health_status"
+              if [ "$health_status" != "\"healthy\"" ]; then
+                echo "ERROR: $service is not healthy!"
+                exit 1
+              fi
+            done
+            
+            # Check jobs instance status
+            jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"")
+            echo "impact-graph-jobs status: $jobs_status"
+            if [ "$jobs_status" != "\"running\"" ]; then
+              echo "ERROR: impact-graph-jobs is not running!"
+              exit 1
+            fi
+            
+            echo "=== All services are healthy and running! ==="
+            echo "Deployment completed successfully at $(date)"


coderabbitai · 2025-08-26T23:41:20Z

Walkthrough

Introduces phased deployments in staging and production workflows, splits the monolithic impact-graph service into multiple GraphQL instances plus a dedicated jobs service, adds feature flags to control GraphQL and cronjobs, and gates server initialization and health reporting accordingly.

Changes

Cohort / File(s)	Summary
CI/CD Rollout Pipelines `/.github/workflows/staging-pipeline.yml`, `/.github/workflows/master-pipeline.yml`	Replace single-step docker-compose restarts with git-based sync and image prune/pull. Add sequential rollout jobs over SSH, per-phase container recreation, and 5‑minute health checks. Final verification step ensures all GraphQL instances are healthy and jobs is running.
Runtime Feature Flags & Health `/src/server/bootstrap.ts`	Add ENABLE_GRAPHQL and ENABLE_CRONJOBS gating. Conditionally initialize GraphQL server/endpoints and cronjobs. Expand /health JSON to report GraphQL, cronjobs, DB, Redis, version, environment, and uptime.
Compose: Production Split `/docker-compose-production.yml`	Remove single `impact-graph`. Add `impact-graph-graphql-1..4` (ports 4001–4004) and `impact-graph-jobs` (port 4000). Set per-service logs, feature flags, and healthchecks.
Compose: Staging Split `/docker-compose-staging.yml`	Remove single `impact-graph`. Add `impact-graph-graphql-1..3` (ports 4001–4003) and `impact-graph-jobs` (port 4000). Configure env, mounts, logs, and healthchecks.
Config Toggles `/config/example.env`	Add ENABLE_GRAPHQL=true and ENABLE_CRONJOBS=true under a new “Service feature toggles” section.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant GH as GitHub Actions
  participant SSH as SSH Runner (Host)
  participant DC as docker compose
  participant SVC as Services (graphql-n, jobs)

  rect rgb(245,248,255)
  note over GH: Deploy job
  GH->>SSH: git reset/checkout/pull (branch)
  GH->>SSH: docker image prune -a --force
  GH->>DC: pull images
  end

  rect rgb(240,255,245)
  note over GH,SVC: Phased rollout (n=1..k)
  GH->>SSH: rm graphql-n (+jobs in phase 1)
  GH->>SSH: up --force-recreate graphql-n (+jobs)
  loop health check ≤5 min
    SSH->>SVC: docker inspect Health.Status
  end
  alt healthy
    GH->>GH: proceed to next phase
  else not healthy
    GH--xGH: fail early
  end
  end

  rect rgb(255,250,240)
  note over GH,SVC: Final verification
  GH->>SVC: check graphql-1..k healthy and jobs running
  end

sequenceDiagram
  autonumber
  participant Node as Server bootstrap.ts
  participant DB as Database
  participant R as Redis
  participant GQL as GraphQL Server
  participant Cron as Cron Scheduler
  participant HC as /health

  Node->>Node: read ENABLE_GRAPHQL / ENABLE_CRONJOBS
  alt GraphQL enabled
    Node->>GQL: init schema, plugins, context, start
  else
    Node->>Node: skip GraphQL init
  end
  alt Cronjobs enabled
    Node->>Cron: initialize cron jobs
  else
    Node->>Node: skip cron setup
  end

  HC->>DB: SELECT 1
  HC->>R: PING
  HC-->>Node: JSON {graphql, cronjobs, db, redis, version, env, uptime}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

New sitemap generation files logic #1945 — Also modifies bootstrap.ts around cronjob startup conditions; overlaps with new ENABLE_CRONJOBS gating.
Remove onramper and skip network tests #2136 — Alters server startup/bootstrapping; intersects with this PR’s GraphQL/cron gating and health endpoint changes.
Revert "Revert "New sitemap generation files logic"" #1948 — Adjusts cron scheduling/feature gating in bootstrap.ts; relevant to the introduced cron enable flag.

Suggested labels

Code Review

Suggested reviewers

kkatusic
RamRamez

Poem

I thump the ground: deploys in a row,
Five hops of GraphQL, one job in tow.
Flags up, ears perked—healthchecks sing,
Green lights bloom like a meadow in spring.
Compose burrows neatly split in two,
Staging to prod—hip-hop, we’re through! 🐇🚀

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch AddHealthCheckAndSeparateDeployments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (13)

config/example.env (1)
346-348: Document the default behavior for feature toggles

The comment states these are service feature toggles, but doesn't clarify what happens when these values are not set. Based on the code in src/server/bootstrap.ts, both default to true for backward compatibility.

Consider enhancing the documentation:
 # Service feature toggles
+# Both default to 'true' if not set (for backward compatibility)
+# Set to 'false' to disable the respective service functionality
 ENABLE_GRAPHQL=true
 ENABLE_CRONJOBS=true
.github/workflows/master-pipeline.yml (2)
167-207: Consider adding explicit permissions for GitHub token security

CodeQL correctly flags that this job lacks explicit permissions. While the job doesn't directly use GITHUB_TOKEN, setting minimal permissions is a security best practice.

Add minimal permissions block:
 rollout-deploy-1:
   needs: deploy
   runs-on: ubuntu-latest
+  permissions: {}
   steps:
333-333: Remove trailing spaces

YAMLlint detected trailing spaces that should be removed.
-            echo "=== Final Production Deployment Verification ==="
-            
+            echo "=== Final Production Deployment Verification ==="
+
             # Check all GraphQL instances health status
             for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3 impact-graph-graphql-4 impact-graph-graphql-5; do
               health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"")
@@ -339,15 +339,15 @@
                 exit 1
               fi
             done
-            
+
             # Check jobs instance status
             jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"")
             echo "impact-graph-jobs status: $jobs_status"
             if [ "$jobs_status" != "\"running\"" ]; then
               echo "ERROR: impact-graph-jobs is not running!"
               exit 1
             fi
-            
+
             echo "=== All services are healthy and running! ==="
Also applies to: 343-343, 351-351
.github/workflows/staging-pipeline.yml (2)
193-232: Add minimal permissions for security compliance

Similar to the production pipeline, these jobs should have explicit permissions set.
 rollout-deploy-1:
   needs: deploy
   runs-on: ubuntu-latest
+  permissions: {}
   steps:
Apply the same change to rollout-deploy-2, rollout-deploy-3, and deployment-verification jobs.

311-311: Remove trailing spaces in YAML
-            echo "=== Final Deployment Verification ==="
-            
+            echo "=== Final Deployment Verification ==="
+
             # Check all GraphQL instances health status
             for service in impact-graph-graphql-1 impact-graph-graphql-2 impact-graph-graphql-3; do
               health_status=$(docker inspect --format='{{json .State.Health.Status}}' $service 2>/dev/null || echo "\"not_found\"")
@@ -317,7 +317,7 @@
                 exit 1
               fi
             done
-            
+
             # Check jobs instance status
             jobs_status=$(docker inspect --format='{{json .State.Status}}' impact-graph-jobs 2>/dev/null || echo "\"not_found\"")
             echo "impact-graph-jobs status: $jobs_status"
@@ -326,7 +326,7 @@
               echo "ERROR: impact-graph-jobs is not running!"
               exit 1
             fi
-            
+
             echo "=== All services are healthy and running! ==="
Also applies to: 321-321, 329-329
docker-compose-staging.yml (8)
1-1: Bump Compose file version (or drop it) to avoid start_period incompatibility.

You’re using healthcheck.start_period but the file declares version "3.3", where start_period may be ignored by older docker-compose. Safer choices: bump to 3.8 or remove the version key (Compose Spec).

Apply either of these minimal diffs:
-version: '3.3'
+version: '3.8'  # ensure start_period is recognized by older docker-compose
—or—
-version: '3.3'
+# version intentionally omitted to use the latest Compose Specification
28-33: Make healthchecks independent of curl availability.

Relying on curl assumes it exists in the image. Use Node (always present here) to check /health, avoiding surprises.

Apply this pattern (repeat for each service block shown in the line ranges):
-    healthcheck:
-      test: ["CMD-SHELL", "curl -f http://localhost:4000/health || exit 1"]
+    healthcheck:
+      # Avoid dependency on curl: use Node to assert HTTP 200
+      test: ["CMD", "node", "-e", "require('http').get('http://127.0.0.1:4000/health', r => process.exit(r.statusCode===200?0:1)).on('error',()=>process.exit(1))"]
       interval: 30s
       timeout: 10s
       retries: 3
       start_period: 60s
Also applies to: 59-63, 89-93, 119-124

14-23: Mount config read-only; verify need for duplicate config mounts.

Config should be immutable at runtime; mark those two config binds as read-only.

You map ./config into both /usr/src/app/config and /usr/src/app/build/config. If the app only reads one, drop the other to reduce complexity.

Example for each service (apply to both config mounts):
       - type: bind
         source: ./config
         target: /usr/src/app/config
+        read_only: true
       - type: bind
         source: ./config
         target: /usr/src/app/build/config
+        read_only: true
Please confirm whether both container paths are actually read by the app at runtime. If only one is needed, we can delete the other bind.

Also applies to: 44-53, 74-83, 105-114

26-27: depends_on does not wait for Redis to be ready; consider health-gated dependency or app-level retries.

Compose’s depends_on controls start order only; it doesn’t wait for service health. If the app needs Redis on boot, add a Redis healthcheck and (if your Compose supports it) use condition: service_healthy, or ensure robust retry/backoff in the app bootstrap.

Minimal example once Redis has a healthcheck:
-    depends_on:
-      - redis-giveth
+    depends_on:
+      redis-giveth:
+        condition: service_healthy
If your environment’s compose doesn’t support conditions, keep app-level retry logic.

Also applies to: 56-57, 86-87, 117-118

115-116: Don’t publish the jobs port; it’s only used for internal health.

Jobs don’t expose a public API, and healthcheck runs inside the container. Remove the host port mapping to reduce surface area.
-    ports:
-      - "4000:4000"
126-135: Tighten Redis: drop unused env, add healthcheck, and update base.

REDIS_ALLOW_EMPTY_PASSWORD is not used by the official redis image; remove to avoid confusion.

Add a Redis healthcheck so other services can gate on it.

Consider updating off EOL Alpine and pinning to a supported patch (or track minor): e.g., redis:7.2-alpine3.20.

Apply this diff:
   redis-giveth:
-    # it's better to not using the latest tag, maybe latest tag have some breaking changes
-    image: redis:7.2.0-alpine3.18
+    # Pin to a supported base; avoid EOL Alpine branches
+    image: redis:7.2-alpine3.20
     container_name: redis-giveth
-    environment:
-      - REDIS_ALLOW_EMPTY_PASSWORD=yes
     restart: always
     volumes:
       - redis-data:/data
+    healthcheck:
+      test: ["CMD", "redis-cli", "ping"]
+      interval: 10s
+      timeout: 3s
+      retries: 5
Please confirm your target runtime supports the chosen tag and that redis-cli is present (it is in the official image by default).

5-13: DRY the duplicated service scaffolding with anchors or env_file.

The three GraphQL services share image/command/restart/volumes and most env. You can simplify with YAML anchors or an env_file for common keys to reduce drift and speed edits.

Example (outside these blocks):
x-graphql-common: &graphql_common
  image: ghcr.io/giveth/impact-graph:staging
  command: npm run start:docker:server
  restart: always
  volumes:
    - type: bind
      source: ./config
      target: /usr/src/app/config
      read_only: true
    - type: bind
      source: ./config
      target: /usr/src/app/build/config
      read_only: true
    - type: bind
      source: ./logs
      target: /usr/src/app/logs

impact-graph-graphql-1:
  <<: *graphql_common
  environment:
    - ENVIRONMENT=staging
    - LOG_PATH=/usr/src/app/logs/impact-graph-graphql-1.log
    - ENABLE_GRAPHQL=true
    - ENABLE_CRONJOBS=false
  ports: ["4001:4000"]
Repeat for graphql-2/3 with only LOG_PATH and ports differing.

Also applies to: 35-43, 65-73, 100-103

10-12: Optional: prefer stdout logging over bind-mounted log files in containers.

File logging inside containers complicates rotation and can lead to permission issues. Consider logging to stdout/stderr and rely on the Docker logging driver or your aggregator; remove the logs bind if you adopt that approach.

If you want to keep file logs for now, no action needed—just ensure ./logs exists and is writable by the container user in your staging hosts.

Also applies to: 40-42, 70-72, 101-103

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 45b32fb and 59240cb.

📒 Files selected for processing (6)

.github/workflows/master-pipeline.yml (1 hunks)
.github/workflows/staging-pipeline.yml (1 hunks)
config/example.env (1 hunks)
docker-compose-production.yml (2 hunks)
docker-compose-staging.yml (2 hunks)
src/server/bootstrap.ts (7 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/server/bootstrap.ts (5)

src/utils/logger.ts (1)

logger (44-44)

src/types/ApolloContext.ts (1)

ApolloContext (16-20)

src/services/authorizationServices.ts (1)

authorizationHandler (51-69)

src/orm.ts (1)

AppDataSource (8-69)

src/redis.ts (1)

redis (10-10)

🪛 GitHub Check: CodeQL

.github/workflows/staging-pipeline.yml

[warning] 193-233: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 234-265: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 266-297: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 298-331: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

.github/workflows/master-pipeline.yml

[warning] 168-208: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 209-240: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 241-272: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 273-319: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

[warning] 320-353: Workflow does not contain permissions
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {{}}

🪛 YAMLlint (1.37.1)

.github/workflows/staging-pipeline.yml

[error] 311-311: trailing spaces

(trailing-spaces)

[error] 321-321: trailing spaces

(trailing-spaces)

[error] 329-329: trailing spaces

(trailing-spaces)

.github/workflows/master-pipeline.yml

[error] 333-333: trailing spaces

(trailing-spaces)

[error] 343-343: trailing spaces

(trailing-spaces)

[error] 351-351: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test
GitHub Check: Analyze (javascript-typescript)

🔇 Additional comments (9)

.github/workflows/master-pipeline.yml (1)

161-165: Good use of Git reset for clean deployment state

The deployment now uses git reset --hard followed by explicit checkout and pull, ensuring a clean state that matches the remote repository. This is more reliable than relying on local changes.

src/server/bootstrap.ts (4)

86-89: Feature toggles default to enabled - verify backward compatibility

The feature flags check for !== 'false' which means they default to true. This ensures backward compatibility when the environment variables are not set.

319-375: Excellent health check implementation

The enhanced health check endpoint provides comprehensive service status including:

Individual service states (GraphQL, cronjobs)

Database and Redis connectivity verification

Version and environment information

Proper HTTP status codes (200 for healthy, 503 for degraded)

This will greatly improve observability in production.

129-199: Proper gating of GraphQL initialization

The code correctly gates all GraphQL-related initialization including schema creation, Apollo server setup, and worker pool creation based on the isGraphqlEnabled flag. This ensures clean separation of concerns.

554-556: Cronjobs initialization properly gated

The cronjobs are only initialized when isCronjobsEnabled is true, preventing duplicate job execution across multiple instances.

docker-compose-production.yml (3)

4-124: Well-structured service separation for GraphQL instances

The separation into multiple GraphQL instances (1-4) with:

Individual log files for better debugging

Proper health checks on each instance

Disabled cronjobs to prevent duplicate execution

Different exposed ports for load balancing

This architecture enables zero-downtime deployments and better scalability.

125-154: Dedicated jobs service properly configured

The jobs service is correctly configured with:

GraphQL disabled to prevent unnecessary resource usage

Cronjobs enabled for background processing

Separate log file for job monitoring

Health check endpoint for monitoring

95-124: Verify if a fifth GraphQL instance is needed

The production pipeline references impact-graph-graphql-5 but only 4 GraphQL services are defined here.

Either add the fifth service to this file or update the pipeline to only reference services 1-4.

docker-compose-staging.yml (1)

95-104: Confirm that /health is served when ENABLE_GRAPHQL=false (jobs service).

Your jobs container disables GraphQL but keeps the same server command. Verify that the process still binds :4000 and serves /health so the healthcheck won’t flap.

Suggested local check:

docker compose up -d impact-graph-jobs

docker compose logs -f impact-graph-jobs

Inside the container: curl -f http://localhost:4000/health and confirm HTTP 200 with expected JSON.

Also applies to: 119-124

coderabbitai · 2025-08-26T23:41:23Z

.github/workflows/master-pipeline.yml

+            docker compose rm -fs impact-graph-graphql-4
+            docker compose rm -fs impact-graph-graphql-5
+            docker compose up --force-recreate -d impact-graph-graphql-4
+            docker compose up --force-recreate -d impact-graph-graphql-5


💡 Verification agent

🧩 Analysis chain

Container mismatch: impact-graph-graphql-5 not defined in docker-compose

The pipeline attempts to deploy impact-graph-graphql-5, but docker-compose-production.yml only defines services up to impact-graph-graphql-4.

🏁 Script executed:

#!/bin/bash # Verify which GraphQL services are defined in docker-compose files echo "=== Checking docker-compose-production.yml for GraphQL service definitions ===" grep -n "impact-graph-graphql-" docker-compose-production.yml | head -20 echo -e "\n=== Checking if graphql-5 is defined anywhere ===" grep -r "impact-graph-graphql-5" --include="*.yml" --include="*.yaml" .

Length of output: 2114

Critical: Missing service definition for impact-graph-graphql-5

A verification of your compose setup shows that only services impact-graph-graphql-1 through impact-graph-graphql-4 are defined in your Docker Compose files—no impact-graph-graphql-5 exists. The workflow steps at .github/workflows/master-pipeline.yml (lines 286–289) will therefore fail when trying to remove or start impact-graph-graphql-5.

• Location of broken refs:

File: .github/workflows/master-pipeline.yml
Lines: 286–289
docker compose rm -fs impact-graph-graphql-4

docker compose rm -fs impact-graph-graphql-5 docker compose up --force-recreate -d impact-graph-graphql-4

docker compose up --force-recreate -d impact-graph-graphql-5

• Verification summary:

No “impact-graph-graphql-5:” block found in any docker-compose*.yml.

Only definitions for services impact-graph-graphql-1–4 exist.

Next steps (choose one):

Define impact-graph-graphql-5 in your compose file(s) (e.g. duplicate the -graphql-4 service and adjust as needed).

Remove or update all references to impact-graph-graphql-5 in the master-pipeline workflow.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

docker compose rm -fs impact-graph-graphql-4

docker compose rm -fs impact-graph-graphql-5

docker compose up --force-recreate -d impact-graph-graphql-4

docker compose up --force-recreate -d impact-graph-graphql-5

docker compose rm -fs impact-graph-graphql-4

docker compose up --force-recreate -d impact-graph-graphql-4

🤖 Prompt for AI Agents

.github/workflows/master-pipeline.yml around lines 286 to 289: the workflow references docker services impact-graph-graphql-5 (remove/stop/start) but no such service exists in your docker-compose files (only impact-graph-graphql-1 through -4), causing the job to fail; fix by either adding a matching impact-graph-graphql-5 service block to your docker-compose*.yml (e.g., duplicate and adjust the -graphql-4 service) or remove/update all references to impact-graph-graphql-5 in this workflow so it only manages existing services.

ae2079 added 2 commits August 27, 2025 02:38

ae2079 requested a review from geleeroyale August 26, 2025 23:30

github-advanced-security bot found potential problems Aug 26, 2025

View reviewed changes

coderabbitai bot reviewed Aug 26, 2025

View reviewed changes

This was referenced Sep 8, 2025

create flag for disabling cronjobs or graphql queries #2119

Closed

create health check endpoint #2121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add health check and separate deployments #2141

Add health check and separate deployments #2141

Uh oh!

ae2079 commented Aug 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

coderabbitai bot commented Aug 26, 2025

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add health check and separate deployments #2141

Are you sure you want to change the base?

Add health check and separate deployments #2141

Uh oh!

Conversation

ae2079 commented Aug 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

Check warning

coderabbitai bot commented Aug 26, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ae2079 commented Aug 26, 2025 •

edited by coderabbitai bot

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)