Made by Omnivya. This repo automates: cloning AzDo repos, generating filesystem SBOMs, scanning sources with TruffleHog, uploading to Dependency-Track, and enforcing a "Shai‑Hulud" blocklist policy to quickly surface risky components and the kinds of secrets that can leak from your codebase.
# 1) Install deps
uv sync
# 2) Clone/update + SBOM + TruffleHog (SSH-first, HTTPS fallback optional)
uv run --env-file .env ./azdo_clone_and_scan.py
# 3) Upload SBOMs to Dependency-Track
uv run --env-file .env.dt ./dt_bulk_upload_sbom.py
# 4) Create the Shai-Hulud policy and add PURL conditions from list_shai.txt
uv run --env-file .env.dt ./dt_create_shai_policy.py
# 5) Re-run analysis on all projects to apply the policy
uv run --env-file .env.dt ./dt_trigger_reanalysis.pyScript to list repositories from an Azure DevOps project, clone/update them over SSH, generate an SBOM, and scan for secrets. Includes throttling controls, pretty colored logs, retries with backoff, SSH key selection, repo status handling (disabled/renamed/no-permission), and a final retry pass for transient failures.
Made by Omnivya.
- Python 3.13+
uvpackage manager (recommended)- Tools (optional but recommended):
cdxgen(preferred) orsyftfor SBOM generationtrufflehogfor secret scanning
- SSH key configured on Azure DevOps (
[email protected])
uv syncPut these in a .env file or set them in your environment.
- AZDO_ORG_URL: Azure DevOps org URL. Examples:
https://dev.azure.com/<org>https://<org>.visualstudio.com
- AZDO_PROJECT: Azure DevOps project name
- AZDO_PAT: Azure DevOps Personal Access Token (used only for REST listing)
- WORKSPACE_DIR: Directory where repos are cloned (default:
~/azdo-workspace) - SBOM_OUT_DIR: Directory for SBOM outputs (default:
~/azdo-scan/sbom) - SECRETS_OUT_DIR: Directory for TruffleHog outputs (default:
~/azdo-scan/secrets) - MAX_WORKERS: Thread count for per-repo processing
- Accepts integer or keywords:
auto,cpu,max,default
- Accepts integer or keywords:
Results reuse / skipping:
- SKIP_IF_RESULTS_EXIST:
true|false(default:true).- If both SBOM and TruffleHog outputs already exist for a repo, the repo is skipped entirely.
- Each artifact is also skipped independently if its output file already exists.
Throttling and noise control:
- GIT_MAX_CONCURRENCY: Max concurrent networked git ops (default:
min(MAX_WORKERS, 4)) - GIT_CLONE_CONCURRENCY: Max concurrent clones (default:
1) - GIT_MAX_RETRIES: Retries for git clone/fetch (default:
3) - HTTP_MAX_RETRIES: Retries for REST listing (default:
4) - BACKOFF_BASE_MS: Backoff base in ms (default:
300) - BACKOFF_MAX_MS: Backoff max in ms (default:
5000) - START_STAGGER_MS: Random startup jitter per repo in ms (default:
0) - GIT_QUIET: Suppress git stdout (
true/false, default:true) - GIT_PARTIAL_CLONE: Use
--filter=blob:noneon clone (default:false) - NO_COLOR: Disable ANSI colors in logs (any value → disables)
SSH:
- GIT_SSH_KEY or AZDO_SSH_KEY: Path to private key to use (forces
IdentitiesOnly=yes) - GIT_SSH_OPTS: Extra ssh options (default includes
ConnectTimeout=20, keepalives,PreferredAuthentications=publickey,IPQoS=throughput)
Ensure your SSH key is set up for Azure DevOps:
ssh -T [email protected]Repository URLs are constructed as: [email protected]:v3/{org}/{project}/{repo}.
Note: The script prefers sshUrl returned by the Azure DevOps REST API when present, and falls back to the constructed URL only if missing. Existing clones have their origin URL updated automatically.
With .env:
uv run --env-file .env ./azdo_clone_and_scan.pyOverride or add flags inline:
uv run --env-file .env GIT_MAX_CONCURRENCY=3 START_STAGGER_MS=500 ./azdo_clone_and_scan.pyNote on TruffleHog: partial clones can yield empty results because blobs are not fetched. The default here disables partial clone. If you need lower bandwidth, enable GIT_PARTIAL_CLONE=true but be aware scans may be incomplete unless you fetch full history/blobs.
Extra options for TruffleHog:
- TRUFFLEHOG_ONLY_VERIFIED:
true|false(default follows your env) - TRUFFLEHOG_ARGS: additional flags appended to the command, e.g.
--branch mainor--since-commit <sha>
Without code changes you can also point to a custom env file:
uv run --env-file .env ./azdo_clone_and_scan.pyFor each repository in the Azure DevOps project:
- Skips disabled repositories (from REST API
isDisabled/status) - Clone over SSH if missing (using API
sshUrlif available), or fetch/prune and hard reset toorigin/HEADif present - Generate an SBOM (prefers
cdxgen, falls back tosyft) - Run
trufflehogsecret scan - Classifies failures:
disabled,not-found-or-renamed,no-permission,timeout,unknown - Performs a final limited retry pass for transient classes (
timeout,unknown) - Appends results to a final JSON summary printed on stdout
Status lines are printed per repo with colors/emojis:
- Starting:
ℹ️ Cloning <repo>orℹ️ Updating <repo> - SBOM/Secrets:
ℹ️ SBOM <repo> with <tool>andℹ️ TruffleHog <repo> - Completion:
✅ <repo> cloned/updated,✅ SBOM ...,✅ TruffleHog ...or❌ ... - Skips:
⏭️ <repo> disabled|not-found-or-renamed|no-permission
- SBOM:
${SBOM_OUT_DIR}/{project}_{repo}.cdx.json - Secrets:
${SECRETS_OUT_DIR}/{project}_{repo}.trufflehog.jsonl - Final summary: JSON to stdout
MAX_WORKERScontrols total threadsGIT_MAX_CONCURRENCYcaps simultaneous git fetch network opsGIT_CLONE_CONCURRENCYcaps simultaneous clonesSTART_STAGGER_MSreduces start bursts- Retries/backoff (with jitter) applied to git and REST calls
ModuleNotFoundError: requests→ runuv sync- SSH permission denied → add your key to Azure DevOps and ensure ssh-agent is loaded (
ssh-add -l) - Rate limiting (429) → lower
GIT_MAX_CONCURRENCY, increaseSTART_STAGGER_MS, or keep defaults to let backoff handle it - Timeouts to
ssh.dev.azure.com:22→ reduce clone concurrency, raiseConnectTimeoutinGIT_SSH_OPTS, or re-run at a quieter time; the script will auto-retry and re-process transient failures at the end
Currently, each repo is scanned right after its clone/update. If you want a two-phase mode (clone all, then scan), open an issue or adjust the flow; the code is structured to make this change straightforward.
Uploads CycloneDX SBOMs to Dependency-Track.
Highlights:
- Validates project existence via
/api/v1/project/lookupand creates it whenDT_AUTOCREATE=true - Optionally waits for processing via token polling
Env:
DT_URL,DT_API_KEY(required)SBOM_DIR(default~/azdo-scan/sbom)DT_DEFAULT_VERSION(defaultHEAD)DT_AUTOCREATE(defaulttrue)WAIT_FOR_PROCESSING(defaultfalse)MAX_WORKERS(same parsing as above)
Run:
uv run --env-file .env ./dt_bulk_upload_sbom.pyCreate and enforce an Operational blocklist policy in Dependency-Track using the PURLs listed in list_shai.txt.
Features:
- Uses Dependency-Track OpenAPI v4 endpoints.
- Creates the policy via
PUT /api/v1/policy(requires a generated UUID). - Adds conditions via
PUT /api/v1/policy/{uuid}/conditionwith operatorMATCHES. - Supports
.xversions inlist_shai.txtby converting them to regex (e.g.1.2.x→1\.2\..*). - Accepts NPM (
pkg:npm/...) and GitHub (pkg:github/...) purls.
Env (from .env.dt):
DT_URL(orDT_BASE_URL),DT_API_KEY- Optional:
POLICY_NAME(defaultShai-Hulud Blocklist),LIST_FILE(defaultlist_shai.txt),DRY_RUN
Run:
uv run --env-file .env.dt ./dt_create_shai_policy.pyNotes:
- Operator is
MATCHESto support wildcard versions. - The script auto-generates a UUID and sets
global: true.
Lines like:
are translated to a regex-backed purl condition:
pkg:npm/ansi-regex@6\.2\..*
which matches any patch version in the 6.2.* range.
If you created a policy manually in the UI (Administration → Policies), you can attach conditions to it:
uv run --env-file .env.dt POLICY_NAME="Shai-Hulud Blocklist" ./dt_add_conditions_only.pyAfter adding conditions, trigger project-wide reanalysis and refresh metrics so violations appear:
# All projects
uv run --env-file .env.dt ./dt_trigger_reanalysis.py
# Filter by name pattern
uv run --env-file .env.dt ./dt_trigger_reanalysis.py "Loop_"Under the hood:
- Triggers
POST /api/v1/finding/project/{uuid}/analyzefor each project. - Calls
GET /api/v1/metrics/project/{uuid}/refreshto refresh metrics.