-
-
Notifications
You must be signed in to change notification settings - Fork 10
Fix assure match detection and add coverage for parallel batches #185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix assure match detection and add coverage for parallel batches #185
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
|
|
||
| if !found && bytes.Contains(output, []byte(`"type":"match"`)) { | ||
| found = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Ripgrep Match Detection Flaw
The runParallelRipgrep function's match detection logic is contradictory. It checks for "type":"match" in ripgrep's output even when ripgrep exits with code 1, which explicitly indicates no matches. This can lead to false positives and is fragile due to relying on a literal string check in the raw JSON output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Auto Pull Request Review from LlamaPReview
🎯 TL;DR & Recommendation
Recommendation: Approve with suggestions
This PR introduces a shared parallel execution helper and adds tests for better coverage, but includes minor maintainability and architectural suggestions to enhance robustness and reduce duplication.
📄 Documentation Diagram
This diagram documents the new parallel ripgrep execution workflow introduced by the PR.
sequenceDiagram
participant User
participant executeAssure
participant runParallelRipgrep
participant StatusHandler
User->>executeAssure: Trigger execution
executeAssure->>runParallelRipgrep: Call with batch args
note over runParallelRipgrep: PR #35;185 added this helper
runParallelRipgrep->>runParallelRipgrep: Process batches in parallel
runParallelRipgrep-->>executeAssure: Return matches found
executeAssure->>StatusHandler: Report policy status
note over StatusHandler: New status reporting mechanism
StatusHandler-->>User: Final result
🌟 Strengths
- Adds comprehensive tests for parallel execution, ensuring reliability.
- Centralizes batch sizing logic, improving consistency across the codebase.
| Priority | File | Category | Impact Summary | Anchors |
|---|---|---|---|---|
| P2 | cmd/scan.go | Maintainability | Duplication of batch size constant increases maintenance risk | symbol:executeParallelScans, constant:parallelBatchSize |
| P2 | cmd/rg_executor.go | Maintainability | Fragile string matching could break on JSON format changes | |
| P2 | cmd/assure.go | Architecture | Asymmetric function interfaces suggest divergence in responsibilities | symbol:executeSingleAssure, constant:parallelBatchSize |
| P2 | cmd/assure.go | Architecture | Global handler pattern adds indirect state mutation | symbol:reportPolicyStatus |
| P2 | cmd/api.go | Maintainability | Hard-coded array indices risk panics on format changes |
🔍 Notable Themes
- Maintainability improvements needed to reduce code duplication and fragility in string parsing and array access.
- Architectural consistency suggested for function interfaces to align responsibilities and state management.
⚠️ **Unanchored Suggestions (Manual Review Recommended)**
The following suggestions could not be precisely anchored to a specific line in the diff. This can happen if the code is outside the changed lines, has been significantly refactored, or if the suggestion is a general observation. Please review them carefully in the context of the full file.
📁 File: cmd/scan.go
The new parallelBatchSize constant (value 25) in rg_executor.go centralizes batch sizing logic, but the related context shows executeParallelScans still uses a hard-coded batchSize of 25. This creates duplication and maintenance risk—if batch sizing logic changes, it must be updated in two places. The new parallel execution helper (runParallelRipgrep) should be adopted here for consistency.
Suggestion:
func executeParallelScans(rgPath string, baseArgs []string, filesToScan []string, writer *bufio.Writer) error {
var wg sync.WaitGroup
errChan := make(chan error, (len(filesToScan)+parallelBatchSize-1)/parallelBatchSize)
// Use runParallelRipgrep or refactor to share batch logicRelated Code:
func executeParallelScans(rgPath string, baseArgs []string, filesToScan []string, writer *bufio.Writer) error {
const batchSize = 25
var wg sync.WaitGroup
errChan := make(chan error, len(filesToScan)/batchSize+1)💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.
| } | ||
| if len(filesToAssure) > 25 { | ||
| if len(filesToAssure) > parallelBatchSize { | ||
| matchesFound, err = executeParallelAssure(rgPath, codePatternAssureJSON, filesToAssure, writer) | ||
| } else { | ||
| matchesFound, err = executeSingleAssure(rgPath, codePatternAssureJSON, filesToAssure, targetDir, policy, writer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: High
The parallel execution threshold now uses the centralized parallelBatchSize, improving consistency. However, executeSingleAssure still receives a targetDir and policy parameter while executeParallelAssure does not—this architectural asymmetry suggests the helper functions have diverged responsibilities. Consider aligning their interfaces or merging logic.
|
|
||
| } | ||
|
|
||
| reportPolicyStatus(policy, matchesFound) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: High
The new status reporting mechanism introduces a global handler pattern. While tests validate its behavior, the related context doesn't show other callers of executeAssure. This pattern is extensible but adds indirect state mutation—ensure future maintainers understand this side effect.
| if sarifReport.Runs[0].Invocations[0].Properties.ReportCompliant { | ||
|
|
||
| resultMsg = fmt.Sprintf("🟢 %s : %s", "Compliant") | ||
|
|
||
| resultMsg = "🟢 Compliant" | ||
| } else { | ||
| resultMsg = fmt.Sprintf("🔴 %s : %s", "Non Compliant") | ||
| resultMsg = "🔴 Non Compliant" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: High
Hard-coded array indices (e.g., [0]) assume a specific SARIF report structure. This is brittle—if the report format changes, this code may panic or misbehave. Add bounds checking or use helper methods to access nested properties safely.
| if sarifReport.Runs[0].Invocations[0].Properties.ReportCompliant { | |
| resultMsg = fmt.Sprintf("🟢 %s : %s", "Compliant") | |
| resultMsg = "🟢 Compliant" | |
| } else { | |
| resultMsg = fmt.Sprintf("🔴 %s : %s", "Non Compliant") | |
| resultMsg = "🔴 Non Compliant" | |
| } | |
| if len(sarifReport.Runs) > 0 && len(sarifReport.Runs[0].Invocations) > 0 { | |
| if props := sarifReport.Runs[0].Invocations[0].Properties; props != nil { | |
| if props.ReportCompliant { | |
| resultMsg = "🟢 Compliant" | |
| } else { | |
| resultMsg = "🔴 Non Compliant" | |
| } | |
| } | |
| } |
| if !found && bytes.Contains(output, []byte(`"type":"match"`)) { | ||
| found = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2 | Confidence: Medium
Match detection relies on fragile string matching against JSON output. While functional, this could break if ripgrep's JSON format changes. A more robust approach would parse the JSON to check for match objects. The current implementation may also false-positive on content containing the literal string "type":"match".
Code Suggestion:
// Consider proper JSON parsing for robustness
var outputs []map[string]interface{}
if err := json.Unmarshal(output, &outputs); err == nil {
for _, entry := range outputs {
if t, ok := entry["type"].(string); ok && t == "match" {
found = true
break
}
}
}
Summary
Testing
https://chatgpt.com/codex/tasks/task_e_68cab288b49c832496bfad4568dc4760
Important
The pull request refactors the
for 5ab833f.
executeAssureandexecuteParallelAssurefunctions for better clarity and performance, introduces a newrunParallelRipgrepfunction to handle parallel execution, and adds tests for theexecuteAssurefunction to ensure correct behavior. Additionally, it improves logging and status reporting mechanisms.This description was created by