Skip to content

Incorrect StatusFilter usage in CapacityBufferPodListProcessor #8851

@moko-poi

Description

@moko-poi

The CapacityBufferPodListProcessor incorrectly uses StatusFilter, causing it to filter out buffers that are ready for provisioning instead of selecting them. This results in the processor only processing buffers that are not ready, which is the opposite of the intended behavior.

Current Behavior

In cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go:

func NewCapacityBufferPodListProcessor(...) *CapacityBufferPodListProcessor {
    return &CapacityBufferPodListProcessor{
        client: client,
        statusFilter: buffersfilter.NewStatusFilter(map[string]string{
            common.ReadyForProvisioningCondition: common.ConditionTrue,
            common.ProvisioningCondition:         common.ConditionTrue,
        }),
        // ...
    }
}

According to StatusFilter's implementation and documentation:

// NewStatusFilter creates an instance of statusFilter that filters out the buffers with condition in passed conditions.
func NewStatusFilter(conditionsToFilterOut map[string]string) *statusFilter

The filter excludes buffers that have the specified conditions. This means:

  • Buffers with ReadyForProvisioning: True OR Provisioning: True are filtered out
  • Only buffers without these conditions are processed

This is the opposite of what should happen. The processor should select buffers that are ready for provisioning, not exclude them.

Expected Behavior

The Pod List Processor should:

  1. Select buffers that have ReadyForProvisioning: True condition
  2. Process these buffers by generating virtual pods
  3. Update the buffer status to Provisioning: True

Root Cause Analysis

StatusFilter Design

StatusFilter is designed to exclude buffers with specified conditions. This design works correctly in Buffer Controller where it's wrapped with CombinedAnyFilter:

// Buffer Controller usage (CORRECT)
statusFilter: filter.NewCombinedAnyFilter(
    []filters.Filter{
        filters.NewStatusFilter(map[string]string{
            common.ReadyForProvisioningCondition: common.ConditionTrue,
            common.ProvisioningCondition:         common.ConditionTrue,
        }),
        filters.NewBufferGenerationChangedFilter(),
        filters.NewPodTemplateGenerationChangedFilter(client),
    },
)

Here, CombinedAnyFilter collects buffers from any sub-filter. The logic is:

  • StatusFilter excludes already-processed buffers
  • Other filters include buffers that need reprocessing
  • Combined result: process buffers that are NOT already processed OR need updating

However, Pod List Processor uses StatusFilter directly without CombinedAnyFilter, causing the inverted logic.

Impact

This bug likely causes the CapacityBuffer feature to be non-functional:

  • Virtual pods are never injected for ready buffers
  • Scale-up based on capacity buffers never occurs
  • The feature appears to do nothing

Proposed Solutions

Solution 1: Create a new InclusiveStatusFilter (Recommended)

Create a separate filter that includes buffers with specified conditions:

// New filter: status_include_filter.go
type statusIncludeFilter struct {
    conditions map[string]string
}

func NewStatusIncludeFilter(conditionsToInclude map[string]string) *statusIncludeFilter {
    return &statusIncludeFilter{
        conditions: conditionsToInclude,
    }
}

func (f *statusIncludeFilter) Filter(buffersToFilter []*v1.CapacityBuffer) ([]*v1.CapacityBuffer, []*v1.CapacityBuffer) {
    var included []*v1.CapacityBuffer
    var excluded []*v1.CapacityBuffer

    for _, buffer := range buffersToFilter {
        if f.hasAllConditions(buffer) {
            included = append(included, buffer)
        } else {
            excluded = append(excluded, buffer)
        }
    }
    return included, excluded
}

func (f *statusIncludeFilter) hasAllConditions(buffer *v1.CapacityBuffer) bool {
    bufferConditions := buffer.Status.Conditions
    matchCount := 0
    
    for condType, condStatus := range f.conditions {
        for _, condition := range bufferConditions {
            if condition.Type == condType && string(condition.Status) == condStatus {
                matchCount++
                break
            }
        }
    }
    
    // All specified conditions must be present
    return matchCount == len(f.conditions)
}

Update Pod List Processor:

func NewCapacityBufferPodListProcessor(...) *CapacityBufferPodListProcessor {
    return &CapacityBufferPodListProcessor{
        client: client,
        statusFilter: buffersfilter.NewStatusIncludeFilter(map[string]string{
            common.ReadyForProvisioningCondition: common.ConditionTrue,
        }),
        // ...
    }
}

Note: We should only check ReadyForProvisioning: True, not both conditions, because:

  • Buffer Controller sets ReadyForProvisioning: True when buffer spec is processed
  • Pod List Processor sets Provisioning: True when virtual pods are injected
  • If we require both, no buffer would ever be processed (chicken-and-egg problem)

Solution 2: Use CombinedAnyFilter with inverted logic

Wrap StatusFilter with additional logic to invert the result:

// This is more complex and less clear, not recommended

Solution 3: Modify StatusFilter to support both modes

Add a boolean parameter to StatusFilter:

func NewStatusFilter(conditions map[string]string, exclude bool) *statusFilter

Not recommended because it makes the API confusing and breaks existing code.

Additional Issues

Issue 1: Condition Update Logic

Currently, both Buffer Controller and Pod List Processor replace the entire Conditions array:

buffer.Status.Conditions = []metav1.Condition{readyCondition}  // Overwrites all conditions

This means:

  • When Buffer Controller sets ReadyForProvisioning, it removes any existing conditions
  • When Pod List Processor sets Provisioning, it removes ReadyForProvisioning
  • Both conditions never coexist

Recommendation: Use proper condition management that updates or appends conditions without removing others.

Issue 2: Filter Logic Confusion

The current design where "filter" means "exclude" is counter-intuitive. Consider:

  • Renaming StatusFilter to StatusExcludeFilter for clarity
  • Creating StatusIncludeFilter as a separate, clearer alternative
  • Documenting the filter semantics more clearly

Testing Recommendations

  1. Add integration tests that verify:

    • Buffers with ReadyForProvisioning: True are actually processed
    • Virtual pods are generated for ready buffers
    • Conditions are properly updated
  2. Add unit tests for the new StatusIncludeFilter

  3. Test the complete lifecycle:

    • Create CapacityBuffer
    • Verify Buffer Controller sets ReadyForProvisioning
    • Verify Pod List Processor generates virtual pods
    • Verify Scale Up Logic processes virtual pods

References

  • cluster-autoscaler/capacitybuffer/filters/status_filter.go
  • cluster-autoscaler/processors/capacitybuffer/pod_list_processor.go
  • cluster-autoscaler/capacitybuffer/controller/controller.go
  • cluster-autoscaler/capacitybuffer/common/common.go

Priority

High - This bug likely prevents the CapacityBuffer feature from functioning at all in its current state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions