Skip to content

Flaky test: SqlServerDataOptionsEndToEndSpec times out intermittently in CI #550

@Aaronontheweb

Description

@Aaronontheweb

Description

Akka.Persistence.Sql.Tests.SqlServer.SqlServerDataOptionsEndToEndSpec.Should_Start_ActorSystem_wth_Sql_Persistence fails intermittently in CI with a timeout waiting for message acknowledgment.

Failure Mode

Failed: Timeout 00:00:03 while waiting for a message of type System.String

The test times out at line 122 waiting for the first "ACK" message from the persistence actor:

_persistenceActor.Tell(1);
ExpectMsg<string>(Ack);  // <-- Times out here

Evidence of Flakiness

  • Passes when run in isolation
  • Fails intermittently when run as part of full test suite in CI
  • 🔍 Not a code regression - failure occurs in code paths unrelated to recent changes

Reproduction

# Passes in isolation
dotnet test src/Akka.Persistence.Sql.Tests/Akka.Persistence.Sql.Tests.csproj \
  --filter "FullyQualifiedName~SqlServerDataOptionsEndToEndSpec"

# May fail when run with full suite (resource contention)
dotnet test src/Akka.Persistence.Sql.Tests/Akka.Persistence.Sql.Tests.csproj

Root Cause Analysis

The test is located in SqlDataOptionsEndToEndSpecBase.cs and uses:

  • SQL Server test container (SqlServerContainer)
  • Manual ActorSystemSetup configuration (lines 75-82)
  • Does NOT use Akka.Hosting extensions

Likely causes:

  1. SQL Server container timing - Container may not be fully ready when test starts
  2. Resource contention - Multiple tests competing for SQL Server connections in parallel
  3. Network delays - Docker container networking issues in CI environment

Test Location

  • File: src/Akka.Persistence.Sql.Tests/SqlServer/SqlServerDataOptionsEndToEndSpec.cs
  • Base class: src/Akka.Persistence.Sql.Tests/SqlDataOptionsEndToEndSpecBase.cs
  • Test method: Should_Start_ActorSystem_wth_Sql_Persistence (line 116)

Suggested Fixes

  1. Increase timeout - Change from 3 seconds to 10+ seconds for CI environments
  2. Add container readiness check - Ensure SQL Server is fully initialized before running test
  3. Retry logic - Mark test with [Retry] attribute if available
  4. Sequential execution - Consider running SQL Server tests sequentially rather than in parallel
  5. Better container cleanup - Ensure proper cleanup between tests to avoid port/resource conflicts

Environment

  • Occurs in CI builds (Azure DevOps)
  • May be specific to Linux or Windows runners
  • Related to SQL Server container initialization timing

Impact

  • Low impact on development (passes in isolation)
  • Medium impact on CI reliability (causes occasional build failures)
  • Not a functional bug - test infrastructure issue

Related Files

  • src/Akka.Persistence.Sql.Tests/SqlServer/SqlServerDataOptionsEndToEndSpec.cs
  • src/Akka.Persistence.Sql.Tests/SqlDataOptionsEndToEndSpecBase.cs
  • src/Akka.Persistence.Sql.Tests.Common/Containers/SqlServerContainer.cs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions