Skip to content

Cursor is not deserialized correctly with multiple columns #1226

@nigel-nava

Description

@nigel-nava

The bug:
Pausing and resuming from the dashboard when my task has multiple cursor columns fails with an error.

The error:

ActiveRecord::StatementInvalid
PG::InvalidDatetimeFormat: ERROR: invalid input syntax for type timestamp with time zone: "["2025-03-11 14:30:01.132797000", "7d826fb4-b75e-4393-a799-e8cacbb32624"]" CONTEXT: unnamed portal parameter $1 = '...' 

The code:

# base jobclass
class DataMigration::MaintenanceTasksBaseJob < MaintenanceTasks::TaskJob
  queue_as :latency_30s
end
class ExampleTask < MaintenanceTasks::Task
  def collection
    User.all # has "id" and "created_at" columns
  end

  def cursor_columns
    [ :created_at, :id ]
  end

  def process(user)
    puts "Processing #{user.id}, #{user.created_at}"
    sleep 1
  end
end

My fix:

# base job class
class DataMigration::MaintenanceTasksBaseJob < MaintenanceTasks::TaskJob
  queue_as :latency_30s

  private

  def build_enumerator(run, cursor:)
    deserialized_cursor = cursor
    if cursor.nil? && @task.cursor_columns.length > 1 && run.cursor.is_a?(String)
      deserialized_cursor = JSON.parse(run.cursor)
    end
    super(run, cursor: deserialized_cursor)
  end
end

I think what's happening is that pausing and resuming from the dashboard enqueues a new job with no serialized position. In ac3872a we assign the cursor_position to the value of run.cursor, but at this point the value is supposed to be fully deserialized. When there are multiple columns in cursor_columns, we actually have a JSON string, which is not handled correctly by JobIteration

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions