Skip to content

Failover lost master #89

@dfredell

Description

@dfredell

I found a scenario where the cluster looses its master.

It occurred when:

  1. I had 3 nodes running healthily, remote consul, static root password
  2. I killed the master
  3. Failover started on 37
  4. mysqlrpladmin on 37 decided that 36 should be the master
  5. 36 detected that he is the new master
  6. 36 creates a new containerpilot.json with the service 'mysql-primary`
  7. Then 36 runs containerpilot -reload
  8. This causes mysql to stop and start
  9. When mysql comes back up mysql doesn't have a record of primary
  10. Also when reading from /v1/kv/mysql-primary there is no result

failover.log
servers

  1. docker compose name: mysql_4 hostname: mysql-37f99a0a7a84 IP:192.168.128.236
  2. docker compose name: mysql_5 hostname: mysql-363deb257281 IP:192.168.128.235

The fail over works great if the node that gets the fail-over lock also wins the mysqlrpladmin poll.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions