Database Replication Breaks Every Week


Blinky
Member
Registered: 30.03.16 13:59
Timezone: UTC +0
Posts: 5

Hi,

I have a MySQL cluster with a master-slave replication and for some reason it seems like every Saturday the slave dies, usually resulting in a broken replication. It often complains about a duplicate primary key or too many connections. It doesn't even make sense really for it to break on Saturday morning since there's nothing special which runs that day and it's a low-traffic time period. Any ideas why this is happening?

Regards,
Blinky


ynori7
Administrator
Registered: 24.08.11 12:16
Timezone: UTC +2
Posts: 152

Maybe something is writing to the slave? Might not be likely if it doesn't break the rest of the week, but it could be that there's maybe some rarely executed query writing to the wrong host.

i537.photobucket.com/albums/ff338/ynori77/archenemysig1.jpg


HammerTime
Member
Registered: 20.02.13 13:54
Timezone: UTC +0
Posts: 5

Or maybe it's an issue with the replication method. Try checking out this article: https://www.kickstarter.com/backing-and-hacking/the-day-the-replication-died

If you have your replication set to use "Mixed" mode, it could be that it normally chooses statement-based replication and maybe you have a rarely executed query which is considered "unsafe" (check this: Determination of Safe and Unsafe Statements in Binary Logging) which triggers a row-based replication which leads to inconsistent data.


Blinky
Member
Registered: 30.03.16 13:59
Timezone: UTC +0
Posts: 5

Wow thanks for the fast replies! Unfortunately we've already made the slave read-only, so nothing can be writing to the slave (except for the replication). I just checked the replication method and it looks like we always use row-based, so it can't be what HammerTime suggested either.


HammerTime
Member
Registered: 20.02.13 13:54
Timezone: UTC +0
Posts: 5

Do you have any logs? Maybe it says something in there that can help identify where it's breaking.


Blinky
Member
Registered: 30.03.16 13:59
Timezone: UTC +0
Posts: 5

Well we don't have any logging enabled, but we did make a dump last time it broke. Here's part of it:

+---------+------------------+-----------------------------------------+------------------+---------+--------+----------------------------------+---------------------------------------------
---------------------------------------------------------+----------+
| Id      | User             | Host                                    | db               | Command | Time   | State                            | Info                                        
                                                         | Progress |
+---------+------------------+-----------------------------------------+------------------+---------+--------+----------------------------------+---------------------------------------------
---------------------------------------------------------+----------+
| 1       | system user      |                                         |                  | Connect | 684158 | Waiting for master to send event |                                             
                                                         | 0.000    |
| 2       | system user      |                                         |                  | Connect | 37499  | Waiting for table level lock     |                                                                                                      | 0.000    |
| 2645755 | softwaretoplists | webserver1:53547   | softwaretoplists | Query   | 86247  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646488 | softwaretoplists | webserver2:37414   | softwaretoplists | Query   | 86098  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646546 | softwaretoplists | webserver3:38945   | softwaretoplists | Query   | 86087  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646638 | softwaretoplists | webserver3:39167   | softwaretoplists | Query   | 86074  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646689 | softwaretoplists | webserver3:39372   | softwaretoplists | Query   | 86061  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646747 | softwaretoplists | webserver4:43446   | softwaretoplists | Query   | 86049  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2646796 | softwaretoplists | webserver3:39802   | softwaretoplists | Query   | 86037  | User sleep                       | SELECT COUNT(DISTINCT d.id) as count FROM downloads d     LEFT JOIN download_category_mapping dcm ON | 0.000    |
| 2873476 | softwaretoplists | webserver2:38234   | softwaretoplists | Query   | 37496  | Waiting for table level lock     | SELECT C.*, (SELECT COUNT(*) FROM download_category_mapping DCM WHERE C.id=DCM.category_id) AS downl | 0.000    |
| 2873489 | softwaretoplists | webserver5:57274   | softwaretoplists | Query   | 37492  | Waiting for table level lock     | SELECT C.*, (SELECT COUNT(*) FROM download_category_mapping DCM WHERE C.id=DCM.category_id) AS downl | 0.000    |
| 2873493 | softwaretoplists | webserver6:39648   | softwaretoplists | Query   | 37491  | Waiting for table level lock     | SELECT C.*, (SELECT COUNT(*) FROM download_category_mapping DCM WHERE C.id=DCM.category_id) AS downl | 0.000    |
| 2873494 | softwaretoplists | webserver6:57289   | softwaretoplists | Query   | 37491  | Waiting for table level lock     | SELECT C.*, (SELECT COUNT(*) FROM download_category_mapping DCM WHERE C.id=DCM.category_id) AS downl | 0.000    |


HammerTime
Member
Registered: 20.02.13 13:54
Timezone: UTC +0
Posts: 5

Is it always this category mapping table which breaks?


Blinky
Member
Registered: 30.03.16 13:59
Timezone: UTC +0
Posts: 5

Yeah actually I think it is. Not sure why though. There's nothing really special about that table.


HammerTime
Member
Registered: 20.02.13 13:54
Timezone: UTC +0
Posts: 5

Try going to your database and run this command and tell me what it says:

SELECT TABLE_NAME, ENGINE FROM information_schema.TABLES where TABLE_SCHEMA='softwaretoplists';


Blinky
Member
Registered: 30.03.16 13:59
Timezone: UTC +0
Posts: 5

Here's the output:

mysql> SELECT TABLE_NAME, ENGINE FROM information_schema.TABLES where TABLE_SCHEMA='softwaretoplists';
+---------------------------+--------+
| TABLE_NAME                | ENGINE |
+---------------------------+--------+
| categories                | InnoDB |
| download_category_mapping | MyISAM |
| downloads                 | InnoDB |
+---------------------------+--------+
3 rows in set (0.02 sec)

I'm not sure why that table has a different engine. It may have been created at a later point in time.