3

I have several tables being reported by pt-table-checksum with CRC_DIFF between my master and slave servers using the following commands:

$ pt-table-checksum h=master,u=user,p=password --empty-replicate-table --databases db --replicate systemadministration.checksums
$ pt-table-checksum h=master,u=user,p=password --databases db --replicate systemadministration.checksums --replicate-check 1

Results in:

Differences on P=3306,h=slave
DB         TBL                         CHUNK CNT_DIFF CRC_DIFF BOUNDARIES
db         table1                          0        0        1 1=1
db         table2                          0        0        1 1=1
db         table3                          0        0        1 1=1
db         table4                          0        0        1 1=1
db         table5                          0        0        1 1=1
db         table6                          0        0        1 1=1
db         table7                          0        0        1 1=1

However, when I run pt-table-sync, the script returns an exit code 0 and says that there are no issues.

$ pt-table-sync --execute --verbose --no-bin-log --tables db.table1 h=master,u=user,p=password h=slave
# Syncing h=slave,p=...,u=user
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      0      0 GroupBy   14:10:45 14:12:12 0    db.table1
$ echo $?
0

I've tried the different algorithms for the checksum command and have had no luck.

$ pt-table-checksum h=master,u=user,p=password --empty-replicate-table --algorithm=ACCUM --tables db.table1 --replicate systemadministration.checksums
DATABASE   TABLE                       CHUNK HOST                   ENGINE      COUNT         CHECKSUM TIME WAIT STAT  LAG
db         table1                          0 master                 MyISAM     141836 00141836D0139746   22 NULL NULL NULL
$ pt-table-checksum h=master,u=user,p=password --tables db.table1 --replicate systemadministration.checksums --replicate-check 1
Differences on P=3306,h=slave
DB         TBL                         CHUNK CNT_DIFF CRC_DIFF BOUNDARIES
db         table1                          0        0        1 1=1
$ echo $?
1

Any hints or is there any other tools I can use to verify the data integrity?

RolandoMySQLDBA
  • 182,700
  • 33
  • 317
  • 520
stanleykylee
  • 474
  • 1
  • 6
  • 21
  • There is an improved version of pt-table-sync coming soon which is currently available in the trunk release of Percona toolkit that may work better for you. – Aaron Brown Dec 06 '11 at 14:25
  • @AaronBrown I looked at the Percona Launchpad and did not find anything newer? Would you be able to point me out to where you're seeing a new version? I looked here: http://bazaar.launchpad.net/~percona-toolkit-dev/percona-toolkit/2.0/files/head:/bin/ – stanleykylee Dec 06 '11 at 22:44
  • Sorry, it's pt-table-checksum that has changed, not pt-table-sync. My mistake. – Aaron Brown Dec 07 '11 at 02:22
  • @Rolando I tried your scripts out, but getting a "ls: *Repair.txt: No such file or directory" have you seen this error on your script before ? –  Feb 08 '12 at 03:47
  • Actually, that's error comes up all the time WHEN THERE IS NOTHING TO REPAIR. That's a good thing. – RolandoMySQLDBA Feb 08 '12 at 17:18
  • Thanks ! :), ill put another catch in there, i was going to integrate this into Xymon so i can get alerts when it gets out of sync. –  Feb 08 '12 at 20:54

1 Answers1

4

The usual cause for this problem is that pt-table-sync is using CRC32 as a hash algorithm, because it's fast and cheap, but it can also give collisions. "codding" and "gnu" have the same crc32, for example. I recommend trying again with MD5 as the function.

  • changing the function didn't seem to work. ended up deleting all rows from the tables and then ran pt-table-sync to fill in the tables. checksum and sync returned fine after that. – stanleykylee Dec 14 '11 at 01:03