A-A+
查找表中多字段重复数据 并删除之 保留最早一条
来源:zhoz 由于代码问题,产生重复数据。而又是产生于两张表间的关联,需要删除多余的数据,但需要保留最原始的第一条数据(这通常就是ID值最小的那条)。举例说明:
CREATE TABLE `zhoz_mst` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`title` VARCHAR( 66 ) NOT NULL ,
`kana` VARCHAR( 66 ) ,
`zz` VARCHAR( 66 ) NOT NULL
) ENGINE = MYISAM ;
插入测试数据:
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'aaa', 'bbb', 'ccc');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'aaa', 'bbb2', '2');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'aaa', 'bbb', '3');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'aaa', 'bbb', '4');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'a', 'b', '5');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'a', NULL, '6');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'a', NULL, '7');
INSERT INTO `zhoz0428`.`zhoz_mst` (`id` ,`title` ,`kana` ,`zz` )VALUES (NULL , 'a', 'b', '8');
1 aaa bbb ccc
2 aaa bbb2 2
3 aaa bbb 3
4 aaa bbb 4
5 a b 5
6 a NULL 6
7 a NULL 7
删除保存最小ID:
select a.id, a.title, a.kana from zhoz_mst a
where (a.title,a.kana) in (select title,kana from zhoz_mst group by title,kana having count(*) > 1)
and id not in (select min(id) from zhoz_mst group by title,kana having count(*)>1)
取出两字段完全相同:
select a.id, a.title, a.kana from zhoz_mst a
where (a.title,a.kana) in (select title,kana from zhoz_mst group by title,kana having count(*) > 1)
----------------------
1 aaa bbb
3 aaa bbb
4 aaa bbb
5 a b
8 a b
如果是重要数据,还得先备份数据:
下面是PGSql的备份方式:
/usr/local/pgsql/bin/pg_dump -d aom -t zhoz_mst > zhoz_mst090428.sql
更新PgSql学习:http://log.zhoz.com/read.php?469
如果删除这个的同时,需要更新另外的关联表,就需要写代码来执行。单纯的Sql文我想应该也能实现,但有风险。
PHP的实现思路:
- $data = $db->executeQuery($sql); // 上面的Sql文
- $exp_id_array = array();
- foreach ($data as $tmp) {
- $first_id = $tmp["id"];
- $first_title = $tmp["title"];
- $first_kana = $tmp["kana"];
- if (in_array($first_id, $exp_id_array)) {
- continue;
- }
- // 这里操作其它表的逻辑
- }