任何一个关系型数据库关于模糊匹配(like)的优化都是一件痛苦的事,相对而言,诸如like 'abc%'之类的还好一点,可以通过创建索引来优化,但对于like 'c%'之类的,真的就没有办法了。
这里介绍一种postgresql关于like 'c%'的优化方法,是基于全文检索的特性来实现的。
测试数据准备(环境centos6.5 + postgresql 9.6.1)。
postgres=# create table ts(id int,name text);
CREATE TABLE
postgres=# \d ts
Table "public.ts"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
name | text |
postgres=# insert into ts select n,n||'_pjy' from generate_series(1,2000) n;
INSERT 0 2000
postgres=# insert into ts select n,n||'_mdh' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_lmm' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_syf' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_wbd' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_hhh' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_sjw' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_jjs' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_ymd' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_biu' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_dfl' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# select count(*) from ts;
count
----------
20002000
(1 row)
开始测试:
postgres=# explain analyze select * from ts where name like '%pjy%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on ts (cost=0.00..358144.05 rows=2000 width=15) (actual time=0.006..1877.087 rows=2000 loops=1)
Filter: (name ~~ '%pjy%'::text)
Rows Removed by Filter: 20000000
Planning time: 0.031 ms
Execution time: 1877.178 ms
(5 rows)
关键一步:
postgres=# create index idx_name on ts using gin (to_tsvector('english',name));
CREATE INDEX
postgres=# vacuum analyze ts;
VACUUM
postgres=# \d ts
Table "public.ts"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
name | text |