postgresql关于like%xxx%的优化操作
  • 作者:admin
  • 发表时间:2021-06-01 07:51
  • 来源:未知

任何一个关系型数据库关于模糊匹配(like)的优化都是一件痛苦的事,相对而言,诸如like 'abc%'之类的还好一点,可以通过创建索引来优化,但对于like 'c%'之类的,真的就没有办法了。

这里介绍一种postgresql关于like 'c%'的优化方法,是基于全文检索的特性来实现的。

测试数据准备(环境centos6.5 + postgresql 9.6.1)。

postgres=# create table ts(id int,name text);
CREATE TABLE
postgres=# \d ts
Table "public.ts"
Column | Type  | Modifiers
--------+---------+-----------
id   | integer |
name  | text  |
postgres=# insert into ts select n,n||'_pjy' from generate_series(1,2000) n;
INSERT 0 2000
postgres=# insert into ts select n,n||'_mdh' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_lmm' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_syf' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_wbd' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_hhh' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_sjw' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_jjs' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_ymd' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_biu' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# insert into ts select n,n||'_dfl' from generate_series(1,2000000) n;
INSERT 0 2000000
postgres=# select count(*) from ts;
 count
----------
 20002000
(1 row)

 

开始测试:

postgres=# explain analyze select * from ts where name like '%pjy%';
                        QUERY PLAN                       
-----------------------------------------------------------------------------------------------------------
 Seq Scan on ts (cost=0.00..358144.05 rows=2000 width=15) (actual time=0.006..1877.087 rows=2000 loops=1)
  Filter: (name ~~ '%pjy%'::text)
  Rows Removed by Filter: 20000000
 Planning time: 0.031 ms
 Execution time: 1877.178 ms
(5 rows)

 

关键一步:

postgres=# create index idx_name on ts using gin (to_tsvector('english',name));
CREATE INDEX
postgres=# vacuum analyze ts;
VACUUM
postgres=# \d ts
   Table "public.ts"
 Column | Type  | Modifiers
--------+---------+-----------
 id   | integer |
 name  | text  |