I have a table in a database with a structure like this Keywords
U_id is a foreign key, id is the primary key
The keywords field is a list of words created by users separated by commas I was wondering if someone could suggest an efficient query to search such a table.
If you using MyISAM, you can create a fulltext index on field keywords
. Then search using:
select * from keywords k where match('test') against(k.keywords);
Of course CSV in a database is just about the worst thing you can do. You should put keywords in a separate table. Make sure to use InnoDB for all tables.
Table tags
-------------
id integer auto_increment primary key
keyword_id integer foreign key references keywords(id)
keyword varchar(40)
Now you can select using:
SELECT k.* FROM keywords k
INNER JOIN tags t ON (t.keyword_id = k.id)
WHERE t.keyword LIKE 'test' //case insensitive comparison.
Much much faster than CSV.
You should change your database design so that you have a table called user_keyword
and store each keyword in a separate row. You can then index this table and search it easily and efficiently:
WHERE keyword = 'foo'
If you can't modify the database then you can use FIND_IN_SET
but it won't be very efficient:
WHERE FIND_IN_SET('foo', keywords)
You have 2 options :
SELECT * FROM Keywords WHERE keyword LIKE %KEYWORD%
Separate keywords
in its own table, "connect" it to the old table via FOREIGN KEY, index it and you'll be able to search for exact keywords of keyword prefixes efficiently.
For example:
id U_id keywords create_date
1 - A,B,C -
Becomes:
PARENT_TABLE:
id U_id create_date
1 - -
CHILD_TABLE:
id keyword
1 A
1 B
1 C
Provided there is an index on keyword
, the following query should be efficient:
SELECT * FROM PARENT_TABLE
WHERE id IN (SELECT id FROM CHILD_TABLE WHERE keyword = ...)
---EDIT---
Based on Johan's comments below, it appears that InnoDB uses what is known as "index-organized tables" under Oracle or "clusters" under most other databases. Provided you don't need to query "from parent to child" (i.e. "give me all keywords for given id"), the PRIMARY KEY on CHILD_TABLE should be:
{keyword, id}
Since the keyword
is the first field in the composite index, WHERE keyword = ...
(or WHERE keyword LIKE 'prefix%'
) can use this index directly.