Full Text Search
How to use full text search in PostgreSQL.
Postgres has built-in functions to handle Full Text Search
queries. This is like a "search engine" within Postgres.
Preparation
For this guide we'll use the following example data:
Usage
The functions we'll cover in this guide are:
to_tsvector()
Converts your data into searchable tokens. to_tsvector()
stands for "to text search vector." For example:
_10select to_tsvector('green eggs and ham');_10-- Returns 'egg':2 'green':1 'ham':4
Collectively these tokens are called a "document" which Postgres can use for comparisons.
to_tsquery()
Converts a query string into tokens to match. to_tsquery()
stands for "to text search query."
This conversion step is important because we will want to "fuzzy match" on keywords.
For example if a user searches for eggs
, and a column has the value egg
, we probably still want to return a match.
Match: @@
The @@
symbol is the "match" symbol for Full Text Search. It returns any matches between a to_tsvector
result and a to_tsquery
result.
Take the following example:
The equality symbol above (=
) is very "strict" on what it matches. In a full text search context, we might want to find all "Harry Potter" books and so we can rewrite the
example above:
Basic full text queries
Search a single column
To find all books
where the description
contain the word big
:
Search multiple columns
Right now there is no direct way to use JavaScript or Dart to search through multiple columns but you can do it by creating computed columns on the database.
To find all books
where description
or title
contain the word little
:
Match all search words
To find all books
where description
contains BOTH of the words little
and big
, we can use the &
symbol:
Match any search words
To find all books
where description
contain ANY of the words little
or big
, use the |
symbol:
Notice how searching for big
includes results with the word bigger
(or biggest
, etc).
Partial search
Partial search is particularly useful when you want to find matches on substrings within your data.
Implementing partial search
You can use the :*
syntax with to_tsquery()
. Here's an example that searches for any book titles beginning with "Lit":
_10select title from books where to_tsvector(title) @@ to_tsquery('Lit:*');
Extending functionality with RPC
To make the partial search functionality accessible through the API, you can wrap the search logic in a stored procedure.
After creating this function, you can invoke it from your application using the SDK for your platform. Here's an example:
This function takes a prefix parameter and returns all books where the title contains a word starting with that prefix. The :*
operator is used to denote a prefix match in the to_tsquery()
function.
Handling spaces in queries
When you want the search term to include a phrase or multiple words, you can concatenate words using a +
as a placeholder for space:
_10select * from search_books_by_title_prefix('Little+Puppy');
Creating indexes
Now that we have Full Text Search working, let's create an index
. This will allow Postgres to "build" the documents pre-emptively so that they
don't need to be created at the time we execute the query. This will make our queries much faster.
Searchable columns
Let's create a new column fts
inside the books
table to store the searchable index of the title
and description
columns.
We can use a special feature of Postgres called
Generated Columns
to ensure that the index is updated any time the values in the title
and description
columns change.
Search using the new column
Now that we've created and populated our index, we can search it using the same techniques as before:
Query operators
Visit PostgreSQL: Text Search Functions and Operators
to learn about additional query operators you can use to do more advanced full text queries
, such as:
Proximity: <->
The proximity symbol is useful for searching for terms that are a certain "distance" apart.
For example, to find the phrase big dreams
, where the a match for "big" is followed immediately by a match for "dreams":
We can also use the <->
to find words within a certain distance of each other. For example to find year
and school
within 2 words of each other:
Negation: !
The negation symbol can be used to find phrases which don't contain a search term.
For example, to find records that have the word big
but not little
: