Querying the Full-Text Index in SQL Server

05 Dec
December 5, 2012

SQL Server provides Full-Text search capabilities through it’s Full-Text Index, a mature document search tool with neat features like thesaurus and stop-word integration as well as some semantic search and keyword extraction features in SQL Server 2012.

The Full-Text Index is used through 2 (scalar) functions CONTAINS and FREETEXT, and 2 (table-valued) functions CONTAINSTABLE and FREETEXTTABLE. In this post I will be briefly exploring the difference between each of those functions.

CONTAINS vs FREETEXT

Both of these functions are typically used in the WHERE clause of a SQL statement, as an argument they both take a list of columns (or all columns using *) that have Full-Text enabled, and a search string argument. Both functions can be localized to a particular language (in case of thesaurus expansions or stop-word elimination) using an additional optional LANGUAGE argument. Basically:

CONTAINS | FREETEXT ( {column_name | list | * } , ” , LANGUAGE )

The main difference through is the behavior and the level of control both these functions offer.

CONTAINS Full-Text Function FREETEXT Full-Text Function
Used when precise matching is required, or controlled fuzzy matching using search conditions Used when trying to expand the search terms to related terms through meaning, through stemming and thesaurus expansions, resulting in less precise search
Supports special characters (such as ‘&‘ ‘|‘) and functions (such as INFLECTIONAL, THESAURUS and proximity functions) Does not allow special characters or functions, each word in the search string is expanded and search for individual and the results are grouped using an OR operator
Allows querying on specific document properties (such as the Title) with Search Property List Does not allow property search
Advance function that does everything FREETEXT capable of doing and more Simple function that supports powerful features without dealing with complex expressions

Special functions supported by the CONTAINS’s search query include:

  • Prefix Search: such as ‘potato*‘ to match terms that are prefixed with the word potato, for examples potatoes or potatoed :S
  • Expansion Search: such as thesaurus or inflectional search, for example ‘FORMSOF (INFLECTIONAL, run) ‘ which could match running, runs, runner, etc.
  • Proximity Search: using generic proximity operators such as NEAR, or by customizing the proximity operators to reflect business requirement, such as the distance between words.
  • Weighted Search: CONTAINS allows you to assign different weights to different terms within the search query, very neat feature if you are doing sentence structure analysis as well.

CONTAINSTABLE vs FREETEXTTABLE

These functions operate in the exact same manner as their previous counterparts in terms of search capabilities and functionality, but offer the result as a table that can be used in the FROM clause of a SQL statement (instead of the WHERE), essentially similar to a table-valued function that returns a table with two columns: KEY and RANK.

  • KEY: the unique key that identifies a row (or a document) in the Full-Text search enabled table, the existence of the key is enforced by the requirements of creating a Full-Text Index.
  • RANK: a relevance rank of the returned document to the issued search query, this will allow you to rank search results by relevance.

Additionally, these two functions take an additional Top_N_By_Rank argument that allows you to limit the returned results according to a rank threshold.

And that is pretty much how you would go about using Full-Text search functions in SQL Server, of course to get the most out of Full-Text search you will need to look into configuring a custom thesaurus file, as well as experimenting with different search query configuration to yield optimum results for your particular requirements.

Querying the Full-Text Index in SQL Server rated 4 out of 5 by 1 readers

Querying the Full-Text Index in SQL Server , 4.0 out of 5 based on 1 ratings
2 votes
Tags: , , , , , , ,
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>