OmniFind Can Help Improve Your Search Technique
Technical tips to help you use Secure Perspective to secure your system
Do you have text or text documents (e.g., PDF, Microsoft* Word, XML or HTML files) that you archive or would like to archive? Would you like to be able to search that set of documents for words and phrases, similar to a search engine? How about if that search could do linguistic matches, like finding “mice” when you search for “mouse”? If so, check out the new IBM OmniFind* Text Search Server product (5733-OMF).
Already available for DB2* for z/OS* and DB2 for Linux*, UNIX* and Windows*, OmniFind was released for DB2 for IBM i in May. It’s provided as a no-additional-charge product for IBM i 6.1. Integrated with DB2 for IBM i, OmniFind lets you index documents stored in DB2 i database tables, then search those documents using the powerful new SQL functions CONTAINS and SCORE.
Let’s see an example of OmniFind’s power. Suppose you have several short stories that you’ve inserted into the Story column of database table “Books.” Using OmniFind, you can create a search index over the Story column. Then, when you want to retrieve all of the records for stories about cats, submit the following SQL query:
SELECT * FROM BOOKS WHERE CONTAINS(STORY, ‘cats chasing mice’) = 1
This query returns the records associated with those short stories that contain any of the phrases “cats chasing mice,” “a cat chased a mouse,” “a mouse was chasing a cat,” “Cats were chasing mice,” etc. OmniFind’s search capability is more than just simple text match. It’s built around linguistic-analysis capabilities developed by IBM research.
But this search potential is only part of the story. OmniFind can also rank the results of a search by relevance using the new SCORE built-in function:
SELECT * FROM BOOKS ORDER BY SCORE(STORY, ‘cat’) DESC
This query returns all from the Books table with records ordered so the short stories that best match the search criteria are returned first. By adding the limiting clause FETCH FIRST, the result set is limited to 20 rows:
SELECT * FROM BOOKS ORDER BY SCORE(STORY, ‘cat’) DESC FETCH FIRST 20 ROWS ONLY
Because CONTAINS and SCORE are built-in SQL functions, the searches can be combined with other SQL joins and selection clauses. For example, to limit the result set to stories after 2007, use:
SELECT TITLE FROM BOOKS WHERE CONTAINS(STORY, ‘cat’) = 1 AND YEAR > 2007
Setting up a text search server on a Power Systems server running IBM i can be trouble free.
Nick Lawrence works for IBM in Rochester, Minn., on DB2 for IBM i. His responsibilities include catalog support, DB2 object management and full-text search. Nick can be reached at ntl@us.ibm.com.
More Articles From Nick Lawrence
Advertisement
Search our new 2012 Buyer's Guide.
Advertisement
Maximize your IT investment with monthly information from THE source...IBM Systems Magazine EXTRA eNewsletter. SUBSCRIBE NOW.
View past IBMi EXTRAs here
Related Articles
Cover Story | How metadata can reduce query and report complexity
E-Newsletter | Introducing DB2 for i Adaptive Query Processing
E-Newsletter | DB2 for i Learning Statistics Engine works smarter