Intended Audience
This tutorial is intended for developers using MySQL
(http://www.MySQL.com/) and PHP
(http://www.php.net) who want to create a searchable database of some sort of textual data. It will focus on the Full-text capabilities presented by MySQL, moving into the Boolean opportunities that are presented in the latest alpha version, 4.1, of MySQL.
Overview
Using directories to group articles by category is a great way to help people to navigate through many articles. At some point, however, someone will want to find all the articles that pertain to a certain topic that may not have a directory of it’s own, or may span many directories . This is what the search engine is for.
Learning Objectives
In this tutorial, you will learn:
- How to modify your current database to accommodate Full-text searching
- How to use a simple Full-text search to quickly gather relevant responses
- How to execute a more complex Full-text search with Boolean capabilities
- Tips on what to do and what not to do, as well as the security implications of some of the example scripts presented.
Definitions
MySQL – An Open Source database that is used by many PHP developers for it’s support and
speed, as well as because it’s free.
Full-text – Built in functionality in MySQL that allows users to search through certain
tables for matches to a string.
Boolean Search – A search which allows users to narrow their results through the use of
Boolean operators.
Boolean Operators – A deductive logical system by which a user can narrow
results through the use of AND, OR, XOR, and other
operators.
Background Information
Before the advent of the search engine, users had to search manually through dozens – or hundreds – of articles and tidbits to find the ones that were right for them. Nowadays, in our more user-centered world, we expect the results to come to the user, not the other way around. The search engine gets the computer to do the work for the user.
Prerequisites
- MySQL version 3.23.23 or better for the simple Full-text searching
- MySQL version 4.1 alpha or better for the complex Boolean searching
- PHP & A knowledge thereof.
Synopsis
Let’s start with a quick review of our situation:
We have a database that contains articles. We might create a table of database contents using a statement like this:
CREATE TABLE articles (body
TEXT, title VARCHAR(250), id INT NOT NULL auto_increment, PRIMARY
KEY(id);
Let’s say we have about 100 of these articles, covering various topics: MySQL, PHP, and various other topics of that sort. How do the users find the tutorials they want? Remember, we need to bring the results to
the user. This is going to be a search engine operation.
Initial Ideas
When I started to work with my first database which was only a tenth of the size, my MySQL query went something like this:
SELECT * FROM articles WHERE body LIKE '%$keyword%';
This was slow and inefficient.Every time someone searched for an article, they got far too many results, and as the database grew the system became downright shameful.
So what is the solution? It’s right here: Full-text Searching.
The Solution: Setup
Full-text Search is a feature introduced to MySQL in version 3.23.23. This is how I used it to fix my problem:
I started out with an update to my table:
ALTER TABLE articles ADD FULLTEXT(body, title);
This set ups our Full-text index. The (body, title) part tells us that we can search the body and title for keywords later on. We’ll find out how to use this later, once we’ve overcome a potential problem.
In my original database BLOB was my datatype for the body of the article. What’s the problem, you ask? BLOBs are meant primarily for binary data. What use is searching binary data? MySQL has been programmed not to index BLOB datatypes for Full-text searching. If you try to index BLOB
datatypes, you get an Error 140.
The fix for this is simple:
ALTER TABLE articles MODIFY body TEXT;
That switches datatype from BLOB to TEXT, thus making a useful column for searching.