Word search in SQL
The current version of manitou-mail uses C++ code inside the interface to deal with the inverted word index. There is also a Perl version in the Manitou::Words module (see sub search), but so far it wasn’t possible to issue a search directly from inside an SQL query, making it hard to combine the results with other criteria.
I’m glad to say that it’s now possible, by using a new wordsearch() function implemented in pl/pgsql.
This opens up the possibility of doing some interesting custom queries, such as for example:
SELECT mail_id,subject,msg_date FROM mail m JOIN (SELECT * FROM wordsearch(array['foo','bar']) AS id) s ON (m.mail_id=s.id) WHERE m.status&16= AND msg_date>now()-'1 year'::INTERVAL; |
This query retrieves the messages that contain both ‘foo’ and ‘bar’, excluding those that are in the trashcan, and those that are more than one year old.
This should be pretty fast, too. This one runs for me in 0.4s on a database containing about 47000 messages.
Not only this kind of query can be launched by any custom program using the mail database, but it can also be saved inside the manitou-mail user interface as a “user query” and be accessible from the quick selection panel.
The code will be shipped with future versions of manitou, but in the meantime the function source code is available in the wiki.