Sep 172014

The solr search engine runs an indexation process on each document added to a collection for the first time, and also every time it is updated. This process analyzes the content of each of the fields in the document, splitting the values of text fields into tokens, etc., according to the document structure defined in the schema.xml configuration file.

Normally, this process needs to carried out only once for every new or updated document. But sometimes it might be necessary to re-generate the index for the whole collection, for instance when schema.xml is edited to modify the structure of the document. This post explains how to carry out this re-indexation operation, and some considerations to be taken into account.

Continue reading »

 Posted by at 10:57 am
Sep 012014

OAuth is a mechanism, originally developed by Twitter, to validate the requests sent to a web server by a client application on behalf of a user.

This article attempts to give a simple explanation of the reasons behind the development of OAuth, and the concepts involved.

Continue reading »

 Posted by at 5:04 pm
Sep 012014

A previous post in this blog explained how to do multiline searches with preg_match in PHP, to search for text strings that span more than one line.

This feature is also available in the vim editor, but the syntax of the regular expressions differs slightly from that used in PHP. This post summarizes these differences, and explains through some examples the right syntax accepted by vim to do a search across multiple lines.

Continue reading »

 Posted by at 3:09 pm
Jul 132014

In the solr search engine, a collection is a set of documents that share the same field structure.

The solr installation package includes a sample collection “collection1”. In the simplest cases, this collection can be used out-of-the-box, or doing small changes to the configuration files of the collection. But, in other cases, it may be desirable to create several collections to index different types of documents. This post explains how to create and configure a new collection in a solr installation.

Continue reading »

 Posted by at 7:19 pm
Jul 052014

Our previous post Introduction to solr already covered the basics of the procedure to install an instance of this search engine, and how to index documents and perform queries on the indexed content.

But using solr efficiently for any purpose other than a demo requires adapting its configuration to the specific type of content that will be indexed.

This post describes the structure of the main configuration file schema.xml in a solr collection. schema.xml specifies the fields that may appear on a document, and their data types.

Continue reading »

 Posted by at 7:30 pm