Monday, December 20, 2010

Installing and configure Apache Solr with MySQL


Apache Solr™ is our high performance enterprise search server, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, distributed search, database integration, web admin and search interfaces.

1. Install Java

yum install java


2. Download & Extract Solr

cd /usr/local/src
wget http://www.apache.org/dist//lucene/solr/1.4.1/apache-solr-1.4.1.tgz
tar -xzvf apache-solr-1.4.1.tgz


3. Download MySQL Java connector

wget http://mysql.spd.co.il/Downloads/Connector-J/mysql-connector-java-3.1.14.zip
unzip mysql-connector-java-3.1.14.zip
cd mysql-connector-java-3.1.14
cp mysql-connector-java-3.1.14-bin.jar /usr/local/src/apache-solr-1.4.1/example/lib/


4. Configure MySQL Database
Create a new xml file called data-import.xml , change the obvious variables to suit your DB. In this example, I am indexing a own DB table called users.

cd /usr/local/src/apache-solr-1.4.1/example/solr/conf
nano data-import.xml

put follow text

<dataconfig>
<datasource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://192.168.1.7/mydatabase" user="myuser" password="mypassword">
<document name="content">
<entity name="node" query="select user_id AS id, mobile AS mobile, user_status AS status FROM users">
<field column="id" name="id">
<field column="mobile" name="mobile">
<field column="status" name="status">
</field></field></field></entity>
</document>
</datasource>
</dataconfig>



5. Configure solr .
Edit file solrconfig.xml which is located in /usr/local/src/apache-solr-1.4.1/example/solr/conf directory. Add the following requestHandler entry if not already existing.

<requesthandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requesthandler>


6. Now we will configure solr’s schema by editing schema.xml in /usr/local/src/apache-solr-1.4.1/example/solr/conf directory. Delete old tag and add the following fields as required. The xml format is self explanatory.

<fields>

<field name="id" type="string" indexed="true" stored="true" default="NOW" required="true">
<field name="mobile" type="text" indexed="true" stored="true" required="true">
<field name="status" type="text" indexed="false" stored="true" required="false">

</field></field></field></fields>

<uniquekey>id</uniquekey>

<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultsearchfield>mobile</defaultsearchfield>



7. Start SOLR

java -jar /usr/local/src/apache-solr-1.4.1/example/start.jar



Performing full or delta indexing
If everything works correctly, you can get solr to fully index the configured tables by accessing the following command via your browser. http://:8983/solr/dataimport?command=full-import

You can check the status of the command by accessing http://:8983/solr/dataimport

If everything works correctly, you can now search for data from http://:8983/solr/admin/ and you should now have data results in XML format.


To do an incremental or delta indexing of data since the last full or delta, increment, issue the command http://:8983/solr/dataimport?command=delta-import

You can now access these xml results from your web application. There are client api’s available for RoR, php, java etc.

References
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/

http://www.ipros.nl/2008/12/15/using-solr-with-wordpress/

http://wiki.apache.org/solr/DataImportHandler#head-df246a3aed0bb38297f3449bc35a0bdf38a272b5

http://lucene.apache.org/solr/tutorial.html

1 comment:

Unknown said...

The code posted doesn't work. XML is case-sensitive. requestHandler not requesthandler