Java and J2EE Tutorials, Jsp and Servlet Tutorials, Spring MVC, Solr, XML, JSON Examples, Hibernate & Struts 2 Hello World projects



Sunday, 27 January 2013

How to use the Solr Data Import Handler to index a MySQL database table

In this article we will go through Importing and indexing My Sql database table data in solr4 using Data Import Handler. In our previous discussions we went through the Integration of solr4 with tomcat, now its time to provide solr4 some data on which the search will be done. In solr4 database table data can be imported and indexed using a number of ways. The most easiest and effective way of doing this is using solr's Data Import Handler. Data Import Handler is not there in solr4 package directly so we need to add a handler entry in solrconfig.xml to let solr4 know about our database configuration.

If you are new to solr4, I recommend you to go through following discussion before we start.

Introduction to Apache Solr 4.0 with Apache Tomcat


In this particular blog i am assuming that you have a basic knowledge of How Solr 4 Works. If you have configured your solr setup correctly than you will be able to see solr dashboard something like this screen.

In case you are not able to see solr dashboard than go through Apache Solr 4.0 with Apache Tomcat 7 in Ubuntu Linux.

Now to import and index data in solr4 you need to follow two step process and everything will be done.


1 Configuring data source


Our first step is to configure data source to do this thing just add a configuration file to your solr-home like this /home/javaguy/solr-home/collection1/conf/data-config.xml

<dataConfig>

<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/indian_team" user="root" password="root" batchSize="1" />
    <document name="players">

        <entity name="players_data" query="SELECT * FROM players_data">
            <field column="player_id" name="id" />
            <field column="player_name" name="player_name" />
<field column="player_age" name="player_age" />
<field column="player_description" name="player_description" />
        </entity>

    </document>
</dataConfig>
This file tells solr about your database credential and the fields of table that you want to import in order to enable search in solr. Document name and entity name can be given any value you want , in field columns add your table columns that you want to be imported as solr indexed data. column attribute represents the column name in your table and name attribute value represents related data fields value in schema.xml.

2 . Adding data import handler in solr-config.xml


Our next and final step is to add a data import handler entry in solr-config.xml to make solr aware of solr-config.xml and to point data source that is going to be imported. This can be done by adding following code in your solr-config.xml like this.


/home/javaguy/solr-home/collection1/conf/solr-config.xml 
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">   

<lst name="defaults">  
<str name="config">data-config.xml</str>
</lst> 

</requestHandler>

Make sure that you have added this handler entry at right place in your config, the best way is to add this just after any other handler entry.

Here you are all done with your data import configuration, now restart the server and go to Dataimport under collection 1, choose full-import from Command drop down, select your entity under Entity drop down and click  Execute Import.  If your import is successful you will see a green entry saying Indexing Completed as shown in the figure below.
By default solr indexed 0 to10 rows from database if you want to index more or less tables than change Start,Rows value before executing import.
We are all done




Here we are all done with importing and indexing data in solr. To check if your data is indexed correctly just go to Query tab under collection1 and execute query with default values and you will see something your indexed data in xml form like this screen.


This is all about importing and indexing data in solr using Data Import Handler. In upcoming blogs we will learn more tips and tricks about solr4 and full text searching.










Thanks for reading !
Being Java Guys Team



25 comments:

  1. Hi,
    I followed your two tutorials :
    setup
    and this one.

    when when I hit execute in the Query tab, there is no results.
    I think you are missing schema.xml configuration in this tutorial.

    Thanks

    ReplyDelete
    Replies
    1. Hi Majid,
      thanks for your reply,as we have some default facets fields in schema.xml already.. we don't need to make changes there for this particular example.

      If you have some more fields in your database and those does not matches to default fields just add that field as facet in schema.xml and you are done.

      regarding search you must try query like 'field:searchKeyword'
      or to check if your data is imported successfully plese use '*:*' as query.



      Thanks

      Delete
    2. Hi Ankush Thakur,

      Read ur Blog & its quite useful for beginners as well and i wanna know something , Is it possible to Index Entire Database( irrespective of tables); i.e. i mean instead of giving a specific table to index,(entity name="XXX" query="SELECT * FROM aspecifictablename">) is it possible to index the Entire Database , which includes all tables, procedures and views , all at once..??? like this.
      (entity name="XXX" query="SELECT * FROM sys.objects">) ..


      Regards
      Mohan M

      Delete
  2. Hi,
    At the end of your article yoiu said :

    In upcoming blogs we will learn more tips and tricks about solr4 and full text searching.

    Have you posted the searcher yet ?
    Thanks.

    ReplyDelete
  3. In dataimport section,no entity values is loading.please help!

    ReplyDelete
  4. In dataimport section "Configuration" link clicking time this error can be display how can this error handling?Can't find resource 'data-config.xml' in classpath or '/var/lib/tomcat6/webapps/solr-4.4.0/example/solr/collection1/conf/', cwd=/var/lib/tomcat6/webapps/solr-4.4.0/example

    ReplyDelete
  5. In dataimport section "Configuration" link clicking time this error can be display how can this error handling?Can't find resource 'data-config.xml' in classpath or '/var/lib/tomcat6/webapps/solr-4.4.0/example/solr/collection1/conf/', cwd=/var/lib/tomcat6/webapps/solr-4.4.0/example

    ReplyDelete
  6. Hi,
    thanks for your post.
    Is there any example in which we can import own XML files into solr using Data Import Handler or any other way?

    ReplyDelete
  7. Hi

    Please let know how to perform a full text search after data import. Currently I see that query like 'field:searchKeyword'
    or '*:*' as query.

    How do we get the query q=searchKeyword working?

    Any inputs here is appreciable.

    ReplyDelete
  8. Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Could not load driver: com.mysql.jdbc.Driver Processing Document # 1
    how can solve this error please help me?

    ReplyDelete
    Replies
    1. just copy your mySql-driver (mySql-collector-java.jar) in lib folder and start solr again

      Delete
    2. From where I can get mySql-collector-java.jar?

      Delete
    3. I have already placed mysql-connector-java-5.1.29.jar at contrib\dataimporthandler.

      my solrconfig.xml contains



      eventhough it still gives error.
      752475 [Thread-15] ERROR org.apache.solr.handler.dataimport.DataImporter û Full
      Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Could not load driver: com.mysql.jdbc.Driver Processing Document # 1

      any suggestion/help?

      Delete
  9. Hello, i'm having a class not found exception on DIH. please help. thnx in advance.

    ReplyDelete
  10. iam learning solr from your great tutorial http://www.beingjavaguys.com.
    but in the situation of fetching data from database i got the following error…please help me how to overcome that error

    SolrCore Initialization Failures

    collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config for solrconfig.xml

    Please check your logs for more information

    Documentation
    Issue Tracker
    IRC Channel
    Community forum
    Solr Query Syntax

    There are no SolrCores running.
    Using the Solr Admin UI currently requires at least one SolrCore.

    Thanks..
    Mahesh

    ReplyDelete
  11. i got the following error…please help me how to overcome that error

    SolrCore Initialization Failures

    collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config for solrconfig.xml

    ReplyDelete
  12. i got the following error during in process of fetching data from mysql database..

    collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: RequestHandler init failure ..

    try to answer me..how to solve that error

    ReplyDelete
  13. Hi Nagesh,
    Is it possible to extract and index data to solr using a MySQL query involving multiple joins? I ask this as generally the data is not concentrated over a single table and many tables are linked by different relations and i have a similar situation where i need to replace the MySQL search with Solr, without changing the database as it is used by a lot of other functionalities.

    ReplyDelete
    Replies
    1. of course you can use complex and multiple joins when using solr with Mysql. See one of my data-config.xml files attached. The other option hear is to use a view but I needed to pass url variables in my solr command - ${dataimporter.request.view}














      Delete
    2. hi Brian,
      can u please post your data-config.xml with multiple joins once more.. i cudnt find any attachment here...

      Delete
  14. You are definitely missing a lot of stuff, schema.xml for example...

    In any case Your "tutorial" does not work as is.

    ReplyDelete
  15. And you look like a punk.
    You answer is a bit douchey

    ReplyDelete
  16. can anyone tell about schema.xml for this tutorial??

    ReplyDelete
  17. how to add data import handler solr panel

    ReplyDelete

Like Us on Facebook


Like Us On Google+



Contact

Email: neel4soft@gmail.com
Skype: neel4soft