Relational Database Tables Indexed With SOLR

May 25 2011

Let’s say you have a gallery and it consists of three database tables (photos, category and gallery). Each of these tables has lots of related data, obviously…

By following the old database routine, I started indexing data from top to bottom (from gallery to photos table).

But this resulted in a weird behavior where SOLR would index gallery id and name, and only the first category; It simply ignored all other gallery categories and photos.

After spending days mind boggling what could go wrong, it suddenly occurred to me that the proper way to index data into SOLR would be to index is “opposite direction”, from bottom to top that is (from photos to gallery).

It was a great relief to finally see all relational data properly indexed :)

The example that bothered me was:

MySQL schema

photos
- id
- categoryId
- views
- url
- title

category
- id
- name
- galleryId

gallery
- id
- name

SOLR db-data-config.xml

<dataConfig>
	<dataSource
	    type="JdbcDataSource"
	    driver="com.mysql.jdbc.Driver"
	    url="jdbc:mysql://localhost/database"
	    user="username"
	    password="password"/>

	<document name="gallery">

		<entity rootEntity="true" name="photos" query="select *, rand() as uuid from photos">
		<field name="photo_id" column="id" />
		<field name="photo_categoryId" column="categoryId" />
		<field name="photo_views" column="views" />
		<field name="photo_url" column="url" />
		<field name="photo_title" column="title" />

			<entity name="category" query="select * from category where id='${photos.categoryId}'">
			<field name="category_id" column="id" />
			<field name="category_name" column="name" />
			<field name="category_galleryId" column="galleryId" />

				<entity name="gallery" query="select * from gallery where id='${category.galleryId}'">
				<field name="gallery_id" column="id" />
				<field name="gallery_name" column="name" />
				</entity>

			</entity>

		</entity>

	</document>

</dataConfig>
Tags: , , ,

Comments Off

Comments are closed at this time.

Performance Optimization WordPress Plugins by W3 EDGE