I've spent a few hours tinkering with an Ruby ActiveRecord plugin to
index, delete, and search models fronted by a database into Solr.
The results are are
$ script/console
>> Book.new(:title => "Solr in Action", :author => "Yonik & Hoss").save
=> true
>> Book.new(:title => "Lucene in Action", :author => "Otis &
Erik").save
=> true
>> action_books = Book.find_by_solr("action")
=> [#<Book:0x2406db0 @attributes={"title"=>"Solr in Action",
"author"=>"Yonik & Hoss", "id"=>"21"}>, #<Book:0x2406d74 @attributes=
{"title"=>"Lucene in Action", "author"=>"Otis & Erik", "id"=>"22"}>]
>> action_books = Book.find_by_solr("actions") # to show stemming
=> [#<Book:0x279ebbc @attributes={"title"=>"Solr in Action",
"author"=>"Yonik & Hoss", "id"=>"21"}>, #<Book:0x279eb80 @attributes=
{"title"=>"Lucene in Action", "author"=>"Otis & Erik", "id"=>"22"}>]
>> Book.find_by_solr("yonik OR otis") # to show QueryParser boolean
expressions
=> [#<Book:0x2793adc @attributes={"title"=>"Solr in Action",
"author"=>"Yonik & Hoss", "id"=>"21"}>, #<Book:0x2793aa0 @attributes=
{"title"=>"Lucene in Action", "author"=>"Otis & Erik", "id"=>"22"}>]
My model looks like this:
class Book < ActiveRecord::Base
acts_as_solr
end
(ain't ActiveRecord slick?!)
acts_as_solr adds save and destroy hooks. All model attributes are
sent to Solr like this:
>> action_books[0].to_solr_doc.to_s
=> "<doc><field name='id'>Book:21</field><field name='type'>Book</
field><field name='pk'>21</field><field name='title_t'>Solr in
Action</field><field name='author_t'>Yonik & Hoss</field></doc>"
The Solr id is <model_name>:<primary_key> formatted, type field is
the model name and AND'd to queries to narrow them to the requesting
model, the pk field is the primary key of the database table, and the
rest of the attributes are named with an _t suffix to leverage the
dynamic field capability. All _t fields are copied into the default
search field of "text".
At this point it is extremely basic, no configurability, and there
are lots of issues to address to flesh this into something robustly
general purpose. But as a proof-of-concept I'm pleased at how easy
it was to write this hook.
I'd like to commit this to the Solr repository. Any objections?
Once committed, folks will be able to use "script/plugin install ..."
to install the Ruby side of things, and using a binary distribution
of Solr's example application and a custom solr/conf directory (just
for schema.xml) they'd be up and running quite quickly. If ok to
commit, what directory should I put things under? How about just
"ruby"?
I currently do not foresee having a lot of time to spend on this, but
I do feel quite strongly that having an "acts_as_solr" hook into
ActiveRecord will really lure in a lot of Rails developers. I'm sure
there will be plenty that will not want a hybrid Ruby/Java
environment, and for them there is the ever improving Ferret
project. Ferret, however, would still need layers added on top of it
to achieve all that Solr provides, so Solr is where I'm at now.
Despite my time constraints, I'm volunteering to bring this prototype
to a documented and easily usable state, and manage patches submitted
by savvy users to make it robust.
Thoughts?
Erik
p.s. And for the really die-hard bleeding edgers, the complete
acts_as_solr code is pasted below which you can put into a Rails
project in vendor/plugins/acts_as_solr.rb, along with a simple one-
line require 'acts_as_solr' init.rb in vendor/plugins. Sheepishly,
here's the hackery....
--------
require 'active_record'
require 'rexml/document'
require 'net/http'
def post_to_solr(body, mode = :search)
url = URI.parse("http://localhost:8983")
post = Net::HTTP::Post.new(mode == :search ? "/solr/select" : "/
solr/update")
post.body = body
post.content_type = 'application/x-www-form-urlencoded'
response = Net::HTTP.start(url.host, url.port) do |http|
http.request(post)
end
return response.body
end
module SolrMixin
module Acts #:nodoc:
module ARSolr #:nodoc:
def self.included(base)
base.extend(ClassMethods)
end
module ClassMethods
def acts_as_solr(options={}, solr_options={})
# configuration = {}
# solr_configuration = {}
# configuration.update(options) if options.is_a?(Hash)
# solr_configuration.update(solr_options) if
solr_options.is_a?(Hash)
after_save :solr_save
after_destroy :solr_destroy
include SolrMixin::Acts::ARSolr::InstanceMethods
end
def find_by_solr(q, options = {}, find_options = {})
q = "(#{q}) AND type:#{self.name}"
response = post_to_solr("q=#{ERB::Util::url_encode(q)}
&wt=ruby&fl=pk")
data = eval(response)
docs = data['response']['docs']
return [] if docs.size == 0
ids = docs.collect {|doc| doc['pk']}
conditions = [ "#{self.table_name}.id in (?)", ids ]
result = self.find(:all,
:conditions => conditions)
end
end
module InstanceMethods
def solr_id
"#{self.class.name}:#{self.id}"
end
def solr_save
logger.debug "solr_save: #{self.class.name} : #{self.id}"
xml = REXML::Element.new('add')
xml.add_element to_solr_doc
response = post_to_solr(xml.to_s, :update)
solr_commit
true
end
# remove from index
def solr_destroy
logger.debug "solr_destroy: #{self.class.name} : #{self.id}"
post_to_solr("<delete><id>#{solr_id}</id></delete>", :update)
solr_commit
true
end
def solr_commit
post_to_solr('<optimize waitFlush="false"
waitSearcher="false"/>', :update)
end
# convert instance to Solr document
def to_solr_doc
logger.debug "to_doc: creating doc for class: #
{self.class.name}, id: #{self.id}"
doc = REXML::Element.new('doc')
# Solr id is <classname>:<id> to be unique across all models
doc.add_element field("id", solr_id)
doc.add_element field("type", self.class.name)
doc.add_element field("pk", self.id.to_s)
# iterate through the fields and add them to the document
self.attributes.each_pair do |key,value|
# _t is appended as a dynamic "text" field for Solr
doc.add_element field("#{key}_t", value.to_s) unless
key.to_s == "id"
end
return doc
end
def field(name, value)
field = REXML::Element.new("field")
field.add_attribute("name", name)
field.add_text(value)
field
end
end
end
end
end
# reopen ActiveRecord and include all the above to make
# them available to all our models if they want it
ActiveRecord::Base.class_eval do
include SolrMixin::Acts::ARSolr
end