There are a few ways to use solrj. I just learned that I can use the
javabin format to get some performance gain. But when I try the binary
format nothing is added to the index. This is how I try to use this:
server = new CommonsHttpSolrServer("http://localhost:8983/solr")
server.setRequestWriter(new BinaryRequestWriter())
request = new UpdateRequest()
request.setAction(UpdateRequest.ACTION.COMMIT, true, true);
request.setParam("stream.file", "/tmp/data.bin")
request.process(server)
Should this work? Could there be something wrong with the file? I
haven't found a good reference for how to create a javabin file, but
by reading the source code I came up with this (groovy code):
fieldId = new NamedList()
fieldId.add("name", "id")
fieldId.add("val", "9-0")
fieldId.add("boost", null)
fieldText = new NamedList()
fieldText.add("name", "text")
fieldText.add("val", "Some text")
fieldText.add("boost", null)
fieldNull = new NamedList()
fieldNull.add("boost", null)
doc = [fieldNull, fieldId, fieldText]
docs = [doc]
root = new NamedList()
root.add("docs", docs)
fos = new FileOutputStream("data.bin")
new JavaBinCodec().marshal(root, fos)
I haven't found any examples of using stream.file like this with a
binary file. Is it supported? Is it better/faster to use
StreamingUpdateSolrServer and send everything over HTTP instead? Would
code for that look something like this?
while (moreDocs) {
xmlDoc = readDocFromFileUsingSaxParser()
doc = new SolrInputDocument()
doc.addField("id", "9-0")
doc.addField("text", "Some text")
server.add(doc)
}
To me it instinctively looks as if stream.file would be faster because
it doesn't have to use HTTP and it doesn't have to create a bunch of
SolrInputDocument objects.
/Tim