Fastest way to use solrj

Tim Terlegård Mon, 18 Jan 2010 23:53:09 -0800

There are a few ways to use solrj. I just learned that I can use the
javabin format to get some performance gain. But when I try the binary
format nothing is added to the index. This is how I try to use this:


    server = new CommonsHttpSolrServer("http://localhost:8983/solr";)
    server.setRequestWriter(new BinaryRequestWriter())
    request = new UpdateRequest()
    request.setAction(UpdateRequest.ACTION.COMMIT, true, true);
    request.setParam("stream.file", "/tmp/data.bin")
    request.process(server)

Should this work? Could there be something wrong with the file? I
haven't found a good reference for how to create a javabin file, but
by reading the source code I came up with this (groovy code):

    fieldId = new NamedList()
    fieldId.add("name", "id")
    fieldId.add("val", "9-0")
    fieldId.add("boost", null)
    fieldText = new NamedList()
    fieldText.add("name", "text")
    fieldText.add("val", "Some text")
    fieldText.add("boost", null)
    fieldNull = new NamedList()
    fieldNull.add("boost", null)
    doc = [fieldNull, fieldId, fieldText]
    docs = [doc]
    root = new NamedList()
    root.add("docs", docs)
    fos = new FileOutputStream("data.bin")
    new JavaBinCodec().marshal(root, fos)

I haven't found any examples of using stream.file like this with a
binary file. Is it supported? Is it better/faster to use
StreamingUpdateSolrServer and send everything over HTTP instead? Would
code for that look something like this?

    while (moreDocs) {
        xmlDoc = readDocFromFileUsingSaxParser()
        doc = new SolrInputDocument()
        doc.addField("id", "9-0")
        doc.addField("text", "Some text")
        server.add(doc)
    }

To me it instinctively looks as if stream.file would be faster because
it doesn't have to use HTTP and it doesn't have to create a bunch of
SolrInputDocument objects.

/Tim

Fastest way to use solrj

Reply via email to