On 8/16/2011 7:14 AM, Erick Erickson wrote:
What have you tried and what doesn't it do that you want it to do?

This works, instantiating the StreamingUpdateSolrServer (server) and
the JDBC connection/SQL statement are left as exercises for the
reader<G>.:

     while (rs.next()) {
       SolrInputDocument doc = new SolrInputDocument();

       String id = rs.getString("id");
       String title = rs.getString("title");
       String text = rs.getString("text");

       doc.addField("id", id);
       doc.addField("title", title);
       doc.addField("text", text);

       docs.add(doc);
       ++counter;
       ++total;
       if (counter>  100) { // Completely arbitrary, just batch up more
than one document for throughput!
         server.add(docs);
         docs.clear();
         counter = 0;
       }
     }

I've implemented a basic loop with the structure you've demonstrated, but it currently doesn't do anything yet with SolrInputDocument or SolrDocumentList. I figured there would be a way to avoid going through the field list one by one, but what you've written suggests that the field-by-field method is required. I can live with that.

It does look like addField just takes an Object, so hopefully I can create a loop that determines the type of each field from the JDBC metadata, retrieves the correct Java type from the ResultSet, and inserts it. I imagine that everything still works if you happen to insert a field that doesn't exist in the index. This must be how the DIH does it, so I was hoping that the DIH might expose a method that takes a ResultSet and produces a SolrDocumentList. I still have to take a deeper look at the source and documentation.

Thanks for the help so far, I can get a little more implemented now.

Shawn

Reply via email to