On 8/16/2011 7:14 AM, Erick Erickson wrote:
What have you tried and what doesn't it do that you want it to do?This works, instantiating the StreamingUpdateSolrServer (server) and the JDBC connection/SQL statement are left as exercises for the reader<G>.: while (rs.next()) { SolrInputDocument doc = new SolrInputDocument(); String id = rs.getString("id"); String title = rs.getString("title"); String text = rs.getString("text"); doc.addField("id", id); doc.addField("title", title); doc.addField("text", text); docs.add(doc); ++counter; ++total; if (counter> 100) { // Completely arbitrary, just batch up more than one document for throughput! server.add(docs); docs.clear(); counter = 0; } }
I've implemented a basic loop with the structure you've demonstrated, but it currently doesn't do anything yet with SolrInputDocument or SolrDocumentList. I figured there would be a way to avoid going through the field list one by one, but what you've written suggests that the field-by-field method is required. I can live with that.
It does look like addField just takes an Object, so hopefully I can create a loop that determines the type of each field from the JDBC metadata, retrieves the correct Java type from the ResultSet, and inserts it. I imagine that everything still works if you happen to insert a field that doesn't exist in the index. This must be how the DIH does it, so I was hoping that the DIH might expose a method that takes a ResultSet and produces a SolrDocumentList. I still have to take a deeper look at the source and documentation.
Thanks for the help so far, I can get a little more implemented now. Shawn
