On 8/16/2011 1:12 PM, Shawn Heisey wrote:
On 8/16/2011 11:23 AM, Erick Erickson wrote:
The problem with anything "automatic" is that I don't see how it could know
which fields in the document to map DB columns to. Unless you had
fields that exactly matched column names, it would be iffy...

I assume DIH actually does something like this, but don't know any way
of having SolrJ automagically do this.

At root these kinds of things don't generalize well, but that doesn't mean
that there's not a good case for doing this.

In my case, the Solr field names are in perfect sync with the database field names. My DIH config doesn't mention any fields by name, it just passes them as-is and lets the schema handle everything. I'm perfectly OK with handling everything myself in my code, but if someone had already invented the wheel, no sense in designing a new one. :)

Thanks for all your help, Erick.

Here's what I've ended up with in my method that takes a ResultSet and puts the data into Solr. I have to get a testbed set up before I can actually test this code, which will take me a while. I'm inviting comment now, knowing it might have bugs. Eclipse is happy with it, but that doesn't mean it works. :)

    /**
* Takes an SQL ResultSet and adds the documents to solr. Does it in batches
     * of fetchSize.
     *
     * @param rs
     * @throws SQLException
     * @throws IOException
     * @throws SolrServerException
     */
    private long addResultSet(ResultSet rs) throws SQLException,
            SolrServerException, IOException
    {
        long count = 0;
        int innerCount = 0;
Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
        ResultSetMetaData rsm = rs.getMetaData();
        int numColumns = rsm.getColumnCount();
        String[] colNames = new String[numColumns + 1];

        for (int i = 1; i < (numColumns + 1); i++)
        {
            colNames[i] = rsm.getColumnName(i);
        }

        while (rs.next())
        {
            count++;
            innerCount++;

            SolrInputDocument doc = new SolrInputDocument();
            for (int j = 1; j < (numColumns + 1); j++)
            {
                Object f;
                switch (rsm.getColumnType(j))
                {
                    case Types.BIGINT:
                    {
                        f = rs.getLong(j);
                        break;
                    }
                    case Types.INTEGER:
                    {
                        f = rs.getInt(j);
                        break;
                    }
                    case Types.DATE:
                    {
                        f = rs.getDate(j);
                        break;
                    }
                    case Types.FLOAT:
                    {
                        f = rs.getFloat(j);
                        break;
                    }
                    case Types.DOUBLE:
                    {
                        f = rs.getDouble(j);
                        break;
                    }
                    case Types.TIME:
                    {
                        f = rs.getDate(j);
                        break;
                    }
                    case Types.BOOLEAN:
                    {
                        f = rs.getBoolean(j);
                        break;
                    }
                    default:
                    {
                        f = rs.getString(j);
                    }
                }
                doc.addField(colNames[j], f);
            }
            docs.add(doc);

            /**
* When we reach fetchSize, index the documents and reset the inner
             * counter.
             */
            if (innerCount == IdxStatic.fetchSize)
            {
                solrCore.add(docs);
                docs.clear();
                innerCount = 0;
            }
        }

        /**
         * If the outer loop ended before the inner loop reset, index the
         * remaining documents.
         */
        if (innerCount != 0)
        {
            solrCore.add(docs);
        }
        return count;
    }

Reply via email to