OK, here is my latest code to get the IndexReader from the solr core. However, it still printed out the non-string fields as special chars. I do use the schema file here. Please help. public static void main(String[] args) throws Exception { SolrConfig config = new SolrConfig("/Users/yuchen/Work/data/", "solrconfig.xml", null); IndexSchema schema = new IndexSchema(config, "schema.xml", null);
CoreContainer container = new CoreContainer(new SolrResourceLoader(SolrResourceLoader.locateInstanceDir())); CoreDescriptor dcore = new CoreDescriptor(container, "solr0", config.getResourceLoader().getInstanceDir()); dcore.setConfigName(config.getResourceName()); dcore.setSchemaName(schema.getResourceName()); SolrCore core = new SolrCore("solr0", "/Users/yuchen/Work/data", config, schema, dcore); container.register("solr0", core, false); IndexReader reader = core.getSearcher().get().getReader(); FieldCache cache = FieldCache.DEFAULT; int total = reader.numDocs(); System.out.println("Total documents: " + total); for (int i = 0; i < 1; i++) { System.out.println("\n=============== Got Node: " + i + " ================="); Document d = reader.document(i); List<Field> fields = d.getFields(); for (Field f : fields) { String name = f.name(); String val = f.stringValue(); System.out.println("get field / value: [" + name + "=" + val + "]"); } } reader.close(); } On Sun, Jul 5, 2009 at 7:58 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com > wrote: > > Yuchen, > > schema.xml is a Solr configuration file that you can find in a conf > directory under Solr home. Please go through the Solr tutorial on the site > first. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- > > From: Yuchen Wang <yuc...@trulia.com> > > To: solr-user@lucene.apache.org > > Sent: Sunday, July 5, 2009 1:19:12 PM > > Subject: Re: Problem in parsing non-string dynamic field by using > IndexReader > > > > Thanks for the reply. However, in the code I posted, where should I load > the > > schema.xml? I just created a Lucene IndexReader directly. > > > > On Sun, Jul 5, 2009 at 9:31 AM, Otis Gospodnetic > > > wrote: > > > > > > > > Yuchen, > > > > > > Make sure the fields you are trying to read are stored (stored="true" > in > > > schema.xml) > > > > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > > > ----- Original Message ---- > > > > From: Yuchen Wang > > > > To: solr-user@lucene.apache.org > > > > Sent: Sunday, July 5, 2009 12:43:50 AM > > > > Subject: Problem in parsing non-string dynamic field by using > IndexReader > > > > > > > > I have a task to parse all documents in a solr index. I use Lucene > > > > IndexReader to read the index and go through each field from all > > > documents. > > > > However, for float or int dynamic fields, the stringValue() call > always > > > > returns some special characters. I tried tokenStreamValue, byteValue, > > > > readerValue, and they return null. > > > > Following is my method to parse the solr index. My question is, how > can I > > > > get the values from non-string dynamic fields properly? > > > > > > > > public static void main(String[] args) throws Exception { > > > > IndexReader reader = > > > > IndexReader.open("/path/to/my/index/directory"); > > > > > > > > int total = reader.numDocs(); > > > > System.out.println("Total documents: " + total); > > > > > > > > for (int i = 0; i < 1; i++) { > > > > Document d = reader.document(i); > > > > > > > > Listfields = d.getFields(); > > > > > > > > for (Field f : fields) { > > > > String name = f.name(); > > > > String val = f.stringValue(); > > > > > > > > System.out.println("get field / value: [" + name + "=" > + > > > val > > > > + "]"); } > > > > } > > > > > > > > reader.close(); > > > > } > > > > > > > >