I would be interested in hearing about some ways to improve the algorithm. I
have done a very straightforward Lucene query within a loop to get the docIds.
Here's what I did to get it working where favsBean are objects returned from a
query of the second core, but there is probably a better way to do it:
private int[] getDocIdsFromPrimaryKey(SolrQueryRequest req, List<Favorites>
favsBeans) throws ParseException {
// open the core & get data directory
String indexDir = req.getCore().getIndexDir();
FSDirectory index = null;
try {
index = FSDirectory.open(new File(indexDir));
} catch (IOException e) {
throw new ParseException("IOException, cannot open the index at: "
+ indexDir + " " + e.getMessage());
}
int[] docIds = new int[favsBeans.size()];
int i = 0;
for(Favorites favBean: favsBeans) {
String pkQueryString = "resourceId:" + favBean.getResourceId();
Query pkQuery = new QueryParser(Version.LUCENE_CURRENT,
"resourceId", new StandardAnalyzer()).parse(pkQueryString);
IndexSearcher searcher = null;
TopScoreDocCollector collector = null;
try {
searcher = new IndexSearcher(index, true);
collector = TopScoreDocCollector.create(1, true);
searcher.search(pkQuery, collector);
} catch (IOException e) {
throw new ParseException("IOException, cannot search the index
at: " + indexDir + " " + e.getMessage());
}
ScoreDoc[] hits = collector.topDocs().scoreDocs;
if(hits != null && hits[0] != null) {
docIds[i] = hits[0].doc;
i++;
}
}
Arrays.sort(docIds);
return docIds;
}
-----Original Message-----
From: Erick Erickson [mailto:[email protected]]
Sent: 02 December 2010 13:46
To: [email protected]
Subject: Re: Return Lucene DocId in Solr Results
Sounds good, especially because your old scenario was fragile. The doc IDs
in
your first core could change as a result of a single doc deletion and
optimize. So
the doc IDs stored in the second core would then be wrong...
Your user-defined unique key is definitely a better way to go. There are
some tricks
you could try if there are performance issues....
Best
Erick
On Thu, Dec 2, 2010 at 7:47 AM, Lohrenz, Steven
<[email protected]>wrote:
> I know the doc ids from one core have nothing to do with the other. I was
> going to use the docId returned from the first core in the solr results and
> store it in the second core that way the second core knows about the doc ids
> from the first core. So when you query the second core from the Filter in
> the first core you get returned a set of data that includes the docId from
> the first core that the document relates to.
>
> I have backed off from this approach and have a user defined primary key in
> the firstCore, which is stored as the reference in the secondCore and when
> the filter performs the search it goes off and queries the firstCore for
> each primary key and gets the lucene docId from the returned doc.
>
> Thanks,
> Steve
>
> -----Original Message-----
> From: Erick Erickson [mailto:[email protected]]
> Sent: 02 December 2010 02:19
> To: [email protected]
> Subject: Re: Return Lucene DocId in Solr Results
>
> On the face of it, this doesn't make sense, so perhaps you can explain a
> bit.The doc IDs
> from one Solr instance have no relation to the doc IDs from another Solr
> instance. So anything
> that uses doc IDs from one Solr instance to create a filter on another
> instance doesn't seem
> to be something you'd want to do...
>
> Which may just mean I don't understand what you're trying to do. Can you
> back up a bit
> and describe the higher-level problem? This seems like it may be an XY
> problem, see:
> http://people.apache.org/~hossman/#xyproblem
>
> Best
> Erick
>
> On Tue, Nov 30, 2010 at 6:57 AM, Lohrenz, Steven
> <[email protected]>wrote:
>
> > Hi,
> >
> > I was wondering how I would go about getting the lucene docid included in
> > the results from a solr query?
> >
> > I've built a QueryParser to query another solr instance and and join the
> > results of the two instances through the use of a Filter. The Filter
> needs
> > the lucene docid to work. This is the only bit I'm missing right now.
> >
> > Thanks,
> > Steve
> >
> >
>