: I need to build a xml-based query with Apache Lucene.
...
: and the query is actually also a document that I need to compare with the
: entire collection. Each attribute has a different similarity metric. e.g.
: description has tf-idf cosine similarity. Time is just the difference and
: "latitude" + "longitude" is compared using the harvesine distance.
:
: For now I've only performed searches with simple textual queries such as
: "word1 word2". How can I build more complex queries instead ?
First off, it doesn't actaully sound like you have any real requirement to
use "xml-based queries" ... your data just happens to be in XML, and you
wnat to query using other data that also happens to be in XML.
Second: it's not clear if you want to use the Lucene Java API directly, or
if you wnat to use Solr -- depending on the answer to that question, i
would suggest you send a (more detailed) description of your problem to
the approriate user list (java-user@lucene or solr-user@lucene)
In general: mixing text based tf/idf queries with distance based queries
is completley possible -- all things in Lucene are ultimately a Query
object, which decides what Scoring to perform, and Lucene has Query types
that support all of the functionality you've described.
-Hoss