Hi,
Given the following Lucene document that I’m adding to my index(and I expect to
have over 10 million of them, each with various sizes from 1 Kbto 50 Kb:
<add>
<doc>
<fieldname="doc_type">PDF</field>
<fieldname="title">Some name</field>
<fieldname="summary">Some summary</field>
<fieldname="owner">Who owns this</field>
<fieldname="price">10</field>
<fieldname="isbn">1234567890</field>
</doc>
<doc>
<fieldname="doc_type">DOC</field>
<fieldname="title">Some name</field>
<fieldname="summary">Some summary</field>
<fieldname="owner">Who owns this</field>
<fieldname="price">10</field>
<fieldname="isbn">0987654321</field>
</doc>
<!-- and more doc's -->
</add>
My question is this: what Lucene search syntax will give meback result the
fastest? If my user is interestedin finding data within “title” and “owner”
fields only “doc_type” “DOC”, shouldI build my Lucene search syntax as:
1) skyfall ian fleming AND doc_type:DOC
2) title:(skyfall OR ian OR fleming) owner:(skyfall OR ian ORfleming) AND
doc_type:DOC
3) Something else I don't know about.
Of the 10 million documents I will be indexing, 80% will be of "doc_type" PDF,
and about 10% of type DOC, so please keep that in mind as a factor (if that
will mean anything in terms of which syntax I should use).
Thanks in advanced,
- MJ