I have an interesting situation of searching Business Names where results should be partially sorted by position.
Searching for "Kramer Tractors" will not result in any matches as there no results that exactly match this. However there are business names that start with Kramer and there are also business names which contain the word Tractor. One important item to note is that we don't want the document frequency to influence the score. Ideally we'd like the Kramer matches to appear before the Tractor Matches. At the moment I'm using simply boosting the terms as in Kramer^4 Tractor^2. I've looked into using the term vector component. I've starting playing with the TVC but suspect, from the documentation, that Document Frequency is causing my results to be ordered not to my liking. If I read the following correctly, Korpan Tractor appears first due to Tractors having df=35. <lst name="103503"> <str name="uniqueKey">103503</str> <lst name="BUS_BUSINESS_NAME"> <lst name="korpan"> <int name="tf">1</int> <lst name="positions"> <int name="position">0</int> </lst> <lst name="offsets"> <int name="start">0</int> <int name="end">6</int> </lst> <int name="df">6</int> <double name="tf-idf">0.16666666666666666</double> </lst> <lst name="tractor"> <int name="tf">1</int> <lst name="positions"> <int name="position">1</int> </lst> <lst name="offsets"> <int name="start">7</int> <int name="end">14</int> </lst> <int name="df">35</int> <double name="tf-idf">0.02857142857142857</double> </lst> </lst> </lst> <lst name="503457"> <str name="uniqueKey">503457</str> <lst name="BUS_BUSINESS_NAME"> <lst name="salvage"> <int name="tf">1</int> <lst name="positions"> <int name="position">3</int> </lst> <lst name="offsets"> <int name="start">12</int> <int name="end">19</int> </lst> <int name="df">61</int> <double name="tf-idf">0.01639344262295082</double> </lst> <lst name="tractor"> <int name="tf">1</int> <lst name="positions"> <int name="position">2</int> </lst> <lst name="offsets"> <int name="start">4</int> <int name="end">11</int> </lst> <int name="df">35</int> <double name="tf-idf">0.02857142857142857</double> </lst> </lst> </lst> <lst name="903"> <str name="uniqueKey">903</str> <lst name="BUS_BUSINESS_NAME"> <lst name="kramer"> <int name="tf">1</int> <lst name="positions"> <int name="position">0</int> </lst> <lst name="offsets"> <int name="start">0</int> <int name="end">6</int> </lst> <int name="df">72</int> <double name="tf-idf">0.013888888888888888</double> </lst> <lst name="ltd"> <int name="tf">1</int> <lst name="positions"> <int name="position">1</int> </lst> <lst name="offsets"> <int name="start">7</int> <int name="end">10</int> </lst> <int name="df">9798</int> <double name="tf-idf">1.0206164523372117E-4</double> </lst> </lst> Am I going in the wrong direction with trying to use the Term Vector Component to accomplish Kramer then Tractor? Thanks, Corey