Is there a way that in Solr Query i find documents that is composed of n
number of words. for example here is the list of words
- Love
- Ice
- Cream
- Sunny
- I
- To
- A
- On
- Elephant
- Balloon
And a percentage such as: 80%
Let’s assume you’re analyzing the text of the following sentence.
Please help me formulate the query that will be easy or do i have to build a
custom filter for this ?
Shahzad
--
View this message in context:
http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264p4094372.html
Sent from the Solr - User mailing list archive at
my client has a strange requirement, he will give a list of 500 words and
then set a percentage like 80% now he want to find those pages or
documents which consist of the only those 80% of 500 and only 20% unknown.
like we have this document
word1 word2
No did not get it unfortunately how this will help meexplain a bit in
details
--
View this message in context:
http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264p4094630.html
Sent from the Solr - User mailing list archive at Nabble.com.
Yes the correct is answer may be "Why" but you cannot ask this to client.
He think there is something interesting with this formula and if it works we
can index websites with Nutch + Solrand let users input queries that
can locate documents which has % of foreign words other than list pr
is there a way that i build a plugin that gets all words on a single page
and build a percentage to see how many words are foreign on the page (words
not on the search list)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264
Eric agreed Solr + Nutch solution was proposed by myself and had never used
these technologies, this is first time i handle these 2. My initial
response to client's requirments were to try to work out existing industry
tools and then modify it according to client requirements instead of
re-inve
Aloke Ghoshal i'm trying to work out your equation. i am using standard
scheme provided by nutch for solr and not aware of how to calculate
myfieldwordcount in first query.no idea where this count will come
from. is there any filter that will store number of tokens generated for a
speci