There are many different “readability scores”. The most common is 
Flesch-Kincaid, which uses the number of words, number of sentences, and number 
of syllables. Solr has the word count, but not the other two.

https://en.wikipedia.org/wiki/Readability_test 
<https://en.wikipedia.org/wiki/Readability_test>
https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests 
<https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests>

I think Solr is the wrong tool for calculating readability scores.

The scores are fairly easy to calculate once you have the whole document. But 
the information stored in a Solr index is the wrong information for that 
calculation.

Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On May 11, 2016, at 1:11 PM, A Laxmi <a.lakshmi...@gmail.com> wrote:
> 
>> 
> 
> 
> * What I mean is that a technical paper will have a different type of
> complexity from let's say a Shakespearean play, because the former will
> have technical jargon, while the latter will have really high level*
> 
> 
> * vocabulary.*Good point. But, I am thinking a 7th grade might find both of
> them complex to understand - one because of technical jargon and other
> because of high level vocabulary?
> 
> If it helps, I am looking at a set of user manuals of various products. I
> am trying to determine which of those user manuals are easier to read and
> which are more complex in comparison.
> 
> 
> 
> On Wed, May 11, 2016 at 3:58 PM, Binoy Dalal <binoydala...@gmail.com> wrote:
> 
>> Please correct me if I'm wrong, but I think what Joel means is the variety
>> of words in a document.
>> 
>> One more aspect that will come into play here, I think, is the different
>> types of complexity.
>> What I mean is that a technical paper will have a different type of
>> complexity from let's say a Shakespearean play, because the former will
>> have technical jargon, while the latter will have really high level
>> vocabulary.
>> 
>> On Thu, 12 May 2016, 01:17 A Laxmi, <a.lakshmi...@gmail.com> wrote:
>> 
>>> Yes, length of the words would be one way but was wondering if there are
>>> any other ways to identify the complexity.
>>> 
>>> On Wed, May 11, 2016 at 3:46 PM, A Laxmi <a.lakshmi...@gmail.com> wrote:
>>> 
>>>> Yes, length of the words would be one way but was wondering if there
>> are
>>>> any ways to identify the complexity.
>>>> 
>>>> On Wed, May 11, 2016 at 3:36 PM, Joel Bernstein <joels...@gmail.com>
>>>> wrote:
>>>> 
>>>>> I'm wondering if the size of the vocabulary used would be enough for
>>> this?
>>>>> 
>>>>> Joel Bernstein
>>>>> http://joelsolr.blogspot.com/
>>>>> 
>>>>> On Wed, May 11, 2016 at 3:32 PM, A Laxmi <a.lakshmi...@gmail.com>
>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Is it possible to determine how complex a document is using Solr?
>>>>>> Complexity in terms of whether document is readable by a 7th grade
>> vs.
>>>>> PHD
>>>>>> Grad?
>>>>>> 
>>>>>> Thanks!
>>>>>> AL
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> --
>> Regards,
>> Binoy Dalal
>> 

Reply via email to