There are many different “readability scores”. The most common is Flesch-Kincaid, which uses the number of words, number of sentences, and number of syllables. Solr has the word count, but not the other two.
https://en.wikipedia.org/wiki/Readability_test <https://en.wikipedia.org/wiki/Readability_test> https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests <https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests> I think Solr is the wrong tool for calculating readability scores. The scores are fairly easy to calculate once you have the whole document. But the information stored in a Solr index is the wrong information for that calculation. Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 11, 2016, at 1:11 PM, A Laxmi <a.lakshmi...@gmail.com> wrote: > >> > > > * What I mean is that a technical paper will have a different type of > complexity from let's say a Shakespearean play, because the former will > have technical jargon, while the latter will have really high level* > > > * vocabulary.*Good point. But, I am thinking a 7th grade might find both of > them complex to understand - one because of technical jargon and other > because of high level vocabulary? > > If it helps, I am looking at a set of user manuals of various products. I > am trying to determine which of those user manuals are easier to read and > which are more complex in comparison. > > > > On Wed, May 11, 2016 at 3:58 PM, Binoy Dalal <binoydala...@gmail.com> wrote: > >> Please correct me if I'm wrong, but I think what Joel means is the variety >> of words in a document. >> >> One more aspect that will come into play here, I think, is the different >> types of complexity. >> What I mean is that a technical paper will have a different type of >> complexity from let's say a Shakespearean play, because the former will >> have technical jargon, while the latter will have really high level >> vocabulary. >> >> On Thu, 12 May 2016, 01:17 A Laxmi, <a.lakshmi...@gmail.com> wrote: >> >>> Yes, length of the words would be one way but was wondering if there are >>> any other ways to identify the complexity. >>> >>> On Wed, May 11, 2016 at 3:46 PM, A Laxmi <a.lakshmi...@gmail.com> wrote: >>> >>>> Yes, length of the words would be one way but was wondering if there >> are >>>> any ways to identify the complexity. >>>> >>>> On Wed, May 11, 2016 at 3:36 PM, Joel Bernstein <joels...@gmail.com> >>>> wrote: >>>> >>>>> I'm wondering if the size of the vocabulary used would be enough for >>> this? >>>>> >>>>> Joel Bernstein >>>>> http://joelsolr.blogspot.com/ >>>>> >>>>> On Wed, May 11, 2016 at 3:32 PM, A Laxmi <a.lakshmi...@gmail.com> >>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Is it possible to determine how complex a document is using Solr? >>>>>> Complexity in terms of whether document is readable by a 7th grade >> vs. >>>>> PHD >>>>>> Grad? >>>>>> >>>>>> Thanks! >>>>>> AL >>>>>> >>>>> >>>> >>>> >>> >> -- >> Regards, >> Binoy Dalal >>