On 6 August 2014 14:13, Ali Nazemian <alinazem...@gmail.com> wrote:
>
> Dear all,
> Hi,
> I was wondering how can I mange to index comments in solr? suppose I am
> going to index a web page that has a content of news and some comments that
> are presented by people at the end of this page. How can I index these
> comments in solr? consider the fact that I am going to do some analysis on
> these comments. For example I want to have such query flexibility for
> retrieving all comments that are presented between 24 June 2014 to 24 July
> 2014! or all the comments that are presented by specific person. Therefore
> defining these comment as multi-value field would not be the solution since
> in this case such query flexibility is not feasible. So what is you
> suggestion about document granularity in this case? Can I consider all of
> these comments as a new document inside main document (tree based
> structure). What is your suggestion for this case? I think it is a common
> case of indexing webpages these days so probably I am not the only one
> thinking about this situation. Please share you though and perhaps your
> experiences in this condition with me. Thank you very much.

Parsing a web page, and breaking up parts up for indexing into different fields
is out of the scope of Solr. You might want to look at Apache Nutch which
can index into Solr, and/or other web crawlers/scrapers.

Regards,
Gora

Reply via email to