lt;https://github.com/alexeiramone> | (11) 9 7613.0966 |
> > >
> > >
> > > 2014-01-28 Jack Krupansky
> > >
> > > > 1. Nutch follows the links within HTML web pages to crawl the full
> > graph
> > > > of a web of pages.
> > > &
martchenko>|
> Steam <http://steamcommunity.com/id/alexeiramone/> |
> 4sq<https://pt.foursquare.com/alexeiramone>| Skype: alexeiramone |
> Github <https://github.com/alexeiramone> | (11) 9 7613.0966 |
>
>
> 2014-01-28 rashmi maheshwari
>
> > Hi,
> >
>
> > of a web of pages.
> >
> > 2. Think of a core as an SQL table - each table/core has a different type
> > of data.
> >
> > 3. SolrCloud is all about scaling and availability - multiple shards for
> > larger collections and multiple replicas for bo
Thanks saurish.
My office *intranet *is a sharepoint website. When I am crawling it using
nutch, i am getting "Unauthorized access(404)" error. NTLM realm is used in
this website.
I checked on one nutch JIRA link that sharepoint could be accessed using
nutch. Nutch has below properties in nutch-
Hi,
Questions 1) Why do we use Spellings file under solr core conf folder?
What spellings do we enter in this?
Question 2) : Implementing all synonyms is a tough thing. From where could
i get list of as many synonyms as we could see in google search?
--
Rashmi
Be the change that you want to
Hi,
Question1 --> When Solr could parse html, documents like doc, excel pdf
etc, why do we need nutch to parse html files? what is different?
Questions 2: When do we use multiple core in solar? any practical business
case when we need multiple cores?
Question 3: When do we go for cloud? What is
Hi,
How to get most relevent items on top of search results using solr search?
--
Rashmi
Be the change that you want to see in this world!
Hi,
I want to creating a POC to search INTRANET along with documents uploaded
on intranet. Documents(PDF, excel, word document, text files, images,
videos) are also exists on SHAREPOINT. sharepoint has Authentication access
at module level(folder level).
My interanet website is http://myintranet/