We also use Nutch at our environment.  Nutch crawls the data and it to Solr
for indexing. I have implemented a custom search API that interacts with my
Solr indexes cos of I don't want to expose my indexes directly to outside.
You can easily configure and build up what you want with such kind of
combination.

30 Ekim 2013 Çarşamba tarihinde Palmer, Eric <epal...@richmond.edu> adlı
kullanıcı şöyle yazdı:
> Thanks for the link
>
> Sent from my iPhone
>
> On Oct 30, 2013, at 4:06 PM, "Rajani Maski" <rajinima...@gmail.com> wrote:
>
>> Hi Eric,
>>
>>  I have also developed mini-applications replacing GSA for some of our
>> clients using Apache Nutch + Solr to crawl multi lingual sites and enable
>> multi-lingual search. Nutch+Solr is very stable and Nutch mailing list
>> provides a good support.
>>
>> Reference link to start:
>> apache nutch | profilerajanimaski
>>
>> Thanks
>> Rajani
>>
>>
>>
>>
>> On Thu, Oct 31, 2013 at 12:27 AM, Palmer, Eric <epal...@richmond.edu>
wrote:
>>
>>> Markus and Jason
>>>
>>> thanks for the info.
>>>
>>> I will start to research Nutch.  Writing a crawler, agree it is a rabbit
>>> hole.
>>>
>>>
>>> --
>>> Eric Palmer
>>>
>>> Web Services
>>> U of Richmond
>>>
>>> To report technical issues, obtain technical support or make requests
for
>>> enhancements please visit
>>> http://web.richmond.edu/contact/technical-support.html
>>>
>>>
>>>
>>>
>>>
>>> On 10/30/13 2:53 PM, "Jason Hellman" <jhell...@innoventsolutions.com>
>>> wrote:
>>>
>>>> Nutch is an excellent option.  It should feel very comfortable for
people
>>>> migrating away from the Google appliances.
>>>>
>>>> Apache Droids is another possible way to approach, and I¹ve found
people
>>>> using Heretrix or Manifold for various use cases (and usually in
>>>> combination with other use cases where the extra overhead was worth the
>>>> trouble).
>>>>
>>>> I think the simples approach will be NutchŠit¹s absolutely worth
taking a
>>>> shot at it.
>>>>
>>>> DO NOT write a crawler!  That is a rabbit hole you do not want to peer
>>>> down into :)
>>>>
>>>>
>>>>
>>>> On Oct 30, 2013, at 10:54 AM, Markus Jelsma <markus.jel...@openindex.io
>
>>>> wrote:
>>>>
>>>>> Hi Eric,
>>>>>
>>>>> We have also helped some government institution to replave their
>>>>> expensive GSA with open source software. In our case we use Apache
Nutch
>>>>> 1.7 to crawl the websites and index to Apache Solr. It is very
>>>>> effective, robust and scales easily with Hadoop if you have to. Nutch
>>>>> may not be the easiest tool for the job but is very stable, feature
rich
>>>>> and has an active community here at Apache.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -----Original message-----
>>>>>> From:Palmer, Eric <epal...@richmond.edu>
>>>>>> Sent: Wednesday 30th October 2013 18:48
>>>>>> To: solr-user@lucene.apache.org
>>>>>> Subject: Replacing Google Mini Search Appliance with Solr?
>>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> Been lurking on the list for awhile.
>>>>>>
>>>>>> We are at the end of life for replacing two google mini search
>>>>>> appliances used to index our public web sites. Google is no longer
>>>>>> selling the mini appliances and buying the big appliance is not cost
>>>>>> beneficial.
>>>>>>
>>>>>> http://search.richmond.edu/
>>>>>>
>>>>>> We would run a solr replacement in linux (cents, redhat, similar)
with
>>>>>> open Java or Oracle Java.
>>>>>>
>>>>>> Background
>>>>>> ==========
>>>>>> ~130 sites
>

Reply via email to