subject:"Nutch\/Solr"

Indexing HTML Metatags Nutch - SOLR

2020-01-18 Thread kra...@gds2.de

Hello, I have been trying this for several days without success. (nutch 1.16 - solr 7.3.1) I have followed this description: https://cwiki.apache.org/confluence/display/nutch/IndexMetatags Below I put my file nutch-site.xml I have created the core following this description: https://cwiki.apache

Indexing HTML Metatags Nutch - SOLR

2020-01-18 Thread kra...@gds2.de

Hello, I have been trying this for several days without success. (nutch 1.16 - solr 7.3.1) I have followed this description: https://cwiki.apache.org/confluence/display/nutch/IndexMetatags Below I put my file nutch-site.xml I have created the core following this description: https://cwiki.apache

Re: Nutch+Solr

2018-10-08 Thread Bineesh

This is solved. Nutch 1.15 have index-writers.xml file wherein we can pass the UN/PWD for indexing to solr. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Nutch+Solr

2018-10-03 Thread Terry Steichen

Bineesh, I don't use Nutch, so don't know if this is relevant, but I've had similar-sounding failures in doing and restoring backups. The solution for me was to deactivate authentication while the backup was being done, and then activate it again afterwards. Then everything was restored correctl

Nutch+Solr

2018-10-03 Thread Bineesh

Hello, We use Solr 7.3.1 and Nutch 1.15 We've placed the authentication for our solr cloud setup using the basic auth plugin ( login details -> solr/SolrRocks) For the nutch to index data to solr, below properties added to nutch-sitexml file solr.auth true Whether to enable HTTP basi

Nutch + Solr - Indexer causes java.lang.OutOfMemoryError: Java heap space

2014-09-07 Thread glumet

Decrease when handling very large documents to prevent Nutch from running out of memory. NOTE: It does not explicitly trigger a server side commit. -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-Solr-Indexer-causes-java-lang-OutOfMemoryError-Java-heap-space-tp415

Re: document id in nutch/solr

2013-06-24 Thread alxsss

Another way of overriding nutch fields is to modify solrindex-mapping.xml file. hth Alex. -Original Message- From: Jack Krupansky To: solr-user Sent: Sun, Jun 23, 2013 12:04 pm Subject: Re: document id in nutch/solr Add the "passthrough" dynamic field to your S

Re: document id in nutch/solr

2013-06-23 Thread Jack Krupansky

une 23, 2013 2:35 PM To: solr-user@lucene.apache.org Subject: Re: document id in nutch/solr Can somebody help with this one, please? On Fri, Jun 21, 2013 at 10:36 PM, Joe Zhang wrote: A quite standard configuration of nutch seems to autoamtically map "url" to "id". Two que

Re: document id in nutch/solr

2013-06-23 Thread Joe Zhang

Can somebody help with this one, please? On Fri, Jun 21, 2013 at 10:36 PM, Joe Zhang wrote: > A quite standard configuration of nutch seems to autoamtically map "url" > to "id". Two questions: > > - Where is such mapping defined? I can't find it anywhere in > nutch-site.xml or schema.xml. The l

document id in nutch/solr

2013-06-21 Thread Joe Zhang

A quite standard configuration of nutch seems to autoamtically map "url" to "id". Two questions: - Where is such mapping defined? I can't find it anywhere in nutch-site.xml or schema.xml. The latter does define the "id" field as well as its uniqueness, but not the mapping. - Given that nutch nutc

spellchecking in nutch solr

2011-09-01 Thread alxsss

Hello, I have tried to implement spellchecker based on index in nutch-solr by adding spell field to schema.xml and making it a copy from content field. However, this increased data folder size twice and spell filed as a copy of content field appears in xml feed which is not necessary. Is it

Assistance required fine-tuning nutch/solr - (paid work)

2010-11-12 Thread Jean-Luc

I require the expertise of a developer who can assist with fine-tuning my nutch/solr setup. I have the basics working but I think I probably need a custom nutch plugin written. If you're interested please contact me: jeanluct [at] gmail . com Hope it's ok to post this here - I'm

Re: Nutch/Solr

2010-09-07 Thread Markus Jelsma

You should: - definately upgrade to 1.1 (1.2 is on the way), and - subscribe to the Nutch mailing list for Nutch specific questions. On Tuesday 07 September 2010 10:36:58 Yavuz Selim YILMAZ wrote: > In fact, I used nutch 0.9 version, but thinking of passing the new version. > > If anybody did

Re: Nutch/Solr

2010-09-07 Thread Yavuz Selim YILMAZ

In fact, I used nutch 0.9 version, but thinking of passing the new version. If anybody did something like that, ı want to learn their experience. If indexing an xml file, there are specific fields and all of them are dependent among them, so duplicates don't happen. I want to extract specific fi

Re: Nutch/Solr

2010-09-07 Thread Markus Jelsma

Depends on your version of Nutch. At least trunk and 1.1 obey the solrmapping.xml file in Nutch' configuration directory. I'd suggest you start with that mapping file and the Solr schema.xml file shipped with Nutch as it exactly matches with the mapping file. Just restart Solr with the new sche

Nutch/Solr

2010-09-07 Thread Yavuz Selim YILMAZ

I tried to combine nutch and solr, want to ask somethig. After crawling, nutch has certain fields such as; content, tstamp, title. How can I map "content" field after crawling ? Do I have change the lucene code (such as add extra field)? Or overcome in solr stage? Any suggestion? Thx. -- Yavu

Re: Nutch <-> Solr latest?

2008-06-25 Thread Chris Hostetter

: Im curious, is there a spot / patch for the latest on Nutch / Solr : integration, Ive found a few pages (a few outdated it seems), it would be nice : (?) if it worked as a DataSource type to DataImportHandler, but not sure if : that fits w/ how it works. Either way a nice contrib patch the way

Nutch <-> Solr latest?

2008-06-24 Thread Jon Baer

Hi, Im curious, is there a spot / patch for the latest on Nutch / Solr integration, Ive found a few pages (a few outdated it seems), it would be nice (?) if it worked as a DataSource type to DataImportHandler, but not sure if that fits w/ how it works. Either way a nice contrib patch

bzr branches for Apache Lucene/Nutch/Solr/Hadoop at Launchpad

2007-03-22 Thread rubdabadub

bzr branch. You can access them here. Nutch - https://launchpad.net/nutch Solr - https://launchpad.net/solr Lucene - https://launchpad.net/lucene Hadoop - https://launchpad.net/hadoop It only mirrors "trunk". Thats what I need to follow thats why and I don't see any reason to mirror releases. Regards

Indexing HTML Metatags Nutch - SOLR

Indexing HTML Metatags Nutch - SOLR

Re: Nutch+Solr

Re: Nutch+Solr

Nutch+Solr

Nutch + Solr - Indexer causes java.lang.OutOfMemoryError: Java heap space

Re: document id in nutch/solr

Re: document id in nutch/solr

Re: document id in nutch/solr

document id in nutch/solr

spellchecking in nutch solr

Assistance required fine-tuning nutch/solr - (paid work)

Re: Nutch/Solr

Re: Nutch/Solr

Re: Nutch/Solr

Nutch/Solr

Re: Nutch <-> Solr latest?

Nutch <-> Solr latest?

bzr branches for Apache Lucene/Nutch/Solr/Hadoop at Launchpad

19 matches

Site Navigation

Mail list logo

Footer information