RE: DIH full-import memory issue

2010-05-10 Thread caman
This may help: batchSize : The batchsize used in jdbc connection http://wiki.apache.org/solr/DataImportHandler#Configuring_DataSources From: Geek Gamer [via Lucene] [mailto:ml-node+809069-2054572211-124...@n3.nabble.com] Sent: Monday, May 10, 2010 9:42 PM To: caman Subject: DIH

DIH full-import memory issue

2010-05-10 Thread Geek Gamer
Hi, I am facing issues with DIH fullimport, I have a database with 3 million records that will translate into index size of 6GB. When I am trying to do full import I am getting out of memory error like : INFO: Starting Full Import May 10, 2010 11:44:06 PM org.apache.solr.handler.dataimport.Solr

what kind params should be in "pf" , pls give a demo.

2010-05-10 Thread Dickens Ting
solr-user@lucene.apache.org Hello, I'm running Solr 1.4 for a bbs search. we index the subject and content. we are using DisMaxRequestHandler now. We make Could anyone give me some pieces advice of setting  "pf" what kind params should be in "pf"   , pls give a demo. #

Re: Help with Embedded Server

2010-05-10 Thread Lance Norskog
This is the underlying exception: > Caused by: org.apache.solr.common.SolrException: No such core: Universities > - Embedded Solr Server >at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:104) >at > org.apache.solr.client.solrj.request.QueryReques

Re: Unbuffered entity enclosing request can not be repeated.

2010-05-10 Thread Lance Norskog
Yes, these occasionally happen with long indexing jobs. You might try limiting the number of documents per upload call. On Sun, May 9, 2010 at 9:16 PM, Satish Kumar wrote: > Found these errors in Tomcat's log file: > > May 9, 2010 10:57:24 PM org.apache.solr.common.SolrException log > SEVERE: ja

Help with Embedded Server

2010-05-10 Thread Eric Berry
Hello, I'm running into an exception when I try to search against the example index provided with the Solr download. I have created a grails application where I want to embed the Solr server. I'm using Solr 1.4.0. [exception] Caused by: org.apache.solr.client.solrj.SolrServerException: Error e

Re: How to query for similar documents before indexing

2010-05-10 Thread Mark Miller
There is no official support for dedupe at search time. You can take a look at the field collapse patch in JIRA though - we where thinking ahead when we added the ability to tag dupes during indexing for field collapsing at search time - but the search side support is not there yet. On 5/10/10

Re: How to query for similar documents before indexing

2010-05-10 Thread Ken Krugler
Hi all (especially Yonik), At the http://wiki.apache.org/solr/Deduplication page, it mentions "duplicate field collapsing" and later "Allow for both duplicate collapsing in search results..." But I don't see any mention of how deduplication happens during search time. Normally this requir

RE: How to query for similar documents before indexing

2010-05-10 Thread Markus Jelsma
Hi Matthieu,     On the top of the wiki page you can see it's in 1.4 already. As far as i know the API doesn't return information on found duplicates in its response header, the wiki isn't clear on that subject. I, at least, never saw any other response than an error or the usual status code

Strange NPE with SOLR-236 (Field collapsing)

2010-05-10 Thread Eric Caron
Using the latest from trunk as of 2010-04-29, and the SOLR-236-trunk.patch from 2010-03-29 05:08, I get a nullpointerexception whenever I use collapse.field and a fq. Works: /solr/select/?q=sales&fq=country%3A1 Works: /solr/select/?q=sales&collapse.f

RE: How to query for similar documents before indexing

2010-05-10 Thread Matthieu Labour
Markus Thank you for your response That would be great if the index has the option to prevent duplicate from entering the index. But is it going to be a silent action ? Or will the add method return that it failed indexing because it detected a duplicate ? Is it commited to the 1.4 already ? Chee

RE: How to query for similar documents before indexing

2010-05-10 Thread Markus Jelsma
Hi,     Deduplication [1] is what you're looking for.It can utilize different analyzers that will add a one or more signatures or hashes to your document depending on exact or partial matches for configurable fields. Based on that, it should be able to prevent new documents from entering the

Re: JSON formatted response from SOLR question....

2010-05-10 Thread Jon Baer
IIRC, I think what we ended up doing in a project was to use the VelocityResponseWriter to write the JSON and set the echoParams to read the handler setup (and looping through the variables). In the template you can grab it w/ something like $request.params.get("facet_fields") ... I don't remem

How to query for similar documents before indexing

2010-05-10 Thread Matthieu Labour
Hi I want to implement the following logic: Before I index a new document into the index, I want to check if there are already documents in the index with similar content to the content of the document about to be inserted. If the request returns 1 or more documents, then I don't want to inser

Re: Highlighting Performance On Large Documents

2010-05-10 Thread Lance Norskog
To search in a field, it has to be indexed. You can store a field without indexing if you want to highlight it. If you index it with the term* options, it should highlight faster. Since these do not speed up higlighting, your analysis stack is probably very simple. The term* options are variations

Re: SOLR-343 date facet mincount patch

2010-05-10 Thread Chris Hostetter
: Has any one tried to apply this patch on Solr 1.4? When I tried I was able : to patch 'SOLR-343.patch' but it failed for another : 'DateFacetsMincountPatch.patch'. both attachments make the same change, one was just a slightly newer and included tests. if you got either patch to apply to 1.4,

readonly access for all host except for localhost

2010-05-10 Thread Tommy Chheng
Is there a way to configure solr to only allow readonly access for all external hosts except when the request is coming from localhost? ex. solr-server.com:8983/solr/select is read-only accessible from remote server and the remote server is not allow to do any update/delete POST actions. --

RE: JSON formatted response from SOLR question....

2010-05-10 Thread caman
Take a look at AjaxSolr source code: http://github.com/evolvingweb/ajax-solr This should give you exactly what you need. thanks From: Tod [via Lucene] [mailto:ml-node+789105-593266572-124...@n3.nabble.com] Sent: Monday, May 10, 2010 7:22 AM To: caman Subject: JSON formatt

Re: keywords, terms component for suggestion of sentences.

2010-05-10 Thread stockii
another question ;) how can i merge a autocompletion for keywords with productnames ? like amazon ? :P my normally a-completion works fine with the productnames. i used the edgeNGram... how it is possible that solr recognize this different. so that a productname is suggests or an keyword. e

Re: Problem with pdf, upgrading Cell

2010-05-10 Thread Grant Ingersoll
I've integrated this into Solr's trunk: https://issues.apache.org/jira/browse/SOLR-1902 -Grant On May 6, 2010, at 3:40 AM, Sandhya Agarwal wrote: > Praveen, > > You can get the latest code, containing the fix, from here : > > http://lucene.apache.org/tika/source-repository.html > > Thanks,

Re: keywords, terms component for suggestion of sentences.

2010-05-10 Thread stockii
oh, nice! thx. thats what i searched for =) =) -- View this message in context: http://lucene.472066.n3.nabble.com/keywords-terms-component-for-suggestion-of-sentences-tp788939p789134.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Issue with range queries

2010-05-10 Thread Ahmet Arslan
> Hi all, > > I have a problem with range queries on an integer field. > (Solr 1.4) > > In my index, myField contains values between 0 and 3000. > > stored="true" required="false"/> > > Here are a few samples to give you an idea of the problem: > > fq=myField:[1 TO 1000] ... 0 results > > fq

Issue with range queries

2010-05-10 Thread Pierre-Luc Thibeault
Hi all, I have a problem with range queries on an integer field. (Solr 1.4) In my index, myField contains values between 0 and 3000. Here are a few samples to give you an idea of the problem: fq=myField:[1 TO 1000] ... 0 results fq=myField:[1 TO 999] ... 1288930 results fq=myField:[1 TO 300

JSON formatted response from SOLR question....

2010-05-10 Thread Tod
I apologize, this is such a JSON/javascript question but I'm stuck and am not finding any resources that address this specifically. I'm doing a faceted search and getting back in my facet_counts.faceted_fields response an array of countries. I'm gathering the count of the array elements retur

Re: keywords, terms component for suggestion of sentences.

2010-05-10 Thread Ahmet Arslan
> Hello. > > I have a little problem. > > i want to import an keywords-field from my database wich > looks like this: > Car, Radio, Car Radio, ... > > i import this with my DIH and i analyze with the > PatternTokenizerFactory. > > pattern=", *" /> > > suggestion for one word works fine, bu

Extending solr.SpellCheckComponent to get phrase collations

2010-05-10 Thread Sachin
Hi All, I've been playing around with SpellCheckComponent (solr 1.4) and ran into issues with suggestions for a phrase query. We use dismax request handler and it's an AND search in case the query terms count < 4 (specified by "mm" param). Since SpellCheckComponent checks the doc frequency

Issue with delta import (not finding data in a column)

2010-05-10 Thread ahammad
I have a Solr core that retrieves data from an Oracle DB. The DB table has a few columns, one of which is a Blob that represents a PDF document. In order to retrieve the actual content of the PDF file, I wrote a Blob transformer that converts the Blob into the PDF file, and subsequently reads it u

Re: keywords, terms component for suggestion of sentences.

2010-05-10 Thread stockii
thats my http-request for terms. ... terms/?terms.fl=tags&terms.prefix=car%20rad&wt=xml&terms.lower=car%20rad&terms.lower.incl=false -- View this message in context: http://lucene.472066.n3.nabble.com/keywords-terms-component-for-suggestion-of-sentences-tp788939p788945.html Sent from the Solr -

keywords, terms component for suggestion of sentences.

2010-05-10 Thread stockii
Hello. I have a little problem. i want to import an keywords-field from my database wich looks like this: Car, Radio, Car Radio, ... i import this with my DIH and i analyze with the PatternTokenizerFactory. suggestion for one word works fine, but not for e.g. "Car Radio" when the user type

Speakers and Schedule for Berlin Buzzwords 2010 - Search, Store and Scale 7th/8th 2010

2010-05-10 Thread Isabel Drost
Hi folks, we proudly present the Berlin Buzzwords talks and presentations. As promised there are tracks specific to the three tags search, store and scale. We have a fantastic mixture of developers and users of open source software projects that make scaling data processing today possible. There

DIH: Clob Transformer doesn't transform Oracle clob

2010-05-10 Thread Agethle, Matthias
Hi, I have problems using DIH with Oracle Clobs. I use a Clob-Transformer but my clob is not transformed to string: oracle.sql.c...@14d1900 10 My data-config is as follows: Thanks, Matthias

AW: SOLR Based Search - Response Times - what do you consider slow or fast?

2010-05-10 Thread Markus.Rietzler
you write: > Our overall response (front end + SOLR) averages 0.5s to 0.7s with > SOLR typicall taking about 100 - 300 ms. is the 100-300ms the time your application needs to query solr and get the response? what are the times if you query SOLR directly without your frontend? we are also in th