EC2 instance type recommended for SOLR?

2015-11-07 Thread Costi Muraru
Hi folks, I'm trying to decide on the EC2 instance type to use for a Solr cluster. Some details about the cluster: 1) The total index size is 89.9GB (somewhere around 20 mil records). 2) The number of requests that reach Solr is pretty low (thousands per day),

SOLR plugin: Retrieve all values of multivalued field

2015-05-11 Thread Costi Muraru
Hi folks, I'm playing with a custom SOLR plugin and I'm trying to retrieve the value for a multivalued field, using the code below. == schema.xml: == input data: 83127 somevalue some other value some other value 3 some other value 4 == plugin: SortedDoc

Re: Using SolrCloud on Amazon EC2

2014-09-26 Thread Costi Muraru
Hi Timo, Why not use Cloudera's CDH5 which comes with Solr? Costi On Thu, Sep 25, 2014 at 10:43 AM, Timo Schmidt wrote: > Hi together, > > we currently plan to setup a project based on solr cloud and amazon > webservices. Our main search application is deployed using aws opsworks > which works

Re: Evaluate function only on subset of documents

2014-06-24 Thread Costi Muraru
Hi Chris, Thanks for your patience, I've now got a better image on how things work. I don't believe however that the two queries (the one with the post filter and the one without one) are equivalent. Suppose out of the whole document set: XXX returns documents 1,2,3. AAA returns documents 6,7,8.

Re: Evaluate function only on subset of documents

2014-06-24 Thread Costi Muraru
Thanks guys for your answers. Sorry for the query syntax errors I've added in the previous queries. Chris, you've been really helpful. Indeed, point 3 is the one I'm trying to solve, rather than 2. You're saying that "BooleanScorer will consult the clauses in order based on which clause says it ca

Evaluate function only on subset of documents

2014-06-23 Thread Costi Muraru
Hi guys, I'm running some tests and I can't see to figure this one out. Suppose we have a real estate index, containing homes for rent and purchase. The first kind of query I want to make is like so: - type:purchase AND {!frange u=10}mycustomfunction() The function is expensive and, in order to i

Store Java object in field and retrieve it in custom function?

2014-06-19 Thread Costi Muraru
Hi, I'm trying to save a Java object in a binary field and afterwards use this value in a custom solr function. I'm able to put and retrieve the Java object in Base64 via the UI, but I can't seem to be able to retrieve the value in the custom function. In the function I'm using: termsIndex = Fiel

How to retrieve entire field value (text_general) in custom function?

2014-06-11 Thread Costi Muraru
I have a text_general field and want to use its value in a custom function. I'm unable to do so. It seems that the tokenizer messes this up and only a fraction of the entire value is being retrieved. See below for more details. 1 term1 term2 term3 < long name="_version_">1470628088879513600 2

Extract values from custom function for ValueSource with multiple indexable fields

2014-06-08 Thread Costi Muraru
Hi guys, I have a custom FieldType that adds several IndexableFields for each document. I also have a custom function, in which I want to retrieve these indexable fields. I can't seem to be able to do so. I have added some code snippets below. Any help is gladly appreciated. Thanks, Costi public

Re: MergeReduceIndexerTool takes a lot of time for a limited number of documents

2014-05-26 Thread Costi Muraru
pdate docs with the same ID, this > is due to the inherent limitation of the Lucene mergeIndex process. > > How long is "a long time"? attachments tend to get filtered out, so if you > want us to see the graph you might paste it somewhere and provide a link. > > Best, >

MergeReduceIndexerTool takes a lot of time for a limited number of documents

2014-05-26 Thread Costi Muraru
Hey guys, I'm using the MergeReduceIndexerTool to import data into a SolrCloud cluster made out of 3 decent machines. Looking in the JobTracker, I can see that the mapper jobs finish quite fast. The reduce jobs get to ~80% quite fast as well. It is here where they get stucked for a long period of

Re: Update existing documents using MapReduceIndexerTool?

2014-05-06 Thread Costi Muraru
; Wolfgang. > > On May 6, 2014, at 3:08 PM, Costi Muraru wrote: > > > Hi guys, > > > > I've used the MapReduceIndexerTool [1] in order to import data into SOLR > > and seem to stumbled upon something. I've followed the tutorial [2] and > > managed to i

Update existing documents using MapReduceIndexerTool?

2014-05-06 Thread Costi Muraru
Hi guys, I've used the MapReduceIndexerTool [1] in order to import data into SOLR and seem to stumbled upon something. I've followed the tutorial [2] and managed to import data into a SolrCloud cluster using the map reduce job. I ran the job a second time in order to update some of the existing do

Re: Fastest way to import big amount of documents in SolrCloud

2014-05-01 Thread Costi Muraru
everyday? > * How much of data are we talking about? > * What's your SolrCloud setup like? > * Do you already have some benchmarks which you're not happy with? > > > > On Thu, May 1, 2014 at 1:47 PM, Costi Muraru > wrote: > > > Hi guys, > > > > What wou

Fastest way to import big amount of documents in SolrCloud

2014-05-01 Thread Costi Muraru
Hi guys, What would you say it's the fastest way to import data in SolrCloud? Our use case: each day do a single import of a big number of documents. Should we use SolrJ/DataImportHandler/other? Or perhaps is there a bulk import feature in SOLR? I came upon this promising link: http://wiki.apache

Re: Delete fields from document using a wildcard

2014-04-29 Thread Costi Muraru
2014 at 7:53 PM, Shawn Heisey wrote: > > > On 4/29/2014 5:25 AM, Costi Muraru wrote: > > > The problem is, I don't know the exact names of the fields I want to > > > remove. All I know is that they end in *_1600_i. > > > > > > When removing fields fr

Re: Delete fields from document using a wildcard

2014-04-29 Thread Costi Muraru
ect: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Tue, Apr 29, 2014 at 12:20 AM, Costi Muraru > wrote: > > Hi guys, > > > > Would be possible, using Atomic Updates in SOLR4, to remove all fields > > matching a pattern? For in

Delete fields from document using a wildcard

2014-04-28 Thread Costi Muraru
Hi guys, Would be possible, using Atomic Updates in SOLR4, to remove all fields matching a pattern? For instance something like: 100 <*field name="*_name_i" update="set" null="true">* Or something similar to remove certain fields in all documents. Thanks, Costi