Error messages

2012-05-10 Thread Tolga

Hi,

Apache servers are returning my post with the status messages
HTML_FONT_SIZE_HUGE,HTML_MESSAGE,HTTP_ESCAPED_HOST,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,URI_HEX,WEIRD_PORT. 
I've tried clearing all formatting and a re-post, but the same thing 
occurred. What to do?


Regards,


Join backport to 3.x

2012-05-10 Thread Valeriy Felberg
Hi,

i've applied the patch from
https://issues.apache.org/jira/browse/SOLR-2604 to Solr 3.5. It works
but noticeably slows down the query time. Did someone already solve
that problem?

Cheers,
Valeriy


Re: Dynamic creation of cores for this use case.

2012-05-10 Thread pprabhcisco123
Hi,

 Thanks sujatha for your response.  
 I tried to create the core as per the blog url that you gave. But in
that 

mkdir -p /etc/solr/conf/$name/conf
cp -a /etc/solr/conftemplate/* /etc/solr/conf/$name/conf/
sed -i "s/CORENAME/$name/" /etc/solr/conf/$name/conf/solrconfig.xml
curl
"http://localhost:8080/solr/admin/cores?action=CREATE&name=$name&instanceDir=/etc/solr/conf/$name";

 I didnt understand the sed command what they are trying to do. I searched
the solrconfig file for replacing the corename , but there is no "CORENAME"
in it. 

 Please let me know what are the configuration changes that i need to do for
a single core .

Thanks 
Prabhakaran. P

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dynamic-creation-of-cores-for-this-use-case-tp3937696p3976507.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Newbie Tries to make a Schema.xml

2012-05-10 Thread Spadez
Right, for Long/Lat I found this information:

<-Long / Lat Field Type->



<-Fields->





Does this look more logical?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-tries-to-make-a-Schema-xml-tp3974200p3976539.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EDisMax and Query Phrase Slop Problem

2012-05-10 Thread Ahmet Arslan
Hi Andre,

qs is used when you have explicit phrase query (you need to use quotes for 
this) in your search string.

q="lisboa tipos"&qs=1


--- On Wed, 5/9/12, André Maldonado  wrote:

> From: André Maldonado 
> Subject: EDisMax and Query Phrase Slop Problem
> To: solr-user@lucene.apache.org
> Date: Wednesday, May 9, 2012, 3:41 PM
> Hi all.
> 
> In my index I have this field called textoboost. In one of
> the documents,
> the field have this value:
> 
>  />PORTUGAL *Lisboa *vendo ótimos aptos
> vários *tipos *totalmente legalizados reformados ótimas
> localidades ótimos
> preços, Inf. Rio 8353-2447 /9425-0025
> Sr.PedroPORTUGAL Lisboa
> vendo ótimos aptos vários tipos totalmente legalizados
> reformados ótimas
> localidades ótimos preços, Inf. Rio 8353-2447 /9425-0025
> Sr.Pedro
> 31217483121748imoveis />-1 dormitorio str>imoveisZP778492 />Venda >
> 
> And this is the analyzer for this field:
> 
>  positionIncrementGap="
> 1000"> class="solr.WhitespaceTokenizerFactory" /> <
> filter class="solr.PatternReplaceFilterFactory"
> pattern="^(?:19|20)(\d{2})$"
>  replacement="$1" /> class="solr.ASCIIFoldingFilterFactory" />  class="solr.LowerCaseFilterFactory" /> class="
> solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand
> ="true" /> ignoreCase="true" words="
> stopwords.txt" enablePositionIncrements="true"
> />
> 
> So, when doing the above query, I got the above document:
> 
> http://00.000.0.00:8983/solr/Index/select/?start=0&rows=12&*
> q=lisboa%20tipos&qf=textoboost*&fq=localexibicao%3axpto*
> &defType=edismax&mm=100%*25&debugQuery=true&echoParams=all
> 
> And this is ok. But when asking for a query phrase slop of
> "1", the
> document is still returned:
> 
> http://00.000.0.00:8983/solr/Index/select/?start=0&rows=12&q=lisboa%20tipos&qf=textoboost&fq=localexibicao%3axpto
> *&qs=1*&defType=edismax&mm=100%25&debugQuery=true&echoParams=all
> 
> The Query Phrase Slop (qs param) isn't affecting matching,
> or I didn't
> understood the usage of this parameter? In Solr Wiki, we
> have only this
> explanation:
> 
> qs (Query Phrase Slop)Amount of slop on phrase queries
> explicitly included
> in the user's query string (in qf fields; affects matching)
> 
> Thank's in advance.
> 
> *
> --
> *
> *"E conhecereis a verdade, e a verdade vos libertará."
> (João 8:32)*
> 
>  *andre.maldonado*@gmail.com 
>  (11) 9112-4227
> 
> 
> 
> 
>   
> 
>   
> 
>   
>


Re: Solr query issues

2012-05-10 Thread anarchos78
I am newbie in "Solr" thing. But with your advices i am in track now (sort of
way).
It seems that "Lucene" community is responsible, and fortunately it doesn't
turns its back to newbies!
Thank you guys,
Tom 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-query-issues-tp3974922p3976565.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Dynamic creation of cores for this use case.

2012-05-10 Thread pprabhcisco123
Hi sujatha,

Basically i just want to explain the use case . The use case is
described below,


1.  Create a VM running solr, with one core per customer
2.  Index all of each customer's data (config text, metadata, etc) into
a single core
3.  Create one fake "partner" per 30 customers(a seperate column as
partner_id in the database table)
4.  Index all 30 customer's data into the "partners" view
5.  Build a simple search page on top of the system that allows users to
6.  Switch between customer or partner ID's
7.  Search various strings
8.  Receive the output from SOLR
9.  With the goal of understanding
10.  How many solr cores can we run per VM?
11.  What's the load during indexing?
12.  Invalidate and rebuild?
13.  If we run multiple concurrent queries, how badly does it impact
performance?

I am trying to create second one . Please help on the same 
Thanks


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Dynamic-creation-of-cores-for-this-use-case-tp3937696p3976594.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr On Fly Field creation from full text for N-Gram Indexing

2012-05-10 Thread Husain, Yavar
I have full text in my database and I am indexing that using Solr. Now at 
runtime i.e. when the indexing is going on can I extract certain parameters 
based on regex and create another field/column on the fly using Solr for that 
extracted text?

For example my DB has just 2 columns (DocId & FullText):

DocIdFullText
1My name is Avi. RoleId: GYUIOP-MN-1087456. .

Now say while indexing I want to extract RoleId, place it in another column 
created on fly and index that column using N-Gram indexing. I dont want to go 
for N-Gram of Full text as that would be too time expensive.

Thanks!! Any clues would be appreciated.


**This
 message may contain confidential or proprietary information intended only for 
the use of theaddressee(s) named above or may contain information that is 
legally privileged. If you arenot the intended addressee, or the person 
responsible for delivering it to the intended addressee,you are hereby 
notified that reading, disseminating, distributing or copying this message is 
strictlyprohibited. If you have received this message by mistake, please 
immediately notify us byreplying to the message and delete the original 
message and any copies immediately thereafter.

Thank you.~
**
FAFLD



Re: Solr On Fly Field creation from full text for N-Gram Indexing

2012-05-10 Thread Jack Krupansky

You can use "Regex Transformer" to extract from a source field.

See:
http://wiki.apache.org/solr/DataImportHandler#RegexTransformer

-- Jack Krupansky

-Original Message- 
From: Husain, Yavar

Sent: Thursday, May 10, 2012 6:04 AM
To: solr-user@lucene.apache.org
Subject: Solr On Fly Field creation from full text for N-Gram Indexing

I have full text in my database and I am indexing that using Solr. Now at 
runtime i.e. when the indexing is going on can I extract certain parameters 
based on regex and create another field/column on the fly using Solr for 
that extracted text?


For example my DB has just 2 columns (DocId & FullText):

DocIdFullText
1My name is Avi. RoleId: GYUIOP-MN-1087456. .

Now say while indexing I want to extract RoleId, place it in another column 
created on fly and index that column using N-Gram indexing. I dont want to 
go for N-Gram of Full text as that would be too time expensive.


Thanks!! Any clues would be appreciated.


**This 
message may contain confidential or proprietary information intended only 
for the use of theaddressee(s) named above or may contain information 
that is legally privileged. If you arenot the intended addressee, or the 
person responsible for delivering it to the intended addressee,you are 
hereby notified that reading, disseminating, distributing or copying this 
message is strictlyprohibited. If you have received this message by 
mistake, please immediately notify us byreplying to the message and 
delete the original message and any copies immediately thereafter.


Thank you.~
**
FAFLD
 



RE: Solr On Fly Field creation from full text for N-Gram Indexing

2012-05-10 Thread Husain, Yavar
Thanks Jack.

I tried (Regex Transformer) it out and the indexing has gone really slow. Is it 
(RegEx Transformer) slower than N-Gram Indexing? I mean they may be apples and 
oranges but what I mean is finally after extracting the field I want to NGram 
Index it. So It seems going in for NGram Indexing of Full Text (i.e. without 
extracting what I need using RegexTransformer) is a better solution ignoring 
space complexity??

Any views?

THANKS!!

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, May 10, 2012 4:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr On Fly Field creation from full text for N-Gram Indexing

You can use "Regex Transformer" to extract from a source field.

See:
http://wiki.apache.org/solr/DataImportHandler#RegexTransformer

-- Jack Krupansky

-Original Message-
From: Husain, Yavar
Sent: Thursday, May 10, 2012 6:04 AM
To: solr-user@lucene.apache.org
Subject: Solr On Fly Field creation from full text for N-Gram Indexing

I have full text in my database and I am indexing that using Solr. Now at 
runtime i.e. when the indexing is going on can I extract certain parameters 
based on regex and create another field/column on the fly using Solr for that 
extracted text?

For example my DB has just 2 columns (DocId & FullText):

DocIdFullText
1My name is Avi. RoleId: GYUIOP-MN-1087456. .

Now say while indexing I want to extract RoleId, place it in another column 
created on fly and index that column using N-Gram indexing. I dont want to go 
for N-Gram of Full text as that would be too time expensive.

Thanks!! Any clues would be appreciated.


**This
message may contain confidential or proprietary information intended only for 
the use of theaddressee(s) named above or may contain information that is 
legally privileged. If you arenot the intended addressee, or the person 
responsible for delivering it to the intended addressee,you are hereby 
notified that reading, disseminating, distributing or copying this message is 
strictlyprohibited. If you have received this message by mistake, please 
immediately notify us byreplying to the message and delete the original 
message and any copies immediately thereafter.  Thank you.~ 
**
FAFLD
 



Re: Can one determine which results are "good enough" to alert users about?

2012-05-10 Thread Jan Høydahl
Hi,

The whole thinking of score threshold is flawed in this situation.
Chris, you say yourself that you plan to let people subscribe to searches which 
are known to have crappy results for perhaps the majority of hits, and there is 
no automatic way of rectifying that.

Imagine a search for the two words Software License, and that your search does 
an OR search with stemming etc.
Now, in a large corpus of documents scoring will see to it that the first page 
is probably filled with hits relevant to both words, but if you try to match 
smaller batches of documents, say all new docs every hour or day, you may very 
well be in a situation where no docs are relevant, but you still find plenty of 
matches for only Software or only License/licenses/licensing. This would be 
slightly better with an AND search, but it would not be usable for alerting 
unless the query itself was a phrase query for "Software License"

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 9. mai 2012, at 22:55, Otis Gospodnetic wrote:

> Hi Chris,
> 
> I think there is some confusion here.
> When people say things about relevance scores they talk about comparing them 
> across queries.
> What you have is a different situation, or at least a situation that lends 
> itself to working around this, at least partially.
> 
> You have N users.
> Each user enters N queries.
> 
> You have incoming stream of documents that you wan to match against all 
> users' saved queries.
> 
> When a new document is matched you could:
> 1) send it to user right away
> 2) store it somewhere as a document that matched a query Q and send all 
> matches to users periodically.
> 
> If you go with 1) then either you send all matches to users, or you introduce 
> the notion of the score thresholds.  That's bad for the reason you already 
> identified.
> If you go with 2) then you have the option of batching up matches for each 
> saved query and alerting users only every N hours.  Then, you could introduce 
> logic that says:
> "If there are >N matches for query Q then remove all matches with score  "If there are >M matches for query Q, then remove all matches with score  "If there are  ...
> 
> Maybe you can turn this into a feature in your product ;)
> 
> Otis 
> 
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm 
> 
> 
> 
>> 
>> From: Chris Harris 
>> To: solr-user@lucene.apache.org 
>> Sent: Wednesday, May 9, 2012 4:50 AM
>> Subject: Can one determine which results are "good enough" to alert users 
>> about?
>> 
>> I'm trying to think through a Solr-based email alerting engine that
>> would have the following properties:
>> 
>> 1. Users can enter queries they want to be alerted on, and the syntax
>> for alert queries should be the same syntax as my regular solr
>> (dismax) queries.
>> 
>> 1a. Corollary: Because of not just tf-idf but also dismax pf and qf
>> boosting, this implies that the set of documents that match a given
>> query will vary widely in quality; the first page of search results
>> will be quite good, but the last page won't be worth looking at.
>> 
>> 2. The email alerting engine shouldn't bother alerting people about
>> *all* new results for a given query; in particular it should avoid the
>> poor-quality tail of results and just alert on "the good stuff".
>> 
>> Unfortunately, my current understanding of Solr/Lucene is that there's
>> not a good automatic way to partition the set of query results into
>> "good stuff" vs "not good stuff". The main option I know of is to
>> filter out documents below a certain score threshold, but if you
>> search the Lucene/Solr mailing lists, people will advise that this is
>> unlikely to be fruitful. (It ultimately boils down to how Lucene/Solr
>> scores wasn't especially designed to mean anything as absolute
>> numbers, only when compared to other scores.)
>> 
>> This makes me wonder if there's something wrong with my original
>> requirements, or whether people have thought of some other way to
>> approach this.
>> 
>> Interestingly, Google appears to have solved this at least to some
>> degree with Google Alerts (http://www.google.com/alerts); there you
>> can choose to receive "Only the best results" rather than "All the
>> results". I'm not clear how they determine which results are "best",
>> but their UI certainly implies they've come up with some scheme for
>> it.
>> 
>> Thanks,
>> Chris
>> 
>> 



Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina

Dear,

I can't find how can I define in my schema.xml a field with this format?

My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina


Re: Field with attribut in the schema.xml ?

2012-05-10 Thread G.Long

Hi :)

You could just add a field called country and then add the information 
to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this format?

My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina




Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the information 
to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this format?

My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina








Solr indexing HTML metatags from Nutch

2012-05-10 Thread ML mail
Hello,

I am using Nutch 1.4 with Solr 3.6.0 and would like to get the HTML keywords 
and description metatags indexed into Solr. On the Nutch side I have followed 
the http://wiki.apache.org/nutch/IndexMetatags to get nutch parsing the 
extracting the metatags (using index-metatags and parse-metatags plugins) but 
now when I run the solrindex they simply don't get indexed. 

In Solr I am using the schema.xml provided by Nutch and have added the 
following fields for the metatags:
 
        
        
        

and have created a solrindex-mapping.xml file as follow:








the rest is pretty much a default install of Solr. So now my question is why 
can't I see the metatags indexed in solr? Did I forget maybe to configure 
something in Solr?

Any suggestions are welcome.

Thanks
M.L.

Re: Field with attribut in the schema.xml ?

2012-05-10 Thread G.Long

When you add data into Solr, you add documents which contain fields.
In your case, you should create a document for each of your inventors 
with every attribute they could have.


Here is an example in Java:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("inventor", "Rossi");
doc.addField("country", "FR");
solrServer.add(doc);
...
And then you do the same for all your inventors.

This way, each doc in your index represents one inventor and you can 
query them like:

q=inventor:rossi AND country:FR

Le 10/05/2012 14:33, Bruno Mannina a écrit :

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the 
information to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this 
format?


My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina










Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann

Am 10.05.2012 14:33, schrieb Bruno Mannina:

like that:

CH
FR

but in this case Ioose the link between inventor and its country?


Of course, you need to index the two inventors into two distinct documents.

Did you mark those fields as multi-valued? That won't make much sense IMHO.

Greetings,
Kuli


Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina
But I have more than 80 000 000 documents with many fields with this 
kind of description?!


i.e:
inventor
applicant
assignee
attorney

I must create for each document 4 documents ??

Le 10/05/2012 14:41, G.Long a écrit :

When you add data into Solr, you add documents which contain fields.
In your case, you should create a document for each of your inventors 
with every attribute they could have.


Here is an example in Java:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("inventor", "Rossi");
doc.addField("country", "FR");
solrServer.add(doc);
...
And then you do the same for all your inventors.

This way, each doc in your index represents one inventor and you can 
query them like:

q=inventor:rossi AND country:FR

Le 10/05/2012 14:33, Bruno Mannina a écrit :

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the 
information to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this 
format?


My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina














Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina

>>Did you mark those fields as multi-valued?

yes, I did.


Re: Field with attribut in the schema.xml ?

2012-05-10 Thread G.Long
You don't have to create a document per field. You have to create a 
document per person.


If inventors, applicants, assignees and attorneys have properties in 
common, you could have a model like :






Regards,
Gary

Le 10/05/2012 14:47, Bruno Mannina a écrit :
But I have more than 80 000 000 documents with many fields with this 
kind of description?!


i.e:
inventor
applicant
assignee
attorney

I must create for each document 4 documents ??

Le 10/05/2012 14:41, G.Long a écrit :

When you add data into Solr, you add documents which contain fields.
In your case, you should create a document for each of your inventors 
with every attribute they could have.


Here is an example in Java:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("inventor", "Rossi");
doc.addField("country", "FR");
solrServer.add(doc);
...
And then you do the same for all your inventors.

This way, each doc in your index represents one inventor and you can 
query them like:

q=inventor:rossi AND country:FR

Le 10/05/2012 14:33, Bruno Mannina a écrit :

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the 
information to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this 
format?


My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the 
link between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina
















Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Michael Kuhlmann
I don't know the details of your schema, but I would create fields like 
name, country, street etc., and a field named role, which contains 
values like inventor, applicant, etc.


How would you do it otherwise? Create only four documents, each fierld 
containing 80 mio. values?


Greetings,
Kuli

Am 10.05.2012 14:47, schrieb Bruno Mannina:

But I have more than 80 000 000 documents with many fields with this
kind of description?!

i.e:
inventor
applicant
assignee
attorney

I must create for each document 4 documents ??

Le 10/05/2012 14:41, G.Long a écrit :

When you add data into Solr, you add documents which contain fields.
In your case, you should create a document for each of your inventors
with every attribute they could have.

Here is an example in Java:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("inventor", "Rossi");
doc.addField("country", "FR");
solrServer.add(doc);
...
And then you do the same for all your inventors.

This way, each doc in your index represents one inventor and you can
query them like:
q=inventor:rossi AND country:FR

Le 10/05/2012 14:33, Bruno Mannina a écrit :

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the
information to your document.

Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this
format?

My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link
between inventor?
Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina
















Suddenly OOM

2012-05-10 Thread Jasper Floor
Hi all,

we've been running Solr 1.4 for about a year with no real problems. As
of monday it became impossible to do a full import on our master
because of an OOM. Now what I think is strange is that even after we
more than doubled the available memory there would still always be an
OOM.  We seem to have reached a magic number of documents beyond which
Solr requires infinite memory (or at least more than 2.5x what it
previously needed which is the same as infinite unless we invest in
more resources).

We have solved the immediate problem by changing autocommit=false,
holdability="CLOSE_CURSORS_AT_COMMIT", batchSize=1. Now
holdability in this case I don't think does very much as I believe
this is the default behavior. BatchSize certainly has a direct effect
on performance (about 3x time difference between 1 and 1). The
autocommit is a problem for us however. This leaves transactions
active in the db which may block other processes.

We have about 5.1 million documents in the index which is about 2.2 gigabytes.

A full index is a rare operation with us but when we need it we also
need it to work (thank you captain obvious).

With the settings above a full index takes 15 minutes. We anticipate
we will be handling at least 10x the amount of data in the future. I
actually hope to have solr 4 by then but I can't sell a product which
isn't finalized yet here.


Thanks for any insight you can give.

mvg,
Jasper


slave index not cleaned

2012-05-10 Thread Jasper Floor
Perhaps I am missing the obvious but our slaves tend to run out of
disk space. The index sizes grow to multiple times the size of the
master. So I just toss all the data and trigger a replication.
However, can't solr handle this for me?

I'm sorry if I've missed a simple setting which does this for me, but
if its there then I have missed it.

mvg
Jasper


Re: Field with attribut in the schema.xml ?

2012-05-10 Thread G.Long

I think I see what the problem is.
Correct me if I'm wrong but I guess your schema does not represent a 
person but something which can contain a list of persons with different 
attributes, right?


The problem is that you can't reproduce easily the hierarchy of 
structured data. There is no attribute in lucene index as there can be 
in a xml document. If your structured data is not too complex, you could 
try to add a field to your schema called "person" and concatenate all 
properties (name, age, role, country) into this unique field but that 
solution works only if you don't need to search for this properties...


Regards,
Gary Long

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this format?

My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the link 
between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina




Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina
Actually I have documents like this one, country of inventor is inside 
the field "inventor"
It's not exactly an inventor notice, it's a patent notive with several 
fields.

The "patent-number" field is the fieldkey.

Should I split my document and use fieldkey to link them (like on normal 
database)?






EP1416522A4
20050921
19052554
THIN-FILM SEMICONDUCTOR DEVICE AND ITS PRODUCTION 
METHOD
DISPOSITIF SEMI-CONDUCTEUR A FILM MINCE ET SON 
PROCEDE DE PRODUCTION
DANNFILM-HALBLEITERBAUELEMENT UND VERFAHREN ZU 
SEINER HERSTELLUNG

H01L21/20D2
H01L  21/0220060101C I20051008RMEP 
ADV LCD TECH DEV CT CO LTD [JP]
MATSUMURA M [JP]
OANA Y [JP]
ABE H [JP]
YAMAMOTO Y [JP]
KOSEKI H [JP]
WARABISAKO M [JP]






Le 10/05/2012 14:57, G.Long a écrit :
You don't have to create a document per field. You have to create a 
document per person.


If inventors, applicants, assignees and attorneys have properties in 
common, you could have a model like :






Regards,
Gary

Le 10/05/2012 14:47, Bruno Mannina a écrit :
But I have more than 80 000 000 documents with many fields with this 
kind of description?!


i.e:
inventor
applicant
assignee
attorney

I must create for each document 4 documents ??

Le 10/05/2012 14:41, G.Long a écrit :

When you add data into Solr, you add documents which contain fields.
In your case, you should create a document for each of your 
inventors with every attribute they could have.


Here is an example in Java:

SolrInputDocument doc = new SolrInputDocument();
doc.addField("inventor", "Rossi");
doc.addField("country", "FR");
solrServer.add(doc);
...
And then you do the same for all your inventors.

This way, each doc in your index represents one inventor and you can 
query them like:

q=inventor:rossi AND country:FR

Le 10/05/2012 14:33, Bruno Mannina a écrit :

like that:

CH
FR

but in this case Ioose the link between inventor and its country?

if I search an inventor named ROSSI with CH:
q=inventor:rossi and inventor-country=CH

the I will get this result but it's not correct because Rossi is FR.

Le 10/05/2012 14:28, G.Long a écrit :

Hi :)

You could just add a field called country and then add the 
information to your document.


Regards,
Gary L.

Le 10/05/2012 14:25, Bruno Mannina a écrit :

Dear,

I can't find how can I define in my schema.xml a field with this 
format?


My original format is:





WEBER WALTER


CH





ROSSI PASCAL


FR





I convert it to:
...
WEBER WALTER
ROSSI PASCAL
...

but how can I add Country code to the field without losing the 
link between inventor?

Can I use an attribut ?

Any idea are welcome :)

Thanks,
Bruno Mannina




















Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina

Le 10/05/2012 15:12, G.Long a écrit :

I think I see what the problem is.
Correct me if I'm wrong but I guess your schema does not represent a 
person but something which can contain a list of persons with 
different attributes, right?


Yes exactly what I have ! (see my next message)


Re: Field with attribut in the schema.xml ?

2012-05-10 Thread G.Long
I don't know what is the best solution. You could indeed split your 
documents and link them with the patent-number inside the same index. Or 
you could also use different cores with a specific schema (one core with 
the schema for the patent and one core with the schema for the inventor) 
and still link the inventors to the patent. see 
http://wiki.apache.org/solr/CoreAdmin  :)


Regards,


Le 10/05/2012 15:28, Bruno Mannina a écrit :
Actually I have documents like this one, country of inventor is inside 
the field "inventor"
It's not exactly an inventor notice, it's a patent notive with several 
fields.

The "patent-number" field is the fieldkey.

Should I split my document and use fieldkey to link them (like on 
normal database)?






EP1416522A4
20050921
19052554
THIN-FILM SEMICONDUCTOR DEVICE AND ITS 
PRODUCTION METHOD
DISPOSITIF SEMI-CONDUCTEUR A FILM MINCE ET SON 
PROCEDE DE PRODUCTION
DANNFILM-HALBLEITERBAUELEMENT UND VERFAHREN ZU 
SEINER HERSTELLUNG

H01L21/20D2
H01L  21/0220060101C I20051008RMEP 
ADV LCD TECH DEV CT CO LTD [JP]
MATSUMURA M [JP]
OANA Y [JP]
ABE H [JP]
YAMAMOTO Y [JP]
KOSEKI H [JP]
WARABISAKO M [JP]





Re: Field with attribut in the schema.xml ?

2012-05-10 Thread Bruno Mannina


The problem is that you can't reproduce easily the hierarchy of 
structured data. There is no attribute in lucene index as there can be 
in a xml document. If your structured data is not too complex, you 
could try to add a field to your schema called "person" and 
concatenate all properties (name, age, role, country) into this unique 
field but that solution works only if you don't need to search for 
this properties...


Yep, I think I will create a field full-inventor with country not 
indexed and a field inventor without country indexed.
I can't loose this information (country) unless in the result of the 
request.


The search will be done only on inventor field and full-inventor will be 
answered in the xml result.


Re: Suddenly OOM

2012-05-10 Thread Godfrey Obinchu
You need to perform Garbage Collection tune up on your JVM to handle the OOM

Sent from my iPhone

On May 10, 2012, at 21:06, "Jasper Floor"  wrote:

> Hi all,
> 
> we've been running Solr 1.4 for about a year with no real problems. As
> of monday it became impossible to do a full import on our master
> because of an OOM. Now what I think is strange is that even after we
> more than doubled the available memory there would still always be an
> OOM.  We seem to have reached a magic number of documents beyond which
> Solr requires infinite memory (or at least more than 2.5x what it
> previously needed which is the same as infinite unless we invest in
> more resources).
> 
> We have solved the immediate problem by changing autocommit=false,
> holdability="CLOSE_CURSORS_AT_COMMIT", batchSize=1. Now
> holdability in this case I don't think does very much as I believe
> this is the default behavior. BatchSize certainly has a direct effect
> on performance (about 3x time difference between 1 and 1). The
> autocommit is a problem for us however. This leaves transactions
> active in the db which may block other processes.
> 
> We have about 5.1 million documents in the index which is about 2.2 gigabytes.
> 
> A full index is a rare operation with us but when we need it we also
> need it to work (thank you captain obvious).
> 
> With the settings above a full index takes 15 minutes. We anticipate
> we will be handling at least 10x the amount of data in the future. I
> actually hope to have solr 4 by then but I can't sell a product which
> isn't finalized yet here.
> 
> 
> Thanks for any insight you can give.
> 
> mvg,
> Jasper


question about solr response qtime

2012-05-10 Thread G.Long

Hi :)

In what unit of time is expressed the QTime of a QueryResponse? Is it 
milliseconds?


Gary


Re: slave index not cleaned

2012-05-10 Thread Otis Gospodnetic
Hi Jasper,

Solr does handle that for you.  Some more stuff to share:

* Solr version?
* JVM version?
* OS?
* Java replication?
* Errors in Solr logs?
* deletion policy section in solrconfig.xml?
* merge policy section in solrconfig.xml?
* ...

You may also want to look at your Index report in SPM (http://sematext.com/spm) 
before/during/after replication and share what you see.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



- Original Message -
> From: Jasper Floor 
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Thursday, May 10, 2012 9:08 AM
> Subject: slave index not cleaned
> 
> Perhaps I am missing the obvious but our slaves tend to run out of
> disk space. The index sizes grow to multiple times the size of the
> master. So I just toss all the data and trigger a replication.
> However, can't solr handle this for me?
> 
> I'm sorry if I've missed a simple setting which does this for me, but
> if its there then I have missed it.
> 
> mvg
> Jasper
>


Re: question about solr response qtime

2012-05-10 Thread Walter Underwood
Yes, milliseconds. --wunder

On May 10, 2012, at 8:57 AM, G.Long wrote:

> Hi :)
> 
> In what unit of time is expressed the QTime of a QueryResponse? Is it 
> milliseconds?
> 
> Gary






Re: question about solr response qtime

2012-05-10 Thread Otis Gospodnetic
Gary - milliseconds, right.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



- Original Message -
> From: G.Long 
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Thursday, May 10, 2012 11:57 AM
> Subject: question about solr response qtime
> 
> Hi :)
> 
> In what unit of time is expressed the QTime of a QueryResponse? Is it 
> milliseconds?
> 
> Gary
>


Re: question about solr response qtime

2012-05-10 Thread crive
Yes

On Thu, May 10, 2012 at 4:57 PM, G.Long  wrote:

> Hi :)
>
> In what unit of time is expressed the QTime of a QueryResponse? Is it
> milliseconds?
>
> Gary
>


Re: question about solr response qtime

2012-05-10 Thread G.Long

Thank you both =)

Gary

Le 10/05/2012 17:59, Otis Gospodnetic a écrit :

Gary - milliseconds, right.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm 



Yes, milliseconds. --wunder

- Original Message -

From: G.Long
To: solr-user@lucene.apache.org
Cc:
Sent: Thursday, May 10, 2012 11:57 AM
Subject: question about solr response qtime

Hi :)

In what unit of time is expressed the QTime of a QueryResponse? Is it
milliseconds?

Gary





Re: Suddenly OOM

2012-05-10 Thread Otis Gospodnetic
Jasper,

The simple answer is to increase -Xmx :)
What is your ramBufferSizeMB (solrconfig.xml) set to?  Default is 32 (MB).

That autocommit you mentioned is a DB commit?  Not Solr one, right?  If so, why 
is commit needed when you *read* data from DB?

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



- Original Message -
> From: Jasper Floor 
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Thursday, May 10, 2012 9:06 AM
> Subject: Suddenly OOM
> 
> Hi all,
> 
> we've been running Solr 1.4 for about a year with no real problems. As
> of monday it became impossible to do a full import on our master
> because of an OOM. Now what I think is strange is that even after we
> more than doubled the available memory there would still always be an
> OOM.  We seem to have reached a magic number of documents beyond which
> Solr requires infinite memory (or at least more than 2.5x what it
> previously needed which is the same as infinite unless we invest in
> more resources).
> 
> We have solved the immediate problem by changing autocommit=false,
> holdability="CLOSE_CURSORS_AT_COMMIT", batchSize=1. Now
> holdability in this case I don't think does very much as I believe
> this is the default behavior. BatchSize certainly has a direct effect
> on performance (about 3x time difference between 1 and 1). The
> autocommit is a problem for us however. This leaves transactions
> active in the db which may block other processes.
> 
> We have about 5.1 million documents in the index which is about 2.2 gigabytes.
> 
> A full index is a rare operation with us but when we need it we also
> need it to work (thank you captain obvious).
> 
> With the settings above a full index takes 15 minutes. We anticipate
> we will be handling at least 10x the amount of data in the future. I
> actually hope to have solr 4 by then but I can't sell a product which
> isn't finalized yet here.
> 
> 
> Thanks for any insight you can give.
> 
> mvg,
> Jasper
>


StatsComponent or ...?

2012-05-10 Thread vybe3142
Hi,

My requirement is to calculate the sum of a certain field in the result

StatsComponent does what I need eg. 


results in 


Question. 
1. I don't need to calculate min, max, sumOfSquares etc. Is there a way to
limit the stats to the sum and nothing else?
2. Is there going to be a significant performance impact with statsComponent
(the field is single valued)?
3. Is there a better alternative to achieve what I want?

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/StatsComponent-or-tp3977553.html
Sent from the Solr - User mailing list archive at Nabble.com.


Populating 'multivalue' fields (m:1 relationships)

2012-05-10 Thread Klostermeyer, Michael
I am attempting to index a DB schema that has a many:one relationship.  I 
assume I would index this within Solr as a 'multivalue=true' field, is that 
correct?

I am currently populating the Solr index w/ a stored procedure in which each DB 
record is "flattened" into a single document in Solr.  I would like one of 
those Solr document fields to contain multiple values from the m:1 table (i.e. 
[fieldName]=1,3,6,8,7).  I then need to be able to do a "fq=fieldname:3" and 
return the previous record.

My question is: how do I populate Solr with a multi-valued field for many:1 
relationships?  My first guess would be to concatenate all the values from the 
'many' side into a single DB column in the SP, then pipe that column into a 
multivalue=true Solr field.  The DB side of that will be ugly, but would the 
Solr side index this properly?  If so, what would be the delimiter that would 
allow Solr to index each element of the multivalued field?

[Warning: possible tangent below...but I think this question is relevant.  If 
not, tell me and I'll break it out]

I have gone out of my way to "flatten" the data within my SP prior to giving it 
to Solr.  For my solution stated above, I would have the following data (Title 
being the "many" side of the m:1, and PK being the Solr unique ID):

PK | Name | Title
Pk_1 | Dwight | Sales, Assistant To The Regional Manager
Pk_2 | Jim | Sales
Pk_3 | Michael | Regional Manger

Below is an example of a non-flattened record set.  How would Solr handle a 
data set in which the following data was indexed:

PK | Name | Title
Pk_1 | Dwight | Sales
Pk_1 | Dwight | Assistant To The Regional Manager
Pk_2 | Jim | Sales
Pk_3 | Michael | Regional Manger

My assumption is that the second Pk_1 record would overwrite the first, thereby 
losing the "Sales" title from Pk_1.  Am I correct on that assumption?

I'm new to this ballgame, so don't be shy about pointing me down a different 
path if I am doing anything incorrectly.

Thanks!

Mike Klostermeyer


Re: Error messages

2012-05-10 Thread Shawn Heisey

On 5/10/2012 2:02 AM, Tolga wrote:

Apache servers are returning my post with the status messages
HTML_FONT_SIZE_HUGE,HTML_MESSAGE,HTTP_ESCAPED_HOST,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,URI_HEX,WEIRD_PORT. 
I've tried clearing all formatting and a re-post, but the same thing 
occurred. What to do?


Don't send HTML email, send plain text only.  Exactly how to do this 
will depend on what program/website you use for email.


Thanks,
Shawn



Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-10 Thread Ravi Solr
I clean the entire index and re-indexed it with SOLRJ 3.6. Still I get
the same error every single day. How can I see if the container
returned partial/nonconforming response since it may be hidden by
solrj ?

Thanks

Ravi Kiran Bhaskar

On Mon, May 7, 2012 at 2:16 PM, Ravi Solr  wrote:
> Hello Mr. Miller and Mr. Erickson,
>              Found yet another inconsistency on the query server that
> might be causing this issue. Today morning also I got a similar error
> as shown in stacktrace below. So I tried querying for that
> "d101dd3a-979a-11e1-927c-291130c98dff" which is our unique key in the
> schema.
>
> On the server having issue it returned more than 10 docs with
> numFound="1051273" and on all other sane servers it returned only 1
> doc with numFound="1". This is really weird, as, we copied the entire
> index from a sane server onto the server having issues now just 2 days
> ago. Do you have any idea why this would happen ?
>
> [#|2012-05-07T12:58:54.055-0400|SEVERE|sun-appserver2.1.1|com.wpost.ipad.feeds.FeedController|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-3;_RequestID=4203e3e5-c39d-4df7-a32a-600d0169c81f;|Error
> searching for thumbnails for d101dd3a-979a-11e1-927c-291130c98dff
> org.apache.solr.client.solrj.SolrServerException: Error executing query
>        at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
>        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:311)
>        at xxx.xxx.xxx.xxx.populateThumbnails(FeedController.java:1184)
>        at xxx.xxx.xxx.xxx..findNewsBySection(FeedController.java:509)
>        at sun.reflect.GeneratedMethodAccessor197.invoke(Unknown Source)
> ..
> ...
> ..
> Caused by: java.lang.RuntimeException: Invalid version (expected 2,
> but 60) or the data in not in 'javabin' format
>        at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
>        at 
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
>        at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:333)
>        at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:211)
>        at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
>        ... 43 more
> |#]
>
> Ravi Kiran Bhaskar
> Principal Software Engineer
> Washington Post Digital
> 1150 15th Street NW, Washington, DC 20071
>
> On Mon, May 7, 2012 at 9:36 AM, Mark Miller  wrote:
>> Normally this specific error is caused by a non success http error page and 
>> response is returned. The response parser tries to parse HTML as javabin.
>>
>> Sent from my iPhone
>>
>> On May 7, 2012, at 7:37 AM, Erick Erickson  wrote:
>>
>>> Well, I'm guessing that the version of Solr (and perhaps there are
>>> classpath issues in here?) are different, somehow, on the machine
>>> slave that is showing the error.
>>>
>>> It's also possible that your config files have a different  LUCENE_VERSION
>>> in them, although I don't think this should really create the errors you're
>>> reporting.
>>>
>>> The thing that leads me in this direction is your statement that things
>>> are fine for a while and then go bad later.  If replication happens just
>>> before you get the index version error, that would point a finger at
>>> something like different Solr versions.
>>>
>>> If there is no replication before this error, then this probably isn't
>>> the problem
>>> and we'll have to look elsewhere...
>>>
>>> But this is all guesswork, just like every bug... things are only obvious 
>>> after
>>> you find the problem!
>>>
>>> Best
>>> Erick
>>>
>>>
>>> On Sun, May 6, 2012 at 11:08 AM, Ravi Solr  wrote:
 Thank you very much for responding Mr.Erickson. You may be right on
 old version index, I will reindex. However we have a 2
 separate/disjoint master-slave setup...only one query node/slave has
 this issue. if it was really incompatible indexes why isnt the other
 query server also throwing errors? that's what is throwing my
 debugging thought process off.

 Thanks

 Ravi Kiran Bhaskar
 Principal Software Engineer
 Washington Post Digital
 1150 15th Street NW, Washington, DC 20071

 On Sat, May 5, 2012 at 12:53 PM, Erick Erickson  
 wrote:
> The first thing I'd check is if, in the log, there is a replication 
> happening
> immediately prior to the error. I confess I'm not entirely up on the
> version thing, but is it possible you're replicating an index that
> is built with some other version of Solr?
>
> That would at least explain your statement that it runs OK, but then
> fails sometime later.
>
> Best
> Erick
>
> On Fri, May 4, 2012 at 1:50 PM, Ravi Solr  wrote:
>> Hello,
>>         We Recently we migrated our SOLR 3.6 server OS from Solaris
>> to CentOS and from then on we started seeing "Invalid version
>> (exp

RE: Nested CachedSqlEntityProcessor running for each entity row with Solr 3.6?

2012-05-10 Thread Brent Mills
Hi James,

I just pulled down the newest nightly build of 4.0 and it solves an issue I had 
been having with solr ignoring the caching of the child entities.  It was 
basically opening a new connection for each iteration even though everything 
was specified correctly.  This was present in my previous build of 4.0 so it 
looks like you fixed it with one of those patches.  Thanks for all your work on 
the DIH, the caching improvements are a big help with some of the things we 
will be rolling out in production soon.

-Brent

-Original Message-
From: Dyer, James [mailto:james.d...@ingrambook.com] 
Sent: Monday, May 07, 2012 1:47 PM
To: solr-user@lucene.apache.org
Cc: Brent Mills; dye.kel...@gmail.com; keithn...@dswinc.com
Subject: RE: Nested CachedSqlEntityProcessor running for each entity row with 
Solr 3.6?

Dear Kellen, Brent & Keith,

There now are fixes available for 2 cache-related bugs that unfortunately made 
their way into the 3.6.0 release.  These were addressed on these 2 JIRA issues, 
which have been committed to the 3.6 branch (as of today):
- https://issues.apache.org/jira/browse/SOLR-3430
- https://issues.apache.org/jira/browse/SOLR-3360
These problem were also affecting Trunk/4.x, with both fixes being committed to 
Trunk under SOLR-3430.

Should Solr 3.6.1 be released, these fixes will become generally available at 
that time.  They also will be part of the 4.0 release, which the Development 
Community hopes will be later this year.

In the mean time, I am hoping each of you can test these fixes with your 
installation.  The best way to do this is to get a fresh SVN checkout of the 
3.6.1 branch 
(http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/), switch 
to the "solr" directory, then run "ant dist".  I believe you need Ant 1.8 to 
build.

If you are unable to build yourself, I put an *unofficial* shapshot of the DIH 
jar here:
 
http://people.apache.org/~jdyer/unofficial/apache-solr-dataimporthandler-3.6.1-SNAPSHOT-r1335176.jar

Please let me know if this solves your problems with DIH Caching, giving you 
the functionality you had with 3.5 and prior.  Your feedback is greatly 
appreciatd.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: not interesting [mailto:dye.kel...@gmail.com]
Sent: Monday, May 07, 2012 9:43 AM
To: solr-user@lucene.apache.org
Subject: Nested CachedSqlEntityProcessor running for each entity row with Solr 
3.6?

I just upgraded from Solr 3.4 to Solr 3.6; I'm using the same data-import.xml 
for both versions. The import functioned properly with 3.4.

I'm using a nested entity to fetch authors associated with each document, and 
I'm using CachedSqlEntityProcessor to avoid hitting the DB an unreasonable 
number of times. However, when indexing, Solr indexes very slowly and appears 
to be fetching all authors in the DB for each document. The index should be 
~500 megs; I aborted the indexing when it reached ~6gigs. If I comment out the 
nested author entity below, Solr will index normally.

Am I missing something obvious or is this a bug?







 
 
 




Also posted at SO if you prefer to answer there:
http://stackoverflow.com/questions/10482484/nested-cachedsqlentityprocessor-running-for-each-entity-row-with-solr-3-6

Kellen


fq syntax question

2012-05-10 Thread anarchos78
Hello,
Solr accepts fq parameter like: localhost:8080/solr/select/?q=blah+blah
&fq=model:member+model:new_member
Is it possible to pass the fq parameter with alternative syntax like:
fq=model=member&model= new_member or in other way?
Thank you,
Tom


--
View this message in context: 
http://lucene.472066.n3.nabble.com/fq-syntax-question-tp3977899.html
Sent from the Solr - User mailing list archive at Nabble.com.


Delete documents

2012-05-10 Thread Tolga

Hi,
I've been reading 
http://lucene.apache.org/solr/api/doc-files/tutorial.html and in the 
section "Deleting Data", I've edited schema.xml to include a field named 
id, issued the command for f in *;java -Ddata=args -Dcommit=no -jar 
post.jar "$f";done, went on to the stats page 
only to find no files were de-indexed. How can I do that?


Regards,


Delete data

2012-05-10 Thread Tolga

Sorry, commit=no should have been commit=yes in my previous post.

Regards,


Re: Solr file size limit?

2012-05-10 Thread Bram Rongen
Hi Guys!

I've removed the two largest documents which were very large. One of which
consisted of 1 field and was around 4MB (text)..

This fixed my issue..

Kind regards,

Bram Rongen

On Fri, Apr 20, 2012 at 2:09 PM, Bram Rongen  wrote:

> Hmm, reading your reply again I see that Solr only uses the first 10k
> tokens from each field so field length should not be a problem per se.. It
> could be my document contain very large tokens and unorganized tokens,
> could this startle Solr?
>
>
> On Fri, Apr 20, 2012 at 2:03 PM, Bram Rongen  wrote:
>
>> Yeah, I'm indexing some PDF documents.. I've extracted the text through
>> tika (pre-indexing).. and the largest field in my DB is 20MB. That's quite
>> extensive ;) My Solution for the moment is to cut this text to the first
>> 500KB, that should be enough for a decent index and search capabilities..
>> Should I increase the buffer size for these sizes as well or will 32MB
>> suffice?
>>
>> FYI, output of ulimit -a is
>> core file size  (blocks, -c) 0
>> data seg size   (kbytes, -d) unlimited
>> scheduling priority (-e) 20
>> *file size   (blocks, -f) unlimited*
>> pending signals (-i) 16382
>> max locked memory   (kbytes, -l) 64
>> max memory size (kbytes, -m) unlimited
>> open files  (-n) 1024
>> pipe size(512 bytes, -p) 8
>> POSIX message queues (bytes, -q) 819200
>> real-time priority  (-r) 0
>> stack size  (kbytes, -s) 8192
>> cpu time   (seconds, -t) unlimited
>> max user processes  (-u) unlimited
>> virtual memory  (kbytes, -v) unlimited
>> file locks  (-x) unlimited
>>
>>
>> Kind regards!
>> Bram
>>
>> On Fri, Apr 20, 2012 at 12:15 PM, Lance Norskog wrote:
>>
>>> Good point! Do you store the large file in your documents, or just index
>>> them?
>>>
>>> Do you have a "largest file" limit in your environment? Try this:
>>> ulimit -a
>>>
>>> What is the "file size"?
>>>
>>> On Thu, Apr 19, 2012 at 8:04 AM, Shawn Heisey  wrote:
>>> > On 4/19/2012 7:49 AM, Bram Rongen wrote:
>>> >>
>>> >> Yesterday I've started indexing again but this time on Solr 3.6..
>>> Again
>>> >> Solr is failing around the same time, but not exactly (now the
>>> largest fdt
>>> >> file is 4.8G).. It's right after the moment I receive memory-errors
>>> at the
>>> >> Drupal side which make me suspicious that it maybe has something to do
>>> >> with
>>> >> a huge document.. Is that possible? I was indexing 1500 documents at
>>> once
>>> >> every minute. Drupal builds them all up in memory before submitting
>>> them
>>> >> to
>>> >> Solr. At some point it runs out of memory and I have to switch to
>>> 10/20
>>> >> documents per minute for a while.. then I can switch back to 1000
>>> >> documents
>>> >> per minute.
>>> >>
>>> >> The disk is a software RAID1 over 2 disks. But I've also run into the
>>> same
>>> >> problem at another server.. This was a VM-server with only 1GB ram and
>>> >> 40GB
>>> >> of disk. With this server the merge-repeat happened at an earlier
>>> stage.
>>> >>
>>> >> I've also let Solr continue with merging for about two days before
>>>  (in an
>>> >> earlier attempt), without submitting new documents. The merging kept
>>> >> repeating.
>>> >>
>>> >> Somebody suggested it could be because I'm using Jetty, could that be
>>> >> right?
>>> >
>>> >
>>> > I am using Jetty for my Solr installation and it handles very large
>>> indexes
>>> > without a problem.  I have created a single index with all my data
>>> (nearly
>>> > 70 million documents, total index size over 100GB).  Aside from how
>>> long it
>>> > takes to build and the fact that I don't have enough RAM to cache it
>>> for
>>> > good performance, Solr handled it just fine.  For production I use a
>>> > distributed index on multiple servers.
>>> >
>>> > I don't know why you are seeing a merge that continually restarts,
>>> that's
>>> > truly odd.  I've never used drupal, don't know a lot about it.  From my
>>> > small amount of research just now, I assume that it uses Tika, also
>>> another
>>> > tool that I have no experience with.  I am guessing that you store the
>>> > entire text of your documents into solr, and that they are indexed up
>>> to a
>>> > maximum of 1 tokens (the default value of maxFieldLength in
>>> > solrconfig.xml), based purely on speculation about the "body" field in
>>> your
>>> > schema.
>>> >
>>> > A document that's 100MB in size, if the whole thing gets stored, will
>>> > completely overwhelm a 32MB buffer, and might even be enough to
>>> overwhelm a
>>> > 256MB buffer as well, because it will basically have to build the
>>> entire
>>> > index segment in RAM, with term vectors, indexed data, and stored data
>>> for
>>> > all fields.
>>> >
>>> > With such large documents, you may have to increase the
>>> maxFieldLength, or
>>> > you won't be able to search on the entire document text.  Depending on
>>> the
>

Re: Identify indexed terms of document

2012-05-10 Thread Ahmet Arslan


> It's possible to see what terms are indexed for a field of
> document that
> stored=false?

One way is to use http://wiki.apache.org/solr/LukeRequestHandler

> I have a search  that doesn't work with quotes like
> this "field:TEXT Nº
> 1098" when i remove quotes the search find the document
> (using AND like a
> default operator). My idea is to see what terms which
> indexed to analyze
> what happened. If anyone has any idea to what this error
> occur, please help.

solr/admin/analysis.jsp is also a great tool. By the way if proper way to fire 
a phrase query is &q=field:"TEXT Nº 1098"



Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-10 Thread Shawn Heisey

On 5/10/2012 12:27 PM, Ravi Solr wrote:

I clean the entire index and re-indexed it with SOLRJ 3.6. Still I get
the same error every single day. How can I see if the container
returned partial/nonconforming response since it may be hidden by
solrj ?


If the server is sending a non-javabin error response that SolrJ doesn't 
parse, the logs from the container that runs Solr will normally give you 
useful information.  Here's the first part of something logged on mine 
for an invalid query - the query sent in only had one double quote.  The 
log actually contains the full Java stacktrace, I just included the 
first little bit:


May 10, 2012 2:13:17 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse '(  ("trader 
joe's))': Lexical error at line 1, column 20.  Encountered:  after 
: "\"trader joe\'s))"


Thanks,
Shawn



Data Import Handler with Dynamic Fields

2012-05-10 Thread dboychuck
I am trying to import data through my db but I have dynamic fields that I
don't always know the names of. Can someone tell me why something like this
doesn't work.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Data-Import-Handler-with-Dynamic-Fields-tp3978419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-10 Thread Ravi Solr
Thanks for responding Mr. Heisey... I don't see any parsing errors in
my log but I see lot of exceptions like the one listed belowonce
an exception like this happens weirdness ensues. For example - To
check sanity I queried for uniquekey:"111" from the solr admin GUI it
gave back numFound equal to all docs in that index i.e. its not
searching for that uniquekey at all, it blindly matched all docs.
However, once you restart the server the same index without any change
works perfectly returning only one doc in numFound when you search for
uniquekey:"111"...I tried everything from reindexing, copying index
from another sane server, delete entire index and reindex from scratch
etc but in vain, it works for roughly 24 hours and then starts
throwing the same error no matter what the query is.


[#|2012-05-10T13:27:14.071-0400|SEVERE|sun-appserver2.1.1|xxx.xxx.xxx.xxx|_ThreadID=21;_ThreadName=httpSSLWorkerThread-9001-6;_RequestID=d44462e7-576b-4391-a499-c65da33e3293;|Error
searching data for section Local
org.apache.solr.client.solrj.SolrServerException: Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:311)
at xxx.xxx.xxx.xxx(FeedController.java:621)
at xxx.xxx.xxx.xxx(FeedController.java:402)
at sun.reflect.GeneratedMethodAccessor184.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.springframework.web.bind.annotation.support.HandlerMethodInvoker.invokeHandlerMethod(HandlerMethodInvoker.java:175)
at 
org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.invokeHandlerMethod(AnnotationMethodHandlerAdapter.java:421)
at 
org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter.handle(AnnotationMethodHandlerAdapter.java:409)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:774)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:719)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:644)
at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:549)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
at 
org.apache.catalina.core.ApplicationFilterChain.servletService(ApplicationFilterChain.java:427)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:315)
at 
org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:287)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:218)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)
at 
com.sun.enterprise.web.PESessionLockingStandardPipeline.invoke(PESessionLockingStandardPipeline.java:98)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:222)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:166)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:648)
at 
org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:593)
at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:587)
at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:1093)
at 
org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:291)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:670)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.doProcess(DefaultProcessorTask.java:601)
at 
com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.process(DefaultProcessorTask.java:875)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.executeProcessorTask(DefaultReadTask.java:365)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:285)
at 
com.sun.enterprise.web.connector.grizzly.DefaultReadTask.doTask(DefaultReadTask.java:221)
at 
com.sun.enterpris

Solr - custom(dynamic) expire header to the solr Response

2012-05-10 Thread solrk
Is there any way to set the "Expires" header dynamically to the solr
response? 

Thanks.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-custom-dynamic-expire-header-to-the-solr-Response-tp3978170.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Import Handler Custom Transformer not working

2012-05-10 Thread dboychuck
I have created a custom transformer for dynamic fields but it doesn't seem to
be working correctly and I'm not sure how to debug it with a live running
solr instance. 

Here is my transformer

package org.build.com.solr;

import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.Transformer;

import java.util.List;
import java.util.Map;

public class FacetsTransformer extends Transformer {

public Object transformRow(Map row, Context context) {
Object tf = row.get("facets");
if (tf != null) {
if (tf instanceof List) {
List list = (List) tf;
for (Object o : list) {
String[] arr = ((String) 
o).split("=");
if (arr.length == 3) 
row.put(arr[0].replaceAll("[^A-Za-z0-9]", "") +
"_" + arr[1], arr[2]);
}
} else {
String[] arr = ((String) tf).split("=");
if (arr.length == 3) 
row.put(arr[0].replaceAll("[^A-Za-z0-9]", "") +
"_" + arr[1], arr[2]);
}
row.remove("facets");
}
return row;
}
}

Seems pretty standard. Here is the data contained in the facet row:
ADA=boolean=Yes|Category=string=Kitchen Faucets|Category=string=Pullout
Kitchen Faucets|Country of Origin=string=USA|Eco
Friendly=boolean=No|Escutcheon Included=boolean=Yes|Faucet
Centers=numeric=0|Faucet Holes=numeric=1|Filtering=boolean=No|Flow Rate
(GPM)=numeric=2.20|Handle Style=string=Metal Lever|Handles
Included=boolean=Yes|Height=numeric=16.63|Hose
Length=numeric=33|Installation Type=string=Deck Mounted|Low Lead
Compliant=Boolean=Yes|MasterFinish=String=Bronze
Tones|MasterFinish=String=Chromes|MasterFinish=String=Nickel
Tones|Material=string=Metal|Max Deck Thickness=numeric=1.38|Number Of
Handles=numeric=1|Pre Rinse=boolean=No|Pullout
Spray=boolean=Yes|Sidespray=boolean=No|Soap Dispenser
Included=boolean=No|Spout Height=numeric=10.00|Spout
Reach=numeric=9.50|Spout Swivel=numeric=360|Spout Type=string=Swivel|Sub
Category=string=Pullout Spray Faucets
|Theme=string=Traditional / Classic|Valve Type=string=Ceramic
Disc|Width=numeric=10.5|


here is my import config file




But for some reason none of the facets are showing up in my index. The
transformer is being used because there are no class not found errors. Can
anybody lend a hand as to what might be going wrong here?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Import-Handler-Custom-Transformer-not-working-tp3978746.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Import Handler Custom Transformer not working

2012-05-10 Thread dboychuck
Also here is my schema

 
   
   

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Import-Handler-Custom-Transformer-not-working-tp3978746p3978748.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SOLR Security

2012-05-10 Thread Klostermeyer, Michael
Instead of hitting the Solr server directly from the client, I think I would go 
through your application server, which would have access to all the users data 
and can forward that to the Solr server, thereby hiding it from the client.

Mike


-Original Message-
From: Anupam Bhattacharya [mailto:anupam...@gmail.com] 
Sent: Thursday, May 10, 2012 9:53 PM
To: solr-user@lucene.apache.org
Subject: SOLR Security

I am using Ajax-Solr Framework for creating a search interface. The search 
interface works well.
In my case, the results have document level security so by even indexing 
records with there authorized users help me to filter results per user based on 
the authentication of the user.

The problem that I have to a pass always a parameter to the SOLR Server with 
userid={xyz} which one can figure out from the SOLR URL(ajax call url) using 
Firebug tool in the Net Console on Firefox and can change this parameter value 
to see others records which he/she is not authorized.
Basically it is Cross Site Scripting Issue.

I have read about some approaches for Solr Security like Nginx with Jetty & 
.htaccess based security.Overall what i understand from this is that we can 
restrict users to do update/delete operations on SOLR as well as we can 
restrict the SOLR admin interface to certain IPs also. But How can I restrict 
the {solr-server}/solr/select based results from access by different user id's ?


Fwd: Delete documents

2012-05-10 Thread Tolga

Anyone at all?

 Original Message 
Subject:Delete documents
Date:   Thu, 10 May 2012 22:59:49 +0300
From:   Tolga 
To: solr-user@lucene.apache.org



Hi,
I've been reading
http://lucene.apache.org/solr/api/doc-files/tutorial.html and in the
section "Deleting Data", I've edited schema.xml to include a field named
id, issued the command for f in *;java -Ddata=args -Dcommit=yes -jar
post.jar "$f";done, went on to the stats page
only to find no files were de-indexed. How can I do that?

Regards,



Re: Fwd: Delete documents

2012-05-10 Thread Otis Gospodnetic
Hi,

You've restarted Solr after editing the schema?
And checked the logs?  Paste?

Otis

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



- Original Message -
> From: Tolga 
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Friday, May 11, 2012 12:31 AM
> Subject: Fwd: Delete documents
> 
> Anyone at all?
> 
>  Original Message 
> Subject:     Delete documents
> Date:     Thu, 10 May 2012 22:59:49 +0300
> From:     Tolga 
> To:     solr-user@lucene.apache.org
> 
> 
> 
> Hi,
> I've been reading
> http://lucene.apache.org/solr/api/doc-files/tutorial.html and in the
> section "Deleting Data", I've edited schema.xml to include a field 
> named
> id, issued the command for f in *;java -Ddata=args -Dcommit=yes -jar
> post.jar "$f";done, 
> went on to the stats page
> only to find no files were de-indexed. How can I do that?
> 
> Regards,
>


Re: Join Query syntax

2012-05-10 Thread Otis Gospodnetic
Hi Sohail,

http://search-lucene.com/?q=Join&fc_project=Solr 


Hit #1.

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 



- Original Message -
> From: Sohail Aboobaker 
> To: solr-user@lucene.apache.org
> Cc: 
> Sent: Thursday, May 10, 2012 10:13 PM
> Subject: Join Query syntax
> 
> Hi,
> 
> We have two indexes. One is for item master and other one is item detail.
> Our search results page is supposed to show all the item masters in a
> certain criteria but also include a column minimum price. This minimum
> price is the minimum price in item detail index.
> 
> Is there a way to do this in one query or do we have to do a query within
> loop of search results?
> 
> Thank you for your help.
> 
> Sohail
>


Re: SOLR Security

2012-05-10 Thread Anupam Bhattacharya
Yes, I agree with you.

But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
solution ?

Anupam

On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
mklosterme...@riskexchange.com> wrote:

> Instead of hitting the Solr server directly from the client, I think I
> would go through your application server, which would have access to all
> the users data and can forward that to the Solr server, thereby hiding it
> from the client.
>
> Mike
>
>
> -Original Message-
> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
> Sent: Thursday, May 10, 2012 9:53 PM
> To: solr-user@lucene.apache.org
> Subject: SOLR Security
>
> I am using Ajax-Solr Framework for creating a search interface. The search
> interface works well.
> In my case, the results have document level security so by even indexing
> records with there authorized users help me to filter results per user
> based on the authentication of the user.
>
> The problem that I have to a pass always a parameter to the SOLR Server
> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
> using Firebug tool in the Net Console on Firefox and can change this
> parameter value to see others records which he/she is not authorized.
> Basically it is Cross Site Scripting Issue.
>
> I have read about some approaches for Solr Security like Nginx with Jetty
> & .htaccess based security.Overall what i understand from this is that we
> can restrict users to do update/delete operations on SOLR as well as we can
> restrict the SOLR admin interface to certain IPs also. But How can I
> restrict the {solr-server}/solr/select based results from access by
> different user id's ?
>


Re: Fwd: Delete documents

2012-05-10 Thread Jack Krupansky
Try using the actual id of the document rather than the shell substitution 
variable - if you're trying to delete one document.


To delete all documents, use delete by query:

*:*

See:
http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F

-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, May 11, 2012 12:31 AM
To: solr-user@lucene.apache.org
Subject: Fwd: Delete documents

Anyone at all?

 Original Message 
Subject: Delete documents
Date: Thu, 10 May 2012 22:59:49 +0300
From: Tolga 
To: solr-user@lucene.apache.org



Hi,
I've been reading
http://lucene.apache.org/solr/api/doc-files/tutorial.html and in the
section "Deleting Data", I've edited schema.xml to include a field named
id, issued the command for f in *;java -Ddata=args -Dcommit=yes -jar
post.jar "$f";done, went on to the stats page
only to find no files were de-indexed. How can I do that?

Regards,



Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-10 Thread Shawn Heisey

On 5/10/2012 4:17 PM, Ravi Solr wrote:

Thanks for responding Mr. Heisey... I don't see any parsing errors in
my log but I see lot of exceptions like the one listed belowonce
an exception like this happens weirdness ensues. For example - To
check sanity I queried for uniquekey:"111" from the solr admin GUI it
gave back numFound equal to all docs in that index i.e. its not
searching for that uniquekey at all, it blindly matched all docs.
However, once you restart the server the same index without any change
works perfectly returning only one doc in numFound when you search for
uniquekey:"111"...I tried everything from reindexing, copying index
from another sane server, delete entire index and reindex from scratch
etc but in vain, it works for roughly 24 hours and then starts
throwing the same error no matter what the query is.


[#|2012-05-10T13:27:14.071-0400|SEVERE|sun-appserver2.1.1|xxx.xxx.xxx.xxx|_ThreadID=21;_ThreadName=httpSSLWorkerThread-9001-6;_RequestID=d44462e7-576b-4391-a499-c65da33e3293;|Error
searching data for section Local
org.apache.solr.client.solrj.SolrServerException: Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:311)
at xxx.xxx.xxx.xxx(FeedController.java:621)
at xxx.xxx.xxx.xxx(FeedController.java:402)


This is still saying solrj.  Unless I am completely misunderstanding the 
way things work, which I will freely admit is possible, this is the 
client code.  Do you have anything in the log files from Solr (the 
server)?  I don't have a lot of experience with Tomcat, because I run my 
Solr under jetty as included in the example.  It looks like the client 
is running under Tomcat, though I suppose you might be running Solr 
under a different container.


Thanks,
Shawn



Editing long Solr URLs - Chrome Extension

2012-05-10 Thread Amit Nithian
Hey all,

I don't know about you but most of the Solr URLs I issue are fairly
lengthy full of parameters on the query string and browser location
bars aren't long enough/have multi-line capabilities. I tried to find
something that does this but couldn't so I wrote a chrome extension to
help.

Please check out my blog post on the subject and please let me know if
something doesn't work or needs improvement. Of course this can work
for any URL with a query string but my motivation was to help edit my
long Solr URLs.

http://hokiesuns.blogspot.com/2012/05/manipulating-urls-with-long-query.html

Thanks!
Amit