Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Michael Kuhlmann
Hi,

Am 25.10.2011 23:53, schrieb Shawn Heisey:
> On 10/20/2011 11:00 AM, Shawn Heisey wrote:
>> [...] I've noticed a performance discrepancy when
>> processing every one of my delete records, currently about 25000 of
>> them.

I din't understand what a delete record is. Do you delete records in
Solr? This shouldn't be done using records (what is a record in this
case? A document?); use a query for that.

Or do you add documents that you call delete records?

> I've managed to make this somewhat better by using multiple threads to
> do all the deletes on the six large static indexes at once, but that
> shouldn't be required.  The Perl version doesn't do them at the same time.

Are you sure? I don't know about the perl client, but maybe it's doing
the network operation in background?

I a single-thread environment, the client has to wait when sending each
request until it has been completely sent to the server, doing nothing.
Multiple threads can help you a lot here.

You can check this when you monitor your client's cpu load.

> 10:27 < cedrichurst> the only difference i could see is deserializing
> the java binary object

This is true, but only in theory. Serializing and deserializing is so
fast that this shouldn't impact.

If you really want to be sure, use a SolrInputDocument instead of
annotated classes when sending documents, but as I said, this shouldn't
matter much.

What's more important: Don't send single documents but rather use
add(Collection) with multiple documents at once. At least when I
understood you correctly that you want to send 25000 documents for update...


-Kuli


Re: some basic information on Solr

2011-10-26 Thread stockii
i think with "incident" he mean, failures / downtimes / problems with solr !? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 
1 Core with 45 Million Documents other Cores < 200.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/some-basic-information-on-Solr-tp3448957p3453837.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: java.net.SocketException: Too many open files

2011-10-26 Thread Michael Kuhlmann
Hi;

we have a similar problem here. We already raised the file ulimit on the
server to 4096, but this only defered the problem. We get a
TooManyOpenFilesException every few months.

The problem has nothing to do with real files. When we had the last
TooManyOpenFilesException, we investigated with netstat -a and saw that
there were about 3900 open sockets in Jetty.

Curiously, we only have one SolrServer instance per Solr client, and we
only have three clients (our running web servers).

We have set defaultMaxConnectionsPerHost to 20 and maxTotalConnections
to 100. There should be room enough.

Sorry that I can't help you, we still have not solved tghe problem on
our own.

Greetings,
Kuli

Am 25.10.2011 22:03, schrieb Jonty Rhods:
> Hi,
> 
> I am using solrj and for connection to server I am using instance of the
> solr server:
> 
> SolrServer server =  new CommonsHttpSolrServer("
> http://localhost:8080/solr/core0";);
> 
> I noticed that after few minutes it start throwing exception
> java.net.SocketException: Too many open files.
> It seems that it related to instance of the HttpClient. How to resolved the
> instances to a certain no. Like connection pool in dbcp etc..
> 
> I am not experienced on java so please help to resolved this problem.
> 
>  solr version: 3.4
> 
> regards
> Jonty
> 



Re: RE: Dismax handler - whitespace and special character behaviour

2011-10-26 Thread Khorail
In fact I tried without WordDelimiterFilterFactory (using a  
PatternTokenizerFactory to tokenize on special chars) and I still have the  
same problem. Apparently dismax handler thinks that 'france-histoire' is a  
single word even if I tokenize on '-'


Le , Demian Katz  a écrit :
I just sent an email to the list about DisMax interacting with  
WordDelimiterFilterFactory, and I think our problems are at least  
partially related -- I think the reason you are seeing an OR where you  
expect an AND is that you have autoGeneratePhraseQueries set to false,  
which changes the way DisMax handles the output of the  
WordDelimiterFilterFactory (among others). Unfortunately, I don't have a  
solution for you... but you might want to keep an eye on my thread in  
case replies there shed any additional light.





- Demian





> -Original Message-



> From: Rohk [mailto:khor...@gmail.com]



> Sent: Tuesday, October 25, 2011 10:33 AM



> To: solr-user@lucene.apache.org



> Subject: Dismax handler - whitespace and special character behaviour



>



> Hello,



>



> I've got strange results when I have special characters in my query.



>



> Here is my request :



>



> q=histoire-



> france&start=0&rows=10&sort=score+desc&defType=dismax&qf=any^1.0&mm=100



> %



>



> Parsed query :



>



> +((any:histoir any:franc)) ()



>



> I've got 17000 results because Solr is doing an OR (should be AND).



>



> I have no problem when I'm using a whitespace instead of a special char



> :



>



> q=histoire



> france&start=0&rows=10&sort=score+desc&defType=dismax&qf=any^1.0&mm=100



> %



>



> +(((any:histoir) (any:franc))~2)



> ()



>



> 2000 results for this query.



>



> Here is my schema.xml (relevant parts) :



>



>
> positionIncrementGap="100" autoGeneratePhraseQueries="false">



>



>



>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"



> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"



> preserveOriginal="1"/>



>



>
> words="stopwords_french.txt" ignoreCase="true"/>



>
> words="stopwords_french.txt" enablePositionIncrements="true"/>



>
> language="French" protected="protwords.txt"/>



>



>



>



>



>



>
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>-->



>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"



> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"



> preserveOriginal="0"/>



>



>
> words="stopwords_french.txt" ignoreCase="true"/>



>
> words="stopwords_french.txt" enablePositionIncrements="true"/>



>
> language="French" protected="protwords.txt"/>



>



>



>



>



>



> I tried with a PatternTokenizerFactory to tokenize on whitespaces &



> special



> chars but no change...



> Even with a charFilter (PatternReplaceCharFilterFactory) to replace



> special



> characters by whitespace, it doesn't work...



>



> First line of analysis via solr admin, with verbose output, for query =



> 'histoire-france' :



>



> org.apache.solr.analysis.PatternReplaceCharFilterFactory {replacement=



> , pattern=([,;./\\'&-]), luceneMatchVersion=LUCENE_32}



> text histoire france



>



> The '-' is replaced by ' ', then tokenized by



> WhitespaceTokenizerFactory.



> However I still have different number of results for 'histoire-france'



> and



> 'histoire france'.



>



> My current workaround is to replace all special chars by whitespaces



> before



> sending query to Solr, but it is not satisfying.



>



> Did i miss something ?





Saravanan Chinnadurai/Actionimages is out of the office.

2011-10-26 Thread Saravanan . Chinnadurai
I will be out of the office starting  26/10/2011 and will not return until
28/10/2011.

Please email to itsta...@actionimages.com  for any urgent issues.


Action Images is a division of Reuters Limited and your data will therefore be 
protected
in accordance with the Reuters Group Privacy / Data Protection notice which is 
available
in the privacy footer at www.reuters.com
Registered in England No. 145516   VAT REG: 397000555


solr.PatternReplaceFilterFactory AND endoffset

2011-10-26 Thread roySolr
Hi,

I have some problems with the patternreplaceFilter. I can't use the
worddelimiter because i only want to replace some special chars given by
myself.

Some example:

Tottemham-hotspur (london)
Arsenal (london)

I want this:
replace "-" with " "
"(" OR ")" with "". 

In the analytics i see this:

position1
term text   tottemham hotspur london
startOffset 0
endOffset   26

So the replacefilter works. Now i want to search "tottemham hotspur london".
This gives no results.

position1
term text   tottemham hotspur london
startOffset 0
endOffset   24

It works when i search for "tottemham-hotspur (london)".
I think the problem is the difference in offset(24 vs 26). I need some
help...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-PatternReplaceFilterFactory-AND-endoffset-tp3454049p3454049.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [ANNOUNCEMENT] PHP Solr Extension 1.0.1 Stable Has Been Released

2011-10-26 Thread alex
Hello roySolr,


roySolr wrote:
> 
> Are you working on some changes to support earlier versions of PHP? What
> is the status?
> 

I have supplied a patch, so that it can be compiled with PHP 5.2:
https://bugs.php.net/bug.php?id=59808 https://bugs.php.net/bug.php?id=59808 

I contacted Israel a while ago to integrate this into the package, but he
hasn't answered yet.

Cheers,
 Alex


--
View this message in context: 
http://lucene.472066.n3.nabble.com/ANNOUNCEMENT-PHP-Solr-Extension-1-0-1-Stable-Has-Been-Released-tp3024040p3450788.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Incorrect Search Results showing up

2011-10-26 Thread Grant Ingersoll
If you add debugQuery=true to your request, what does it show for that last 
result?

On Oct 25, 2011, at 5:31 PM, aronitin wrote:

> Hi Group,
> 
> I've the defined a type "text" in the SOLR schema as shown below. 
> 
>  autoGeneratePhraseQueries="true">
>  
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
>
> protected="protwords.txt"/>
>
>  
>  
>
>ignoreCase="true"
>words="stopwords.txt"
>enablePositionIncrements="true"
>/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
>
> protected="protwords.txt"/>
>
>  
> 
> 
> A multi valued field is defined to use the type defined above
>  multiValued="true"/>
> 
> I index some content such as 
> - Google REST API
> - Facebook REST API
> - Software Architecture
> - Design Documents
> - Xml Web Services
> - Web API design
> 
> When I issue a search query like content:"rest api"~4, the matches that I
> get are
> - Google REST API (which is fine)
> - Facebook REST API (which is fine)
> - *Web API design* (which is not fine, because the query was a phrase query
> and rest and api should be within 4 words of each other)
> 
> Does any body see the 3rd search result as a correct search result to be
> returned? If yes, then what is explanation for that result based on the
> schema defined.
> 
> According to me 3rd result should not be returned as part of the search
> result. If somebody can point out anything wrong in my schema it will be
> great help to me.
> 
> Thanks
> Nitin
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Incorrect-Search-Results-showing-up-tp3452810p3452810.html
> Sent from the Solr - User mailing list archive at Nabble.com.

--
Grant Ingersoll
http://www.lucidimagination.com







Re: questions about autocommit & committing documents

2011-10-26 Thread Erick Erickson
Not sure what you mean by a callback, can you clarify? You don't get
anything except the return from the add call as far as I know...

Best
Erick

On Tue, Oct 25, 2011 at 4:15 AM, darul  wrote:
> I was not sure thank you.
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p3450794.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


MultiValued fields and Facets...

2011-10-26 Thread Tiernan OToole
Good morning all.

I am currently indexing about 11 million records, and would like to add
facating to the results page. I have tweaked the query string to include
facating, but i am not getting anything back.

an Example Query string (slightly modified) is as follows:

http://localhost:8080/solr/select?indent=on&version=2.2&q=*:*&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=json&explainOther=&hl.fl=&facet=on&facet.field=Category&facet.field=Warehouse

the Category and Warehouse fields are multivalue fields...

the results i get are as follows:

  "facet_counts":{
"facet_queries":{},
"facet_fields":{
  "Category":[],
  "Warehouse":[]},
"facet_dates":{},
"facet_ranges":{}}}

the data i am sending in has mutliple values for Category and Warehouse. I
did read that this was not available in Solr 1.3 or 1.4... I am currently
running Solr 3.4, and its not working... Would it work if i went to solr 4,
or am i doing something wrong here?

Thanks in advance.

-- 
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie


Re: A sort-by-geodist question

2011-10-26 Thread Erick Erickson
Hmmm, I'm not sure this is supported. Why can't you just use the "location" type
provided in the example schema?

Best
Erick

On Mon, Oct 24, 2011 at 9:39 PM, ☼ 林永忠 ☼ (Yung-chung Lin)
 wrote:
> Hi,
>
> I've started to use Solr to build up a search service, but I have
> encountered a problem here.
>
> However, when I use this URL, it always returns "*sort param could not be
> parsed as a query, and is not a field that exists in the index: geodist()"*
> *
> *
> http://localhost:8080/solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist()%20asc
>
> It works only when I specify coordinates in geodist().
> http://localhost:8080/solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist(45.15,-93.85)%20asc
>
> And the returned documents don't seem to be ranked by distance according to
> the criteria.
>
> My lucene is 3.4. The field 'coordinates' is in geohash format.
>
> Can anyone here give me some pointers?
>
> Thank you very much.
>
> Yung-chung Lin
>


Too many values for UnInvertedField faceting on field autocompleteField

2011-10-26 Thread Torsten Krah
I am getting this SolrException "Too many values for UnInvertedField
faceting on field autocompleteField".
Already added facet.method=enum to my search handler definition but
still this exception does happen.

Any known fix or workaround whan i can do to get a result?

regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature


Re: A sort-by-geodist question

2011-10-26 Thread Yung-chung Lin
Hi,

Thanks for the reply. I switched to the location type. And it's working now.
Am not sure if it's a problem with geohash or it's because I don't know well
about the configurations, but it works now.

Thanks for the reply.

Yung-chung Lin

2011/10/26 Erick Erickson 

> Hmmm, I'm not sure this is supported. Why can't you just use the "location"
> type
> provided in the example schema?
>
> Best
> Erick
>
> On Mon, Oct 24, 2011 at 9:39 PM, ☼ 林永忠 ☼ (Yung-chung Lin)
>  wrote:
> > Hi,
> >
> > I've started to use Solr to build up a search service, but I have
> > encountered a problem here.
> >
> > However, when I use this URL, it always returns "*sort param could not be
> > parsed as a query, and is not a field that exists in the index:
> geodist()"*
> > *
> > *
> >
> http://localhost:8080/solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist()%20asc
> <
> http://antlion.skimbl.com:8080/skimbl-solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist(45.15,-93.85)%20asc
> >
> >
> > It works only when I specify coordinates in geodist().
> >
> http://localhost:8080/solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist(45.15,-93.85)%20asc
> <
> http://antlion.skimbl.com:8080/skimbl-solr/select/?indent=true&fl=name,coordinates&q=*:*&sfield=coordinates&pt=45.15,-93.85&sort=geodist(45.15,-93.85)%20asc
> >
> >
> > And the returned documents don't seem to be ranked by distance according
> to
> > the criteria.
> >
> > My lucene is 3.4. The field 'coordinates' is in geohash format.
> >
> > Can anyone here give me some pointers?
> >
> > Thank you very much.
> >
> > Yung-chung Lin
> >
>


Re: MultiValued fields and Facets...

2011-10-26 Thread Erik Hatcher
That URL has several oddities to it... empty fq and qt parameters.  Try simply 
?q=*:*&facet=on&facet.field=Category&facet.field=Warehouse and see if that 
helps.

Erik

On Oct 26, 2011, at 07:08 , Tiernan OToole wrote:

> Good morning all.
> 
> I am currently indexing about 11 million records, and would like to add
> facating to the results page. I have tweaked the query string to include
> facating, but i am not getting anything back.
> 
> an Example Query string (slightly modified) is as follows:
> 
> http://localhost:8080/solr/select?indent=on&version=2.2&q=*:*&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=json&explainOther=&hl.fl=&facet=on&facet.field=Category&facet.field=Warehouse
> 
> the Category and Warehouse fields are multivalue fields...
> 
> the results i get are as follows:
> 
>  "facet_counts":{
>"facet_queries":{},
>"facet_fields":{
>  "Category":[],
>  "Warehouse":[]},
>"facet_dates":{},
>"facet_ranges":{}}}
> 
> the data i am sending in has mutliple values for Category and Warehouse. I
> did read that this was not available in Solr 1.3 or 1.4... I am currently
> running Solr 3.4, and its not working... Would it work if i went to solr 4,
> or am i doing something wrong here?
> 
> Thanks in advance.
> 
> -- 
> Tiernan O'Toole
> blog.lotas-smartman.net
> www.geekphotographer.com
> www.tiernanotoole.ie



Re: MultiValued fields and Facets...

2011-10-26 Thread Tiernan OToole
Ok, so now i am getting something back, but still getting "odd" results...

I actually made a mistake in the first question... Category is MultiValued,
but Warehouse is not... So, when i run the query, you sugested, Category
comes back with facets and counts, which is one step closer to where i want
to be, but Warehouse, which as i mentioned is a single valued result, is
not... If i just facet on either Category, i get the results, but just
faceting on Warehouse still gets me nothing...

Very confused now...

--Tiernan

On Wed, Oct 26, 2011 at 12:53 PM, Erik Hatcher wrote:

> That URL has several oddities to it... empty fq and qt parameters.  Try
> simply ?q=*:*&facet=on&facet.field=Category&facet.field=Warehouse and see if
> that helps.
>
>Erik
>
> On Oct 26, 2011, at 07:08 , Tiernan OToole wrote:
>
> > Good morning all.
> >
> > I am currently indexing about 11 million records, and would like to add
> > facating to the results page. I have tweaked the query string to include
> > facating, but i am not getting anything back.
> >
> > an Example Query string (slightly modified) is as follows:
> >
> >
> http://localhost:8080/solr/select?indent=on&version=2.2&q=*:*&fq=&start=0&rows=10&fl=*%2Cscore&qt=&wt=json&explainOther=&hl.fl=&facet=on&facet.field=Category&facet.field=Warehouse
> >
> > the Category and Warehouse fields are multivalue fields...
> >
> > the results i get are as follows:
> >
> >  "facet_counts":{
> >"facet_queries":{},
> >"facet_fields":{
> >  "Category":[],
> >  "Warehouse":[]},
> >"facet_dates":{},
> >"facet_ranges":{}}}
> >
> > the data i am sending in has mutliple values for Category and Warehouse.
> I
> > did read that this was not available in Solr 1.3 or 1.4... I am currently
> > running Solr 3.4, and its not working... Would it work if i went to solr
> 4,
> > or am i doing something wrong here?
> >
> > Thanks in advance.
> >
> > --
> > Tiernan O'Toole
> > blog.lotas-smartman.net
> > www.geekphotographer.com
> > www.tiernanotoole.ie
>
>


-- 
Tiernan O'Toole
blog.lotas-smartman.net
www.geekphotographer.com
www.tiernanotoole.ie


Re: Too many values for UnInvertedField faceting on field autocompleteField

2011-10-26 Thread Yonik Seeley
On Wed, Oct 26, 2011 at 7:39 AM, Torsten Krah
 wrote:
> I am getting this SolrException "Too many values for UnInvertedField
> faceting on field autocompleteField".
> Already added facet.method=enum to my search handler definition but
> still this exception does happen.

The facet.method is not taking effect for some reason
UnInvertedField is *only* used when facet.method=fc on a multivalued
field.
Maybe you have another facet.method=fc somewhere, or a per-field
override, or maybe you forgot to restart solr, or maybe there's some
other configuration bug.
Add facet=false&echoParams=all on a request to your handler to see all
of the request params echoed back in the response.
You can also try adding facet.method=enum directly to your request.

-Yonik
http://www.lucidimagination.com

> Any known fix or workaround whan i can do to get a result?
>
> regards
>
> Torsten
>


Re: NRT and replication

2011-10-26 Thread Esteban Donato
thanks Mark and Tomas.  Tomas, you mean doing soft commits to all the
slave nodes right?  If so, that is what I'm planning to do with the
update processor commented above.

2011/10/21 Tomás Fernández Löbbe :
> I was thinking in this, would it make sense to keep the master / slave
> architecture, adding documents to the master and the slaves, do soft commits
> (only) to the slaves and hard commits to the master? That way you wouldn't
> be doing any merges on slaves. Would that make sense?
>
> On Fri, Oct 21, 2011 at 5:43 PM, Mark Miller  wrote:
>
>> Yeah - a distributed update processor like the one Yonik wrote will do fine
>> in simple situations.
>>
>> On Oct 17, 2011, at 7:33 PM, Esteban Donato wrote:
>>
>> > thanks Yonik.  Any idea of when this should be completed?  In the
>> > meantime I think I will have to add docs to every replica, possibly
>> > implementing an update processor.  Something similar to SOLR-2355?
>> >
>> > On Fri, Oct 14, 2011 at 7:31 PM, Yonik Seeley
>> >  wrote:
>> >> On Fri, Oct 14, 2011 at 5:49 PM, Esteban Donato
>> >>  wrote:
>> >>>  I found soft commits very useful for NRT search requirements.
>> >>> However I couldn't figure out how replication works with this feature.
>> >>>  I mean, if I have N replicas of an index for load balancing purposes,
>> >>> when I soft commit a doc in one of this nodes, is there any way that
>> >>> those "in-memory" docs get replicated to the rest of replicas?
>> >>
>> >> Nope.  Index replication isn't really that compatible with NRT.
>> >> But the new distributed indexing features we're working on will be!
>> >> The parent issue for this effort is SOLR-2358.
>> >>
>> >> -Yonik
>> >> http://www.lucene-eurocon.com - The Lucene/Solr User Conference
>> >>
>>
>> - Mark Miller
>> lucidimagination.com
>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>


RE: Replication issues with multiple Slaves

2011-10-26 Thread Jaeger, Jay - DOT
Thanks for that information.  It was most useful.  

Does anyone know:  when this happens does the slave continue using its old 
index, and then try again at the next time interval?  (I sure hope so).

JRJ

-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Sent: Tuesday, October 25, 2011 3:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Replication issues with multiple Slaves


> 1) Hmm, maybe, didn't notice that... but I'd be very confused why it works
> occasionally, and manual replication (through Solr Admin) always works ok
> in that case?
> 2) This was my initial thought, it was happening on one core (multiple
> commits while replication in progress), but I noticed it happening on
> another core (the one mentioned below) which only had 1 commit and a single
> generation (11 > 12) change to replicate.
> 
> 
> I too hoped and presumed that the Master is being Locked while replication
> is copying files... can anyone confirm this? We are using the native Lock
> type on a Windows/Tomcat server.

Replication does not lock the index from being written to.

> 
> Is anyone aware of any reason why the replication skips files, or fails to
> copy/find files other than because of presumably a commit or optimize
> re-chunking the segments and deleting them on the Master?

Slaves receive a list of files to download. Files further on the list may 
disappear before it gets a change to download them. By keeping older commits 
we were able to work around this issue.

> 
> -Original Message-
> From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov]
> Sent: 25 October 2011 20:48
> To: solr-user@lucene.apache.org
> Subject: RE: Replication issues with multiple Slaves
> 
> I noted that in these messages the left hand side is lower case collection,
> but the right hand side is upper case Collection.  Assuming you did a
> cut/paste, could you have a core name mismatch between a master and a slave
> somehow?
> 
> Otherwise (shudder):  could you be doing a commit while the replication is
> in progress, causing files to shift about on it?  I'd have expected
> (perhaps naively) solr to have some sort of lock to prevent such a
> problem.  But if there is no internal lock, that would be a serious matter
> (and could happen to us, too, down the road).
> 
> JRJ
> 
> -Original Message-
> From: Rob Nicholls [mailto:robst...@hotmail.com]
> Sent: Tuesday, October 25, 2011 10:32 AM
> To: solr-user@lucene.apache.org
> Subject: Replication issues with multiple Slaves
> 
> 
> Hey guys,
> 
> We have a Master (1 server) and 2 Slaves (2 servers) setup and running
> replication across multiple cores.
> 
> However, the replication appears to behave sporadically and often fails
> when left to replicate automatically via poll. More often than not a
> replicate will fail after the slave has finished pulling down the segment
> files, because it cannot find a particular file, giving errors such as:
> 
> Oct 25, 2011 10:00:17 AM org.apache.solr.handler.SnapPuller copyAFile
> SEVERE: Unable to move index file from:
> D:\web\solr\collection\data\index.2011102510\_3u.tii to:
> D:\web\solr\Collection\data\index\_3u.tiiTrying to do a copy
> 
> SEVERE: Unable to copy index file from:
> D:\web\solr\collection\data\index.2011102510\_3s.fdt to:
> D:\web\solr\Collection\data\index\_3s.fdt
> java.io.FileNotFoundException:
> D:\web\solr\collection\data\index.2011102510\_3s.fdt (The system cannot
> find the file specified)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.(Unknown Source)
> at org.apache.solr.common.util.FileUtils.copyFile(FileUtils.java:47)
> at org.apache.solr.handler.SnapPuller.copyAFile(SnapPuller.java:585)
> at
> org.apache.solr.handler.SnapPuller.copyIndexFiles(SnapPuller.java:621)
> at
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:317)
> at
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:
> 2 67)
> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown
> Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access
> $ 101(Unknown Source)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPer
> i odic(Unknown Source)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Un
> k nown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> 
> For these files, I checked the master, and they did indeed exist.
> 
> Both slave machines are configured the same, with the same replication
> settings and a 60 minutes p

RE: Loading data to SOLR first time ( taking too long)

2011-10-26 Thread Jaeger, Jay - DOT
No, we do not use DIH.  Based on other responses I saw, its seems likely that 
the issue is in the DIH component somehow.

JRJ

-Original Message-
From: Awasthi, Shishir [mailto:shishir.awas...@baml.com] 
Sent: Tuesday, October 25, 2011 3:24 PM
To: solr-user@lucene.apache.org; Jaeger, Jay - DOT
Subject: RE: Loading data to SOLR first time ( taking too long)

Ok that makes me feel better.

We have around 40 fields being loaded from multiple tables. Other than
not commiting every row is there any other setting that you make?

Are you also using DataImportHandler?

-Original Message-
From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov] 
Sent: Tuesday, October 25, 2011 4:03 PM
To: 'solr-user@lucene.apache.org'
Subject: RE: Loading data to SOLR first time ( taking too long)

My goodness.  We do 4 million in about 1/2 HOUR (7+ million in 40
minutes).

First question:  Are you somehow forcing Solr to do a commit for each
and every record?  If so, that way leads to the house of PAIN.

The thing to do next, I suppose, might be to try and figure out whether
the issue is in Solr proper, or in the database you are importing from.

What does your query against your database look like?
How many fields do you have per record (we have around 30, counting
copyField destinations)

Using a performance monitoring tool, try and find out the CPU
utilization, memory utilization, page write rates and physical disk
drive queue lengths to narrow down which of the two systems are having
the problem (assuming your database is not on the same machine as Solr!)

JRJ

-Original Message-
From: Awasthi, Shishir [mailto:shishir.awas...@baml.com]
Sent: Tuesday, October 25, 2011 2:57 PM
To: solr-user@lucene.apache.org
Subject: Loading data to SOLR first time ( taking too long)

Hi,

I recently started working on SOLR and loaded approximately 4 million
records to the solr using DataImportHandler. It took 5 days to complete
this process.

 

Can you please suggest how this can be improved? I would like this to be
done in less than 6 hrs.

 

Thanks,

Shishir

--
This message w/attachments (message) is intended solely for the use of
the intended recipient(s) and may contain information that is
privileged, confidential or proprietary. If you are not an intended
recipient, please notify the sender, and then please delete and destroy
all copies and attachments, and be advised that any review or
dissemination of, or the taking of any action in reliance on, the
information contained in or attached to this message is prohibited. 
Unless specifically indicated, this message is not an offer to sell or a
solicitation of any investment products or other financial product or
service, an official confirmation of any transaction, or an official
statement of Sender. Subject to applicable law, Sender may intercept,
monitor, review and retain e-communications (EC) traveling through its
networks/systems and may produce any such EC to regulators, law
enforcement, in litigation and as required by law. 
The laws of the country of each sender/recipient may impact the handling
of EC, and EC may be archived, supervised and produced in countries
other than the country in which you are located. This message cannot be
guaranteed to be secure or free of errors or viruses. 

References to "Sender" are references to any subsidiary of Bank of
America Corporation. Securities and Insurance Products: * Are Not FDIC
Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank
Deposit * Are Not a Condition to Any Banking Service or Activity * Are
Not Insured by Any Federal Government Agency. Attachments that are part
of this EC may have additional important disclosures and disclaimers,
which you should read. This message is subject to terms available at the
following link: 
http://www.bankofamerica.com/emaildisclaimer. By messaging with Sender
you consent to the foregoing.

--
This message w/attachments (message) is intended solely for the use of the 
intended recipient(s) and may contain information that is privileged, 
confidential or proprietary. If you are not an intended recipient, please 
notify the sender, and then please delete and destroy all copies and 
attachments, and be advised that any review or dissemination of, or the taking 
of any action in reliance on, the information contained in or attached to this 
message is prohibited. 
Unless specifically indicated, this message is not an offer to sell or a 
solicitation of any investment products or other financial product or service, 
an official confirmation of any transaction, or an official statement of 
Sender. Subject to applicable law, Sender may intercept, monitor, review and 
retain e-communications (EC) traveling through its networks/systems and may 
produce any such EC to regulators, law enforcement, in litigation and as 
required by law. 
The

missing core name in path

2011-10-26 Thread Fred Zimmerman
It is not a multi-core setup.  The solr.xml has null value for . ?
HTTP ERROR 404

Problem accessing /solr/admin/index.jsp. Reason:

missing core name in path



2011-10-26 13:40:21.182:WARN::/solr/admin/
java.lang.IllegalStateException: STREAM
at org.mortbay.jetty.Response.getWriter(Response.java:616)
at
org.apache.jasper.runtime.JspWriterImpl.initOut(JspWriterImpl.java:187)


Re: data import in 4.0

2011-10-26 Thread Adeel Qureshi
Any comments .. please

I am able to do the bulkimport without nested query but with nested query it
just keeps working on it and never seems to end ..

I would appreciate any help

Thanks
Adeel


On Sat, Oct 22, 2011 at 11:12 AM, Adeel Qureshi wrote:

> yup that was it .. my data import files version was not the same as solr
> war .. now I am having another problem though
>
> I tried doing a simple data import
>
> 
> 
>   
>   
>   
>
>   
>
> simple in terms of just pulling up three fields from a table and adding to
> index and this worked fine but when I add a nested or joined table ..
>
> 
> 
>   
>   
>   
>   
>   
>   
>
>   
>
> this data import doesnt seems to end .. it just keeps going .. i only have
> about 15000 records in the main table and about 22000 in the joined table ..
> but the Fetch count in dataimport handler status indicator thing shows that
> it has fetched close to half a million records or something .. i m not sure
> what those records are .. is there a way to see exactly what queries are
> being run by dataimport handler .. is there something wrong with my nested
> query ..
>
> THanks
> Adeel
>
>
> On Fri, Oct 21, 2011 at 3:05 PM, Alireza Salimi 
> wrote:
>
>> So to me it heightens the probability of classloader conflicts,
>> I haven't worked with Solr 4.0, so I don't know if set of JAR files
>> are the same with Solr 3.4. Anyway, make sure that there is only
>> ONE instance of apache-solr-dataimporthandler-***.jar in your
>> whole tomcat+webapp.
>>
>> Maybe you have this jar file in CATALINA_HOME\lib folder.
>>
>> On Fri, Oct 21, 2011 at 3:06 PM, Adeel Qureshi > >wrote:
>>
>> > its deployed on a tomcat server ..
>> >
>> > On Fri, Oct 21, 2011 at 12:49 PM, Alireza Salimi
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > How do you start Solr, through start.jar or you deploy it to a web
>> > > container?
>> > > Sometimes problems like this are because of different class loaders.
>> > > I hope my answer would help you.
>> > >
>> > > Regards
>> > >
>> > >
>> > > On Fri, Oct 21, 2011 at 12:47 PM, Adeel Qureshi <
>> adeelmahm...@gmail.com
>> > > >wrote:
>> > >
>> > > > Hi I am trying to setup the data import handler with solr 4.0 and
>> > having
>> > > > some unexpected problems. I have a multi-core setup and only one
>> core
>> > > > needed
>> > > > the dataimport handler so I have added the request handler to it and
>> > > added
>> > > > the lib imports in config file
>> > > >
>> > > > > regex="apache-solr-dataimporthandler-\d.*\.jar"
>> > />
>> > > > > > > > regex="apache-solr-dataimporthandler-extras-\d.*\.jar" />
>> > > >
>> > > > for some reason this doesnt works .. it still keeps giving me
>> > > ClassNoFound
>> > > > error message so I moved the jars files to the shared lib folder and
>> > then
>> > > > atleast I was able to see the admin screen with the dataimport
>> plugin
>> > > > loaded. But when I try to do the import its throwing this error
>> message
>> > > >
>> > > > INFO: Starting Full Import
>> > > > Oct 21, 2011 11:35:41 AM org.apache.solr.core.SolrCore execute
>> > > > INFO: [DW] webapp=/solr path=/select
>> > > params={command=status&qt=/dataimport}
>> > > > status=0 QTime=0
>> > > > Oct 21, 2011 11:35:41 AM
>> org.apache.solr.handler.dataimport.SolrWriter
>> > > > readIndexerProperties
>> > > > WARNING: Unable to read: dataimport.properties
>> > > > Oct 21, 2011 11:35:41 AM
>> > org.apache.solr.handler.dataimport.DataImporter
>> > > > doFullImport
>> > > > SEVERE: Full Import failed
>> > > > java.lang.NoSuchMethodError:
>> > org.apache.solr.update.DeleteUpdateCommand:
>> > > > method ()V not found
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.SolrWriter.doDeleteAll(SolrWriter.java:193)
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.cleanByQuery(DocBuilder.java:1012)
>> > > >at
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:183)
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335)
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:374)
>> > > > Oct 21, 2011 11:35:41 AM
>> org.apache.solr.handler.dataimport.SolrWriter
>> > > > rollback
>> > > > SEVERE: Exception while solr rollback.
>> > > > java.lang.NoSuchMethodError:
>> > > org.apache.solr.update.RollbackUpdateCommand:
>> > > > method ()V not found
>> > > >at
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.SolrWriter.rollback(SolrWriter.java:184)
>> > > >at
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DocBuilder.rollback(DocBuilder.java:249)
>> > > >at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.solr.handler.dataimport.DataImpo

fixed schema problems, now running out of memory?

2011-10-26 Thread Fred Zimmerman
It's a small indexing job coming from nutch.

2011-10-26 15:07:29,039 WARN  mapred.LocalJobRunner - job_local_0011
java.io.IOException: org.apache.solr.client.solrj.SolrServerException: Error
executi$
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing
query
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
at
org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
... 3 more
Caused by: org.apache.solr.common.SolrException: Java heap space
 java.lang.OutOfMem$

Java heap space  java.lang.OutOfMemoryError: Java heap spaceat
org.apache.lucene$

request: localhost/solr/select?q=id:[* TO *]&fl=id,boost,tstamp,$
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
... 5 more


Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Shawn Heisey

On 10/26/2011 1:30 AM, Michael Kuhlmann wrote:

Hi,

Am 25.10.2011 23:53, schrieb Shawn Heisey:

On 10/20/2011 11:00 AM, Shawn Heisey wrote:

[...] I've noticed a performance discrepancy when
processing every one of my delete records, currently about 25000 of
them.

I din't understand what a delete record is. Do you delete records in
Solr? This shouldn't be done using records (what is a record in this
case? A document?); use a query for that.

Or do you add documents that you call delete records?


A record is an entry in the idx_delete table in the database.  When 
something is deleted from the main database table, there's a trigger 
that inserts its did value (for document id), one of the unique IDs we 
have on each document, into idx_delete.  The build system uses this 
table to process deletes from Solr.  Please see the first message in 
this thread for full details.



I've managed to make this somewhat better by using multiple threads to
do all the deletes on the six large static indexes at once, but that
shouldn't be required.  The Perl version doesn't do them at the same time.

Are you sure? I don't know about the perl client, but maybe it's doing
the network operation in background?

I a single-thread environment, the client has to wait when sending each
request until it has been completely sent to the server, doing nothing.
Multiple threads can help you a lot here.

You can check this when you monitor your client's cpu load.


The Perl programs use LWP::Simple and LWP::Simple::Post and have no 
threading or process forking of any kind.  I am not using a Perl/Solr 
API, I construct the URLs myself from saved templates and send them as a 
browser would.  I'll check and see if my superiors will let me post my 
code publicly.  If not, I may be able to redact it a bit and send it 
unicast to an interested party.



10:27<  cedrichurst>  the only difference i could see is deserializing
the java binary object

This is true, but only in theory. Serializing and deserializing is so
fast that this shouldn't impact.

If you really want to be sure, use a SolrInputDocument instead of
annotated classes when sending documents, but as I said, this shouldn't
matter much.

What's more important: Don't send single documents but rather use
add(Collection) with multiple documents at once. At least when I
understood you correctly that you want to send 25000 documents for update...


This is not for *adding* documents.  It's for making a query that looks 
like the following, with up to 1000 clauses instead of four:


did:(1 OR 2 OR 3 OR 4)

For inserting, I do use a Collection of SolrInputDocuments.  The delete 
process grabs values from idx_delete, does a query like the above (the 
part that's slow in Java), then if any documents are found, issues a 
deleteByQuery with the same string.  The Perl code uses a POST request 
for both the query and the delete, text/xml for the latter.


One possible thing I can do to make the Java code even faster is to set 
rows to zero before doing the query, since I only need numFound, not the 
actual results.  The Perl code does NOT do this, and yet it's super fast.


Any other ideas?

Thanks,
Shawn



Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Shawn Heisey

On 10/26/2011 10:29 AM, Shawn Heisey wrote:
One possible thing I can do to make the Java code even faster is to 
set rows to zero before doing the query, since I only need numFound, 
not the actual results.  The Perl code does NOT do this, and yet it's 
super fast.


It turns out I already thought of this and I DO set rows to zero.

private static final String SOLR_QT = "qt";
private static final String SOLR_ROWS = "rows";
private static final String NO_STATS_QUERY_TYPE = "lbcheck";

...

/**
 * Get the count of all documents matching a query.
 *
 * @param query the query
 * @return the long
 * @throws IdxException
 */
public long getCount(String query) throws IdxException
{
SolrQuery sq = new SolrQuery();
sq.setParam(SOLR_QT, NO_STATS_QUERY_TYPE);
sq.setParam(SOLR_ROWS, "0");
sq.setQuery(query);
QueryResponse qr = null;
try
{
qr = _solrCore.query(sq);
}
catch (Exception e)
{
throw new IdxException("Query '" + query + "' failed on " + 
_prefix

+ _name, e);
}
if (qr == null)
{
throw new IdxException("Count for '" + query + "' failed on "
+ _prefix);
}
else
{
long numFound = qr.getResults().getNumFound();
int qTime = qr.getQTime();
LOG.info(_prefix + _name + ": query QTime=" + qTime + 
",numFound=" + numFound);

return numFound;
}
}

And since someone might ask how I actually do the delete, see below.

/**
 * Delete by query.
 *
 * @throws IdxException
 *
 */
public void deleteByQuery(String query) throws IdxException
{
if (getCount(query) > 0)
{
try
{
UpdateResponse ur = _solrCore.deleteByQuery(query);
LOG.info(_prefix + _name + ": done deleting " + ur);
_needsCommit = true;
}
catch (Exception e)
{
throw new IdxException("deleteByQuery failed on " + _prefix
+ _name, e);
}
}
}




Java 7u1 fixes index corruption and crash bugs in Apache Lucene Core and Apache Solr

2011-10-26 Thread Uwe Schindler
Hi users of Apache Lucene Core and Apache Solr,

Oracle released Java 7u1 [1] on October 19. According to the release notes
and tests done by the Lucene committers, all bugs reported on July 28 are
fixed in this release, so code using Porter stemmer no longer crashes with
SIGSEGV. We were not able to experience any index corruption anymore, so it
is safe to use Java 7u1 with Lucene Core and Solr.

On the same day, Oracle released Java 6u29 [2] fixing the same problems
occurring with Java 6, if the JVM switches -XX:+AggressiveOpts or
-XX:+OptimizeStringConcat were used. Of course, you should not use
experimental JVM options like -XX:+AggressiveOpts in production
environments! We recommend everybody to upgrade to this latest version 6u29.

In case you upgrade to Java 7, remember that you may have to reindex, as the
unicode version shipped with Java 7 changed and tokenization behaves
differently (e.g. lowercasing). For more information, read
JRE_VERSION_MIGRATION.txt in your distribution package!

On behalf of the Apache Lucene/Solr committers,
Uwe Schindler

[1] http://www.oracle.com/technetwork/java/javase/7u1-relnotes-507962.html
[2] http://www.oracle.com/technetwork/java/javase/6u29-relnotes-507960.html

-
Uwe Schindler
uschind...@apache.org 
Apache Lucene PMC Member / Committer
Bremen, Germany
http://lucene.apache.org/




Re: fixed schema problems, now running out of memory?

2011-10-26 Thread Fred Zimmerman
More on what's happening. It seems to be timing out during the commit.

The new documents are small, but the existing index is large (11 million
records).

INFO: Closing Searcher@4a7df6 main
>
> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> ...
>


> Oct 26, 2011 4:51:17 PM org.apache.solr.update.processor.LogUpdateProcessor
> finish
> *INFO: {commit=} 0 2453
> **Oct 26, 2011 4:51:17 PM org.apache.solr.core.SolrCore execute
> **INFO: [] webapp=/solr path=/update
> params={waitSearcher=true&waitFlush=true&wt=javabin&commit=true&version=2}
> status=0 QTime=2453
> *Oct 26, 2011 4:51:52 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/select
> params={fl=id&wt=javabin&q=id:[*+TO+*]&rows=1&version=2} hits=11576871 
> *status=0
> QTime=35298*
> Oct 26, 2011 4:51:53 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/select
> params={fl=id&wt=javabin&q=id:[*+TO+*]&rows=1&version=2} hits=11576871
> status=0 QTime=1
> *java.lang.OutOfMemoryError: Java heap space*
> Dumping heap to /home/bitnami/apache-solr-3.4.0/example/heaplog ...
> Heap dump file created [306866344 bytes in 32.376 secs]



On Wed, Oct 26, 2011 at 11:09 AM, Fred Zimmerman wrote:

> It's a small indexing job coming from nutch.
>
> 2011-10-26 15:07:29,039 WARN  mapred.LocalJobRunner - job_local_0011
> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
> Error executi$
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
> executing query
> at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
> at
> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
> ... 3 more
> Caused by: org.apache.solr.common.SolrException: Java heap space
>  java.lang.OutOfMem$
>
> Java heap space  java.lang.OutOfMemoryError: Java heap spaceat
> org.apache.lucene$
>
> request: localhost/solr/select?q=id:[* TO *]&fl=id,boost,tstamp,$
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
> at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
> ... 5 more
>
>


Re: fixed schema problems, now running out of memory?

2011-10-26 Thread Fred Zimmerman
http://wiki.apache.org/solr/SolrPerformanceFactors#Schema_Design_Considerations

The number of indexed fields greatly increases the following:
>
>- Memory usage during indexing
>
>
>- Segment merge time
>
>
>- Optimization times
>
>
>- Index size
>
> These impacts can be reduced by the use of omitNorms="true"


http://lucene.472066.n3.nabble.com/What-is-omitNorms-td2987547.html

1. length normalization will not work on the specific field--
> Which means matching documents with shorter length will not be
> preferred/boost over matching documents with greater length for the specific
> field, at search time.
> For my application, I actually prefer documents with greater length.
> 2. Index time boosting will not be available on the field.
> If, both the above cases are not required by you, then, you can set
> "omitNorms=true" for the specific fields.
> This has an added advantage, it will save you some(or a lot of) RAM also,
> since, with "omitNorms=false" on total "N" fields in the index will require
> RAM of size:
>  Total docs in index * 1 byte * N
> I have a lot of fields: I count 31 without omitNorms values, which means
> false by default.


Gak!  11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself.

On Wed, Oct 26, 2011 at 1:01 PM, Fred Zimmerman wrote:

> More on what's happening. It seems to be timing out during the commit.
>
> The new documents are small, but the existing index is large (11 million
> records).
>
> INFO: Closing Searcher@4a7df6 main
>>
>> fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
>> ...
>>
>
>
>> Oct 26, 2011 4:51:17 PM
>> org.apache.solr.update.processor.LogUpdateProcessor finish
>> *INFO: {commit=} 0 2453
>> **Oct 26, 2011 4:51:17 PM org.apache.solr.core.SolrCore execute
>> **INFO: [] webapp=/solr path=/update
>> params={waitSearcher=true&waitFlush=true&wt=javabin&commit=true&version=2}
>> status=0 QTime=2453
>> *Oct 26, 2011 4:51:52 PM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/select
>> params={fl=id&wt=javabin&q=id:[*+TO+*]&rows=1&version=2} hits=11576871 
>> *status=0
>> QTime=35298*
>> Oct 26, 2011 4:51:53 PM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/select
>> params={fl=id&wt=javabin&q=id:[*+TO+*]&rows=1&version=2} hits=11576871
>> status=0 QTime=1
>> *java.lang.OutOfMemoryError: Java heap space*
>> Dumping heap to /home/bitnami/apache-solr-3.4.0/example/heaplog ...
>> Heap dump file created [306866344 bytes in 32.376 secs]
>
>
>
> On Wed, Oct 26, 2011 at 11:09 AM, Fred Zimmerman wrote:
>
>> It's a small indexing job coming from nutch.
>>
>> 2011-10-26 15:07:29,039 WARN  mapred.LocalJobRunner - job_local_0011
>> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
>> Error executi$
>> at
>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
>> executing query
>> at
>> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
>> at
>> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
>> at
>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec$
>> ... 3 more
>> Caused by: org.apache.solr.common.SolrException: Java heap space
>>  java.lang.OutOfMem$
>>
>> Java heap space  java.lang.OutOfMemoryError: Java heap spaceat
>> org.apache.lucene$
>>
>> request: localhost/solr/select?q=id:[* TO *]&fl=id,boost,tstamp,$
>> at
>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
>> at
>> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt$
>> at
>> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja$
>> ... 5 more
>>
>>
>


Upgratding the Index from 1.4.1 to 3.4 using replication

2011-10-26 Thread Nemani, Raj
All,

 

We are planning to upgrade our Solr instance from 1.4.1 to 3.4.  We
understand that we need to re-index all the documents given the changes
to the index structure.  If we setup a replication pipe with 1.4.1 as
the Master and 3.4 as the salve (with an empty index) is there would the
replication process convert the index from 1.4.1 format to 3.4 format?

 

Thanks so much in advance for your time and help.

Raj

 



Can dynamic fields defined by a prefix be used with LatLonType?

2011-10-26 Thread Tom Cooke
Hi,



I'm adding support for lat/lon data into an existing schema which uses
prefix-based dynamic fields e.g. "OBJECT_I_*".  I would like to add
"OBJECT_LL_*" as a dynamic field for LatLonType data but it seems that
the LatLonType always needs to add suffixes for the dynamically created
subfields which leads to a field name being generated that not only
matches the subfield suffix e.g. "*_coordinate" but also matches
"OBJECT_LL_*" leading to a clash.



Is there any way around this other than always using a suffix-based
approach to define any dynamic fields that contain LatLonType data?



Thanks,



Tom





Sign-up to our newsletter for industry best practice and thought leadership: 
http://www.gossinteractive.com/newsletter

Registered Office: c/o Bishop Fleming, Cobourg House, Mayflower Street, 
Plymouth, PL1 1LG. Company Registration No: 3553908

This email contains proprietary information, some or all of which may be 
legally privileged. It is for the intended recipient only. If an addressing or 
transmission error has misdirected this email, please notify the author by 
replying to this email. If you are not the intended recipient you may not use, 
disclose, distribute, copy, print or rely on this email.

Email transmission cannot be guaranteed to be secure or error free, as 
information may be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete or contain viruses. This email and any files attached to it have 
been checked with virus detection software before transmission. You should 
nonetheless carry out your own virus check before opening any attachment. GOSS 
Interactive Ltd accepts no liability for any loss or damage that may be caused 
by software viruses.





solr break up word

2011-10-26 Thread Boris Quiroz
Hi,

I've solr running on a CentOS server working OK, but sometimes my application 
needs to index some parts of a word. For example, if I search 'dislike' word 
fine but if I search 'disl' it returns zero. Also, if I search 'disl*' returns 
some values (the same if I search for 'dislike') but if I search 'dislike*' it 
returns zero too. 

So, I've two questions:

1. How exactly the asterisk works as a wildcard?

2. What can I do to index properly parts of a word? I added this lines to my 
schema.xml:


  




  

  



  


But I can't get it to work. Is OK what I did or I'm wrong?

Thanks.

--
Boris Quiroz
boris.qui...@menco.it



Get results ordered by field content starting with specific word

2011-10-26 Thread darul
I have seen many threads talking about it but not found any way on how to
resolve it.

In my schema 2 fields :



Results are sorted by field2 desc like in the following listing when looking
for "word1" as query pattern:



I would like to get Doc3 at the end because "word1" is not at the beginning
of the field content.

Have you any idea ? 

I have seen SpanNearQuery, tried FuzzySearch with no success etc...maybe
making a special QueryParserPlugin, but I am lost ;)

We use Solr 3.4

Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Get-results-ordered-by-field-content-starting-with-specific-word-tp3455754p3455754.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: some basic information on Solr

2011-10-26 Thread Jaeger, Jay - DOT
It didn't look like that, but maybe.

Our experience has been very very good.  I don't think we have seen a crash in 
our prototype to date (though that prototype is also not very busy).  We have 
had as many a four cores, with as many as 35 million "documents".

-Original Message-
From: stockii [mailto:stock.jo...@googlemail.com] 
Sent: Wednesday, October 26, 2011 2:30 AM
To: solr-user@lucene.apache.org
Subject: Re: some basic information on Solr

i think with "incident" he mean, failures / downtimes / problems with solr !? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 
1 Core with 45 Million Documents other Cores < 200.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/some-basic-information-on-Solr-tp3448957p3453837.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Difficulties Installing Solr with Jetty 7.x

2011-10-26 Thread Jaeger, Jay - DOT
>From your logs, it looks like the Solr library is being found just fine, and 
>that the servlet is initing OK.

Does your Jetty configuration specify index.jsp in a welcome list?

We had that problem in WebSphere:  we got 404's the same way, and the cure was 
to modify the Jetty web.xml to include:


  index.jsp


In our Solr web.xml, and submitted a JIRA on the issue (I don't have the number 
handy, but I recall a quick response that the issue had been fixed).

Maybe Jetty's default has changed with the new version?

A quick test would be to try /solr/admin/index.jsp .  If that works, then this 
is probably your problem.

JRJ

-Original Message-
From: Scott Vanderbilt [mailto:li...@datagenic.com]
Sent: Tuesday, October 25, 2011 7:35 PM
To: solr-user@lucene.apache.org
Subject: Difficulties Installing Solr with Jetty 7.x

Hello. I am having trouble installing Solr 3.4.0 with Jetty 7.5.3. My OS
is OpenBSD 5.0, and JDK is 1.7.0.

I was able to successfully run the Solr example application which comes
bundled with an earlier version of Jetty (not sure which, but I'm
assuming pre-version 7). I would like--if at all possible--to run the
latest version of Jetty.

After some confusion resulting from the fact that the Jetty-specific
install docs at  are apparently
out of sync with the newest versions of Jetty, I was able to make some
progress by cloning the sample contexts file at
JETTY_HOME/contexts/test.xml, the contents of which is below. Also below
is the output when attempting to start Solr in the Jetty container.
(Sorry about the line-wrapping)

When I start Jetty's test application, I can successfully retrieve the
home page. However, when I attempt to start Solr, Jetty is obviously up
and serving HTTP requests, but attempts to connect to
 result in a 404.

Might someone be able to point out what mistake I am making? I'm sure
it's in the java output somewhere, but I am unable to discern where.
Alternatively, any pointers to relevant docs to help me get going would
also be greatly appreciated.

Many thanks in advance.


=
OUTPUT
=
jetty $/usr/local/jdk-1.7.0/bin/java -Dsolr.solr.home=/var/jetty/solr
-jar /var/jetty/start.jar
2011-10-25 16:44:50.110:INFO:oejs.Server:jetty-7.5.3.v20111011
2011-10-25 16:44:50.160:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/webapps at interval 1
2011-10-25 16:44:50.168:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/contexts at interval 1
2011-10-25 16:44:50.173:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/javadoc.xml
2011-10-25 16:44:50.240:INFO:oejsh.ContextHandler:started
o.e.j.s.h.ContextHandler{/javadoc,file:/var/jetty/javadoc/}
2011-10-25 16:44:50.241:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/test.xml
2011-10-25 16:44:50.358:INFO:oejw.WebInfConfiguration:Extract
jar:file:/var/jetty/webapps/test.war!/ to
/tmp/jetty-0.0.0.0-8080-test.war-_-any-/webapp
2011-10-25 16:44:51.155:INFO:oejsh.ContextHandler:started
o.e.j.w.WebAppContext{/,file:/tmp/jetty-0.0.0.0-8080-test.war-_-any-/webapp/},/var/jetty/webapps/test.war
2011-10-25 16:44:51.539:INFO:oejs.TransparentProxy:TransparentProxy @
/javadoc-proxy to http://download.eclipse.org/jetty/stable-7/apidocs
2011-10-25 16:44:51.543:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/solr.xml
2011-10-25 16:44:51.564:INFO:oejw.WebInfConfiguration:Extract
jar:file:/var/jetty/webapps/solr.war!/ to
/tmp/jetty-0.0.0.0-8080-solr.war-_-any-/webapp
2011-10-25 16:44:52.850:INFO:oejsh.ContextHandler:started
o.e.j.w.WebAppContext{/,file:/tmp/jetty-0.0.0.0-8080-solr.war-_-any-/webapp/},/var/jetty/webapps/solr.war
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr.home: /var/jetty/solr
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/var/jetty/solr/'
Oct 25, 2011 4:44:53 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr.home: /var/jetty/solr
Oct 25, 2011 4:44:53 PM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /var/jetty/solr/solr.xml
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr

RE: Difficulties Installing Solr with Jetty 7.x

2011-10-26 Thread Jaeger, Jay - DOT
ERRATA, that should the the *SOLR* web.xml (not the Jetty web.xml)

Sorry for the confusion.

-Original Message-
From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov]
Sent: Wednesday, October 26, 2011 4:02 PM
To: 'solr-user@lucene.apache.org'
Subject: RE: Difficulties Installing Solr with Jetty 7.x

>From your logs, it looks like the Solr library is being found just fine, and 
>that the servlet is initing OK.

Does your Jetty configuration specify index.jsp in a welcome list?

We had that problem in WebSphere:  we got 404's the same way, and the cure was 
to modify the Jetty web.xml to include:


  index.jsp


In our Solr web.xml, and submitted a JIRA on the issue (I don't have the number 
handy, but I recall a quick response that the issue had been fixed).

Maybe Jetty's default has changed with the new version?

A quick test would be to try /solr/admin/index.jsp .  If that works, then this 
is probably your problem.

JRJ

-Original Message-
From: Scott Vanderbilt [mailto:li...@datagenic.com]
Sent: Tuesday, October 25, 2011 7:35 PM
To: solr-user@lucene.apache.org
Subject: Difficulties Installing Solr with Jetty 7.x

Hello. I am having trouble installing Solr 3.4.0 with Jetty 7.5.3. My OS
is OpenBSD 5.0, and JDK is 1.7.0.

I was able to successfully run the Solr example application which comes
bundled with an earlier version of Jetty (not sure which, but I'm
assuming pre-version 7). I would like--if at all possible--to run the
latest version of Jetty.

After some confusion resulting from the fact that the Jetty-specific
install docs at  are apparently
out of sync with the newest versions of Jetty, I was able to make some
progress by cloning the sample contexts file at
JETTY_HOME/contexts/test.xml, the contents of which is below. Also below
is the output when attempting to start Solr in the Jetty container.
(Sorry about the line-wrapping)

When I start Jetty's test application, I can successfully retrieve the
home page. However, when I attempt to start Solr, Jetty is obviously up
and serving HTTP requests, but attempts to connect to
 result in a 404.

Might someone be able to point out what mistake I am making? I'm sure
it's in the java output somewhere, but I am unable to discern where.
Alternatively, any pointers to relevant docs to help me get going would
also be greatly appreciated.

Many thanks in advance.


=
OUTPUT
=
jetty $/usr/local/jdk-1.7.0/bin/java -Dsolr.solr.home=/var/jetty/solr
-jar /var/jetty/start.jar
2011-10-25 16:44:50.110:INFO:oejs.Server:jetty-7.5.3.v20111011
2011-10-25 16:44:50.160:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/webapps at interval 1
2011-10-25 16:44:50.168:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/contexts at interval 1
2011-10-25 16:44:50.173:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/javadoc.xml
2011-10-25 16:44:50.240:INFO:oejsh.ContextHandler:started
o.e.j.s.h.ContextHandler{/javadoc,file:/var/jetty/javadoc/}
2011-10-25 16:44:50.241:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/test.xml
2011-10-25 16:44:50.358:INFO:oejw.WebInfConfiguration:Extract
jar:file:/var/jetty/webapps/test.war!/ to
/tmp/jetty-0.0.0.0-8080-test.war-_-any-/webapp
2011-10-25 16:44:51.155:INFO:oejsh.ContextHandler:started
o.e.j.w.WebAppContext{/,file:/tmp/jetty-0.0.0.0-8080-test.war-_-any-/webapp/},/var/jetty/webapps/test.war
2011-10-25 16:44:51.539:INFO:oejs.TransparentProxy:TransparentProxy @
/javadoc-proxy to http://download.eclipse.org/jetty/stable-7/apidocs
2011-10-25 16:44:51.543:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/solr.xml
2011-10-25 16:44:51.564:INFO:oejw.WebInfConfiguration:Extract
jar:file:/var/jetty/webapps/solr.war!/ to
/tmp/jetty-0.0.0.0-8080-solr.war-_-any-/webapp
2011-10-25 16:44:52.850:INFO:oejsh.ContextHandler:started
o.e.j.w.WebAppContext{/,file:/tmp/jetty-0.0.0.0-8080-solr.war-_-any-/webapp/},/var/jetty/webapps/solr.war
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr.home: /var/jetty/solr
Oct 25, 2011 4:44:52 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/var/jetty/solr/'
Oct 25, 2011 4:44:53 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Oct 25, 2011 4:44:53 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: using system property solr.solr.home: /var/jetty/solr
Oct 25, 2011 4:44:53 PM org.apache.solr.core.CoreContainer$Initializer
initi

RE: Upgratding the Index from 1.4.1 to 3.4 using replication

2011-10-26 Thread Jaeger, Jay - DOT
I very much doubt that would work:  different versions of Lucene involved, and 
Solr replication does just a streamed file copy, nothing fancy.

JRJ

-Original Message-
From: Nemani, Raj [mailto:raj.nem...@turner.com] 
Sent: Wednesday, October 26, 2011 12:55 PM
To: solr-user@lucene.apache.org
Subject: Upgratding the Index from 1.4.1 to 3.4 using replication

All,

 

We are planning to upgrade our Solr instance from 1.4.1 to 3.4.  We
understand that we need to re-index all the documents given the changes
to the index structure.  If we setup a replication pipe with 1.4.1 as
the Master and 3.4 as the salve (with an empty index) is there would the
replication process convert the index from 1.4.1 format to 3.4 format?

 

Thanks so much in advance for your time and help.

Raj

 



Re: DisMax search

2011-10-26 Thread Erik Hatcher
Maybe a case sensitive issue?  defType it should be. 

   Erik

On Oct 26, 2011, at 16:03, jyn7  wrote:

> Hi,
> 
> I am using a dismax search and limiting the query parameters using qf:
> /solrbgp/select/?facet=true&qf=memnum&q=%229065%22&deftype=dismax&start=0&rows=10
> 
> My understanding is SOLR should now search only the memnum field for an
> exact match of "9065", since there is no data with memnum 9065 I should be
> getting an empty result set, but instead the result that comes up has 9065
> in another indexed string field(email) and not the memnum field.
> 
> 
> How do I limit the search to one single field?
> 
> Thanks.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/DisMax-search-tp3455671p3455671.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Analyzers from schema.xml with custom parser

2011-10-26 Thread Milan Dobrota
I created a custom plugin parser, and it seems like it is ignoring analyzers
from schema.xml. Is there any way to associate the two?


Re: Difficulties Installing Solr with Jetty 7.x

2011-10-26 Thread Scott Vanderbilt

Jay:

Thanks for the response.

$JETTY_HOME/etc/webdefault.xml is the unmodified file that came with 
Jetty, and it has a  referencing index.jsp, 
index.html, and index.htm.


Attempting to load /solr/admin.index.jsp generates a 404. All other URLs 
generate a 404 also, except /, which returns the Jetty test app home 
page. Not sure if this is useful, but that page contains the following info:


  This webapp is deployed in $JETTY_HOME/webapp/test and configured
  by $JETTY_HOME/contexts/test.xml

You refer to Solr's web.xml. I have no such file, or any other config 
files which are Solr-specific, so far as I can tell. I followed the Solr 
wiki page instructions , so 
apart from copying the solr.war into $JETTY_HOME/webapps/, the only 
other thing I copied over from the Solr example distribution was the 
directory apache-solr-3.4.0/example/solr/ as $JETTY_HOME/solr/.


My complete WAG is that the fix will lie somewhere in the contexts/ 
directory. I really see no other place to do Solr-specific configuration 
apart from $JETTY_HOME/etc/, and my intuition is that these files 
shouldn't be messed with unless the intention is to affect global 
container-wide behavior. Which I don't. I'm only trying to get Solr 
running. I may want to run other apps, so I'd rather leave Jetty's 
config files as is.




On 10/26/2011 2:05 PM, Jaeger, Jay - DOT wrote:

ERRATA, that should the the *SOLR* web.xml (not the Jetty web.xml)

Sorry for the confusion.

-Original Message-
From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov]
Sent: Wednesday, October 26, 2011 4:02 PM
To: 'solr-user@lucene.apache.org'
Subject: RE: Difficulties Installing Solr with Jetty 7.x


From your logs, it looks like the Solr library is being found just fine, and 
that the servlet is initing OK.

X-Spam-Status: No, hits=0.00 required=0.90

Does your Jetty configuration specify index.jsp in a welcome list?

We had that problem in WebSphere:  we got 404's the same way, and the cure was 
to modify the Jetty web.xml to include:


   index.jsp


In our Solr web.xml, and submitted a JIRA on the issue (I don't have the number 
handy, but I recall a quick response that the issue had been fixed).

Maybe Jetty's default has changed with the new version?

A quick test would be to try /solr/admin/index.jsp .  If that works, then this 
is probably your problem.

JRJ

-Original Message-
From: Scott Vanderbilt [mailto:li...@datagenic.com]
Sent: Tuesday, October 25, 2011 7:35 PM
To: solr-user@lucene.apache.org
Subject: Difficulties Installing Solr with Jetty 7.x

Hello. I am having trouble installing Solr 3.4.0 with Jetty 7.5.3. My OS
is OpenBSD 5.0, and JDK is 1.7.0.

I was able to successfully run the Solr example application which comes
bundled with an earlier version of Jetty (not sure which, but I'm
assuming pre-version 7). I would like--if at all possible--to run the
latest version of Jetty.

After some confusion resulting from the fact that the Jetty-specific
install docs at  are apparently
out of sync with the newest versions of Jetty, I was able to make some
progress by cloning the sample contexts file at
JETTY_HOME/contexts/test.xml, the contents of which is below. Also below
is the output when attempting to start Solr in the Jetty container.
(Sorry about the line-wrapping)

When I start Jetty's test application, I can successfully retrieve the
home page. However, when I attempt to start Solr, Jetty is obviously up
and serving HTTP requests, but attempts to connect to
  result in a 404.

Might someone be able to point out what mistake I am making? I'm sure
it's in the java output somewhere, but I am unable to discern where.
Alternatively, any pointers to relevant docs to help me get going would
also be greatly appreciated.

Many thanks in advance.


=
OUTPUT
=
jetty $/usr/local/jdk-1.7.0/bin/java -Dsolr.solr.home=/var/jetty/solr
-jar /var/jetty/start.jar
2011-10-25 16:44:50.110:INFO:oejs.Server:jetty-7.5.3.v20111011
2011-10-25 16:44:50.160:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/webapps at interval 1
2011-10-25 16:44:50.168:INFO:oejdp.ScanningAppProvider:Deployment
monitor /var/jetty/contexts at interval 1
2011-10-25 16:44:50.173:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/javadoc.xml
2011-10-25 16:44:50.240:INFO:oejsh.ContextHandler:started
o.e.j.s.h.ContextHandler{/javadoc,file:/var/jetty/javadoc/}
2011-10-25 16:44:50.241:INFO:oejd.DeploymentManager:Deployable added:
/var/jetty/contexts/test.xml
2011-10-25 16:44:50.358:INFO:oejw.WebInfConfiguration:Extract
jar:file:/var/jetty/webapps/test.war!/ to
/tmp/jetty-0.0.0.0-8080-test.war-_-any-/webapp
2011-10-25 16:44:51.155:INFO:oejsh.ContextHandler:started
o.e.j.w.WebAppContext{/,file:/

exact matches are not filtered to the top

2011-10-26 Thread Ji, Jason
Hi guys,

We have a case that we need to do wildcard search for either user's realname or 
username.(note that realname is not mandatory)
So we specified the copyField as below:

  
  





   



  



  
  




  



But the problem is that when the search results are displayed , the exact 
matches are not filtered to the top.

This should be a simple use case, anyone can suggest when goes wrong ?

Thanks,
Jason


Re: help needed on solr-uima integration

2011-10-26 Thread Xue-Feng Yang
Hi,

Is there logging for uima? From Logging in Solr Admin page, I couldn't find it.


Thanks,

Xue-Feng




From: Xue-Feng Yang 
To: "solr-user@lucene.apache.org" 
Sent: Tuesday, October 25, 2011 8:50:05 PM
Subject: Re: help needed on solr-uima integration


I configured solr-uima integration as the resource() I could found, but the 
data import results had empty data from uima. The other fields not from uima 
were there and no error messages. 


The following were the steps I did:

1) set shema.xml with all fields of both uima and non uima.
2) set lib, updateRequestProcessorChain for AE and maps,  requestHandler for 
update, and  DataImportHandler for config and update.processor.


Do I still miss anything?

Thanks,

Xue-Feng




From: Xue-Feng Yang 
To: "solr-user@lucene.apache.org" 
Sent: Monday, October 24, 2011 11:21:14 AM
Subject: Re: help needed on solr-uima integration


Thanks Koji. I found it. I should the solution there.

Xue-Feng



From: Koji Sekiguchi 
To: solr-user@lucene.apache.org
Sent: Monday, October 24, 2011 7:30:01 AM
Subject: Re: help needed on solr-uima integration

(11/10/24 17:42), Xue-Feng Yang wrote:
> Hi,
>
> Where can I find test code for
 solr-uima component?

You should find them under:

solr/contrib/uima/src/test

koji
-- 
Check out "Query Log Visualizer" for Apache Solr
http://www.rondhuit-demo.com/loganalyzer/loganalyzer.html
http://www.rondhuit.com/en/

Re: help needed on solr-uima integration

2011-10-26 Thread Xue-Feng Yang
Hi,

From Solr Info page, I can see my solr-uima core is there, but 
updateRequestProcessorChain is not there. What is the reason?

Thanks,

Xue-Feng





From: Xue-Feng Yang 
To: "solr-user@lucene.apache.org" 
Sent: Wednesday, October 26, 2011 7:56:21 PM
Subject: Re: help needed on solr-uima integration

Hi,

Is there logging for uima? From Logging in Solr Admin page, I couldn't find it.


Thanks,

Xue-Feng




From: Xue-Feng Yang 
To: "solr-user@lucene.apache.org" 
Sent: Tuesday, October 25, 2011 8:50:05 PM
Subject: Re: help needed on solr-uima integration


I configured solr-uima integration as the resource() I could found, but the 
data import results had empty data from uima. The other fields not from uima 
were there and no error messages. 


The following were the steps I did:

1) set shema.xml with all fields of both uima and non uima.
2) set lib, updateRequestProcessorChain for AE and maps,  requestHandler for 
update, and  DataImportHandler for config and update.processor.


Do I still miss anything?

Thanks,

Xue-Feng




From: Xue-Feng Yang 
To: "solr-user@lucene.apache.org" 
Sent: Monday, October 24, 2011 11:21:14 AM
Subject: Re: help needed on solr-uima integration


Thanks Koji. I found it. I should the solution there.

Xue-Feng



From: Koji Sekiguchi 
To: solr-user@lucene.apache.org
Sent: Monday, October 24, 2011 7:30:01 AM
Subject: Re: help needed on solr-uima integration

(11/10/24 17:42), Xue-Feng Yang wrote:
> Hi,
>
> Where can I find test code for
solr-uima component?

You should find them under:

solr/contrib/uima/src/test

koji
-- 
Check out "Query Log Visualizer" for Apache Solr
http://www.rondhuit-demo.com/loganalyzer/loganalyzer.html
http://www.rondhuit.com/en/

Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Michael Sokolov
Have you checked to see when you are committing?  Is the pattern the 
same in both instances?  If you are committing after each delete request 
in Java, but not in Perl, that could slow things down.


On 10/25/2011 5:53 PM, Shawn Heisey wrote:

On 10/20/2011 11:00 AM, Shawn Heisey wrote:
I've got two build systems for my Solr index that I wrote.  The first 
one is in Perl and uses GET/POST requests via HTTP, the second is in 
Java using SolrJ.  I've noticed a performance discrepancy when 
processing every one of my delete records, currently about 25000 of 
them.  It takes about 5 seconds in Perl and a minute or more via 
SolrJ.  In the perl system, I do a full delete like this once an 
hour.  The performance impact of doing it once an hour in the SolrJ 
version has forced me to do it only once per day.  The normal delete 
process in both cases looks for new records and deletes just those.  
It happens every two minutes in the Perl program and every minute in 
the Java program. 




Re: Query/Delete performance difference between straight HTTP and SolrJ

2011-10-26 Thread Shawn Heisey

On 10/26/2011 6:16 PM, Michael Sokolov wrote:
Have you checked to see when you are committing?  Is the pattern the 
same in both instances?  If you are committing after each delete 
request in Java, but not in Perl, that could slow things down.


The commit happens separately, not during the process.  The java logs I 
pasted did not include the other things that happen afterwards, or the 
commit, which can take another 10-30 seconds.


Here's the outer-level code that does the full update cycle.  It does 
deletes, reinserts (documents that have been changed), and inserts (new 
content), then a commit.  The innermost commit method (passed down from 
the code below through a couple of object levels) spits log messages of 
its own, and indicates that no commits are happening until after 
everything is done.


/**
 * Do all the updates.
 *
 * @throws IdxException
 */
public synchronized void updateIndex(boolean fullUpdate,
boolean useBuildCore) throws IdxException
{
refreshFlags();
if (fullUpdate)
{
_fullDelete = true;
_fullReinsert = true;
}

if (_dailyOptimizeStarted)
{
LOG.info(_lp
+ "Skipping delete and reinsert 
- optimization underway.");

}
else
{
doDelete(_fullDelete, useBuildCore);
doReinsert(_fullReinsert, useBuildCore);
turnOffFullUpdate();
}
doInsert(useBuildCore);
doCommit(useBuildCore);
}

Due to the multihreading of delete requests, I now have the full delete 
down to 10-15 seconds instead of a minute or more.  This is now an 
acceptable time, but I am completely mystified as to why the Pelr code 
can do it without multithreading just as fast, and often faster.  The 
Java code is long-running, and the Perl code is started by cron.  If you 
look back to the first message on the thread, you'll see commit messages 
in the Perl log, but those commits are done with the wait options set to 
false.  That's an extra step the Java code isn't doing - and it's STILL 
faster.


Thanks,
Shawn



Re: DisMax search

2011-10-26 Thread jyn7
I am searching for 9065 , so its not about case sensitivity. My search is
searching across all the field names and not limiting it to one
field(specified in the qf param and using deftype dismax)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DisMax-search-tp3455671p3456716.html
Sent from the Solr - User mailing list archive at Nabble.com.