Re: Status of Spelt integration

2009-12-07 Thread freerk55

The standard spell checker of Thunderbird works in eGroupware.
But not in Felamimail!!?? Why not?
How can I get it working as it does in the rest of eGroupware?

Freerk Jongsma



Toby Cole-2 wrote:
> 
> Hi Andrew,
>   We ended up abandoning the spelt integration as the built in solr  
> spellchecking improved so much during our project. Also, if you did go  
> the route of using spelt, I'd implement it as a spellcheck plugin  
> (which didn't exist as a concept when we started trying to shoehorn  
> spelt into solr).
> Regards, Toby.
> 
> On 30 Nov 2009, at 11:29, Andrey Klochkov wrote:
> 
>> Hi all
>>
>> I searched through the mail-list archives and saw that sometime ago  
>> Toby
>> Cole was going to integrate a spellchecker named Spelt into Solr. Does
>> anyone now what's the status of this? Anyone tried to use it with  
>> Solr? Does
>> it make sense to try it instead of standard spell checker?
>>
>> Some links on the subject:
>> http://markmail.org/message/cqt4qtzzwyceltqu#query:+page:1+mid:cqt4qtzzwyceltqu+state:results
>> http://markmail.org/search/?q=spelt#query:spelt+page:1+mid:krzofzojhg7hmms7+state:results
>> http://groups.google.com/group/spelt
>>
>> -- 
>> Andrew Klochkov
>> Senior Software Engineer,
>> Grid Dynamics
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Status-of-Spelt-integration-tp26573196p26674324.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Status of Spelt integration

2009-12-07 Thread Toby Cole

I'm pretty sure this isn't a Solr related question.
Have you tried asking on the eGroupware mailing lists? 
http://sourceforge.net/mail/?group_id=78745
Toby.

On 7 Dec 2009, at 08:52, freerk55 wrote:



The standard spell checker of Thunderbird works in eGroupware.
But not in Felamimail!!?? Why not?
How can I get it working as it does in the rest of eGroupware?

Freerk Jongsma



Toby Cole-2 wrote:


Hi Andrew,
We ended up abandoning the spelt integration as the built in solr
spellchecking improved so much during our project. Also, if you did  
go

the route of using spelt, I'd implement it as a spellcheck plugin
(which didn't exist as a concept when we started trying to shoehorn
spelt into solr).
Regards, Toby.

On 30 Nov 2009, at 11:29, Andrey Klochkov wrote:


Hi all

I searched through the mail-list archives and saw that sometime ago
Toby
Cole was going to integrate a spellchecker named Spelt into Solr.  
Does

anyone now what's the status of this? Anyone tried to use it with
Solr? Does
it make sense to try it instead of standard spell checker?

Some links on the subject:
http://markmail.org/message/cqt4qtzzwyceltqu#query:+page:1+mid:cqt4qtzzwyceltqu+state:results
http://markmail.org/search/?q=spelt#query:spelt+page:1+mid:krzofzojhg7hmms7+state:results
http://groups.google.com/group/spelt

--
Andrew Klochkov
Senior Software Engineer,
Grid Dynamics





--
View this message in context: 
http://old.nabble.com/Status-of-Spelt-integration-tp26573196p26674324.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory

2009-12-07 Thread Koji Sekiguchi

Robin Wojciki wrote:

Koji, I was able to create a minimal replication.

Attached zip has solr.xml, solrconf.xml and Main.java. I was able to
replicate the issue by replacing the conf files in
apache-solr-1.4.0/example/solr/conf and running the class Main. Could
please confirm if this replication is enough.

Also, please let me know if I should log the ticket with Lucene or Solr.

Thanks,
Robin
  


Robin,

I reproduced the problem with your sample data, but it could be 
reproduceable

without HTMLStripCharFilter ... I commented out HTML Strippers
in schema.xml and rebuild indexes with the following data:


 
   debug-1
   hello world WGKEKW AWEHGSE
 


still the exception occurred.

Can you check it and open a JIRA issue for Solr?

Thank you!

Koji

--
http://www.rondhuit.com/en/



Re: Question about the message "Indexing failed. Rolled back all changes."

2009-12-07 Thread yountod

That was it!  Thank you for the tip.  To clarify for other beginners:  Create
a blank file called dataimport.properties in your conf directory and don't
forget to make sure the system has write access to it.




Lance Norskog-2 wrote:
> 
> This is definitely a bug. Please open a JIRA issue for this.
> 
> On Sat, Nov 21, 2009 at 10:53 AM, Bertie Shen 
> wrote:
>> Hey,
>>
>>  I figured out why we always we have see Indexing failed.
>> Rolled back all changes..  It is because we need a
>> dataimport.properties file at conf/, into which indexing will write a
>> last
>> indexing time. Without that file, SolrWriter.java will put throw an
>> exception and Solr will have this misleading  Indexing
>> failed.
>> Rolled back all changes.. output, although indexing is actually
>> successfully completed.
>>
>>  I think we need to improve this functionality, at least documentation.
>>
>>  There are one more thing that we need to pay attention to, i.e. we need
>> to
>> make dataimport.properties writable by other users, otherwise,
>> last_index_time will not be written and the error message may still be
>> there.
>>
>> On Fri, Nov 13, 2009 at 9:35 AM, yountod  wrote:
>>
>>>
>>> The process initially completes with:
>>>
>>>  2009-11-13 09:40:46
>>>  Indexing completed. Added/Updated: 20 documents.
>>> Deleted
>>> 0 documents.
>>>
>>>
>>> ...but then it fails with:
>>>
>>>  2009-11-13 09:40:46
>>>   Indexing failed. Rolled back all changes.
>>>   2009-11-13 09:41:10
>>>  2009-11-13 09:41:10
>>>  2009-11-13 09:41:10
>>>
>>>
>>> 
>>> I think it may have something to do with this, which I found by using
>>> the
>>> DataImport.jsp:
>>> 
>>> (Thread.java:636) Caused by: java.sql.SQLException: Illegal value for
>>> setFetchSize(). at
>>> com.mysql.jdbc.Statement.setFetchSize(Statement.java:1864) at
>>>
>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:242)
>>> ... 28 more
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Question-about-the-message-%22Indexing-failed.-Rolled-back-all--changes.%22-tp26242714p26340360.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
> 
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Question-about-the-message-%22Indexing-failed.-Rolled-back-all--changes.%22-tp26242714p26676982.html
Sent from the Solr - User mailing list archive at Nabble.com.



Multiindexing

2009-12-07 Thread Jörg Agatz
Hi Users..

i need help with Multiindexing in Solr,

i want one Core, and 3 to 5 diferent indizes. So i can search in simultan in
all or in some of them.
i find the Help im WIKI.. but it dosent Help.
http://wiki.apache.org/solr/MultipleIndexes?highlight=%28multi%29
there stand nothing about Multiindexing in Solr.
in the Book from Solr1.4 too

exist no way to use more than one index in one core/instance?

King


DIH Updating

2009-12-07 Thread Lee Smith

Hello All

Sorry newbie Q.

Im looking at using the Data Import Handler to add my data to solr.

But I am a little confused how I go about updating the index. I  
understand there is no update index so just a delete replace but how  
will solr know what to remove and add ?


Also hope someone does not mind giving me advice on my scema I should  
use.


I will be indexing multiple tables as each table means a different  
type of search. Here is the tables and the rows im looking at adding  
to solr.


Files:
- id
- display_name
- server_path
- file_type
- project_id

Folders:
- id
- folder_name
- fullpath
- project_id

Dailies:
- id
- scene
- take
- description
- filename (join)
- project_id

Assets
- id
- title
- project_id

Calendar: (Events)
- id
- title
- description
- project_id

On top of this I will be looking at doing full indexing using solr  
cell of the documents held in the file data table.


Hope some can point me in the right direction and thank you in advance

Regards

Lee




Solr Search in stemmed and non stemmed mode

2009-12-07 Thread khalid y
Hi !!

I'm looking for a way to have two index in solr one stemmed and another non
stemmed. Why ? It's simple :-)

My users can do query for  :
-  banking marketing =>  it return all document matches bank*** and
market***
- "banking" marketing => it return all document matches "banking" and
market***

The second request need that I can switch between stemmed and not stemmed
when the user write the keyword with quotes.

The  optimal solution is : solr can mix gracefully results from stemmed and
non stemmed index, with a good score calculation ect...

The near optimal solution is : if solr see " " it switch in non stemmed mod
for all key words in query

I have an idea but I prefer to listen the comunity voice before to propose
it. I'll expose it in my next post.

If someone has an graceful idea to do this craps :-)

Thanks


Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory

2009-12-07 Thread Robin Wojciki
Koji,

In the sample I sent, the exception comes only if the
HTMLStripCharFilter is there.

However, your test case seems to capture the essence. Sorry if I sent
you on a wild goose chase.

Thanks for taking the time! I will log a ticket.
Robin

On Mon, Dec 7, 2009 at 5:09 PM, Koji Sekiguchi  wrote:
> Robin Wojciki wrote:
>>
>> Koji, I was able to create a minimal replication.
>>
>> Attached zip has solr.xml, solrconf.xml and Main.java. I was able to
>> replicate the issue by replacing the conf files in
>> apache-solr-1.4.0/example/solr/conf and running the class Main. Could
>> please confirm if this replication is enough.
>>
>> Also, please let me know if I should log the ticket with Lucene or Solr.
>>
>> Thanks,
>> Robin
>>
>
> Robin,
>
> I reproduced the problem with your sample data, but it could be
> reproduceable
> without HTMLStripCharFilter ... I commented out HTML Strippers
> in schema.xml and rebuild indexes with the following data:
>
> 
>  
>   debug-1
>   hello world WGKEKW AWEHGSE
>  
> 
>
> still the exception occurred.
>
> Can you check it and open a JIRA issue for Solr?
>
> Thank you!
>
> Koji
>
> --
> http://www.rondhuit.com/en/
>
>


RE: search on tomcat server

2009-12-07 Thread Jill Han
In fact, I just followed the instructions titled as Tomcat On Windows.
Here are the updates on my computer
1. -Dsolr.solr.home=C:\solr\example
2. change dataDir to C:\solr\example\data in solrconfig.xml 
at C:\solr\example\conf
3. created solr.xml at C:\Tomcat 5.5\conf\Catalina\localhost


  


I restarted Tomcat, went to http://localhost:8080/solr/admin/
Entered video in Query String field, and got
/**
 
- 
- 
  0 
  0 
- 
  10 
  0 
  on 
  video 
  2.2 
  
  
   
  
/
My questions are
1. is the setting correct?
2. where does solr start to search words entered in Query String field
3. how can I make result page like general searching result page, such as, not 
found, if found, a url, instead of xml will be returned.


Thanks a lot for your helps,

Jill

-Original Message-
From: William Pierce [mailto:evalsi...@hotmail.com] 
Sent: Friday, December 04, 2009 12:56 PM
To: solr-user@lucene.apache.org
Subject: Re: search on tomcat server

Have you gone through the solr tomcat wiki?

http://wiki.apache.org/solr/SolrTomcat

I found this very helpful when I did our solr installation on tomcat.

- Bill

--
From: "Jill Han" 
Sent: Friday, December 04, 2009 8:54 AM
To: 
Subject: RE: search on tomcat server
X-HOSTLOC: hermes.apache.org/140.211.11.3

> I went through all the links on 
> http://wiki.apache.org/solr/#Search_and_Indexing
> And still have no clue as how to proceed.
> 1. do I have to do some implementation in order to get solr to search doc. 
> on tomcat server?
> 2. if I have files, such as .doc, docx, .pdf, .jsp, .html, etc under 
> window xp, c:/tomcat/webapps/test1, /webapps/test2,
>   What should I do to make solr search those directories
> 3. since I am using tomcat, instead of jetty, is there any demo that shows 
> the solr searching features, and real searching result?
>
> Thanks,
> Jill
>
>
> -Original Message-
> From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
> Sent: Monday, November 30, 2009 10:40 AM
> To: solr-user@lucene.apache.org
> Subject: Re: search on tomcat server
>
> On Mon, Nov 30, 2009 at 9:55 PM, Jill Han  wrote:
>
>> I got solr running on the tomcat server,
>> http://localhost:8080/solr/admin/
>>
>> After I enter a search word, such as, solr, then hit Search button, it
>> will go to
>>
>> http://localhost:8080/solr/select/?q=solr&version=2.2&start=0&rows=10&in
>> dent=on
>>
>>  and display
>>
>>   
>>
>> -
>> > ndent=on>
>>  <
>>
>> -
>> > ndent=on>
>>  <  
>>
>>  <0
>>
>>  <0
>>
>> -
>> > ndent=on>
>>  <
>>
>>  <  10
>>
>>  <  0
>>
>>  <  on
>>
>>  <  solr
>>
>>  <  2.2
>>
>> 
>>
>>   
>>
>>  <  
>>
>>  
>>
>>  My question is what is the next step to search files on tomcat server?
>>
>>
>>
> Looks like you have not added any documents to Solr. See the "Indexing
> Documents" section at http://wiki.apache.org/solr/#Search_and_Indexing
>
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 


Re: question about schemas

2009-12-07 Thread solr-user


Lance Norskog-2 wrote:
> 
> You can make a separate facet field which contains a range of "buckets":
> 10, 20, 50, or 100 means that the field has a value 0-10, 11-20, 21-50, or
> 51-100. You could use a separate filter query with values for these
> buckets. Filter queries are very fast in Solr 1.4 and this would limit
> your range query execution to documents which match the buckets.
> 

Lance, I am afraid that I do not see how to use this suggestion.

Which of the three (four?) suggested schemas would I be using?  How would
these range facets prevent the potential issues I found such as getting
product facets instead of customer facets, or having very large numbers of
ANDs and ORs, and so forth.
-- 
View this message in context: 
http://old.nabble.com/question-about-schemas-tp26600956p26679922.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DIH Updating

2009-12-07 Thread Erick Erickson
The short form is that you must provide and identify a unique key (analogous
to a database PK). See http://wiki.apache.org/solr/UniqueKey

There's an example here:
http://wiki.apache.org/solr/DataImportHandler

But one red flag in your mail
is that you're thinking in terms
of tables. Think about *denormalizing* your data. There's really
no such thing as a join in SOLR/Lucene, and attempts
to emulate a DB-style join should be resisted ...

I know, I know. It really goes against the trained instincts of
a DB person to *replicate* data all over the place. But
search engines and RDBMSs are two very different beasts
and trying to make one behave like the other is usually...er...
unfortunate.

Before you get too far into your migration, I'd *strongly* advise
you to spend some time imagining what form a query would take
with your proposed schema. Don't even bother with using
SOLR query syntax at first, just construct your query with boolean
logic. Something like id:234 AND title:wonderful AND title:life. No
sub-selects, joins, etc. allowed. This will inform your schema no
end

HTH
Erick

On Mon, Dec 7, 2009 at 9:13 AM, Lee Smith  wrote:

> Hello All
>
> Sorry newbie Q.
>
> Im looking at using the Data Import Handler to add my data to solr.
>
> But I am a little confused how I go about updating the index. I understand
> there is no update index so just a delete replace but how will solr know
> what to remove and add ?
>
> Also hope someone does not mind giving me advice on my scema I should use.
>
> I will be indexing multiple tables as each table means a different type of
> search. Here is the tables and the rows im looking at adding to solr.
>
> Files:
> - id
> - display_name
> - server_path
> - file_type
> - project_id
>
> Folders:
> - id
> - folder_name
> - fullpath
> - project_id
>
> Dailies:
> - id
> - scene
> - take
> - description
> - filename (join)
> - project_id
>
> Assets
> - id
> - title
> - project_id
>
> Calendar: (Events)
> - id
> - title
> - description
> - project_id
>
> On top of this I will be looking at doing full indexing using solr cell of
> the documents held in the file data table.
>
> Hope some can point me in the right direction and thank you in advance
>
> Regards
>
> Lee
>
>
>


Re: Solr 1.4: StringIndexOutOfBoundsException in SpellCheckComponent with HTMLStripCharFilterFactory

2009-12-07 Thread Robin Wojciki
Logged a ticket for Solr: https://issues.apache.org/jira/browse/SOLR-1630

Thanks,
Robin

On Mon, Dec 7, 2009 at 9:36 PM, Robin Wojciki  wrote:
> Koji,
>
> In the sample I sent, the exception comes only if the
> HTMLStripCharFilter is there.
>
> However, your test case seems to capture the essence. Sorry if I sent
> you on a wild goose chase.
>
> Thanks for taking the time! I will log a ticket.
> Robin
>
> On Mon, Dec 7, 2009 at 5:09 PM, Koji Sekiguchi  wrote:
>> Robin Wojciki wrote:
>>>
>>> Koji, I was able to create a minimal replication.
>>>
>>> Attached zip has solr.xml, solrconf.xml and Main.java. I was able to
>>> replicate the issue by replacing the conf files in
>>> apache-solr-1.4.0/example/solr/conf and running the class Main. Could
>>> please confirm if this replication is enough.
>>>
>>> Also, please let me know if I should log the ticket with Lucene or Solr.
>>>
>>> Thanks,
>>> Robin
>>>
>>
>> Robin,
>>
>> I reproduced the problem with your sample data, but it could be
>> reproduceable
>> without HTMLStripCharFilter ... I commented out HTML Strippers
>> in schema.xml and rebuild indexes with the following data:
>>
>> 
>>  
>>   debug-1
>>   hello world WGKEKW AWEHGSE
>>  
>> 
>>
>> still the exception occurred.
>>
>> Can you check it and open a JIRA issue for Solr?
>>
>> Thank you!
>>
>> Koji
>>
>> --
>> http://www.rondhuit.com/en/
>>
>>
>


Re: Solr Search in stemmed and non stemmed mode

2009-12-07 Thread Erick Erickson
Try searching the mail archive for
stemmer exact match
or similar, this has been discussed multiple times and you'll get more
complete discussions wy faster

One suggestion is to use two fields, one for the stemmed version
and one for the original, then use whichever field you need to via
DixMax handler (more detail in the mail archive).

Best
Erick

On Mon, Dec 7, 2009 at 10:02 AM, khalid y  wrote:

> Hi !!
>
> I'm looking for a way to have two index in solr one stemmed and another non
> stemmed. Why ? It's simple :-)
>
> My users can do query for  :
> -  banking marketing =>  it return all document matches bank*** and
> market***
> - "banking" marketing => it return all document matches "banking" and
> market***
>
> The second request need that I can switch between stemmed and not stemmed
> when the user write the keyword with quotes.
>
> The  optimal solution is : solr can mix gracefully results from stemmed and
> non stemmed index, with a good score calculation ect...
>
> The near optimal solution is : if solr see " " it switch in non stemmed mod
> for all key words in query
>
> I have an idea but I prefer to listen the comunity voice before to propose
> it. I'll expose it in my next post.
>
> If someone has an graceful idea to do this craps :-)
>
> Thanks
>


Re: Solr Search in stemmed and non stemmed mode

2009-12-07 Thread khalid y
Thanks,

I'll read the mail archive.

Your suggestion is like mine but whitout the DisMax handler. I'm going to
read what is this handler.
I have one field text and another text_unstemmed where I copy all others
fields. I'm writing my custom query handler who check if quotes exists and
switch between the good field.

Going to read...

Thanks


2009/12/7 Erick Erickson 

> Try searching the mail archive for
> stemmer exact match
> or similar, this has been discussed multiple times and you'll get more
> complete discussions wy faster
>
> One suggestion is to use two fields, one for the stemmed version
> and one for the original, then use whichever field you need to via
> DixMax handler (more detail in the mail archive).
>
> Best
> Erick
>
> On Mon, Dec 7, 2009 at 10:02 AM, khalid y  wrote:
>
> > Hi !!
> >
> > I'm looking for a way to have two index in solr one stemmed and another
> non
> > stemmed. Why ? It's simple :-)
> >
> > My users can do query for  :
> > -  banking marketing =>  it return all document matches bank*** and
> > market***
> > - "banking" marketing => it return all document matches "banking" and
> > market***
> >
> > The second request need that I can switch between stemmed and not stemmed
> > when the user write the keyword with quotes.
> >
> > The  optimal solution is : solr can mix gracefully results from stemmed
> and
> > non stemmed index, with a good score calculation ect...
> >
> > The near optimal solution is : if solr see " " it switch in non stemmed
> mod
> > for all key words in query
> >
> > I have an idea but I prefer to listen the comunity voice before to
> propose
> > it. I'll expose it in my next post.
> >
> > If someone has an graceful idea to do this craps :-)
> >
> > Thanks
> >
>


Re: question about schemas (and SOLR-1131?)

2009-12-07 Thread solr-user


wojtekpia wrote:
> 
> Could this be solved with a multi-valued custom field type (including a
> custom comparator)? The OP's situation deals with multi-valuing products
> for each customer. If products contain strictly numeric fields then it
> seems like a custom field implementation (or extension of BinaryField?)
> *should* be easy - only the comparator part needs work. I'm not clear on
> how the existing query parsers would handle this though, so there's
> probably some work there too. 
> https://issues.apache.org/jira/browse/SOLR-1131 SOLR-1131  seems like a
> more general solution that supports analysis that numeric fields don't
> need.
> 

Thank you for your suggestion.

It was my hope that I had simply not understood how to properly define the
schema in Solr, or that I had not understood how to use the existing Solr
functionality.

I will further look into the suggestions that I have received so far,
however I have concerns that my Solr project cannot proceed with the
technology present.  Lance may be correct in his assertion that I am using
the incorrect tool for the job.
-- 
View this message in context: 
http://old.nabble.com/question-about-schemas-tp26600956p26680485.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Embedded for write, HTTP for read - cache aging

2009-12-07 Thread Peter 4U

Hi Erik,

 

Thanks for your answer.

 

Yes, I've done an /update to the http server, which certainly works as far as 
the 'reading' goes.

This sends the update to the back-end index though, which essentially defeats 
the purpose of having the embedded instance do the write (as writes are always 
local, but reads might be remote, the goal is for super-fast writes, at the 
potential cost of slower reads). Maybe the http server can be set as 
'Read-only' (redirected /update handler) so that it doesn't hit the back-end 
indexer, but still tells it to check the index on the next read?

 

The main performance bottleneck isn't Solr itself, but the HTTP 
wrapping/transmission.

At low traffic rates, it really makes no difference at all.

But when you get into 1000's writes/sec the http wrapping and transmission 
becomes more and more significant as the traffic rate rises. On average, we've 
seen ~3-8% efficiency increase at very high rates (using a typical Windows TCP 
stack). This might not seem like much, but at really high screaming input 
rates, it does make a difference.

The EmbeddedSolr instance itself wraps each request into an XML request, so I 
believe the performance of the EmbeddedSolr instance could be increased if it 
handled requests without any wrapping at all (NamedList).

 

Thanks,

Peter

 


 
> From: erik.hatc...@gmail.com
> To: solr-user@lucene.apache.org
> Subject: Re: Embedded for write, HTTP for read - cache aging
> Date: Mon, 7 Dec 2009 05:49:01 +0100
> 
> 
> On Dec 5, 2009, at 12:56 PM, Peter 4U wrote:
> > Does anyone know of a way to tell an http SolrServer to reload its 
> > back-end index (mark cache as dirty) periodically?
> 
> Send a  to the HTTP SolrServer.
> 
> > I have a scenario where an EmbeddedSolrServer is used for writing 
> > (for fast indexing), and an
> >
> > CommonsHttpSolrServer for reading (for remote access).
> 
> I'm curious, now much faster is it in your situation?
> 
> Erik
> 
  
_
Have more than one Hotmail account? Link them together to easily access both
 http://clk.atdmt.com/UKM/go/186394591/direct/01/

Re: Multiple Solr Instances - Multiple Jetty Instances

2009-12-07 Thread Smiley, David W.
If you have many documents (say > 10M documents, probably a larger threshold) 
then you will benefit from sharding your index, i.e. splitting your index up 
into multiple cores and using distributed searches.  You could use one VM and 
multiple cores just fine, assuming you have multiple CPUs.

If not, then I see no point in using more Java VMs.  Java is pretty scalable in 
the enterprise, you know.

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/

On Dec 6, 2009, at 11:56 PM, Amit Nithian wrote:

> This may be a silly question but is there any capacity gain if I run
> multiple jetty instances each having their own SOLR_HOME where each jetty
> instance/solr will replicate their index from a separate cluster of masters?
> I have a couple powerful multi-core servers and am not sure if/how a single
> JVM takes advantage of multi-cores and feel that I could increase my
> resource usage and hence search capacity by running multiple jetty instances
> per server as opposed to adding more machines.
> 
> Physical redundancy aside, is this acceptable practice?
> 
> Thanks!
> Amit




RE: search on tomcat server

2009-12-07 Thread Sascha Szott
Hi Jill,

just to make sure your index contains at least one document, what is the
output of



Best,
Sascha

Jill Han wrote:
> In fact, I just followed the instructions titled as Tomcat On Windows.
> Here are the updates on my computer
> 1. -Dsolr.solr.home=C:\solr\example
> 2. change dataDir to C:\solr\example\data in
> solrconfig.xml at C:\solr\example\conf
> 3. created solr.xml at C:\Tomcat 5.5\conf\Catalina\localhost
> 
>  crossContext="true">
>value="c:/solr/example" override="true"/>
> 
>
> I restarted Tomcat, went to http://localhost:8080/solr/admin/
> Entered video in Query String field, and got
> /**
> 
> - 
> - 
>   0
>   0
> - 
>   10
>   0
>   on
>   video
>   2.2
>   
>   
>   
>   
> /
> My questions are
> 1. is the setting correct?
> 2. where does solr start to search words entered in Query String field
> 3. how can I make result page like general searching result page, such as,
> not found, if found, a url, instead of xml will be returned.
>
>
> Thanks a lot for your helps,
>
> Jill
>
> -Original Message-
> From: William Pierce [mailto:evalsi...@hotmail.com]
> Sent: Friday, December 04, 2009 12:56 PM
> To: solr-user@lucene.apache.org
> Subject: Re: search on tomcat server
>
> Have you gone through the solr tomcat wiki?
>
> http://wiki.apache.org/solr/SolrTomcat
>
> I found this very helpful when I did our solr installation on tomcat.
>
> - Bill
>
> --
> From: "Jill Han" 
> Sent: Friday, December 04, 2009 8:54 AM
> To: 
> Subject: RE: search on tomcat server
> X-HOSTLOC: hermes.apache.org/140.211.11.3
>
>> I went through all the links on
>> http://wiki.apache.org/solr/#Search_and_Indexing
>> And still have no clue as how to proceed.
>> 1. do I have to do some implementation in order to get solr to search
>> doc.
>> on tomcat server?
>> 2. if I have files, such as .doc, docx, .pdf, .jsp, .html, etc under
>> window xp, c:/tomcat/webapps/test1, /webapps/test2,
>>   What should I do to make solr search those directories
>> 3. since I am using tomcat, instead of jetty, is there any demo that
>> shows
>> the solr searching features, and real searching result?
>>
>> Thanks,
>> Jill
>>
>>
>> -Original Message-
>> From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
>> Sent: Monday, November 30, 2009 10:40 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: search on tomcat server
>>
>> On Mon, Nov 30, 2009 at 9:55 PM, Jill Han  wrote:
>>
>>> I got solr running on the tomcat server,
>>> http://localhost:8080/solr/admin/
>>>
>>> After I enter a search word, such as, solr, then hit Search button, it
>>> will go to
>>>
>>> http://localhost:8080/solr/select/?q=solr&version=2.2&start=0&rows=10&in
>>> dent=on
>>>
>>>  and display
>>>
>>>   
>>>
>>> -
>>> >> ndent=on>
>>>  <
>>>
>>> -
>>> >> ndent=on>
>>>  <  
>>>
>>>  <0
>>>
>>>  <0
>>>
>>> -
>>> >> ndent=on>
>>>  <
>>>
>>>  <  10
>>>
>>>  <  0
>>>
>>>  <  on
>>>
>>>  <  solr
>>>
>>>  <  2.2
>>>
>>> 
>>>
>>>   
>>>
>>>  <  
>>>
>>>  
>>>
>>>  My question is what is the next step to search files on tomcat
>>> server?
>>>
>>>
>>>
>> Looks like you have not added any documents to Solr. See the "Indexing
>> Documents" section at http://wiki.apache.org/solr/#Search_and_Indexing
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>



Re: Multiple Solr Instances - Multiple Jetty Instances

2009-12-07 Thread Amit Nithian
David thanks for your response. With that having been said, is there a
general ratio of the number of Tomcat/Jetty HTTP threads to allocate
relative to the number of CPU cores you have on your machine?

Is the default in Tomcat/Jetty acceptable?

Thanks again
Amit

On Mon, Dec 7, 2009 at 10:00 AM, Smiley, David W.  wrote:

> If you have many documents (say > 10M documents, probably a larger
> threshold) then you will benefit from sharding your index, i.e. splitting
> your index up into multiple cores and using distributed searches.  You could
> use one VM and multiple cores just fine, assuming you have multiple CPUs.
>
> If not, then I see no point in using more Java VMs.  Java is pretty
> scalable in the enterprise, you know.
>
> ~ David Smiley
> Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/
>
> On Dec 6, 2009, at 11:56 PM, Amit Nithian wrote:
>
> > This may be a silly question but is there any capacity gain if I run
> > multiple jetty instances each having their own SOLR_HOME where each jetty
> > instance/solr will replicate their index from a separate cluster of
> masters?
> > I have a couple powerful multi-core servers and am not sure if/how a
> single
> > JVM takes advantage of multi-cores and feel that I could increase my
> > resource usage and hence search capacity by running multiple jetty
> instances
> > per server as opposed to adding more machines.
> >
> > Physical redundancy aside, is this acceptable practice?
> >
> > Thanks!
> > Amit
>
>
>


Re: [SolrResourceLoader] Unable to load cached class-name

2009-12-07 Thread Chris Hostetter

: Ok, I'm not sure where my particular use of Solr fits into all this.
: I'm writing a log4j appender that adds each log entry to a Solr index.
: It's not really a Solr plugin.

So if i'm understanding correctly, you want to run a "stock" solr server, 
with none of your own custom code in that server, and then in another 
application you want to load a custom log4j appender that sends the log 
messages to the solr server.

do i understand that part correctly?

is your log4j appender using solrj to communicate with solr?

: I've removed all the extra files from the lib directory as you
: suggest, and now I get the following message when starting JBoss
: 
: 18:54:23,083 ERROR [AbstractKernelController] Error installing to
: Create: name=jboss.system:service=Logging,type=Log4jService
: state=Configured mode=Manual requiredState=Create
: java.lang.NoClassDefFoundError: 
org/apache/solr/client/solrj/SolrServerException

Hmmm... just to be clear: are you getting this error in the 
instance of Solr you've setup to *recieve* the log messages, or in the 
application where you are generating the log messages?

If it's the former, then it sounds like you still have a mungled classpath 
-- the only thing you need there is the solr.war, no other special jar 
files.  If it's the later, then it looks like you don't have all of hte 
solrj client jars.  I don't remember off the top of my head which jars you 
need, but a wuick "jar tf" looking for 
"org/apache/solr/client/solrj/SolrServerException" should be fairly 
trivial.




-Hoss



Re: comparing index-time boost and sort in the case of a date field

2009-12-07 Thread Chris Hostetter
: 
: I have a requirement where I need to display records with more recent values
: for approval_dt to come first when a query is made. I thought of approaching
: this in 2 different ways:-

...

: 2. INDEX-TIME boosting.
: I sorted the query from databse itself in asc order of approval_dt while
: creating my input xml and while creating each ** gave it a boost
: increment by 0.1 starting from 1.01. Those records which don't have a value

index time boosts are folded info the fieldNorm which is float indexed 
using a compressed "buyte" incoding, so many initial values all collapse 
down to the same final value -- which means you aren't going to get the 
granulatiry you want from index time booksts like 1.01 and 1.02 even if 
you do every thing else perfectly.

if you have the luxary of sorting your docs before indexing them, then you 
should sort them by approval_dt *descending* and then iterate over them 
and add them to the index.  then you can use the new "_docid_ asc" sort 
syntax added in Solr 1.4

Ascending sord by internal docid (ie: the order that documents are 
indexed) is essentially free in Lucene/Solr -- so you should find that 
much faster then sorting by an explicit field (or even sorting by score)



-Hoss



# in query

2009-12-07 Thread Joel Nylund

Hi,

How can I put a # sign in a query, do I need to escape it?

For example I want to query books with title that contain #

No work so far:
http://localhost:8983/solr/select?q=textTitle:"#";
http://localhost:8983/solr/select?q=textTitle:#
http://localhost:8983/solr/select?q=textTitle:"\#";

Getting
org.apache.lucene.queryParser.ParseException: Cannot parse 'textTitle: 
\': Lexical error at line 1, column 12.  Encountered:  after : ""


and sometimes just no response.


thanks
Joel



RE: SolrPlugin Guidance

2009-12-07 Thread Chris Hostetter

: e.g. For the following query that looks for a file in a directory:
: q=+directory_name:"myDirectory" +file_name:"myFile"
: 
: We'd need to decompose the query into the following two queries:
: 1. q=+directory_name:"myDirectory"&fl=directory_id
: 2. q=+file_name:"myFile" +directory_id:(results from query #1)
: 
: I guess I'm looking for the following feedback:
: - Does this sound crazy?  

it's a little crazy, but not absurd.

: - Is the QParser the right place for this logic?  If so, can I get a 
: little more guidance on how to decompose the queries there (filter 
: queries maybe)?

a QParser could work. (and in general, if you can solve something with a 
QParser that's probably for the best, since it allows the most reuse). but 
exactly how to do it depends on how many results you expect from your 
first query:  if you are going to structure things so they have to 
uniquely id a directory, and you'll have a singleID, then this is 
something that could easily make sense in a QParser (you are essentailly 
just rewriting part of the query from string to id -- you just happen to 
be using solr as a lookup table for those strings).

but if you plan to support any arbitrary "N" directories, then you may 
need something more complicated ... straight filter queries won't help 
much because you'll want the union instead of hte intersection, so for 
every directoryId you find, use it as a query to get a DocSet and then 
maintain a running union of all those DocSets to use as your final filter 
(hmm... that may not actually be possible with the QParser API ... i 
haven't look at ti in a while, but for an approach like this you may beed 
to subclass QueryComponent instead)




-Hoss



Re: # in query

2009-12-07 Thread Paul Libbrecht

Sure you have to escape it! %23

otherwise the browser considers it as a separator between the URL for  
the server (on the left) and the fragment identifier (on the right)  
which is not sent the server.


You might want to read about "URL-encoding", escaping with backslash  
is a shell-thing, not a thing for URLs!


paul


Le 07-déc.-09 à 21:16, Joel Nylund a écrit :


Hi,

How can I put a # sign in a query, do I need to escape it?

For example I want to query books with title that contain #

No work so far:
http://localhost:8983/solr/select?q=textTitle:"#";
http://localhost:8983/solr/select?q=textTitle:#
http://localhost:8983/solr/select?q=textTitle:"\#";

Getting
org.apache.lucene.queryParser.ParseException: Cannot parse  
'textTitle:\': Lexical error at line 1, column 12.  Encountered:  
 after : ""


and sometimes just no response.


thanks
Joel





smime.p7s
Description: S/MIME cryptographic signature


Re: # in query

2009-12-07 Thread Joel Nylund
ok thanks,  sorry my brain wasn't working, but even when I url encode  
it, I dont get any results, is there something special I have to do  
for solr?


thanks
Joel

On Dec 7, 2009, at 3:20 PM, Paul Libbrecht wrote:


Sure you have to escape it! %23

otherwise the browser considers it as a separator between the URL  
for the server (on the left) and the fragment identifier (on the  
right) which is not sent the server.


You might want to read about "URL-encoding", escaping with backslash  
is a shell-thing, not a thing for URLs!


paul


Le 07-déc.-09 à 21:16, Joel Nylund a écrit :


Hi,

How can I put a # sign in a query, do I need to escape it?

For example I want to query books with title that contain #

No work so far:
http://localhost:8983/solr/select?q=textTitle:"#";
http://localhost:8983/solr/select?q=textTitle:#
http://localhost:8983/solr/select?q=textTitle:"\#";

Getting
org.apache.lucene.queryParser.ParseException: Cannot parse  
'textTitle:\': Lexical error at line 1, column 12.  Encountered:  
 after : ""


and sometimes just no response.


thanks
Joel







Re: # in query

2009-12-07 Thread Erick Erickson
Well, the very first thing I would is examine the field definition in
your schema file. I suspect that the tokenizers and/or
filters you're using for indexing and/or querying is doing something
to the # symbol. Most likely stripping it. If you're just searching
for the single-letter term "#", I *think* the query parser silently just
drops that part of the clause out, but check on that.

The second thing would be to get a copy of Luke and examine your
index to see if what you *think* is in your index actually is there.

HTH
Erick

On Mon, Dec 7, 2009 at 3:28 PM, Joel Nylund  wrote:

> ok thanks,  sorry my brain wasn't working, but even when I url encode it, I
> dont get any results, is there something special I have to do for solr?
>
> thanks
> Joel
>
>
> On Dec 7, 2009, at 3:20 PM, Paul Libbrecht wrote:
>
>  Sure you have to escape it! %23
>>
>> otherwise the browser considers it as a separator between the URL for the
>> server (on the left) and the fragment identifier (on the right) which is not
>> sent the server.
>>
>> You might want to read about "URL-encoding", escaping with backslash is a
>> shell-thing, not a thing for URLs!
>>
>> paul
>>
>>
>> Le 07-déc.-09 à 21:16, Joel Nylund a écrit :
>>
>>  Hi,
>>>
>>> How can I put a # sign in a query, do I need to escape it?
>>>
>>> For example I want to query books with title that contain #
>>>
>>> No work so far:
>>> http://localhost:8983/solr/select?q=textTitle:"#";
>>> http://localhost:8983/solr/select?q=textTitle:#
>>> http://localhost:8983/solr/select?q=textTitle:"\#";
>>>
>>> Getting
>>> org.apache.lucene.queryParser.ParseException: Cannot parse 'textTitle:\':
>>> Lexical error at line 1, column 12.  Encountered:  after : ""
>>>
>>> and sometimes just no response.
>>>
>>>
>>> thanks
>>> Joel
>>>
>>>
>>
>


Re: Question regarding scoring/boosting

2009-12-07 Thread Chris Hostetter


Unfortunately understanding how Lucene/Scoring works isn't much of a 
beginer level topic -- the short answer to your question is that adding a 
function to the "bf" param of dismax causees that function to be evaluated 
for every doc that matches your main query and the scores are "boosted" in 
proportion to the values produced by those funcitons.

If you really want to see the nitty gritty details, use "debugQuery=true" 
and a breakdown of the full score "explanation" will be produced, but 
this will refrence quite a bit of concepts that are by no means beginer 
level -- following the math is easy, understanding where the numbers come 
from is not.


: I'm what one would probably call a beginner with Solr. I have my data loaded
: in and I am getting the hang of querying things. However, I'm still rather
: unclear as to how the score can be affected by various parameters. I'm using
: the dismax request handler, and I just don't quite get how doing foo^value
: in the bf affects the score. Perhaps if someone could explain this at a
: basic level or point me in the direction of some documentation as to how
: this affects the final score this would be very helpful.
: 
: Thanks,
: Oliver
: 



-Hoss



Re: Solr plugin or something else for custom work?

2009-12-07 Thread Chris Hostetter

What you are describing corrisponds pretty closely to some work currently 
in progress to make the DataImportHandler integrate with the 
ExtractingRequestHandler/Tika ... 

https://issues.apache.org/jira/browse/SOLR-1358

...in the meantime, your options are either to extract all the metadata 
yourself and push them (along with the attachment) as literal field values 
to ExtractingRequestHandler, or extract the content of hte attachment 
yourself and use one the XmlUpdateRequestHandler.

: I have a requirement where I am indexing attachements. Attachements hang off
: of a database entity(table). I also need to include some meta-data info from
: the database table as part of the index. Trying to find best way to
: implement using custom handler or something? where custom handler gets all
: required db records (those include document path) by consuming a web service
: (I can expose a method from my application as a web service) and then
: itereate through a list (returned by web serivce) and index required meta
: data along with indexing attachments (attachements path is part of meta data
: of an entity). Has anyone tried something like this or have suggestions how
: best to implement this requirement?



-Hoss



Exception encountered during replication on slave....Any clues?

2009-12-07 Thread William Pierce
Folks:

I am seeing this exception in my logs that is causing my replication to fail.   
 I start with  a clean slate (empty data directory).  I index the data on the 
postingsmaster using the dataimport handler and it succeeds.  When the 
replication slave attempts to replicate it encounters this error. 

Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller fetchLatestIndex
SEVERE: Master at: http://localhost/postingsmaster/replication is not 
available. Index fetch failed. Exception: Invalid version or the data in not in 
'javabin' format

Any clues as to what I should look for to debug this further?  

Replication is enabled as follows:

The postingsmaster solrconfig.xml looks as follows:



  
  commit
  
  

  

The postings slave solrconfig.xml looks as follows:




http://localhost/postingsmaster/replication 
 

00:05:00  
 
  


Thanks,

- Bill




Re: Exception encountered during replication on slave....Any clues?

2009-12-07 Thread TCK
are you missing the port number in the master's url ?

-tck



On Mon, Dec 7, 2009 at 4:44 PM, William Pierce wrote:

> Folks:
>
> I am seeing this exception in my logs that is causing my replication to
> fail.I start with  a clean slate (empty data directory).  I index the
> data on the postingsmaster using the dataimport handler and it succeeds.
>  When the replication slave attempts to replicate it encounters this error.
>
> Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller fetchLatestIndex
> SEVERE: Master at: http://localhost/postingsmaster/replication is not
> available. Index fetch failed. Exception: Invalid version or the data in not
> in 'javabin' format
>
> Any clues as to what I should look for to debug this further?
>
> Replication is enabled as follows:
>
> The postingsmaster solrconfig.xml looks as follows:
>
> 
>
>  
>  commit
>  
>  
>
>  
>
> The postings slave solrconfig.xml looks as follows:
>
> 
>
>
>http://localhost/postingsmaster/replication
> 
>
>00:05:00
> 
>  
>
>
> Thanks,
>
> - Bill
>
>
>


Oddly slow replication

2009-12-07 Thread Simon Wistow
I have a Master server with two Slaves populated via Solr 1.4 native 
replication.

Slave1 syncs at a respectable speed i.e around 100MB/s but Slave2 runs 
much, much slower - the peak I've seen is 56KB/s.

Both are running off the same hardware with the same config - 
compression is set to 'internal' and http(Conn|Read)Timeout are defaults 
(5000/1). 

I've checked too see if it was a disk problem using dd and if it was a 
network problem by doing a manual scp and an rsync from the slave to the 
master and the master to the slave. 

I've shut down the replication polling on Slave1 just to see if that was 
causing the problem but there's been no improvement.

Any ideas?




Re: Exception encountered during replication on slave....Any clues?

2009-12-07 Thread William Pierce

tck,

thanks for your quick response.  I am running on the default port (8080). 
If I copy that exact string given in the masterUrl and execute it in the 
browser I get a response from solr:



- 
- 
 0
 0
 
 OK
 No command
 

So the masterUrl is reachable/accessible so far as I am able to tell

Thanks,

- Bill

--
From: "TCK" 
Sent: Monday, December 07, 2009 1:50 PM
To: 
Subject: Re: Exception encountered during replication on slaveAny clues?


are you missing the port number in the master's url ?

-tck



On Mon, Dec 7, 2009 at 4:44 PM, William Pierce 
wrote:



Folks:

I am seeing this exception in my logs that is causing my replication to
fail.I start with  a clean slate (empty data directory).  I index the
data on the postingsmaster using the dataimport handler and it succeeds.
 When the replication slave attempts to replicate it encounters this 
error.


Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller 
fetchLatestIndex

SEVERE: Master at: http://localhost/postingsmaster/replication is not
available. Index fetch failed. Exception: Invalid version or the data in 
not

in 'javabin' format

Any clues as to what I should look for to debug this further?

Replication is enabled as follows:

The postingsmaster solrconfig.xml looks as follows:


   
 
 commit
 
 
   
 

The postings slave solrconfig.xml looks as follows:


   
   
   http://localhost/postingsmaster/replication

   

   00:05:00

 


Thanks,

- Bill







Re: Response writer configs

2009-12-07 Thread Chris Hostetter

: I guess we should remove this commented response writers from the
: example solrconfig. It adds no value.

The comment tried to make it clear that it was showing what writers were 
enabled by default.  But i changed it to be more in line with what we 
have for search components.





-Hoss



Re: Exception encountered during replication on slave....Any clues?

2009-12-07 Thread William Pierce
Just to make doubly sure,  per tck's suggestion,  I went in and explicitly 
added in the port in the masterurl so that it now reads:


http://localhost:8080/postingsmaster/replication

Still getting the same exception...

I am running solr 1.4, on Ubuntu karmic, using tomcat 6 and Java 1.6.

Thanks,

- Bill

--
From: "William Pierce" 
Sent: Monday, December 07, 2009 2:03 PM
To: 
Subject: Re: Exception encountered during replication on slaveAny clues?


tck,

thanks for your quick response.  I am running on the default port (8080). 
If I copy that exact string given in the masterUrl and execute it in the 
browser I get a response from solr:



- 
- 
 0
 0
 
 OK
 No command
 

So the masterUrl is reachable/accessible so far as I am able to tell

Thanks,

- Bill

--
From: "TCK" 
Sent: Monday, December 07, 2009 1:50 PM
To: 
Subject: Re: Exception encountered during replication on slaveAny 
clues?



are you missing the port number in the master's url ?

-tck



On Mon, Dec 7, 2009 at 4:44 PM, William Pierce 
wrote:



Folks:

I am seeing this exception in my logs that is causing my replication to
fail.I start with  a clean slate (empty data directory).  I index 
the

data on the postingsmaster using the dataimport handler and it succeeds.
 When the replication slave attempts to replicate it encounters this 
error.


Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller 
fetchLatestIndex

SEVERE: Master at: http://localhost/postingsmaster/replication is not
available. Index fetch failed. Exception: Invalid version or the data in 
not

in 'javabin' format

Any clues as to what I should look for to debug this further?

Replication is enabled as follows:

The postingsmaster solrconfig.xml looks as follows:


   
 
 commit
 
 
   
 

The postings slave solrconfig.xml looks as follows:


   
   
   http://localhost/postingsmaster/replication

   

   00:05:00

 


Thanks,

- Bill









how to set CORE when using Apache Solr extension?

2009-12-07 Thread regany

Hello,

Can anyone tell me how you set which Solr CORE to use when using the Apache
Solr extension? (Using Solr with multicores)
http://www.php.net/manual/en/book.solr.php

thanks,
regan
-- 
View this message in context: 
http://old.nabble.com/how-to-set-CORE-when-using-Apache-Solr-extension--tp26685174p26685174.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Windows 7 / Java 64bit / solr 1.4 - solr.solr.home problem

2009-12-07 Thread Chris Hostetter

: I tried it on Vista 32 & Java 1.6.0_17-b04 and it works without a problem.
: Actually on all other computers in the office there is no problem - I am the
: only one using Windows 7
: 
: I did try with full path and it didn't work as well. Here's the result:

by the looks of it, your problem is happening long before solr (or even 
java) get access to the system property ... it appears that something 
About how Windows 7 works is causing the command line args to be 
split at the first "." character so it thinks hte rest of the sysproperty 
name is suppose to be a classname.

things get even more interesting when you use the full path, and you can 
see htat somewhere along the way, something is converting the "/" 
charaters in the path to "." characters.  (but that may just be an attempt 
by the JRE to work arround people who inadvertantly try to run a java class 
using 
apathc instead of a fully qualified classname).

My suggestion would be that you ignore Solr for a little while.  Start by 
double checking that the docs for your Windows 7 version of Java don't say 
anything about needing to use special escaping of system properties, then 
make a little dos batch file (or whatever they are called in Windows 7) 
that echo's back all of hte comman line arguments it gets, and test that 
out with some arguments that contain "." charaters in them and see what 
that tells you ... because everything you describe makes this sound like 
funky new shell behavior introduced in Windows 7.

: PS C:\nginx\solr> java -Dsolr.solr.home=c:/nginx/solr/multicore -jar start.jar
: Exception in thread "main" java.lang.NoClassDefFoundError:
: /solr/home=c:/nginx/solr/multicore
: Caused by: java.lang.ClassNotFoundException:
: .solr.home=c:.nginx.solr.multicore


-Hoss



Re: Stopping & Starting

2009-12-07 Thread regany


Lee Smith-6 wrote:
> 
> So how can I stop and restart the service ?
> 
> Hope you can help get me going again.
> 
> Thank you
> Lee
> 


I found this shell script which works well for me...


#!/bin/sh -e

# Starts, stops, and restarts solr

SOLR_DIR="/usr/local/solr/example"
JAVA_OPTIONS="-Xmx1024m -DSTOP.PORT=8079 -DSTOP.KEY=stopkey -jar start.jar"
LOG_FILE="/var/log/solr.log"
JAVA="/usr/bin/java"

case $1 in
start)
echo "Starting Solr"
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS 2> $LOG_FILE &
;;
stop)
echo "Stopping Solr"
cd $SOLR_DIR
$JAVA $JAVA_OPTIONS --stop
;;
restart)
$0 stop
sleep 1
$0 start
;;
*)
echo "Usage: $0 {start|stop|restart}" >&2
exit 1
;;
esac

-- 
View this message in context: 
http://old.nabble.com/Stopping---Starting-tp26633950p26685498.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: edismax using bigrams instead of phrases?

2009-12-07 Thread Chris Hostetter

: > I've started trying edismax, and have noticed that my relevancy ranking is
: > messed up with edismax because, according to the debug output, it's using
: > bigrams instead of phrases and inexplicably ignoring a couple of the pf

I noticed that aw well while testing edismax on the train the other day 
(notes attached to SOLR-1553 earlier today)

: It's a feature in general - the problem with putting all the terms in
: a single phrase query is that you get no boosting at all if all of the
: terms don't appear.

But sometimes that's what you want -- pf was intended to support hte 
usecase where people remember an exact phrase from the text (ie: they 
cut/paste the title, or the first line from an abstract, etc...) and want 
that right at the top.  removing that and replacing it with a shingles 
based approach allows other docs that match lots of "bits" of the iput 
string to overshadow exact matches.


-Hoss



Re: edismax using bigrams instead of phrases?

2009-12-07 Thread Chris Hostetter

: I see that edismax already defines pf (bigrams) and pf3 (trigrams) -- how
: would folks think about just calling them pf / pf1 (aliases for each
: other?), pf2, and pf3? The pf would then behave exactly as it does in
: dismax.

changing edismax's current pasing logic to be applied to a "pf2" param 
and restoring the original "pf" logic certainly makes sense -- but i think 
it would be a mistake to have a "pf1" field that was an alias for "pf" ... 
as it stands the "pf" parm in dismax is analogous to a "pf*" or 
"pf-Infinity" type option requiring all of the words however many tehre 
are ... in the context of "pf2" and "pf3" a "pf1" option would imply that 
it did a phrase boosting on each individual word -- which wouldnt' be very 
useful at all (tht'as what qf is for)



-Hoss



Re: latency in solr response is observed after index is updated

2009-12-07 Thread Chris Hostetter

: We are observing latency (some times huge latency upto 10-20 secs) in solr
: response  after index is updated . whats the reason of this latency and how
: can it be minimized ? 
: Note: our index size is pretty large.

Please read the following wiki pages...

http://wiki.apache.org/solr/SolrPerformanceFactors
http://wiki.apache.org/solr/SolrCaching

...and if you still have any additional questions, please provide the 
relevant information about your setup (ie: solrconfig.xml, frequency of 
running snapinstaller, examples of the exact queries that are slow, 
etc...)


-Hoss



Re: Facet query with special characters

2009-12-07 Thread Chris Hostetter


: When performing a facet query where part of the value portion has a 
: special character (a minus sign in this case), the query returns zero 
: results unless I put a wildcard (*) at the end.

check your analysis configuration for this fieldtype, in particular look 
at what debugQuery produces for your parsed query, and look at what 
analysis.jsp says it will do at query time with the input string 
"pds-comp.domain" ... because it sounds like you have a disconnect between 
how the text is indexed and how it is searched.  adding a * to your 
input query forces it to make a WildcardQuery which doesn't use analysis, 
so you get a match on the literal token.

in short: i suspect your problem has nothing to do with query string 
escaping, and everything to do with field tokenization.


-Hoss



Re: NullPointerException thrown during updates to index

2009-12-07 Thread Chris Hostetter
: Hi,
: I'm running a distributed solr index (3 nodes) and have noticed frequent
: exceptions thrown during updates.  The exception (see below for full trace)

what do you mean "during updates" ? ... QueryComponent isn't used at all 
when updating hte index, so there may be a missunderstanding here.

I'm not very familiar with the code in question, but i opened a bug to 
track it, please update with any new info you may have as you do more 
testing...

https://issues.apache.org/jira/browse/SOLR-1631





-Hoss



why no results?

2009-12-07 Thread regany

hi all - newbie solr question - I've indexed some documents and can search /
receive results using the following schema - BUT ONLY when searching on the
"id" field. If I try searching on the title, subtitle, body or text field I
receive NO results. Very confused. :confused: Can anyone see anything
obvious I'm doing wrong Regan.











 






 

 
 id

 
 text

 
 

 






-- 
View this message in context: 
http://old.nabble.com/why-no-results--tp26688249p26688249.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: why no results?

2009-12-07 Thread Tom Hill
Hi -

That's a common one to get bit by. The string

On Mon, Dec 7, 2009 at 7:44 PM, regany  wrote:

>
> hi all - newbie solr question - I've indexed some documents and can search
> /
> receive results using the following schema - BUT ONLY when searching on the
> "id" field. If I try searching on the title, subtitle, body or text field I
> receive NO results. Very confused. :confused: Can anyone see anything
> obvious I'm doing wrong Regan.
>
>
>
> 
>
> 
>
> 
> sortMissingLast="true" omitNorms="true" />
> 
>
>  
> 
> multiValued="false" required="true" />
> multiValued="false" />
> multiValued="false" />
> multiValued="false" />
> multiValued="true" />
>  
>
>  
>  id
>
>  
>  text
>
>  
>  
>
>  
> 
> 
> 
>
> 
>
> --
> View this message in context:
> http://old.nabble.com/why-no-results--tp26688249p26688249.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: why no results?

2009-12-07 Thread regany


Tom Hill-7 wrote:
> 
> That's a common one to get bit by. The string
> 


You lost me Tom? I Think your message got cut off. I'm guessing something to
do with the "string" type??
-- 
View this message in context: 
http://old.nabble.com/why-no-results--tp26688249p26688295.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: why no results?

2009-12-07 Thread Tom Hill
Sorry, just discovered a keyboard shortcut for "send". :-)

That's a common one to get bit by. The fieldtype StrField indexes the entire
field as one item. So you can only find it if your search term is everything
in the field. That is, "fox" will not find "The Quick Brown Fox", because
it's not the whole field.

The ID field probably works because it has one term in it. "1" finds "1"
just fine.

Try solr.TextField instead.

Tom


On Mon, Dec 7, 2009 at 7:47 PM, Tom Hill  wrote:

> Hi -
>
> That's a common one to get bit by. The string
>
>
> On Mon, Dec 7, 2009 at 7:44 PM, regany  wrote:
>
>>
>> hi all - newbie solr question - I've indexed some documents and can search
>> /
>> receive results using the following schema - BUT ONLY when searching on
>> the
>> "id" field. If I try searching on the title, subtitle, body or text field
>> I
>> receive NO results. Very confused. :confused: Can anyone see anything
>> obvious I'm doing wrong Regan.
>>
>>
>>
>> 
>>
>> 
>>
>> 
>>> sortMissingLast="true" omitNorms="true" />
>> 
>>
>>  
>> 
>>> multiValued="false" required="true" />
>>> multiValued="false" />
>>> multiValued="false" />
>>> multiValued="false" />
>>> multiValued="true" />
>>  
>>
>>  
>>  id
>>
>>  
>>  text
>>
>>  
>>  
>>
>>  
>> 
>> 
>> 
>>
>> 
>>
>> --
>> View this message in context:
>> http://old.nabble.com/why-no-results--tp26688249p26688249.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>


Re: why no results?

2009-12-07 Thread regany


Tom Hill-7 wrote:
> 
> Try solr.TextField instead.
> 


Thanks Tom,

I've replaced the  section above with...






deleted my index, restarted Solr and re-indexed my documents - but the
search still returns nothing.

Do I need to change the type in the  sections as well?

regan
-- 
View this message in context: 
http://old.nabble.com/why-no-results--tp26688249p26688469.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Exception encountered during replication on slave....Any clues?

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
are you able to hit the
http://localhost:8080/postingsmaster/replication using a browser from
the slave box. if you are able to hit it what do you see?


On Tue, Dec 8, 2009 at 3:42 AM, William Pierce  wrote:
> Just to make doubly sure,  per tck's suggestion,  I went in and explicitly
> added in the port in the masterurl so that it now reads:
>
> http://localhost:8080/postingsmaster/replication
>
> Still getting the same exception...
>
> I am running solr 1.4, on Ubuntu karmic, using tomcat 6 and Java 1.6.
>
> Thanks,
>
> - Bill
>
> --
> From: "William Pierce" 
> Sent: Monday, December 07, 2009 2:03 PM
> To: 
> Subject: Re: Exception encountered during replication on slaveAny clues?
>
>> tck,
>>
>> thanks for your quick response.  I am running on the default port (8080).
>> If I copy that exact string given in the masterUrl and execute it in the
>> browser I get a response from solr:
>>
>> 
>> - 
>> - 
>>  0
>>  0
>>  
>>  OK
>>  No command
>>  
>>
>> So the masterUrl is reachable/accessible so far as I am able to tell
>>
>> Thanks,
>>
>> - Bill
>>
>> --
>> From: "TCK" 
>> Sent: Monday, December 07, 2009 1:50 PM
>> To: 
>> Subject: Re: Exception encountered during replication on slaveAny
>> clues?
>>
>>> are you missing the port number in the master's url ?
>>>
>>> -tck
>>>
>>>
>>>
>>> On Mon, Dec 7, 2009 at 4:44 PM, William Pierce
>>> wrote:
>>>
 Folks:

 I am seeing this exception in my logs that is causing my replication to
 fail.    I start with  a clean slate (empty data directory).  I index
 the
 data on the postingsmaster using the dataimport handler and it succeeds.
  When the replication slave attempts to replicate it encounters this
 error.

 Dec 7, 2009 9:20:00 PM org.apache.solr.handler.SnapPuller
 fetchLatestIndex
 SEVERE: Master at: http://localhost/postingsmaster/replication is not
 available. Index fetch failed. Exception: Invalid version or the data in
 not
 in 'javabin' format

 Any clues as to what I should look for to debug this further?

 Replication is enabled as follows:

 The postingsmaster solrconfig.xml looks as follows:

 
   
     
     commit
     
     
   
  

 The postings slave solrconfig.xml looks as follows:

 
   
       
       http://localhost/postingsmaster/replication
 
       
       00:05:00
    
  


 Thanks,

 - Bill



>>>
>>
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: how to set CORE when using Apache Solr extension?

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
the core is a part of the uri

http://host:port///select

say if the core name is core1 and solr app name is solr deployed at port 8983
then it would look like
http://host:8983/solr/core1/select

On Tue, Dec 8, 2009 at 3:44 AM, regany  wrote:
>
> Hello,
>
> Can anyone tell me how you set which Solr CORE to use when using the Apache
> Solr extension? (Using Solr with multicores)
> http://www.php.net/manual/en/book.solr.php
>
> thanks,
> regan
> --
> View this message in context: 
> http://old.nabble.com/how-to-set-CORE-when-using-Apache-Solr-extension--tp26685174p26685174.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Oddly slow replication

2009-12-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
this has to be a network problem . We have never encountered such
vastly different speeds in the same LAN.

On Tue, Dec 8, 2009 at 3:22 AM, Simon Wistow  wrote:
> I have a Master server with two Slaves populated via Solr 1.4 native
> replication.
>
> Slave1 syncs at a respectable speed i.e around 100MB/s but Slave2 runs
> much, much slower - the peak I've seen is 56KB/s.
>
> Both are running off the same hardware with the same config -
> compression is set to 'internal' and http(Conn|Read)Timeout are defaults
> (5000/1).
>
> I've checked too see if it was a disk problem using dd and if it was a
> network problem by doing a manual scp and an rsync from the slave to the
> master and the master to the slave.
>
> I've shut down the replication polling on Slave1 just to see if that was
> causing the problem but there's been no improvement.
>
> Any ideas?
>
>
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Replicating multiple cores

2009-12-07 Thread Jason Rutherglen
If I've got multiple cores on a server, I guess I need multiple
rsyncd's running (if using the shell scripts)?


Re: Replicating multiple cores

2009-12-07 Thread Shalin Shekhar Mangar
On Tue, Dec 8, 2009 at 11:48 AM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> If I've got multiple cores on a server, I guess I need multiple
> rsyncd's running (if using the shell scripts)?
>

Yes. I'd highly recommend using the Java replication though.

-- 
Regards,
Shalin Shekhar Mangar.