Re: Release date of SOLR 1.3

2008-05-19 Thread Andrew Savory
Hi,

2008/5/16 Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]>:
> If you are looking for an immediate need waiting for a release I must
> advice you against waiting for the solr1.3 release. The best strategy
> would be to take a nightly and start using it. Test is thoroughly and
> if bugs are found report them back . If everything is fine go into
> production with that

Since most production environments are reluctant to use nightly builds
(regardless of how stable the trunk is), and since there's not been a
solr release in some time, would it be worth looking at what
outstanding issues are critical for 1.3 and perhaps pushing some over
to 1.4, and trying to do a release soon?

I think trunk has already sufficiently diverged to make it worth doing
a release, and I'd be happy to help wherever I can (since I could
really do with a more recent release to run).


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: Release date of SOLR 1.3

2008-05-19 Thread Ian Holsman (Lists)

Noble Paul നോബിള്‍ नोब्ळ् wrote:

If you are looking for an immediate need waiting for a release I must
advice you against waiting for the solr1.3 release. The best strategy
would be to take a nightly and start using it. Test is thoroughly and
if bugs are found report them back . If everything is fine go into
production with that

--Noble


I'd be very hesitant to recommend ANYONE go into production with 
non-released software if you are unfamiliar with the codebase.
waiting on the list for someone to fix a bug which is causing a site 
outage for your site is somewhat of a career limiting move.


I'd recommend using the stable release, and learning the codebase ;-)

regards
Ian


On Thu, May 15, 2008 at 12:28 AM, Matthew Runo <[EMAIL PROTECTED]> wrote:

There isn't a specific date so far, but I'd like to say that only once in
the year or so I've been working with the SVN head build of Solr have I
noticed a bug get committed. And it was fixed very quickly once it was
found.. I think if you need to have development features you're probably
safe to use the SVN head, but remember that it is dev, and you should
*always* test new builds before actually using them =p

Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On May 14, 2008, at 9:08 AM, Umar Shah wrote:

Hi,

I'm using the latest trunk code from SOLR .
I am basically using function queries (sum, product, scale) for my project
which are not present in 1.2.
I wanted to know if there is some decided date for release of Solr1.3.
If the date is far/ not decide, what should be the best practice to adopt
the above mentioned feature while not compromising on stability of the
system.

thanks
-umar










Re: Auto commit and optimize settings

2008-05-19 Thread Lucas F. A. Teixeira

Take a look in mergeFactor (and probably use compound format).

[]s,

Lucas Frare A. Teixeira
[EMAIL PROTECTED] 
Tel: +55 11 3660.1622 - R3018



Vaijanath N. Rao escreveu:

Hi Otis and Solr-users,

I was under the impression that when one call optimize all the indexes 
created so far get's merged. Hence I went about the question on optimize.


The reason I want optimize is that I have autoCommit feature in the 
solrConfig.xml to commit after every 1000 documents. Once I do that I 
get too many files open error after some time, while crawling and 
indexing a large number of sites.


Is there a way I can avoid too many files open issue all-together and 
yet have index committed after every 1000 docs.


--Thanks and Regards
Vaijanath

Otis Gospodnetic wrote:

Hi,

There is no such option currently and it is not likely that such 
feature will be added because index optimization is not really a 
quick and lightweight operation, so one typically optimized only 
after the index is fully built and one knows the index will remain 
unchanged for a while.  If you do need to optimize periodically for 
some reason, just send optimize commands to Solr from your own 
application.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 

From: Vaijanath N. Rao <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Cc: [EMAIL PROTECTED]
Sent: Monday, May 19, 2008 1:13:03 AM
Subject: Auto commit and optimize settings

Hi Solr-Users,

I have gone through the solrConfig.xml file in the example directory 
of the solr build (nightly build). I wanted to know is there a way 
to tell solr to optimize the index after certain number of seconds 
elapsed or number of records indexed as we do in case of auto-commit.


--Thanks and Regards
Vaijanath




  






Re: Searching "inside of words"

2008-05-19 Thread Daniel Löfquist

Thank you for your reply.
I've been trying some things out this morning but I'm still not getting 
it to work properly. I have a feeling that I'm on the right track 
somewhat though.


The type in my schema.xml looks like this:













If I'm understanding everything correctly this should create tokens with 
the size of 2 to 18 letters at the time of indexing, right?


However, I can't search properly now. I have to slice my search-string 
up into 2-letter chunks. So if I'm searching for "monitor" I have to 
send "mo+ni+to+r" to Solr. Like this:

http://localhost:8080/solrtest/select/?q=mo+ni+to+r&q.op=AND
when I want it to be like this:
http://localhost:8080/solrtest/select/?q=monitor&q.op=AND

I'm sure I'm doing something completely wrong. I just need some one more 
wise to the ways of Lucene and Solr to point directly at what it is 
that's wrong ;-)


//Daniel

Chris Hostetter wrote:

: so the only ones I can utilize are EdgeNGramTokenizerFactory and
: NGramTokenizerFactory.
: 
: I've done some playing around with them but the best result I've gotten so far

: is a field-type that enables searching for specific letters, for example I can
: search for an item that contains the letters a and x, but it returns a hit no
: matter where these letters are in the text, they don't have to be next to each
: other, and that's not the result I was going for. If the field contains
: "monitor" I want a hit on a search for "onit" but not on "rint" for example.

NGramTokenizerFactory should work fine for this ... the key is to use it 
at indexing time with the appropriate min and max gram sizes to meet your 
needs -- at query time, don't use it at all (use keyword or 
whitespace tokenizer)


so the word "monitor" will be indexed as these tokens (but not 
neccessarily in this order)...


  m o n i t o r mo on ni it to or mon oni nit ... onit ...

and at search time when the user gives you "onit" that term will exist.

: I've never attempted to construct a new field-type of my own before and I'm
: finding the available documentation somewhat incomplete and not very helpful

FWIW: creating a new FieldType is almost never what you need if you 
are dealing with text .. creating new FieldTypes is something that 
typically only needs done in cases where you want specialized encoding or 
sorting.


-Hoss



--
Daniel Löfquist
Application Manager / Software Engineer

CDON.COM
Bergsgatan 20, Box 385, SE 201 23 Malmö, Sweden

Office: +46 40 601 61 00
Direct: +46 40 601 61 16
Mobile: +46 702 92 21 75
Fax: +46 40 601 61 20
E-mail: [EMAIL PROTECTED] 

CDON.COM 

Confidentiality
Information contained in this e-mail is intended for the use of the
addressee only, and is confidential. Any dissemination, distribution,
copying or use of this communication without prior permission of
the addressee is strictly prohibited. If you are not the intended
addressee you must delete this e-mail and its attachments.


Re: Searching "inside of words"

2008-05-19 Thread Otis Gospodnetic
You are doing the right thing.  If you are creating n-grams at index time, you 
have to match that at query time.  If the query is "monitor", you need to pass 
that through n-gram tokenizer, too.  n-grams of length 18 look a little 
weird


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Daniel Löfquist <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, May 19, 2008 7:14:52 AM
> Subject: Re: Searching "inside of words"
> 
> Thank you for your reply.
> I've been trying some things out this morning but I'm still not getting 
> it to work properly. I have a feeling that I'm on the right track 
> somewhat though.
> 
> The type in my schema.xml looks like this:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> If I'm understanding everything correctly this should create tokens with 
> the size of 2 to 18 letters at the time of indexing, right?
> 
> However, I can't search properly now. I have to slice my search-string 
> up into 2-letter chunks. So if I'm searching for "monitor" I have to 
> send "mo+ni+to+r" to Solr. Like this:
> http://localhost:8080/solrtest/select/?q=mo+ni+to+r&q.op=AND
> when I want it to be like this:
> http://localhost:8080/solrtest/select/?q=monitor&q.op=AND
> 
> I'm sure I'm doing something completely wrong. I just need some one more 
> wise to the ways of Lucene and Solr to point directly at what it is 
> that's wrong ;-)
> 
> //Daniel
> 
> Chris Hostetter wrote:
> > : so the only ones I can utilize are EdgeNGramTokenizerFactory and
> > : NGramTokenizerFactory.
> > : 
> > : I've done some playing around with them but the best result I've gotten 
> > so 
> far
> > : is a field-type that enables searching for specific letters, for example 
> > I 
> can
> > : search for an item that contains the letters a and x, but it returns a 
> > hit 
> no
> > : matter where these letters are in the text, they don't have to be next to 
> each
> > : other, and that's not the result I was going for. If the field contains
> > : "monitor" I want a hit on a search for "onit" but not on "rint" for 
> > example.
> > 
> > NGramTokenizerFactory should work fine for this ... the key is to use it 
> > at indexing time with the appropriate min and max gram sizes to meet your 
> > needs -- at query time, don't use it at all (use keyword or 
> > whitespace tokenizer)
> > 
> > so the word "monitor" will be indexed as these tokens (but not 
> > neccessarily in this order)...
> > 
> >   m o n i t o r mo on ni it to or mon oni nit ... onit ...
> > 
> > and at search time when the user gives you "onit" that term will exist.
> > 
> > : I've never attempted to construct a new field-type of my own before and 
> > I'm
> > : finding the available documentation somewhat incomplete and not very 
> > helpful
> > 
> > FWIW: creating a new FieldType is almost never what you need if you 
> > are dealing with text .. creating new FieldTypes is something that 
> > typically only needs done in cases where you want specialized encoding or 
> > sorting.
> > 
> > -Hoss
> > 
> 
> -- 
> Daniel Löfquist
> Application Manager / Software Engineer
> 
> CDON.COM
> Bergsgatan 20, Box 385, SE 201 23 Malmö, Sweden
> 
> Office: +46 40 601 61 00
> Direct: +46 40 601 61 16
> Mobile: +46 702 92 21 75
> Fax: +46 40 601 61 20
> E-mail: [EMAIL PROTECTED] 
> 
> CDON.COM 
> 
> Confidentiality
> Information contained in this e-mail is intended for the use of the
> addressee only, and is confidential. Any dissemination, distribution,
> copying or use of this communication without prior permission of
> the addressee is strictly prohibited. If you are not the intended
> addressee you must delete this e-mail and its attachments.



Re: Auto commit and optimize settings

2008-05-19 Thread Otis Gospodnetic
Vaijanath,

My suggestion is to:

- turn off autocommit
- double-check that mergeFactor is not too high (e.g. higher than 50)
- double check your servers open file limit (ulimit -a is the command to run) 
and if it's low (e.g. 1024) increase it (info about how to do might be on the 
Wiku, if not Google will help)
- don't commit unless you really need searchers to pick up the changes.  
committing every 1000 docs sounds wrong.  Commit at the end, or just optimize 
(see below)
- optimize once at the end of a larger index run.  optimize infrequently, as 
optimization essentially rewrites your whole index, which means a lot of IO 
work for larger indices


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Vaijanath N. Rao <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Cc: [EMAIL PROTECTED]
> Sent: Monday, May 19, 2008 2:49:01 AM
> Subject: Re: Auto commit and optimize settings
> 
> Hi Otis and Solr-users,
> 
> I was under the impression that when one call optimize all the indexes 
> created so far get's merged. Hence I went about the question on optimize.
> 
> The reason I want optimize is that I have autoCommit feature in the 
> solrConfig.xml to commit after every 1000 documents. Once I do that I 
> get too many files open error after some time, while crawling and 
> indexing a large number of sites.
> 
> Is there a way I can avoid too many files open issue all-together and 
> yet have index committed after every 1000 docs.
> 
> --Thanks and Regards
> Vaijanath
> 
> Otis Gospodnetic wrote:
> > Hi,
> >
> > There is no such option currently and it is not likely that such feature 
> > will 
> be added because index optimization is not really a quick and lightweight 
> operation, so one typically optimized only after the index is fully built and 
> one knows the index will remain unchanged for a while.  If you do need to 
> optimize periodically for some reason, just send optimize commands to Solr 
> from 
> your own application.
> >
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> > - Original Message 
> >  
> >> From: Vaijanath N. Rao 
> >> To: solr-user@lucene.apache.org
> >> Cc: [EMAIL PROTECTED]
> >> Sent: Monday, May 19, 2008 1:13:03 AM
> >> Subject: Auto commit and optimize settings
> >>
> >> Hi Solr-Users,
> >>
> >> I have gone through the solrConfig.xml file in the example directory of 
> >> the solr build (nightly build). I wanted to know is there a way to tell 
> >> solr to optimize the index after certain number of seconds elapsed or 
> >> number of records indexed as we do in case of auto-commit.
> >>
> >> --Thanks and Regards
> >> Vaijanath
> >>
> >
> >
> >  



Re: Release date of SOLR 1.3

2008-05-19 Thread Chris Hostetter

: solr release in some time, would it be worth looking at what outstanding 
: issues are critical for 1.3 and perhaps pushing some over to 1.4, and 
: trying to do a release soon?

That's what is typically done when the Developers start getting an itch to 
make a release.

Jira keeps track of all the issues that are marked outstanding issues that 
have been targed for 1.3...
http://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&mode=hide&sorter/order=DESC&sorter/field=priority&resolution=-1&pid=12310230&fixfor=12312486

..some of these are major features that are in active development (in some 
cases: partially commited) while others are more wishlist items that misc 
people have said "it would be really cool to try and do this for 1.3" but 
have no patches attached yet.

If people are particularly eager to see a 1.3 release, the best thing to 
do is subscribe to solr-dev and start a dialog there about what issues 
people thing are "show stopers" for 1.3 and what assistance the various 
people working on those issues can use.

-Hoss



Re: Searching "inside of words"

2008-05-19 Thread Chris Hostetter

: You are doing the right thing.  If you are creating n-grams at index 
: time, you have to match that at query time.  If the query is "monitor", 
: you need to pass that through n-gram tokenizer, too.  n-grams of length 
: 18 look a little weird

you don't *have* to use ngrams at query time ... his goal is "parital" 
word matching, so he wants to create various sized ngrams so that input 
like "onit" matches "monitor" but does not match "on it"

Daniel: the options for NGramTokenizerFactory are minGramSize 
and maxGramSize ... not minGram and maxGram ... you are getting the 
defaults (which are 1 and 2 i think)

it confused me too untill i tried you schema changes, and then looked at 
the analysis.jsp link and saw only 1 and 2 gram tokens being created .. 
then i checked the class.



-Hoss



adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young

hi :)

I'm having an interesting problem with my data.  in general, I want the 
results of the WordDelimiterFilter for better matching, but there are 
times when it's just too aggressive.  for example


  boys2men => boys 2 men (good)
  p!nk => pnk (maybe)
  !!!  => (nothing - bad)

there's a special place for bands who name themselves just punctuation 
marks :)


anyway, one way around this is synonyms.  but if I do that then I need 
to run the synonym filter multiple times.  the first might expand


  !!!  => chk chk chk
  p!nk => pink

while the next would need to run after the WordDelimiterFilter for

  boys 2 men => boyz II men

I'd really like to avoid multiple passes (and multiple synonym files) if 
at all possible, but that's the solution I'm faced with currently...


unless an 'expand' option were added to the WordDelimiterFilter, in 
which case I'd have


  p!nk => p!nk pnk

after it runs, so I could just apply the synonyms once.  or maybe 
there's another solution I'm missing.


would it be difficult (or desirable) to add an expand option?

--Geoff



Re: [SPAM] [poll] Change logging to SLF4J?

2008-05-19 Thread Matthew Runo
I just read through the dev list's thread.. and I'm voting for SLF4J  
as well.


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On May 6, 2008, at 7:40 AM, Ryan McKinley wrote:

Hello-

There has been a long running thread on solr-dev proposing switching
the logging system to use something other then JDK logging.
http://www.nabble.com/Solr-Logging-td16836646.html
http://www.nabble.com/logging-through-log4j-td13747253.html

We are considering using http://www.slf4j.org/.  Check:
https://issues.apache.org/jira/browse/SOLR-560

The "pro" argument is that:
* SLFJ allows more flexibility for people using solr outside the
canned .war to configure logging without touching JDK logging.

The "con" argument goes something like:
* JDK logging is already is the standard logging framework.
* JDK logging is already in in use.
* SLF4J adds another dependency (for something that already works)

On the dev lists there are a strong opinions on either side, but we
would like to get a larger sampling of option and validation before
making this change.

[  ] Keep solr logging as it is.  (JDK Logging)
[  ] Use SLF4J.

As an bonus question (this time fill in the blank):
I have tried SOLR-560 with my logging system and  
___.


thanks
ryan





RE: Help with Solr + KStem

2008-05-19 Thread Hung Huynh
Otis,

Thanks for helping me out. I downloaded the KStem source you provided below.
I have .class files for /apache/lucene/analysis/KStem*.class and
/apache/solr/analysis/KStemFilterFactory.class compiled from Eclipse. What
do I do next? Sorry, I'm a complete newbie in Java and Solr.

Do I ?
1. Jar up all /lucense/analysis/Kstem*.class and put KStem.jar in solr/lib
2. What do I do with /apache/solr/analysis/KStemFilterFactory.class? I'm
still getting similar errors about class_not_found everytime I hit Solr
Admin.

I originally tried to use the pre-compiled KStem2.jar version from Harry
Wagner. That didn't work for me either or rather I didn't know what to do
with it.

Thanks,

Hung

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 14, 2008 4:06 PM
To: solr-user@lucene.apache.org
Subject: Re: Help with Solr + KStem

Hung,

You included the KStem jar itself, and that is good, but class
KStemFilterFactory does not exist anywhere in Solr.
You need to get it from here:
https://issues.apache.org/jira/browse/SOLR-379

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Hung Huynh <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 14, 2008 3:57:29 PM
> Subject: Help with Solr + KStem
> 
> 
> I have KStem.jar in solr/lib and solr/example/lib and made a change to
> schema.xml to include the KStem line (removed the Porter line).
> 
> 
> 
> 
> This is what I get when I try to hit the Solr Admin page. How can I go
about
> resolving this error?
> 
> Thanks,
> 
> HH
> 
> ---
> 
> 
> HTTP ERROR: 500
> Severe errors in solr configuration.
> 
> Check your log files for more detailed infomation on what may be wrong.
> 
> If you want solr to continue after configuration errors, change: 
> 
> false
> 
> in solrconfig.xml
> 
> -
> org.apache.solr.core.SolrException: Error loading class
> 'solr.KStemFilterFactory'
> at org.apache.solr.core.Config.findClass(Config.java:220)
> at org.apache.solr.core.Config.newInstance(Config.java:225)
> at
>
org.apache.solr.schema.IndexSchema.readTokenFilterFactory(IndexSchema.java:6
> 29)
> at
> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:607)
> at
> org.apache.solr.schema.IndexSchema.readConfig(IndexSchema.java:331)
> at org.apache.solr.schema.IndexSchema.(IndexSchema.java:71)
> at org.apache.solr.core.SolrCore.(SolrCore.java:196)
> at org.apache.solr.core.SolrCore.getSolrCore(SolrCore.java:177)
> at
>
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
> at
> org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
> at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
> at
>
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
> at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
> at
>
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218)
> at
> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
> at
> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
> at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
> at
>
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:1
> 47)
> at
>
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCol
> lection.java:161)
> at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
> at
>
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:1
> 47)
> at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
> at
> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
> at org.mortbay.jetty.Server.doStart(Server.java:210)
> at
> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
> at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at org.mortbay.start.Main.invokeMain(Main.java:183)
> at org.mortbay.start.Main.start(Main.java:497)
> at org.mortbay.start.Main.main(Main.java:115)
> Caused by: java.lang.ClassNotFoundException: solr.KStemFilterFactory
> at java.net.URLClassLoader$1.run(Unknown Source)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at
>
org.mortbay.jetty.webapp.WebAppClassLoader.loa

How to limit number of pages per domain

2008-05-19 Thread JLIST
I'm indexing pages from multiple domains. In any given
result set, I don't want to return more than two links
from the same domain, so that the first few pages won't
be all from the same domain. I suppose I could get more
(say, 100) pages from solr, then sort in memory in the
front-end server to mix the domains. But I'm not sure if
there is a simpler way of implementing this ...

-- 
Best regards,
Jack



HTTP Version Not Supported errors?

2008-05-19 Thread Matthew Runo

Hello folks!

We're starting to see a lot of errors in Solr/SolrJ with the message  
"HTTP Version Not Supported". I can't reproduce it, and it only seems  
to happen with load - if there's no one browsing our site, then we  
don't get the errors if we try browsing around ourselves. I looked  
about in the SolrJ code where it connects to the Solr server, but all  
seems well... any ideas?


Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

Begin forwarded message:

From: [EMAIL PROTECTED]
Date: May 19, 2008 4:05:02 PM PDT
To: [EMAIL PROTECTED]
Subject: [Log4j] [SMTPAppender] web43 error message

2008-05-19 16:05:02,982 [ERROR] - [EMAIL PROTECTED] -  -  
service.SimpleSiteService (getBrandById:197) - Could not retrieve  
brand for ID[313]
org.apache.solr.client.solrj.SolrServerException: Invalid SOLR Query  
object

at com.zappos.domain.dao.SearchDAO.getSearch(SearchDAO.java:68)
at com.zappos.domain.dao.BrandDAO.getBrandById(BrandDAO.java:92)
	at  
com 
.zappos 
.domain 
.service.SimpleSiteService.getBrandById(SimpleSiteService.java:191)

at com.zappos.zeta.action.ViewBrand.brand(ViewBrand.java:496)
at com.zappos.zeta.action.ViewBrand.view(ViewBrand.java:286)
at sun.reflect.GeneratedMethodAccessor590.invoke(Unknown Source)
	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)
	at net.sourceforge.stripes.controller.DispatcherHelper 
$6.intercept(DispatcherHelper.java:458)
	at  
net 
.sourceforge 
.stripes.controller.ExecutionContext.proceed(ExecutionContext.java: 
157)
	at  
net 
.sourceforge 
.stripes 
.controller 
.BeforeAfterMethodInterceptor 
.intercept(BeforeAfterMethodInterceptor.java:107)
	at  
net 
.sourceforge 
.stripes.controller.ExecutionContext.proceed(ExecutionContext.java: 
154)
	at  
net 
.sourceforge 
.stripes.controller.ExecutionContext.wrap(ExecutionContext.java:73)
	at  
net 
.sourceforge 
.stripes 
.controller 
.DispatcherHelper.invokeEventHandler(DispatcherHelper.java:456)
	at  
net 
.sourceforge 
.stripes 
.controller 
.DispatcherServlet.invokeEventHandler(DispatcherServlet.java:241)
	at  
net 
.sourceforge 
.stripes.controller.DispatcherServlet.doPost(DispatcherServlet.java: 
154)
	at  
net 
.sourceforge 
.stripes.controller.DispatcherServlet.doGet(DispatcherServlet.java:61)

at javax.servlet.http.HttpServlet.service(Unknown Source)
at javax.servlet.http.HttpServlet.service(Unknown Source)
	at  
org 
.apache 
.catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
Source)
	at  
net 
.sourceforge 
.stripes.controller.StripesFilter.doFilter(StripesFilter.java:180)
	at  
org 
.apache 
.catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
Source)
	at org.apache.catalina.core.ApplicationDispatcher.invoke(Unknown  
Source)
	at  
org 
.apache.catalina.core.ApplicationDispatcher.processRequest(Unknown  
Source)
	at org.apache.catalina.core.ApplicationDispatcher.doForward(Unknown  
Source)
	at org.apache.catalina.core.ApplicationDispatcher.forward(Unknown  
Source)
	at  
org 
.tuckey 
.web 
.filters 
.urlrewrite.NormalRewrittenUrl.doRewrite(NormalRewrittenUrl.java:195)
	at  
org 
.tuckey 
.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:159)
	at  
org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java: 
141)
	at  
org 
.tuckey 
.web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java: 
90)
	at  
org 
.tuckey 
.web 
.filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java: 
417)
	at  
org 
.apache 
.catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
Source)
	at  
com 
.zappos 
.zeta.plumbing.SSLRedirectFilter.doFilter(SSLRedirectFilter.java:57)
	at  
org 
.apache 
.catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
Source)
	at org.apache.catalina.core.StandardWrapperValve.invoke(Unknown  
Source)
	at org.apache.catalina.core.StandardContextValve.invoke(Unknown  
Source)
	at  
org.apache.catalina.authenticator.AuthenticatorBase.invoke(Unknown  
Source)

at org.apache.catalina.core.StandardHostValve.invoke(Unknown Source)
at org.apache.catalina.valves.ErrorReportValve.invoke(Unknown Source)
	at  
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java: 
563)
	at org.apache.catalina.core.StandardEngineValve.invoke(Unknown  
Source)

at org.apache.catalina.ha.tcp.ReplicationValve.invoke(Unknown Source)
	at  
org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(Unknown  
Source)
	at org.apache.catalina.connector.CoyoteAdapter.service(Unknown  
Source)
	at org.apache.coyote.http11.Http11NioP

Re: adding expand=true to WordDelimiterFilter

2008-05-19 Thread Chris Hostetter

by "expand=true" it sounds like you mean you are looking for a way to 
preserve the orriginal term without any characteres removed.

This sounds like SOLR-14 ... you might want to take a look at it, and see 
if the patch is still useable, and if not see if you can bring it up to
date.


-Hoss



Re: exceeded limit of maxWarmingSearchers

2008-05-19 Thread Chris Hostetter
: and I though a true master-slave setup would be overkill. Is it really
: problematic to run queries on instances that aren't auto-warmed? Sounds like

it really depends on your usecases and what you consider "problematic" ... 
there's no inherent problem in having queries hit an unwarmed index, it 
isn't an error case or anything like that ...it's just that that those 
queries *may* be slower then you're willing to live with, and if 
you hit an unwarmed index with a high volume of concurrent requests they 
*may* all be slow and they *may* cause general performance problems that 
could cascade ... 

the only hard and fast rule is that regardless of wether it's a master or 
a slave, you don't want to have "commits" (either explicitly on a master 
or because of new snapshots on a slave) happen faster then you can open a 
new searcher -- if they are, then you either need to do less warming when 
opening a newSearcher, or slow down on the commits.

: I'm stuck between a rock and a hard-place. Am I going to have to build my
: initial index w/ one configuration and then re-start with a different
: configuration? I'd prefer to avoid that.

this is where a master/slave setup typically comes into play ... build 
your initial index on a master with replication disabled, wait for a full 
build then enable replication and let your slave(s) pull the full index, 
warm it and then use it.

in theory, you could have "cascading" replication ... where real time 
update go to M which has rapid autocommiting turned on and generates 
snapshots on every commit -- but cache autowarming is completley disabled.  
S1 pulls from M as fast as it can and has some caches with extremely 
conservative cache warming; it responds to queries that need very "up to 
date" info but are willing to wait a little bit i nthe event of a cache 
miss.  S2 pulls from M (or S1) much less frequently then S1 pulls from M, 
S2 has very aggressive cache warming and is used for responding to the 
bulk of queries, where responses need to be generated instantly but the 
info in those responses is allowed to be a little stale.


-Hoss



Re: adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young



Chris Hostetter wrote:
by "expand=true" it sounds like you mean you are looking for a way to 
preserve the orriginal term without any characteres removed.


yes, that's it.



This sounds like SOLR-14 ... you might want to take a look at it, and see 
if the patch is still useable, and if not see if you can bring it up to

date.


I'm working with a team that deploys this all for me, so I've asked 
them.  I'll report back.


thanks for pointing it out :)

--Geoff


Problem getting spelling suggestions to work

2008-05-19 Thread oleg_gnatovskiy

Hello. I am having some trouble getting spelling suggestions to work. I am
running the latest nightly build of Solr. The URL I am hitting is:

http://localhost:8983/solr/select/?q=pizzza&qt=spellchecker&cmd=rebuild

and the response I am getting is 




0
14


rebuild
pizzza
spellchecker





Which is obviously missing the suggestions field. The reason for that is
likely that I overrode the default definition of /select. My /select is
defined in the following way:

 

  explicit


  collapse
  facet
  mlt
  highlight
  debug

  
The reason I am doing this, is that I want to replace the query component
with the collapse component.

Am I missing something that would make the qt parameter work? Any help would
be appreciated.
-- 
View this message in context: 
http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17331252.html
Sent from the Solr - User mailing list archive at Nabble.com.



anyone use hadoop+solr?

2008-05-19 Thread j . L
can u talk about it ?

maybe i will use hadoop + solr.

thks for ur advice.



-- 
regards
j.L


Re: Problem getting spelling suggestions to work

2008-05-19 Thread Otis Gospodnetic
I haven't actually used this in a while, but are you asking the handler for 
spellchecking (q=pizzza) or are you asking it to rebuild the index 
(cmd=rebuild)?  Asking for both at the same time might not be the best thing.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: oleg_gnatovskiy <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, May 19, 2008 9:01:14 PM
> Subject: Problem getting spelling suggestions to work
> 
> 
> Hello. I am having some trouble getting spelling suggestions to work. I am
> running the latest nightly build of Solr. The URL I am hitting is:
> 
> http://localhost:8983/solr/select/?q=pizzza&qt=spellchecker&cmd=rebuild
> 
> and the response I am getting is 
> 
> 
> 
> 
> 0
> 14
> 
> 
> rebuild
> pizzza
> spellchecker
> 
> 
> 
> 
> 
> Which is obviously missing the suggestions field. The reason for that is
> likely that I overrode the default definition of /select. My /select is
> defined in the following way:
> 
> 
> class="org.apache.solr.handler.component.SearchHandler">
> 
>   explicit
> 
> 
>   collapse
>   facet
>   mlt
>   highlight
>   debug
> 
>   
> The reason I am doing this, is that I want to replace the query component
> with the collapse component.
> 
> Am I missing something that would make the qt parameter work? Any help would
> be appreciated.
> -- 
> View this message in context: 
> http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17331252.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to limit number of pages per domain

2008-05-19 Thread Otis Gospodnetic
Jack, look at Solr JIRA and search for: field collapsing
There is a patch there that does this, though it's not in sync with the Solr 
trunk.
You can also look at how Nutch does this.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: JLIST <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, May 19, 2008 6:23:32 PM
> Subject: How to limit number of pages per domain
> 
> I'm indexing pages from multiple domains. In any given
> result set, I don't want to return more than two links
> from the same domain, so that the first few pages won't
> be all from the same domain. I suppose I could get more
> (say, 100) pages from solr, then sort in memory in the
> front-end server to mix the domains. But I'm not sure if
> there is a simpler way of implementing this ...
> 
> -- 
> Best regards,
> Jack



Re: HTTP Version Not Supported errors?

2008-05-19 Thread Otis Gospodnetic
I am guessing this is your Tomcat's error message.  I'd start by looking at 
Tomcat logs to see who's specifying invalid HTTP versions.  The thing to look 
for is something like "GET /solr/select HTTP/1.1"

That 1.1 is HTTP version 1.1.  I am guessing something/someone is specifying 
something like 1.2 (invalid) and Tomcat is complaining.  But it's a wild guess.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Matthew Runo <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, May 19, 2008 7:27:27 PM
> Subject: HTTP Version Not Supported errors?
> 
> Hello folks!
> 
> We're starting to see a lot of errors in Solr/SolrJ with the message  
> "HTTP Version Not Supported". I can't reproduce it, and it only seems  
> to happen with load - if there's no one browsing our site, then we  
> don't get the errors if we try browsing around ourselves. I looked  
> about in the SolrJ code where it connects to the Solr server, but all  
> seems well... any ideas?
> 
> Thanks!
> 
> Matthew Runo
> Software Developer
> Zappos.com
> 702.943.7833
> 
> Begin forwarded message:
> > From: [EMAIL PROTECTED]
> > Date: May 19, 2008 4:05:02 PM PDT
> > To: [EMAIL PROTECTED]
> > Subject: [Log4j] [SMTPAppender] web43 error message
> >
> > 2008-05-19 16:05:02,982 [ERROR] - [EMAIL PROTECTED] -  -  
> > service.SimpleSiteService (getBrandById:197) - Could not retrieve  
> > brand for ID[313]
> > org.apache.solr.client.solrj.SolrServerException: Invalid SOLR Query  
> > object
> > at com.zappos.domain.dao.SearchDAO.getSearch(SearchDAO.java:68)
> > at com.zappos.domain.dao.BrandDAO.getBrandById(BrandDAO.java:92)
> > at  
> > com 
> > .zappos 
> > .domain 
> > .service.SimpleSiteService.getBrandById(SimpleSiteService.java:191)
> > at com.zappos.zeta.action.ViewBrand.brand(ViewBrand.java:496)
> > at com.zappos.zeta.action.ViewBrand.view(ViewBrand.java:286)
> > at sun.reflect.GeneratedMethodAccessor590.invoke(Unknown Source)
> > at  
> > sun 
> > .reflect 
> > .DelegatingMethodAccessorImpl 
> > .invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at net.sourceforge.stripes.controller.DispatcherHelper 
> > $6.intercept(DispatcherHelper.java:458)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.ExecutionContext.proceed(ExecutionContext.java: 
> > 157)
> > at  
> > net 
> > .sourceforge 
> > .stripes 
> > .controller 
> > .BeforeAfterMethodInterceptor 
> > .intercept(BeforeAfterMethodInterceptor.java:107)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.ExecutionContext.proceed(ExecutionContext.java: 
> > 154)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.ExecutionContext.wrap(ExecutionContext.java:73)
> > at  
> > net 
> > .sourceforge 
> > .stripes 
> > .controller 
> > .DispatcherHelper.invokeEventHandler(DispatcherHelper.java:456)
> > at  
> > net 
> > .sourceforge 
> > .stripes 
> > .controller 
> > .DispatcherServlet.invokeEventHandler(DispatcherServlet.java:241)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.DispatcherServlet.doPost(DispatcherServlet.java: 
> > 154)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.DispatcherServlet.doGet(DispatcherServlet.java:61)
> > at javax.servlet.http.HttpServlet.service(Unknown Source)
> > at javax.servlet.http.HttpServlet.service(Unknown Source)
> > at  
> > org 
> > .apache 
> > .catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
> > at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
> > Source)
> > at  
> > net 
> > .sourceforge 
> > .stripes.controller.StripesFilter.doFilter(StripesFilter.java:180)
> > at  
> > org 
> > .apache 
> > .catalina.core.ApplicationFilterChain.internalDoFilter(Unknown Source)
> > at org.apache.catalina.core.ApplicationFilterChain.doFilter(Unknown  
> > Source)
> > at org.apache.catalina.core.ApplicationDispatcher.invoke(Unknown  
> > Source)
> > at  
> > org 
> > .apache.catalina.core.ApplicationDispatcher.processRequest(Unknown  
> > Source)
> > at org.apache.catalina.core.ApplicationDispatcher.doForward(Unknown  
> > Source)
> > at org.apache.catalina.core.ApplicationDispatcher.forward(Unknown  
> > Source)
> > at  
> > org 
> > .tuckey 
> > .web 
> > .filters 
> > .urlrewrite.NormalRewrittenUrl.doRewrite(NormalRewrittenUrl.java:195)
> > at  
> > org 
> > .tuckey 
> > .web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:159)
> > at  
> > org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java: 
> > 141)
> > at  
> > org 
> > .tuckey 
> > .web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java: 
> > 90)
> > at  
> > org 
> > .tuckey 
> > .web 
> > .filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java: 
> > 417)
> > at  
> > org 
> > .apache 
> > .catalina.core.Applica

Re: Problem getting spelling suggestions to work

2008-05-19 Thread oleg_gnatovskiy

Thats true, but that's not the problem. The problem is that you can't call
qt=spellchecker if you redefine /select in solrconfig.xml. I was wondering
how I could add qt functionality back.



Otis Gospodnetic wrote:
> 
> I haven't actually used this in a while, but are you asking the handler
> for spellchecking (q=pizzza) or are you asking it to rebuild the index
> (cmd=rebuild)?  Asking for both at the same time might not be the best
> thing.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Monday, May 19, 2008 9:01:14 PM
>> Subject: Problem getting spelling suggestions to work
>> 
>> 
>> Hello. I am having some trouble getting spelling suggestions to work. I
>> am
>> running the latest nightly build of Solr. The URL I am hitting is:
>> 
>> http://localhost:8983/solr/select/?q=pizzza&qt=spellchecker&cmd=rebuild
>> 
>> and the response I am getting is 
>> 
>> 
>> 
>> 
>> 0
>> 14
>> 
>> 
>> rebuild
>> pizzza
>> spellchecker
>> 
>> 
>> 
>> 
>> 
>> Which is obviously missing the suggestions field. The reason for that is
>> likely that I overrode the default definition of /select. My /select is
>> defined in the following way:
>> 
>> 
>> class="org.apache.solr.handler.component.SearchHandler">
>> 
>>   explicit
>> 
>> 
>>   collapse
>>   facet
>>   mlt
>>   highlight
>>   debug
>> 
>>   
>> The reason I am doing this, is that I want to replace the query component
>> with the collapse component.
>> 
>> Am I missing something that would make the qt parameter work? Any help
>> would
>> be appreciated.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17331252.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17333756.html
Sent from the Solr - User mailing list archive at Nabble.com.