Re: Querying Question

2008-08-21 Thread Erik Hatcher
That's correct. But your query clause was category_facet:test, and you said category_facet is a "string" type. If you copyField'd category_facet to category_text (a "text" type), then you need to search with category_text:test Erik On Aug 21, 2008, at 8:09 PM, Jake Conk wrote:

Re: Querying Question

2008-08-21 Thread Norberto Meijome
On Thu, 21 Aug 2008 18:09:11 -0700 "Jake Conk" <[EMAIL PROTECTED]> wrote: > I thought if I used to copy my string field to a text > field then I can search for words within it and not limited to the > entire content. Did I misunderstand that? but you need to search on the fields that are defined

Re: Querying Question

2008-08-21 Thread Jake Conk
I thought if I used to copy my string field to a text field then I can search for words within it and not limited to the entire content. Did I misunderstand that? Thanks, - Jake On Thu, Aug 21, 2008 at 5:53 PM, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > On Aug 21, 2008, at 7:33 PM, Jake Conk wr

Re: Querying Question

2008-08-21 Thread Walter Underwood
Also, "+" in a URL parameter turns into a space. The URL for this query: +field:Jake should look like this: ?q=%2Bfield%3AJake The admin UI takes care of that for you. wunder On 8/21/08 5:53 PM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > > On Aug 21, 2008, at 7:33 PM, Jake Conk wrot

Re: Querying Question

2008-08-21 Thread Erik Hatcher
On Aug 21, 2008, at 7:33 PM, Jake Conk wrote: I'm having trouble using the + operator. According to the documentation if I put that operator in front of any term then it should find that term anywhere within the field. Be sure to look at this documentation:

Querying Question

2008-08-21 Thread Jake Conk
Hello, I'm having trouble using the + operator. According to the documentation if I put that operator in front of any term then it should find that term anywhere within the field. So if I want all the records that have the name "Jake" in them I started with a simple query that works: ?q=Jake

Re: "Multicore" and snapshooter / snappuller

2008-08-21 Thread Jon Baer
Thanks ... on a somewhat related note, does having the index on ZFS buy me anything, has anyone toyed w/ ZFS snapshots / send / recv to automount? Does it work? - Jon On Aug 21, 2008, at 6:43 PM, Alexander Ramos Jardim wrote: You need to setup one snapshooter for each index 2008/8/21 Jon

Re: "Auto commit error" and java.io.FileNotFoundException

2008-08-21 Thread Michael McCandless
OK indeed that revision of Lucene is before the workaround for that nasty JRE bug was committed. Can you test one of those JRE versions (known not to have this particular JRE bug) and see if you can get the original "massive deletion" problem to happen. I guess first try it without the

Re: "Multicore" and snapshooter / snappuller

2008-08-21 Thread Alexander Ramos Jardim
You need to setup one snapshooter for each index 2008/8/21 Jon Baer <[EMAIL PROTECTED]> > Hi, > > Ive started putting together a small cluster and going through the setup on > some of the scripts, do they have any awareness of a multicore setup? It > seems like I can only snapshot a single maste

Re: Buffer overflow attack on solr seen in the wild

2008-08-21 Thread Alexander Ramos Jardim
Yes, it is an SQL injection in fact. 2008/8/21 Mike Klaas <[EMAIL PROTECTED]> > Hi Jim, > > Looks like a sql injection attack that is automatically entered into search > forms. Solr should not be affected, but it could affect you if you insert > the raw/unescaped query into a sql database (for l

Re: Buffer overflow attack on solr seen in the wild

2008-08-21 Thread Mike Klaas
Hi Jim, Looks like a sql injection attack that is automatically entered into search forms. Solr should not be affected, but it could affect you if you insert the raw/unescaped query into a sql database (for logging, etc.). -Mike On 21-Aug-08, at 3:30 PM, Jim Hurst wrote: Hey folks, I

Buffer overflow attack on solr seen in the wild

2008-08-21 Thread Jim Hurst
Hey folks, I was just perusing a log on a production server and saw the entry below. It's all one line in the log, I've added line breaks to ease your viewing. I'm not informed enough to evaluate this as a threat. Any advice? Thanks, -Jim PS: query is lightly sanitized,

"Multicore" and snapshooter / snappuller

2008-08-21 Thread Jon Baer
Hi, Ive started putting together a small cluster and going through the setup on some of the scripts, do they have any awareness of a multicore setup? It seems like I can only snapshot a single master directory, Im assuming these tools are compatible with that type of setup but just want

Re: XML includes in solrconfig.xml/schema.xml

2008-08-21 Thread Henrib
Since I authored the patch, I'm guilty on all counts. :-) Amit Nithian wrote: > > I am not sure why they chose that direction over built-in entity include. > Entities are not the most used or known feature and I just did not think of this was a way to do it. I also wanted variable expansion in

Re: Testing query response time

2008-08-21 Thread Walter Underwood
We do two kinds of load testing for our Solr search farm. 1. Zipf distribution, using a log of user queries. This tests the engine without any caching in front. 2. Flat distribution, using unique queries (sort|uniq on the above log). This is a worst case load with a perfect cache in front. In bo

Re: "Auto commit error" and java.io.FileNotFoundException

2008-08-21 Thread Chris Harris
I'll see about using a newer/older JVM. In the meantime, according to the Solr admin page, which seems to get its info like so LucenePackage.class.getPackage().getImplementationVersion() what I've been testing is Lucene r652650. The Solr version is r654965, now modified of course to do some

Re: XML includes in solrconfig.xml/schema.xml

2008-08-21 Thread Amit Nithian
Thanks for pointing me to this. Although the patch page says that the include resource is an experimental feature. I am not sure why they chose that direction over built-in entity include. Thanks Amit On Thu, Aug 21, 2008 at 2:27 PM, Henrib <[EMAIL PROTECTED]> wrote: > > The other option is to u

Re: Testing query response time

2008-08-21 Thread Phillip Farber
On rereading my original post it does sound weird. Let me try again and thanks for bearing with me. I want to know how long solr will take to process a unique query taking full advantage of OS i/o buffers. I think executing a set of unique queries from a cold start should measure that if I

Re: Less aggressive stemmer?

2008-08-21 Thread Guillaume Smet
On Thu, Aug 21, 2008 at 11:23 PM, Jason Rennie <[EMAIL PROTECTED]> wrote: > Is there an option to perform less aggressive stemming in solr? We're using > the Porter stemmer. I see that there is an option for Snowball, but my > understanding is that Snowball is a refinement of Porter rather than >

Re: Less aggressive stemmer?

2008-08-21 Thread Kevin Osborn
We had similar problems and then switched to KStem and have been pretty happy with the results. http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi - Original Message From: Jason Rennie <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, August 21, 2008 2:23:36 PM

Re: XML includes in solrconfig.xml/schema.xml

2008-08-21 Thread Henrib
The other option is to use solr-646 which adds the ability to include files through an . Regards henri -- View this message in context: http://www.nabble.com/XML-includes-in-solrconfig.xml-schema.xml-tp19096292p19097243.html Sent from the Solr - User mailing list archive at Nabble.com.

Less aggressive stemmer?

2008-08-21 Thread Jason Rennie
Is there an option to perform less aggressive stemming in solr? We're using the Porter stemmer. I see that there is an option for Snowball, but my understanding is that Snowball is a refinement of Porter rather than something radically different. I think we'd be best off with something very basi

Re: XML includes in solrconfig.xml/schema.xml

2008-08-21 Thread Alexander Ramos Jardim
I would love to have this possibility! 2008/8/21 Amit Nithian <[EMAIL PROTECTED]> > I read a post online by Jacob Singh about XML includes for solrconfig.xml. > I > have several indices with somewhat overlapping schemas and having to > redefine these elements is not ideal. XML entity includes see

XML includes in solrconfig.xml/schema.xml

2008-08-21 Thread Amit Nithian
I read a post online by Jacob Singh about XML includes for solrconfig.xml. I have several indices with somewhat overlapping schemas and having to redefine these elements is not ideal. XML entity includes seems like a reasonable solution; however, the problem with Solr currently is that entity inclu

Re: "Auto commit error" and java.io.FileNotFoundException

2008-08-21 Thread Michael McCandless
Urgh, I was hoping we could repro the "massive deletion" with infoStream turned on. Uh-oh: that "off by 1" corruption is very likely due to the Sun JRE bug described here: https://issues.apache.org/jira/browse/LUCENE-1282 Can you downgrade to 1.6.0_03, or, upgrade to the latest beta

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Sean Timm
https://issues.apache.org/jira/browse/LUCENE-1360 Simon Hu wrote: I am definitely interested in trying your Similarity class. Can you please post the patch in jira? thanks -Simon Sean Timm wrote: In the example below, Doc1, and Doc2 will all have the same score for the query "chevrole

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Simon Hu
I am definitely interested in trying your Similarity class. Can you please post the patch in jira? thanks -Simon Sean Timm wrote: > > In the example below, Doc1, and Doc2 will all have the same score for > the query "chevrolet tahoe." We would prefer Doc2 to score higher than > Doc1. Th

Re: "Auto commit error" and java.io.FileNotFoundException

2008-08-21 Thread Chris Harris
Well shoot, upon further examination it appears that this time around there weren't actually any segment deletion problems. The "only" problem was a "doc counts differ" error. Interestingly, the count is only off by one. >From the CheckIndex tool: Opening index @ /ssd/solr-/solr/exhibit

Re: Solr Logo thought

2008-08-21 Thread Mike Klaas
I thought the plan was to run more of a logo contest? -Mike On 21-Aug-08, at 9:29 AM, Otis Gospodnetic wrote: One more +1 for the eye/sun O. I don't think I thought "eye" when i saw it, but I think having an eye there is actually a cool little detail. I think Shalin should revive his pol

RE: shards and performance

2008-08-21 Thread Lance Norskog
We found that searching by itself was faster with the Distributed multicore search over three cores in the same servlet engine, than one just one core. Faceting and sorting use more memory than simple searches, and we could not do faceting on our one simple index. We needed this for data analysis.

Re: shards and performance

2008-08-21 Thread Alexander Ramos Jardim
2008/8/21 Otis Gospodnetic <[EMAIL PROTECTED]> > Uh uh. 6 instances per node all pointing to the same index? > Yes, this can increase performance, but only because it essentially gives > you 6 separate searchers (SolrIndexSearchers). This clearly uses more RAM, > especially if you sort on fields

Re: shards and performance

2008-08-21 Thread Otis Gospodnetic
Uh uh. 6 instances per node all pointing to the same index? Yes, this can increase performance, but only because it essentially gives you 6 separate searchers (SolrIndexSearchers). This clearly uses more RAM, especially if you sort on fields and especially if you are not omiting norms where yo

Re: Testing query response time

2008-08-21 Thread Otis Gospodnetic
Hi, I think you are describing some "weird" unrealistic scenarios. There is typically no need to test "just solr" without relying on disk caches. Not using disk buffers will only work in trivial scenarios, but if you really want to test it, run something that hogs memory while running solr perf

Re: Recognizing date inputs

2008-08-21 Thread Otis Gospodnetic
Sounds like custom heuristics is needed and used with something like DateFormat. Nothing that automagically recognized dates in various formats exists in Solr as far as I know. I bet someone, somewhere has written and outsourced something like this, though. http://www.google.com/search?hl=en

Re: Solr won't start under jetty on RHEL5.2

2008-08-21 Thread Jeremy Hinegardner
On Mon, Aug 18, 2008 at 04:20:12PM -0700, Jon Drukman wrote: > Jon Drukman wrote: >> I just migrated my solr instance to a new server, running RHEL5.2. I >> installed java from yum but I suspect it's different from the one I used >> to use. > > > Turns out my instincts were correct. The version

Re: Solr Logo thought

2008-08-21 Thread Otis Gospodnetic
One more +1 for the eye/sun O. I don't think I thought "eye" when i saw it, but I think having an eye there is actually a cool little detail. I think Shalin should revive his poll with your logo versions once they are ready. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Jason Rennie
Count me as interested. Our "documents" are product descriptions, many fields of which are very short. Not sure if it would make large enough of an impact to warrant us rolling our own solr build, but I'm definitely interested to see the custom Similarity class. Thanks, Jason On Thu, Aug 21, 2

Re: DIH - Document missing required field error

2008-08-21 Thread Todd Breiholz
That was it. Thanks! On Thu, Aug 21, 2008 at 9:30 AM, Noble Paul നോബിള്‍ नोब्ळ् < [EMAIL PROTECTED]> wrote: > oh the names are in CAPS. And in your dataconfig it is in small it > must be something like this > > > > > On Thu, Aug 21, 2008 at 7:33 PM, Todd Breiholz <[EMAIL PROTECTED]> wrote: > >

Re: DIH - Document missing required field error

2008-08-21 Thread Shalin Shekhar Mangar
Hmm...seems like none of the columns are getting copied over to the document. I see that all column names are in caps in Row#1, perhaps you need to add them in caps to the data-config too? DIH just stores each row as a Map and therefore map lookups are also case-sensitive. On Thu, Aug 21, 2008 at

Re: best way to debug shard format errors

2008-08-21 Thread Ian Connor
I found wrapping the BinaryResponseWrite with a try/catch solved the problem and allows null values to be returned. BinaryResponseWriter.java:141 try { val = useFieldObjects ? ft.toObject(f) : ft.toExternal(f); } catch (NumberFormatException e) {

Re: DIH - Document missing required field error

2008-08-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
oh the names are in CAPS. And in your dataconfig it is in small it must be something like this On Thu, Aug 21, 2008 at 7:33 PM, Todd Breiholz <[EMAIL PROTECTED]> wrote: > On Wed, Aug 20, 2008 at 10:54 PM, Shalin Shekhar Mangar < > [EMAIL PROTECTED]> wrote: > >> Does the "mygallery_id" column i

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Alexander Ramos Jardim
The strategy I use is rather simple: I put the data I want to match in 2 fields, 1 tokenized (indexed=true, stored=false), 1 exact match (indexed=true, stored=true) 2008/8/20 Simon Hu <[EMAIL PROTECTED]> > > Hi > > I have a text field named prodname in the solr index. Lets say there are 3 > docu

Re: DIH - Document missing required field error

2008-08-21 Thread Todd Breiholz
On Wed, Aug 20, 2008 at 10:54 PM, Shalin Shekhar Mangar < [EMAIL PROTECTED]> wrote: > Does the "mygallery_id" column in your database allow nulls? It gets copied > to the id field in Solr which is required, hence the error. > Nope. I just confirmed that the mygallery_id column does *not* allow nu

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Sean Timm
In the example below, Doc1, and Doc2 will all have the same score for the query "chevrolet tahoe." We would prefer Doc2 to score higher than Doc1. The score length norm for each is also 0.5f. I presume which one appears first now falls to the order they were placed in the index? By using ou

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Mark Miller
Sean Timm wrote: To solve this, we wrote our own Similarity class which extends DefaultSimilarity and maps numTerms 1-10 to precalculated values between 1.5f and 0.3125f. For numTerms >10, we use the standard formula above. If anyone else is interested in this, I can post the code as a patch

Re: How to boost the score higher in case user query matches entire field value than just some words within a field

2008-08-21 Thread Sean Timm
Length normalization in the Similarity class will generally favor shorter fields. For example, with the DefaultSimilarity, the length norm for a 2 term field is 0.625. For a three term field it is 0.5. The norm is multiplied by the score. I say "generally will favor" because the length norm

Re: best way to debug shard format errors

2008-08-21 Thread Ian Connor
More of an update and work around. When you query a number field locally, it can return null. However, when you go through a shard if you have an empty number it throws an error. Should I open a bug for this? On Thu, Aug 21, 2008 at 8:56 AM, Ian Connor <[EMAIL PROTECTED]> wrote: > I think I have

Re: best way to debug shard format errors

2008-08-21 Thread Ian Connor
I think I have narrowed it down to: where integer is defined in the example as: It returns fine when I query directly, but blows up when going through the binary conversion that shards uses. On Thu, Aug 21, 2008 at 8:37 AM, Ian Connor <[EMAIL PROTECTED]> wrote: > Hi, > > What is the b

best way to debug shard format errors

2008-08-21 Thread Ian Connor
Hi, What is the best way to figure out which field is giving this error: Aug 21, 2008 8:34:17 AM org.apache.solr.common.SolrException log SEVERE: java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)

Re: localsolr and dataimport problems

2008-08-21 Thread TomWilliamson
Thanks. I've just tested localsolr using the latest solr trunk and this doesn't seem to work at all, so I'm guessing it is in need of an update. Great work on this though - SOLR is awesome! Cheers, Tom Patrick is on vacation this week. You can get the authoritative answer when he is back, bu

Re: Solr Logo thought

2008-08-21 Thread Lukáš Vlček
ConcurrentException... OK, I will try to prepare two desings. One which will not evoke the eye and the second which will keep the eye feeling. Then you can choose. Lukas On Thu, Aug 21, 2008 at 11:24 AM, Nick Jenkin <[EMAIL PROTECTED]> wrote: > I like the O, it is both the sun and it looks like

Re: Solr Logo thought

2008-08-21 Thread Nick Jenkin
I like the O, it is both the sun and it looks like an eye which suits in with the search. Good stuff. -Nick On 8/21/08, Lukáš Vlček <[EMAIL PROTECTED]> wrote: > Hi, > > Well, the eye looking O is not intentional. It is more a result of the > techique I used when doing the initial skatch. Believe