Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Feb 4, 2010 at 10:50 AM, Lance Norskog wrote: > I just tested this with a DIH that does not use database input. > > If the DataImportHandler JDBC code does not support a schema that has > optional fields, that is a major weakness. Noble/Shalin, is this true? The problem is obviously not wi

Re: The Riddle of the Underscore and the Dollar Sign

2010-02-03 Thread Lance Norskog
Please reframe how you give the various fields and tests - i'ts hard to follow in this email. On Wed, Feb 3, 2010 at 12:50 PM, Christopher Ball wrote: > I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters > with regard to Underscores. > > > > 1) I am trying to get rid of t

Re: query all filled field?

2010-02-03 Thread Lance Norskog
Queries that start with minus or NOT don't work. You have to do this: *:* AND -fieldX:[* TO *] On Wed, Feb 3, 2010 at 5:04 AM, Frederico Azeiteiro wrote: > Hum, strange.. I reindexed some docs with the field corrected. > > Now I'm sure the field is filled because: > > "fieldX:(*a*)" returns

Re: Using solr to store data

2010-02-03 Thread Tommy Chheng
Hey AJ, For simplicity sake, I am using Solr to serve as storage and search for http://researchwatch.net. The dataset is 110K NSF grants from 1999 to 2009. The faceting is all dynamic fields and I use a catch all to copy all fields to a default text field. All fields are also stored and used f

Re: Solr usage with Auctions/Classifieds?

2010-02-03 Thread Lance Norskog
Oops, forgot to add the link: http://www.lucidimagination.com/search/document/CDRG_ch04_4.4.4 On Wed, Feb 3, 2010 at 9:17 PM, Andy wrote: > How do I set up and use this external file? > > Can I still use such a field in fq or boost? > > Can you point me to the right documentation? Thanks > > ---

Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-03 Thread Lance Norskog
I just tested this with a DIH that does not use database input. If the DataImportHandler JDBC code does not support a schema that has optional fields, that is a major weakness. Noble/Shalin, is this true? On Tue, Feb 2, 2010 at 8:50 AM, Sascha Szott wrote: > Hi, > > since some of the fields used

Re: Solr usage with Auctions/Classifieds?

2010-02-03 Thread Andy
How do I set up and use this external file? Can I still use such a field in fq or boost? Can you point me to the right documentation? Thanks --- On Wed, 2/3/10, Lance Norskog wrote: From: Lance Norskog Subject: Re: Solr usage with Auctions/Classifieds? To: solr-user@lucene.apache.org Date: We

Re: Solr response extremely slow

2010-02-03 Thread Lance Norskog
Is it possible that the virtual machine does not give clean system millisecond numbers? On Wed, Feb 3, 2010 at 5:43 PM, Erik Hatcher wrote: > > On Feb 3, 2010, at 1:38 PM, Rajat Garg wrote: >> >> Solr Specification Version: 1.3.0 >> Solr Implementation Version: 1.3.0 694707 - grantingersoll - 200

Re: Using solr to store data

2010-02-03 Thread Lance Norskog
If you're happy with disk sizes and indexing&search performance, there are still holes: Documents update instead of fields, so when you have a million documents that say "German" and should say "French", you have to reindex a million documents. There are no tools for managing distributed indexes,

Re: Slow QueryComponent.process() when queries have numbers in them

2010-02-03 Thread Lance Norskog
The debugQuery parameter shows you how the query is parsed into a tree of Lucene query objects. http://wiki.apache.org/solr/CommonQueryParameters#debugQuery On Wed, Feb 3, 2010 at 3:40 PM, Simon Wistow wrote: > According to my logs > > org.apache.solr.handler.component.QueryComponent.process() >

Re: Search wihthout diacritics

2010-02-03 Thread Lance Norskog
You need to add AsciiFoldingFilter to the query path as well as the indexing path. The solr/admin/analysis.jsp page lets you explore how these analysis stacks work. On Tue, Feb 2, 2010 at 5:53 PM, Olala wrote: > > Hi all! > > I have problem with Solr, and I hope everyboby in there can help me :)

Re: ClassCastException setting date.formats in ExtractingRequestHandler

2010-02-03 Thread Lance Norskog
Please file a bug for this in the JIRA: http://issues.apache.org/jira/browse/SOLR Please add all details. Thanks! -- Forwarded message -- From: Christoph Brill Date: Tue, Feb 2, 2010 at 4:11 AM Subject: ClassCastException setting date.formats in ExtractingRequestHandler To: sol

Re: weird behabiour when setting negative boost with bq using dismax

2010-02-03 Thread Lance Norskog
In the standard query parser, this means "remove all entries in which field_a = 54". > bq=-field_a:54^1 Generally speaking, by convention boosts in Lucene have unity at 1.0, not 0.0. So, a "negative boost" is usually done with boosts between 0 and 1. For this case, maybe a boost of 0.1 is

Solr 1.4: Full import FileNotFoundException

2010-02-03 Thread ranjitr
Hello, I have noticed that when I run concurrent full-imports using DIH in Solr 1.4, the index ends up getting corrupted. I see the following in the log files (a snippet): 2010-02-03T17:54:24 1265248464553 764 org.apache.solr.handler.dataimport.SolrWriter SEVERE org.apache.solr.han

Re: Solr usage with Auctions/Classifieds?

2010-02-03 Thread Lance Norskog
This field type allows you to have an external file that gives a float value for a field. You can only use functions on it. On Sat, Jan 30, 2010 at 7:05 AM, Jan Høydahl / Cominvent wrote: > A follow-up on the auction use case. > > How do you handle the need for frequent updates of only one field,

Re: Solr response extremely slow

2010-02-03 Thread Erik Hatcher
On Feb 3, 2010, at 1:38 PM, Rajat Garg wrote: Solr Specification Version: 1.3.0 Solr Implementation Version: 1.3.0 694707 - grantingersoll - 2008-09-12 11:06:47 There's the problem right there... that grantingersoll guy :) (kidding) Sounds like you're just hitting cache warming which can

Using solr to store data

2010-02-03 Thread AJ Asver
Hi all, I work on search at Scoopler.com, a real-time search engine which uses Solr. We current use solr for indexing but then fetch data from our couchdb cluster using the IDs solr returns. We are now considering storing a larger portion of data in Solr's index itself so we don't have to hit th

Re: need help with feb 3/2010 trunk and solr-236

2010-02-03 Thread Koji Sekiguchi
gdeconto wrote: I got latest trunk (feb3/2010) using svn and applied solr-236. I tried the latest patch with Solr trunk yesterday, no problems there. did an "ant clean" and it seems to build fine with no errors or warnings. Did you ant example? I think ant clean will delete everything

need help with feb 3/2010 trunk and solr-236

2010-02-03 Thread gdeconto
I got latest trunk (feb3/2010) using svn and applied solr-236. did an "ant clean" and it seems to build fine with no errors or warnings. however, when I start solr I get an error (here is a snippet): SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.so lr.handler.com

Slow QueryComponent.process() when queries have numbers in them

2010-02-03 Thread Simon Wistow
According to my logs org.apache.solr.handler.component.QueryComponent.process() takes a significant amount of time (>5s but I've seen up to 15s) when a query has an odd pattern of numbers in e.g "neodymium megagauss-oersteds (MGOe) (1 MG·Oe = 7,958·10³ T·A/m = 7,958 kJ/m³" "myers 8e psycholo

Re: Guidance on Solr errors

2010-02-03 Thread Grant Ingersoll
Inline below. On Feb 2, 2010, at 8:40 PM, Vauthrin, Laurent wrote: > Hello, > > > > I'm trying to troubleshoot a problem that occurred on a few Solr slave > Tomcat instances and wanted to run it by the list to see if I'm on the > right track. > > > > The setup involves 1 master replicating

Re: Search wihthout diacritics

2010-02-03 Thread Grant Ingersoll
On Feb 2, 2010, at 8:53 PM, Olala wrote: > > Hi all! > > I have problem with Solr, and I hope everyboby in there can help me :) > > I want to search text without diacritic but Solr will response diacritic > text and without diacritic text. > > For example, I query "solr index", it will respon

The Riddle of the Underscore and the Dollar Sign

2010-02-03 Thread Christopher Ball
I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters with regard to Underscores. 1) I am trying to get rid of them when shingling, but seem unable to do so with a Stopwords Filter. And yet they are being removed when I am not even trying to by the WordDelimiter Filter

autosuggest via solr.EdgeNGramFilterFactory (was: Re: wildcards in stopword list)

2010-02-03 Thread Lukas Kahwe Smith
Hi Ahmet, Well after some more testing I am now convinced that you rock :) I like the solution because its obviously way less hacky and more importantly I expect this to be a lot faster and less memory intensive, since instead of a facet prefix or terms search, I am doing an "equality" compariso

Re: distributed search and failed core

2010-02-03 Thread Joe Calderon
thx guys, i ended up using a mix of code from the solr-1143 and solr-1537 patches, now whenever there is an exception theres is a section in the results indicating the result is partial and also lists the failed core(s), weve added some monitoring to check for that output as well to alert us when a

Re: distributed search and failed core

2010-02-03 Thread Yonik Seeley
On Fri, Jan 29, 2010 at 3:31 PM, Joe Calderon wrote: > hello *, in distributed search when a shard goes down, an error is > returned and the search fails, is there a way to avoid the error and > return the results from the shards that are still up? The SolrCloud branch has load-balancing capabili

Re: Solr response extremely slow

2010-02-03 Thread Rajat Garg
Here you go - Solr Specification Version: 1.3.0 Solr Implementation Version: 1.3.0 694707 - grantingersoll - 2008-09-12 11:06:47 Lucene Specification Version: 2.4-dev Lucene Implementation Version: 2.4-dev 691741 - 2008-09-03 15:25:16 -- View this message in context: http://old.nabble.com/Sol

Re: distributed search and failed core

2010-02-03 Thread Ian Connor
My only suggestion is to put haproxy in front of two replicas and then have haproxy do the failover. If a shard fails, the whole search will fail unless you do something like this. On Fri, Jan 29, 2010 at 3:31 PM, Joe Calderon wrote: > hello *, in distributed search when a shard goes down, an err

SOLR Performance Tuning: Fuzzy Search

2010-02-03 Thread Fuad Efendi
I was lucky to contribute an excellent solution: http://issues.apache.org/jira/browse/LUCENE-2230 Even 2nd edition of Lucene in Action advocates to use fuzzy search only in exceptional cases. Another solution would be 2-step indexing (it may work for many use cases), but it is not "spellchecker

Re: Indexing an oracle warehouse table

2010-02-03 Thread caman
Thanks. I will give this a shot. Alexey-34 wrote: > >> What would be the right way to point out which field contains the term >> searched for. > I would use highlighting for all of these fields and then post process > Solr response in order to check highlighting tags. But I don't have so > many

Re: Any idea what could be wrong with this fq value?

2010-02-03 Thread javaxmlsoapdev
thanks Erik for the pointer. I had this field as "text" and after changing it to "string" it started working as expected. I am still not sure why this particular value("Infrastructure") was failing to bring back results. other values like "Network", "Information" etc worked fine when field was o

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 14:34, Ahmet Arslan wrote: > >> Actually I plan to write a bigger blog post about the >> approach. In order to match the different fields I actually >> have a separate core with an index dedicated to auto suggest >> alone where I merge all fields together via some javascript >>

Re: Any idea what could be wrong with this fq value?

2010-02-03 Thread Erik Hatcher
is groupName a "string" field? If not, it probably should be. My hunch is that you're analyzing that field and it is lowercased in the index, and maybe even stemmed. Try &q=*:*&facet=on&facet.field=groupName to see all the *indexed* values of the groupName field. Erik On Feb 3,

Any idea what could be wrong with this fq value?

2010-02-03 Thread javaxmlsoapdev
Following is my solr URL. http://hostname:port/solr/entities/select/?version=2.2&start=0&indent=on&qt=dismax&rows=60&fq=statusName:(Open OR Cancelled)&debugQuery=true&q=dev&fq=groupName:"Infrastructure“ “groupName” is one of the attributes I create fq (filterQuery) on. This field(groupName) is

Re: how to stress test solr

2010-02-03 Thread Marc Sturlese
I like to use JMeter with a large queries file. This way you can measure response times with lots of requests at the same time. Having JConsole opened at the same time you can check the memory status James liu-2 wrote: > > before stressing test, Should i close SolrCache? > > which tool u use? >

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 15:19, Ahmet Arslan wrote: >>> With this field type, the query "ding" or "din" or >> "di" would return "Foo Bar Ding Dong". >> >> hmm wouldnt it return "foo bar ding dong" ? > > No, it will return original string. In this method you are not using faceting > anymore. You are

Re: wildcards in stopword list

2010-02-03 Thread Ahmet Arslan
> > With this field type, the query "ding" or "din" or > "di" would return "Foo Bar Ding Dong". > > hmm wouldnt it return "foo bar ding dong" ? No, it will return original string. In this method you are not using faceting anymore. You are just querying and requesting a field.

Re: ContentStreamUpdateRequest addFile fails to close Stream

2010-02-03 Thread Mark Miller
Hey Christoph, Could you give the patch at https://issues.apache.org/jira/browse/SOLR-1744 a try and let me know how it works out for you? -- - Mark http://www.lucidimagination.com Mark Miller wrote: > Christoph Brill wrote: > >> I tried to fix it in CommonsHttpSolrServer but I wasn't s

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 14:34, Ahmet Arslan wrote: > >> Actually I plan to write a bigger blog post about the >> approach. In order to match the different fields I actually >> have a separate core with an index dedicated to auto suggest >> alone where I merge all fields together via some javascript >>

Re: wildcards in stopword list

2010-02-03 Thread Ahmet Arslan
> Actually I plan to write a bigger blog post about the > approach. In order to match the different fields I actually > have a separate core with an index dedicated to auto suggest > alone where I merge all fields together via some javascript > code: > > This way I can then use terms for a single

RE: query all filled field?

2010-02-03 Thread Frederico Azeiteiro
Hum, strange.. I reindexed some docs with the field corrected. Now I'm sure the field is filled because: "fieldX:(*a*)" returns docs. But "fieldX:[* TO *]" is returning the same as "*.*" (all results) I tried with "-fieldX:[* TO *]" and I get no results at all. I wonder if someone has tried th

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 13:54, Lukas Kahwe Smith wrote: > The issue is that I have multiple fields of data (names, address etc) that > should all be relevant for the auto suggest. Furthermore a "phrase" entered > can either match on one field or any combination of fields. Phrase in this > context me

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 13:41, Ahmet Arslan wrote: >> this way i can do a prefix facet search for the term "foo" >> or "bar" and in both cases i can show the user "Foo Bar" >> with a bit of frontend logic to split off the "payload" aka >> original data. > > So you have a list of phrases (pre-extracted

Re: wildcards in stopword list

2010-02-03 Thread Ahmet Arslan
> this way i can do a prefix facet search for the term "foo" > or "bar" and in both cases i can show the user "Foo Bar" > with a bit of frontend logic to split off the "payload" aka > original data. So you have a list of phrases (pre-extracted) to be used for auto-suggest? Or you are using bi-gra

Re: query all filled field?

2010-02-03 Thread Ahmet Arslan
> Is it possible to query some field in order to get only not > empty > documents? > > > > All documents where field x is filled? Yes. q=x:[* TO *] will bring documents that has non-empty x field.

Re: wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
On 03.02.2010, at 13:07, Ahmet Arslan wrote: >> i am doing some funky hackery inside DIH via javascript to >> make my autosuggest work. i basically split phrases and >> store them together with the full phrase: >> >> the phrase: >> "Foo Bar" >> >> becomes: >> >> Foo Bar >> foo bar >> {foo}Foo_

RE: query all filled field?

2010-02-03 Thread Frederico Azeiteiro
Ok, if anyone needs it: I tried fieldX:[* TO *] I think this is correct. In my case I found out that I was not indexing this field correctly because they are all empty. :) -Original Message- From: Frederico Azeiteiro [mailto:frederico.azeite...@cision.com] Sent: quarta-feira, 3 de Fev

Re: wildcards in stopword list

2010-02-03 Thread Ahmet Arslan
> I am wondering if there is some way to maintain a stopword > list with widcards: > > ignoring anything that starts with "foo": > foo* A custom TokenFilterFactory derived from StopFilterFactory can remove a token if it matches a java.util.regex.Pattern. List of patterns can be loaded from a fi

Re: Indexing an oracle warehouse table

2010-02-03 Thread Alexey Serba
> What would be the right way to point out which field contains the term > searched for. I would use highlighting for all of these fields and then post process Solr response in order to check highlighting tags. But I don't have so many fields usually and don't know if it's possible to configure So

Re: How can I make my solr admin Password Protected

2010-02-03 Thread Erik Hatcher
There's some basic info for Jetty and Resin here: http://wiki.apache.org/solr/SolrSecurity Keep in mind the various URLs that Solr exposes though, so if you aren't protecting /solr completely you'll want to be aware that / update can add/update/delete anything, and so on. Erik On

How can I make my solr admin Password Protected

2010-02-03 Thread Vijayant Kumar
Hi, Can any one help me, how can I make my solr adim password protected so that only authorise person can access it. -- Thank you, Vijayant Kumar Software Engineer Website Toolbox Inc. http://www.websitetoolbox.com 1-800-921-7803 x211

Re: DataImportHandler - convertType attribute

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 3, 2010 at 4:16 PM, Erik Hatcher wrote: > > On Feb 3, 2010, at 5:36 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: >> >> On Wed, Feb 3, 2010 at 3:31 PM, Erik Hatcher >> wrote: >>> >>> One thing I find awkward about convertType is that it is JdbcDataSource >>> specific, rather than field-specifi

query all filled field?

2010-02-03 Thread Frederico Azeiteiro
Hi all, Is it possible to query some field in order to get only not empty documents? All documents where field x is filled? Thanks, Frederico

Re: Another basic question

2010-02-03 Thread Ahmet Arslan
> I have got a basic configuration of > Solr up and running and have loaded some > data to experiment with > Dataimport is pulling the expected number of rows in from > my DB view > > If I query for Beekeeping i get one result returned (as > expected) > > If I query for bee - I get no results > s

Another basic question

2010-02-03 Thread Stefan Maric
I have got a basic configuration of Solr up and running and have loaded some data to experiment with Dataimport is pulling the expected number of rows in from my DB view If I query for Beekeeping i get one result returned (as expected) If I query for bee - I get no results similarly for Bee etc

RE: Basic indexing question

2010-02-03 Thread Stefan Maric
Thanks that was it - I've now configured a dismax requesthandler that suits my needs -Original Message- From: Joe Calderon [mailto:calderon@gmail.com] Sent: 03 February 2010 00:20 To: solr-user@lucene.apache.org Subject: Re: Basic indexing question see http://wiki.apache.org/solr/S

Re: DataImportHandler - convertType attribute

2010-02-03 Thread Erik Hatcher
On Feb 3, 2010, at 5:36 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: On Wed, Feb 3, 2010 at 3:31 PM, Erik Hatcher wrote: One thing I find awkward about convertType is that it is JdbcDataSource specific, rather than field-specific. Isn't the current implementation far too broad? it is feature

Lucene User Group Meetup in Amsterdam

2010-02-03 Thread Uri Boness
Hi All, On 17th February we'll host the first Dutch Lucene User Group Meetup. This meet-up will be split into two parts: - The first part will be dedicated to the user group itself. We'll have an introduction to the members and have an open discussion about the goals of the user group and th

Re: DataImportHandler - convertType attribute

2010-02-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, Feb 3, 2010 at 3:31 PM, Erik Hatcher wrote: > One thing I find awkward about convertType is that it is JdbcDataSource > specific, rather than field-specific.  Isn't the current implementation far > too broad? it is feature of JdbcdataSource and no other dataSource offers it. we offer it be

wildcards in stopword list

2010-02-03 Thread Lukas Kahwe Smith
Hi, I am wondering if there is some way to maintain a stopword list with widcards: ignoring anything that starts with "foo": foo* i am doing some funky hackery inside DIH via javascript to make my autosuggest work. i basically split phrases and store them together with the full phrase: the phr

Re: DataImportHandler - convertType attribute

2010-02-03 Thread Erik Hatcher
One thing I find awkward about convertType is that it is JdbcDataSource specific, rather than field-specific. Isn't the current implementation far too broad? Erik On Feb 3, 2010, at 1:16 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: implicit conversion can cause problem when Transformer

how to stress test solr

2010-02-03 Thread James liu
before stressing test, Should i close SolrCache? which tool u use? How to do stress test correctly? Any pointers? -- regards j.L ( I live in Shanghai, China)

Re: Deploying Solr 1.3 in JBoss 5

2010-02-03 Thread Luca Molteni
Apparently, that worked! I've never realized that the order of the elements in XML is significant, nice to see. As always, problems leads to other problems, so now I'm facing with a Xerces ClassCastException with JDK 6. org.jboss.xb.binding.JBossXBRuntimeException: Failed to create a new SAX pars

Re: Solr response extremely slow

2010-02-03 Thread Shalin Shekhar Mangar
On Wed, Feb 3, 2010 at 2:18 PM, Doddamani, Prakash < prakash.doddam...@corp.aol.com> wrote: > Hey > Can any one say which is the latest and stable version, > We are using 1.2 > >Solr Specification Version: 1.2.0 >Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12 >

RE: Solr response extremely slow

2010-02-03 Thread Doddamani, Prakash
Hey Can any one say which is the latest and stable version, We are using 1.2 Solr Specification Version: 1.2.0 Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12 Lucene Specification Version: 2007-05-20_00-04-53 Lucene Implementation Version: build 20

Re: C++ being filtered (please help)

2010-02-03 Thread Ahmet Arslan
> I have a field which may take the form "C++,PHP & > MySql,C#" > now i want to tokenize it based on comma or white space and > other word > delimiting characters only. Not on the plus sign. so that > result after > tokenization should be > C++ > PHP > MySql > C# > > But the result I am getting is