Re: filter query parsing problem

2010-01-20 Thread Shalin Shekhar Mangar
On Tue, Jan 19, 2010 at 3:10 AM, Ahmet Arslan wrote: > > I am submitting a query and it seems > > to be parsing incorrectly. Here > > is the query with the debug output. Any ideas what > > the problem is: > > > > > > > > ((VLog:814124 || VLog:12342) && > > (PublisherType:U || PublisherT

Re: How to backup / dump solr database

2010-01-20 Thread Shalin Shekhar Mangar
On Tue, Jan 19, 2010 at 6:38 PM, jmf wrote: > Hi, > > I'm using solr with the Plone CMS. I have just following some tutorials, > and I > would like to 'dump' the solr database on production server and make it run > on > my developement environement. Both are linux. > > So first the question is : i

AW: Restricting Facet to FilterQuery in combination with mincount

2010-01-20 Thread Chantal Ackermann
Thank you, Chris! That did clarify it. :-) Cheers, Chantal Von: Chris Hostetter [hossman_luc...@fucit.org] Gesendet: Dienstag, 19. Januar 2010 23:27 An: solr-user@lucene.apache.org Betreff: Re: Restricting Facet to FilterQuery in combination with mincount

AW: TermsComponent, multiple fields, total count

2010-01-20 Thread Chantal Ackermann
I find the DismaxRequestHandler perfect for matching multiple fields, matching phrases in other/subset of fields, weighting the different matches. It's powerful and fast. You can define several DismaxRequestHandlers if you want to offer different kinds of "search areas" to the user (e.g. search

Re: Please help: Failing tests

2010-01-20 Thread Shalin Shekhar Mangar
On Wed, Jan 20, 2010 at 2:26 AM, Siv Anette Fjellkårstad wrote: > > I'm tring to run the unit tests from Eclipse. Almost half the tests are > failing, and I don't know what I'm doing wrong. This is what I've done: > > 1. Checked out the code outside Eclipse's workspace > 2. File > New > Project >

Ruby client fails to build

2010-01-20 Thread Siddhant Goel
Hi, I'm using Solr 1.4 (and trying to use the Ruby client (solr-ruby) to access it). The problem is that I just cant get it to work. :-) If I run the tests (rake test), it fails giving me the following output - /path/to/solr-ruby/test/unit/delete_test.rb:52: invalid multibyte char (US-ASCII) /pat

Re: Fastest way to use solrj

2010-01-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
2010/1/20 Tim Terlegård : BinaryRequestWriter does not read from a file and post it >>> >>> Is there any other way or is this use case not supported? I tried this: >>> >>> $ curl /solr/update/javabin -F stream.file=/tmp/data.bin >>> $ curl /solr/update -F stream.body=' ' >>> >>> Solr did read

SV: Please help: Failing tests

2010-01-20 Thread Siv Anette Fjellkårstad
Thank you so much - that helped a lot. Now most of the tests are green, but I still have some failing. One of failing tests is testMultiThreade and the error messages is: Caused by: org.apache.solr.common.SolrException: QueryElevationComponent missing config file: 'elevate.xml either: C:\data\

Re: Ruby client fails to build

2010-01-20 Thread Erik Hatcher
Where are you getting your solr-ruby code from? You can simply "gem install" it to pull in an already pre-built gem. I just ran the tests on trunk, all passed, with the output pasted below. Erik ~/dev/solr/client/ruby/solr-ruby: rake test (in /Users/erikhatcher/dev/solr/client/ruby/s

Re: Ruby client fails to build

2010-01-20 Thread Siddhant Goel
On Wed, Jan 20, 2010 at 4:19 PM, Erik Hatcher wrote: > Where are you getting your solr-ruby code from? You can simply "gem > install" it to pull in an already pre-built gem. > I'm just picking it up from the 1.4 release. I also tried checking out the latest copy from svn, but the results were th

Re: Ruby client fails to build

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 6:32 AM, Siddhant Goel wrote: On Wed, Jan 20, 2010 at 4:19 PM, Erik Hatcher wrote: Where are you getting your solr-ruby code from? You can simply "gem install" it to pull in an already pre-built gem. I'm just picking it up from the 1.4 release. I also tried checking

big index vs. lots of small ones

2010-01-20 Thread Thorsten Scherler
Hi all, I have to do an analyses about following usecase. I am working as consultant in a public company. We are talking about to offer in the future each public institution its own search server (probably) based on Apache Solr. However the user of our portal should be able to search all indexes.

Re: filter query parsing problem

2010-01-20 Thread Ahmet Arslan
> If they are really filter queries i.e. specified through > "fq" then they will > not be run through an analyzer. > Does this mean filter queries are not analyzed? The query below returns a document. http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on&debugQuery=on

LucidGaze, No Data

2010-01-20 Thread Markus Jelsma
Hello all, I have installed and reconfigured everything according to the readme supplied with the recent LucidGaze release. Files have been written in the gaze directory in SOLR_HOME but the *.log.x.y files are all empty! The rrd directory does contain something that is about 24MiB. In the en

Re: filter query parsing problem

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: If they are really filter queries i.e. specified through "fq" then they will not be run through an analyzer. Does this mean filter queries are not analyzed? The query below returns a document. http://localhost:8983/solr/select/?q=*%3A*&ver

Re: filter query parsing problem

2010-01-20 Thread Shalin Shekhar Mangar
On Wed, Jan 20, 2010 at 7:40 PM, Erik Hatcher wrote: > > On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: > > If they are really filter queries i.e. specified through >>> "fq" then they will >>> not be run through an analyzer. >>> >>> >> Does this mean filter queries are not analyzed? The query b

Re: filter query parsing problem

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 9:34 AM, Shalin Shekhar Mangar wrote: On Wed, Jan 20, 2010 at 7:40 PM, Erik Hatcher wrote: On Jan 20, 2010, at 8:11 AM, Ahmet Arslan wrote: If they are really filter queries i.e. specified through "fq" then they will not be run through an analyzer. Does this mean fi

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Bogdan Vatkov
Hi Eric, I think I realize that and I am actually using this - I am using the stemmed, cased etc. token from the stored "term vectors" and additionally I am using the field values. But the fields values are different from the tokens in the level of granularity. When I access the term vector for my

Re: TermsComponent, multiple fields, total count

2010-01-20 Thread Lukas Kahwe Smith
On 19.01.2010, at 22:52, Lukas Kahwe Smith wrote: >>> I also want to match multiple fields at once. >> >> Can you give an example? > > > I enter "Kreuz" but this could either be part of a persons name or of a > street name, which are separate fields in my index mainly because they > analyzed

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
Sorry, I meant completely server-side - even more I want that at indexing time (I do not care about query-time as I am reading later the whole index anyway). On Wed, Jan 20, 2010 at 2:40 AM, Erick Erickson wrote: > Do you mean you want the URLs to be extracted on the client? > If so, no. Filters/

Field collapsing works but is tree modeling possible?

2010-01-20 Thread Kelly Taylor
I'm currently using the latest SOLR-236 patch (12/24/2009) and field-collapsing seems to be giving me the desired results, but I'm wondering if I should focus more on a tree view of my catalog data instead, as described in "Beyond Basic Faceted Search" Could either of the patches for SOLR-792 or

Replication clients logs in solr 1.4

2010-01-20 Thread Jérôme Etévé
Hi All, I'm using the build in replication with master/slave(s) Solr and the indices are replicating just fine. Just something troubles me: Nothing happens in my logs/ directory .. On the slave(s), no logs/snapshot.current file. And on the master, nothing either appears on logs/clients/ The log

Solr query single entity?

2010-01-20 Thread fredanthony
Hi, I have Solr setup to use a DataImportHandler with my database. In the data-config.xml file I have one document with two entities as follows: Now, my goal is I want certa

Re: big index vs. lots of small ones

2010-01-20 Thread Marc Sturlese
Check out this patch witch solve the distributed IDF's problem: https://issues.apache.org/jira/browse/SOLR-1632 I think it fixes what you are explaining. The price you pay is that there are 2 requests per shard. If I am not worng the first is to get term frequencies and needed info and the second

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Erick Erickson
Ah, OK. I take the "unnecessary" comment back. If you require the original form of the tokens (not just the original text), then you do have to do something to preserve them, so I think you're on the right track FWIW Erick On Wed, Jan 20, 2010 at 9:38 AM, Bogdan Vatkov wrote: > Hi Eric, > >

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
I guess it depends on what you mean by "extract". There's nothing that I know of that, say, stores them to a file or separate field, or even does anything special with them. I think StandardTokenizerFactory tries to keep URLs together as a token in the field, but it's just another token... You sho

Re: [1.3] help with update timeout issue?

2010-01-20 Thread Jerome L Quinn
Lance Norskog wrote on 01/16/2010 12:43:09 AM: > If your indexing software does not have the ability to retry after a > failure, you might with to change the timeout from 20 seconds to, say, > 5 minutes. I can make it retry, but I have somewhat real-time processes doing these updates. Does an

Re: Replication clients logs in solr 1.4

2010-01-20 Thread Jérôme Etévé
Oops. Ok my mistakes. The logs are actually for the solr 1.3 system scripts based distribution only. And the config files synchronize only on change .. J. 2010/1/20 Jérôme Etévé : > Hi All, > > I'm using the build in replication with master/slave(s) Solr and the > indices are replicating just

Re: Unstemming after solr.PorterStemFilterFactory

2010-01-20 Thread Bogdan Vatkov
Thanks! It is good to know I did not do something in vаin :) On Wed, Jan 20, 2010 at 6:54 PM, Erick Erickson wrote: > Ah, OK. I take the "unnecessary" comment back. If you require > the original form of the tokens (not just the original text), then you > do have to do something to preserve them,

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
I am not absolutely sure about what I am saying but I think after tokenization I get the URLs as single tokens but with all the "interesting symbols" :) like "/",":" removed from the token. Is it normal? Is there a chance I misconfigured something? Best regards, Bogdan On Wed, Jan 20, 2010 at 7:0

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
That's really hard to say without seeing your configuration ... If your field has WordDelimiterFactory with the proper catenate options set to one, that'd do it. Can you post the relevant parts of your schema? Erick On Wed, Jan 20, 2010 at 12:46 PM, Bogdan Vatkov wrote: > I am not absolutely s

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
that is the field type: and that is the field def: On Wed, Jan 20, 2010 at 7:53 PM, Erick Erickson wrote: > That's really hard to say without seeing your configuration ... > > If your field has WordDelimiterFactory wi

Need help : Solr configuration issue for sorting on title field

2010-01-20 Thread EL KASMI Hicham
Hello, We have a problem with sorting on title field in Solr instance of our production repository, we get the error message: "HTTP Status 500 - there are more terms than documents in field "titleStr", but it's impossible to sort on tokenized fields". After some googling and searching in this l

Re: Extracting URLs while indexing

2010-01-20 Thread Erick Erickson
You really need to have this page as a handy reference. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Look in particular at what happens with WordDelimiterFilterFactory, you're breaking your tokens up on non-alpha char

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/19/2010 06:05:45 PM: > On Tue, Jan 19, 2010 at 5:57 PM, Steve Conover wrote: > > I'm using latest solr 1.4 with java 1.6 on linux.  I have a 3M > > document index that's 10+GB.  We currently give solr 12GB of ram to > > play in and our machine has 32GB total. > > > >

Re: solr blocking on commit

2010-01-20 Thread Yonik Seeley
On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn wrote: > This is essentially the same problem I'm fighting with.  Once in a while, > commit > causes everything to freeze, causing add commands to timeout. This could be a bit different. Commits do currently block other update operations such as ad

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/20/2010 02:24:04 PM: > On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn wrote: > > This is essentially the same problem I'm fighting with.  Once in a while, > > commit > > causes everything to freeze, causing add commands to timeout. > > This could be a bit different.

Re: Extracting URLs while indexing

2010-01-20 Thread Bogdan Vatkov
Now I see I didn't review all the config that I took from the default config. Removed the WordDelimiterFilter and the StandardTokenizer seems to keep URLs but splits relative paths (e.g. /file/location/file.txt) and I want to keep such as single token. Any ideas? On Wed, Jan 20, 2010 at 8:13 PM, E

filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynamic int fields but not dynamic string fields? ex. http://localhost:8983/solr/select?indent=on&version=2.2&q=climate - correct http://localhost:8983/solr/select?version=2.2&q=climate&fq=awardedamounttodat

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Erik Hatcher
On Jan 20, 2010, at 4:27 PM, Tommy Chheng wrote: I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynamic int fields but not dynamic string fields? ex. http://localhost:8983/solr/select?indent=on&version=2.2&q=climate - correct http://localhost:8983/

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Rob Casson
> http://localhost:8983/solr/select?indent=on&version=2.2&q=climate&fq=awardinstrument_s:Continuing+grant > Continuing grant everything that erik already mentioned, but looks like you also have a trailing space in the document, so even quoting it would require that last space.

Re: filter querying working on dynamic int fields but not dynamic string fields?

2010-01-20 Thread Tommy Chheng
Thanks, quoting it fixed it. I'm also going to strip the leading/trailing whitespace at index time. Tommy On 1/20/10 1:47 PM, Erik Hatcher wrote: On Jan 20, 2010, at 4:27 PM, Tommy Chheng wrote: I'm having trouble doing a filter query on a string field. Any ideas why it's working on dynami

Re: Contributors - Solr in Action Case Studies

2010-01-20 Thread Tom Burton-West
Hello Otis, Hi Otis, We are using Solr to provide indexing for the full text of 5 million books (About 4-6 terrabytes of text.) Our index is currently around 3 terrabytes distributed over 10 shards with about 310 GB of index per shard. We are using very large Solr documents (about 750MB of tex

Re: Need help : Solr configuration issue for sorting on title field

2010-01-20 Thread Chris Hostetter
: Subject: Need help : Solr configuration issue for sorting on title field : In-Reply-To: : References: : <359a92831001191640v7c063e28y8b3376b71ec3d...@mail.gmail.com> : : <359a92831001200903p73d89754t4ff15140b7ef7...@mail.gmail.com> : http://people.apache.org/~hossman/#thread

Re: build path

2010-01-20 Thread Chris Hostetter
: Subject: build path : References: <219927.42092...@web52905.mail.re2.yahoo.com> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if

Re: solr blocking on commit

2010-01-20 Thread Jerome L Quinn
ysee...@gmail.com wrote on 01/20/2010 02:24:04 PM: > On Wed, Jan 20, 2010 at 2:18 PM, Jerome L Quinn wrote: > > This is essentially the same problem I'm fighting with.  Once in a while, > > commit > > causes everything to freeze, causing add commands to timeout. > > This could be a bit different.

Problems with spellchecker

2010-01-20 Thread Simon Wistow
The spellchecker in my 1.4 install started behaving increasingly erratically andsuggestions would only be returned some of the time with the same query. I tried to force a rebuild using spellcheck.build=yes The full request being /select/?q=alexandr the great& indent=on& fl=title& spellchec

Re: Dynamic boosting of ids at search time

2010-01-20 Thread Lance Norskog
http://www.lucidimagination.com/search/document/CDRG_ch04_4.4.4?q=ExternalFileField This lets you make a file with a boost value for every document. You can change the file and reload the new values with a . It hasn't been materially changed since 2007 and there are no unit tests, so it might not

filter query granularity

2010-01-20 Thread Wangsheng Mei
The following 3 search senarioes: > bla:A > bla:B > bla:A OR bla:B > are quite common, so I use 3 filter queries: fq=bla:A fq=bla:B fq=bla:A OR bla:B My question is, since the last fq documents set will be build from the first two fq doc sets, will solr still cache the last fq doc set or it just

Re: Does specifying a smaller number of rows in search improve efficiency?

2010-01-20 Thread Lance Norskog
The data in stored fields that is fetched back is in different files than the index data. So, when you ask for documents you are asking for more disk i/o. The different fields are in different places on the disk, so if you request only 1 out of 20 fields, the query will be slightly faster. I once m

Re: filter query granularity

2010-01-20 Thread Lance Norskog
The docset for "fq=bla:A OR bla:B" has no relation to the other two. Different 'fq' filters are made and cached separately. The first time you search with a filter query, Solr does that query and saves the list of documents matching the search. 2010/1/20 Wangsheng Mei : > The following 3 search se

Solr Analysis Webinar Jan 28, 2010

2010-01-20 Thread Jay Hill
My colleague at Lucid Imagination, Tom Hill, will be presenting a free webinar focused on analysis in Lucene/Solr. If you're interested, please sign up and join us. Here is the official notice: We'd like to invite you to a free webinar our company is offering next Thursday, 28 January, at 2PM Eas

Re: Rounding dates on sort and filter

2010-01-20 Thread Lance Norskog
The precision of the date should not matter that much in the time for the first sort. Lucene makes a pair of arrays for the sorted field, one with each unique date and one with each document number in the index. (Yes, the entire index.) The first array will be shorter when you cut the date precisi

Re: Google Commerce Search

2010-01-20 Thread Lance Norskog
The Linux file systems are generally at least twice as fast as the Windows NTFS file system. Solr installations are mostly disk-limited so this will have a major effect. On Tue, Jan 19, 2010 at 12:53 PM, wojtekpia wrote: > > While Solr is functionally platform independent, I have seen much better

Replication Handler Severe Error: Unable to move index file

2010-01-20 Thread Trey
Does anyone know what would cause the following error?: 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile SEVERE: *Unable to move index file* from: /home/solr/cores/core8/index.20100119103919/_6qv.fnm to: /home/solr/cores/core8/index/_6qv.fnm This occurred a few days back and we notic

RE : Need help : Solr configuration issue for sortin g on title field

2010-01-20 Thread EL KASMI Hicham
Sorry Chris and others, it's my first time I'm using a mailing list to ask a question. I'll send my question again in a new blank clean message. Thanks for references. Hicham Message d'origine De: Chris Hostetter [mailto:hossman_luc...@fucit.org] Date: jeu. 21/01/2010 0:12 À: so

Re: Replication Handler Severe Error: Unable to move index file

2010-01-20 Thread Otis Gospodnetic
It's hard to tell without poking around, but one of the first things I'd do would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm - does this file/dir really exist? Or, rather, did it exist when the error happened. I'm not looking at the source code now, but is that really

Re: solr blocking on commit

2010-01-20 Thread Steve Conover
> How solr organized so that search can continue when a commit has closed the > index? > Also, looking at lucene docs, commit causes a system fsync().  Won't search > also > get blocked by the IO traffic generated? ...I'll run iostat too and see if there's anything interesting to report

Re: filter query granularity

2010-01-20 Thread Wangsheng Mei
Thanks for your explanation, it makes a lot sense to me. 2010/1/21 Lance Norskog > The docset for "fq=bla:A OR bla:B" has no relation to the other two. > Different 'fq' filters are made and cached separately. The first time > you search with a filter query, Solr does that query and saves the > l

Re: Fastest way to use solrj

2010-01-20 Thread Tim Terlegård
Yes, it worked! Thank you very much. But do I need to use curl or can I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't use BinaryWriter then I don't know how to do this. /Tim 2010/1/20 Noble Paul നോബിള്‍ नोब्ळ् : > 2010/1/20 Tim Terlegård : > BinaryRequestWriter does not

question

2010-01-20 Thread Daniel Angelov
Is it posible to set maximum indexed documents in solr? For example, I want to insert in solr max 5000 document, after that solr must refuse unserting.