Re: Retrieve unique documents on a non id field

2012-11-07 Thread Rafał Kuć
Hello! Look at the field collapsing functionality - http://wiki.apache.org/solr/FieldCollapsing It allows you to group documents based on field value, query or function query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi All, > Curren

Jetty Error while testing Solr

2012-11-07 Thread deniz
HI all, I got a weird error while running tests on my solr... here is the log for that error: ERROR o.a.solr.servlet.SolrDispatchFilter - null:org.eclipse.jetty.io.EofException at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:154) at org.eclipse.jetty.server.HttpOutp

Re: is it possible to save the search query?

2012-11-07 Thread Romita Saha
Hi, The following is the example; 1st query: http://localhost:8983/solr/db/select/?defType=dismax&&debugQuery=on&q=cashier2&qf=data^2 id&start=0&rows=11&fl=data,id Next query: http://localhost:8983/solr/db/select/?defType=dismax&&debugQuery=on&q=cashier2&qf=data id^2&start=0&rows=11&fl=data

Re: Testing Solr Cloud with ZooKeeper

2012-11-07 Thread Otis Gospodnetic
You didn't ask about this, but you'll want an odd number of zookeeper nodes. Think voting. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 7, 2012 4:43 PM, "darul" wrote: > Yes instanceDir attribute point to new created core (with no conf dir) so > it > is stranged... > > but loo

Re: Questions about schema.xml

2012-11-07 Thread Prithu Banerjee
Those two values are used to specify the analyzer type you want. That can be of two kinds, one for the indexer- the analyzer you specify analyzes the input documents accordingly to build the index. The other one is for query, it analyzes your query. Typically the specified analyzer for index and qu

Re: is it possible to save the search query?

2012-11-07 Thread Otis Gospodnetic
Hi, Compare in what sense? An example will help. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 7, 2012 8:45 PM, "Romita Saha" wrote: > Hi All, > > Is it possible to record a search query in solr and then compare it with > the previous search query? > > Thanks and regards, > R

Questions about schema.xml

2012-11-07 Thread johnmunir
HI, Can someone help me understand the meaning of and in schema.xml, how they are used and what do I get back when the values are not the same? For example, given: If I make the

Re: SolrCloud, Zookeeper and Stopwords with Umlaute or other special characters

2012-11-07 Thread Lance Norskog
You can debug this with the 'Analysis' page in the Solr UI. You pick 'text_general' and then give words with umlauts in the text box for indexing and queries. Lance - Original Message - | From: "Daniel Brügge" | To: solr-user@lucene.apache.org | Sent: Wednesday, November 7, 2012 8:45:4

Re: Sorting by boolean function

2012-11-07 Thread Indika Tantrigoda
Hi, Yes, the new sort query worked!. Thank you for the quick response. Thanks, Indika On 8 November 2012 04:31, Rafał Kuć wrote: > Hello! > > Ahhh, sorry. I see now. The function query you are specifying for the > sort is not proper. Solr doesn't understand that part: > > exists(start_time:[*

Re: Sorting by boolean function

2012-11-07 Thread Rafał Kuć
Hello! Ahhh, sorry. I see now. The function query you are specifying for the sort is not proper. Solr doesn't understand that part: exists(start_time:[* TO 1721] AND end_time:[1721 TO *]) try something like that: sort=if(exists(query($innerQuery)),100,0)+desc&innerQuery={!edismax}(start_time:[*

Re: Sorting by boolean function

2012-11-07 Thread Indika Tantrigoda
Thanks for the response, I did append the sort order but the result was the same, Can't determine a Sort Order (asc or desc) in sort spec 'if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) asc', pos=23 Thanks, Indika On 8 November 2012 04:02, Rafał Kuć wrote: > Hello! > > Fro

Re: Deleting an individual document while delta index is running

2012-11-07 Thread Josh Turmel
Okay, thanks for the help guys... I *think* that this can be resolved by kicking off the delta and passing optimize=false since the default was true in 3.3. I'll post back if I see the issue pop back up. JT On Wednesday, November 7, 2012 at 1:34 PM, Josh Turmel wrote: > Here's what we have s

Re: Sorting by boolean function

2012-11-07 Thread Rafał Kuć
Hello! From looking at your sort parameter it seems that you are missing the order of sorting. The function query will return a value on which Solr will sort and Solr needs to know if the documents should be sorted in descending or ascending order. Try something like that: sort=if(exists(start_ti

Re: Testing Solr Cloud with ZooKeeper

2012-11-07 Thread darul
Yes instanceDir attribute point to new created core (with no conf dir) so it is stranged... but looks like I have played to much: when I start main solr shard. I try everything again tomorrow and give you feedback. -- View this message in context: http://lucene.472066.n3.nabble.com/Testin

New solrcloud deployment, no results

2012-11-07 Thread Jeff Rhines
I have a cluster of 6 shards of Solr 4.0.0 deployed, one machine each, with no replicas, and another single machine running a zookeeper ensemble of 5. Using python sunburnt, I submit six documents with separate ids and populated text fields and commit them. No errors are reported. When I search

Re: Testing Solr Cloud with ZooKeeper

2012-11-07 Thread Erick Erickson
Right. Solr uses zookeeper only for configuration information. The index resides on the machines running solr. bq: In my solr instances directory tree, /solr/mycollection/ sometimes I have an "index" or "index.20121107185908378" You can configure Solr to keep snapshots of indexes around under co

Re: Testing Solr Cloud with ZooKeeper

2012-11-07 Thread darul
I reply to myself : darul wrote > * > Few questions: * > - my both zookeeper have their own data directory, as usual, but I did not > see so much change inside after indexing examples docs. Are data stored > their or just / > configuration (conf files) / > is stored in zookeeper ensemble ? Can

Testing Solr Cloud with ZooKeeper

2012-11-07 Thread darul
Hello everyone, Having used *Hadoop* (not in charge of deployment, just java code part) and *Solr 3.6* (deployment and coding) this year, today I made the solr cloud wiki. Well, * I have deployed 2 zookeeper (not embedded) instances * 2 solr instances with 2 shards (pointing to zookeeper nodes)

Re: custom request handler

2012-11-07 Thread Amit Nithian
Why not do this in a ServletFilter? Alternatively, I'd just write a front end application servlet to do this so that you don't firewall your internal admins off from accessing the core Solr admin pages. I guess you could solve this using some form of security but I don't know this well enough. If

Re: Deleting an individual document while delta index is running

2012-11-07 Thread Josh Turmel
Here's what we have set in our data-config.xml Thanks, Josh Turmel On Wednesday, November 7, 2012 at 1:00 PM, Shawn Heisey wrote: > On 11/7/2012 10:55 AM, Otis Gospodnetic wrote: > > Hi Shawn, > > > > It the last part really correct? Optimization should be doable while > > updates are goin

Re: searching camel cased terms with phrase queries

2012-11-07 Thread Dmitry Kan
Hi Jack, It seems all the html tables in my original e-mail have been broken, sorry about that. Thanks for the ideas. You're very right that a compound word recognition would be the way to pursue. Most probably this feature is domain dependent, so implementing something generic might not satisfy

Re: Deleting an individual document while delta index is running

2012-11-07 Thread Shawn Heisey
On 11/7/2012 10:55 AM, Otis Gospodnetic wrote: Hi Shawn, It the last part really correct? Optimization should be doable while updates are going on... or am I missing something? From what I recall when I was first putting my build system together, which I will admit was on Solr 1.4.0, I could

Re: Two questions about solrcloud

2012-11-07 Thread Tomás Fernández Löbbe
A transaction log file is opened after a hard commit and it stays open until a new hard commit is issued. Old tlogs are deleted, but always with the premise of keeping enough updates "lines" of tlog to allow peer-sync (currently, hardcoded to 100 updates). The recovery is made using the last tlog (

RE: [SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files

2012-11-07 Thread Dyer, James
Try specifying the "escape" parameter. This is the character your file uses to escape delimiters occuring in the data. If this fixes your problem and if you do not want to have to specify "escape", you can alter the patch, line 113: change: (escape == '\\') to: (escape !=null && escape == '\\'

Re: get a list of terms sorted by total term frequency

2012-11-07 Thread Edward Garrett
i see... using the -t flag it would be cool if TermsComponent had an option to sort by total term frequency, something like terms.sort={count|index|ttf} surely that's a common enough use case On Wed, Nov 7, 2012 at 6:17 PM, Michael McCandless wrote: > Lucene's misc module has HighFreqTerms to

Subscription reg

2012-11-07 Thread sachin gk
Hi, Please confirm my subscription. Regards, Sachin

Re: Two questions about solrcloud

2012-11-07 Thread Otis Gospodnetic
Hi On Wed, Nov 7, 2012 at 7:49 AM, Tomás Fernández Löbbe wrote: > > 1.If I have a solrcloud cluster with two shards and 0 replica on two > > different server. > > when one of server restarts will the solr instance on that server replay > > the transaction log to make sure these operations pers

Re: get a list of terms sorted by total term frequency

2012-11-07 Thread Michael McCandless
Lucene's misc module has HighFreqTerms tool. Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett wrote: > hi, > > is there a simple way to get a list of all terms that occur in a field > sorted by their total term frequency within that field? > > Terms

get a list of terms sorted by total term frequency

2012-11-07 Thread Edward Garrett
hi, is there a simple way to get a list of all terms that occur in a field sorted by their total term frequency within that field? TermsComponent (http://wiki.apache.org/solr/TermsComponent) "provides fast field faceting over the whole index", but as counts it gives the number of documents that e

Re: Deleting an individual document while delta index is running

2012-11-07 Thread Otis Gospodnetic
Hi Shawn, It the last part really correct? Optimization should be doable while updates are going on... or am I missing something? Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Wed, Nov 7, 2012 at 12:17

Re: Deleting an individual document while delta index is running

2012-11-07 Thread Shawn Heisey
On 11/6/2012 2:01 PM, Josh Turmel wrote: Running Solr 3.3 We're running into issues where deleting individual documents (by ID) will timeout but it only seems to happen when our hourly delta index is being ran to pull in new documents, is there a way to work around this? Are you experiencing r

SolrCloud, Zookeeper and Stopwords with Umlaute or other special characters

2012-11-07 Thread Daniel Brügge
Hi, i am running a SolrCloud cluster with the 4.0.0 version. I have a stopwords file which is in the correct encoding. It contains german Umlaute like e.g. 'ü'. I am also running a standalone Zookeeper which contains this stopwords file. In my schema i am using the stopwords file in the standard w

Re: Nested Join Queries

2012-11-07 Thread Gerald Blanck
Thank you Erick for your reply. I understand that search is not an RDBMS. Yes, we do have a huge combinatorial explosion if we de-normalize and duplicate data. In fact, I believe our use case is exactly what the Solr developers were trying to solve with the addition of the Join query. And while

Re: searching camel cased terms with phrase queries

2012-11-07 Thread Jack Krupansky
This is one of those areas of Solr where you can refine and make improvements, as you have done, but never actually reach 100% satisfaction. And, in some cases, as here, you have a choice of settings and no single combination covers all cases. In this case, you really need compound-term recogn

AW: Stemmer German2

2012-11-07 Thread André Widhani
No - you need to restart Solr to pick up the changes to the schema and you need to re-index the existing documents. Regards, André Von: Andreas Niekler [aniek...@informatik.uni-leipzig.de] Gesendet: Mittwoch, 7. November 2012 16:40 An: solr-user@lucene.ap

Re: About solr fields (dynamic query)

2012-11-07 Thread Luis Cappa Banda
Hello! In my opinion, you are trying to use Solr as a complete and classic database but the thing is that the best scenario where Solr rocks is just for fast search data access. Thus, you just should index those data candidate to be searched for. My personal suggestion is that you think about how

Re: [SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files

2012-11-07 Thread zakaria benzidalmal
James, Thank you for you quick answer. This is my data-config.xml file: am I doing something wrong. Cordialement. __ Zakaria BENZIDALMAL mobile: 06 31 40 04 33 2012/11/7 Dyer, James > Zakaria, > > You might wa

Re: About solr fields (dynamic query)

2012-11-07 Thread Jack Krupansky
With Solr, you would need to pre-compute any computations in query expressions and index them for direct matching. So, you would need a field which was indexed with the sum of the values of the other two fields. -- Jack Krupansky -Original Message- From: blopez Sent: Wednesday, Novem

RE: [SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files

2012-11-07 Thread Dyer, James
Zakaria, You might want to post your data-config.xml, or at least the part that uses SOLR-2549. If its throwing an NPE, it certaintly has a bug (if you're doing something wrong, it would at least give you a sensible error message). Also, unless you need to use DIH for some other reason, you m

Re: Stemmer German2

2012-11-07 Thread Andreas Niekler
Hello, thanks for the advice. If i now change the schema that my lowercase factory is before the stemmer. is the index updating itself after the change? How could i achieve this. I stored all values within the index. Thanks andreas Am 07.11.2012 10:47, schrieb André Widhani: Do you use the

Re: Matching an exact phrase in a text field

2012-11-07 Thread Christopher Gross
I do have the omit positions turned on...d'oh. So if I cut those out, then the region (and other similar fields) should work correctly? Thanks Jack! -- Chris On Wed, Nov 7, 2012 at 10:31 AM, Jack Krupansky wrote: > Try the Solr Admin Analysis page to see how your phrase and input text are > a

Re: Matching an exact phrase in a text field

2012-11-07 Thread Jack Krupansky
Try the Solr Admin Analysis page to see how your phrase and input text are actually being analyzed. Also add &debugQuery=true to your query and see what the "parsed" query looks like. And check to see that "North" and "America" are not mentioned in either your stop words or synonyms. Also, che

Re: omitTermFreq only?

2012-11-07 Thread Jack Krupansky
Please clarify. Be specific, with details. -- Jack Krupansky -Original Message- From: anasalkouz Sent: Wednesday, November 07, 2012 5:19 AM To: solr-user@lucene.apache.org Subject: Re: omitTermFreq only? Unfortunately, in Solr 4 all this omitTermFreqAndPositions not working probably

Re: Exponential omitNorms

2012-11-07 Thread Walter Underwood
You are probably thinking of SweetSpotSimilarity. You might also want to look at pivoted document normalization. wunder On Nov 7, 2012, at 4:22 AM, Dotan Cohen wrote: > On Wed, Nov 7, 2012 at 2:01 PM, Otis Gospodnetic > wrote: >> The name escapes me but Lucene used to have a special Similarity

[SOLR-2549] DIH LineEntityProcessor support for delimited & fixed-width files

2012-11-07 Thread zakaria benzidalmal
Hi all, Could some one provide a clear exemple using this Processor (data-config.xml exemple)? I run into this problem after patching and building my code: GRAVE: Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerExcept

Re: New Index directory regardless of Solr.xml

2012-11-07 Thread Erick Erickson
Well, you have a couple of issues here: 1> the nodes shard ID will have to be guaranteed to be the same as it was when the shard was created. 2> you should be able to just copy the index directory to wherever your node is looking for it now. or 3> instanceDir should point to the directory that has

Re: omitTermFreq only?

2012-11-07 Thread anasalkouz
Unfortunately, in Solr 4 all this omitTermFreqAndPositions not working probably -- View this message in context: http://lucene.472066.n3.nabble.com/omitTermFreq-only-tp3167128p4018733.html Sent from the Solr - User mailing list archive at Nabble.com.

About solr fields (dynamic query)

2012-11-07 Thread blopez
Hi all, I'm facing some problems with solr fields at query time. Let's see a simplified example. I have the fields A, B and C. In a relational DB, it's possible to launch a (let's say dynamically) query: SELECT * FROM wherever WHERE wherever.A + wherever.B = wherever.C I'm trying to do this in S

Re: Two questions about solrcloud

2012-11-07 Thread Tomás Fernández Löbbe
> 1.If I have a solrcloud cluster with two shards and 0 replica on two > different server. > when one of server restarts will the solr instance on that server replay > the transaction log to make sure these operations persistent to the index > files(commit the transaction log)? > Yes, the Solr i

Re: Exponential omitNorms

2012-11-07 Thread Dotan Cohen
On Wed, Nov 7, 2012 at 2:01 PM, Otis Gospodnetic wrote: > The name escapes me but Lucene used to have a special Similarity impl for > this sort of stuff. I think it's still there. > > We implemented a slightly better Similarity that used Gaussian distribution > and was thus smoother. Try doing tha

Re: Exponential omitNorms

2012-11-07 Thread Otis Gospodnetic
The name escapes me but Lucene used to have a special Similarity impl for this sort of stuff. I think it's still there. We implemented a slightly better Similarity that used Gaussian distribution and was thus smoother. Try doing that. Otis -- Performance Monitoring - http://sematext.com/spm On No

Re: Two questions about solrcloud

2012-11-07 Thread Otis Gospodnetic
Hi, Somebody else may answer 1. 2. Yes you can remove 4th server if you have only 3 shards. I'm not 100% certain if solr will never put 2 shards on the same server, so check what's on the server before shutting it down. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 7, 2012 1:41

Re: Latin characters encoding. Example: letter "ñ".

2012-11-07 Thread Luis Cappa Banda
Latest news! It was a simple bad spelling Tomcat issue in server.xml. I specified "utf8" instead of "UTF-8". After that the problem was solved and everything is O.K. However, I hope that this thread could be useful for someone because this kind of latin encoding problems are very common. Goodbye!

Re: migrating from solr3 to solr4

2012-11-07 Thread Carlos Alexandro Becker
It works, thank you very much. On Wed, Nov 7, 2012 at 6:49 AM, Stefan Matheis wrote: > Hey Carlos > > This sounds a bit like a change which was made in SOLR-1770: > > Index: example/solr/solr.xml > === > --- example/solr/solr.xml (r

Latin characters encoding. Example: letter "ñ".

2012-11-07 Thread Luis Cappa Banda
Hello! I´ve got some encoding problems with my currently new analyzer configuration. I´ve deployed a Solr server in Apache Tomcat setting Tomcat´s encoding to UTF-8 in server.xml. Also Solr´s encoding is setted to UTF-8 in schema.xml. I have defined a fieldType like the following: ** * * * *

AW: Stemmer German2

2012-11-07 Thread André Widhani
Do you use the LowerCaseFilterFactory filter in your analysis chain? You will probably want to add it and if you aready have, make sure it is _before_ the stemming filter so you get consistent results regardless of lower- or uppercase spelling. You can protect words from being subject to stemmi

Stemmer German2

2012-11-07 Thread Andreas Niekler
Dear List, i have an unwanted behavior with the German2 Stemmer. For example the river Elbe: If i input elbe - the word gets reduced to elb If i input Elbe - everything is ok and elbe is stored to the index. If i now query for elbe or Elbe i get of course differnt Results allowing the users

scale with 3 shards on single server

2012-11-07 Thread SuoNayi
Hi all, Because we cannot add or remove shard when solrcloud cluster has been set up, so we have to predict a precise shard size at first, says we need 3 shards. but now we have no enough servers to set up the cluster with one shard one server. In this situation, I have to set up the cluster

Re: migrating from solr3 to solr4

2012-11-07 Thread Stefan Matheis
Hey Carlos This sounds a bit like a change which was made in SOLR-1770: Index: example/solr/solr.xml === --- example/solr/solr.xml (revision 909013) +++ example/solr/solr.xml (working copy) @@ -29,6 +29,6 @@ If 'null' (or absent), co

Indexing text files in Solr

2012-11-07 Thread sharat89
Hey, I am new to solr and java. I am working on a project that involves the use of both. I want to learn the complete process of indexing a text file using solr. Also, I'd like to learn how to interpret the results. i am looking for outputs like "number of occurrences of a particular word" . P