Re: Per-segment faceting and soft commits

2012-11-26 Thread Mikhail Khludnev
Jonathan, I wonder why softCommit does not create a new segment? I thought it doesn't call flush/fsync for a new segment. Can you give a reference to your explanation? Then, I guess, as far "fieldValueCache" is not a Solr cache but monitoring wrapper ontop of Lucene per-segment cache, what you see

Re: Search among multiple cores

2012-11-26 Thread Nicholas Ding
Hi Otis, Thank you so much, that's exactly what I need! Thanks Nicholas On Mon, Nov 26, 2012 at 10:28 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Would http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer save you some > work? > > Otis > -- > SOLR Performance Monitoring - http:/

Re: Search among multiple cores

2012-11-26 Thread Amit Nithian
You can simplify your code by searching across cores in the SearchComponent: 1) public class YourComponent implements SolrCoreAware --> Grab instance of CoreContainer and store (mCoreContainer = core.getCoreDescriptor().getCoreContainer();) 2) In the process method: * grab the core requested (SolrC

SolrCloud Performance - Indexing

2012-11-26 Thread deniz
As I am some kinda confused, I wanna check if anyone else has same confusions like mine about solrcloud.. I have set up an environment with 3 solr instances and 2 zookeepers, amd tried to index some documents from mysql db. the total amount the docs are around 3.5M. before indexing i was expecting

SolrCloud(5x) - Errors while recovering

2012-11-26 Thread deniz
Here is briefly what is happening: I have a simple SolrCloud environment for test purposes, running with a zookeeper ensemble, not the ones embedded in Solr. I have 3 instances in the cloud, all of them are using RAMDirectory (which is enabled by new Solr release to use with cloud) After running

Per-segment faceting and soft commits

2012-11-26 Thread Jonathan Acheson
Using solr 4.0, I have 2 fields that I'm faceting on using facet.method=fcs. After the first facet query I can see the two fields in the fieldValueCache. If I then issue a soft commit at /update/json?softCommit=true and again look at the fieldValueCache the two entries are now gone. I was thinking

Re: Skewed IDF in multi lingual index

2012-11-26 Thread Robert Muir
Hi again Markus. Sorry for the slow reply here. I'm confused: are you saying the score goes negative? Are you sure there is no 3.x segments? Can you check that docCount is not -1? Do you happen to have a test, can you share your modified similarity, or give more details? I just want to make sure

Re: Search among multiple cores

2012-11-26 Thread Otis Gospodnetic
Would http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer save you some work? Otis -- SOLR Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 26, 2012 at 7:18 PM, Nicholas Ding wrote: > Hi, > > I'm workin

Re: Weird negative query responses

2012-11-26 Thread Chris Hostetter
: Without knowing anything about how Solr is configured, I would guess that it : is because of a default operator of "OR" making it so that any of those filter : clauses will match. Give the following filter a try: that shouldn't matter -- regardless of the default operator the "-" in front of t

Re: Find the matched field in each matched document

2012-11-26 Thread Chris Hostetter
: Assume I have documents like this: : {"title": "Robert De Niro", "actors": []} : {"title": "ronin", "actors": ["robert de niro", "jean reno"]} : {"title": "casino", "actors": ["robert de niro", "Joe Pesci"]} ... : Now after search for "robert de niro" in both "title" and "Actors", : I wi

Re: Solr 4, optimizing while doing other updates?

2012-11-26 Thread Walter Underwood
Normal merges expunge deletes. You do not need to force a merge. Once per hour is almost certainly way too often. Before I used Solr, I worked was on the Ultraseek team for nine years. Ultraseek had the same merging strategy, with a force merge option. I've worked with many, many customers on t

Re: positions and qf parameter in (e)dismax

2012-11-26 Thread Chris Hostetter
: We do not want to store positions for some fields or omit term and : positions (or just tf) for other fields. Obviously we don't need/want : explicit phrase matching on the fields we want to configure without : positions, but (e)dismax doesn't let us. All text fields configured in : the QF p

Re: Solr 4, optimizing while doing other updates?

2012-11-26 Thread Shawn Heisey
On 11/26/2012 5:56 PM, Walter Underwood wrote: You can optimize during updates, but you should not optimize at all, especially if you are doing continuous updates. Hands off that knob. I promise I'm not optimizing just because it's got a cool name, or because a README/HOWTO said to do it. I o

Re: High Slave CPU Intermittently After Replication

2012-11-26 Thread Shawn Heisey
On 11/26/2012 2:41 PM, richardg wrote: We started having high load again today a few times today, each time looking at SPM monitor our filter cache starts having high lookups and low hit rate, this is our filtercache setting: would it be possibly too slow? here is the graph when it happens:

Re: Solr 4, optimizing while doing other updates?

2012-11-26 Thread Walter Underwood
You can optimize during updates, but you should not optimize at all, especially if you are doing continuous updates. Hands off that knob. wunder On Nov 26, 2012, at 4:48 PM, Shawn Heisey wrote: > For Solr 4.0 and higher, is it possible to optimize the index while other > updates are happening?

Solr 4, optimizing while doing other updates?

2012-11-26 Thread Shawn Heisey
For Solr 4.0 and higher, is it possible to optimize the index while other updates are happening? Based on some behavior I just saw, I think it might be. I ran a full-import using DIH -- six index shards with 13 million records each and a seventh shard (hot shard) with 317000. On a few of tho

Odd casting error in embedded Jetty container

2012-11-26 Thread Mark Bennett
I'm sure this is some config error, but I've checked a lot of things and not finding much on Google. Not sure if it's a jvm, maven, jetty or solr config issue. I'm running with a custom Solr 4.0.0 app under Jetty 8.1.2, with a clone of the example directory tree. The main error seems to be:

Re: Suggester with punctuation signs

2012-11-26 Thread Upayavira
You may want to change your tokenisation anyhow, as a search for 'universidad' will not match your term 'universidad,' But you are on the right track - to improve suggestions, improve what is in your index. Upayavira On Mon, Nov 26, 2012, at 07:54 PM, Jorge Luis Betancourt Gonzalez wrote: > Hi:

Re: High Slave CPU Intermittently After Replication

2012-11-26 Thread richardg
We started having high load again today a few times today, each time looking at SPM monitor our filter cache starts having high lookups and low hit rate, this is our filtercache setting: would it be possibly too slow? here is the graph when it happens:

Re: Searchers, threads and performance

2012-11-26 Thread Chris Hostetter
: Our number one problem: Doing a commit from loading records, which can : happen throughout the day, makes all queries stop for 5-7 seconds. : This is a showstopper for deployment. Best guess: your queries rely on th FieldCache in some way (either sorting or faceting) and you aren't doing an

Suggester with punctuation signs

2012-11-26 Thread Jorge Luis Betancourt Gonzalez
Hi: I've configured my solr setup to use the suggester component and to get terms suggestions from a PHP application, the thing is that I'm getting results like universidad, note the punctuation sign, is there any way I can get rid of this? Or do I need to create a separate field and strip all

Re: error while doing partial update using curl

2012-11-26 Thread Chris Hostetter
: i tried issuing a command using curl with xml syntax and it turns out that it : replace my whole documents rather than updating a specific field this is : what i gave, i got an impression providing update=set will only changes that : field rather than reindexing the entire document. Any idea how

Re: error while doing partial update using curl

2012-11-26 Thread Darniz
i tried issuing a command using curl with xml syntax and it turns out that it replace my whole documents rather than updating a specific field this is what i gave, i got an impression providing update=set will only changes that field rather than reindexing the entire document. Any idea how to issue

Re: error while doing partial update using curl

2012-11-26 Thread Darniz
Sorry for urgency, but i tried many different things i would appreciate if anyone can provide solution for this. -- View this message in context: http://lucene.472066.n3.nabble.com/error-while-doing-partial-update-using-curl-tp4022313p4022408.html Sent from the Solr - User mailing list archive

Re: Spellchecker for multiple sites (and languages?)

2012-11-26 Thread André Schild
Ok, thanks André Am 26.11.2012 18:45, schrieb Dyer, James: The Lucene spellcheckers just look at each word in isolation, which is what the extended results are reporting on. So when using "maxCollationTries", etc, this information becomes less useful. Its when Solr tries to put these words

RE: Spellchecker for multiple sites (and languages?)

2012-11-26 Thread Dyer, James
The Lucene spellcheckers just look at each word in isolation, which is what the extended results are reporting on. So when using "maxCollationTries", etc, this information becomes less useful. Its when Solr tries to put these words together into a meaningful collation that you get a good query

Re: is there a way to prevent abusing rows parameter

2012-11-26 Thread Amit Nithian
If you're going to validate the rows parameter, may as well validate the start parameter too.. I've run into problems with start and rows with ridiculously high values crash our servers. On Thu, Nov 22, 2012 at 9:58 AM, solr-user wrote: > Thanks guys. This is a problem with the front end not v

Re: Spellchecker for multiple sites (and languages?)

2012-11-26 Thread André Schild
Am 26.11.2012 16:32, schrieb Markus Jelsma: Hi - check the new spellchecker collate options. It limits spellchecker suggestions to the fq restrictions. If you filter on specific hosts, the spellchecker will only provide suggestions that are found in that host. Same goes for language. http://w

Re: Find the matched field in each matched document

2012-11-26 Thread Mikhail Khludnev
agh... forgot to mention pivot facets are also close to what are you looking for http://wiki.apache.org/solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting Good luck. On Mon, Nov 26, 2012 at 6:21 PM, Alireza Salimi wrote: > Hi Mikhail, > > Thanks for the reply, I have a feeling that

Re: From Solr3.1 to SolrCloud

2012-11-26 Thread Mark Miller
On Mon, Nov 26, 2012 at 9:40 AM, roySolr wrote: > Mark: I'm using a separate zookeeper instance. I don't use the embedded zk in > solr. Doesn't matter either way. Clear deletes whole directories. -- - Mark

Re: Problem with migration to SolrAdaptersForLuceneSpatial4

2012-11-26 Thread David Smiley (@MITRE.org)
Hi Viacheslav, 1. You don't need JTS unless you're using polygons or WKT and your examples uses neither. So you can remove the spatialContext attribute to use the default, and remove the JTS jar. But that shouldn't be related to your reported problem. 2. The units for d= in the circle are in de

RE: MultiValued facet behavior question

2012-11-26 Thread nicopost
Thanks Robert, your code helped me solve a problem I had! Saved me a lot of time & headaches cheers, Nico -- View this message in context: http://lucene.472066.n3.nabble.com/MultiValued-facet-behavior-question-tp3093851p4022375.html Sent from the Solr - User mailing list archive at Nabble.com

RE: Spellchecker for multiple sites (and languages?)

2012-11-26 Thread Dyer, James
Also see this recent mail list thread for an explanation how you can set up a master dictionary with everything in it but only get valid spell suggestions returned: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201211.mbox/%3c8f0d0142ca7ecc4287a9ec1bd8cb880c182dd3f...@uslvdcmbvp01.in

RE: Spellchecker for multiple sites (and languages?)

2012-11-26 Thread Markus Jelsma
Hi - check the new spellchecker collate options. It limits spellchecker suggestions to the fq restrictions. If you filter on specific hosts, the spellchecker will only provide suggestions that are found in that host. Same goes for language. http://wiki.apache.org/solr/SpellCheckComponent#spellc

RE: Problem with Solr 3.6.1 extracting ODT content using SolrCell's ExtractingRequestHandler

2012-11-26 Thread Brett Melbourne
Hi Erik, The document is committed successfully... it is just missing all the extracted content from Tika when I query for that document. i.e. the mapped content field attr_content is empty (fmap.content=attr_content) 1.9162908 24 2009-04-16T11:32:00 2012-11-23T00:29:39.73 ... B

Re: Copying few field using copyField to non multiValued field

2012-11-26 Thread Barry Galaxy
thanx erick! oh, there was a typo in my example... i meant: /*full_name = first_name + last_name */ you are correct, i would like to use keywordtokenizer to get an exact hit on a query such as: full_name:"Billy Corgan" coming from a source document with: */Billy & Corgan/* is this possible

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread deniz
Mark Miller-3 wrote > It looks like your original path had a double / in it that was causing > problems. my original path in the config file doesnt have any double quotes, but when it is on solrcloud, it adds additional slash to the path... i am not an zookeeper expert or something but could it be

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread deniz
Mark Miller-3 wrote > It looks like your original path had a double / in it that was causing > problems. my original path in the config file doesnt have any double quotes, but when it is on solrcloud, it adds additional slash to the path... i am not an zookeeper expert or something but could it be

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread deniz
Mark Miller-3 wrote > It looks like your original path had a double / in it that was causing > problems. my original path in the config file doesnt have any double quotes, but when it is on solrcloud, it adds additional slash to the path... i am not an zookeeper expert or something but could it be

Spellchecker for multiple sites (and languages?)

2012-11-26 Thread André Schild
Hello, we are a long time nutch user (Since 0.7) Now we made the big jump from 0.9 to 1.5 and solr 4.0 We use it to index different websites and then provide site specific search for these. Currently we index the sites and store them all in one solr instance. The different sites are separate

Re: Solr Near Realtime with denormalized Data

2012-11-26 Thread Jack Krupansky
At least from what you have said, it doesn't sound as if a lot of updating (as opposed to adding new documents) would be needed. I suspect that a lot of users would rather enter a simple patient ID anyway (with a separate name lookup capability.) I mean, once a "visit" is complete, is it reall

Re: From Solr3.1 to SolrCloud

2012-11-26 Thread roySolr
Mark: I'm using a separate zookeeper instance. I don't use the embedded zk in solr. I can't find the location where the configs are stored, i can login to zookeeper and see the configs. delete commando works but i can't delete the whole config directory in once, only file by file. Erick, The nodes

Re: Find the matched field in each matched document

2012-11-26 Thread Alireza Salimi
Hi Mikhail, Thanks for the reply, I have a feeling that what I'm looking for is something that someone else must have already implemented. Basically it's a component which categorizes matched items by their type. For my requirement, even debugQuery should be fine because I'm expecting to return j

Re: Solr Near Realtime with denormalized Data

2012-11-26 Thread zbindigonzales
Hello again The problem is that the software is used in different fields. The table schema for hospital software is not the same as in the industrie sectors. Customers usually create their own schema. The worst case scenary that we know is that there are four tables connected. Table patient -->

Re: Reload core via CoreAdminRequest doesnt work with solr cloud? (solrj)

2012-11-26 Thread Mark Miller
I think that core admin commands may not work with CloudSolrServer at the moment. There is a JIRA issue for it I think. For now, I'd use the HTTP imp and point at the node you want to work with or reload with the collections api. - Mark On Nov 22, 2012, at 12:17 PM, joe.cohe...@gmail.com wrote

Re: AutoSoftcommit option solr 4.0

2012-11-26 Thread Mark Miller
The best way to confirm the NRT aspect is to start loading a really large batch of documents and go to your browser and do a search for *:* (match all documents). See the result count. Hit refresh and see the result count. Hit refresh a couple times about once per second and see the result count

Re: From Solr3.1 to SolrCloud

2012-11-26 Thread Mark Miller
The command line util has a clear command. If you use the out of the box setup, it's something like: example/cloud-scripts/zkcli.sh -cmd clear /path/to/clear http://wiki.apache.org/solr/SolrCloud#Command_Line_Util - Mark On Nov 26, 2012, at 3:44 AM, roySolr wrote: > Ok, that's important for

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread Mark Miller
It looks like your original path had a double / in it that was causing problems. It looks like the below is a bug. Could you please file a JIRA issue? As a workaround, try explicitly setting the directory to write the properties file in with the "directory" param. You should be able to set it to

Re: From Solr3.1 to SolrCloud

2012-11-26 Thread Erick Erickson
what is the _status_ of the "nodes that are already gone"? What is the test you run when you see this? It could just be that you're seeing nodes that are unresponsive but that ZK knows about. Best Erick On Mon, Nov 26, 2012 at 3:44 AM, roySolr wrote: > Ok, that's important for the traffic. > >

Re: Solr replication

2012-11-26 Thread Erick Erickson
"both servers are master and slave in the config file". So that means they're polling each other? I think to say anything intelligent you're going to have to provide more data. Please review: http://wiki.apache.org/solr/UsingMailingLists. The bottom line here is that it sounds like what you're try

Re: Solr Near Realtime with denormalized Data

2012-11-26 Thread Jack Krupansky
Denormalization IS the best practice, gnerally. You still haven't demonstrated an "exponential" increase. If there is any "exponential" increase, that is your own doing and you should simply NOT do that! The total number of documents would be the number of rows in your second table. Denormal

Re: Solr Near Realtime with denormalized Data

2012-11-26 Thread zbindigonzales
Hello Erick Thanks for your response. The main problem we have, is that the data is denormalized. And this increases the document to index exponentially. Let's say we have the following two tables. -- -- Table: Patient Table:Image -

Re: Solr replication

2012-11-26 Thread jacques.cortes
Thanks Antoine. In fact, both solr servers are master and slave in config file. They have the same config file and replicate from the same master url with the vip. So, the master is on the server with the vip mounted on. And if heartbeat toggle, the role is toggled too. The question is : what can

error while doing partial update using curl

2012-11-26 Thread Darniz
Hello i am trying to update a filed in my solr doc using curl, dont know why its giving me this error when i try to do this statment curl 'myhostname:8080/solr/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"contentId#63481697","price":{"set":16595}}]' i am getting this error

Re: From Solr3.1 to SolrCloud

2012-11-26 Thread roySolr
Ok, that's important for the traffic. Some questions about zookeeper. I have done some tests and i have the following questions: - How can i delete configs from zookeeper? - I see some nodes in the clusterstate that are already gone. Why is this not up-to-date? Same for graph. Thanks again!

Re: AutoSoftcommit option solr 4.0

2012-11-26 Thread Vadim Kisselmann
Hi Shaveta, simple, index a doc and search for this ;) An soft commit stands for NearRealTimeSearch, It could take a couple of seconds to see this doc, but it should be there. Best regards Vadim 2012/11/26 Shaveta_Chawla : > I have migrated solr 3.6 to solr 4.0. I have implemented solr4.0's auto

Re: AutoSoftcommit option solr 4.0

2012-11-26 Thread Vadim Kisselmann
Hi Shaveta, simple, index a doc and search for this ;) An soft commit stands for NearRealTimeSearch, It could take a couple of seconds to see this doc, but it should be there. Best regards Vadim 2012/11/26 Shaveta_Chawla : > I have migrated solr 3.6 to solr 4.0. I have implemented solr4.0's auto

Re: User context based search in apache solr

2012-11-26 Thread Mikhail Khludnev
I agree with Otis's suggestion. I don't think that the preference table/matrix ( 1)product_id, 2)user_id, 3)score_value) should be indexed in Lucene/Solr. Any key-value/RDBMS/iMDG updateable storage is fair enough to store preferences lists: user/sessionID -> { (product, weight), (product, weight)

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread deniz
okay, after changing it to db-config from the full path above, i am able to see dataimport page, but still data import is failing... i see this in the logs SEVERE: Full Import failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to PropertyWriter implementation:ZKProper

Re: Find the matched field in each matched document

2012-11-26 Thread Mikhail Khludnev
Alireza, Please be aware that debugQuery works across retrieved search result page (# 'rows' from 'start'), but not for all numFound docs, it's also usually slow, however it carries all info which you need. In our platform we implemented similar functionality working really fast, but not really li

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread deniz
Marcin Rzewucki wrote > Hi, > > It seems like the file is missing from Zookeeper. Can you confirm ? > > Regards. nope, i can see my db-config file on the admin interface of solr as well as zk client on command line, i dont think it is missing from zookeeper - Zeki ama calismiyor... Calis

Re: SolrCloud - Fails to read db config file

2012-11-26 Thread Marcin Rzewucki
Hi, It seems like the file is missing from Zookeeper. Can you confirm ? Regards. On 26 November 2012 07:57, deniz wrote: > Hi all, > > I am working on solrcloud and trying to import from db... but I am getting > this error: > > > > > 500 name="QTime">3Error opening > /configs/poppenuser//hom