Problem with SolrCloud + Zookeeper + DataImportHandler

2013-08-08 Thread 兴涛孙
hello,guys: I've encounted a problem about configuring many cluested nodes with solr,so i want to ask for your help,thanks in advance! The problems lists as follows: 1.Installation platform: solr4.3.1,zookeeper 3.4.5 and tomcat 7 with jdk1.7 2.when i configured single node with DIH to build

Re: Percolate feature?

2013-08-08 Thread Mark
Ok forget the mention of percolate. We have a large list of known keywords we would like to match against. Product keyword: "Sony" Product keyword: "Samsung Galaxy" We would like to be able to detect given a product title whether or not it matches any known keywords. For a keyword to be mat

Re: external zookeeper with SolrCloud

2013-08-08 Thread Shawn Heisey
On 8/8/2013 3:03 PM, Joshi, Shital wrote: We did quite a bit of testing and we think bug https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4 The commit for SOLR-4899 was made to branch_4x on June 10th. lucene_solr_4_4 code branch was created from branch_4x on July 8th.

Re: [POLL] Who & how does use "admin-extra" ?

2013-08-08 Thread Michael Sokolov
On 8/7/13 10:56 PM, Chris Hostetter wrote: : Didn't somebody once say this is used for customization of admin pages? it can be yes, that's why it originla existed -- Stefan's question was wether anyone was actually using it for that. I used it quite a bit back in the day at CNET as a way to "se

RE: external zookeeper with SolrCloud

2013-08-08 Thread Joshi, Shital
We did quite a bit of testing and we think bug https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4 -Original Message- From: Joshi, Shital [Tech] Sent: Wednesday, August 07, 2013 2:48 PM To: 'solr-user@lucene.apache.org' Subject: RE: external zookeeper with SolrCl

Re: Solr search on a large text field is very slow

2013-08-08 Thread Erick Erickson
If you really need wildcards on both ends, I don't think edgengram will work, you need plain ngrams. The trick is that the query side needs to make the 2-grams into phrases. BTW, I think you'd be fine with bigrams. Best Erick On Thu, Aug 8, 2013 at 1:47 PM, meena.sri...@mathworks.com < meena

Re: Storing A Taxonomy In A Separate Core For Query Expansion

2013-08-08 Thread Erick Erickson
No reason why not. You don't even have to keep it in a separate core if you have some field in your docs that had values "items" or "taxonomy" you could always use the appropriate fq when querying... You could also to this from the app, query the taxonomy core and do the substitutions in the app l

Re: Handling categories( level one and two) based navigation

2013-08-08 Thread Erick Erickson
You really haven't told us much about _how_ you want to use category and subcategory. Can a document belong to one or more subcategories? What have you tried? What use-case do you want to support? Etc. You might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Thu, Aug 8, 201

Re: Suggest aka "autocomplete" request handler with solr 4.4

2013-08-08 Thread Utkarsh Sengar
HI Chris, You were right, "appl" was matched to "application". So, I created a new type without the stemmer. New type:

Re: Problems with distributed MoreLikeThis

2013-08-08 Thread Shawn Heisey
On 8/6/2013 1:18 PM, Shawn Heisey wrote: I'm having some problems with distributed MLT. On 4.4, it seems completely broken. Searches that work on 4.2.1 return an exception on 4.4.0. This stackoverflow post shows the EarlyTerminatingCollectorException I'm getting: http://stackoverflow.com/ques

Re: Problems with distributed MoreLikeThis

2013-08-08 Thread Shawn Heisey
On 8/6/2013 9:43 PM, manju16832003 wrote: I'm not sure about the root cause in your case. However one thing to remember while MLT is that, *MLT does not work with integer fields*. In your case if 'catchall' is copyField and if you are trying to copy any integer values verify it again :-). The c

Re: Solr search on a large text field is very slow

2013-08-08 Thread meena.sri...@mathworks.com
Thanks for your responses, helped me to understand the issue. Digged through the documentation and now I am implementing EdgeNGramFilterFactory to see how fastly can I improve wild card searches.

Re: Transform data at index time: country -> continent

2013-08-08 Thread Jack Krupansky
(I think you're better off with an update processor script, but...) The synonym filter supports 2.5 modes: 1. Replace mode country => continent 2. Expand mode country, continent - results in both terms if either is used 2.5) The expand="false" attribute that means treat expand mode as repla

Re: SolrCloud "deletedDocs"

2013-08-08 Thread Shawn Heisey
On 8/8/2013 10:47 AM, Rasmussen, Chris wrote: I'm running a 4.2 SOLRCloud instance with multiple servers/shards. As I'm indexing data, I review the results of the STATUS commands and note an extremely high number of "deletedDocs". I've combed through the source data to verify whether I'm sen

SolrCloud "deletedDocs"

2013-08-08 Thread Rasmussen, Chris
I'm running a 4.2 SOLRCloud instance with multiple servers/shards. As I'm indexing data, I review the results of the STATUS commands and note an extremely high number of "deletedDocs". I've combed through the source data to verify whether I'm sending duplicate documents ids, but haven't been a

Re: Question about soft commit and updateRequestProcessorChain

2013-08-08 Thread Erick Erickson
Well, in the Solr4 world, there's another option. Do a hard commit with openSearcher=false. That will guarantee that the documents are durably written because it'll close the segment. What it will NOT do is make the documents searchable, you need to do a soft commit to make that happen. The other

Re: First Indexing a postgres database

2013-08-08 Thread Erick Erickson
What errors come out in the log (catalina.out) when you try to create cores? What is the full stack trace when you get your error? Is the error from the log or from the client? Details matter. Best Erick On Thu, Aug 8, 2013 at 4:00 AM, geoport wrote: > Hi, > i am using solr the first time and

Re: Solr search on a large text field is very slow

2013-08-08 Thread Aloke Ghoshal
Compare timings in the following cases: - Without the wildcard - With suffix wild card only - test* - With reverse wild card filter factory and two separate terms - *test OR test* On Thu, Aug 8, 2013 at 8:15 PM, meena.sri...@mathworks.com < meena.sri...@mathworks.com> wrote: > Index size is arou

Re: Solr search on a large text field is very slow

2013-08-08 Thread Rafał Kuć
Hello! In general, wildcard queries can be expensive. If you need wildcard queries, you can try the EdgeNGram - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch > In

Re: Solr search on a large text field is very slow

2013-08-08 Thread Erick Erickson
Sometimes bolding comes through my e-mail as *, so is *test* with the asterisk on each end really what you're doing? Assuming so, this will inevitably be slow. It must iterate through all the terms in the field to see if any of them match. This is generally a bad practice. You can to go n-grams if

Re: Filtering suggestion results

2013-08-08 Thread Erick Erickson
Suggester just looks at the terms in the field you point it to, there's no way that I know of to do what you're asking Best Erick On Wed, Aug 7, 2013 at 4:40 PM, rohitrmd wrote: > Hi, > I have question regarding suggester component. > Can we filter suggestion results depending on particul

Solr search on a large text field is very slow

2013-08-08 Thread meena.sri...@mathworks.com
Index size is around 150 GB and there are around 6.5 million documents in the index. Search on a specific text field is very slow, it takes 1 minute to 2 minute for wildcard queries like *test* with no highlighting and no facets This field contributes to 90% of index size. This is my shema.xml

Re: Solr - how do I index barcode

2013-08-08 Thread Andre Bois-Crettez
I would go with a tokenizer to split each character as a separate token. (maybe https://cwiki.apache.org/confluence/display/solr/Tokenizers#Tokenizers-RegularExpressionPatternTokenizer can do) Add a LowerCaseFilterFactory so that casing is ignored. Untested : sku : "ABCD-EF34GD-JOHN" would

Re: Transform data at index time: country -> continent

2013-08-08 Thread Walter Underwood
SynonymFilter may have a keepOrig flag. If so, that would map countries to continents and not keep the country names. wunder On Aug 8, 2013, at 4:10 AM, Christian Köhler - ZFMK wrote: > Hi, > > I have thought about synonyms as well. But wouldn't leave me this with a > field that contains bo

Re: Solr4.4 DIH Headache

2013-08-08 Thread Stefan Matheis
> First things first, for the dataimport handler. Is it correct that when I > visit it from the admin panel it takes me to this URL: > > *http://x.com:8080/solr/#/collection1/dataimport//dataimport > * > > > That one is correct. the trailing "/dataimport" is the name of the handler you've de

Re: JSON Update create different copies of the same document

2013-08-08 Thread Jack Krupansky
Either your field values are in fact unique for those separate documents, or your have overwrite="false" on the input documents. -- Jack Krupansky -Original Message- From: Bruno René Santos Sent: Thursday, August 08, 2013 4:58 AM To: solr-user@lucene.apache.org Subject: JSON Update c

Re: JSON Update create different copies of the same document

2013-08-08 Thread Yonik Seeley
On Thu, Aug 8, 2013 at 4:58 AM, Bruno René Santos wrote: > I thought that by adding a new document with the same id on Solr the > document already on Solr would be updated with the new info. Yes, this should be the case. > But both > documents appear on the search results... How can I update a d

Re: Suggest aka "autocomplete" request handler with solr 4.4

2013-08-08 Thread Vinícius
if correctSpelled is true, then "appl" was found in solr index. In this case, maybe the EnglishMinimalStemFilterFactory filter in text_general fieldType is messing your suggestion. On 6 August 2013 15:33, Utkarsh Sengar wrote: > Jack/Chris, > > 1. This is my complete schema.xml: > > https://gi

Re: How to parse multivalued data into single valued fields?

2013-08-08 Thread eShard
Ok, I have one index called Communities from an RSS feed. each item in the feed has multiple titles (which are all the same for this feed) So, the title needs to be cleaned up before it is put into the community index let's call the field community_title; And then an UpdateProcessorChain needs to

Re: Solr4.4 DIH Headache

2013-08-08 Thread Raymond Wiker
On Aug 8, 2013, at 15:57 , Spadez wrote: > Hi, > > QUESTION 1 > > > First things first, for the dataimport handler. Is it correct that when I > visit it from the admin panel it takes me to this URL: > > *http://x.com:8080/solr/#/collection1/dataimport//dataimport > * > When I visit it on this

Re: how to sort by frequency of values on a specific field?

2013-08-08 Thread Jack Krupansky
Sounds like faceting on a field. Or, how is it not like faceting? -- Jack Krupansky -Original Message- From: Luca Incrocci Sent: Thursday, August 08, 2013 8:40 AM To: solr-user@lucene.apache.org Subject: how to sort by frequency of values on a specific field? I'm working with Java a

Solr4.4 DIH Headache

2013-08-08 Thread Spadez
Hi, QUESTION 1 First things first, for the dataimport handler. Is it correct that when I visit it from the admin panel it takes me to this URL: *http://x.com:8080/solr/#/collection1/dataimport//dataimport * When I visit it on this page, it seems to load my config correctly in the right panel. A

how to sort by frequency of values on a specific field?

2013-08-08 Thread Luca Incrocci
I'm working with Java and SolrJ on Eclipse. How can I sort the results of a SolrQuery by occurrency of values on a certain field? For example, when I search top n articles (docType=0) of a particular author I want to sort query results by frequency of values in the journal_facet field (type Str

Re: Error loading class 'solr.ISOLatin1AccentFilterFactory'

2013-08-08 Thread Parul Gupta(Knimbus)
Hey ... It works for me! Thanks a lot!!! -- View this message in context: http://lucene.472066.n3.nabble.com/Error-loading-class-solr-ISOLatin1AccentFilterFactory-tp4083012p4083289.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Spadez
Thank you both so much for your help. The regex was indeed outdated. Everything works perfectly now! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Enabling-DIH-breaks-Solr4-4-tp4083282p4083286.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Raymond Wiker
I think the problem is that you have the wrong name for the jar file: you have apache-solr-dataimporthandler instead of simply solr-dataimporthandler. In my solrconfig.xml, I have --- which may or may not work for you.

Re: Enabling DIH breaks Solr4.4

2013-08-08 Thread Rafał Kuć
Hello! Try changing the lib directives to absolute paths, by looking at the exception: Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler it seems that the DataImportHandler class is not seen by Solr class loader. -- Regards, Rafał Kuć Sematext

Enabling DIH breaks Solr4.4

2013-08-08 Thread Spadez
Hi, I'm a bit stuck here. I had Solr4.4 working without too many issues. I wanted to enable the DIH so I firstly added these lines to the solrconfig.xml: Restarted and everything was fine. Then I added these lines to the config as well which broke Solr: /opt/solr/collection1/conf/

RE: new field type - enum field

2013-08-08 Thread Elran Dvir
Hi all, Did anyone have a chance to look at the code? It's attached here: https://issues.apache.org/jira/browse/SOLR-5084 Thank you very much. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 29, 2013 2:15 PM To: solr-user@lucene.apache.org Sub

Re: Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
2. notes 1. My current query is similiar to this http://127.0.0.1:8983/solr/vault/select?q=ABCD&qf=Name+SKU&defType=edismax 2. I want it to be case insensitive On Thu, Aug 8, 2013 at 2:52 PM, Mysurf Mail wrote: > I have a documnet that contains the following data > > car { > id: guid

Re: Solr 4.4 Default shard

2013-08-08 Thread Prasi S
this was only with solr 4.4.I didnt face the issue in any other versions. On Thu, Aug 8, 2013 at 4:23 PM, Prasi S wrote: > Initially i created a single collection, > > ->java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd > upconfig -zkhost localhost:2181 -confdir solr-conf -confna

Storing A Taxonomy In A Separate Core For Query Expansion

2013-08-08 Thread Michael Delaney
Hi Guys, Our application contains two data sets: - Items - Item taxonomy What i'm trying to do is allow our admin team to maintain a taxonomy which allows for items to be found easier. For example: if someone searched for 'dog', and the taxonomy contained 'Spaniel' as a narrower form of dog,

Solr - how do I index barcode

2013-08-08 Thread Mysurf Mail
I have a documnet that contains the following data car { id: guid name: string sku: list } Now, The barcodes dont have a pattern. It can be either one of the follwings: ABCD-EF34GD-JOHN ABCD-C08-YUVF I want to index my documents so that search for 1. ABCD will return both. 2

Handling categories( level one and two) based navigation

2013-08-08 Thread payalsharma
Hi All, Our web application (e commerce ) requires primary and secondary categories in items. Based on this requirement I have following queries : 1) How category and subcategory are handled in solr version 4.4. I have used apache-solr-1.3.0 previously, but facets have undergone many big change

RE: SOLR USING 100% percent CPU and not responding after a while

2013-08-08 Thread nitin4php
Hi Biva, Any luck on this? Even we are facing same issue with exactly same configuration and setup. Any inputs will help a lot. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-USING-100-percent-CPU-and-not-responding-after-a-while-tp4021359p4083234.html Sent from the

Re: Transform data at index time: country -> continent

2013-08-08 Thread Christian Köhler - ZFMK
Hi, One interesting issue: These countries that span continents - Turkey and Russia and some of the former USSR Republics. I arbitrarily assigned them a single continent: // Note: Turkey is mapped to Asia, and Russia to Europe, // Azerbaijan to Asia, Armenia to Asia, Cyprus to Asia, //

Re: Transform data at index time: country -> continent

2013-08-08 Thread Christian Köhler - ZFMK
Hi, I have thought about synonyms as well. But wouldn't leave me this with a field that contains both the original expression and additionally the continent? e.g. "germany, continent-europe". I am not sure if this might get in the way at some point. On the other hand this would enable my to have

Re: Solr 4.4 Default shard

2013-08-08 Thread Prasi S
Initially i created a single collection, ->java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost localhost:2181 -confdir solr-conf -confname *myconf1* -->java -classpath .;zoo-lib/* org.apache.solr.cloud.ZkCLI -cmd linkconfig -zkhost 127.0.0.1:2181 -collection *fir

Re: Solr 4.4 Default shard

2013-08-08 Thread Anshum Gupta
Hi Prasi, I'd highly recommend you to go through the SolrCloud wiki here: http://wiki.apache.org/solr/SolrCloud . When it comes to SolrCloud, you need to read about collections before you go any further. I don't know anything about your use case so I'm guessing you just probably are trying to cre

Re: Problems installing Solr4 in Jetty9

2013-08-08 Thread Spadez
Apparently this is the error: 2013-08-08 09:35:19.994:WARN:oejw.WebAppContext:main: Failed startup of context o.e.j.w.WebAppContext@64a20878{/solr,file:/tmp/jetty-0.0.0.0-8080-solr.war-_solr-any-/webapp/,STARTING}{/solr.war} org.apache.solr.common.SolrException: Could not find necessary SLF4j log

DocValues for byte[] ... or a common codec for "selected fields"

2013-08-08 Thread Mathias Lux
Hi all! First of all: Solr is an amazing project. Big thanks to the community! I really appreciate the stability, and especially the pre-configured jetty example ;) And now for the question: I'm currently on my way to writing a RequestHandler for Solr that deals with content based image search (u

Solr 4.4 Default shard

2013-08-08 Thread Prasi S
I have setup solr 4.4 with cloud and have created two cores mycore_shard1, mycore_shard2. I have few questions here, 1. Once the setup is ready, i could see a default collection "collection" with :shard1 in the admin -> cloud page. How to remove it. I have deleted the core.properties file in the

Re: JSON Update create different copies of the same document

2013-08-08 Thread Bruno René Santos
Yes, id. I was just reading and maybe this happens because I do not send the _version_ with the document? I tried to send it with a value chosen (_versoin_=342342342) by me with Solr empty of documents and get the http response 409 with the reason conflict. Any ideas? Regards Bruno On Thu, Aug

Re: Problems installing Solr4 in Jetty9

2013-08-08 Thread Rafał Kuć
Hello! Could you look at the logs and post what you find there? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch > I've been unable to install SOLR into Jetty. Jetty seems to be running fine, > and this is the steps I took to install solr: > # SOLR > cd

Problems installing Solr4 in Jetty9

2013-08-08 Thread Spadez
I've been unable to install SOLR into Jetty. Jetty seems to be running fine, and this is the steps I took to install solr: # SOLR cd /opt wget -O - $SOLR_URL | tar -xzf - cp solr-4.4.0/dist/solr-4.4.0.war /opt/jetty/webapps/solr.war cp -R solr-4.4.0/example/solr /opt/ cp -R /opt/solr-4.4.0/dist/ /

Re: JSON Update create different copies of the same document

2013-08-08 Thread Rafał Kuć
Hello! Do you have the unique identifier specified in the schema.xml ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch > Hello, > I thought that by adding a new document with the same id on Solr the > document already on Solr would be updated with the

JSON Update create different copies of the same document

2013-08-08 Thread Bruno René Santos
Hello, I thought that by adding a new document with the same id on Solr the document already on Solr would be updated with the new info. But both documents appear on the search results... How can I update a document? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal

Re: [POLL] Who & how does use "admin-extra" ?

2013-08-08 Thread Bernd Fehling
I have a table of links to all my servers running SOLR. So I can jump from one admin page any other servers admin page. And also a link to my monitoring server. Not very innovative but better than an empty page. So, yes I'm using it. Regards, Bernd Am 08.08.2013 00:24, schrieb Stefan Matheis:

First Indexing a postgres database

2013-08-08 Thread geoport
Hi, i am using solr the first time and i have a lot of problems. So at first i have installed a tomcat7, when i have copied the solr.war(original solr-4.2.1.war) into the webapps-directory of the tomcat. The second step was to copy the example\solr into a custom directory i called it c:\solr. The p