Re: A few questions about solr and tika

2013-10-17 Thread primoz . skale
Everythink about Tika extraction is written under those links. Basicaly what you need is the following: 1) requestHandler for Tika in solrconfig.xml 2) keep all the fields in schema.xml that are needed for Tika (they are marked in example schema.xml) and set those you don't need to indexed=fals

Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
I tried commenting out NOW in bq, but didn't make any difference in the performance. I do see minor entry in the queryfiltercache rate which is a meager 0.02. I'm really struggling to figure out the bottleneck, any known pain points I should be checking ? -- View this message in context: http

Re: Different document types in different collections OR same collection without sharing fields?

2013-10-17 Thread shrikanth k
Hi, Logically maintaining will be easy, as both collections are in different folders. Next, even thought making separate fields in one collection, at search time if field list is not mentioned then results will be combination of both domains. If this is mandatorily taking care at search/

Re: SolrDocumentList - bitwise operation

2013-10-17 Thread Michael Tyler
Hi, Regrets, I was confused with bit-set. I l have Shawn's suggested approach in system. I want to try with other ways and test performance. How can I use join? I have 2 different solr indexes. localhost:8080/solr_1/select?q=content:test&fl=id,name,type localhost:8081/solr_1_1/select?q=text:t

Re: Switching indexes

2013-10-17 Thread Shawn Heisey
On 10/17/2013 12:51 PM, Christopher Gross wrote: OK, super confused now. http://index1:8080/solr/admin/cores?action=CREATE&name=test2&collection=test2&numshards=1&replicationFactor=3 Nets me this: 400 15007 Error CREATEing SolrCore 'test2': Could not find configName for collection test2 fou

Re: Skipping caches on a /select

2013-10-17 Thread Bill Bell
But global on a qt would be awesome !!! Bill Bell Sent from mobile > On Oct 17, 2013, at 2:43 PM, Yonik Seeley wrote: > > There isn't a global "cache=false"... it's a local param that can be > applied to any "fq" or "q" parameter independently. > > -Yonik > > >> On Thu, Oct 17, 2013 at 4:3

solrconfig.xml carrot2 params

2013-10-17 Thread youknow...@heroicefforts.net
Would someone help me out with the syntax for setting Tokenizer.documentFields in the ClusteringComponent engine definition in solrconfig.xml? Carrot2 is expecting a Collection of Strings. There's no schema definition for this XML file and a big TODO on the Wiki wrt init params. Every permuta

Re: Skipping caches on a /select

2013-10-17 Thread Tim Vaillancourt
Awesome, this make a lot of sense now. Thanks a lot guys. Currently the only mention of this setting in the docs is under filterQuery on the "SolrCaching" page as: " Solr3.4 Adding the localParam flag of {!cache=false} to a query will prevent the filterCach

Re: Skipping caches on a /select

2013-10-17 Thread Chris Hostetter
: Does "cache=false" apply to all caches? The docs make it sound like it is for : filterCache only, but I could be misunderstanding. it's per *query* -- not per cache, or per request... /select?q={!cache=true}foo&fq={!cache=false}bar&fq={!cache=true}baz ...should cause 1 lookup/insert in the

Re: Skipping caches on a /select

2013-10-17 Thread Yonik Seeley
There isn't a global "cache=false"... it's a local param that can be applied to any "fq" or "q" parameter independently. -Yonik On Thu, Oct 17, 2013 at 4:39 PM, Tim Vaillancourt wrote: > Thanks Yonik, > > Does "cache=false" apply to all caches? The docs make it sound like it is > for filterCac

Re: Skipping caches on a /select

2013-10-17 Thread Tim Vaillancourt
Thanks Yonik, Does "cache=false" apply to all caches? The docs make it sound like it is for filterCache only, but I could be misunderstanding. When I force a commit and perform a /select a query many times with "cache=false", I notice my query gets cached still, my guess is in the queryResul

Re: Switching indexes

2013-10-17 Thread Michael Della Bitta
> load the configs into zookeeper, Yes. > stop tomcat, add it to the solr.xml file, and restart tomcat. To your CREATE URL, add the parameter &collection.configName= http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API Michael Della Bitta Applications Developer o

Check if dynamic columns exists and query else ignore

2013-10-17 Thread Utkarsh Sengar
I trying to do this: if (US_offers_i exists): fq=US_offers_i:[1 TO *] else: fq=offers_count:[1 TO *] Where: US_offers_i is a dynamic field containing an int offers_count is a status field containing an int. I have tried this so far but it doesn't work: http://solr_server/solr/col1/select?

Re: Switching indexes

2013-10-17 Thread Christopher Gross
I can't find it in the Admin->Cloud->Tree part of the UI. Trying to "get" the file: [zk: localhost:2181(CONNECTED) 0] get /aliases.json Node does not exist: /aliases.json So it didn't stick -- I'm guessing. I don't see an error message regarding the "alias" in my logs either. Anywhere else I sh

RE: Switching indexes

2013-10-17 Thread Garth Grimm
Go to the admin screen for Cloud/Tree, and then click the node for aliases.json. To the lower right, you should see something like: {"collection":{"AdWorksQuery":"AdWorks"}} Or access the Zookeeper instance, and do a 'get /aliases.json'. -Original Message- From: Christopher Gross [mail

Re: Switching indexes

2013-10-17 Thread Christopher Gross
Also, when I make an alias: http://index1:8080/solr/admin/collections?action=CREATEALIAS&name=test1-alias&collections=test1 I get a pretty useless response: 00 So I'm not sure if it is made. I tried going to: http://index1:8080/solr/test1-alias/select?q=*:* but that didn't work. How do I use an

Re: Switching indexes

2013-10-17 Thread Christopher Gross
OK, super confused now. http://index1:8080/solr/admin/cores?action=CREATE&name=test2&collection=test2&numshards=1&replicationFactor=3 Nets me this: 400 15007 Error CREATEing SolrCore 'test2': Could not find configName for collection test2 found:[xxx, xxx, , x, xx] 400 For that

RE: Change config set for a collection

2013-10-17 Thread michael.boom
Thanks Garth! Yes, indeed, I know that issue. I had set up my SolrCloud using 4.5.0 and then encountered this problem, so I rolled back to 4.4.0 - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Change-config-set-for-a-collection-tp4096032p4096136.html S

Chegg is looking for a search engineer

2013-10-17 Thread Walter Underwood
I work at Chegg.com and I really like it, but we have more search work than I can do by myself, so we are hiring a senior software engineer for search. Most of our search services are on Solr. http://www.chegg.com/jobs/listings/?jvi=oAQGXfwN,Job If you'd like to know a lot more about Chegg's b

RE: Change config set for a collection

2013-10-17 Thread Garth Grimm
But if you're working with multiple configs in zookeeper, be aware that 4.5 currently has an issue creating multiple collections in a cloud that has multiple configs. It's targeted to be fixed whenever 4.5.1 comes out. https://issues.apache.org/jira/i#browse/SOLR-5306 -Original Message---

Re: Change config set for a collection

2013-10-17 Thread michael.boom
Thank you, Shawn! "linkconfig" - that's exactly what i was looking for! -- View this message in context: http://lucene.472066.n3.nabble.com/Change-config-set-for-a-collection-tp4096032p4096134.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
Thanks, I'll try upgade. On 17 October 2013 15:55, Mark Miller wrote: > There was a reload bug in SolrCloud that was fixed in 4.4 - > https://issues.apache.org/jira/browse/SOLR-4805 > > Mark > > On Oct 17, 2013, at 7:18 AM, Grzegorz Sobczyk wrote: > > > Sorry for previous spam (something eat m

Re: Change config set for a collection

2013-10-17 Thread Shawn Heisey
On 10/17/2013 2:36 AM, michael.boom wrote: > The question also asked some 10 months ago in > http://lucene.472066.n3.nabble.com/SolrCloud-4-1-change-config-set-for-a-collection-td4037456.html, > and then the answer was negative, but here it goes again, maybe now it's > different. > > Is it possibl

Re: ExtractRequestHandler, skipping errors

2013-10-17 Thread Koji Sekiguchi
Hi Roland, (13/10/17 20:44), Roland Everaert wrote: Hi, I helped a customer to deployed solr+manifoldCF and everything is going quite smoothly, but every time solr is raising an exception, the manifoldcfjob feeding solr aborts. I would like to know if it is possible to configure the ExtractRequ

Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
Thanks Primoz, I was suspecting that too. But then, its hard to imagine that query cache is only contributing to the big performance hit. The setting applies to the old configuration, and it works pretty well even with the query cache low hit rate. -- View this message in context: http://lucene

Re: limiting deep pagination

2013-10-17 Thread Peter Keegan
Yes, right now this constraint could be implemented in either the web app or Solr. I see now that many of the QTimes on these queries are <10 ms (probably due to caching), so I'm a bit less concerned. On Wed, Oct 16, 2013 at 2:13 AM, Furkan KAMACI wrote: > I just wonder that: Don't you implement

Re: Timeout Errors while using Collections API

2013-10-17 Thread Mark Miller
There was a reload bug in SolrCloud that was fixed in 4.4 - https://issues.apache.org/jira/browse/SOLR-4805 Mark On Oct 17, 2013, at 7:18 AM, Grzegorz Sobczyk wrote: > Sorry for previous spam (something eat my message) > > I have the same problem but with reload action > ENV: > - 3x Solr 4.2.

Re: Solr errors

2013-10-17 Thread Roland Everaert
I have just find this JIRA report, which could explain your problem: https://issues.apache.org/jira/browse/SOLR-2416 Regards, Roland. On Thu, Oct 17, 2013 at 3:30 PM, wonder wrote: > Thanks for answer. Yes Tika extract, but not index content. Here is the > solr response > ... > "content":

Re: Regarding Solr Cloud issue...

2013-10-17 Thread Chris
I am also trying with something like - java -Durl=http://domainname.com:1981/solr/web/update-Dtype=application/json -jar /solr4RA/example1/exampledocs/post.jar /root/Desktop/web/*.json but it is giving error - 19:06:22 ERROR SolrCore org.apache.solr.common.SolrException: Unknown command: subDoma

Re: Solr errors

2013-10-17 Thread wonder
Thanks for answer. Yes Tika extract, but not index content. Here is the solr response ... "content": [ " 9118_xmessengereu_v18ximpda.jar dimonvideo.ru.txt " ], ... There are not any of this files in index. Any ideas? 17.10.2013 17:20, Roland Everaert ?: Even if I don't test it myself, you ca

Re: Solr errors

2013-10-17 Thread Roland Everaert
Even if I don't test it myself, you can use Tika, it is able to extract document from zip archives and index them, but of course it depends of the file type in the archive. Regards, Roland. On Thu, Oct 17, 2013 at 2:36 PM, wonder wrote: > Does anybody know how index files in zip archives? >

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-17 Thread Jack Krupansky
The default Solr stopwords.txt file is empty, so SOMEBODY created that non-empty stop words file. The StopFilterFactory token filter in the field type analyzer controls stop word processing. You can remove that step entirely, or different field types can reference different stop word files, or

Re: Regarding Solr Cloud issue...

2013-10-17 Thread Chris
Wow thanks for all that, i just upgraded, linked my plugins & it seems fine so far, but i have run into another issue while adding a document to the solr cloud it says - org.apache.solr.common.SolrException: Unknown document router '{name=compositeId}' in the clusterstate.json i can see -

Re: A few questions about solr and tika

2013-10-17 Thread wonder
Thanks for answer. If I dont want to store and index any fields i do: multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> multiValued="true"/> stored="false" multiValued="true"/> O

Re: Solr errors

2013-10-17 Thread wonder
Does anybody know how index files in zip archives?

ExtractRequestHandler, skipping errors

2013-10-17 Thread Roland Everaert
Hi, I helped a customer to deployed solr+manifoldCF and everything is going quite smoothly, but every time solr is raising an exception, the manifoldcfjob feeding solr aborts. I would like to know if it is possible to configure the ExtractRequestHandler to ignore errors like it seems to be possibl

Solr errors

2013-10-17 Thread wonder
Hello everyone! Please tell my wy Solr freezes when I adding this file http://yadi.sk/d/dy-RtcHXB7KZU The response from the server does not come. curl "http://localhost:8085/solr/myCollection/update/extract?literal.id=doc1&literal.fileName=as&uprefix=attr_&&commit=true"; -F "myfile=@/media/PENDR

RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-17 Thread Akkinepalli, Bharat (ELS-CON)
Thanks Shalin. Regards, Bharat Akkinepalli -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Thursday, October 17, 2013 1:18 AM To: solr-user@lucene.apache.org Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after de

Re: SolrCloud on SSL

2013-10-17 Thread Christopher Gross
Tim, if a separate VLAN was an option, I wouldn't be trying to use SSL. -- Chris On Wed, Oct 16, 2013 at 7:27 PM, Tim Vaillancourt wrote: > Not important, but I'm also curious why you would want SSL on Solr (adds > overhead, complexity, harder-to-troubleshoot, etc)? > > To avoid the overhead, c

Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
Sorry for previous spam (something eat my message) I have the same problem but with reload action ENV: - 3x Solr 4.2.1 with 4 cores each - ZK Before error I have: - 14, 2013 5:25:36 AM CollectionsHandler handleReloadAction INFO: Reloading Collection : name=products&action=RELOAD - hundreds of (

Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
On 16 October 2013 11:48, RadhaJayalakshmi wrote: > Hi, > My setup is > Zookeeper ensemble - running with 3 nodes > Tomcats - 9 Tomcat instances are brought up, by registereing with > zookeeper. > > Steps : > 1) I uploaded the solr configuration like db_data_config, solrconfig, > schema > xmls int

Status of wiki documentation on grouping under distributed search

2013-10-17 Thread Jackson, Andrew
On the SolrCloud wiki page (https://wiki.apache.org/solr/SolrCloud), I found this statement: The Grouping feature only works if groups are in the same shard. You must use the custom sharding feature to use the Grouping feature. However, the Distributed Search page (https://wiki.apache.o

Re: A few questions about solr and tika

2013-10-17 Thread primoz . skale
Why don't you check these: - Content extraction with Apache Tika ( http://www.youtube.com/watch?v=ifgFjAeTOws) - ExtractingRequestHandler ( http://wiki.apache.org/solr/ExtractingRequestHandler) - Uploading Data with Solr Cell using Apache Tika ( https://cwiki.apache.org/confluence/display/solr/Upl

A few questions about solr and tika

2013-10-17 Thread wonder
Hello everyone! Please tell me how and where to set Tika options in Solr? Where is Tica conf? I'm want to know how I can eliminate not required to me response attribute(such as links or images)? Also I am interesting how i can get and index only metadata in several file formats?

measure result set quality

2013-10-17 Thread Alvaro Cabrerizo
Hi, Imagine the next situation. You have a corpus of documents and a list of queries extracted from production environment. The corpus haven't been manually annotated with relvant/non relevant tags for every query. Then you configure various solr instances changing the schema (adding synonyms, sto

Change config set for a collection

2013-10-17 Thread michael.boom
The question also asked some 10 months ago in http://lucene.472066.n3.nabble.com/SolrCloud-4-1-change-config-set-for-a-collection-td4037456.html, and then the answer was negative, but here it goes again, maybe now it's different. Is it possible to change the config set of a collection using the Co

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-17 Thread Stavros Delisavas
Thank you, I found the file with the stopwords and noticed that my local file is empty (comments only) and the one on my webserver has a big list of english stopwords. That seems to be the problem. I think in general it is a good idea to use stopwords for random searches, but it is not usefull in

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-17 Thread Upayavira
Stopwords are small words such as "and", "the" or "is",that we might choose to exclude from our documents and queries because they are such common terms. Once you have stripped stop words from your above query, all that is left is the word "wild", or so is being suggested. Somewhere in your config