Re: Bloom filter

2014-07-29 Thread jim ferenczi
Hi Per, First of all the BloomFilter implementation in Lucene is not exactly a bloom filter. It uses only one hash function and you cannot set the false positive ratio beforehand. ElasticSearch has its own bloom filter implementation (using "guava like" BloomFilter), you should take a look at their

Re: integrating Accumulo with solr

2014-07-29 Thread Ali Nazemian
Sure, Thank you very much for your guide. I think I am not that kind of gunslinger and probably I will go for another NoSQL that can be integrated with solr/elastic search much easier:) Best regards. On Sun, Jul 27, 2014 at 5:02 PM, Jack Krupansky wrote: > Right, and that's exactly what DataSta

Re: Shuffle results a little

2014-07-29 Thread babenis
despite the fact that I upgrade to 4.9.0 - grouping doesn't seem to work on multi valued field, ie i was going to try to group by tags + brand (where tags is a multi-valued field) and spread results apart or select unique combinations only -- View this message in context: http://lucene.47206

Question on multi-threaded faceting

2014-07-29 Thread Vamsee Yarlagadda
Hi, I am trying to work with multi-threaded faceting on SolrCloud and in the process i was hit by some issues. I am currently running the below upstream test on different SolrCloud configurations and i am getting a different result set per configuration. https://github.com/apache/lucene-solr/

Re: fq & bq

2014-07-29 Thread Jack Krupansky
Apply the boost to the specific query term you want boosted, like Name:Car*^200. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) Sent: Tuesday, July 29, 2014 3:16 PM To: Jack Krupansky Cc: solr-user@lucene.apache.org Subject: RE: f

Re: Character encoding problems

2014-07-29 Thread Paul Libbrecht
> If you are seeing " appelé au téléphone" in the browser, I would guess > that the data is being rendered in UTF-8 by your server and the content type > of the html is set to iso-8859-1 or not being set and your browser is > defaulting to iso-8859-1. > > You can force the encoding to utf-8

Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread Anshum Gupta
Use the Split shard API to split an existing shard. It splits an existing shard into 2. https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3 You can then split the sub-shards. One thing to note though is that the Admin UI still doesn't comprehend the difference betw

Re: Character encoding problems

2014-07-29 Thread François Schiettecatte
Hi If you are seeing " appelé au téléphone" in the browser, I would guess that the data is being rendered in UTF-8 by your server and the content type of the html is set to iso-8859-1 or not being set and your browser is defaulting to iso-8859-1. You can force the encoding to utf-8 in the

Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread avgxm
Thanks a lot, Shawn. I have gotten as far as having the core come up per your instructions. Since numShards was set to 1, what is the next step to add more shards? Is it /admin/collections?action=CREATESHARD... or something else? Ultimately, I'd like to have shard1, shard2, shard3, with "router

Re: Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread Shawn Heisey
On 7/29/2014 2:23 PM, avgxm wrote: > Is there a correct way to take an existing Solr index (core.properties, > conf/, data/ directories from a standalone Solr instance) and copy it over > to a Solr cloud, with shards, without having to use import or re-indexing? > Does anyone know the proper steps

Re: Search results inconsistency when using joins

2014-07-29 Thread heaven
Yup, that's known, added it for future Solr releases. But seems this couldn't be a reason for such results discrepancy. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-results-inconsistency-when-using-joins-tp4149810p4149925.html Sent from the Solr - User mailing list

Re: Scaling Issues

2014-07-29 Thread Ameya Aware
Hi, It is not going slow immediately.. it goes slow after a while. It is not doing Solr commit on each document. Size of Solr index - 117.31 MB for which enough system memory is available for file caching. I have Solr auto-commit enabled. 5000 false Thanks, Ameya On Tu

Copy existing index from standalone Solr to Solr cloud

2014-07-29 Thread avgxm
Is there a correct way to take an existing Solr index (core.properties, conf/, data/ directories from a standalone Solr instance) and copy it over to a Solr cloud, with shards, without having to use import or re-indexing? Does anyone know the proper steps to accomplish this type of a move? The ta

Re: Scaling Issues

2014-07-29 Thread Jack Krupansky
Make sure it isn't doing a Solr commit on each document. Is it slow immediately, like on the first 100 documents, or only after awhile? When you do see it indexing very slow, check the size of the Solr index - you should make sure that you have enough system memory available for file caching

Re: Search results inconsistency when using joins

2014-07-29 Thread Yonik Seeley
The join qparser has no "fq" parameter, so that is ignored. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Tue, Jul 29, 2014 at 12:12 PM, heaven wrote: > _query_:"{!join from=profile_ids_im to=id_i v=$qTweet107001860 > fq=$fqTweet107001860

Re: Scaling Issues

2014-07-29 Thread Ameya Aware
yeah.. i tried that.. with null output connector all the files gets crawled in simply one hour.. On Tue, Jul 29, 2014 at 4:00 PM, Toke Eskildsen wrote: > Ameya Aware [ameya.aw...@gmail.com] wrote: > > I am using Apache ManifoldCF framework which connects to my local system > > and passes all th

RE: Scaling Issues

2014-07-29 Thread Toke Eskildsen
Ameya Aware [ameya.aw...@gmail.com] wrote: > I am using Apache ManifoldCF framework which connects to my local system > and passes all the documents in C drive to Solr. > There is total 362GB of data needs to be indexed. I am not performing any > complex analysis. If you are indexing "random" fil

Re: Character encoding problems

2014-07-29 Thread Gulliver Smith
Thanks for the information about URIEncoding="UTF-8" in the tomcat conf file, but that doesn't answer my main concerns: - what is the character encoding of the text in the title_fr field? - is there any way to force it to be UTF-8? On Tue, Jul 29, 2014 at 8:35 AM, wrote: > Hi, > > If you use sol

RE: fq & bq

2014-07-29 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi Jack, Thanks for the response. I have a question, using standard parser, can I boost a filed based on a values. For e.g. if Name filed has Car boost the results.. I tried using the below query but it is getting all the data. Is there any other way I need to do this e.g q=(Name:Car) OR (Nam

Re: Scaling Issues

2014-07-29 Thread Ameya Aware
I am using Apache ManifoldCF framework which connects to my local system and passes all the documents in C drive to Solr. I am not doing any searches while indexing. There is total 362GB of data needs to be indexed. I am not performing any complex analysis. Thanks, Ameya On Tue, Jul 29, 2014

RE: Scaling Issues

2014-07-29 Thread Toke Eskildsen
Ameya Aware [ameya.aw...@gmail.com] wrote: [Solr -Xmx5120m] > I need to index around 30 documents but with above parameters > performance is coming very poor around 15000-2 documents per hour. 4-5 documents/second is a lot less than the numbers people normally cite, but we need to know

Re: Scaling Issues

2014-07-29 Thread Erick Erickson
95+ % of the time problems like this are not Solr, but the data acquisition, i.e. querying the DB, traversing the file system etc. We need to have an idea of what the indexing pipeline is all about before saying anything coherent. If you're using extractingrequesthandler for Word, PDFs, etc, you

Re: Scaling Issues

2014-07-29 Thread Timothy Potter
Hi Ameya, Tough to say without more information about what's slow. In general, when I've seen Solr index that slow, it's usually related to some complex text analysis, for instance, are you doing any phonetic analysis? Best thing to do is attach a Java profiler (e.g. JConsole or VisualVM) using rm

RE: Scaling Issues

2014-07-29 Thread Boogie Shafer
when you say performance is very poor, what is happening at the system level? e.g. are cpu's pegged out? is there a lot of IO wait? is the storage busy? is the network busy? some easy tools to watch this stuff live if you arent sure and dont have full on system monitoring agents installed

Scaling Issues

2014-07-29 Thread Ameya Aware
Hi, I am running Solr with below parameters: -XX:MaxPermSize=128m -Xms5120m -Xmx5120m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts -XX:-UseGC

RE: Solr Wiki ContributorsGroup request

2014-07-29 Thread Chris Hostetter
: One of the specific pages that I've been looking at is the page on : compiling Solr: https://wiki.apache.org/solr/HowToCompileSolr . I go for it... : haven't seen a corresponding page in Confluence. Could you help me : understand the relationship between the wiki and the Confluence : doc

RE: Solr Wiki ContributorsGroup request

2014-07-29 Thread Edwards, Joshua
Thanks, Steve - One of the specific pages that I've been looking at is the page on compiling Solr: https://wiki.apache.org/solr/HowToCompileSolr . I haven't seen a corresponding page in Confluence. Could you help me understand the relationship between the wiki and the Confluence documentation

Re: Search results inconsistency when using joins

2014-07-29 Thread heaven
Just tried to remove joins and it worked as expected: q: ( _query_:"{!edismax qf='name_small_ngram' mm='1'}-foundation -association -organization -hospital -charity -news -info" AND ( _query_:"{!edismax qf='name_small_ngram emails_words_ngram sites_words_ngram rss_categories_texts twitter_

Re: SolrCloud without NRT and indexing only on the master

2014-07-29 Thread Erick Erickson
bq: What if I don't need NRT and in particular want the slave to use all resources for query answering, i.e. only the master shall index. But at the same time I want all the other benefits of SolrCloud. You want all the benefits of SolrCloud without... using SolrCloud? Your only two choices are t

Search results inconsistency when using joins

2014-07-29 Thread heaven
I was thinking these 2 queries should yield same results: q: ( _query_:"{!edismax qf='name_small_ngram' mm='1'}-foundation -association -organization -hospital -charity -news -info" AND ( _query_:"{!edismax qf='name_small_ngram emails_words_ngram sites_words_ngram rss_categories_texts twitt

Solr Wiki ContributorsGroup request

2014-07-29 Thread Edwards, Joshua
Hello - My name is Josh Edwards, and I am starting to get spun up on Solr, as we are investigating using it at our organization. As I work my way through the wiki, I would like the ability to update the documentation as I find typos and things that are out of date. Would it be possible for me

Re: Solr Wiki ContributorsGroup request

2014-07-29 Thread Steve Rowe
Hi Josh, I’ve added you to the Solr ContributorsGroup page. Note that the Solr Reference Guide[1] is now Solr’s official documentation, and that stale documentation you find in the wiki may have already been fixed there. I encourage you to comment on Solr Reference Guide pages where you find

Re: how to extract stats component with solrj 4.9.0

2014-07-29 Thread Shawn Heisey
On 7/28/2014 11:12 AM, Edith Au wrote: > I found this method FieldStatsInfo().getFacets() in the Solr 4.9.0 doc. > But it seems to me the method is missing in my Solrj 4.9.0 distribution. > Could this be a bug? or I have a bad distro? If you're sure that you have the 4.9.0 jar, that sounds like

RE: fq & bq

2014-07-29 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi Jack, My q:Fusion defType=edismax&q=Fusion&bq=(Type:"Car")^1000&fq=Country:US... facet fileds and other join condition (I am using join for 2 cores). With q & fq I am getting results.. and q & bq I am getting results.. But q,bq,fq, returns 0 records.. Thanks Ravi -Original Message

Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread Shalin Shekhar Mangar
Hi Pushkar, Not related to your problem but you should think about using the solr-scale-tk to setup your environment -- it really takes the pain away. https://github.com/LucidWorks/solr-scale-tk http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/ On Tue, Jul 29, 2014 at 6:45 PM,

RE: copy EnumField to text field

2014-07-29 Thread Elran Dvir
Hi all, I got an answer about default value. But what about the code change I suggested? Do you think it's good? For your convenience, I am rewriting my original message: I have an enumField called severity. these are its relevant definitions in schema.xml: And in enumsConfig.xml: N

Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread aurelien . mazoyer
Ooops, didn't see Andrew's answer: sorry for my redundant answer :-) Aurélien On 29.07.2014 15:47, aurelien.mazo...@francelabs.com wrote: Hi Pusakar, Did you try to ping your solr from localhost in your ssh console: curl http://localhost:8983(or 8984 if you change the jetty port)/solr/collecti

Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread Andrew Pawloski
Is port 8984 open in your ec2's security settings? >From the ec2 instance, can you curl localhost:8984/solr? Do you see anything? On Tue, Jul 29, 2014 at 9:15 AM, pushkar sawant wrote: > Hi Team, > I have done Solr 4.9.0 setup on ubuntu 12.04 instanace on AWS. > with Java 7. When i start the

Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread aurelien . mazoyer
Hi Pusakar, Did you try to ping your solr from localhost in your ssh console: curl http://localhost:8983(or 8984 if you change the jetty port)/solr/collection1/admin/ping ? Aurélien On 29.07.2014 15:15, pushkar sawant wrote: Hi Team, I have done Solr 4.9.0 setup on ubuntu 12.04 instanace on

Solr on AWS ubuntu12.04 instance

2014-07-29 Thread pushkar sawant
Hi Team, I have done Solr 4.9.0 setup on ubuntu 12.04 instanace on AWS. with Java 7. When i start the solr with "java -jar start.jar" it start with attached output. It sys -: 5460 [main] INFO org.eclipse.jetty.server.AbstractConnector – Started SocketConnector@0.0.0.0:8984 When i try to open it

Re: SolrCloud without NRT and indexing only on the master

2014-07-29 Thread Mikhail Khludnev
I never did it, but always like. http://lucene.472066.n3.nabble.com/Best-practice-for-rebuild-index-in-SolrCloud-td4054574.html >From time to time such recipes are mentioned in the list. On Tue, Jul 29, 2014 at 12:39 PM, Harald Kirsch wrote: > Hi all, > > from the Solr documentation I find two

Re : Re: Multipart documents with different update cycles

2014-07-29 Thread aurelien . mazoyer
Yes, that is the point : I have to handle complex queries that perform full text search both on user-metadata and main part of documents :-(... Aurélien Do you search the frequently changing user-metadata? If not, maybe the external file field is helpful. https://cwiki.apache.org/confluence/di

Re: fq & bq

2014-07-29 Thread Jack Krupansky
Boosting simply rearranges the search results, but does not affect the count. Sounds like no result documents are matching your filter query. What is your "q" parameter? Try with q and fq alone as see if you get any results. Try with q set to your fq query alone and see what results you get,

Re: Character encoding problems

2014-07-29 Thread aurelien . mazoyer
Hi, If you use solr 4.8.1, you don't have to add URIEncoding="UTF-8" in the tomcat conf file anymore : https://wiki.apache.org/solr/SolrTomcat Regards, Aurélien MAZOYER On 29.07.2014 14:22, Gulliver Smith wrote: I have solr 4.8.1 under Tomcat 7 on Debian Linux. The connector in Tomcat's se

Character encoding problems

2014-07-29 Thread Gulliver Smith
I have solr 4.8.1 under Tomcat 7 on Debian Linux. The connector in Tomcat's server.xml has been changed to include character encoding UTF-8: I am posting to the server from PHP 5.5 curl. The extract POST was intercepted and confirmed that everything is being encode in UTF-8. However, the resp

fq & bq

2014-07-29 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi , I am using the bq to boost a particular value of a field. But when I try to add a filter query (fq) there results are Zero, Do you have thoughts, can I use both bq and fq togerther...? &bq=(Type:"Car")^1000&fq=Country:US Thanks Ravi

SolrCloud without NRT and indexing only on the master

2014-07-29 Thread Harald Kirsch
Hi all, from the Solr documentation I find two options how replication of an indexing is handled: a) SolrCloud indexes on master and all slaves in parallel to support NRT (near realtime search) b) Legacy replication where only the master does the indexing and slave receive index copies onc

RE: crawling all links of same domain in nutch in solr

2014-07-29 Thread Markus Jelsma
Hi - use the domain URL filter plugin and list the domains, hosts or TLD's you want to restrict the crawl to. -Original message- > From:Vivekanand Ittigi > Sent: Tuesday 29th July 2014 7:17 > To: solr-user@lucene.apache.org > Subject: crawling all links of same domain in nutch in so