Re: find all two word phrases that appear in more than one document

2013-09-09 Thread Ali, Saqib
/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all at > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > On Tue, Sep 10, 2013 at 8:22 AM, Ali, Saqib wrote: &g

find all two word phrases that appear in more than one document

2013-09-09 Thread Ali, Saqib
Dear Solr Ninjas, We would like to run a query that returns two word phrases that appear in more than one document. So for e.g. take the string "Solr Ninja". Since it appears in more than one document in our Solr instance, the query should return that. The query should find all such phrases from

Re: removing duplicates

2013-08-21 Thread Ali, Saqib
, > Aloke > > > On Thu, Aug 22, 2013 at 2:44 AM, Ali, Saqib wrote: > > > hello, > > > > We have documents that are duplicates i.e. the ID is different, but rest > of > > the fields are same. Is there a query that can remove duplicate, and just > > leave

removing duplicates

2013-08-21 Thread Ali, Saqib
hello, We have documents that are duplicates i.e. the ID is different, but rest of the fields are same. Is there a query that can remove duplicate, and just leave one copy of the document on solr? There is one numeric field that we can key off for find duplicates. Please advise. Thanks

Re: [solr 4.4.0] SPLITSHARD and core autodiscovery

2013-08-02 Thread Ali, Saqib
Dmitry, That is expected behaviour. You need to manually remove the original core. Thanks. On Fri, Aug 2, 2013 at 6:03 AM, Dmitry Kan wrote: > Hello list, > > I was wondering, if what I see with the split shard a correct behaviour or > is something wrong. > > Following this article: > > http:

Re: uniqueKey: string vs. long integer

2013-08-01 Thread Ali, Saqib
, is the > key used in numeric calculations? Can it be negative? Is it ever sorted? > > But as a Solr best practice, I'd advise against it. > > -- Jack Krupansky > > -Original Message- > From: Ali, Saqib > Sent: Thursday, August 01, 2013 12:02 PM > To: solr

uniqueKey: string vs. long integer

2013-08-01 Thread Ali, Saqib
We have an application that was developed by a third party. It uses uniqueKey that is a long integer instead of a string. Will there be any repercussions of using a long integer instead of string for the uniqueKey? Thanks! :)

Re: FieldCollapsing issues in SolrCloud 4.4

2013-07-31 Thread Ali, Saqib
grouping field. > > If you have many groups, you may also experience a huge performance > hit, as the current implementation has been heaviy optimized for low > number of groups (e.g. e-commerce categories). > > Paul > > > > On Wed, Jul 31, 2013 at 1:59 AM, Ali, Saqib

FieldCollapsing issues in SolrCloud 4.4

2013-07-30 Thread Ali, Saqib
Hello all, Is anyone experiencing issues with the numFound when using group=true in SolrCloud 4.4? Sometimes the results are off for us. I will post more details shortly. Thanks.

Using HP SiteScope to monitor individual Solr shards

2013-07-30 Thread Ali, Saqib
We would like to use HP SiteScope to monitor the availability of the individual Solr shards. Any ideas on how we can do that? Is there a shard based URL that is a sure shot of knowing that the shard is feeling healthy? Thanks! :)

Re: monitor jvm heap size for solrcloud

2013-07-26 Thread Ali, Saqib
You can use SPM (i think): http://sematext.com/spm/solr-performance-monitoring/ On Fri, Jul 26, 2013 at 1:36 PM, Joshi, Shital wrote: > We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. While > running stress tests, we want to monitor JVM heap size across 10 nodes. Is > there a u

maximum number of documents per shard?

2013-07-23 Thread Ali, Saqib
still 2.1 billion documents?

Re: zkHost in solr.xml goes missing after SPLITSHARD using Collections API

2013-07-23 Thread Ali, Saqib
Thanks Alan and Shawn. Just installed Solr 4.4, and no longer experiencing the issue. Thanks! :) On Tue, Jul 23, 2013 at 7:21 AM, Shawn Heisey wrote: > On 7/23/2013 7:50 AM, Alan Woodward wrote: > > Can you try upgrading to the just-released 4.4? Solr.xml persistence > had all kinds of bugs i

zkHost in solr.xml goes missing after SPLITSHARD using Collections API

2013-07-23 Thread Ali, Saqib
Hello all, Every time I issue a SPLITSHARD using Collections API, the zkHost attribute in the solr.xml goes missing. I have to manually edit the solr.xml to add zkHost after every SPLITSHARD. Any thoughts on what could be causing this? Thanks.

Re: add to ContributorsGroup - Instructions for setting up SolrCloud on jboss

2013-07-17 Thread Ali, Saqib
so long, hadn't looked at the list in a couple of days. > > > Erick > > On Fri, Jul 12, 2013 at 5:46 PM, Ali, Saqib wrote: > > username: saqib > > > > > > On Fri, Jul 12, 2013 at 2:35 PM, Ali, Saqib > wrote: > > > >> Hello, > >>

Re: Where to specify numShards when startup up a cloud setup

2013-07-16 Thread Ali, Saqib
What does the solr.xml look like on the nodes? On Tue, Jul 16, 2013 at 2:36 PM, Robert Stewart wrote: > I want to script the creation of N solr cloud instances (on ec2). > > But its not clear to me where I would specify numShards setting. > From documentation, I see you can specify on the "first

Re: Clearing old nodes from zookeper without restarting solrcloud cluster

2013-07-15 Thread Ali, Saqib
Hello Luis, I don't think that is possible. If you delete clusterstate.json from zookeeper, you will need to restart the nodes.. I could be very wrong about this Saqib On Mon, Jul 15, 2013 at 8:50 PM, Luis Carlos Guerrero Covo < lcguerreroc...@gmail.com> wrote: > I know that you can cl

Re: Book contest idea - feedback requested

2013-07-15 Thread Ali, Saqib
Hello Alex, This sounds like an excellent idea! :) Saqib On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch wrote: > Hello, > > Packt Publishing has kindly agreed to let me run a contest with e-copies of > my book as prizes: > http://www.packtpub.com/apache-solr-for-indexing-data/book > >

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2013-07-12 Thread Ali, Saqib
I am getting a java.lang.OutOfMemoryError: Requested array size exceeds VM limit on certain queries. Please advise: 19:25:02,632 INFO [org.apache.solr.core.SolrCore] (http-oktst1509.company.tld/12.5.105.96:8180-9) [collection1] webapp=/solr path=/select params={sort=sent_date+asc&distrib=false&w

Re: add to ContributorsGroup - Instructions for setting up SolrCloud on jboss

2013-07-12 Thread Ali, Saqib
username: saqib On Fri, Jul 12, 2013 at 2:35 PM, Ali, Saqib wrote: > Hello, > > Can you please add me to the ContributorsGroup? I would like to add > instructions for setting up SolrCloud using Jboss. > > thanks. > >

add to ContributorsGroup - Instructions for setting up SolrCloud on jboss

2013-07-12 Thread Ali, Saqib
Hello, Can you please add me to the ContributorsGroup? I would like to add instructions for setting up SolrCloud using Jboss. thanks.

Re: preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
Thanks Walter. And the container.. On Thu, Jul 11, 2013 at 7:55 PM, Walter Underwood wrote: > Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. > --wunder > > On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: > > > With the embedded Zookeeper

Re: preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal wrote: > We're running under jetty. > > Sent from my iPhone > > On Jul 11, 2013, at 6:06 PM, "Ali, Saqi

preferred container for running SolrCloud

2013-07-11 Thread Ali, Saqib
1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?

SolrCloud on Jboss

2013-07-08 Thread Ali, Saqib
Hello, Does anyone have step-by-step instructions for running SolrCloud on Jboss? Thanks

Re: SolrJ and SolrCloud

2013-07-08 Thread Ali, Saqib
Thanks Mark! On Mon, Jul 8, 2013 at 10:46 AM, Mark Miller wrote: > > On Jul 8, 2013, at 1:40 PM, "Ali, Saqib" wrote: > > > Hello all, > > > > We have an app that uses the SolrJ and instantiates using HttpSolrServer. > > > > Now that we woul

SolrJ and SolrCloud

2013-07-08 Thread Ali, Saqib
Hello all, We have an app that uses the SolrJ and instantiates using HttpSolrServer. Now that we would like to move to SolrCloud, can we still use the same app, or do we HAVE to switch to CloudSolrServer server = new CloudSolrServer("?"); right away? Or will point to one instance using Htt

Re: 2.1billion+ document

2013-07-05 Thread Ali, Saqib
main data to > single shards so as to allow isolated queries against dedicated data models > in single shards. > > But if you just want to basics, it really is as easy as describe above. > > Jason > > > On Jul 5, 2013, at 7:36 PM, "Ali, Saqib" wrote: > > >

Re: 2.1billion+ document

2013-07-05 Thread Ali, Saqib
port -- http://sematext.com/ > Performance Monitoring -- http://sematext.com/spm > > > > On Fri, Jul 5, 2013 at 8:42 PM, Ali, Saqib wrote: > > Question regarding the 2.1 billion+ document. > > > > I understand that a single instance of solr has a limit of 2.1 billion

2.1billion+ document

2013-07-05 Thread Ali, Saqib
Question regarding the 2.1 billion+ document. I understand that a single instance of solr has a limit of 2.1 billion documents. We currently have a single solr server. If we reach 2.1billion documents limit, what is involved in moving to the Solr DistributedSearch? Thanks! :)

Re: [Announcement] Norch- a search engine for node.js

2013-07-05 Thread Ali, Saqib
Very interesting. What is the upper limit on the number of documents? Thanks! :) On Fri, Jul 5, 2013 at 11:53 AM, Fergus McDowall wrote: > Here is some news that might be of interest to users and implementers of > Solr > > > http://blog.comperiosearch.com/blog/2013/07/05/norch-a-search-engine-f

solrj distributed solr example

2013-07-05 Thread Ali, Saqib
Hello all, Can anyone please share a solrj example for distributed solr? Thanks! :)

Re: Moving from single Solr instance to Solr Cloud

2013-07-04 Thread Ali, Saqib
Hello Furkan, We are using Solr 4.3 Thanks On Thu, Jul 4, 2013 at 1:43 AM, Furkan KAMACI wrote: > Which version of Solr you are using? > > 2013/7/4 Ali, Saqib > > > We have single Solr instance with lot of indexed document. Now we would > > like to move to

Re: omitTermFreqAndPositions="true" in easy English, please?

2013-07-03 Thread Ali, Saqib
2013 12:54 AM > > To: solr-user@lucene.apache.org > Subject: Re: omitTermFreqAndPositions="**true" in easy English, please? > > Yes, but it is simply doing an AND or OR of the individual terms - no > phrases or implied ordering of the terms. > > -- Jack Krupansky > >

Re: omitTermFreqAndPositions="true" in easy English, please?

2013-07-03 Thread Ali, Saqib
sorry change the query to: label: (Google AND Cloud AND Storage) or will Solr add AND / OR behind the scenes? On Wed, Jul 3, 2013 at 9:59 PM, Ali, Saqib wrote: > So do I have to change my query to > label: (Google Cloud Storage) ? > > or will Solr add AND / OR behind the scenes?

Re: omitTermFreqAndPositions="true" in easy English, please?

2013-07-03 Thread Ali, Saqib
terms. > > > -- Jack Krupansky > > -Original Message- From: Ali, Saqib > Sent: Thursday, July 04, 2013 12:52 AM > To: solr-user@lucene.apache.org > Subject: Re: omitTermFreqAndPositions="**true" in easy English, please? > > > Jack, > > Thanks for the expl

Re: omitTermFreqAndPositions="true" in easy English, please?

2013-07-03 Thread Ali, Saqib
needs to be stored in the index. > > -- Jack Krupansky > > -Original Message- From: Ali, Saqib > Sent: Wednesday, July 03, 2013 10:31 PM > To: solr-user@lucene.apache.org > Subject: omitTermFreqAndPositions="**true" in easy English, please? > > > Hello, > &

Re: Use case indexed="false" stored="false" field

2013-07-03 Thread Ali, Saqib
Thank you Shawn for the excellent use case. :) On Wed, Jul 3, 2013 at 9:34 AM, Shawn Heisey wrote: > On 7/3/2013 9:22 AM, Ali, Saqib wrote: > >> What would be the use case for such a field: >> >> > stored="false"/> >> >> >> a

Re: unused fields in Solr schema.xml increase the index size

2013-07-03 Thread Ali, Saqib
ome need to reference them. > > You should keep your schema and config files in a version control system > so that you can always go back or view differences. > > -- Jack Krupansky > > -Original Message- From: Ali, Saqib > Sent: Wednesday, July 03, 2013 11:55 AM > To

omitTermFreqAndPositions="true" in easy English, please?

2013-07-03 Thread Ali, Saqib
Hello, Can anyone please explain omitTermFreqAndPositions="true" to me in easy English, please? Thanks.

Moving from single Solr instance to Solr Cloud

2013-07-03 Thread Ali, Saqib
We have single Solr instance with lot of indexed document. Now we would like to move to SolrCloud implementation. Can we move the existing index to SolrCloud? If so, how? Or do we need to reindex our data in SolrCloud? Thanks, Saqib

unused fields in Solr schema.xml increase the index size

2013-07-03 Thread Ali, Saqib
Hello all, Do unused fields in Solr Schem.xml increase the size of the index files? Should we be cleaning up those fields? Thanks. Saqib

Re: Use case indexed="false" stored="false" field

2013-07-03 Thread Ali, Saqib
.. you could still have update processors that look at the values of > "ignored" fields and maybe assigns them to other, non-ignored fields. > > -- Jack Krupansky > > -Original Message- From: Ali, Saqib > Sent: Wednesday, July 03, 2013 11:22 AM > To: solr-user@lucene

Use case indexed="false" stored="false" field

2013-07-03 Thread Ali, Saqib
Hello all, What would be the use case for such a field: and ? Thanks.

Re: copyField and storage requirements

2013-07-02 Thread Ali, Saqib
!!! :) On Tue, Jul 2, 2013 at 11:35 AM, Shawn Heisey wrote: > On 7/2/2013 12:22 PM, Ali, Saqib wrote: > > Newbie question: > > > > We have the following

copyField and storage requirements

2013-07-02 Thread Ali, Saqib
Newbie question: We have the following fields defined in the schema: the content is field is about 500KB data. My question is whether Solr stores the entire contents of the that 500KB content field? We want to minimize the stored data in the Solr index, that is why we added the copyField te

Re: Storing Solr Index on NFS

2013-04-15 Thread Ali, Saqib
Apr 15, 2013, at 9:40 AM, Ali, Saqib wrote: > > > Greetings, > > > > Are there any issues with storing Solr Indexes on a NFS share? Also any > > recommendations for using NFS for Solr indexes? > > I recommend that you do not put Solr indexes on NFS. > > It can

Storing Solr Index on NFS

2013-04-15 Thread Ali, Saqib
Greetings, Are there any issues with storing Solr Indexes on a NFS share? Also any recommendations for using NFS for Solr indexes? Thanks, Saqib

Re: secure deployment of solr.war on jboss

2013-04-01 Thread Ali, Saqib
Thanks. Are you using IP tables firewall on the jboss to prevent access from other systems? Or are you using some jboss configuration for that? Thanks, Saqib On Mon, Apr 1, 2013 at 6:25 AM, adityab wrote: > Hi Ali, > > We have Solr 4.2 on Jboss running on a separate VM behind firewall. Only IT

secure deployment of solr.war on jboss

2013-03-31 Thread Ali, Saqib
Hello all, We are using Apache Solr 4.2 in our application to provide search capabilities. We are deploying the solr.war file to jboss along with our application. Any suggestions on proper security controls for this type of solr setup? Also solr is now accessible to everyone from the http://jbos

Re: What is the graceful shutdown API for Solrj embedded?

2013-02-07 Thread Ali, Saqib
Hello Alex, I asked a similar question on server fault: http://serverfault.com/a/474442/156440 On Wed, Feb 6, 2013 at 7:05 PM, Alexandre Rafalovitch wrote: > Hello, > > When I CTRL-C the example Solr, it prints a bunch of graceful shutdown > messages. I assume it shuts down safe and without co

Re: Configuring the jetty shipped with Solr

2013-02-05 Thread Ali, Saqib
l blog: http://blog.outerthoughts.com/ > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch > - Time is the quality of nature that keeps events from happening all at > once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) > > > On Mon, Feb 4, 2013 at 6:41 PM, Ali, Sa

Configuring the jetty shipped with Solr

2013-02-04 Thread Ali, Saqib
Hello all, How do I change the configuration for the Jetty that is shipped with Apache Solr? Where are the configuration files located? I want to restrict the IP address that can connect to that instance of Solr Thanks, Saqib