Re: MoreLikeThis - Odd results - what am I doing wrong?

2013-04-01 Thread DC tech
OK - so I have my SOLR instance running on AWS. Any suggestions on how to safely share the link? Right now, the whole SOLR instance is totally open. Gagandeep singh wrote: >say &debugQuery=true&mlt=true and see the scores for the MLT query, not a >sample query. You can use Amazon ec2 to br

Re: Need Help in Patching OPENNLP

2013-04-01 Thread karthicrnair
Thanks much !! Explorer -- Internet Explorer :) Sorry for the miscommunication. Yeah let me check it once again. appreciate all the help :) krn -- View this message in context: http://lucene.472066.n3.nabble.com/Need-Help-in-Patching-OPENNLP-tp4052362p4053094.html Sent from the Solr - User m

Re: Need Help in Patching OPENNLP

2013-04-01 Thread Gora Mohanty
On 2 April 2013 11:17, karthicrnair wrote: > Thanks Gora!! > > when I tried with ping command all my request got timed out. am able to > access the svn through my explorer though. > > What could be the issue now? :( Hard to tell. My guess would be that your network is blocking some things like IC

Re: Need Help in Patching OPENNLP

2013-04-01 Thread karthicrnair
Thanks Gora!! when I tried with ping command all my request got timed out. am able to access the svn through my explorer though. What could be the issue now? :( Thanks, krn -- View this message in context: http://lucene.472066.n3.nabble.com/Need-Help-in-Patching-OPENNLP-tp4052362p4053092.htm

Re: Need Help in Patching OPENNLP

2013-04-01 Thread Gora Mohanty
On 2 April 2013 11:08, Gora Mohanty wrote: > On 2 April 2013 11:00, karthicrnair wrote: >> Hi Erick, >> >> thank you so much or the help and support. >> >> As you have mentioned I have made the svn set up and while trying to connect >> using check out option am getting this error; >> >> C:\bin>sv

Re: Need Help in Patching OPENNLP

2013-04-01 Thread Gora Mohanty
On 2 April 2013 11:00, karthicrnair wrote: > Hi Erick, > > thank you so much or the help and support. > > As you have mentioned I have made the svn set up and while trying to connect > using check out option am getting this error; > > C:\bin>svn co https://svn.apache.org/repos/asf/lucene/dev/ Ple

Re: Need Help in Patching OPENNLP

2013-04-01 Thread karthicrnair
Hi Erick, thank you so much or the help and support. As you have mentioned I have made the svn set up and while trying to connect using check out option am getting this error; C:\bin>svn co https://svn.apache.org/repos/asf/lucene/dev/ svn: E175002: Unable to connect to a repository at URL 'http

Re: solr4.1 No live SolrServers available to handle this request

2013-04-01 Thread sling
thx for your reply. my solr.xml is like this: i have change the zkclienttimeout from 15s to 30s, but this exception still shows. and the load on solrcloud servers are not too heavy, they are 1.4 1.5 1. and these disconnects appear in solrj logs, while the solrcloud is

Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Otis Gospodnetic
Hi, Solr doesn't have anything like ES River. DIH (DataImportHandler) feels like the closest thing in Solr, though it's not quite the same thing. DIH pulls in data like a typical River does, but most people have external indexers that push data into Solr using one of its client libraries to talk

Re: [ANNOUNCE] Solr wiki editing change

2013-04-01 Thread Steve Rowe
On Apr 1, 2013, at 9:40 PM, "Vaillancourt, Tim" wrote: > I would also like to contribute to SolrCloud's wiki where possible. Please > add myself (TimVaillancourt) when you have a chance. Added to solr wiki ContributorsGroup.

RE: [ANNOUNCE] Solr wiki editing change

2013-04-01 Thread Vaillancourt, Tim
I would also like to contribute to SolrCloud's wiki where possible. Please add myself (TimVaillancourt) when you have a chance. Cheers, Tim -Original Message- From: Trey Grainger [mailto:solrt...@gmail.com] Sent: Saturday, March 30, 2013 9:43 PM To: d...@lucene.apache.org Cc: solr-user

Re: Top 10 Terms in Index (by date)

2013-04-01 Thread Andy Pickler
I need "total number of occurrences" across all documents for each term. Imagine this... Post #1: "I think, therefore I am like you" Reply #1: "You think too much" Reply #2 "I think that I think much as you" Each of those "documents" are put into 'content'. Pretending I don't have stop words, th

Re: Suggestions for Customizing Solr Admin Page

2013-04-01 Thread Chris Hostetter
: I want to customize Solr Admin Page. I think that I will need more : complicated things to manage my cloud. I will separate my Solr cluster into : just indexing ones and just response ones. I will index my documents by : categorical and I will index them at different collections. A key design c

Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Jack Krupansky
The Solr example really is rather simple. Download, unzip, run, add data, query. It's really that simple. Make sure you are looking at the Solr tutorial: http://lucene.apache.org/solr/4_2_0/tutorial.html Download from here: http://lucene.apache.org/solr/tutorial.html -- Jack Krupansky -O

Solr Multiword Search

2013-04-01 Thread skmirch
We have a catalog of media content which is ingested into solr. We are trying to do a spell check on the title of the catalog item, to make sure that the client is able to correctly predict and correct the (mis)typed text. The requirement is that corrected text match a title in the catalog. I h

Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Utkarsh Sengar
Thanks for the reply. So DSE is one of the options and I am looking into that too. Although, before diving into solr+cassandra integration (which comes out of the box with DSE). I am just trying to setup a solr instance on my local machine without the bloat the "example" solr instance has to offer

Re: Use of SolrJettyTestBase

2013-04-01 Thread Chris Hostetter
: I've subclassed SolrJettyTestBase, and added a test method (annotated : with @test). However, my test method is never called. I see the You got an immediate failure from the tests setup, because you don'th ave assertions enabled in your JVM (the Lucene & Solr test frameworks both require asser

Re: Getting started with solr 4.2 and cassandra

2013-04-01 Thread Jack Krupansky
You might want to check out DataStax Enterprise, which actually integrates Cassandra and Solr. You keep the data in Cassandra, but as data is added and updated and deleted, the Solr index is automatically updated in parallel. You can add and update data and query using either the Cassandra API o

Getting started with solr 4.2 and cassandra

2013-04-01 Thread Utkarsh Sengar
Hello, I am evaluating solr 4.2 and ElasticSearch (I am new to both) for a search API, where data sits in cassandra. Getting started with elasticsearch is pretty straight forward and I was able to write an ES "river" which pulls data from cassa

Re: Top 10 Terms in Index (by date)

2013-04-01 Thread Tomás Fernández Löbbe
So you have one document per user comment? Why not use faceting plus filtering on the "dateCreated" field? That would count "number of documents" for each term (so, in your case, if a term is used twice in one comment it would only count once). Is that what you are looking for? Tomás On Mon, Apr

Use of SolrJettyTestBase

2013-04-01 Thread Upayavira
I'm attempting to use SolrJettyTestBase to test a simple app that pushes content to Solr. I've subclassed SolrJettyTestBase, and added a test method (annotated with @test). However, my test method is never called. I see the following: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Fou

Re: How does solr 4.2 do in returning large datasets ?

2013-04-01 Thread Otis Gospodnetic
Don't forget you could also tell Solr to return a CSV in response, which will be much lighter in terms of response size, though disk IO would still be there. Is returning large result sets recommended? No. :) But the same can be said for Lucene. Otis -- Solr & ElasticSearch Support http://semate

Top 10 Terms in Index (by date)

2013-04-01 Thread Andy Pickler
Our company has an application that is "Facebook-like" for usage by enterprise customers. We'd like to do a report of "top 10 terms entered by users over (some time period)". With that in mind I'm using the DataImportHandler to put all the relevant data from our database into a Solr 'content' fie

Filtering Search Cloud

2013-04-01 Thread Furkan KAMACI
I want to separate my cloud into two logical parts. One of them is indexer cloud of SolrCloud. Second one is Searcher cloud of SolrCloud. My first question is that. Does separating my cloud system make sense about performance improvement. Because I think that when indexing, searching make time to

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

2013-04-01 Thread feroz_kh
Also, Is it absolutely necessary to set the maxSegments=1 , if we need to reformat the whole index ? -- View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4052991.html Sent from the Solr - User mailing list archive at Nabble.com

Re: Making tika process mail attachments eludes me

2013-04-01 Thread Chris Hostetter
: I believe that the handling of the multipart MIME lacks some error checking, and : it is probably related to the content outside the MIME boundaries (in my : example, the text "This is a multi-part message in MIME format."): : : I really hope that some SOLR developer can have a look, we canno

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

2013-04-01 Thread feroz_kh
Hi Shawn, I tried optimizing using this command... curl 'http://localhost:/solr/update?optimize=true&maxSegments=10&waitFlush=true' And i got this response within secs... 0840 Is this a valid response that one should get ? I checked the statistics link from /solr/admin page and it show

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

2013-04-01 Thread feroz_kh
Hi Shawn, I tried optimizing using this command... curl ' http://10.7.233.54:8088/solr/update?optimize=true&maxSegments=10&waitFlush=true' And i got this response within secs... 0840 Is this a valid response that one should get ? I checked the statistics link from /solr/admin page and it

Re: DocValues vs stored fields?

2013-04-01 Thread David Smiley (@MITRE.org)
Otis, DocValues are quite insufficient for true field updates. DocValues is a per-document value storage (hence the name); it's not uninverted/indexed. If you needed to search based on these values (e.g. find all docs that have this value or between these values) then that's not going to work.

Re: highlight on same field with different fragsize

2013-04-01 Thread Erick Erickson
Why do you want to do this? Can't you just take the longer frag size and cut it down for display in the app? Best Erick On Mon, Apr 1, 2013 at 6:33 AM, meghana wrote: > We use Solr4.2 in our application. We need to return highlight on same field > 2 times, with different fragsize. Solr allows to

Re: solr4.1 No live SolrServers available to handle this request

2013-04-01 Thread Joel Bernstein
Check the Solr logs for Zookeeper disconnects. It could be that as load is increasing Solr is not able to respond to the Zookeeper pings which would bring the nodes offline. If you see Zookeeper disconnects then you can increase the zkClientTimeout set in solr.xml. But be aware that zk disconnects

RE: How does solr 4.2 do in returning large datasets ?

2013-04-01 Thread Swati Swoboda
It really depends on what you are returning (how big is each document? Just a document ID? Pages and pages of data in fields?). It can take a long time for Solr to render an XML with 60,000 results. Solr will be serializing the data and then you'd (presumably) be de-serializing it. Depending o

Re: secure deployment of solr.war on jboss

2013-04-01 Thread Ali, Saqib
Thanks. Are you using IP tables firewall on the jboss to prevent access from other systems? Or are you using some jboss configuration for that? Thanks, Saqib On Mon, Apr 1, 2013 at 6:25 AM, adityab wrote: > Hi Ali, > > We have Solr 4.2 on Jboss running on a separate VM behind firewall. Only IT

How does solr 4.2 do in returning large datasets ?

2013-04-01 Thread Liz Sommers
I thought I remembered reading that Solr is not good for returning large datasets. We are currently using lucene 3.6.0 and returning datasets of 10,000 to 60,000 results. In the future we might need to return even larger datasets. Would you all recommend going to Solr for this, or should we stic

Re: secure deployment of solr.war on jboss

2013-04-01 Thread adityab
Hi Ali, We have Solr 4.2 on Jboss running on a separate VM behind firewall. Only IT Administration and our FrontEnd Application Server is able to access the Solr servers in production. -- View this message in context: http://lucene.472066.n3.nabble.com/secure-deployment-of-solr-war-on-jboss

Re: had query regarding the indexing and analysers

2013-04-01 Thread Jack Krupansky
Yes, if there is only a single analyzer or an index analyzer is specified and the Porter stemmer is used in it. -- Jack Krupansky -Original Message- From: Rohan Thakur Sent: Monday, April 01, 2013 9:13 AM To: solr-user@lucene.apache.org Subject: Re: had query regarding the indexing an

Re: had query regarding the indexing and analysers

2013-04-01 Thread Rohan Thakur
hi does this means that while indexing also ace is been stored as ac in solr index? thanks regards Rohan On Fri, Mar 22, 2013 at 9:49 AM, Jack Krupansky wrote: > Actually, it's the Porter Stemmer that is turning "ace" into "ac". > > Try making a copy of text_en_splitting and delete the > Porter

Re: Realtime updates solrcloud

2013-04-01 Thread Erick Erickson
Does the 30 second interval persist for a long time after you stop your queries? It's possible that your requests are queueing up and you have a bunch of search in the queue in front of the update Best Erick On Fri, Mar 29, 2013 at 8:33 AM, roySolr wrote: > Hello Guys, > > I want to use the

Re: Need Help in Patching OPENNLP

2013-04-01 Thread Erick Erickson
Here's the start-up page: http://wiki.apache.org/solr/HowToContribute First, just check out the code via svn and build it (see the page above). That'll tell you if you have all the tools available. Second, apply the patch to the source. From the root of your source, 'patch -p0 -i ' Third, execute

query regarding running solr4.1.0 on tomcat6

2013-04-01 Thread Rohan Thakur
hi all I had installed tomcat6 on centos redhat linux os and had configured solr with name on solrt on tomcat and It was running fine now what I did was placed another copy of solr home folder in centos and changed the tomcat directory to this new solr and now every thing is working fine like the

Re: Solr Phonetic Search Highlight issue in search results

2013-04-01 Thread Erick Erickson
Good question, you're causing me to think... about code I know very little about . So rather than spouting off, I tried it and.. it works fine for me, either with or without using fast vector highlighter on, admittedly, a very simple test. So I think I'd try peeling off all the extra stuff you've

solr4.1 No live SolrServers available to handle this request

2013-04-01 Thread sling
hi,all. I am new to Solr. when i query solrcloud4.1 with solrj, the client throws exceptions as follows. there are 2 shards in my solrcloud. each shard is on a server with 4cpu/3G RAM, and jvm has 2G ram. when the query requests get more and more, the exception occers. [java] org.apache.solr

highlight on same field with different fragsize

2013-04-01 Thread meghana
We use Solr4.2 in our application. We need to return highlight on same field 2 times, with different fragsize. Solr allows to highlight on different fields with different fragsize as mentioned below , but do not work with same fields http://localhost:8080/solr/select?q="my search"&hl=on&hl.fl=con

Found child node with improper name

2013-04-01 Thread yriveiro
I have this warning when I try to create a collection and the collection is not created. Apr 01, 2013 10:05:26 AM org.apache.solr.handler.admin.CollectionsHandler handleCreateAction INFO: Creating Collection : collection.configName=statisticsBucket-archive&maxShardsPerNode=3&name=ST-ARCHIVE_0

Architect /Lead Custom Cloud & Middleware Platforms (APACHE SOLR)

2013-04-01 Thread jessica k
We are very excited and proud to be selected as the only recruiting company assigned to a retained search project for a brand new High Technology Service line( Cloud Technologies) for a Top Tier IT Service and Solutions company. This new service line is very exciting and has many new products lin