Re: Semantic search with python numpy and Solr

2014-03-24 Thread Alexandre Rafalovitch
Didn't you ask exactly this question two weeks ago and got some replies that you need to do more domain analysis? Did you have any progress since and do you have more precise Solr-specific questions? Regards, Alex. P.s. http://www.brainyquote.com/quotes/quotes/a/alberteins133991.html Personal w

Semantic search with python numpy and Solr

2014-03-24 Thread Sohan Kalsariya
I am beginner with solr, started playing with solr for last one month. I am building a search mechanism for the http://allevents.in and i want to implement semantic search with solr, when someone search events in our website. And the back-end is in php(solarium-client). So can you please guide me

Re: solr cloud distributed optimize() becomes serialized

2014-03-24 Thread Shalin Shekhar Mangar
Found it - https://issues.apache.org/jira/browse/LUCENE-5481 On Fri, Mar 21, 2014 at 8:11 PM, Mark Miller wrote: > Recently fixed in Lucene - should be able to find the issue if you dig a > little. > -- > Mark Miller > about.me/markrmiller > > On March 21, 2014 at 10:25:56 AM, Greg Walters (greg

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Roman Chyla
perhaps useful, here is an open source implementation with near[digit] support, incl analysis of proximity tokens. When days become longer maybe itwill be packaged into a nice lib...:-) https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/grammars/ADS.g On 25 Mar 2014 00:14, "Salman

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Salman Akram
Basically we just created this syntax for the ease of users, otherwise on back end it uses W or N operators. On Tue, Mar 25, 2014 at 4:21 AM, Ahmet Arslan wrote: > Hi, > > There is no w/ syntax in surround. > /* Query language operators: OR, AND, NOT, W, N, (, ), ^, *, ?, " and > comma */ > > A

Re: Can the solr dataimporthandler consume an atom feed?

2014-03-24 Thread Gora Mohanty
On 25 March 2014 01:15, eShard wrote: > I confirmed the xpath is correct with a third party XPath visualizer. > /atom:feed/atom:entry parses the xml correctly. > > Can anyone confirm or deny that the dataimporthandler can handle an atom > feed? Yes, an ATOM feed can be consumed by DIH, as noted i

Re: Multiple Languages in Same Core

2014-03-24 Thread Alexandre Rafalovitch
Solr In Action has a significant discussion on the multi-lingual approach. They also have some code samples out there. Might be worth a look Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature t

Re: Required fields

2014-03-24 Thread Chris Hostetter
: What is the default value for the required attribute of a field element : in a schema? I've just looked everywhere I can think of in the wiki, the : reference manual, and the JavaDoc. Most of the documentation doesn't : even mention that attribute. Good catch, fixed... https://cwiki.apache.o

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Walter Underwood
That is similar to Verity VQL, but that used "NEAR/10". --wunder On Mar 24, 2014, at 4:21 PM, Ahmet Arslan wrote: > Hi, > > There is no w/ syntax in surround. > /* Query language operators: OR, AND, NOT, W, N, (, ), ^, *, ?, " and comma */ > > Ahmet > > > > On Monday, March 24, 2014 9:46

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Ahmet Arslan
Hi, There is no w/ syntax in surround.  /* Query language operators: OR, AND, NOT, W, N, (, ), ^, *, ?, " and comma */ Ahmet On Monday, March 24, 2014 9:46 PM, T. Kuro Kurosaka wrote: On 3/19/14 5:13 PM, Otis Gospodnetic wrote:> Hi, > > Guessing it's surround query parser's support for "withi

Question on highlighting edgegrams

2014-03-24 Thread Software Dev
In 3.5.0 we have the following. If we searched for "c" with highlighting enabled we would get back results such as: cdat crocdile cool beans But in the latest Solr (4.7) we get the full words highlighted back. Di

Re: Best approach to handle large volume of documents with constantly high incoming rate?

2014-03-24 Thread shushuai zhu
Jack, thanks. Actually the 20K events/sec is some low-end rate we estimated. It is not necessarily related to sensor; when you want to centralize data from many sources, regardless multi-tenancy, even for a single tenant, many events per second have to be handled. I have a question regarding

Multiple Languages in Same Core

2014-03-24 Thread Jeremy Thomerson
I recently deployed Solr to back the site search feature of a site I work on. The site itself is available in hundreds of languages. With the initial release of site search we have enabled the feature for ten of those languages. This is distributed across eight cores, with two Chinese languages plu

solr 4.x reindexing issues

2014-03-24 Thread Ravi Solr
Hello, We are trying to reindex as part of our move from 3.6.2 to 4.6.1 and have faced various issues reindexing 1.5 Million docs. We dont use solrcloud, its still Master/Slave config. For testing this Iam using a single test server reading from it and putting back into same index. We send

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Otis Gospodnetic
I think SQP is getting axed, no? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Mar 24, 2014 at 3:45 PM, T. Kuro Kurosaka wrote: > On 3/19/14 5:13 PM, Otis Gospodnetic wrote:> Hi, > > > > Guessing it's surround query

Re: Can the solr dataimporthandler consume an atom feed?

2014-03-24 Thread eShard
Ok, I found one typo: the links need to be this: /atom:feed/atom:entry/atom:link/@href But the import still doesn't work... :( I guess I have to convert the feed over to RSS 2.0 -- View this message in context: http://lucene.472066.n3.nabble.com/Can-the-solr-dataimporthandler-consume-an-atom-f

Re: Fixing corrupted index?

2014-03-24 Thread zqzuk
Hi Thanks. But I am already using CheckIndex and the error is given by the CheckIndex utility: it could not even continue after reporting "could not read any segements file in directory". -- View this message in context: http://lucene.472066.n3.nabble.com/Fixing-corrupted-index-tp4126644p4126

Re: Fixing corrupted index?

2014-03-24 Thread Dmitry Kan
Hi, Have a look at: http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/CheckIndex.html HTH, Dmitry On Mon, Mar 24, 2014 at 8:16 PM, zqzuk wrote: > My Lucene index - built with Solr using Lucene4.1 - is corrupted. Upon > trying > to read the index using the following code I get

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread T. Kuro Kurosaka
On 3/19/14 5:13 PM, Otis Gospodnetic wrote:> Hi, > > Guessing it's surround query parser's support for "within" backed by span > queries. > > Otis You mean this? http://wiki.apache.org/solr/SurroundQueryParser I guess this parser needs improvement in documentation area. It doesn't explain or hav

Re: Can the solr dataimporthandler consume an atom feed?

2014-03-24 Thread eShard
I confirmed the xpath is correct with a third party XPath visualizer. /atom:feed/atom:entry parses the xml correctly. Can anyone confirm or deny that the dataimporthandler can handle an atom feed? -- View this message in context: http://lucene.472066.n3.nabble.com/Can-the-solr-dataimporthandle

Re: Using Sentence Information For Snippet Generation

2014-03-24 Thread Dmitry Kan
Hi Furkan, I have done an implementation with a custom filler (special character) sequence in between sentences. A better solution I landed at was increasing the position of each sentence's first token by a large number, like 1 (perhaps, a smaller number could be used too). Then a user search

Re: Indexer: java.io.IOException: Job failed!

2014-03-24 Thread Laura McCord
So the problem might be because I’m running solr on tomcat port 8080. is there a way to resolve this so I can run the command successfully? Thanks, Laura On Mar 24, 2014, at 1:33 PM, Laura McCord wrote: > Hi, > > I’m trying to integrate Solr with Nutch and I performed all of the necessary

Indexer: java.io.IOException: Job failed!

2014-03-24 Thread Laura McCord
Hi, I’m trying to integrate Solr with Nutch and I performed all of the necessary steps except after Nutch performs the crawl it appears that I’m receiving a connection refused. 2014-03-24 11:42:43,062 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: TestCrawl/crawldb 2014-03-24 11:4

Fixing corrupted index?

2014-03-24 Thread zqzuk
My Lucene index - built with Solr using Lucene4.1 - is corrupted. Upon trying to read the index using the following code I get org.apache.solr.common.SolrException: No such core: collection1 exception: >> File configFile = new File(cacheFolder + File.separator + "solr.xml"); CoreContainer containe

Solr 4.3.1 memory swapping

2014-03-24 Thread Darrell Burgan
Hello all, we have a SolrCloud implementation in production, with two servers running Solr 4.3.1 in a SolrCloud configuration. Our search index is about 70-80GB in size. The trouble is that after several days of uptime, we will suddenly have periods where the operating system Solr is running in

Re: Solr Cloud collection keep going down?

2014-03-24 Thread Software Dev
Shawn, Thanks for pointing me in the right direction. After consulting the above document I *think* that the problem may be too large of a heap and which may be affecting GC collection and hence causing ZK timeouts. We have around 20G of memory on these machines with a min/max of heap at 6, 8 res

Re: Singles in solr for bigrams,trigrams in parsed_query

2014-03-24 Thread Dmitry Kan
Hi, Query rewrite happens down the chain, after query parsing. For example a wildcard query triggers an index based query rewrite where terms matching the wildcard are added into the original query. In your case, looks like the query rewrite will generate the ngrams and add them into the original

Re: Ram usage

2014-03-24 Thread Shawn Heisey
On 3/24/2014 9:48 AM, David Flower wrote: Its not saw toothing though it’s sitting solidly at 52% It may be very difficult to see the sawtooth effect unless you actually connect an app like jconsole to your running Solr instance and watch the graphs over time. My point was that what you've

Re: Can the solr dataimporthandler consume an atom feed?

2014-03-24 Thread eShard
The only message I get is: Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. Requests: 1, Skipped: 0 And there are no errors in the log. Here's what the ibm atom feed looks like: http://www.w3.org/2005/Atom"; xmlns:wplc="http://www.ibm.com/wplc/atom/1.0"; xmlns:age="http://p

Re: Ram usage

2014-03-24 Thread David Flower
Its not saw toothing though it’s sitting solidly at 52% On 24/03/2014 15:46, "Shawn Heisey" wrote: >> I¹m looking at dashboard page on all 4 nodes and seeing >> Physical Memory 92% compared with ~41-44% >> >> And JVM-Memory 52.9% compared to 23-28% >> >> The reason I mentioned slave is that on t

Re: Ram usage

2014-03-24 Thread Shawn Heisey
> I¹m looking at dashboard page on all 4 nodes and seeing > Physical Memory 92% compared with ~41-44% > > And JVM-Memory 52.9% compared to 23-28% > > The reason I mentioned slave is that on the core overview page there is > An entry for Slave (Searching) that doesn¹t appear on any of the other > no

Using Sentence Information For Snippet Generation

2014-03-24 Thread Furkan KAMACI
Hi; When I generate snippet via Solr I do not want to remove beginning of any sentence at the snippet. So I need to do a sentence detection. I think that I can do it before I send documents into Solr. I can put some special characters that signs beginning or end of a sentence. Then I can use that

Re: SolrCloud from "Stopping recovery for" warnings to crash

2014-03-24 Thread Lukas Mikuckis
We tried to set ZK timeout to 1s and did load testing (both indexing and search) and this issue didn't happen. 2014-03-24 17:00 GMT+02:00 Lukas Mikuckis : > Garbage Collectors Summary: > https://apps.sematext.com/spm-reports/s/rgRnwuShgI

Re: join and filter query with AND

2014-03-24 Thread Kranti Parisa
glad the suggestions are working for you! Thanks, Kranti K. Parisa http://www.linkedin.com/in/krantiparisa On Mon, Mar 24, 2014 at 4:10 AM, Marcin Rzewucki wrote: > Hi, > > Yonik, thank you for explaining me the reason of the issue. The workarounds > you suggested are working fine. > Kranti, y

Re: Ram usage

2014-03-24 Thread David Flower
I¹m looking at dashboard page on all 4 nodes and seeing Physical Memory 92% compared with ~41-44% And JVM-Memory 52.9% compared to 23-28% The reason I mentioned slave is that on the core overview page there is An entry for Slave (Searching) that doesn¹t appear on any of the other nodes Cheers, D

Re: SolrCloud from "Stopping recovery for" warnings to crash

2014-03-24 Thread Lukas Mikuckis
Garbage Collectors Summary: https://apps.sematext.com/spm-reports/s/rgRnwuShgI Pool Size: https://apps.sematext.com/spm-reports/s/H16ndqichM First Stopping recovery warning: 4:00, OOM error: 6:30. 2014-03-24 16:35 GMT+02:00 Shalin Shekhar Mangar : > I am guessing that it is all related to memo

Re: Ram usage

2014-03-24 Thread Shawn Heisey
On 3/24/2014 7:15 AM, David Flower wrote: > We have a 4 node cluster with a collection thats sharded into 2 and each > shard having a master and a slave for redundancy however 1 node has decied > to use twice the ram that the others are using within the cluster > > The only difference we can spot

Re: SolrCloud from "Stopping recovery for" warnings to crash

2014-03-24 Thread Shalin Shekhar Mangar
I am guessing that it is all related to memory issues. I guess that as the used heap increases, full GC cycles increase causing ZK timeouts which in turn cause more recoveries to be initiated. In the end, everything blows up with the out of memory errors. Do you log GC activity on your servers? I

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-24 Thread Greg Walters
Sathya, We're still missing a fair amount of information here though it looks like your cluster is healthy. How are you indexing and what's the request you're sending that results in the error you're seeing? Have you checked your nodes' logs for errors that correspond with the one you're seeing

Re: SolrCloud from "Stopping recovery for" warnings to crash

2014-03-24 Thread Lukas Mikuckis
Yes, we upgraded solr from 4.6.1 to 4.7 3 weeks ago (2 weeks before solr started crashing). When we were upgrading, we just upgraded solr and changed versions in collections configs. When solr crashes we get OOM but only 2h after first Stopping recovery warnings. Maybe you have any ideas when Sto

Re: Ram usage

2014-03-24 Thread David Flower
We¹re still on 4.4.0 David On 24/03/2014 13:19, "Furkan KAMACI" wrote: >Hi David; > >Which version of Solr you are using? > >Thanks; >Furkan KAMACI > > >2014-03-24 15:15 GMT+02:00 David Flower : > >> Hi All >> >> We have a 4 node cluster with a collection thats sharded into 2 and each >> shard

Re: Ram usage

2014-03-24 Thread Furkan KAMACI
Hi David; Which version of Solr you are using? Thanks; Furkan KAMACI 2014-03-24 15:15 GMT+02:00 David Flower : > Hi All > > We have a 4 node cluster with a collection thats sharded into 2 and each > shard having a master and a slave for redundancy however 1 node has decied > to use twice the r

Ram usage

2014-03-24 Thread David Flower
Hi All We have a 4 node cluster with a collection thats sharded into 2 and each shard having a master and a slave for redundancy however 1 node has decied to use twice the ram that the others are using within the cluster The only difference we can spot between the node is that the one with the ra

Re: highlight did not work correctly

2014-03-24 Thread Ahmet Arslan
Hi, You may need to increase hl.maxAnalyzedChars which has a default of 51200. On Monday, March 24, 2014 2:33 PM, "panzj.f...@cn.fujitsu.com" wrote: Hi all While using solr 4.6 to highlight the result, I ran into a strange situation. Most searching results were correctly highlighted. But

highlight did not work correctly

2014-03-24 Thread panzj.f...@cn.fujitsu.com
Hi all While using solr 4.6 to highlight the result, I ran into a strange situation. Most searching results were correctly highlighted. But a few gave out all the content of the indexed webpage without any highlighted keywords. Is anybody ever met this problem? Here is my solrconfig.xml

Re: Getting 500s on distributed queries with SolrCloud

2014-03-24 Thread Ugo Matrangolo
Hi Shalin, Thank you for your answer. I'm already using custom hashing to make sure all the docs that are going to be grouped together are on the same shard. During index I make sure the uniqueKey is something like: productId!skuId so all the skus belonging to the same product will end up o

Re: Solr dih to read Clob contents

2014-03-24 Thread Prasi S
Below is my full configuration, And this is my xml data ZAYQ5181 Sam Mathews 2013-01-18T23:29:04.492 Thanks, Prasi On Mon, Mar 24, 2014 at 3:23 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > 1. I don't see the definition of a datasource named 'xmldata' in yo

Re: Solr dih to read Clob contents

2014-03-24 Thread Shalin Shekhar Mangar
1. I don't see the definition of a datasource named 'xmldata' in your data-config. 2. You have forEach="/*:summary" but I don't think that is a syntax supported by XPathRecordReader. If you can give a sample of the xml stored as Clob in your database, then we can help you write the right xpaths.

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-24 Thread Sathya
Hi Greg, This is my Clusterstate.json. WatchedEvent state:SyncConnected type:None path:null [zk: 10.10.1.72:2185(CONNECTED) 0] get /clusterstate.json {"set_recent":{ "shards":{ "shard1":{ "range":"8000-d554", "state":"active", "replicas":{ "10.1

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-24 Thread Sathya
Hi Greg, This is my Clusterstate.json. WatchedEvent state:SyncConnected type:None path:null [zk: 10.10.1.72:2185(CONNECTED) 0] get /clusterstate.json {"set_recent":{ "shards":{ "shard1":{ "range":"8000-d554", "state":"active", "replicas":{ "10.1

Re: join and filter query with AND

2014-03-24 Thread Marcin Rzewucki
Hi, Yonik, thank you for explaining me the reason of the issue. The workarounds you suggested are working fine. Kranti, your suggestion was also good :-) Thanks a lot! On 21 March 2014 20:00, Kranti Parisa wrote: > My example should also work, am I missing something? > > &q=({!join from=inne

Re: Solr dih to read Clob contents

2014-03-24 Thread Prasi S
My database configuration is as below and i get my response from solr as below org...@1c8e807 Am i mising anything? Thanks, Prasi On Thu, Mar 20, 2014 at 4:25 PM, Gora Mohanty wrote: > On 20 March 2014 14:53, Prasi S wrote: > >

Re: how to generate json response from the php solarium ?

2014-03-24 Thread Gora Mohanty
On 24 March 2014 12:35, Sohan Kalsariya wrote: > How can i get the json response from solr ? > I mean how can i get response of the searched results in json format > and print it in solarium php code ? Adding wt=json to the query will get you Solr results in JSON format. Please refer to the Solar

how to generate json response from the php solarium ?

2014-03-24 Thread Sohan Kalsariya
How can i get the json response from solr ? I mean how can i get response of the searched results in json format and print it in solarium php code ? -- Regards, *Sohan Kalsariya*