Solr 4.3.1 - SolrCloud nodes down and lost documents

2013-07-19 Thread Neil Prosser
While indexing some documents to a SolrCloud cluster (10 machines, 5 shards and 2 replicas, so one replica on each machine) one of the replicas stopped receiving documents, while the other replica of the shard continued to grow. That was overnight so I was unable to track exactly what happened (I'

IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
Hi; Is there any support for IDNA at Solr? (IDNA: http://en.wikipedia.org/wiki/Internationalized_domain_name)

RE: IDNA Support For Solr

2013-07-19 Thread Markus Jelsma
Hi - What kind of support would you expect Solr to provide? IDN is only about conversion between Unicode in your address bas and ASCII in the DNS. -Original message- > From:Furkan KAMACI > Sent: Friday 19th July 2013 11:09 > To: solr-user@lucene.apache.org > Subject: IDNA Support For So

Help !

2013-07-19 Thread narasimha.g
HI, Need help on configuring SOLR search in Alfresco. -- Regards Narasimha Please do not print this email unless it is absolutely necessary. ATTENTION: The information in this electronic mail message is private and confidential, and only intended for the addressee. Should you receive th

custom field type plugin

2013-07-19 Thread Kevin Stone
I have a particular use case that I think might require a custom field type, however I am having trouble getting the plugin to work. My use case has to do with genetics data, and we are running into several situations were we need to be able to query multiple regions of a chromosome (or gene, or

Re: Indexing into SolrCloud

2013-07-19 Thread Erick Erickson
Usually EOF errors indicate that the packet you're sending are too big. Wait, though. 50K is not buffered docs, I think it's buffered _requests_. So you're creating a queue that's ginormous and asking 2 threads to empty it. But that's not really the issue I suspect. How many documents are you add

Re: Auto-sharding and numShard parameter

2013-07-19 Thread Erick Erickson
First the numShards parameter is only relevant the very first time you create your collection. It's a little confusing because in the SolrCloud examples you're getting "collection1" by default. Look further down the SolrCloud Wiki page, the section titled "Managing Collections via the Collections A

Re: IDNA Support For Solr

2013-07-19 Thread Furkan KAMACI
I mean that: there is a web adress: *çorba.com * However its IDNA coded version is: *xn--orba-zoa.com* You can check it from here: * http://www.whois.com.tr/?q=%C3%A7orba&sldtld=com* Let's assume that I've indexed a web page with that URL: *xn--orba-zoa.com*and one sear

RE: IDNA Support For Solr

2013-07-19 Thread Markus Jelsma
No, you'll have to index the Unicode version of the domain name. Nutch 1.x already deals with this conversion for you. Or you could create a custom update processor for Solr and code it there. It's quite simple, IDN is in java.net package. -Original message- > From:Furkan KAMACI > Sen

Re: Auto-sharding and numShard parameter

2013-07-19 Thread Flavio Pompermaier
Thank you for the reply Erick, I was facing exactly with that problem..from the documentation it seems that those parameter are required to run SolrCloud, instead they are just used to initialize a sample collection.. I think that in the examples on the user doc it should be better to separate thos

AW: Avoid Solr Pivot Faceting Out of Memory / Shorter result for pivot faceting requests with facet.pivot.ngroup=true and facet.pivot.showLastList=false

2013-07-19 Thread Sandro Zbinden
Dear Members. Do you guys think I am better off in the solr developer group with this question. To summarize I would like to add a facet.pivot.ngroup =true param for show the count of the facet list Further on I would like to avoid an out of memory exceptions in reducing the result of a facet

Re: Help !

2013-07-19 Thread Gora Mohanty
On 19 July 2013 10:39, wrote: > HI, > > Need help on configuring SOLR search in Alfresco. Please do not ask questions that are so overly broad that they are impossible to respond to. Firstly, do your basic homework: Alfresco is now integrated with Solr. Secondly, your question is more pertinent

RE: Indexing into SolrCloud

2013-07-19 Thread Beale, Jim (US-KOP)
Hi Erick! Thanks for the reply. When I call server.add() it is just to add a single document. But, still, I think you might be correct about the size of the ultimate request. I decided to grab the bull by the horns by instantiating my own HttpClient and, in so doing, my first run changed the

Date for 4.4 solr release

2013-07-19 Thread Jabouille Jean Charles
Hi, we are currently using solr 4.2.1. There are a lot of fix in the 4.4 that we need. Can we have an approximative date of the first stable release of solr 4.4 please ? Regards, jean charles Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentie

Re: dataimporter, custom fields and parsing error

2013-07-19 Thread Alexandre Rafalovitch
Dumb question: they are in your schema? Spelled right, in the right section, using types also defined? Can you populate them by hand with a CSV file and post.jar? Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is

Collapsing similar queries

2013-07-19 Thread Otis Gospodnetic
Hi, Are there any known good tools or approaches to "collapsing queries". For example, imagine 4 original queries: * big house * big houses * the big house * bigger house ...and all 4 being reduced/collapsed to just "big house". What might be some good approached for doing this? 1) stem them all

Re: Date for 4.4 solr release

2013-07-19 Thread Gmail
Hahahaha ... Good 1 On 20/07/2013, at 1:43 AM, "Jack Krupansky" wrote: > real_soon:[NOW+3DAYS TO NOW+10DAYS] > > -- Jack Krupansky > > -Original Message- From: Jabouille Jean Charles > Sent: Friday, July 19, 2013 11:10 AM > To: solr-user@lucene.apache.org > Subject: Date for 4.4 solr r

Re: Date for 4.4 solr release

2013-07-19 Thread Jack Krupansky
real_soon:[NOW+3DAYS TO NOW+10DAYS] -- Jack Krupansky -Original Message- From: Jabouille Jean Charles Sent: Friday, July 19, 2013 11:10 AM To: solr-user@lucene.apache.org Subject: Date for 4.4 solr release Hi, we are currently using solr 4.2.1. There are a lot of fix in the 4.4 that

Indexing CSV files in a Folder

2013-07-19 Thread Rajesh Jain
Hi I have flume dumping CSV files in folders and I would like Solr to build a index using these CSV files. What should I do? Thanks, Rajesh

Re: Indexing CSV files in a Folder

2013-07-19 Thread Jack Krupansky
Read: http://wiki.apache.org/solr/UpdateCSV -- Jack Krupansky -Original Message- From: Rajesh Jain Sent: Friday, July 19, 2013 1:55 PM To: solr-user@lucene.apache.org Subject: Indexing CSV files in a Folder Hi I have flume dumping CSV files in folders and I would like Solr to bu

Re: Custom RequestHandlerBase XML Response Issue

2013-07-19 Thread Chris Hostetter
: So as you mentioned in your last mail, how can I prepare a combined : response for this xml doc and even if I do I don't think it would work : because the same I am doing in the RequstHandler. Part of the disconnect you seem to be having with the advice others have been giving you is that Solr

Re: Collapsing similar queries

2013-07-19 Thread Jack Krupansky
For starters, I think you need to elaborate your criteria for "queries that can be collapsed". You can say "they're similar", but then that begs the questions of: 1) How to measure similarity, and 2) What threshold level of similarity to use for "ok to collapse". Two measures of similarity to

Re: Indexing CSV files in a Folder

2013-07-19 Thread Rajesh Jain
Thanks Jack, I am looking at more real time, I am streaming CSV file in a folder using Flume and I would like Solr to build the index automatically, rather that posting using curl. I think there is some discussion about the MorphlineSolrSink on Flume site, but the documentation is very little. I

Re: custom field type plugin

2013-07-19 Thread Chris Hostetter
: a chromosome (or gene, or other object types). All that really boils : down to is being able to give a number, e.g. 10234, and return documents : that have regions containing the number. So you'd have a document with a : list like ["1:16090","400:8000","40123:43564"], and it should come

AUTO: Siobhan Roche is out of the office (returning 22/07/2013)

2013-07-19 Thread Siobhan Roche
I am out of the office until 22/07/2013. I will respond to your query on my return, Thanks Siobhan Note: This is an automated response to your message "custom field type plugin" sent on 19/07/2013 13:06:27. This is the only notification you will receive while this person is away.

Request to be added to the ContributorsGroup

2013-07-19 Thread ricky gill
Hello, Would someone please be kind enough and add me to the "ContributorsGroup"? My Wiki Username is: RickyGill Thanks again. Regards Ricky Gill | Managing Director | Jobuzu.co.uk Mob: 07455071710 (Any Time) | Tel: 0845 805 2162 (11:00am - 5:30pm) Skype: JobuzuLTD | Email:

dataimporter, custom fields and parsing error

2013-07-19 Thread Andreas Owen
i'm using solr 4.3 which i just downloaded today and am using only jars that came with it. i have enabled the dataimporter and it runs without error. but the field "path" (included in schema.xml) and "text" (file content) aren't indexed. what am i doing wrong? solr-path: C:\ColdFusion10\cfusion

Re: Solr 4.3 open a lot more files than solr 3.6

2013-07-19 Thread SolrLover
Did you try setting useCompoundFile to true in solrconfig.xml? Also, try using a lower mergeFactor which will result in fewer segments and hence fewer open files. Also, I assume you can set the limit using a ulimit command.. ex: ulimit -n20 -- View this message in context: http://lucen

Re: The way edismax parses colon seems weird

2013-07-19 Thread Jack Krupansky
What field type analyzer and tokenizer are you using, and what does a sample of the input data look like? Generally, a single backslash I all that is needed for escaping. And, escaping is not needed within a quoted phrase, except for quotes and literal backslashes. -- Jack Krupansky -Or

Re: Request to be added to the ContributorsGroup

2013-07-19 Thread Stefan Matheis
Sure :) Done! - Stefan On Friday, July 19, 2013 at 9:28 PM, ricky gill wrote: > > Hello, > > > > > > Would someone please be kind enough and add me to the “ContributorsGroup”? My > Wiki Username is: RickyGill > > > > > > Thanks again. > > > > > > Regards >

The way edismax parses colon seems weird

2013-07-19 Thread jefferyyuan
In our application, user may search error code like 12:34. We define default search field, like: title^10 body_stored^8 content^5 So when user search: 12:34, we want to search the error code in the specified fields. In the code, if we search q=12:34 directly, this can't find anything. It's expect

Re: The way edismax parses colon seems weird

2013-07-19 Thread Shawn Heisey
On 7/19/2013 4:01 PM, jefferyyuan wrote: If I type 2 \\, seems it can find the error page: q=12\\:34 (+DisjunctionMaxQuery((content:"12 34"^0.5 | body_stored:"(12\:34 12) 34"^0.8 | title:"12 34"^1.1)))/no_coord +(content:"12 34"^0.5 | body_stored:"(12\:34 12) 34"^0.8 | title:"12 34"^1.1) Exte

Re: The way edismax parses colon seems weird

2013-07-19 Thread Alexandre Rafalovitch
Could this be related: https://issues.apache.org/jira/browse/SOLR-4333(Fixed in 4.4, so you could even run your test against RC1) Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps

Re: The way edismax parses colon seems weird

2013-07-19 Thread Chris Hostetter
You havne't told us anything about how you have the anslysis configured for the fileds you are using -- and those details probably contain the specifics of your problem. once you've escaped hte colon so that eismax no longer recognizes it as "search this specific user field" syntax, any other

Re: Indexing CSV files in a Folder

2013-07-19 Thread SolrLover
Did you look in to this link? http://www.marshut.com/ruzyy/download-and-configure-morphlinesolrsink.html -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-CSV-files-in-a-Folder-tp4079192p4079222.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: The way edismax parses colon seems weird

2013-07-19 Thread jefferyyuan
Thanks very much for the reply. We are querying solr directly from browser: http://localhost:8080/solr/select?q=12\:34&defType=edismax&debug=query&qf=content 12\:34 12\:34 (+12\:34)/no_coord +12\:34 ExtendedDismaxQParser And seems this is not related with which (default) field I use to query.

Re: Date for 4.4 solr release

2013-07-19 Thread Alexandre Rafalovitch
Shouldn't that be real_soon:[NOW/DAY+3DAYS TO NOW/DAY+10DAYS] You know, just to avoid the performance problems of the people asking every five minutes. :-) Regards, Alex. P.s. Or is this a premature optimization? Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.co

Re: The way edismax parses colon seems weird

2013-07-19 Thread Jack Krupansky
Very good chance that is it. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Friday, July 19, 2013 7:16 PM To: solr-user@lucene.apache.org Subject: Re: The way edismax parses colon seems weird Could this be related: https://issues.apache.org/jira/browse/SOLR-433

Re: The way edismax parses colon seems weird

2013-07-19 Thread Jack Krupansky
I noticed that a single backslash in the URL query got turned into a backslash in the parsed query, which implies that the backslash was escaped (improperly) by Solr: http://localhost:8080/solr/select?q=12\:34&defType=edismax&debug=query&qf=content +12\:34 As a workaround, enclose the term in

RE: custom field type plugin

2013-07-19 Thread Kevin Stone
I can try again this weekend to get a clean environment. However, the order I did things in was the reverse of what you suggest. I got the AbstractSubTypeFieldType error first. Then I removed my jar from the sharedLib folder, and tried the war repacking solution. That is when I got NoClassDefFo

RE: custom field type plugin

2013-07-19 Thread Chris Hostetter
: I can try again this weekend to get a clean environment. However, the : order I did things in was the reverse of what you suggest. I got the Hmmm... then i'm kind of at a loss to explain what you're describing. need to see more details of the configs, dir structure, jar structure, etc...

Re: Solr index lot of pdf, doc, txt

2013-07-19 Thread sodoo
I'm using Solr 4.2 but I don't understand well this post recursive way. Maybe I think write a bash script. But bash script is not good solution. Another way & solution ? Please advice me. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-index-lot-of-pdf-doc-txt-tp4078

Re: Date for 4.4 solr release

2013-07-19 Thread adityab
+1 :D -- View this message in context: http://lucene.472066.n3.nabble.com/Date-for-4-4-solr-release-tp4079152p4079254.html Sent from the Solr - User mailing list archive at Nabble.com.

Order by an expression in Solr

2013-07-19 Thread cmd.ares
In SQL you can order by an expression like: SELECT * FROM TABLE1 ORDER BY ( CASE WHEN COL1 BETWEEN ${PARAM1} - 10 AND ${PARAM1} + 10 AND COL2=${PARAM2} THEN 1 WHEN COL1 BETWEEN ${PARAM1} - 10 AND ${PARAM1} + 10 AND COL3=${PARAM2} THEN 2 WHEN COL1 - COL3=${PARAM3} THEN 3 WHE