How to index autoshape text in Excel 2007+

2013-07-21 Thread Hiroshi Tatsumi
Hi, I am using Solr 4.3.0. I'd like to index autoshape text in Excel 2007+(.xlsx) by using ExtractingRequestHandler, but I can't. I tried to do for some MS office files. The results are below. Success (I can index autoshape text.) - Excel 2003(.xls) - Word 2003(.doc) - Word 2007+(.docx) Fail

Re: Solr 4.3.1 - SolrCloud nodes down and lost documents

2013-07-21 Thread Erick Erickson
Well, if I'm reading this right you had a node go out of circulation and then bounced nodes until that node became the leader. So of course it wouldn't have the documents (how could it?). Basically you shot yourself in the foot. Underlying here is why it took the machine you were re-starting so lo

Re: How to index autoshape text in Excel 2007+

2013-07-21 Thread Alexandre Rafalovitch
Solr uses Apache Tika for text extraction. Their mailing list (and issues list) might be better place to resolve this. And if it is a bug, they probably would appreciate an example they could practice on. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linke

Re: DIH nested cached entities not working after upgrade

2013-07-21 Thread Alexandre Rafalovitch
Could you check with Solr 4.4 RC1: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC1-rev1504776/solr/? There were some issues with nested keys ${a.b.c} due to the scoping mechanism implementation changes. Not a direct match, but might be easier to check this first than dig into d

Re: Avoid Solr Pivot Faceting Out of Memory / Shorter result for pivot faceting requests with facet.pivot.ngroup=true and facet.pivot.showLastList=false

2013-07-21 Thread Erick Erickson
Sorry, life's been really hectic lately. I don't know the pivot code, so can't make much of a comment on that. But when it comes to code changes, it's perfectly reasonable to open up a JIRA and attach the code as a patch. You might have to nudge people a bit to get them to carry it forward... The

RE: DIH nested cached entities not working after upgrade

2013-07-21 Thread Zac Smith
Same problem with 4.4.0 RC1. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Sunday, July 21, 2013 5:57 AM To: solr-user@lucene.apache.org Subject: Re: DIH nested cached entities not working after upgrade Could you check with Solr 4.4 RC1: http://people.a

short-circuit OR operator in lucene/solr

2013-07-21 Thread Deepak Konidena
I understand that lucene's AND (&&), OR (||) and NOT (!) operators are shorthands for REQUIRED, OPTIONAL and EXCLUDE respectively, which is why one can't treat them as boolean operators (adhering to boolean algebra). I have been trying to construct a simple OR expression, as follows q = +(field1:

Blog posts on extracting text features using Solr

2013-07-21 Thread Ken Krugler
Hi all, I recently posted parts 1 & 2 of a series on extracting text features for machine learning… http://www.scaleunlimited.com/2013/07/10/text-feature-selection-for-machine-learning-part-1/ http://www.scaleunlimited.com/2013/07/21/text-feature-selection-for-machine-learning-part-2/ It uses

hey.

2013-07-21 Thread chris sleeman
http://raai.ro/rsx/woygdauogfl chris sleeman 7/22/2013 4:06:15 AM

fw:

2013-07-21 Thread Ozgur Yilmazel
http://volumizercum.freeenhancementpills.com/apnzhdmv/pewnepiccjcjomtiqadeosdpvxbe ozguryilmazel 7/22/2013 4:10:47 AM

Re: SolrEntityProcessor gets slower and slower

2013-07-21 Thread Manuel Le Normand
Minfeng- This issue is tougher as the number of shard you have raise, you can read Erick Erickson's post: http://grokbase.com/t/lucene/solr-user/131p75p833/how-distributed-queries-works. If you have 100M docs I guess you are running this issue. The common way to deal with this issue is by filteri

DIH and tinyint(1) Field

2013-07-21 Thread deniz
Hello, I have exactly the same problem as here http://lucene.472066.n3.nabble.com/how-to-avoid-DataImportHandler-from-interpreting-quot-tinyint-1-unsigned-quot-value-as-quot-Boolean--td4035241.html#a4036967 however for the solution there, it is ruining my date type fields... are there any oth

Re: DIH and tinyint(1) Field

2013-07-21 Thread Shalin Shekhar Mangar
Your database's JDBC driver is interpreting the tinyint(1) as a boolean. Solr 4.4 fixes the problem affected date fields with convertType=true. It should be released by the end of this week. On Mon, Jul 22, 2013 at 12:18 PM, deniz wrote: > Hello, > > I have exactly the same problem as here > >