Re: Getting 411 Length required when adding docs

2011-11-15 Thread Chris Hostetter
: i am this strange issue of http 411 Length required error. My Solr is hosted : on third party hosting company and it was working fine all these while. : i really don't understand why this happened. Attached is the stack trace any : help will be appreciated General rule of debuggin client+serve

Different maxAnalyzedChars value in solrconfig.xml

2011-11-15 Thread Shyam Bhaskaran
Hi, Wanted to know whether we can set different maxAnalyzedChars values in the solrconfig.xml based on different fields. Can someone point if this is possible at all, my requirement needs me to set different values for maxAnalyzedChars parameter based on two different field values. For exampl

Re: Extended Dismax QueryParser

2011-11-15 Thread Jamie Johnson
Also to be fair I'm not working with trunk, so I ask this without knowing if this is fixed on trunk or not. If it's fixed already on trunk please just let me know. On Tue, Nov 15, 2011 at 10:10 PM, Jamie Johnson wrote: > I ran into an issue with the extended dismax query parser and multiple > tr

Extended Dismax QueryParser

2011-11-15 Thread Jamie Johnson
I ran into an issue with the extended dismax query parser and multiple trailing operators, I noticed that there is a fix for this for the dismax query parser (https://issues.apache.org/jira/browse/SOLR-874) which I've manually used to patch my system, is there any change of getting this done for ed

Re: Getting 411 Length required when adding docs

2011-11-15 Thread Darniz
Hello can anyone has any advice This is the code i am using server =new CommonsHttpSolrServer("http://www.mysolrserver.com/solr";); Credentials def = new UsernamePasswordCredentials("xxx","xxx"); server.getHttpClient().getState().setCredentials(AuthScope.ANY,def); server.getHttpClient().

Inserting documents using get method

2011-11-15 Thread Darniz
Hello All, i am trying to insert document using the server.addBean(obj) method. somehow i am getting HTTP error 411 Length required. After trying a lot i decided to change my method from post to get. if i open a browser and execute this query mysolrserver/solr/update?stream.body=testTestL it work

Re: License Info

2011-11-15 Thread Chris Hostetter
: Since Apache Solr is governed by Apache License 2.0 - does it mean that all : jar files bundled within Solr are also governed by the same License ? Do I : have to worry about checking the License information of all bundled jar : files in my commercial Solr powered application ? You can be certa

Re: admin index version not updating

2011-11-15 Thread Chris Hostetter
: I have a setup with a master and single slave, using the collection : distribution scripts. I'm not sure if it's relevant, but I'm running : multicore also. I am on version 3.4.0 (we are upgrading from 1.3). : : My understanding that the indexVersion (a number) reported by the stats : page

Re: Help! - ContentStreamUpdateRequest

2011-11-15 Thread Erick Erickson
That's odd. What are your autocommit parameters? And are you either committing or optimizing as part of your program? I'd bump the autocommit parameters up and NOT commit (or optimize) from your client if you are Best Erick On Tue, Nov 15, 2011 at 2:17 PM, Tod wrote: > Otis, > > The files ar

Re: getting lots of errors doing bulk insertion

2011-11-15 Thread Otis Gospodnetic
Jason, What you read is valid advice.  Just don't commit that often, or even at all until the very end if you can wait. :) And make sure you are indexing to a machine that doesn't warm up caches and searcher every time you commit. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - N

Re: Deploy Solritas as a separate application?

2011-11-15 Thread jwang
Thanks Erik. Our web tier is in Java. I probably will implement something using the same velocity template. As you said I think there is plenty of code that I can borrow from the Solritas project. -- View this message in context: http://lucene.472066.n3.nabble.com/Deploy-Solritas-as-a-separate-a

Re: Auto-scaling solr setup

2011-11-15 Thread jwang
An option is to wrap your Solr slave in a beanstalk and have it take care of the auto-scaling. -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-scaling-solr-setup-tp3029913p3511140.html Sent from the Solr - User mailing list archive at Nabble.com.

Problems installing Solr PHP extension

2011-11-15 Thread Travis Low
I know this isn't strictly Solr, but I've been at this for hours and I'm at my wits end. I cannot install the Solr PECL extension ( http://pecl.php.net/package/solr), either by command line "pecl install solr" or by downloading and using phpize. Always the same error, which I see here: http://www

Re: Locating index files?

2011-11-15 Thread John
Yes, it's a Nutch class that provides integration with Solr and in doing so, should place the files where Solr expects them based on the Solr config file. Based on the use of Solr and its configuration, I placed the post here. The issue is resolved. -- View this message in context: http://lucene.

RE: Easy way to tell if there are pending documents

2011-11-15 Thread Latter, Antoine
Thank you, that does help - but I am more looking for a way to get at this programmatically. -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Tuesday, November 15, 2011 11:22 AM To: solr-user@lucene.apache.org Subject: Re: Easy way to tell if there are

Re: Locating index files?

2011-11-15 Thread Chris Hostetter
FWIW: Based on your log description, it wounds like you are using some extenal index builder named "solrindex" ... the details on what exactly this "solrindex" app is are kind of crucial to understanding what/where "solrindex" is doing and where it's putting your data. "solrindex" is not the n

getting lots of errors doing bulk insertion

2011-11-15 Thread Jason Toy
I've written a script that does bulk insertion from my database, it grabs chunks of 500 docs (out of 100 million ) and inserts them into solr over http. I have 5 threads that are inserting from a queue. After each insert I issue a commit. Every 20 or so inserts I get this error message: Error:

Re: naming facet queries?

2011-11-15 Thread Erik Hatcher
Yes... use key instead of name in your example below :) On Nov 15, 2011, at 15:12 , Robert Stewart wrote: > Is there any way to give a name to a facet query, so you can pick > facet values from results us

Re: Phrase between quotes with dismax edismax

2011-11-15 Thread Erick Erickson
The query re-writing is...er...interesting, and I'll skip that for now... As for why you're not getting results, see the mm parameter here: http://wiki.apache.org/solr/DisMaxQParserPlugin Especially the line: The default value is 100% (all clauses must match) so I suspect your categories not mat

naming facet queries?

2011-11-15 Thread Robert Stewart
Is there any way to give a name to a facet query, so you can pick facet values from results using some name as a key (rather than looking for match via the query itself)? For example, in request handler I have: publish_date:[NOW-7DAY TO NOW] publish_date:[NOW-1MONTH TO NOW] I'd like results to h

Re: two word phrase search using dismax

2011-11-15 Thread alxsss
Hello, Thanks for your letter. I investigated further and found out that we have title scored more than content in qf field and those docs in the first places have one of the words in title but not both of them. The doc in the first place has only one of the words in the content. Docs with both

RE: File based wordlists for spellchecker

2011-11-15 Thread Dyer, James
>Doesn't IndexBasedSpellChecker simply extract (word, freq) pairs from index, >puts them into spellcheckingIndex, and forgets about the index altogether? >If so, then I'd only need to override index building, and reuse that. >Am I correct here, or does it actually go back to the original index? Yo

Re: Help! - ContentStreamUpdateRequest

2011-11-15 Thread Tod
Otis, The files are only part of the payload. The supporting metadata exists in a database. I'm pulling that information, as well as the name and location of the file, from the database and then sending it to a remote Solr instance to be indexed. I've heard Solr would prefer to get documen

Re: Search in multivalued string field does not work

2011-11-15 Thread Erick Erickson
Well, based on what you've written, they should be returning similar results, so there must be something else lurking. Possibilities: 1> the index is different on the two machines. How did you index server B? 2> Look at your admin/analysis page and try your text. This will help if you've over

Re: NGramFilterFactory - proximity and percentage of ngrams found

2011-11-15 Thread Erick Erickson
Well, I can have ago at two of them... (1) there isn't any relationship here. Although the q.op parameter can be used, see: http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 (2) I have no real clue (3) Probably the edge factory would be good here, although

Re: CachedSqlEntityProcessor

2011-11-15 Thread Mark
FYI my sub-entity looks like the following On 11/15/11 10:42 AM, Mark wrote: I am trying to use the CachedSqlEntityProcessor with Solr 1.4.2 however I am not seeing any performance gains. I've read some other posts that reference cacheKey and cacheLookup however I don't see any reference to

CachedSqlEntityProcessor

2011-11-15 Thread Mark
I am trying to use the CachedSqlEntityProcessor with Solr 1.4.2 however I am not seeing any performance gains. I've read some other posts that reference cacheKey and cacheLookup however I don't see any reference to them in the wiki http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityPr

Re: File based wordlists for spellchecker

2011-11-15 Thread Tomasz Wegrzanowski
On 15 November 2011 15:55, Dyer, James wrote: > Writing your own spellchecker to do what you propose might be difficult.  At > issue is the fact that both the "index-based" and "file-based" spellcheckers > are designed to work off a Lucene index and use the document frequency > reported by Luce

Howto Programatically check if the index is optimized or not?

2011-11-15 Thread Pranav Prakash
Hi, After the commit, my optimize usually takes 20 minutes. The thing is that I need to know programatically if the optimization has completed or not. Is there an API call through which I can know the status of optimization? *Pranav Prakash* "temet nosce" Twitter

Highlighting with a default copy field with EdgeNGramFilterFactory

2011-11-15 Thread João Nelas
Hi, I have a defaultSearch field that's also the target of a copyField with source="*". If I do a wildcard search "q=sas*" I get a match on all docs that have any word starting with "sas" and I get a separate highlight for every field where there is a match (using "hl=on&hl.fl=*"). This is grea

Re: Easy way to tell if there are pending documents

2011-11-15 Thread Otis Gospodnetic
Antoine, On Solr Admin Stats page search for "docsPending".  I think this is what you are looking for. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: "Latter, Antoine" >To: "'sol

Re: get a total count

2011-11-15 Thread Otis Gospodnetic
I'm assuming the question was about how MANY documents have been indexed across all shards. Answer #1: Look at the Solr Admin Stats page on each of your Solr instances and add up the numDocs numbers you see there Answer #2: Use Sematext's free Performance Monitoring tool for Solr On Index repor

Re: Help! - ContentStreamUpdateRequest

2011-11-15 Thread Otis Gospodnetic
Hi, How about just concatenating your files into one?  Would that work for you? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > >From: Tod >To: solr-user@lucene.apache.org >Sent: Monday,

RE: File based wordlists for spellchecker

2011-11-15 Thread Dyer, James
Writing your own spellchecker to do what you propose might be difficult. At issue is the fact that both the "index-based" and "file-based" spellcheckers are designed to work off a Lucene index and use the document frequency reported by Lucene to base their decisions. Both spell checkers build

RE: Index format difference between 4.0 and 3.4

2011-11-15 Thread Latter, Antoine
We didn't have to re-index when we upgraded - but if you're using a master/slave setup, you won't be able to replicate from a higher version to a lower version - old solr cannot read the new indices. -Original Message- From: roz dev [mailto:rozde...@gmail.com] Sent: Monday, November 14,

Phrase between quotes with dismax edismax

2011-11-15 Thread Jean-Claude Dauphin
Hello, I would be very greateful if somebody could explain me what is the exact problem and how to get the right results. Using dismax or edismax with the following query: EDISMAX query (q)=("chef de projet" category1071 category10055078 category10055405) gives no results (should get 33 docum

Re: memory usage keep increase

2011-11-15 Thread Erick Erickson
I'm pretty sure not. The words "virtual memory address space" is important here, that's not physical memory... Best Erick On Mon, Nov 14, 2011 at 11:55 AM, Yongtao Liu wrote: > Hi all, > > I saw one issue is ram usage keep increase when we run query. > After look in the code, looks like Lucene u

Re: get a total count

2011-11-15 Thread Erick Erickson
Not sure I understand the question. You have to specifically address the docs to a particular shard when indexing, so you should know already. Solr automagically distributes *queries* across shards (if you've configured your installation for it), but not docs during indexing. If that makes no sens

Re: Using solr during optimization

2011-11-15 Thread Isan Fulia
Hi Mark, Thanks for the reply. You are right.We need to test first by decreasing the mergefactor and see the indexing as well as searching performance and have some numbers in hand. Also after partial optimize with the same mergefactor how long the performance lasts(both searching and indexing)

Search in multivalued string field does not work

2011-11-15 Thread mechravi25
Hi, I have some data indexed in two servers A and B. This is how the data looks in schema.xml for datatype "text" I have two another d

Re: Casesensitive search problem

2011-11-15 Thread Ahmet Arslan
> HI, > Even if i have used all the posibility way like class="solr.LowerCaseFilterFactory"/> still i am getting > same problrm.If > anyone faced  before same problem  please let me > know how you have solved. WordDelimeterFilterFactory with split on change case setting may cause this. It will b

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread rajini maski
Thankyou for the responses :) Found that the bug was in naming convention of fields. (for tlong/long ) I had given a number character as a name of the field. Studyid field name was - 450 , Changed it to S450 and it started working :) Thank you all. Regards, Rajani On Tue, Nov 15, 2011 at 3

JVM Bugs affecting Lucene & Solr

2011-11-15 Thread Simon Willnauer
hey folks, we lately looked into https://issues.apache.org/jira/browse/LUCENE-3235 again, an issue where a class using ConcurrentHashMap hangs / deadlocks on specific JVMs in combination with specific CPUs. It turns out its a JVM bug in Sun / Oracle Java 1.5 as well as Java 1.6. Its apparently fix

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread Michael Kuhlmann
Hi, Am 15.11.2011 10:25, schrieb rajini maski: [...] [...] Hmh, why didn't you just changed the field type to tlong as you mentioned before? Instead you changed the class of the long type. There's nothing against this, it's just a bit confusing since long fields nor

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread rajini maski
All I didnt find any mistake in the schema.. below I have psted my schema file

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread rajini maski
All, On Tue, Nov 15, 2011 at 1:21 PM, kashif.khan wrote: > Obviously there is some problem somewhere in the schema or any other files. > the default SOLR demo which is by using the start.jar works well with the > long field. It is just that we do not know where is the problem causing > this >

Re: creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Michael Kuhlmann
I don't know much about nutch, but it looks like there's simply a commit missing at the end. Try to send a commit, e.g by executing curl http://host:port/solr//update -H "Content-Type: text/xml" --data-binary '' -Kuli Am 15.11.2011 09:11, schrieb Armin Schleicher: hi there, [...]

creating solr index from nutch segments, no errors, no results

2011-11-15 Thread Armin Schleicher
hi there, i am trying to create a fulltext index over internet archive .warc files. the whole procedure (as described in the following) seems to work fine, i do not get any errors or warnings, however there is no data being passed to solr, at least q=*:* returns nothing. I double checked the

Re: Solr 3.3 Sorting is not working for long fields

2011-11-15 Thread kashif.khan
Obviously there is some problem somewhere in the schema or any other files. the default SOLR demo which is by using the start.jar works well with the long field. It is just that we do not know where is the problem causing this error. -- View this message in context: http://lucene.472066.n3.nabble

Re: two word phrase search using dismax

2011-11-15 Thread Michael Kuhlmann
Am 14.11.2011 21:50, schrieb alx...@aim.com: Hello, I use solr3.4 and nutch 1.3. In request handler we have 2<-1 5<-2 6<90% As fas as I know this means that for two word phrase search match must be 100%. However, I noticed that in most cases documents with both words are ranked around 20 place

NGramFilterFactory - proximity and percentage of ngrams found

2011-11-15 Thread elisabeth benoit
Hello, I'm trying to use NGramFilterFactory for spell correction. I have three questions. 1) I use an edismax request handler. In this case, what is the relation between my ngrams and my default operator (q.op), if there is any? 2) Is there a way to control the proximity and percentage of ngrams