Solr Quries

2009-10-06 Thread Pravin Karne
Hi, I am new to solr. I have following queries : 1. Is solr work in distributed environment ? if yes, how to configure it? 2. Is solr have Hadoop support? if yes, how to setup it with Hadoop/HDFS? (Note: I am familiar with Hadoop) 3. I have employee information(id, name ,

Re: solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
really? I don't remember that being changed. what difference do u notice? On Wed, Oct 7, 2009 at 2:30 AM, michael8 wrote: > > Just looking for confirmation from others, but it appears that the formatting > of last_index_time from dataimport.properties (using DataImportHandler) is > different in

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Ravi Kiran
Hello Mr.Hostetter, Thank you for patiently reading through my post, I apologize for being cryptic in my previous messages.. >>when you cut/pasted the facet output, you excluded the field names. based >>on the schema & solrconfig.xml snippets you posted later, i'm assuming >>they are usstate, and

Re: DataImportHandler problem: Feeding the XPathEntityProcessor with the FieldReaderDataSource

2009-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi Lance. db.blob is the correct field name so that is fine. you can probbaly open an issue and provide the testcase as a patch. That can help us track this better On Wed, Oct 7, 2009 at 12:45 AM, Lance Norskog wrote: > A side note that might help: if I change the dataField from 'db.blob' > to 'b

Re: Problems with DIH XPath flatten

2009-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
send a small sample xml snippet you are trying to index and it may help On Tue, Oct 6, 2009 at 9:29 PM, Adam Foltzer wrote: > Hi all, > > I'm trying to set up DataImportHandler to index some XML documents available > over web services. The XML includes both content and metadata, so for the > inde

Re: search by some functionality

2009-10-06 Thread David Smiley @MITRE.org
Maybe I'm missing something, but function queries aren't involved in determining whether a document matches or not, only its score. How is a a custom function / value-source going to filter? ~ David Smiley hossman wrote: > > > : I read about this chapter before. It did not mention how to cre

Re: Importing CSV file slow/crashes

2009-10-06 Thread Nasseam Elkarra
Hello Yonik, Thank you for looking into this. Your question of if I'm using stock solr put me in the right direction. I am in fact using a patched version of solr to get hierarchal facet support (http://issues.apache.org/jira/browse/SOLR-64 ). I took out the 4 hiefacet fields from the schema

RE: Need "OR" in DisMax Query

2009-10-06 Thread Dean Missikowski (Consultant), CLSA
Hi David, See this thread for how I use OR with Dismax. http://www.mail-archive.com/solr-user@lucene.apache.org/msg19375.html -- Dean -Original Message- From: Ingo Renner [mailto:i...@typo3.org] Sent: 06 October 2009 05:00 To: solr-user@lucene.apache.org Subject: Re: Need "OR" in DisMax

Re: Authentication/Authorization with Master-Slave over HTTP

2009-10-06 Thread Chris Hostetter
: I want to be able to have SOLR Slave instance on publicly available host : (accessible via HTTP), and synchronize with Master securely (via HTTP) HTTP based replication only works with the the new ReplicationHandler ... if you setup a proxy in front of your Master (either as a seperate daemon,

Re: Why isn't the DateField implementation of ISO 8601 broader?

2009-10-06 Thread Chris Hostetter
: I would expect field:2001-03 to be a hit on a partial match such as : field:[2001-02-28T00:00:00Z TO 2001-03-13T00:00:00Z]. I suppose that my : expectation would be that field:2001-03 would be counted once per day for each : day in its range. It would follow that a user looking for documents re

Re: TermsComponent or auto-suggest with filter

2009-10-06 Thread R. Tan
Nice. In comparison, how do you do it with faceting? > "Two other approaches are to use either the TermsComponent (new in Solr > 1.4) or faceting." On Wed, Oct 7, 2009 at 1:51 AM, Jay Hill wrote: > Have a look at a blog I posted on how to use EdgeNGrams to build an > auto-suggest tool: > > ht

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Chris Hostetter
A few comments about the info you've provided... when you cut/pasted the facet output, you excluded the field names. based on the schema & solrconfig.xml snippets you posted later, i'm assuming they are usstate, and keyword, but you have to be explicit so that people can help correlate the r

Re: search by some functionality

2009-10-06 Thread Chris Hostetter
: I read about this chapter before. It did not mention how to create my : own customized function. : Can you point me to some instructions? The first step is to figure out how you can code your custom functionality as an extension of the ValueSource class... http://lucene.apache.org/solr/api/or

Re: Why isn't the DateField implementation of ISO 8601 broader?

2009-10-06 Thread Walter Lewis
On 6 Oct 09, at 5:31 PM, Chris Hostetter wrote: ...your expectations may be different then everyone elses. by requiring that the dates be explicit there is no ambiguity, you are in control of the behavior. The power of some of the other formulas in ISO 8601 is that you don't introduce f

Re: Why isn't the DateField implementation of ISO 8601 broader?

2009-10-06 Thread Tricia Williams
Thanks for making me think about this a little bit deeper, Hoss. Comments in-line. Chris Hostetter wrote: because those would be ambiguous. if you just indexed field:2001-03 would you expect it to match field:[2001-02-28T00:00:00Z TO 2001-03-13T00:00:00Z] ... what about date faceting, what s

Re: Question about PatternReplace filter and automatic Synonym generation

2009-10-06 Thread Chris Hostetter
: I ll try to explain with an example. Given the term 'it!' in the title, it : should match both 'it' and 'it!' in the query as an exact match. Currently, : this is done by using a synonym entry (and index time SynonymFilter) as : follows: : : it! => it, it! : : Now, the above holds true for

Re: Importing CSV file slow/crashes

2009-10-06 Thread Yonik Seeley
On Tue, Oct 6, 2009 at 1:06 PM, Nasseam Elkarra wrote: > I had a dev build of 1.4 from 5/1/2009 and importing a 20K row took less > than a minute. Updating to the latest as of yesterday, the import is really > slow and I had to cancel it after a half hour. This prevented me from > upgrading a few

Re: conditional sorting

2009-10-06 Thread Chris Hostetter
: I tried to simplify the problem, but the point is that I could have : really: complex requirements. For instance, "if in the first 5 results : none are older than one year, use sort by X, otherwise sort by Y". First 5 in what order? X? Y or something else? : So, the question is, is there a

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
Got it. Sorry for not having an answer for your problem. On 10/06/2009 04:58 PM, Ravi Kiran wrote: You dont see any facet fields in my query because I have configured them in the solrconfig.xml to give specific fields as facets by default in the dismax and standard handlers so that I dont have t

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Ravi Kiran
You dont see any facet fields in my query because I have configured them in the solrconfig.xml to give specific fields as facets by default in the dismax and standard handlers so that I dont have to specify all those fields individually everytime I query, all I need to do is just set facet=true tha

Re: What to set in query.setMaxRows()?

2009-10-06 Thread Chris Hostetter
: Sorry about asking this here, but I can't reach wiki.apache.org right now. : What do I set in query.setMaxRows() to get all the rows? http://wiki.apache.org/solr/FAQ#How_can_I_get_ALL_the_matching_documents_back.3F_..._How_can_I_return_an_unlimited_number_of_rows.3F How can I get ALL the mat

Re: stats page slow in latest nightly

2009-10-06 Thread Chris Hostetter
: When I was working on it, I was actually going to default to not show : the size, and make you click a link that added a param to get the sizes : in the display too. But I foolishly didn't bring it up when Hoss made my : life easier with his simpler patch. we can always turn the size estimator

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
I am stumped then. I had a similar issue when I was using a field that was being heavily tokenized, but I corrected the issue by using a field(generated using copyField) that doesn't get analyzed at all. On the query you provided before I didn't see the parameters to tell solr for which field

Re: Why isn't the DateField implementation of ISO 8601 broader?

2009-10-06 Thread Chris Hostetter
:My question is why isn't the DateField implementation of ISO 8601 broader : so that it could include and MM as acceptable date strings? What because those would be ambiguous. if you just indexed field:2001-03 would you expect it to match field:[2001-02-28T00:00:00Z TO 2001-03-13

Re: Merging multicore indexes

2009-10-06 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:40 AM, Paul Rosen wrote: > Shalin Shekhar Mangar wrote: > > The path on the wiki page was wrong. You need to use the adminPath in the >> url. Look at the adminPath attribute in solr.xml. It is typically >> /admin/cores >> >> So the correct path for you would be: >> >> >>

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Mark Miller wrote: > Jeff Newburn wrote: > >> So could that potentially explain our use of more ram on indexing? Or is >> this a rare edge case. >> >> > I think it could explain the JVM using more RAM while indexing - but it > should be fairly easily recoverable from what I can tell - so

Re: Problems with DIH XPath flatten

2009-10-06 Thread Adam Foltzer
Hi Shalin, Good question; sorry I forgot it in the initial post. I have tried with both a nightly build from earlier this month (Oct 2 I believe) as well as a build from the trunk as of yesterday afternoon. Adam On Tue, Oct 6, 2009 at 5:04 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrot

Re: Merging multicore indexes

2009-10-06 Thread Paul Rosen
Shalin Shekhar Mangar wrote: The path on the wiki page was wrong. You need to use the adminPath in the url. Look at the adminPath attribute in solr.xml. It is typically /admin/cores So the correct path for you would be: http://localhost:8983/solr/admin/cores?action=mergeindexes&core=merged&ind

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Ravi Kiran
Yes Exactly the same On Tue, Oct 6, 2009 at 4:52 PM, Christian Zambrano wrote: > And you had the analyzer for that field set-up the same way as shown on > your previous e-mail when you indexed the data? > > > > > On 10/06/2009 03:46 PM, Ravi Kiran wrote: > >> I did infact check it out any there i

Re: Problems with DIH XPath flatten

2009-10-06 Thread Shalin Shekhar Mangar
On Tue, Oct 6, 2009 at 9:29 PM, Adam Foltzer wrote: > Hi all, > > I'm trying to set up DataImportHandler to index some XML documents > available > over web services. The XML includes both content and metadata, so for the > indexable content, I'm trying to just index everything under the content >

solr 1.4 formats last_index_time for SQL differently than 1.3 ?!?

2009-10-06 Thread michael8
Just looking for confirmation from others, but it appears that the formatting of last_index_time from dataimport.properties (using DataImportHandler) is different in 1.4 vs. that in 1.3. I was troubleshooting why delta imports are no longer working for me after moving over to solr 1.4 (10/2 nighl

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
And you had the analyzer for that field set-up the same way as shown on your previous e-mail when you indexed the data? On 10/06/2009 03:46 PM, Ravi Kiran wrote: I did infact check it out any there is no weirdness in analysis page...see result below Index Analyzer org.apache.solr.analysis.Ke

Re: stats page slow in latest nightly

2009-10-06 Thread Joe Calderon
thx much guys, no biggie for me, i just wanted to get to the bottom of it in case i had screwed something else up.. --joe On Tue, Oct 6, 2009 at 1:19 PM, Mark Miller wrote: > I was worried about that actually. I havn't tested how fast the RAM > estimator is on huge String FieldCaches - it will b

RE: Solr Timeouts

2009-10-06 Thread Giovanni Fernandez-Kincade
Yeah that's exactly right Mark. What does the "maxCommitsToKeep"(from SolrDeletionPolicy in SolrConfig.xml) parameter actually do? Increasing this value seems to have helped a little, but I'm wary of cranking it without having a better understanding of what it does. -Original Message- F

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Ravi Kiran
I did infact check it out any there is no weirdness in analysis page...see result below Index Analyzer org.apache.solr.analysis.KeywordTokenizerFactory {} term position 1 term text New York term type word source start,end 0,8 payload org.apache.solr.analysis.TrimFilterFactory {} term position 1

Re: Solr Timeouts

2009-10-06 Thread Mark Miller
It sounds like he is indexing on a local disk, but reading the files to be index from NFS - which would be fine. You can get Lucene indexes to work on NFS (though still not recommended) , but you need to use a custom IndexDeletionPolicy to keep older commit points around longer and be sure not to

RE: Solr Timeouts

2009-10-06 Thread Feak, Todd
I seem to recall hearing something about *not* putting a Solr index directory on an NFS mount. Might want to search on that. That, of course, doesn't have anything to do with commits showing up unexpectedly in stack traces, per your original email. -Todd -Original Message- From: Giovan

Re: Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Christian Zambrano
Have you tried using the Analysis page to see what tokens are generated for the string "New York"? It could be one of the token filter is adding the token 'new' for all strings that start with 'new' On 10/06/2009 02:54 PM, Ravi Kiran wrote: Hello All, Iam getting some ghost face

Re: stats page slow in latest nightly

2009-10-06 Thread Mark Miller
I was worried about that actually. I havn't tested how fast the RAM estimator is on huge String FieldCaches - it will be fast on everything else, but it checks the size of each String in the array. When I was working on it, I was actually going to default to not show the size, and make you click a

RE: Solr and Garbage Collection

2009-10-06 Thread Fuad Efendi
Master-Slave replica: new caches will be warmed&prepopulated _before_ making new IndexReader available for _new_ requests and _before_ discarding old one - it means that theoretical sizing for FieldCache (which is defined by number of docs in an index and cardinality of a field) should be doubled..

Re: stats page slow in latest nightly

2009-10-06 Thread Yonik Seeley
Might be the new Lucene fieldCache stats stuff that was recently added? -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 3:56 PM, Joe Calderon wrote: > hello *, ive been noticing that /admin/stats.jsp is really slow in the > recent builds, has anyone else encountered this? > > > --

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Jeff Newburn wrote: > So could that potentially explain our use of more ram on indexing? Or is > this a rare edge case. > I think it could explain the JVM using more RAM while indexing - but it should be fairly easily recoverable from what I can tell - so no explanation on the OOM yet. Still loo

stats page slow in latest nightly

2009-10-06 Thread Joe Calderon
hello *, ive been noticing that /admin/stats.jsp is really slow in the recent builds, has anyone else encountered this? --joe

Weird Facet and KeywordTokenizerFactory Issue

2009-10-06 Thread Ravi Kiran
Hello All, Iam getting some ghost facets in solr 1.4. Can anybody kindly help me understand why I get them and how to eliminate them. My schema.xml snippet is given at the end. Iam indexing Named Entities extracted via OpenNLP into solr. My understanding regarding KeywordTokenizerFact

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Jeff Newburn
So could that potentially explain our use of more ram on indexing? Or is this a rare edge case. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 > From: Mark Miller > Reply-To: > Date: Tue, 06 Oct 2009 15:30:50 -0400 > To: > Subject: Re: Solr Trunk Heap Space I

RE: Solr Timeouts

2009-10-06 Thread Giovanni Fernandez-Kincade
That thread was blocking for an hour while all other threads were idle or blocked. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, October 06, 2009 3:07 PM To: solr-user@lucene.apache.org Subject: Re: Solr Timeouts This speci

RE: Solr Timeouts

2009-10-06 Thread Giovanni Fernandez-Kincade
Yeah this is Java 1.6. The indexes are being written to a local disk, but they files being indexed live on a NFS. -Original Message- From: Lance Norskog [mailto:goks...@gmail.com] Sent: Tuesday, October 06, 2009 2:59 PM To: solr-user@lucene.apache.org Subject: Re: Solr Timeouts Is this

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
This is looking like its just a Lucene oddity you get when adding a single doc due to some changes with the NRT stuff. Mark Miller wrote: > Okay - I'm sorry - serves me right for working sick. > > Now that I have put on my glasses and correctly tagged my two eclipse tests: > > It still appears tha

RE: Solr and Garbage Collection

2009-10-06 Thread Fuad Efendi
> I read pretty much all posts on this thread (before and after this one). Looks > like the main suggestion from you and others is to keep max heap size (-Xmx) > as small as possible (as long as you don't see OOM exception). I suggested absolute opposite; please note also that "as small as possi

RE: Geo Coding Service

2009-10-06 Thread Fuad Efendi
If you are looking for (simplified) ZIP/PostalCode -> Longitude-Latitude mapping (North America) check this: http://www.zipcodedownload.com I am using it for service area calculations for casaGURU renovation professionals at http://www.casaguru.com They even have API library (including stored pro

Re: Importing CSV file slow/crashes

2009-10-06 Thread Chris Hostetter
: Is it possible to narrow down what fields/field-types are causing the problems? : Or perhaps profile and see what's taking up time compared to the older version? Or: could you post your solrconfig + schema + csv files online so other people could help debug the problem? : : -Yonik : http:

Re: DataImportHandler problem: Feeding the XPathEntityProcessor with the FieldReaderDataSource

2009-10-06 Thread Lance Norskog
A side note that might help: if I change the dataField from 'db.blob' to 'blob', this DIH stack emits no documents. On 10/5/09, Lance Norskog wrote: > I've added a unit test for the problem down below. It feeds document > field data into the XPathEntityProcessor via the > FieldReaderDataSource, a

Re: Different sort behavior on same code

2009-10-06 Thread Yonik Seeley
Lucene's test for multi-valued fields is crude... it's essentially if the number of values (un-inverted term instances) becomes greater than the number of documents. -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 3:04 PM, wojtekpia wrote: > > Hi, > > I'm running Solr version 1.3.0

Re: Solr Timeouts

2009-10-06 Thread Yonik Seeley
This specific thread was blocked for an hour? If so, I'd echo Lance... this is a local disk right? -Yonik http://www.lucidimagination.com On Mon, Oct 5, 2009 at 2:11 PM, Giovanni Fernandez-Kincade wrote: > I just grabbed another stack trace for a thread that has been similarly > blocking for o

Different sort behavior on same code

2009-10-06 Thread wojtekpia
Hi, I'm running Solr version 1.3.0.2009.07.08.08.05.45 in 2 environments. I have a field defined as: The two environments have different data, but both have single and multi valued entries for myDate. On one environment sorting by myDate works (sort seems to be by the 'last' value if multi va

Re: Solr Timeouts

2009-10-06 Thread Lance Norskog
Is this Java 1.5? There are known threading bugs in 1.5 that were fixed in Java 1.6. Also, there was one short series of 1.6 releases that wrote bogus Lucene index files. So, make sure you use the latest Java 1.6 release. Also, I hope this is a local disk. Some shops try running over NFS or Windo

Re: Importing CSV file slow/crashes

2009-10-06 Thread Yonik Seeley
Is it possible to narrow down what fields/field-types are causing the problems? Or perhaps profile and see what's taking up time compared to the older version? -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 1:48 PM, Nasseam Elkarra wrote: > Hello Erick, > > Sorry about that. I'm

Geo Coding Service

2009-10-06 Thread ram_sj
Hi, Can someone suggest me a good geo-coding service or software for commercial use. I want to find gecodes for large collection of address. I'm looking for a good long term service. Thanks Ram -- View this message in context: http://www.nabble.com/Geo-Coding-Service-tp25774277p25774277.html

Re: TermsComponent or auto-suggest with filter

2009-10-06 Thread Jay Hill
Have a look at a blog I posted on how to use EdgeNGrams to build an auto-suggest tool: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ You could easily add filter queries to this approach. Ffor example, the query used in the blog could add filter

Re: Importing CSV file slow/crashes

2009-10-06 Thread Nasseam Elkarra
Hello Erick, Sorry about that. I'm using the CSV update handler. Uploading a local CSV using the stream.file parameter. There are 94 fields and 36 copyFields. Thank you, Nasseam On Oct 6, 2009, at 10:09 AM, Erick Erickson wrote: Well, without some better idea of *how* you're doing the imp

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Okay - I'm sorry - serves me right for working sick. Now that I have put on my glasses and correctly tagged my two eclipse tests: It still appears that trunk likes to use more RAM. I switched both tests to one million iterations and watched the heap. The test from the build around may 5th (I pr

Re: Importing CSV file slow/crashes

2009-10-06 Thread Erick Erickson
Well, without some better idea of *how* you're doing the import, it's a little hard to say anything meaningful (hint, hint). Best Erick On Tue, Oct 6, 2009 at 1:06 PM, Nasseam Elkarra wrote: > Hello all, > > I had a dev build of 1.4 from 5/1/2009 and importing a 20K row took less > than a minute

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Okay, I juggled the tests in eclipse and flipped the results. So they make sense. Sorry - goose chase on this one. Yonik Seeley wrote: > I don't see this with trunk... I just tried TestIndexingPerformance > with 1M docs, and it seemed to work fine. > Memory use stabilized at 40MB. > Most memory u

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Yonik Seeley
I don't see this with trunk... I just tried TestIndexingPerformance with 1M docs, and it seemed to work fine. Memory use stabilized at 40MB. Most memory use was for indexing (not analysis). char[] topped out at 4.5MB -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 12:31 PM, Mark Mi

De-basing / re-basing docIDs, or how to effectively pass calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process

2009-10-06 Thread Aaron McKee
(Posted here, per Yonik's suggestion) In the code I'm working with, I generate a cache of calculated values as a by-product within a Filter.getDocidSet implementation (and within a Query-ized version of the filter and its Scorer method) . These values are keyed off the IndexReader's docID valu

Re: HighLithing exact phrases with solr

2009-10-06 Thread Koji Sekiguchi
Please try hl.usePhraseHighlighter=true parameter. (It should be true by default if you use the latest nightly, but I think you don't) Koji Antonio Calò wrote: Hi Guys I'm getting crazy with the highlighting in solr. The problem is the follow: when I submit an exact phrase query, I get the r

RE: Solr Timeouts

2009-10-06 Thread Giovanni Fernandez-Kincade
Is it possible that deletions are triggering these commits? Some of the documents that I'm making indexing requests for already exist in the index, so they would result in deletions. I tried messing with some of these parameters but I'm still running into the same problem: false

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Yeah - I was wondering about that ... not sure how these guys are stacking up ... Yonik Seeley wrote: > TestIndexingPerformance? > What the heck... that's not even multi-threaded! > > -Yonik > http://www.lucidimagination.com > > > > On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller wrote: > >> Darn

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Yonik Seeley
TestIndexingPerformance? What the heck... that's not even multi-threaded! -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 12:17 PM, Mark Miller wrote: > Darnit - didn't finish that email. This is after running your old short > doc perf test for 10,000 iterations. You see the same

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Darnit - didn't finish that email. This is after running your old short doc perf test for 10,000 iterations. You see the same thing with 1000 iterations but much less pronounced eg gettin' worse with more iterations. Mark Miller wrote: > A little before and after. The before is around may 5th'is -

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
A little before and after. The before is around may 5th'is - the after is trunk. http://myhardshadow.com/memanalysis/before.png http://myhardshadow.com/memanalysis/after.png Mark Miller wrote: > Took a peak at the checkout around the time he says he's using. > > CharTokenizer appears to be holdin

How to retrieve the index of a string within a field?

2009-10-06 Thread Elaine Li
Hi, I have a field. The field has a sentence. If the user types in a word or a phrase, how can I return the index of this word or the index of the first word of the phrase? I tried to use &bf=ord..., but it does not work as i expected. Thanks. Elaine

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Mark Miller
Took a peak at the checkout around the time he says he's using. CharTokenizer appears to be holding onto much large char[] arrays now than before. Same with snowball.Among - used to be almost nothing, now its largio. The new TokenStream stuff appears to be clinging. Needs to find some inner peace

Problems with DIH XPath flatten

2009-10-06 Thread Adam Foltzer
Hi all, I'm trying to set up DataImportHandler to index some XML documents available over web services. The XML includes both content and metadata, so for the indexable content, I'm trying to just index everything under the content tag: The result of this is that the title field gets populat

Tr : Questions about synonyms and highlighting

2009-10-06 Thread Nourredine K.
Hello, Even short/partial answers could satisfy me :) Nourredine. >Hi, >Can you please give me some answers for those questions : > >1 - How can I get synonyms found for a keyword ? > >I mean i search "foo" and i have in my synonyms.txt file the following tokens >: "foo, foobar, fee" (w

Re: FACET_SORT_INDEX descending?

2009-10-06 Thread Gerald Snyder
Reverse alphabetical ordering. The option "index" provides alphabetical ordering. I have a year_facet field, that I would like to display in reverse order (most recent years first). Perhaps there is some other way to accomplish this. Thanks. --Gerald Chris Hostetter wrote: : Is there a

Re: Creating cores using SolrJ

2009-10-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
yeah that is missing. I've just committed a setter/getter for dataDir in create command do this CoreAdminRequest.Create req = new CoreAdminRequest.Create(); req.setCoreName( name ); req.setInstanceDir(instanceDir); req.setDataDirDir(dataDir); return req.process( solrServer );

Creating cores using SolrJ

2009-10-06 Thread Licinio Fernández Maurelo
Hi there, i want to create cores using SolrJ, but i also want to create then in a given datadir. How can i do this? Looking CoreAdminRequest methods i only found: - createCore(name, instanceDir, server) - createCore(name, instanceDir, server, configFile, schemaFile) None of above methods

RE: using regular expressions in solr query

2009-10-06 Thread Feak, Todd
Any particular reason for the double quotes in the 2nd and 3rd query example, but not the 1st, or is this just an artifact of your email? -Todd -Original Message- From: Rakhi Khatwani [mailto:rkhatw...@gmail.com] Sent: Tuesday, October 06, 2009 2:26 AM To: solr-user@lucene.apache.org Su

Re: search by some functionality

2009-10-06 Thread Sandeep Tagore
Hi Elaine, You can implement a function query in Solr in two ways: 1. Using Dismax request handler (with bf parameter). 2. Using the standard request handler (with _val_ field). I recommend the first option. Sandeep Elaine Li wrote: > > Hi Sandeep, > > I read about this chapter before. It

solr optimize - no space left on device

2009-10-06 Thread Phillip Farber
I am attempting to optimize a large shard on solr 1.4 and repeatedly get java.io.IOException: No space left on device. The shard, after a final commit before optimize, shows a size of about 192GB on a 400GB volume. I have successfully optimized 2 other shards that were similarly large without

TermsComponent or auto-suggest with filter

2009-10-06 Thread R. Tan
Hello, What's the best way to get auto-suggested terms/keywords that is filtered by one or more fields? TermsComponent should have been the solution but filters are not supported. Thanks, Rihaed

ISOLatin1AccentFilter before or after Snowball?

2009-10-06 Thread Chantal Ackermann
Hi all, from reading through previous posts on that subject, it seems like the accent filter has to come before the snowball filter. I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly, at all, or wh

Re: Re : Re : wildcard searches

2009-10-06 Thread Avlesh Singh
You are right, Angel. The problem would still persist. Why don't you consider putting the original data in some field. While querying, you can query on both the fields - analyzed and original one. Wildcard queries will not give you any results from the analyzed field but would match the data in you

using regular expressions in solr query

2009-10-06 Thread Rakhi Khatwani
Hi, i have an example in which i want to use a regular expression in my solr query: for example: suppose i wanna search on a sample : raakhi rajnish ninad goureya sheetal ritesh rajnish ninad goureya sheetal where my content field is of type text when i type in QUERY: content:raa* RESPONSE

Re: Need "OR" in DisMax Query

2009-10-06 Thread Ingo Renner
Am 05.10.2009 um 20:36 schrieb David Giffin: Hi David, Maybe I'm missing something, but I can't seem to get the dismax request handler to perform and OR query. It appears that OR is removed by the stop words. It's not the stop words, Dismax simply doesn't do any boolean operations, the onl

Re : Re : wildcard searches

2009-10-06 Thread Angel Ice
Ah yes, got it. But i'm not sure this will solve my problem. Because, I'm aloso using the IsoLatin1 filter, that remove the accentued characters. So I will have the same problem with accentued characters. Cause the original token is not stored with this filter. Laurent

Re: Date field being null

2009-10-06 Thread Avlesh Singh
> > I am defining a field: > > indexed="false" and stored="false"? really? This field is as good as nothing. What would you use it for? Can I have a null for such a field? > Yes you can. Moreover, as you have sortMissingLast="true" specified in your field type definition, documents having null va

Re: Re : wildcard searches

2009-10-06 Thread Avlesh Singh
You are processing your tokens in the filter that you wrote. I am assuming it is the first filter being applied and removes the character 'h' from tokens. When you are doing that, you can preserve the original token in the same field as well. Because as of now, you are simply removing the character

Re : wildcard searches

2009-10-06 Thread Angel Ice
Hi. Thanks for your answers Christian and Avlesh. But I don't understant what you mean by : "If you want to enable wildcard queries, preserving the original token (while processing each token in your filter) might work." Could you explain this point please ? Laurent __

Date field being null

2009-10-06 Thread Pooja Verlani
Hi, My fieldtype definition is like: I am defining a field: Can I have a null for such a field? or is there a way I can use it as a date field only if the value is null. I cant put the field as a string type as I have to apply recency sort and some filters for that field. Regards, Pooja

solr reporting tool adapter

2009-10-06 Thread Rakhi Khatwani
Hi, i wanted to query solr and send the output some reporting tool. has anyone done something like that? moreover, which reporting filter is good?? ny suggesstions? Regards, Raakhi