Re: How to retrieve the index of a string within a field?

2009-10-08 Thread Sandeep Tagore
Hi Elaine, As you are able to get the sentences which contains that phrase(when you use double quotes), its ok with the 'text' field type. Frankly speaking, I don't know whether Solrj's http call will hung or not if you try to get 100 thousands records at a time. I never tried that. But I guess y

RE: Solr Quries

2009-10-08 Thread Pravin Karne
Thanks for your reply. I have one more query regarding solr distributed environment. I have configured solr on to machine as per http://wiki.apache.org/solr/DistributedSearch But I have following test case - Suppose I have two machine ,Sever1 ,Server2 I have post record with id 1 on sever1 and

RE: Solr Quries

2009-10-08 Thread Pravin Karne
Thanks for your help. Can you please provide detail configuration for solr distributed environment. How to setup master and slave ? for this in which file/s I have to do changes ? What are the shard parameters ? Can we integrate zookeeper with this ? Please provide details for this. Thanks in a

Re: DIH: Setting rows= on full-import has no effect

2009-10-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
I have raised an issue http://issues.apache.org/jira/browse/SOLR-1501 On Fri, Oct 9, 2009 at 6:10 AM, Jay Hill wrote: > In the past setting rows=n with the full-import command has stopped the DIH > importing at the number I passed in, but now this doesn't seem to be > working. Here is the command

Re: DIH Error in latest Nightly Builds

2009-10-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
raised an issue https://issues.apache.org/jira/browse/SOLR-1500 On Fri, Oct 9, 2009 at 7:10 AM, jayakeerthi s wrote: > Hi All, > > I tried Indexing data and got the following error., Used Solr nightly Oct5th > and nightly 8th, The same Configuration/query  is working in Older > version(May nightl

Re: Scoring for specific field queries

2009-10-08 Thread Avlesh Singh
Use the field analysis tool to see how the data is being analyzed in both the fields. Cheers Avlesh On Fri, Oct 9, 2009 at 12:56 AM, R. Tan wrote: > Hmm... I don't quite get the desired results. Those starting with "cha" are > now randomly ordered. Is there something wrong with the filters I ap

DIH Error in latest Nightly Builds

2009-10-08 Thread jayakeerthi s
Hi All, I tried Indexing data and got the following error., Used Solr nightly Oct5th and nightly 8th, The same Configuration/query is working in Older version(May nightly Build) The db-data-config.xml has the simple Select query SEVERE: Full Import failed org.apache.solr.handler.dataimport.Dat

Re: multi-word synonyms and analysis.jsp vs real field analysis (query, index)

2009-10-08 Thread Koji Sekiguchi
Patrick, > parsedQueryString was something similar to "field:foo field:bar". At > index time, it works like expected. I guess because you are searching q=foo bar, this causes OR query. Use q="foo bar", instead. Koji Patrick Jungermann wrote: Hi list, I worked on a field type and its analyzi

DIH: Setting rows= on full-import has no effect

2009-10-08 Thread Jay Hill
In the past setting rows=n with the full-import command has stopped the DIH importing at the number I passed in, but now this doesn't seem to be working. Here is the command I'm using: curl ' http://localhost:8983/solr/indexer/mediawiki?command=full-import&rows=100' But when 100 docs are imported

Re: issue in adding data to a multivalued field

2009-10-08 Thread Koji Sekiguchi
Hi Rakhi, Use multiValued (capital V), not multivalued. :) Koji Rakhi Khatwani wrote: Hi, i have a small schema with some of the fields defined as: where the field author_name is multivalued. however in UI (schema browser), following r the details of author_name field, its nowher

multi-word synonyms and analysis.jsp vs real field analysis (query, index)

2009-10-08 Thread Patrick Jungermann
Hi list, I worked on a field type and its analyzing chain, at which I want to use the SynonymFilter with entries similar to: foo bar=>foo_bar During the analysis phase, I used the /admin/analysis.jsp view to test the analyzing results produced by the created field type. The output shows that a q

Re: [slightly off topic] Jetty and NIO

2009-10-08 Thread Grant Ingersoll
On Oct 8, 2009, at 7:37 PM, Yonik Seeley wrote: On Thu, Oct 8, 2009 at 6:24 PM, Grant Ingersoll wrote: So, if I'm on Centos 2.6 (64 bit), what connector should I be using? Based on the comments, I'm not sure the top one is the right thing either, but it also sounds like it is my only ot

concatenating tokens

2009-10-08 Thread Joe Calderon
hello *, im using a combination of tokenizers and filters that give me the desired tokens, however for a particular field i want to concatenate these tokens back to a single string, is there a filter to do that, if not what are the steps needed to make my own filter to concatenate tokens? for exam

Re: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Christian Zambrano
Bern, The only way that could be happening is if you are not using the field type you described on your original e-mail. The TokenFilter WordDelimiterFilterFactory should take care of the hyphen. On 10/08/2009 05:30 PM, Bernadette Houghton wrote: Thanks for this Patrick. If I remove one of t

Re: how can I use debugQuery if I have extended QParserPlugin?

2009-10-08 Thread gdeconto
Hi Yonik; My original post ( http://www.nabble.com/how-can-I-use-debugQuery-if-I-have-extended-QParserPlugin--tt25789546.html http://www.nabble.com/how-can-I-use-debugQuery-if-I-have-extended-QParserPlugin--tt25789546.html ) has the stack trace. =^D I am having trouble reproducing this issue co

Re: [slightly off topic] Jetty and NIO

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 6:24 PM, Grant Ingersoll wrote: > So, if I'm on Centos 2.6 (64 bit), what connector should I be using?  Based > on the comments, I'm not sure the top one is the right thing either, but it > also sounds like it is my only other choice. Right - the connector that Solr uses in

RE: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Bernadette Houghton
Thanks for this Patrick. If I remove one of the hyphens, solr doesn't throw up the error, but still doesn't find the right record. I see from marklo's analysis page that solr is still parsing it with a hyphen. Changing this part of our schema.xml - To i.e. replacing non-al

[slightly off topic] Jetty and NIO

2009-10-08 Thread Grant Ingersoll
In the Solr example jetty.xml, there is the following setup and comments: default="8983"/> 5 1500 So, if I'm on Centos 2.6 (64 bit), what connector should I be using? Based on the comments, I'm

RE: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Bernadette Houghton
Thanks for this, marklo; it is a *very* useful page. bern -Original Message- From: marklo [mailto:mar...@pcmall.com] Sent: Thursday, 8 October 2009 1:10 PM To: solr-user@lucene.apache.org Subject: Re: Problems with WordDelimiterFilterFactory Use http://solr-url/solr/admin/analysis.jsp t

RE: Sorting by insertion time

2009-10-08 Thread Steven A Rowe
Hi Tarjei, See https://issues.apache.org/jira/browse/SOLR-1478 - with trunk Solr (and soon, 1.4), you can use pseudo-field _docid_ for this purpose. Steve > -Original Message- > From: tarjei [mailto:tar...@nu.no] > Sent: Thursday, October 08, 2009 2:18 AM > To: solr-user@lucene.apache.o

Re: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Patrick Jungermann
Hi Bern, the problem is the character sequence "--". A query is not allowed to have minus characters that consequent upon another one. Remove one minus character and the query will be parsed without problems. Because of this parsing problem, I'd recommend a query cleanup before the submit to the

RE: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Bernadette Houghton
Sorry, the last line was truncated - HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Cannot parse '(Asia -- Civilization AND status_i:(2)) ': Encountered "-" at line 1, column 7. Was expecting one of: "(" ... "*" ... ... ... ... ... "[" ... "{" ... ... -Original Messag

RE: Problems with WordDelimiterFilterFactory

2009-10-08 Thread Bernadette Houghton
Here's the query and the error - Oct 09 08:20:17 [debug] [196] Solr query string:(Asia -- Civilization AND status_i:(2)) Oct 09 08:20:17 [debug] [196] Solr sort by: score desc Oct 09 08:20:17 [error] Error on searching: "400" Status: org.apache.lucene.queryParser.ParseException: Canno

Re: indexing frequently-changing fields

2009-10-08 Thread Yonik Seeley
It's a bit round-about but you might be able to use ExternalFileField http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html The fieldType definition would look like Then you can use frange to include/exclude certain values: http://www.lucidimagination.com/blog/tag/

indexing frequently-changing fields

2009-10-08 Thread didier deshommes
I am using Solr to index data in a SQL database. Most of the data doesn't change after initial commit, except for a single boolean field that indicates whether an item is flagged as 'needing attention'. So I have a need_attention field in the database that I update whenever a user marks an item a

Re: delay while adding document to solr index

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 1:58 AM, swapna_here wrote: > i don't understand why my solr index increasing daily > when i am adding and deleting the same number of documents daily A delete is just a bit flip, and does not reclaim disk space immediately. Deleted documents are squeezed out when segment m

Re: Scoring for specific field queries

2009-10-08 Thread R. Tan
Hmm... I don't quite get the desired results. Those starting with "cha" are now randomly ordered. Is there something wrong with the filters I applied? On Thu, Oct 8, 2009 at 7:38 PM, Avlesh Singh wrote: > Filters? I did not mean filters at all. > I am in a mad rush right now, but on the face of

releasing memory?

2009-10-08 Thread Ryan McKinley
Hello- I have an application that can run in the background on a user Desktop -- it will go through phases of being used and not being used. I want to be able to free as many system resources when not in use as possible. Currently I have a timer that wants for 10 mins of inactivity and r

Re: IndexWriter InfoStream in solrconfig not working

2009-10-08 Thread Yonik Seeley
OK, move the infoStream part in solrconfig.xml from indexDefaults into mainIndex and it should work. -Yonik http://www.lucidimagination.com On Thu, Oct 8, 2009 at 2:40 PM, Yonik Seeley wrote: > I can't get it to work either, so I reopened > https://issues.apache.org/jira/browse/SOLR-1145 > > -Y

Re: IndexWriter InfoStream in solrconfig not working

2009-10-08 Thread Yonik Seeley
I can't get it to work either, so I reopened https://issues.apache.org/jira/browse/SOLR-1145 -Yonik http://www.lucidimagination.com On Wed, Oct 7, 2009 at 1:45 PM, Giovanni Fernandez-Kincade wrote: > I had the same problem. I'd be very interested to know how to get this > working... > > -Gio. >

Re: how can I use debugQuery if I have extended QParserPlugin?

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 12:14 PM, gdeconto wrote: > I did check the other posts, as well as whatever I could find on the net but > didnt find anything. > > Has anyone encountered this type of issue, or is what I am doing (extending > QParserPlugin) that unusual?? I think you need to provide some

Optimization of large shard succeeded

2009-10-08 Thread Phillip Farber
I thought I'd summarize a method that solved the problem we were having trying to optimize a large shard that was running out of disk space, df=100% (400g), du=~380g. After we ran out of space, if we restarted tomcat, segment files disappeared from disk leaving 3 segments. What worked: we u

RE: How to determine the size of the index?

2009-10-08 Thread Fishman, Vladimir
No, I need to know what is the size of the index. -Original Message- From: Sandeep Tagore [mailto:sandeep.tag...@gmail.com] Sent: Wednesday, October 07, 2009 10:20 PM To: solr-user@lucene.apache.org Subject: Re: How to determine the size of the index? Are you referring to schema info ??

Re: correct syntax for boolean search

2009-10-08 Thread Avlesh Singh
q=+fieldname1:(+(word_a1 word_b1) +(word_a2 word_b2) +(word_a3 word_b3)) +fieldname2:... Cheers Avlesh On Thu, Oct 8, 2009 at 7:40 PM, Elaine Li wrote: > Hi, > > What is the correct syntax for the following boolean search from a field? > > fieldname1:(word_a1 or word_b1) && (word_a2 or word_b2)

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-08 Thread Claudio Martella
Hello, i'm following the thread but i think it still hasn't been answered if the isolatinfilter goes before or after the stemmer. any direct answer? Koji Sekiguchi wrote: > In this particular case, I don't think one is better than the other... > > In general, MappingCharFilter is more flexible

Re: UTF-8 and latin accents

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 12:48 PM, Claudio Martella wrote: > I'm trying to index documents with latin accents (italian documents). I > extract the text from .doc documents with Tika directly into .xml files. > If i open up the XML document with my Dashcode (i run mac os x) i can > see the characters

UTF-8 and latin accents

2009-10-08 Thread Claudio Martella
Hello list, I'm trying to index documents with latin accents (italian documents). I extract the text from .doc documents with Tika directly into .xml files. If i open up the XML document with my Dashcode (i run mac os x) i can see the characters correctly. my xml document is an xml document with t

Re: how can I use debugQuery if I have extended QParserPlugin?

2009-10-08 Thread gdeconto
I did check the other posts, as well as whatever I could find on the net but didnt find anything. Has anyone encountered this type of issue, or is what I am doing (extending QParserPlugin) that unusual?? gdeconto wrote: > > ... > one thing I noticed is that if I append "debugQuery=true" to a

Re: how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Yonik Seeley
What is this huge file? Solr XML? CSV? Anyway, if it's a local file, you can get Solr to directly read/stream it via stream.file Examples in http://wiki.apache.org/solr/UpdateCSV but it should work for any update format, not just CSV. -Yonik http://www.lucidimagination.com On Thu, Oct 8, 2009

Re: Default query parameter for one core

2009-10-08 Thread Michael
On Wed, Oct 7, 2009 at 1:46 PM, Michael wrote: > Is there a way to not have the shards param at all for most cores, and for > core0 to specify it? E.g. core0 requests always get a "&shards=foo" appended, while other cores don't have an "&shards" param at all. Or, barring that, is there a way to

correct syntax for boolean search

2009-10-08 Thread Elaine Li
Hi, What is the correct syntax for the following boolean search from a field? fieldname1:(word_a1 or word_b1) && (word_a2 or word_b2) && (word_a3 or word_b3) && fieldname2:. Thanks. Elaine

Re: how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Walter Underwood
Are you are indexing multiple documents? If so, split them into multiple files. A single XML file with all documents is not a good idea. Solr is designed to use batches for indexing. It will be extremely hard to index a 1TB XML file. I would guess that would need a JVM heap of well over 1T

Re: how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
you can write a simple program which streams the file from the disk to post it to Solr On Thu, Oct 8, 2009 at 7:10 PM, Elaine Li wrote: > You can increase the java heap size, e.g. java -Xms128m -Xmx8192m -jar <*.xml> > Or i split the file if it is too big. > > Elaine > > On Thu, Oct 8, 2009 at 6

Re: how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Elaine Li
You can increase the java heap size, e.g. java -Xms128m -Xmx8192m -jar <*.xml> Or i split the file if it is too big. Elaine On Thu, Oct 8, 2009 at 6:47 AM, Pravin Karne wrote: > Hi, > I am new to solr. I am able to index, search and update with small > size(around 500mb) > But if I try to index

Re: How to retrieve the index of a string within a field?

2009-10-08 Thread Elaine Li
Sandeep, When I submit query, i actually make sure the searched phrase is wrapped with double quotes. When I do that, it will only return sentences with 'get what you'. If it does not have double quotes, it will return all the sentences as described in your email because without double quotes, it

issue in adding data to a multivalued field

2009-10-08 Thread Rakhi Khatwani
Hi, i have a small schema with some of the fields defined as: where the field author_name is multivalued. however in UI (schema browser), following r the details of author_name field, its nowhere mentioned tht its multivalued. Field: author_name Field Type: text Properties: Indexed, T

Re: Facet query pb

2009-10-08 Thread clico
clico wrote: > > > > clico wrote: >> >> That's not a pb >> I want to use that in order to drill down a tree >> >> >> Christian Zambrano wrote: >>> >>> Clico, >>> >>> Because you are doing a wildcard query, the token 'AMERICA' will not be >>> analyzed at all. This means that 'AMERICA*' w

Re: solr reporting tool adapter

2009-10-08 Thread Rakhi Khatwani
Hi Lance, thnx a tonwill look into BIRT Regards, Raakhi On Thu, Oct 8, 2009 at 1:22 AM, Lance Norskog wrote: > The BIRT project can do what you want. It has a nice form creator and > you can configure http XML input formats. > > It includes very complete Eclipse plugins and there is a book a

Re: how to rename a schema field, whose values are indexed already?

2009-10-08 Thread Shalin Shekhar Mangar
On Thu, Oct 8, 2009 at 4:32 PM, noor wrote: > Without re-indexing the data, > how to rename, any one of the schema field ?? > > Solr does not support renaming without re-indexing. Re-indexing is your best bet. If you cannot re-index for some reasons and if you have all fields as stored then you

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-08 Thread Koji Sekiguchi
In this particular case, I don't think one is better than the other... In general, MappingCharFilter is more flexible than specific TokenFilters, such as ISOLatin1AccentFilter. For example, if you want your own character mapping rules, you can add them to mapping.txt. It should be easier than mod

Re: Scoring for specific field queries

2009-10-08 Thread Avlesh Singh
Filters? I did not mean filters at all. I am in a mad rush right now, but on the face of it your field definitions look right. This is what I asked for - q=(autoComplete2:cha^10 autoComplete:cha) Lemme know if this does not work for you. Cheers Avlesh On Thu, Oct 8, 2009 at 4:58 PM, R. Tan wro

Re: Scoring for specific field queries

2009-10-08 Thread R. Tan
Hi Avlesh, I can't seem to get the scores right. I now have these types for the fields I'm targeting, My query is this, q=*:*&fq=auto

Re: Ranking of search results

2009-10-08 Thread bhaskar chandrasekar
Hi Amith,     I tried with the options you gave and gave debug=true at the end of the URL. I am getting output as       channel   channel   text:channel   text:channel -   http://hotmail";>1.2682627 = (MATCH) fieldWeight(text:channel in 3), product of: 2.828427 = tf(termFreq(text:channel)=8

Re: how to rename a schema field, whose values are indexed already?

2009-10-08 Thread noor
Without re-indexing the data, how to rename, any one of the schema field ?? Sandeep Tagore wrote: I guess you cant do it. I tried it before. I had a field with name 'KEYWORD' and i changed it to 'keyword' and it didn't work. Everything else was normal and I searched with 'KEYWORD' i got an excep

Re: how to rename a schema field, whose values are indexed already?

2009-10-08 Thread noor
Without re-indexing the data, how i rename, any one of the schema field ?? Sandeep Tagore wrote: I guess you cant do it. I tried it before. I had a field with name 'KEYWORD' and i changed it to 'keyword' and it didn't work. Everything else was normal and I searched with 'KEYWORD' i got an except

how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Pravin Karne
Hi, I am new to solr. I am able to index, search and update with small size(around 500mb) But if I try to index file with 5 to 10 or more that (500mb) it gives memory heap exception. While investigation I found that post jar or post.sh load whole file in memory. I use one work around with dividi

Re: how to rename a schema field, whose values are indexed already?

2009-10-08 Thread Sandeep Tagore
I guess you cant do it. I tried it before. I had a field with name 'KEYWORD' and i changed it to 'keyword' and it didn't work. Everything else was normal and I searched with 'KEYWORD' i got an exception saying undefined field and I searched with 'keyword' , I got 0 results. It didn't work even aft

Re: ISOLatin1AccentFilter before or after Snowball?

2009-10-08 Thread Chantal Ackermann
Now, you got me wondering - wich one should I like better? I didn't even know there is an alternative. :-) Chantal Koji Sekiguchi schrieb: No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory

how to rename a schema field, whose values are indexed already?

2009-10-08 Thread Noor
In solr, how to rename a schema field, if its values are indexed already ?? anybody please tell your suggestions.. for example, If i have a schema field named "subTitle", now i want to rename as "subtitle". What i need to do for this change ?? regards, Noor

Re: Scoring for specific field queries

2009-10-08 Thread R. Tan
I will have to pass on this and try your suggestion first. So, how does your suggestion (1 and 2) boost the my startswith query? Is it because of the n-gram filter? On Thu, Oct 8, 2009 at 2:27 PM, Sandeep Tagore wrote: > > Yes it can be done but it needs some customization. Search for custom sor

Re: Facet query pb

2009-10-08 Thread clico
clico wrote: > > That's not a pb > I want to use that in order to drill down a tree > > > Christian Zambrano wrote: >> >> Clico, >> >> Because you are doing a wildcard query, the token 'AMERICA' will not be >> analyzed at all. This means that 'AMERICA*' will NOT match 'america'. >> >> On