Limiting the result content

2008-12-23 Thread Sajith Vimukthi
Hi all, I am writing an application which uses solrj to perform the search function. I basically have a huge set of pdfs doc etc and these are indexed. I have almost finished the application. But still I have some problems. When I search for a parameter I get the whole indexed document which is su

RE: Limiting the content of the result

2008-12-23 Thread Sajith Vimukthi
Hi, Thanks a lot Shekhar. Indeed I need to limit the result within the particular field. So how can I get it done? -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, December 24, 2008 10:51 AM To: solr-user@lucene.apache.org Subject: Re: Limit

Re: Unicode characters that are not legal XML characters;

2008-12-23 Thread lucas song
I have wirte a class to deal with this problem. public class XmlCharFilter { public static String doFilter(String in) { StringBuffer out = new StringBuffer(); // Used to hold the output. char current; // Used to reference the current character. if (in == null || ("".equals(in)))

Re: Limiting the content of the result

2008-12-23 Thread Shalin Shekhar Mangar
You probably want highlighting http://wiki.apache.org/solr/HighlightingParameters On Wed, Dec 24, 2008 at 10:33 AM, Sajith Vimukthi wrote: > Hi all, > > I am developing n application which indexes whole pdfs and other documents > to solr. I have completed a working version of my application. But

Limiting the content of the result

2008-12-23 Thread Sajith Vimukthi
Hi all, I am developing n application which indexes whole pdfs and other documents to solr. I have completed a working version of my application. But there are some problems. The main one is that when I do a search the indexed whole document is shown. I have used solrj and need some help to reduce

Re: New User: Question about index locking options

2008-12-23 Thread Shalin Shekhar Mangar
On Wed, Dec 24, 2008 at 3:31 AM, Alan May wrote: > > I have multiple SolrJ clients that will be sending small update requests > but > should be staggered in time to prevent concurrent updates. However, to add > additional margin of error, I would like to apply the safest locking > mechanism poss

Re: How can i indexing MS-Outlook files?

2008-12-23 Thread Jeryl Cook
http://www.aduna-software.com/technologies/aperture/overview.view this component Aperture worked for me.. Jeryl Cook /^\ Pharaoh /^\ http://pharaohofkush.blogspot.com/ "Whether we bring our enemies to justice, or bring justice to our enemies, justice will be done." --George W. Bush, Address to a

Re: How can i indexing MS-Outlook files?

2008-12-23 Thread Norberto Meijome
On Sun, 14 Dec 2008 19:22:00 -0800 (PST) Otis Gospodnetic wrote: > Perhaps an easier alternative is to index not the MS-Outlook files > themselves, but email messages pulled from the IMAP or POP servers, if that's > where the original emails live. PST files ('outlook files') are local to the end

New User: Question about index locking options

2008-12-23 Thread Alan May
Hi, First off, as a new user of Solr, I'm extremely impressed with the Solr service and accompanying admin tool. Thank you very much to those who have contributed! My environment: Windows 2003 R2 x64 edition server 4gb RAM Java 6 update 18 - 64 bit Tomcat 6.0.18 Solr 1.3 Single Index / No replic

Re: Problems with WordDelimiterFilterFactory

2008-12-23 Thread David Smiley @MITRE.org
It seems you want Id to only match on complete field values. If that is the case then you should not do tokenization nor perhaps any text analysis altogether. Consider removing the whole block or using KeywordTokenizerFactory plus a modicum of other stuff (perhaps lowercasing). For the particu

Re: emample for using SOLR for search against database tables

2008-12-23 Thread Erik Hatcher
Well, there is the EmbeddedSolrServer - Solr runs totally fine as a pure Java API. It can also be embedded in your own web apps without having Solr as a standalone service. Erik On Dec 23, 2008, at 10:29 AM, Glen Newton wrote: Depending on your requirements, using Lucene directl

Problems with WordDelimiterFilterFactory

2008-12-23 Thread GPS.
I am using a fieldType, with following configuration: I have When I try searching with : http://localhost:8001/solr/select/?q=Id:ARMZ It gives me complete list, where Id is: ARMZ or ARMZ117 or ARMZ129 What

Re: highlighting and stemming

2008-12-23 Thread David Bowen
I've filed a ticket on this so it doesn't get lost: https://issues.apache.org/jira/browse/SOLR-937 On Mon, Dec 22, 2008 at 11:53 AM, David Bowen wrote: > Yonik, thanks for looking into this. > > Here is a better example of the problem, using the example data from the > latest dev version. Add

Re: spellCheckComponent and dismax query type

2008-12-23 Thread Otis Gospodnetic
Jae Joo, Please have a look at the SpellCheckComponent page on the Wiki for info about setting that component up, so you don't have to make a separate request to the spellchecker. This will also address your issue. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Ori

Re: Any new python libraries?

2008-12-23 Thread Ed Summers
It should be easy_install-able: % easy_install solrpy //Ed On Tue, Dec 23, 2008 at 12:47 PM, jlist9 wrote: > Maybe I'm using an older version. I'll give it a try and report back. Thanks. > > On Tue, Dec 23, 2008 at 3:26 AM, Ed Summers wrote: >> Yes I've used it with Unicode, see test_unicode

Re: is my MoreLikeThis performance normal?

2008-12-23 Thread Eric Kilby
Is there any way of doing this with the SOLR handler? I was looking for a param something like mlt.maxdf that could be applied in order to enforce this type of condition, but there doesn't seem to be one (in 1.3 at least). Walter Underwood wrote: > > Common terms are not that useful for More L

Re: Any new python libraries?

2008-12-23 Thread jlist9
Maybe I'm using an older version. I'll give it a try and report back. Thanks. On Tue, Dec 23, 2008 at 3:26 AM, Ed Summers wrote: > Yes I've used it with Unicode, see test_unicode in the unittests [1]. > In fact one of the reasons why it was moved to google-code was so we > could rapidly fix some

Re: is my MoreLikeThis performance normal?

2008-12-23 Thread Walter Underwood
Common terms are not that useful for More Like This. Get rid of terms with a low IDF. You want selective terms. Usually, picking the top 20 or so terms by tf.idf will eliminate the low IDF terms, but you might need to specifically toss those. Phrase IDF is really, really useful for this. Note: I

Re: Unicode characters that are not legal XML characters

2008-12-23 Thread Bryan Talbot
I believe you can use the following unicode characters in XML documents: U+0009, U+000A, U+000D, [U+0020-U+D7FF], [U+E000-U+FFFD], and [U+1-U+10] One of your documents contains a U0022 character which is an invalid space character for XML. http://www.unicode.org/unicode/reports/tr

Re: is my MoreLikeThis performance normal?

2008-12-23 Thread Eric Kilby
That is correct, we see similar if not longer query times if run in a regular query through the admin tool using the same terms that MLT is selecting. I was testing MLT to see if this was an unavoidable consequence of having terms that occur in a large number of documents, or whether it was some

Re: emample for using SOLR for search against database tables

2008-12-23 Thread Glen Newton
Depending on your requirements, using Lucene directly instead of Solr might be appropriate. Even in a web environment. Not likely a popular statement on the Solr list, but one that you should consider. :-) -Glen 2008/12/23 Manupriya : > > Yes... At present I want SOLR to run within my standalone

Re: "static facet field"

2008-12-23 Thread Marc Sturlese
Exactly! I thought facet query just worked for range numbers. I complicating myself a lot... Thank you Erik Hatcher wrote: > > > On Dec 23, 2008, at 7:55 AM, Marc Sturlese wrote: >> My other goal is to do something I have called static facets by >> field. I >> would like to specify the terms

Re: Unicode characters that are not legal XML characters;

2008-12-23 Thread Jarek Zgoda
Wiadomość napisana w dniu 2008-12-23, o godz. 14:46, przez rohit arora: When i give post command to build my Index on my (databases / XML) file it gives me an error which is like . com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 22)) at [row,col {unknown-

spellCheckComponent and dismax query type

2008-12-23 Thread Jae Joo
I would like to use spell check with dismax, but it is not working. This query searchs only default search field which is defined in schema.xml. http://localhost:8080/ibegin_mb3/spellCheckCompRH?q=pluming%20heaing&qt=dismax&spellcheck.q=pluming%20heaing&spellcheck.count=10&spellcheck=true&spellche

Unicode characters that are not legal XML characters

2008-12-23 Thread rohit arora
Hi, When i give post command to build my Index on my (databases / XML) file it gives me an error which is like . com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 22))  at [row,col {unknown-source}]: [1676,86] I find a inbuild function in perl to convert all m

Re: "static facet field"

2008-12-23 Thread Erik Hatcher
On Dec 23, 2008, at 7:55 AM, Marc Sturlese wrote: My other goal is to do something I have called static facets by field. I would like to specify the terms to facet in the sorlconfig. Lets supose I have a field called animals. I want to do facet fields with the tokens cat,doc,monkey and I wan

Unicode characters that are not legal XML characters;

2008-12-23 Thread rohit arora
Hi, When i give post command to build my Index on my (databases / XML) file it gives me an error which is like . com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 22))  at [row,col {unknown-source}]: [1676,86] I find a inbuild function in perl to convert all my

Re: emample for using SOLR for search against database tables

2008-12-23 Thread Manupriya
Yes... At present I want SOLR to run within my standalone java application which at a later stage may be ported to a web application. It would be great if I could get some initial pointers to achieve that. Thanks, Manu David Smiley @MITRE.org wrote: > > Absolutely... most of us come from a web-

Re: emample for using SOLR for search against database tables

2008-12-23 Thread Smiley, David W.
Absolutely... most of us come from a web-app world but there isn't anything intrinsically web-app about Solr except for the obvious fact that Solr itself runs on a servlet engine. If you mean is Solr embeddable so that it could run within your Java application (which may or may not be a web-app

emample for using SOLR for search against database tables

2008-12-23 Thread Manupriya
Hi, I am very new to using SOLR. I referred through the documentation for SOLR and tried to understand it. Now have a feel of SOLR. My actual requirement is to use SOLR for search against the database tables. I refered the link at http://wiki.apache.org/solr/DataImportHandler#head-ac5699cd97e4dc

"static facet field"

2008-12-23 Thread Marc Sturlese
Hey there, I missed couple of things these days about faceting: First one was that I wanted to do date facet with more than a date start, end and gap... I sorted it modifing the function getFacetDateCounts() and adding the parameters in the xml this way: date NOW/DAY-1YEAR NOW/

Re: Any new python libraries?

2008-12-23 Thread Ed Summers
Yes I've used it with Unicode, see test_unicode in the unittests [1]. In fact one of the reasons why it was moved to google-code was so we could rapidly fix some of the outstanding problems with the python client. If you can demonstrate a bug using the unittests we've got for it that would be great