Re: Indexing very large files.

2007-09-06 Thread Thorsten Scherler
On Thu, 2007-09-06 at 08:55 +0200, Brian Carmalt wrote: > Hello again, > > I run Solr on Tomcat under windows and use the tomcat monitor to start > the service. I have set the minimum heap > size to be 512MB and then maximum to be 1024mb. The system has 2 Gigs of > ram. The error that I get afte

Tagging using SOLR

2007-09-06 Thread Doss
Dear all, We are running an appalication built using SOLR, now we are trying to build a tagging system using the existing SOLR indexed field called "tag_keywords", this field has different keywords seperated by comma, please give suggestions on how can we build tagging system using this field? Th

Re: Indexing very large files.

2007-09-06 Thread Brian Carmalt
Moin Thorsten, I am using Solr 1.2.0. I'll try the svn version out and see of that helps. Thanks, Brian Which version do you use of solr? http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java?view=markup The trunk version of the XmlUpdate

Re: Tagging using SOLR

2007-09-06 Thread Thorsten Scherler
On Thu, 2007-09-06 at 12:59 +0530, Doss wrote: > Dear all, > > We are running an appalication built using SOLR, now we are trying to build > a tagging system using the existing SOLR indexed field called > "tag_keywords", this field has different keywords seperated by comma, please > give suggestio

Re: Indexing very large files.

2007-09-06 Thread Brian Carmalt
Hallo again, I checked out the solr source and built the 1.3-dev version and then I tried to index the same file to the new server. I do get a different exception trace, but the result is the same. java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) a

solr.py problems with german "Umlaute"

2007-09-06 Thread Christian Klinger
Hi all, i try to add/update documents with the python solr.py api. Everything works fine so far but if i try to add a documents which contain German Umlaute (ö,ä,ü, ...) i got errors. Maybe someone has an idea how i could convert my data? Should i post this to JIRA? Thanks for help. Btw: I ha

Re: solr.py problems with german "Umlaute"

2007-09-06 Thread Brian Carmalt
Hallo Christian, Try it with title.encode('utf-8'). As in: kw = {'id':'12','title':title.encode('utf-8'),'system':'plone','url':'http://www.google.de'} Christian Klinger schrieb: Hi all, i try to add/update documents with the python solr.py api. Everything works fine so far but if i try to

Re: Indexing very large files.

2007-09-06 Thread Thorsten Scherler
On Thu, 2007-09-06 at 11:26 +0200, Brian Carmalt wrote: > Hallo again, > > I checked out the solr source and built the 1.3-dev version and then I > tried to index the same file to the new server. > I do get a different exception trace, but the result is the same. > > java.lang.OutOfMemoryError:

Re: Tagging using SOLR

2007-09-06 Thread Erik Hatcher
On Sep 6, 2007, at 3:29 AM, Doss wrote: We are running an appalication built using SOLR, now we are trying to build a tagging system using the existing SOLR indexed field called "tag_keywords", this field has different keywords seperated by comma, please give suggestions on how can we build

Re: Replication broken.. no helpful errors?

2007-09-06 Thread Bill Au
The snapinstaller script opens a new searcher by calling commit. From the attached debug output it looks like that actually worked: + /opt/solr/bin/commit + [[ 0 != 0 ]] + logExit ended 0 Try running the /opt/solr/bin/commit directly with the -V option. Bill On 9/5/07, Matthew Runo <[EMAIL PRO

RSS syndication Plugin

2007-09-06 Thread Thorsten Scherler
Hi all, I am curious whether somebody has written a rss plugin for solr. The idea is to provide a rss syndication link for the current search. It should be really easy to implement since it would be just a transformation solrXml -> RSS which easily can be done with a simple xsl. Has somebody a

Re: RSS syndication Plugin

2007-09-06 Thread Ryan McKinley
perhaps: https://issues.apache.org/jira/browse/SOLR-208 in http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/xslt/ check: example_atom.xsl example_rss.xsl Thorsten Scherler wrote: Hi all, I am curious whether somebody has written a rss plugin for solr. The idea is to prov

Re: Distribution Information?

2007-09-06 Thread Bill Au
That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Run

Deleting all index, including synonyms.txt and stopwords.txt

2007-09-06 Thread lorenzo zhak
Hi, I'm actually facing a issues reguarding a process on our Solr Webapp We currently reindexe all document every X days since our stopwords and synonyms list are very often edited and since we use stopwordFilter and SynFilter at index time. For that process, I wrote a Ant script that do somethi

update servlet not working

2007-09-06 Thread Benjamin Li
Hi, We have the example solr installed with jetty. We are able to navigate to the solr/admin page, but when we try to POST an xml document via the command line, there is a fatal error. It seems that the solr/update servlet isnt running, giving a http 400 error. does anyone have any clue what is

Re: update servlet not working

2007-09-06 Thread Chris Hostetter
: We are able to navigate to the solr/admin page, but when we try to : POST an xml document via the command line, there is a fatal error. It : seems that the solr/update servlet isnt running, giving a http 400 : error. a 400 could mean a lot of things ... what is the full HTTP response you get ba

Re: update servlet not working

2007-09-06 Thread Benjamin Li
oops, sorry, its says "missing content stream" as far as logs go: i have a request log, didn't find anything with stack traces though. where is it? we're using the example one packaged with solr. "GET /solr/update HTTP/1.1" 400 1401 just to make sure, i typed "java -jar post.jar solrfile.xml" th

Re: Distribution Information?

2007-09-06 Thread Matthew Runo
Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 +-

RE: Indexing very large files.

2007-09-06 Thread Lance Norskog
Now I'm curious: what is the use case for documents this large? Thanks, Lance Norskog

Re: Replication broken.. no helpful errors?

2007-09-06 Thread Matthew Runo
The thing is that a new searcher is not opened if I look in the stats.jsp page. The index version never changes. When I run.. sudo /opt/solr/bin/commit -V -u tomcat5 ..I get a new searcher opened, but even though it (in theory) installed the new index, I see no docs in there. During the s

Re: updates on the server

2007-09-06 Thread Matthew Runo
On a related note, it'd be great if we could set up a series of transformations to be done on data when it comes into the index, before being indexed. I guess a custom tokenizer might be the best way to do this though..? ie: -Post -Data is cleaned up, properly escaped, etc -Then data is pa

RE: solr.py problems with german "Umlaute"

2007-09-06 Thread Lance Norskog
I researched this problem before. The problem I found is that Python strings are not Unicode by default. You have to do something to make them Unicode. Here are the links I found: http://www.reportlab.com/i18n/python_unicode_tutorial.html http://evanjones.ca/python-utf8.html http://jjinux.blog

Re: solr.py problems with german "Umlaute"

2007-09-06 Thread Yonik Seeley
On 9/6/07, Brian Carmalt <[EMAIL PROTECTED]> wrote: > Try it with title.encode('utf-8'). > As in: kw = > {'id':'12','title':title.encode('utf-8'),'system':'plone','url':'http://www.google.de'} It seems like the client library should be responsible for encoding, not the user. So try changing title=

solr/home

2007-09-06 Thread Matt Mitchell
Hi, I recently upgraded to Solr 1.2. I've set it up through Tomcat using context fragment files. I deploy using the tomcat web manager. In the context fragment I set the environment variable solr/home. This use to work as expected. The solr/home value pointed to the directory where "data"

Re: update servlet not working

2007-09-06 Thread Tom Hill
I don't use the java client, but when I switched to 1.2, I'd get that message when I forget to add the content type header, as described in CHANGES.txt > 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using the new request dispatcher (SOLR-104). This requires posted co

Re: solr/home

2007-09-06 Thread Tom Hill
It works for me. (fragments with solr 1.2 on tomcat 5.5.20) Could you post your fragment file? Tom On 9/6/07, Matt Mitchell <[EMAIL PROTECTED]> wrote: > Hi, > > I recently upgraded to Solr 1.2. I've set it up through Tomcat using > context fragment files. I deploy using the tomcat web manager.

Re: Replication broken.. no helpful errors?

2007-09-06 Thread Yonik Seeley
On 9/6/07, Matthew Runo <[EMAIL PROTECTED]> wrote: > The thing is that a new searcher is not opened if I look in the > stats.jsp page. The index version never changes. The index version is read from the index... hence if the lucene index doesn't change (even if a ew snapshot was taken), the versio

Re: solr/home

2007-09-06 Thread Matt Mitchell
Here you go: crossContext="true" > This is the same file I'm putting into the Tomcat manager "XML Configuration file URL" form input. Matt On Sep 6, 2007, at 3:25 PM, Tom Hill wrote: It works for me. (fragments with solr 1.2 on tomcat 5.5.20) Could you post your fragment file? Tom

Re: solr.py problems with german "Umlaute"

2007-09-06 Thread Mike Klaas
On 6-Sep-07, at 12:13 PM, Yonik Seeley wrote: On 9/6/07, Brian Carmalt <[EMAIL PROTECTED]> wrote: Try it with title.encode('utf-8'). As in: kw = {'id':'12','title':title.encode ('utf-8'),'system':'plone','url':'http://www.google.de'} It seems like the client library should be responsible fo

Re: Replication broken.. no helpful errors?

2007-09-06 Thread Matthew Runo
Well, I've been playing around with it (removed all the snapshots, restarted tomcat) and it seems like it works now.. maybe. I was noticing that search2 and search3, the slaves, had searchers that had been opened several days ago - when we do several 100 commits and 2 optimizes on search1,

searching where a value is not null?

2007-09-06 Thread David Whalen
Hi all. I'm trying to construct a query that in pseudo-code would read like this: field != '' I'm finding it difficult to write this as a solr query, though. Stuff like: NOT field:() doesn't seem to do the trick. any ideas? dw

Re: searching where a value is not null?

2007-09-06 Thread Yonik Seeley
On 9/6/07, David Whalen <[EMAIL PROTECTED]> wrote: > Hi all. > > I'm trying to construct a query that in pseudo-code would read > like this: > > field != '' > > I'm finding it difficult to write this as a solr query, though. > Stuff like: > > NOT field:() > > doesn't seem to do the trick. > > any i

Re: Indexing very large files.

2007-09-06 Thread Mike Klaas
On 6-Sep-07, at 2:26 AM, Brian Carmalt wrote: Hallo again, I checked out the solr source and built the 1.3-dev version and then I tried to index the same file to the new server. I do get a different exception trace, but the result is the same. Note that StringBuilder expands capacity by al

Slow response

2007-09-06 Thread Aaron Hammond
I am pretty new to Solr and this is my first post to this list so please forgive me if I make any glaring errors. Here's my problem. When I do a search using the Solr admin interface for a term that I know does not exist in my index the QTime is about 1ms. However, if I add facets to the searc

Re: Slow response

2007-09-06 Thread Yonik Seeley
On 9/6/07, Aaron Hammond <[EMAIL PROTECTED]> wrote: > I am pretty new to Solr and this is my first post to this list so please > forgive me if I make any glaring errors. > > Here's my problem. When I do a search using the Solr admin interface for > a term that I know does not exist in my index the

Non-HTTP Indexing

2007-09-06 Thread Renaud Waldura
Dear Solr Users: Is it possible to index documents directly without going through any XML/HTTP bridge? I have a large collection (10^7 documents, some very large) and indexing speed is a concern. Thanks! --Renaud

RE: Non-HTTP Indexing

2007-09-06 Thread Wu, Daniel
There are couple choices, see: http://wiki.apache.org/solr/SolJava - Daniel > -Original Message- > From: Renaud Waldura [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 06, 2007 2:21 PM > To: solr-user@lucene.apache.org > Subject: Non-HTTP Indexing > > > Dear Solr Users: > > Is

Re: Non-HTTP Indexing

2007-09-06 Thread Yonik Seeley
On 9/6/07, Renaud Waldura <[EMAIL PROTECTED]> wrote: > Is it possible to index documents directly without going through any > XML/HTTP bridge? > I have a large collection (10^7 documents, some very large) and indexing > speed is a concern. Where are these documents currently stored, and in what fo

RE: Slow response

2007-09-06 Thread Aaron Hammond
Thank-you for your response, this does shed some light on the subject. Our basic question was why were we seeing slower responses the smaller our result set got. Currently we are searching about 1.2 million documents with the source document about 2KB, but we do duplicate some of the data. I bump

Re: Slow response

2007-09-06 Thread Mike Klaas
On 6-Sep-07, at 3:16 PM, Aaron Hammond wrote: Thank-you for your response, this does shed some light on the subject. Our basic question was why were we seeing slower responses the smaller our result set got. Currently we are searching about 1.2 million documents with the source document about 2

Re: Slow response

2007-09-06 Thread Mike Klaas
On 6-Sep-07, at 3:25 PM, Mike Klaas wrote: There are essentially two facet computation strategies: 1. cached bitsets: a bitset for each term is generated and intersected with the query restul bitset. This is more general and performs well up to a few thousand terms. 2. field enumeratio

caching query result

2007-09-06 Thread Jae Joo
HI, I am wondering that is there any way for CACHING FACETS SEARCH Result? I have 13 millions and have facets by states (50). If there is a mechasim to chche, I may get faster result back. Thanks, Jae

removing a field from the relevance calculation

2007-09-06 Thread Bart Smyth
Hi, I'm having trouble getting a field of type SortableFloatField to not weigh into to the relevancy score returned for a document. So far I've tried boosting the field to 0.0 at index time using this field type - and also implemented a custom Similarity implementation that overrode lengthNorm(

Re: updates on the server

2007-09-06 Thread Erik Hatcher
On Sep 6, 2007, at 2:56 PM, Matthew Runo wrote: On a related note, it'd be great if we could set up a series of transformations to be done on data when it comes into the index, before being indexed. I guess a custom tokenizer might be the best way to do this though..? ie: -Post -Data is

Re: RSS syndication Plugin

2007-09-06 Thread Thorsten Scherler
On Thu, 2007-09-06 at 09:07 -0400, Ryan McKinley wrote: > perhaps: > https://issues.apache.org/jira/browse/SOLR-208 > > in http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/xslt/ > > check: > example_atom.xsl > example_rss.xsl Awesome. Thanks very much Ryan to point me into th

Re: solr/home

2007-09-06 Thread Erik Hatcher
Matt - hey! In your Solr console, which of these three messages do you see? * log.info("JNDI not configured for Solr (NoInitialContextEx)"); * log.info("No /solr/home in JNDI"); * log.warning("Odd RuntimeException while testing for JNDI: " + ex.getM

Re: removing a field from the relevance calculation

2007-09-06 Thread Yonik Seeley
On 9/6/07, Bart Smyth <[EMAIL PROTECTED]> wrote: > I'm having trouble getting a field of type SortableFloatField to not > weigh into to the relevancy score returned for a document. > > sortMissingLast="true" omitNorms="true"/> > > So far I've tried boosting the field to 0.0 at index time using thi

Re: caching query result

2007-09-06 Thread Yonik Seeley
On 9/6/07, Jae Joo <[EMAIL PROTECTED]> wrote: > I have 13 millions and have facets by states (50). If there is a mechasim to > chche, I may get faster result back. How fast are you getting results back with standard field faceting (facet.field=state)?

Question on use of wildcard to field name at query

2007-09-06 Thread Toru Matsuzawa
Hi all. Wildcard cannot be used for field name by specifying query though storage in index is possible according to the specification of wildcard by dynamic field. I want to use wildcard to specify field name at query. Please teach something a good idea. The following images. --document