Re: How are multivalued fields used?

2008-10-13 Thread Brian Carmalt
Hello Gene, Am Montag, den 13.10.2008, 23:32 +1300 schrieb ristretto.rb: > How does one use of this field type. > Forums, wiki, Lucene in Action, all coming up empty. > If there's a doc somewhere please point me there. > > I use pysolr to index. But, that's not a requirement. > > I'm not sure h

Re: Not enough space

2008-09-25 Thread Brian Carmalt
Search with Google for swap file linux linux or "distro name" There is tons of info out there. Am Donnerstag, den 25.09.2008, 02:07 -0700 schrieb sunnyfr: > Hi, > I've obviously the same error, I just don't know how do you add swap space ? > Thanks a lot, > > > Yonik Seeley wrote: > > > > O

Re: Searching with Wildcards

2008-09-24 Thread Brian Carmalt
best bet is to create a new QParser(Plugin) that uses > >> Lucene's QueryParser directly. We probably should have that available > >> anyway in the core, just so folks coming from Lucene Java have the same > >> QueryParser. > >> > >>Erik &g

Re: How to copy a solr index to another index with a different schema collapsing stored data?

2008-09-17 Thread Brian Carmalt
It wouldn't be that bad to merge the index externally and the reindex the results, if it is as simple as your example. Search for id:[1 TO *] and a fq for the category, increment the slice of the results you need to process until you have covered all of the docs in the category. Request the content

Searching with Wildcards

2008-09-02 Thread Brian Carmalt
Hello all, I need to get wildcard searches with highlighting up and running. I'd like to get it to work with a DismaxHandler, but I'll settle with starting with the StandardRequestHandler. I've been reading the some of the past mails on wildcard searches and Solr-195. It seems I need to change th

Re: too many open files

2008-07-14 Thread Brian Carmalt
t running again or call System.gc() periodically. How do force the VM to realese the files? This happens under RedHat with a 2.4er kernel and under Debian Etch with 2.6er kernel. Thanks, Brian > -Yonik > > On Mon, Jul 14, 2008 at 9:15 AM, Brian Carmalt <[EMAIL PROTECTED]> wro

Re: too many open files

2008-07-14 Thread Brian Carmalt
Hello, I have a similar problem, not with Solr, but in Java. From what I have found, it is a usage and os problem: comes from using to many files, and the time it takes the os to reclaim the fds. I found the recomendation that System.gc() should be called periodically. It works for me. May not be

Re: How to debug ?

2008-06-24 Thread Brian Carmalt
Hello Beto, There is a plugin for jetty: http://webtide.com/eclipse. Insert this as and update site and let eclipse install the plugin for you You can then start the jetty server from eclipse and debug it. Brian. Am Mittwoch, den 25.06.2008, 12:48 +1000 schrieb Norberto Meijome: > On Tue, 24 J

Problem with searching using the DisMaxHandler

2008-06-19 Thread Brian Carmalt
Hello all, I have defined a DisMax handler. It should search in the following fields: content1, content2 and id(doc uid). I would like to beable to specify a query like the following: (search terms) AND ( id1 OR id2 .. idn) My intent is to retrieve only the docs in which hits for the sear

Re: AW: My First Solr

2008-06-13 Thread Brian Carmalt
-- > type Status report > message undefined field text > description The request sent by the client was syntactically incorrect > (undefined field text). > > > Regards Thomas > > > -Urs

Re: My First Solr

2008-06-13 Thread Brian Carmalt
http://wiki.apache.org/solr/DisMaxRequestHandler In solrconfig.xml there are example configurations for the DisMax. Sorry I told you the wrong name, not enough coffee this morning. Brian. Am Freitag, den 13.06.2008, 09:40 +0200 schrieb Thomas Lauer:

Re: AW: AW: My First Solr

2008-06-12 Thread Brian Carmalt
The DisMaxQueryHandler is your friend. Am Freitag, den 13.06.2008, 08:29 +0200 schrieb Thomas Lauer: > ok, i find my files now. can I make all files to the default search file? > > Regards Thomas > > -Ursprüngliche Nachricht----- > Von: Brian Carmalt [mailto:[EMAIL PRO

Re: AW: My First Solr

2008-06-12 Thread Brian Carmalt
Do you see if the document update is sucessful? When you start solr with java -jar start.jar for the example, Solr will list the the document id of the docs that you are adding and tell you how long the update took. A simple but brute force method to findout if a document has been commited is to

Re: My First Solr

2008-06-12 Thread Brian Carmalt
Hello Thomas, Have you performed a commit? Try adding as the last line of the document you are adding. I would suggest you read up on commits and how often you should perform them and how to do auto commits. Brian Am Freitag, den 13.06.2008, 07:20 +0200 schrieb Thomas Lauer: > HI, > > i have

Searching accross many fields

2008-06-05 Thread Brian Carmalt
Hello All, We are thinking about a totally dynamic indexing schema, where the only fields that known to be in the index is the ID field. This means that in order to search in the index, the field names of where we want to search must be specified. "q=title:solr+content:solr+summary:solr" and so

Re: exception while feeding converted text from pdf

2008-05-14 Thread Brian Carmalt
Hello Cam, Are you writing your xml by hand, as in no xml writer? That can cause problems. In your exception it says "latitude 59&", the & should have converted to '&'(I think). If you can use Java6, there is a XMLStreamWriter in java.xml.stream that does automatic special character escaping. This

Re: indexing pdf documents

2008-05-14 Thread Brian Carmalt
Hello Cam, The wiki for RichDocuments explains how you can add meta data to the RDUpdater. http://wiki.apache.org/solr/UpdateRichDocuments I have used the patch to index docs and thier meta data, but it was not exactly what we needed. Brian. Am Mittwoch, den 14.05.2008, 12:38 +0300 schrieb

Re: How to effectively search inside fields that should be indexed with changing them.

2007-12-14 Thread Brian Carmalt
/analyzers/factories. For the first part you'll likely want to extract (W+)0+ -- 1 or morel etters followed by 1 or more zeros as one token, and then 0+(D+) -- 1 or more zeros followed by 1 or more digits. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message -

How to effectively search inside fields that should be indexed with changing them.

2007-12-11 Thread Brian Carmalt
Hello all, The titles of our docs have the form "ABC0001231-This is an important doc.pdf". I would like to be able to search for 'important', or '1231', or 'ABC000*', or 'This is an important doc' in the title field. I looked a the NGramTokenizer and tried to use it. In the index it doesn't

Re: out of heap space, every day

2007-12-04 Thread Brian Carmalt
Hello, I am also fighting with heap exhaustion, however during the indexing step. I was able to minimize, but not fix the problem by setting the thread stack size to 64k with "-Xss64k". The minimum size is os specific, but the VM will tell you if you set the size too small. You can try it, it

Re: Weird memory error.

2007-11-20 Thread Brian Carmalt
Can you recommend one? I am not familar with how to profile under Java. Yonik Seeley schrieb: Can you try a profiler to see where the memory is being used? -Yonik On Nov 20, 2007 11:16 AM, Brian Carmalt <[EMAIL PROTECTED]> wrote: Hello all, I started looking into the scalability o

Weird memory error.

2007-11-20 Thread Brian Carmalt
Hello all, I started looking into the scalability of solr, and have started getting weird results. I am getting the following error: Exception in thread "btpool0-3" java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java

Re: [jira] Commented: (SOLR-380) There's no way to convert search results into page-level hits of a "structured document".

2007-10-31 Thread Brian Carmalt
There is more to consider here. Lucene now supports "payloads", additional metadata on terms that can be leveraged with custom queries. I've not yet tinkered with them myself, but my understanding is that they would be useful (and in fact designed in part) for representing structured docume

Re: Searching dynamic fields

2007-10-15 Thread Brian Carmalt
le feature. I would like to see Solr support wildcarded field names in request parameters, but we're not there yet. Erik On Oct 15, 2007, at 9:32 AM, Brian Carmalt wrote: Hello all, Is there a way to search dynamicFields, without having to specify the name of the filed in a Qu

Re: Querying for an id with a colon in it

2007-10-15 Thread Brian Carmalt
Robert Young schrieb: Hi, If my unique identifier is called guid and one of the ids in it is, for example, "article:123". How can I query for that article id? I have tried a number of ways but I always either get no results or an error. It seems to be to do with having the colon in the id value.

Searching dynamic fields

2007-10-15 Thread Brian Carmalt
Hello all, Is there a way to search dynamicFields, without having to specify the name of the filed in a Query. Example: I have index a doc with the field name myDoc_text_en. and I have a dynamic field *_text_en which maps to a type of text_en. How can I search this field without knowing its

Re: Indexing very large files.

2007-09-07 Thread Brian Carmalt
Lance Norskog schrieb: Now I'm curious: what is the use case for documents this large? Thanks, Lance Norskog It is a rand use case, but could become relevant for us. I was told to explore the possibilities, and that's what I'm doing. :) Since I haven't heard any suggestions as to how to

Re: solr.py problems with german "Umlaute"

2007-09-06 Thread Brian Carmalt
Hallo Christian, Try it with title.encode('utf-8'). As in: kw = {'id':'12','title':title.encode('utf-8'),'system':'plone','url':'http://www.google.de'} Christian Klinger schrieb: Hi all, i try to add/update documents with the python solr.py api. Everything works fine so far but if i try to

Re: Indexing very large files.

2007-09-06 Thread Brian Carmalt
Hallo again, I checked out the solr source and built the 1.3-dev version and then I tried to index the same file to the new server. I do get a different exception trace, but the result is the same. java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) a

Re: Indexing very large files.

2007-09-06 Thread Brian Carmalt
Moin Thorsten, I am using Solr 1.2.0. I'll try the svn version out and see of that helps. Thanks, Brian Which version do you use of solr? http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java?view=markup The trunk version of the XmlUpdate

Re: Indexing very large files.

2007-09-05 Thread Brian Carmalt
5 Sep 2007 17:18:09 +0200 Brian Carmalt <[EMAIL PROTECTED]> wrote: I've bin trying to index a 300MB file to solr 1.2. I keep getting out of memory heap errors. Even on an empty index with one Gig of vm memory it sill won't work. Hi Brian, VM != heap memory. VM =

Re: Indexing very large files.

2007-09-05 Thread Brian Carmalt
Yonik Seeley schrieb: On 9/5/07, Brian Carmalt <[EMAIL PROTECTED]> wrote: I've bin trying to index a 300MB file to solr 1.2. I keep getting out of memory heap errors. 300MB of what... a single 300MB document? Or is that file represent multiple documents in XML or CSV form

Indexing very large files.

2007-09-05 Thread Brian Carmalt
Hello all, I will apologize up front if this is comes twice. I've bin trying to index a 300MB file to solr 1.2. I keep getting out of memory heap errors. Even on an empty index with one Gig of vm memory it sill won't work. Is it even possible to get Solr to index such large files? Do I need to