date:20070612

storing the document URI in the index

2007-06-12 Thread Ard Schrijvers

Hello, is it possible to configure solr to store the document URI in the lucene index (the URI is not an xml field, but just the document's location)? Or is everybody used to storing the contents of a document in the lucene index (doesn't this imply a much larger index though?), so instead of r

Re: storing the document URI in the index

2007-06-12 Thread Erik Hatcher

On Jun 12, 2007, at 8:51 AM, Ard Schrijvers wrote: is it possible to configure solr to store the document URI in the lucene index (the URI is not an xml field, but just the document's location)? Yes. Set the field to be store and non-indexed, field type "string" is what I use. Or is ev

RE: storing the document URI in the index

2007-06-12 Thread Ard Schrijvers

Hello Erik, thanks for the fast answer (sry for my mail not indenting but must use webmail :-( ), but the problem I am facing is that I do not see solr storing the location of the documents it indexed. So, I need to store the location of a document in a field, but I do not see where solr would

Re: storing the document URI in the index

2007-06-12 Thread Otis Gospodnetic

Ard, You have to store the URI in a Field yourself. That means you need to define that field in the schema and you have to set its value when adding documents. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Or

RE: storing the document URI in the index

2007-06-12 Thread Ard Schrijvers

Hello Otis, thanks for the info. Would it a be an improvement to be able to specify in the schema.xml wether or not the URI should be stored or not in a field which name you can also specify in the schema? It might be very well possible that you do not "own" the xml documents you index over ht

indexing documents (or pieces of a document) by access controls

2007-06-12 Thread Nathaniel A. Johnson

Hi all, Can anyone give me some advice on breaking a document up and indexing it by access control lists. What we have are xml documents that are transformed based on the user viewing it. Some users might see all of the document, while other may see a few fields, and yet others see nothing at a

Re: storing the document URI in the index

2007-06-12 Thread Otis Gospodnetic

I'm afraid I don't understand your question. Perhaps somebody else does. Otis - Original Message From: Ard Schrijvers <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org; solr-user@lucene.apache.org Sent: Tuesday, June 12, 2007 9:23:16 AM Subject: RE: storing the document URI in the ind

Re: storing the document URI in the index

2007-06-12 Thread Yonik Seeley

On 6/12/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote: thanks for the info. Would it a be an improvement to be able to specify in the schema.xml wether or not the URI should be stored or not in a field which name you can also specify in the schema? It might be very well possible that you do not

Re: storing the document URI in the index

2007-06-12 Thread Walter Underwood

Solr doesn't have the URL of the document. The document is given to Solr in an HTTP POST. Solr is not a web spider, it is a search web service. wunder On 6/12/07 6:23 AM, "Ard Schrijvers" <[EMAIL PROTECTED]> wrote: > Hello Otis, > > thanks for the info. Would it a be an improvement to be abl

RE: indexing documents (or pieces of a document) by access controls

2007-06-12 Thread Ard Schrijvers

Hello Nate, IMHO, you will not be able to do this in solr unless you accept pretty hard constraints on your ACLs (I will get back to this in a moment). IMO, it is not possible to index documents along with ACLs. ACLs can be very fine grained, and the thing you describe, ACL specific parts of a

RE: storing the document URI in the index

2007-06-12 Thread Ard Schrijvers

Thanks Yonik and Walter, putting it that way, it does make good sense to not store the transient xml file which it is most of the usecases (I was thinking differently because I do have xml files on file system or over http, like from a webdav call) Anyway, thx for all answers, and again, sry fo

RE: indexing documents (or pieces of a document) by access controls

2007-06-12 Thread Ard Schrijvers

Excuse me, I meant solr ofcourse :-) > For these reasons, I do not think you can achieve with solar

Tomcat: The requested resource (/solr/update) is not available.

2007-06-12 Thread Matt Mitchell

Hi, I've got an app using Cocoon and Solr, both running through Tomcat. The post.sh file has been modified to grab local files, send it to Cocoon (via http), the Solr-fied xml from Cocoon is then sent to the update url in Tomcat/Solr. Not sure any of that is relevant though! I'm running t

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka

Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For bi-

Re: Multi-language indexing and searching

2007-06-12 Thread Yonik Seeley

On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: For bi-lingual or tri-lingual search, we can have parallel fields (title_en, title_fr, title_de, for example) but this wouldn't scale well. Due to search across multiple fields, or due to increased index size? Lucene and Solr requires t

RE: storing the document URI in the index

2007-06-12 Thread Thorsten Scherler

On Tue, 2007-06-12 at 16:33 +0200, Ard Schrijvers wrote: > Thanks Yonik and Walter, > > putting it that way, it does make good sense to not store the transient xml > file which it is most of the usecases (I was thinking differently because I > do have xml files on file system or over http, like

Re: indexing documents (or pieces of a document) by access controls

2007-06-12 Thread Ken Krugler

Hi all, Can anyone give me some advice on breaking a document up and indexing it by access control lists. What we have are xml documents that are transformed based on the user viewing it. Some users might see all of the document, while other may see a few fields, and yet others see nothing at a

Re: indexing documents (or pieces of a document) by access controls

2007-06-12 Thread Daniel Alheiros

Hi And about the fields, if they are/aren't going to be present on the responses based on the user group, you can do it in many different ways (using XML transformation to remove the undesirable fields, implementing your own RequestHandler able to process your group information, filtering the data

Re: LIUS/Fulltext indexing

2007-06-12 Thread Vish D.

Sounds interesting. I can't seem to find any clear dates on the project website. Do you know? ...V1 shipping date? Thanks! On 6/12/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 6/12/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: >... I think Tika will be the way forward (some of the code

Re: Multi-language indexing and searching

2007-06-12 Thread Daniel Alheiros

Hi Yonik. About how to handle with the index in query time: I think that if you don't inform a language, you can return any document matching the term, without considering different languages (if it's possible) or if it's interesting for your solution, you can define a default language to be used

RE: Multi-language indexing and searching

2007-06-12 Thread Ken Krugler

Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For bi-li

Re: question about sorting

2007-06-12 Thread Yonik Seeley

On 6/11/07, Xuesong Luo <[EMAIL PROTECTED]> wrote: For example, first name, department, job title etc. Something like first name might be able to be a single field that is searchable and sortable (use a keyword tokenizer followed by a lowercase filter). If the field contains multiple words, an

Re: LIUS/Fulltext indexing

2007-06-12 Thread Bertrand Delacretaz

On 6/12/07, Vish D. <[EMAIL PROTECTED]> wrote: ...Sounds interesting. I can't seem to find any clear dates on the project website. Do you know? ...V1 shipping date?... Not at the moment, Tika just entered incubation and it's impossible to predict what will happen. But help is welcome, of cours

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka

Hi Yonik, > On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: > > For bi-lingual > > or tri-lingual search, we can have parallel fields (title_en, > > title_fr, title_de, for example) but this wouldn't scale well. > > Due to search across multiple fields, or due to increased index size? D

RE: question about sorting

2007-06-12 Thread Xuesong Luo

Thanks, Yonik. Unfortunately we have users whose first names contain more than one word, it seems copy field is my only option. Thanks Xuesong -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Tuesday, June 12, 2007 10:35 AM To: solr-use

Re: question about sorting

2007-06-12 Thread Yonik Seeley

On 6/12/07, Xuesong Luo <[EMAIL PROTECTED]> wrote: Thanks, Yonik. Unfortunately we have users whose first names contain more than one word, it seems copy field is my only option. Yes, if you need to be able to match on part of a first name, rather than just exact first name. -Yonik

RE: Multi-language indexing and searching

2007-06-12 Thread Chris Hostetter

: Due to the prolification of number of fields. Say, we want : to have the field "title" to have the title of the book in : its original language. But because Solr has this implicit : assumption of one language per field, we would have to have : the artifitial fields title_fr, title_de, title_en

Re: To make sure XML is UTF-8

2007-06-12 Thread Ajanta Phatak

Hi Not sure if you've had a solution for your problem yet, but I had dealt with a similar issue that is mentioned below and hopefully it'll help you too. Of course, this assumes that your original data is in utf-8 format. The default charset encoding for mysql is Latin1 and our display format

Re: LIUS/Fulltext indexing

2007-06-12 Thread Vish D.

Wonder if TOM could be useful to integrate? http://tom.library.upenn.edu/convert/sofar.html On 6/12/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 6/12/07, Vish D. <[EMAIL PROTECTED]> wrote: > ...Sounds interesting. I can't seem to find any clear dates on the project > website. Do you kn

Re: To make sure XML is UTF-8

2007-06-12 Thread Tiong Jeffrey

Hi Ajanta, thanks! Since I used PHP, I managed to use the PHP decode function to change it to UTF-8. But just a question, even if we change mysql default char-set to UTF-8, and if the input originally is in other format, the mysql engine won't help to convert it to UTF-8 rite? I think my questio

Compass vs Solr

2007-06-12 Thread Harini Raghavan

Hi Everyone, We have a web application with search functionality built using lucene. The search is across different types of data, so it does not scale well from the database. As lucene does not allow to store relational data, we decided to try out Compass since it provides a object relation mapp

Re: LIUS/Fulltext indexing

2007-06-12 Thread Bertrand Delacretaz

On 6/13/07, Vish D. <[EMAIL PROTECTED]> wrote: ...Wonder if TOM could be useful to integrate? http://tom.library.upenn.edu/convert/sofar.html... It might be interesting. and as I understand the goal of Tika is mostly to be a framework for plugging in various types of analyzers. So plugging in m

storing the document URI in the index

Re: storing the document URI in the index

RE: storing the document URI in the index

Re: storing the document URI in the index

RE: storing the document URI in the index

indexing documents (or pieces of a document) by access controls

Re: storing the document URI in the index

Re: storing the document URI in the index

Re: storing the document URI in the index

RE: indexing documents (or pieces of a document) by access controls

RE: storing the document URI in the index

RE: indexing documents (or pieces of a document) by access controls

Tomcat: The requested resource (/solr/update) is not available.

RE: Multi-language indexing and searching

Re: Multi-language indexing and searching

RE: storing the document URI in the index

Re: indexing documents (or pieces of a document) by access controls

Re: indexing documents (or pieces of a document) by access controls

Re: LIUS/Fulltext indexing

Re: Multi-language indexing and searching

RE: Multi-language indexing and searching

Re: question about sorting

Re: LIUS/Fulltext indexing

RE: Multi-language indexing and searching

RE: question about sorting

Re: question about sorting

RE: Multi-language indexing and searching

Re: To make sure XML is UTF-8

Re: LIUS/Fulltext indexing

Re: To make sure XML is UTF-8

Compass vs Solr

Re: LIUS/Fulltext indexing

32 matches

Site Navigation

Mail list logo

Footer information