Hello,
is it possible to configure solr to store the document URI in the lucene index
(the URI is not an xml field, but just the document's location)? Or is
everybody used to storing the contents of a document in the lucene index
(doesn't this imply a much larger index though?), so instead of r
On Jun 12, 2007, at 8:51 AM, Ard Schrijvers wrote:
is it possible to configure solr to store the document URI in the
lucene index (the URI is not an xml field, but just the document's
location)?
Yes. Set the field to be store and non-indexed, field type "string"
is what I use.
Or is ev
Hello Erik,
thanks for the fast answer (sry for my mail not indenting but must use webmail
:-( ), but the problem I am facing is that I do not see solr storing the
location of the documents it indexed. So, I need to store the location of a
document in a field, but I do not see where solr would
Ard,
You have to store the URI in a Field yourself. That means you need to define
that field in the schema and you have to set its value when adding documents.
Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/ - Tag - Search - Share
- Or
Hello Otis,
thanks for the info. Would it a be an improvement to be able to specify in the
schema.xml wether or not the URI should be stored or not in a field which name
you can also specify in the schema? It might be very well possible that you do
not "own" the xml documents you index over ht
Hi all,
Can anyone give me some advice on breaking a document up and indexing it
by access control lists. What we have are xml documents that are
transformed based on the user viewing it. Some users might see all of
the document, while other may see a few fields, and yet others see
nothing at a
I'm afraid I don't understand your question. Perhaps somebody else does.
Otis
- Original Message
From: Ard Schrijvers <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org; solr-user@lucene.apache.org
Sent: Tuesday, June 12, 2007 9:23:16 AM
Subject: RE: storing the document URI in the ind
On 6/12/07, Ard Schrijvers <[EMAIL PROTECTED]> wrote:
thanks for the info. Would it a be an improvement to be able to specify in the schema.xml
wether or not the URI should be stored or not in a field which name you can also specify
in the schema? It might be very well possible that you do not
Solr doesn't have the URL of the document. The document is given
to Solr in an HTTP POST.
Solr is not a web spider, it is a search web service.
wunder
On 6/12/07 6:23 AM, "Ard Schrijvers" <[EMAIL PROTECTED]> wrote:
> Hello Otis,
>
> thanks for the info. Would it a be an improvement to be abl
Hello Nate,
IMHO, you will not be able to do this in solr unless you accept pretty hard
constraints on your ACLs (I will get back to this in a moment). IMO, it is not
possible to index documents along with ACLs. ACLs can be very fine grained, and
the thing you describe, ACL specific parts of a
Thanks Yonik and Walter,
putting it that way, it does make good sense to not store the transient xml
file which it is most of the usecases (I was thinking differently because I do
have xml files on file system or over http, like from a webdav call)
Anyway, thx for all answers, and again, sry fo
Excuse me, I meant solr ofcourse :-)
> For these reasons, I do not think you can achieve with solar
Hi,
I've got an app using Cocoon and Solr, both running through Tomcat.
The post.sh file has been modified to grab local files, send it to
Cocoon (via http), the Solr-fied xml from Cocoon is then sent to the
update url in Tomcat/Solr. Not sure any of that is relevant though!
I'm running t
Daniel,
I was reading your email and responses to it with great
interest.
I was aware that Solr has an implicit assumption that
a field is mono-lingual per system. But your mail and
its correspondence made me wonder if this limitation
is practical for multi-lingual search applications. For bi-
On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
For bi-lingual
or tri-lingual search, we can have parallel fields (title_en,
title_fr, title_de, for example) but this wouldn't scale well.
Due to search across multiple fields, or due to increased index size?
Lucene and Solr
requires t
On Tue, 2007-06-12 at 16:33 +0200, Ard Schrijvers wrote:
> Thanks Yonik and Walter,
>
> putting it that way, it does make good sense to not store the transient xml
> file which it is most of the usecases (I was thinking differently because I
> do have xml files on file system or over http, like
Hi all,
Can anyone give me some advice on breaking a document up and indexing it
by access control lists. What we have are xml documents that are
transformed based on the user viewing it. Some users might see all of
the document, while other may see a few fields, and yet others see
nothing at a
Hi
And about the fields, if they are/aren't going to be present on the
responses based on the user group, you can do it in many different ways
(using XML transformation to remove the undesirable fields, implementing
your own RequestHandler able to process your group information, filtering
the data
Sounds interesting. I can't seem to find any clear dates on the project
website. Do you know? ...V1 shipping date?
Thanks!
On 6/12/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote:
On 6/12/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>... I think Tika will be the way forward (some of the code
Hi Yonik.
About how to handle with the index in query time:
I think that if you don't inform a language, you can return any document
matching the term, without considering different languages (if it's
possible) or if it's interesting for your solution, you can define a default
language to be used
Daniel,
I was reading your email and responses to it with great
interest.
I was aware that Solr has an implicit assumption that
a field is mono-lingual per system. But your mail and
its correspondence made me wonder if this limitation
is practical for multi-lingual search applications. For bi-li
On 6/11/07, Xuesong Luo <[EMAIL PROTECTED]> wrote:
For example, first name, department, job title etc.
Something like first name might be able to be a single field that is
searchable and sortable (use a keyword tokenizer followed by a
lowercase filter). If the field contains multiple words, an
On 6/12/07, Vish D. <[EMAIL PROTECTED]> wrote:
...Sounds interesting. I can't seem to find any clear dates on the project
website. Do you know? ...V1 shipping date?...
Not at the moment, Tika just entered incubation and it's impossible to
predict what will happen.
But help is welcome, of cours
Hi Yonik,
> On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
> > For bi-lingual
> > or tri-lingual search, we can have parallel fields (title_en,
> > title_fr, title_de, for example) but this wouldn't scale well.
>
> Due to search across multiple fields, or due to increased index size?
D
Thanks, Yonik. Unfortunately we have users whose first names contain
more than one word, it seems copy field is my only option.
Thanks
Xuesong
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Tuesday, June 12, 2007 10:35 AM
To: solr-use
On 6/12/07, Xuesong Luo <[EMAIL PROTECTED]> wrote:
Thanks, Yonik. Unfortunately we have users whose first names contain
more than one word, it seems copy field is my only option.
Yes, if you need to be able to match on part of a first name, rather
than just exact first name.
-Yonik
: Due to the prolification of number of fields. Say, we want
: to have the field "title" to have the title of the book in
: its original language. But because Solr has this implicit
: assumption of one language per field, we would have to have
: the artifitial fields title_fr, title_de, title_en
Hi
Not sure if you've had a solution for your problem yet, but I had dealt
with a similar issue that is mentioned below and hopefully it'll help
you too. Of course, this assumes that your original data is in utf-8 format.
The default charset encoding for mysql is Latin1 and our display format
Wonder if TOM could be useful to integrate?
http://tom.library.upenn.edu/convert/sofar.html
On 6/12/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote:
On 6/12/07, Vish D. <[EMAIL PROTECTED]> wrote:
> ...Sounds interesting. I can't seem to find any clear dates on the
project
> website. Do you kn
Hi Ajanta,
thanks! Since I used PHP, I managed to use the PHP decode function to change
it to UTF-8.
But just a question, even if we change mysql default char-set to UTF-8, and
if the input originally is in other format, the mysql engine won't help to
convert it to UTF-8 rite? I think my questio
Hi Everyone,
We have a web application with search functionality built using lucene. The
search is across different types of data, so it does not scale well from the
database. As lucene does not allow to store relational data, we decided to
try out Compass since it provides a object relation mapp
On 6/13/07, Vish D. <[EMAIL PROTECTED]> wrote:
...Wonder if TOM could be useful to integrate?
http://tom.library.upenn.edu/convert/sofar.html...
It might be interesting. and as I understand the goal of Tika is
mostly to be a framework for plugging in various types of analyzers.
So plugging in m
32 matches
Mail list logo