hi li,
i looked at doing something similar - where we only index the text
but retrieve search results / highlight from files -- we ended up giving
up because of the amount of customisation required in solr -- mainly
because we wanted the distributed search functionality in solr which
meant making
I used to store full text into lucene index. But I found it's very
slow when merging index because when merging 2 segments it copy the
fdt files into a new one. So I want to only index full text. But When
searching I need the full text for applications such as hightlight and
view full text. I can s
You could implement a good solution with the underlying Lucene ParallelReader
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/ParallelReader.html
Keep the 100 search fields - 'static' info - in one index, the
permissions info in another index that gets updated when the
permissi
What Ken describes is called 'role-based' security. Users have roles,
and security items talk about roles, not users.
http://en.wikipedia.org/wiki/Role-based_access_control
On Tue, Jul 6, 2010 at 3:15 PM, Peter Sturge wrote:
> Yes, you don't want to hard code permissions into your index - it wil
Ah! I did not notice the 'too many open files' part. This means that
your mergeFactor setting is too high for what your operating system
allows. The default mergeFactor is 10 (which translates into thousands
of open file descriptors). You should lower this number.
On Tue, Jul 6, 2010 at 1:14 PM, J
the index file is ill-formated because disk full when feeding. Can I
roll back to last version? Is there any method to avoid unexpected
errors when indexing? attachments are my segment_N
That's because deleting a document simply marks it as deleted,
it doesn't really do much else with it, all that work is deferred
to the optimize step as you've found.
But deleted documents will NOT be found even though the
admin page shows their terms still in the index.
Best
Erick
On Tue, Jul 6
Underneath SOLR is Lucene. Here's a description of
Lucene's scoring algorithm (follow the "Similarity" link)
http://lucene.apache.org/java/2_4_0/scoring.html#Understanding%20the%20Scoring%20Formula
Letters in non-matching words isn't relevant, what is
is the relationship between the number of sear
first do you have a unique key defined in your schema.xml? If you
do, some of those 300 rows could be replacing earlier rows.
You say: " if I have 200
rows indexed from postgres and 100 rows from Oracle, the full-import process
only indexes 200 documents from oracle, although it shows clearly that
Still not enough info.
Please show:
1> the field type (not field, but field type showing the analyzers for the
field you're interested in).
2> example data you've indexed
3> the query you submit
4> the response from the query (especially with &debugQuery=on appended to
the query).
Otherwise, it's
On Jul 6, 2010, at 3:44pm, Chris Hostetter wrote:
: Can you try "ant compile example"?
: After Lucene/Solr merge, solr ant build needs to compile before
example
: target.
the "compile" target is already in the dependency tree for the
"example"
target, so that won't change anything.
At
: (this is particularly odd since the nightlies include all the compiled
: lucene code as jars in a "lucene-libs/" directory, but the build system
: doesn't seem to use that directory ... at least not when compiling solrj).
https://issues.apache.org/jira/browse/SOLR-1989
-Hoss
: Can you try "ant compile example"?
: After Lucene/Solr merge, solr ant build needs to compile before example
: target.
the "compile" target is already in the dependency tree for the "example"
target, so that won't change anything.
At the moment, the "nightly" snapshots produced by hudson only
The Char-filters MUST come before the Tokenizer, due to their nature of
processing the character-stream and not the tokens.
If you need to apply the accent normalizatino later in the analysis chain,
either use ISOLatin1AccentFilterFactory or help with the implementation of
SOLR-1978.
--
Jan Hø
Yes, you don't want to hard code permissions into your index - it will give
you headaches.
You might want to have a look at SOLR 1872:
https://issues.apache.org/jira/browse/SOLR-1872 .
This patch provides doc level security through an external ACL mechanism (in
this case, an XML file) controlling
(10/07/07 6:25), darknovan...@gmail.com wrote:
I'd like to try the new edismax feature in Solr, so I downloaded the
latest nightly (apache-solr-4.0-2010-07-05_08-06-42) and tried running
"ant example". It fails with a missing package error. I've pasted in
the output below. I tried a nightly fro
I'd like to try the new edismax feature in Solr, so I downloaded the latest
nightly (apache-solr-4.0-2010-07-05_08-06-42) and tried running "ant
example". It fails with a missing package error. I've pasted in the output
below. I tried a nightly from a couple weeks ago, and it did the same
t
That's exactly what it was. I forgot to commit.
Thanks,
Moazzam
On Tue, Jul 6, 2010 at 3:29 PM, Markus Jelsma wrote:
> Hi,
>
>
>
> If q=*:* doesn't show your insert, then you forgot the commit:
>
> http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
>
>
>
> Cheers,
>
Hi,
If q=*:* doesn't show your insert, then you forgot the commit:
http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
Cheers,
-Original message-
From: Moazzam Khan
Sent: Tue 06-07-2010 22:09
To: solr-user@lucene.apache.org;
Subject: Solr results no
On Sat, Jul 3, 2010 at 1:10 PM, Lance Norskog wrote:
> You don't need to optimize, only commit.
OK, thanks for the tip, Lance. I thought the "too many open files"
problem was because I wasn't optimizing/merging frequently enough. My
understanding of your suggestion is that commit also does merg
Hi,
I just successfully inserted a document into SOlr but when I search
for it, it doesn't show up. Is it a cache issue or something? Is there
a way to make sure it was inserted properly? And, it's there?
Thanks,
Moazzam
: It fetches 5322 rows but doesn't process any documents and doesn't
: populate the index. Any suggestions would be appreciated.
I don't know much about DIH, but it seems weird that both of your entities
say 'rootEntity="false"'
looking at the docs, that definitely doesn't seem like what you
: What we are seeing is the request is dispatched to solr server,but its not
: being processed.
you'll have to explain what you mean by "not being processed" ?
According to your logs, DIH is in fact working and logging it's
progress...
: 2010-06-14 12:51:01,328 INFO [org.apache.solr.core.SolrC
> Will quotes do an exact match within
> a proximity test?
No.
> If not, does anybody know how to accomplish this?
It is not supported out-of-the-box. You need to plug Lucene's XmlQueryParser or
SurroundQueryParser. Similar discussion:
http://search-lucene.com/m/PO3iXKRuAv1/
Will quotes do an exact match within a proximity test? For instance
body:""mountain goat" grass"~10
should match:
"the mountain goat went up the hill to eat grass"
but should NOT match
"the mountain where the goat lives is covered in grass"
If not, does anybody know how to accomplish this?
FYI - optimise() operations solved the issue.
Kumar_/|\_
www.saisk.com
ku...@saisk.com
"making a profound difference with knowledge and creativity..."
On Tue, Jul 6, 2010 at 11:47 AM, Kumaravel Kandasami <
kumaravel.kandas...@gmail.com> wrote:
> BTW, Using SOLRJ - javabin api.
>
>
>
> Kuma
BTW, Using SOLRJ - javabin api.
Kumar_/|\_
www.saisk.com
ku...@saisk.com
"making a profound difference with knowledge and creativity..."
On Tue, Jul 6, 2010 at 11:43 AM, Kumaravel Kandasami <
kumaravel.kandas...@gmail.com> wrote:
> Hi,
>
>How to delete the terms associated with the doc
Hi,
How to delete the terms associated with the document ?
Current scenario: We are deleting documents based on a query
('field:value').
The documents are getting deleted, however, the old terms associated to the
field are displayed in the admin.
How do we make SOLR to re-evaluate and update
On Jul 6, 2010, at 8:27am, osocurious2 wrote:
Someone else was recently asking a similar question (or maybe it was
you but
worded differently :) ).
Putting user level security at a document level seems like a recipe
for
pain. Solr/Lucene don't do frequent update well...and being highly
Is there some sort of threshold that I can tweak which sets how many letters
in non-matching words makes a result more or less relevant?
Searching on title, q=fantasy football, and I get this:
{"title":"The Fantasy Football Guys",
"score":2.8387074},
{"title":"Fantasy Football Bums",
"score":2.8
Someone else was recently asking a similar question (or maybe it was you but
worded differently :) ).
Putting user level security at a document level seems like a recipe for
pain. Solr/Lucene don't do frequent update well...and being highly optimized
for query, I don't blame them. Is there any wa
Hi,
I have a SOLR installed on a Tomcat application server. This solr instance
has some data indexed from a postgres database. Now I need to add some
entities from an Oracle database. When I run the full-import command, the
documents indexed are only documents from postgres. In fact, if I have 200
Hi,
a bit more information would help to identify what's the problem in your
case.
but in general these facts come into my mind:
- leading wildcard queries are not available in solr (without extending the
QueryParser).
- no text analysing will be performed on the search word when using
wildcards
I've a question about indexing/searching techniques in relation to document
level security.
In planning a system that has, let's say, about 1million search documents
with about 100 search fields each. Most of them unstored to keep the index
size low, because some of them can contain some kilobytes
Hi,
thanks for the reply. I am an absolute beginner with Solr.
I have taken, for the beginning, the configuration from
{solr.home}example/solr .
In solrconfig.xml are all queryparser commented out ;-( Where can a
find the QeryParser? Javadoc, Wiki?
Regards,
Robert
2010/7/6 Mark Miller :
> On
> If you do distributed indexing correctly, what about updating the documents
> and what about replicating them correctly?
Yes, you can do you and it'll work great.
On Mon, Jul 5, 2010 at 7:42 AM, MitchK wrote:
>
> I need to revive this discussion...
>
> If you do distributed indexing correctly,
On 7/6/10 8:53 AM, Robert Naczinski wrote:
> Hi,
>
> we use in our application EmbeddedSolrServer.
Great!
> Everything went fine.
Excellent!
> Now I want use wildcards queries.
Cool!
>
> It does not work.
Bummer!
> Must be adapted for the schema.xml?
Not necessarily...
>
> Can someon
Hi,
we use in our application EmbeddedSolrServer. Everything went fine.
Now I want use wildcards queries.
It does not work. Must be adapted for the schema.xml?
Can someone help me? In wiki, I find nothing? Why do I need simple
example or link.
Regards,
Robert
On 6/28/2010 8:28 AM, Alexey Serba wrote:
Ok, I'm trying to integrate the TikaEntityProcessor as suggested. �I'm using
Solr Version: 1.4.0 and getting the following error:
java.lang.ClassNotFoundException: Unable to load BinURLDataSource or
org.apache.solr.handler.dataimport.BinURLDataSource
It
On Jul 4, 2010, at 5:10 PM, Andrew Clegg wrote:
Mark Miller-3 wrote:
On 7/4/10 12:49 PM, Andrew Clegg wrote:
I thought so but thanks for clarifying. Maybe a wording change on
the
wiki
Sounds like a good idea - go ahead and make the change if you'd like.
That page seems to be marked
Hi,
Chris Hostetter wrote:
AND, OR, and NOT are just syntactic-sugar for modifying
the MUST, MUST_NOT, and SHOULD. The default op of "OR" only affects the
first clause of your query (R) because it doesn't have any modifiers --
Thanks for pointing that out!
-Sascha
the second clause has that
41 matches
Mail list logo