Hi,
We are planning to use Solr for processing large volume of application log
files (around ~ 10 Billions documents of size 5-6 TB).
One of the approach we are considering for the same is to use Distributed
Search extensively.
What we have in mind is distributing the log files in multiple bo
On Tue, Dec 2, 2008 at 9:29 PM, Jake Conk <[EMAIL PROTECTED]> wrote:
> I am trying to exclude certain records from my search results in my
> query by specifying which ones I don't want back but its not working
> as expected. Here is my query:
>
> +message:test AND (-thread_id:123 OR -thread_id:456
Hi everyone,
I'm wondering if the MoreLikeThis handler takes the boost function
parameter into account for the scoring (hence the sorting I guess) of
the similar documents it finds.
Thanks for your help !
Jerome.
--
Jerome Eteve.
Chat with me live at http://www.eteve.net
[EMAIL PROTECTED]
This is really cool. U... How does it integrate with the Data Import
Handler?
Lance
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Friday, December 05, 2008 8:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Dealing with field values as key/value pairs
:
The default encoding on windows is not UTF-8. This causes various weirdness
when you develop on Windows. This has helped me find all places in
string-handling that need the encoding name parameter, so it's not all bad.
Lance
-Original Message-
From: tushar kapoor [mailto:[EMAIL PROTECTE
On Sat, Dec 6, 2008 at 10:59 PM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
>
> Hey there,
> I am doing some hacks to some parts of the solr source. I am doing a feature
> for everytime I use delta import hanlder I want it to start geting info from
> the db starting from the last indexed document id
This sounds a little like my original problem of deltaQuery imports
per entity ...
https://issues.apache.org/jira/browse/SOLR-783
I wonder if those 2 hacks could be combined to fix the issue.
- Jon
On Dec 6, 2008, at 12:29 PM, Marc Sturlese wrote:
Hey there,
I am doing some hacks to some
Hey there,
I am doing some hacks to some parts of the solr source. I am doing a feature
for everytime I use delta import hanlder I want it to start geting info from
the db starting from the last indexed document id (from the latest
execution).
The point of doing that is that if I start a full imp
Can you retrieve those thread_ids as is? That is, if you query for
thread_id:456 (w/o all the other stuff) what happens?
Also, try adding &debugQuery=true to your input parameters. This
should give you some more information about how the query was parsed,
etc.
HTH,
Grant
On Dec 2, 2008
I don't think there is, since storage (or term vectors, but that
likely won't save you any space) is the only place that Solr has the
content stored in the correct "order". Namely, for searching,
documents are split up into an inverted index and it is really
cumbersome to recreate a docume
Typically this is handle through Solr's built-in replication
capabilities.This is commonly referred to as a master/slave or
master/worker setup whereby indexing takes place in one instance of
Solr, and then the worker nodes pull snapshots from the master on a
regular basis (I've seen pe
Hi Grant,
Q.Why do you need two web apps pointing to the same Solr data
directory?
A.I am planning to deploy solr in a load balanced environment where
there will be 3 web servers and 3 app servers.So there will be solr web app
deployed in 3 app servers and there will be 1 SOLR_DATA fold
Hi Kumar,
Wow, a brave soul trying out Solr Cell (aka the
ExtractingRequestHandler) already! Cool!
To add in external metadata, you can pass in literal parameters, as in:
In your example, you could do something like:
&ext.literal.Category=Alphabets&ext.literal.Catalog_ID=1213123
This will
About the only thing you can do here is to increase the readAheadLimit
on the BufferedReader, but, by the looks of it, that also means we
need to modify the TokenStream Factories that create the
HTMLStripReader so that they take in some optional attributes. If you
can open a JIRA issue for
In Lucene (hence Solr) only one IndexWriter may write to an index at a
time (by design), so pointing two separate Solr instances at the same
index will result in the lock issue you describe.
I guess the question back to you is, why do you need two web apps
pointing to the same Solr data dir
Hi,
Just adding some additional steps I have tried.
I have tried the following scenarios for testing this.
FIRST
1.Index 5000 docs to solr using the FIRST web app
2.send a commit command from the SECOND web app
3.Tried indexing docs from the SECOND web app.
Hi,
I want to get snippets along with my results. For this, I use the
Highlighting Feature to return the context of fragment size 10.
Some of the documents are very large (over 30 MB) in size and the
Highlighting feature works only for stored fields. So this makes it
necessary for me to store
Hi Steve,
You were right,it turned out to be a an encoding issue but a really weird
one. I was using windows notepad to save the stopwords file in UTF-8
encoding. On the other hand I was using editplus to save synonyms file. That
was the only difference. The moment I switched to editplus for sa
Hi,
I need to add some external metadata along with the documents I send to
ExtractingRequestHandler. Can someone please tell me how do i achieve
this?
E.g. Say I need to index the file abc.pdf. I want to add some more
additional information to the metadata such as Category = Alphabets,
Catalog
Hi,
Please help me with the following scenario.
I have a solr data folder "SOLR_DATA"
I have 2 web applications solr1 and solr 2 referring to the same
SOLR_DATA folder.
I am trying to index data using solr1/update and solr2/update
sequentially.
Indexing using the first
20 matches
Mail list logo