I'm using solr to index our files servers ( 480K files )
If I don't optimize, I 've got a too many files open at about 450K files
and 3 Gb index
If i optimize I've got this stacktrace during the commit of all the
following update
java.io.FileNotFoundException: no segments* file
found in
org.
are you using embedded solr?
I had stumbled on a similar error :
http://www.mail-archive.com/solr-user@lucene.apache.org/msg06085.html
-V
On Nov 12, 2007 2:16 PM, SDIS M. Beauchamp <[EMAIL PROTECTED]> wrote:
> I'm using solr to index our files servers ( 480K files )
>
> If I don't optimize, I '
what is your specific SolrQuery?
calling:
query.setQuery( " stuff with spaces " );
does not call trim(), but some other calls do.
My query looks e.g.
(myField:"_T8sY05EAEdyU7fJs63mvdA" OR myField:"_T8sY0ZEAEdyU7fJs63mvdA"
OR myField:"_T8sY0pEAEdyU7fJs63mvdA") AND NOT
myField:"_T8sY1JEAE
No , I'm using a custom indexer, written in C# which submits content using some
post request.
I let lucene manage the index on his own
Florent BEAUCHAMP
-Message d'origine-
De : Venkatraman S [mailto:[EMAIL PROTECTED]
Envoyé : lundi 12 novembre 2007 10:19
À : solr-user@lucene.apache.or
Hello,
Until now, i've used two instance of solr, one for each of my collections ; it
works fine, but i wonder
if there is an advantage to use multiple indexes in one instance over several
instances with one index each ?
Note that the two indexes have different schema.xml.
Thanks.
PL
> Date:
Hello,
I would like to use solr to return ranges of searches on an integer
field, if I wrote in the url offset:[0 TO 10], it returns documents
with offset values 0, 1, 10 only but I want to return the range 0,1,2,
3, 4 ,10. How can I do that with solr
Thanks in advance
Best rega
I'd say yes. Solr supports Unicode and ships with language specific
analyzers, and allows you to provide your own custom analyzers if you
need them. This allows you to create different definitions
for the languages you want to support. For example here is an example
field type for French text whic
The advantages of a multi-core setup are configuration flexibility and
dynamically changing available options (without a full restart).
For high-performance production solr servers, I don't think there is
much reason for it. You may want to split the two indexes on to two
machines. You may w
On Nov 12, 2007 8:02 AM, Heba Farouk <[EMAIL PROTECTED]> wrote:
> I would like to use solr to return ranges of searches on an integer
> field, if I wrote in the url offset:[0 TO 10], it returns documents
> with offset values 0, 1, 10 only but I want to return the range 0,1,2,
> 3, 4 ,10. How
In my system, the heap size (old generation) keeps growing up caused by
heavy traffic.
I have adjusted the size of young generation, but it does not work well.
Does anyone have any recommendation regarding this issue? - Solr
configuration and/or web.xml ...etc...
Thanks,
Jae
Here is my situation.
I have 6 millions articles indexed and adding about 10k articles everyday.
If I maintain only one index, whenever the daily feeding is running, it
consumes the heap area and causes FGC.
I am thinking the way to have multiple indexes - one is for ongoing querying
service and o
just use the standard collection distribution stuff. That is what it is
made for! http://wiki.apache.org/solr/CollectionDistribution
Alternatively, open up two indexes using the same config/dir -- do your
indexing on one and the searching on the other. when indexing is done
(or finishes a
For starters, do you need to be able to search across groups or
sub-groups (in one query?)
If so, then you have to stick everything in one index.
You can add a field to each document saying what 'group' or 'sub-group'
it is in and then limit it at query time
q="kittens +group:A"
The advant
Hi Guys
How do we add word documents / pdf / text / etc documents in solr ?. How the
content of the files are stored or indexed ?. Does the documents are stored
as XML in the filesystem ?
Regards
Dwarak R
- Original Message -
From: "Ryan McKinley" <[EMAIL PROTECTED]>
To:
Sent: Mon
Hi Guys
How do we add word documents / pdf / text / etc documents in solr ?. How do the
content of the files are stored or indexed ?. Are these documents stored as XML
in the SOLR filesystem ?
Regards
Dwarak R
This message is for the designated recipient only and may contain privileged,
prop
Ryan,
We currently have 8-9 million documents to index and this number will grow in
the future. Also, we will never have a query that will search across groups,
but, we will have queries that will search across sub-groups for sure.
Now, keeping this in mind we were thinking if we could have mul
On Nov 12, 2007 3:46 AM, SDIS M. Beauchamp <[EMAIL PROTECTED]> wrote:
> If I don't optimize, I 've got a too many files open at about 450K files
> and 3 Gb index
You may need to increase the number of filedescriptors in your system.
If you're using Linux, see this:
http://www.cs.uwaterloo.ca/~brec
Hello,
Does SOLR supports multiple instances within the same web application? If
so how is this achieved?
Thanks in advance.
Regards,
Dilip TS
Hi
I found the thread about enabling leading wildcards in
Solr as additional option in config file. I've got nightly Solr build
and I can't find any options connected with leading wildcards in
config files.
How I can enable leading wildcard queries in Solr? Thank you
--
rtfm :)
http://lucene.apache.org/solr/tutorial.html
On Nov 12, 2007 4:33 PM, Dwarak R <[EMAIL PROTECTED]> wrote:
> Hi Guys
>
> How do we add word documents / pdf / text / etc documents in solr ?. How do
> the content of the files are stored or indexed ?. Are these documents stored
> as XML in th
The related bug is still open:
http://issues.apache.org/jira/browse/SOLR-218
Bill
On Nov 12, 2007 10:25 AM, Traut <[EMAIL PROTECTED]> wrote:
> Hi
> I found the thread about enabling leading wildcards in
> Solr as additional option in config file. I've got nightly Solr build
> and I
Dilip.TS wrote:
Hello,
Does SOLR supports multiple instances within the same web application? If
so how is this achieved?
If you want multiple indices, you can run multiple web-apps.
If you need multiple indices in the same web-app, check SOLR-350 -- it
is still in development, and make s
Highly unfortunate!
On Nov 12, 2007 9:07 PM, Traut <[EMAIL PROTECTED]> wrote:
> rtfm :)
> http://lucene.apache.org/solr/tutorial.html
>
> On Nov 12, 2007 4:33 PM, Dwarak R <[EMAIL PROTECTED]> wrote:
> > Hi Guys
> >
> > How do we add word documents / pdf / text / etc documents in solr ?. How
> do
: I'm trying to obtain faceting information based on the first 'x' (lets say
: 100-500) results matching a given (dismax) query. The actual documents
: matching the query are not important in this case, so intuitively the
can you elaborate on your use case ... the only time i've ever seen people
Seems like there is no way to enable leading wildcard queries except
code editing and files repacking. :(
On 11/12/07, Bill Au <[EMAIL PROTECTED]> wrote:
> The related bug is still open:
>
> http://issues.apache.org/jira/browse/SOLR-218
>
> Bill
>
> On Nov 12, 2007 10:25 AM, Traut <[EMAIL PROTECTE
Vote for that issue and perhaps it'll gain some more traction. A former
colleague of mine was the one who contributed the patch in SOLR 218 and it
would be nice to have that configuration option 'standard' (if off by
default) in the next SOLR release.
On Nov 12, 2007 11:18 AM, Traut <[EMAIL PROT
I have built the master solr instance and indexed some files. Once I run
snapshotter, i complains the error.. - snapshooter -d data/index (in
solr/bin directory)
Did I missed something?
++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2007/11/12 12:38:40 taking snapshot
/solr/master/solr/data/index/snapshot.2
Thanks Ryan,
This looks like the way to go. However, when I set up my schema I get,
"Error loading class 'solr.EdgeNGramFilterFactory'". For some reason
the class is not found. I tried the stable 1.2 build and even tried the
nightly build. I'm using "".
Any suggestions?
Thanks,
Mike
-Or
: "Error loading class 'solr.EdgeNGramFilterFactory'". For some reason
EdgeNGramFilterFactory didn't exist when Solr 1.2 was released, but the
EdgeNGramTokenizerFactory did. (the javadocs that come with each release
list all of the various factories in that release)
-Hoss
Is there a way to define a query in that way that a search result
contains only one representative of every set of documents which are
equal on a given field (it is not important which representative
document), i.e. to have the DINTINCT ON-concept from relational
databases in Solr?
If this ca
I gather that the standard Solr query parser uses the same syntax for
proximity searches as Lucene, and that Lucene syntax is described at
http://lucene.apache.org/java/docs/queryparsersyntax.html#Proximity%20Searches
This syntax lets me look for terms that are within x words of each
other. The
DISCLAIMER: This is from a Lucene-centric viewpoint. That said, this may be
useful
For your line number, page number etc perspective, it is possible to index
special guaranteed-to-not-match tokens then use the termdocs/termenum
data, along with SpanQueries to figure this out at search time. Fo
Erik,
Probably because of my newness to SOLR/Lucene, I see now what you/Yonik meant
by "case" field, but I am not clear about your wording "per-book setting
attached at index time" - would you mind ellaborating on that, so I am clear?
Dave
- Original Message
From: Erik Hatcher <[EMAIL
Will I need to use Solr 1.3 with the EdgeNGramFilterFactory in order to
get the autosuggest feature?
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: Monday, November 12, 2007 1:05 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr + autocomplete
: "Error loadi
All,
I am working with very exact text and search over permament documents (books).
It would be great to associate pronouns like he, she, him, her, I, my, etc.
with the acutal author or person the pronoun refers to. I can see how I could
get pretty darn close with the synonym feature in Lucen
Hi Chris,
I gather that the standard Solr query parser uses the same syntax for
proximity searches as Lucene, and that Lucene syntax is described at
http://lucene.apache.org/java/docs/queryparsersyntax.html#Proximity%20Searches
This syntax lets me look for terms that are within x words of each
Attempting to answer my own question, which I should probably just try,
assuming I can doctor the indexed text ---I suppose I could do something like
change all instances or I, he, etc that refer to one person to IJBA HEJBA,
HIMJBA (making sure they would never equal a normal word) -- then use t
Erik - thanks, I am considering this approach, verses explicit redundant
indexing -- and am also considering Lucene -- problem is, I am one week into
both technologies (though have years in the search space) -- wish I could go to
Hong Kong -- any discounts available anywhere :)
Dave
- Orig
On Nov 12, 2007 2:20 PM, David Neubert <[EMAIL PROTECTED]> wrote:
> Erik - thanks, I am considering this approach, verses explicit redundant
> indexing -- and am also considering Lucene -
There's not a well defined solution in either IMO.
> - problem is, I am one week into both technologies (tho
: > - problem is, I am one week into both technologies (though have years in
the search space) -- wish I could
: > go to Hong Kong -- any discounts available anywhere :)
:
: Unfortunately the OS Summit has been canceled.
Or rescheduled to 2008 ... depending on wether you are a half-empty /
hal
All
have found (from using the Admin/Analysis page) that if I were to append
unique initials (that didn't match any other word or acronym) to each pronoun
(e.g. I-WCN, she-WCN, my-WCN etc) that the default parsing and tokenization
for the text field in SOLR might actually do the trick -- it p
On 13/11/2007, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> can you elaborate on your use case ... the only time i've ever seen people
> ask about something like this it was because true facet counts were too
> expensive to compute, so they were doing "sampling" of the first N
> results.
>
> In
Currently this functionality is not available in Solr out-of-the-box,
however there is a patch implementing Field Collapsing
http://issues.apache.org/jira/browse/SOLR-236 which might be similar to what
you are trying to achieve.
Piete
On 13/11/2007, Jörg Kiegeland <[EMAIL PROTECTED]> wrote:
>
>
: It's not really a performance-related issue, the primary goal is to use the
: facet information to determine the most relevant product category related to
: the particular search being performed.
ah ... ok, i understand now. the order does matter, you want the "top N"
documents sorted by some
if I understand correct,,u just do it like that:(i use php)
$data1 = getDataFromInstance1($url);
$data2 = getDataFromInstance2($url);
it just have multi solr Instance. and getData from the distance.
On Nov 12, 2007 11:15 PM, Dilip.TS <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Does SOLR supports
if u use tomcat,,,it default port: 8080 and other default port.
so u just use other tomcat which use 8181 and other port...(i remember u
should modify three port(one tomcat) )
I used to have four tomcat in One SERVER.
On Nov 9, 2007 7:39 AM, Isart Montane <[EMAIL PROTECTED]> wrote:
> Hi all,
>
46 matches
Mail list logo