Hello,
I have two sources of data for the same "things" to search. It is book
data in a library. First there is the usual bibliographic data (author,
title...) and then I have scanned and OCRed table of contents data about
the same books. Both are updated independently.
Now I don't know how t
: How do I access the ValueSource for my DateField? I'd like to use a
: ReciprocalFloatFunction from inside the code, adding it aside others in the
: main BooleanQuery.
The FieldType API provides a getValueSource method (so every FieldType
picks it's own best ValueSource implementaion).
-Hoss
: > I know this is a lot and I'm going to decrease it, I was just experimenting,
: > but I need some guidelines of how to calculate the right size of the cache.
:
: Each filter that matches more than ~3000 documents will occupy maxDocs/8 bytes
: of memory. Certain kinds of faceting require one en
: INFO: {add=[10485, 10488, 10489, 10490, 10491, 10495, 10497, 10498, ...(42
: more)
: ]} 0 875
:
: However, when timing this instruction on the client-side (I use SOlrJ -->
: req.process(server)) I get totally different numbers (in the beginning the
: client-side measured time is about 2 seconds
:
: Hello, is this possible to do in one query: I have a query which returns
: 1000 documents with names and addresses. I can run facet on state field
: and see how many addresses I have in each state. But also I need to see
: how many families lives in each state. So as a result I need a matri
: searches. That is fine by me. But I'm still at the first question:
: How do I conduct a wildcard search for ARIZONA on a solr.textField? I tried
as i said: it really depends on what kind of "index" analyzer you have
configured for the field -- the query analyzer isn't used at all when
dealin
: i have encountered a problem concerning the wildcard. When i search for
: field:testword i get 50 results. That's ok but when I search for
: field:testwor* i get just 3 hits! I get only words returned without a
: whitespace after the char like "testwordtest" but i wont find any single
: "testwo
: In my schema I have a multivalued field, and the values of that field are
: "stored" and "indexed" in the index. I wanted to know if its possible to
: restrict the number of multiple values being returned from that field, on a
: search? And how? Because, lets say, if I have thousands of values i
: Does anyone have more experience doing this kind of stuff and whants to share?
My advice: don't.
I work with (or work with people who work with) about two dozen Solr
indexes -- we don't attempt to update a single one of them in any sort of
transactional way. Some of them are updated "real t
: The problem is that when I use the 'cd' request handler, the facet count for
: 'dvd' provided in the response is 0 because of the filter query used to only
: show the 'cd' facet results. How do I retrieve facet counts for both
: categories while only retrieving the results for one category?
the
: Is there any way to make the DisMaxRequestHandler a bit more forgiving with
: user queries, I'm only getting results when the user enters a close to
: perfect match. I'd like to allow near matches if possible, but I'm not sure
: how to add something like this when special query syntax isn't allo
: I may be mistaken, but this is not equivalent to my query.In my query i have
: matches for x1, matches for x2 without slope and/or boosting and then match
: to "x1 x2" (exact match) with slope (~) a and boost (b) in order to have
: results with exact match score better.
: The total score is the
: For this exact example, use the WordDelimiterFilter exactly as
: configured in the "text" fieldType in the example schema that ships
: with solr. The trick is to then use some slop when querying.
:
: FT-50-43 will be indexed as FT, 50, 43 / 5043 (the last two tokens
: are in the same position)
On Wed, 16 Jan 2008 16:54:56 +0100
"Philippe Guillard" <[EMAIL PROTECTED]> wrote:
> Hi here,
>
> It seems that Lucene accepts any kind of XML document but Solr accepts only
> flat name/value pairs inside a document to be indexed.
> You'll find below what I'd like to do, Thanks for help of any kin
All,
I'm new to Solr and Tomcat and I'm trying to track down some odd errors.
How do I set up Tomcat to do fine-grained Solr-specific logging? I have
looked around enough to know that it should be possible to do per-webapp
logging in Tomcat 5.5, but the details are hard to follow for a newbie. A
I did see that bug, which made me suspect Lucene. In my case, I tracked down
the problem. It was my own application. I was using Java's
FileChannel.transferTo functions to copy my index from one location to another.
One of the files is bigger than 2^31-1 bytes. So, one of my files was corrupted
Thanks, Otis. I will take a look at those profiling tools.
Best,
Dave
On 1/16/08, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
>
> David,
> I bet you can quickly identify the source using YourKit or another Java
> profiler jmap command line tool might also give you some direction.
>
> Otis
>
Yonik,
I pulled SimplePostTool apart, pulled out the main() and the postFiles() and
just use it directly in Java via postFile() -> postData(). It seems to work
OK. Maybe I should upgrade to v1.3 and try doing things directly through
Solrj. Is 1.3 stable yet? Might that be a better plan altogethe
This may be a Lucene bug... IIRC, I saw at least one other lucene user
with a similar stack trace. I think the latest lucene version (2.3
dev) should fix it if that's the case.
-Yonik
On Jan 16, 2008 3:07 PM, Kevin Osborn <[EMAIL PROTECTED]> wrote:
> I am using the embedded Solr API for my index
has anyone done any work integrating dojo based applications with solr? I am
pretty new to both but I wondered if it anyone had developed an xsl for solr
that returns solr queries in dojo data store format - json, but a specific
format of json. I am not even sure if this is sensible/possible.
Our basic setup is master/slave. We just want to make sure that we are not
syncing against an index that is in the middle of a large rebuild. But, I think
these issues are still separate from what I am experiencing.
I also tried this same scenario in a different development environment. No
prob
David,
I bet you can quickly identify the source using YourKit or another Java
profiler jmap command line tool might also give you some direction.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: David Thibault <[EMAIL PROTECTED]>
To: solr-
Kevin,
Perhaps you want to look at how Solr can be used in a master-slave setup. This
will separate your indexing from searching. Don't have the URL, but it's on
zee Wiki.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Kevin Osborn <[EMA
>From your stack trace, it looks like it's your client running out of
memory, right?
SimplePostTool was meant as a command-line replacement to curl to
remove that dependency, not as a recommended way to talk to Solr.
-Yonik
On Jan 16, 2008 4:29 PM, David Thibault <[EMAIL PROTECTED]> wrote:
> OK,
It is more of a file structure thing for our application. We build in one place
and do our index syncing in a different place. I doubt it is relevant to this
issue, but figured I would include this information anyway.
- Original Message
From: Otis Gospodnetic <[EMAIL PROTECTED]>
To: sol
Do you trust the spellchecker 100% (not looking at its source now). I'd peek
at the index with Luke (Luke I trust :)) and see if that term is really there
first.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Doug Steigerwald <[EMAIL PROTECT
Kevin,
Don't have the answer to EOF but I'm wondering why the index moving. You
don't need to do that as far as Solr is concerned.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Kevin Osborn <[EMAIL PROTECTED]>
To: Solr
Sent: Wednesday,
I think you should try isolating the problem.
It may turn out that the problem isn't really to do with Solr, but file
uploading.
I'm no expert, but that's what I'd try out in such situation.
Cheers,
Timothy Wonil Lee
http://timundergod.blogspot.com/
http://www.google.com/reader/shared/1684924941
OK, I have now bumped my tomcat JVM up to 1024MB min and 1500MB max. For
some reason Walter's suggestion helped me get past the 8MB file upload to
Solr but it's still choking on a 32MB file. Is there a way to set
per-webapp JVM settings in tomcat, or is the overall tomcat JVM sufficient
to set?
I see,.. but I really need to run it on Solr. We have already indexed
everything.
I don't really want to construct a query with 1K OR conditions, and send to
Solr to parse it first and run it after.
May be there is a way to go directly to Lucene, or Solr and run such query from
Java, passing Ar
I am using the embedded Solr API for my indexing process. I created a brand new
index with my application without any problem. I then ran my indexer in
incremental mode. This process copies the working index to a temporary Solr
location, adds/updates any records, optimizes the index, and then co
Having another weird spell checker index issue. Starting off from a clean index and spell check
index, I'll index everything in example/exampledocs. On the first rebuild of the spellchecker index
using the query below says the word 'blackjack' exists in the spellchecker index. Great, no proble
On 16-Jan-08, at 3:15 AM, farhanali wrote:
when i search the query for example
http://localhost:8983/solr/select/?q=category&qt=dismax
it gives the results but when i want to search on the basis of
field name
like
http://localhost:8983/solr/select/?q=maincategory:Cars&qt=dismax
it does n
On 16-Jan-08, at 11:15 AM, [EMAIL PROTECTED] wrote:
I'm using Tomcat. I set Max Size = 5Gb and I checked in profiler
that it's actually uses whole memory. There is no significant
memory use by other applications.
Whole change was I increased the size of cache to:
LRU Cache(maxSize=1048576, i
On 15-Jan-08, at 9:23 PM, Srikant Jakilinki wrote:
2) Solr that has to handle a large collective index which has to be
split up on multi-machines
- The index is ever increasing (TB scale) and dynamic and all of it
has to be searched at any point
This will require significant development on you
On 16-Jan-08, at 11:09 AM, Srikant Jakilinki wrote:
Thanks for that Shalin. Looks like I have to wait and keep track of
developments.
Forgetting about indexes that cannot be fit on a single machine
(distributed search), any links to have Solr running in a 2-machine
environment? I want to
Solr provides a few scripts to create a multiple-machine deployment. One box
is setup as the master (used primarily for writes) and others as slaves.
Slaves are added as per application requirements. The index is transferred
using rsync. Look at http://wiki.apache.org/solr/CollectionDistribution fo
I'm using Tomcat. I set Max Size = 5Gb and I checked in profiler that it's
actually uses whole memory. There is no significant memory use by other
applications.
Whole change was I increased the size of cache to:
LRU Cache(maxSize=1048576, initialSize=1048576, autowarmCount=524288, [EMAIL
PROTECT
Thanks for that Shalin. Looks like I have to wait and keep track of
developments.
Forgetting about indexes that cannot be fit on a single machine
(distributed search), any links to have Solr running in a 2-machine
environment? I want to measure how much improvement there will be in
performanc
Walter and all,
I had been bumping up the heap for my Java app (running outside of Tomcat)
but I hadn't yet tried bumping up my Tomcat heap. That seems to have helped
me upload the 8MB file, but it's crashing while uploading a 32MB file now. I
Just bumped tomcat to 1024MB of heap, so I'm not sure
Nice signature...=)
On 1/16/08, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> The PS really wasn't related to your OOM, and raising that shouldn't
> have changed the behavior. All that happens if you go beyond 10,000
> tokens is that the rest gets thrown away.
>
> But we're beyond my real knowledg
This error means that the JVM has run out of heap space. Increase the
heap space. That is an option on the "java" command. I set my heap to
200 Meg and do it this way with Tomcat 6:
JAVA_OPTS="-Xmx600M" tomcat/bin/startup.sh
wunder
On 1/16/08 8:33 AM, "David Thibault" <[EMAIL PROTECTED]> wrote:
The PS really wasn't related to your OOM, and raising that shouldn't
have changed the behavior. All that happens if you go beyond 10,000
tokens is that the rest gets thrown away.
But we're beyond my real knowledge level about SOLR, so I'll defer
to others. A very quick-n-dirty test as to whether y
I tried raising the 1 under
as well as and still no luck. I'm trying to
upload a text file that is about 8 MB in size. I think the following stack
trace still points to some sort of overflowed String issue. Thoughts?
Solr returned an error: Java heap space java.lang.OutOfMemoryError: J
I think your PS might do the trick. My JVM doesn't seem to be the issue,
because I've set it to -Xmx512m -Xms256m. I will track down the solr config
parameter you mentioned and try that. Thanks for the quick response!
Dave
On 1/16/08, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> P.S. Lucene by
P.S. Lucene by default limits the maximum field length
to 10K tokens, so you have to bump that for large files.
Erick
On Jan 16, 2008 11:04 AM, Erick Erickson <[EMAIL PROTECTED]> wrote:
> I don't think this is a StringBuilder limitation, but rather your Java
> JVM doesn't start with enough memor
I don't think this is a StringBuilder limitation, but rather your Java
JVM doesn't start with enough memory. i.e. -Xmx.
In raw Lucene, I've indexed 240M files
Best
Erick
On Jan 16, 2008 10:12 AM, David Thibault <[EMAIL PROTECTED]>
wrote:
> All,
> I just found a thread about this on the
Hi here,
It seems that Lucene accepts any kind of XML document but Solr accepts only
flat name/value pairs inside a document to be indexed.
You'll find below what I'd like to do, Thanks for help of any kind !
Phil
I need to index products (hotels) whi
Hi Gene.
Have you set your app server / servlet container to use allocate some of
this memory to be used?
You can define the maximum and minimum heap size adding/replacing some
parameters on the app server initialization:
-Xmx1536m -Xms1536m
Which app server / servlet container are you using?
Hello,..
I have relatively large RAM (10Gb) on my server which is running Solr. I
increased Cache settings and start to see OutOfMemory exceptions, specially on
facet search.
Is anybody has some suggestions how Cache settings related to Memory
consumptions? What are optimal settings? How they c
All,
I just found a thread about this on the mailing list archives because I'm
troubleshooting the same problem. The kicker is that it doesn't take such
large files to kill the StringBuilder. I have discovered the following:
By using a text file made up of 3,443,464 bytes or less, I get no erro
my answers inilne...
On Jan 16, 2008 3:51 AM, Dilip.TS <[EMAIL PROTECTED]> wrote:
> Hi Bill,
> I have some questions regarding the SOLR collection distribution.
> !) Is it possilbe to add the index operations on the the slave server
> using
> SOLR collection distribution and still the master serv
Look at http://issues.apache.org/jira/browse/SOLR-303
Please note that it is still work in progress. So you may not be able to use
it immeadiately.
On Jan 16, 2008 10:53 AM, Srikant Jakilinki <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> There is a requirement in our group of indexing and searching s
Hi,
In the web application we are developing we have two sets of details.
The personal details and the resume details. We allow 5 different
resumes to be available for each user. But we want the personal details
to remain same for each 5 resumes. The problem is when personal details
are cha
when i search the query for example
http://localhost:8983/solr/select/?q=category&qt=dismax
it gives the results but when i want to search on the basis of field name
like
http://localhost:8983/solr/select/?q=maincategory:Cars&qt=dismax
it does not gives results however
http://localhost:8983/
Hi Bill,
I have some questions regarding the SOLR collection distribution.
!) Is it possilbe to add the index operations on the the slave server using
SOLR collection distribution and still the master server is updated with
these changes?
2)I have a requirement of having more than one solr instance
56 matches
Mail list logo