I tried the to reproduce this. However the matches always returns 4 in my
case (when using rows=1 and rows=2).
In your case the 2 documents on each core do belong to the same group,
right?
I did find something else. If I use rows=0 then an error occurs. I think we
need to further investigate this.
@Eric
By threshold, all I mean is the count of the documents returned and I am
not going to play with score. So if I have to commit my code to svn, whats
the best way to go about it? I know I have to discuss my design here which
would take atleast a couple of days. But is there special instructions
: Suppose I have content which has title and description. Users can tag content
: and search content based on tag, title and description. Tag has more
: weightage.
:
: Any inputs on how indexing and retrieval will work given there is content
: and tags using Solr? Has anyone implemented search ba
If you add &explainOther=, see:
http://wiki.apache.org/solr/SolrRelevancyFAQ
you might get some hints. You can use the TermsComponent
to see if the synonyms are getting in the index, but you'll
have to have a very restricted input set (like one doc) for that
to be helpful for a specific document.
Part of it depends on what you mean by "threshold". If it's
just the number of matches, then fine. But if you're talking score
here, be very, very careful. Scores are not an absolute measure
of anything, they only tell you that "for _this_ query, the docs
should be order this way".
So I'd advise a
: It seems to be an unrecognisable pattern, this is from the log, last
: paragraph says "unknown character block name". The java version is
: "1.6.0_31":
Did you read the rest of my reply? about testing if java recognizes your
block name independent of Solr ... because that error is coming direc
Okay, I've played with this a bit more. Found something interesting:
When the groups returned do not include results from a core, then the core is
excluded from the count. (I have 1 group, 2 documents per core)
Example:
http://localhost:8983/solr/core0/select/?q=*:*&shards=localhost:8983/solr/c
>
> All documents of a group exist on a single shard, there are no cross-shard
> groups.
>
You only have to partition documents by group when the groupCount and some
other features need to be accurate. For the "matches" this is not
necessary. The matches are summed up during merging the shared resp
A few more details to this thread -
when i try the analysis tab from the admin console I see that the synonym
is kicking in & its matching the text in the document that I am expecting
to see as part of the results. However the actual search is not returning
that document.
Also I used the termcomp
I am working on a component for indexing documents from a database that
contains medical records. The information is organized across several tables
and I am supposed to index records for varying sizes of sets of patients for
others to do IR experiments with. Each patient record has one or more
Karthick,
The solution that I use to this problem is to perform query1 and
query2 and boost results matching query1. Then solr takes care of all
the deduplication (not necessarily merging) automatically, would this
work for your situation?
I stole this idea from this slide deck:
"Make sure all r
Hmm, unless the ulimits are low, or the default mergeFactor was
changed, or you have many indexes open in a single JVM, or you keep
too many IndexReaders open, even in an NRT or frequent commit use
case, you should not run out of file descriptors.
Frequent commit/reopen should be perfectly fine, a
Here is SolrConfig.xml, and I am using Lucene NRT with soft commit and
update the index every 5 seconds, soft commit every 1 second and hard
commit every 15 minutes
> SolrConfig.xml:
>
>
>
>false
>10
>2147483647
>1
>
In the case of group=false:
numFound="26"
In the case of group=true:
34000
As a note, the grouped number changes when I hit refresh. It seems to display
the count from any single shard. (The top match also changes).
I haven't tried this in other versions of solr.
All documents of a group
The query in question should be:
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-do-I-use-localparams-joins-using-SolrJ-and-or-the-Admin-GUI-tp3872088p3877927.html
Sent from the Solr - User mailing list archive at Nabble.com.
> Solr Cell is great for proof-of-concept, but for heavy-duty
> applications,
> you're offloading all the processing on the Solr server,
> which can be a
> problem.
Good point!
Thank you
Hi,
is it possible to determine the memory consumption (heap space) per core
in solr trunk (4.0-SNAPSHOT)?
I just unloaded a core and saw the difference in memory usage, but it
would be nice to have a smoother way of getting the information without
core downtime.
It would also be interesting, wh
And probably 10,000 tokens (words). See maxFieldLength
in solrconfig.xml.
Best
Erick
On Mon, Apr 2, 2012 at 8:57 AM, Sandro Feuillet
wrote:
> Hi,
>
> We have troubles indexing big text files with Solr.
> We extract PDF files with Tika and try to index them with Solr.
> But Solr doesn't index the
On 3/31/2012 4:30 AM, Suneel wrote:
Hello friends,
I am using DIH for solr indexing. I have 60 million records in SQL which
need to upload on solr. i started caching its smoothly working and memory
consumption is normal, But after some time incrementally memory consumption
going high and process
I've answered my own question, but it left me with a lot of curiosity.
Why is the convention to build strings joined with commas (e.g in
SolrQuery.addValueToParam) rather than to use the array option? All
these params are Map, so why cram multiples into the
first slot with commas ?
I've got a prototype of a RequestHandler that embeds, within itself, a
SearchHandler. Yes, I read the previous advice to be a query
component, but I found it a lot easier to chart my course.
I'm having some trouble with sorting. I came up with the following.
'args' is the usual Map. firstpassSort
Hi,
We have troubles indexing big text files with Solr.
We extract PDF files with Tika and try to index them with Solr.
But Solr doesn't index the entire text. As soon as a certain amount of text
is reached Solr stopps indexing the rest. We haven't found a setting or
parameter wich defines the amo
You can index 2B tokens, so upping maxFieldLength should have
fixed your problem at least as far as Solr is concerned. How
many tokens get indexed? I'm not as familiar with Tika, but
there may be some kind of parameter there (although I
don't remember this coming up before)...
Did you restart Solr
Are you seeing a real problem here, besides just being alarmed by the
big numbers from top?
Consumption of virtual memory by itself is basically harmless, as long
as you're not running up against any of the OS limits (and, you're
running a 64 bit JVM).
This is just "top" telling you that you've m
Why do you care about virtual memory? It's after all, virtual. You can
allocate as much as you want.
For instance, MMapDirectory maps a load of virtual memory, but that
has little relation to how much physical memory is being used. Consider
looking at your app with something like jConsole and seei
Ok. got it. thanks
Best Regards
Alexander Aristov
On 2 April 2012 16:37, Erick Erickson wrote:
> You can't set the default operator for a single field. This implies
> you're using edismax? If that's the case, your app layer can
> massage the query to something like
> term1 term2 term3 field_x:
You can't set the default operator for a single field. This implies
you're using edismax? If that's the case, your app layer can
massage the query to something like
term1 term2 term3 field_x:(term1 AND term2 AND term3). In which
case field_x probably should not be in your qf parameter.
Best
Erick
How often are you committing index updates? This kind of thing
can happen if you commit too often. Consider setting
commitWithin to something like, say, 5 minutes. Or doing the
equivalent with the autoCommit parameters in solrconfig.xml
If that isn't relevant, you need to provide some more details
No, you don't have to run zookeeper on each replica. Zookeeper
is a repository for your system (cluster) information. It knows
about each replica, but ZK does not need to run on each shard.
You can run one zookeeper instance for your entire cluster, no matter
how many shards/replicas you have.
He
On Monday, April 2, 2012 at 2:00 PM, Stefan Matheis wrote:
> On Friday, March 30, 2012 at 11:33 PM, vybe3142 wrote:
> > When I paste the relevant part of the query into the SOLR admin UI query
> > interface,
> > {!join+from=join_id+to=id}attributes_AUTHORS.4:4, I fail to retrieve any
> > documents
On Saturday, March 31, 2012 at 6:01 PM, Yonik Seeley wrote:
> Shouldn't that be the other way? The admin UI should do any necessary
> escaping, so those "+" chars should instead be a spaces?
We can, but is this really what you'd expect?
On Friday, March 30, 2012 at 11:33 PM, vybe3142 wrote:
> When I paste the relevant part of the query into the SOLR admin UI query
> interface,
> {!join+from=join_id+to=id}attributes_AUTHORS.4:4, I fail to retrieve any
> documents
Just go and paste the raw content into the form, then you'll get
Hello Guys,
I am using apache solr 3.3.0 with Tikka 1.0.
I have pdf files which I am pushing into solr for conent searching. Apache
solr is indexing pdf files and I can see them in apache solr admin interface
for search. But the issue is apache solr is not indexing whole file content.
It is index
The "matches" element in the response should return the number of documents
that matched with the query and not the number of groups.
Did you encountered this issue also with other Solr versions (3.5 or
another nightly build)?
Martijn
On 2 April 2012 09:41, fbrisbart wrote:
> Hi,
>
> when you w
Alright well I discovered that php converts '.' in a variable name to '_'
causing my request to contain a variable to a non-exsistent facet_field.
2012/3/30 William Bell
> Can you also include a /select?q=*:*&wt=xml
>
> ?
>
> On Thu, Mar 29, 2012 at 11:47 AM, Erick Erickson
> wrote:
> > Hmmm, l
Hi!
I'm desperately trying to work out how to configure Solr in order to allow
it to make calls to the Alchemy service through the UIMA analysis engines.
Is there anybody who has been able to accomplish this?
Cheers
--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-UIMA-
Hello Everyone,
On window server.
I am facing same problem during indexing my memory consumption going very
high based on above discussion i checked in my Solrconfig.xml file and found
that "directoryFactory" not configured yet. if i configuring
directoryfactory then its will help me reduce the c
Hello friends,
I am using DIH for solr indexing. I have 60 million records in SQL which
need to upload on solr. i started caching its smoothly working and memory
consumption is normal, But after some time incrementally memory consumption
going high and process reach more then 6 gb. that the reason
Hi,
when you write "I get xxx results", does it come from 'numFound' ? Or
you really display xxx results ?
When using both field collapsing and sharding, the 'numFound' may be
wrong. In that case, think about using 'shards.rows' parameter with a
high value (be careful, it's bad for performance).
Hi
It seems to be an unrecognisable pattern, this is from the log, last
paragraph says "unknown character block name". The java version is
"1.6.0_31":
***
SEVERE: null:org.apache.solr.common.SolrException: Plugin init failure for
[schema.xml] fieldType:Plugin init failure for [schema.xml]
analyze
40 matches
Mail list logo