Ok sure.
> " ngrams: The max number of tokens out of which singles will be make the
> dictionary. The default value is 2. Increasing this would mean you want
> more than the previous 2 tokens to be taken into consideration when making
> the suggestions. "
I got confused by this, as I could not ge
Hi Upayavira
Thank you for your explanation onthe difference between traditional
grouping and collapsingQParser. I understand more now.
On 6/19/2015 7:11 PM, Upayavira wrote:
On Fri, Jun 19, 2015, at 06:20 AM, Derek Poh wrote:
Hi
I read about "collapsingQParser returns the facet count the s
Hi Joel
By group heads, is it referring to the document thatis use to represent
each group in the main result section?
Eg. Using the below 3 documentsandwe collapse on field supplier_id
supplier_id:S1
product_id:P1
supplier_id:S2
product_id:P2
supplier_id:S2
product_id:P3
With collapse on
Hi!
I'm facing a problem.
I'm using SolrCloud 4.10.3, with 2 shards, each shard have 2 replicas.
After index data to the collection, and run the same query,
http://localhost:8983/solr/catalog/select?q=a&wt=json&indent=true
Sometimes, it return the right,
{
"responseHeader":{
"status":0,
Hmm, I can see some things you couldn't do with just using
a tint field for the year. Or rather, some things that wouldn't
be as convenient
But this might help:
http://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html
or you can
Hi Chris,
Thank you for taking the time to write the detailed response. Very helpful.
Dealing with interesting formats in the source data and trying to evaluate
various options for our business needs. The second scenario you described
(where some values in the date field are just the year) will e
I'm not sure i understand your question ...
if you know that you are only ever going to have the 'year' then why not
just index the year as an int?
a TrieDateField isn't really of any use to you, because normal date type
usage (date math, date ranges) are useless because you don't have any rea
Hello,
Example csv doc has column 'just_the_year' and value '2010':
With the Schema API I can tell the indexing process to treat 'just_the_year'
as a date field.
I know that I can update the solrconfig.xml to correctly parse formats such
as MM/dd/ (which is awesome) but has anyone tried t
Thanks Joel,
I don't know why I was unable to find the "understanding collapsing" email
thread via the search I did on the site but I found it in my own email search
now.
We'll look into our specific scenario and see if we can find a workaround.
Thanks!
CARLOS MAROTO
M +1 626 354 7750
If you see the last comment on:
https://issues.apache.org/jira/browse/SOLR-6143
You'll see there is a discussion starting about adding this feature.
Joel Bernstein
http://joelsolr.blogspot.com/
On Fri, Jun 19, 2015 at 4:14 PM, Joel Bernstein wrote:
> The CollapsingQParserPlugin does not provi
As stated previously, using Field Collapsing (group parameters) tends to
significantly slow down queries. In my experience, search response gets even
worst when:
- Requesting facets, which more often than not I do in my query formulation
- Asking for the facet counts to be on the groups via the
The CollapsingQParserPlugin does not provide facet counts that are them
same as the group.facet feature in Grouping. It provides facet counts that
behave like group.truncate.
The CollapsingQParserPlugin only collapses the result set. The facets
counts are then generated for the collapsed result se
Hi,
We are comparing results between Field Collapsing (&group* parameters) and
CollapseQParserPlugin. We noticed that some facets are returning incorrect
counts.
Here are the relevant parameters of one of our test queries:
Field Collapsing:
---
q=red%20dress&facet=true&facet
Thanks as always for the great answers!
Jim
On 6/19/15, 11:57 AM, "Erick Erickson" wrote:
>Jim:
>
>This is by design. There's no way to tell Solr to find all the cores
>available and put one replica on each. In fact, you're explicitly
>telling it to create one and only one replica, one and onl
Dirk,
There are 3 open JIRAs related to this behavior:
https://issues.apache.org/jira/browse/SOLR-3739
https://issues.apache.org/jira/browse/SOLR-3740
https://issues.apache.org/jira/browse/SOLR-3741
We worked around it by adding the explicit + signs if the query matched the
problematic pattern
Also, since you are tuning for relative times, you can tune on the smaller
index. Surely, you will want to test at scale. But tuning query, analyzer
or schema options is usually easier to do on a smaller index. If you get a 3x
improvement at small scale, it may only be 2.5x at full scale.
Do be aware that turning on &debug=query adds a load. I've seen the
debug component
take 90% of the query time. (to be fair it usually takes a much
smaller percentage).
But you'll see a section at the end of the response if you set
debug=all with the time each
component took so you'll have a sense
As for now, the index size is 6.5 M records, and the performance is good
enough. I will re-build the index for all the records (14 M) and test it
again with debug turned on.
Thanks
On Fri, Jun 19, 2015 at 12:10 PM, Erick Erickson
wrote:
> First and most obvious thing to try:
>
> bq: the Solr w
Jim:
This is by design. There's no way to tell Solr to find all the cores
available and put one replica on each. In fact, you're explicitly
telling it to create one and only one replica, one and only one shard.
That is, your collection will have exactly one low-level core. But you
realized that...
On 6/19/2015 11:15 AM, Jim.Musil wrote:
> I noticed that when I issue the CREATE collection command to the api, it does
> not automatically put a replica on every live node connected to zookeeper.
>
> So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and
> create a collect
I noticed that when I issue the CREATE collection command to the api, it does
not automatically put a replica on every live node connected to zookeeper.
So, for example, if I have 3 solr nodes connected to a zookeeper ensemble and
create a collection like this:
/admin/collections?action=CREATE&
You really, really, really want to get friendly with the
admin/analysis page for questions like:
bq: You're probably right though. I probably have to create a better analyzer
really ;).
It shows you exactly what each link in your analysis chain does to the
input. Perhaps 75% or
the questions abo
Yes the number of indexed documents is correct. But the queries I perform
fall short of what they should be. You're probably right though. I probably
have to create a better analyzer.
And I'm not really worried about the other fields. I've already check to see
if it's storing them correctly and i
2015-06-19 18:00 GMT+02:00 Erick Erickson :
> You really have to ask more specific questions here. What
> are you confused _about_? Have
I read that I could migrate using the backup script, so I looked for
the backup script in the Solr 4.7.1 source code but I haven't find
anything...
This may be another forehead-slapper (man, you don't know how often
I've injured myself that way).
Did you commit at the end of the SolrJ indexing to Testcore2? DIH automatically
commits at the end of the run, and depending on how your SolrJ program
is written
it may not have. Or just set autoComm
First and most obvious thing to try:
bq: the Solr was started with maximal 4G for JVM, and index size is < 2G
Bump your JVM to 8G, perhaps 12G. The size of the index on disk is very
loosely coupled to JVM requirements. It's quite possible that you're spending
all your time in GC cycles. Consider
You really have to ask more specific questions here. What
are you confused _about_? Have
you gone through the tutorial? Read the Solr In Action book?
Tried _anything_?
Best,
Erick
On Fri, Jun 19, 2015 at 5:02 AM, shacky wrote:
> Hi.
> I have an old index running on a standalone Solr 4.7.1 and I
So, the first I can say is if that is true : "it almost killed Solr with
280 files" you are doing something wrong for sure.
At least if you are not trying to index 4k full movies xD
Joking apart :
1) You should carefully design your analyser.
2) You should store your fields initially to verify you
Hi all,
I have the following search components that I don't have a solution at the
moment to get them working in distributed mode on solr 4.10.4.
[standard query component]
[search component-1] (StageID - 2500):
handleResponses: get few values from docs and populate parameters for
stats component
On 6/19/2015 5:40 AM, Paul Revere wrote:
> Our log files show entries for each member indexed:
>
> Error: Could not create instance of 'SolrInputDocument'.
> ~~
> Exception: org.apache.solr.common.SolrInputDocument
There will be a *lot* more detail available on this exception. We will
need all o
Yeah, actually changing the field to "text_en" or "text_en_splitting"
actually made it so my indexer indexed all my files. The only problem is, I
don't think it's doing it well.
I have two Cores that I'm working with. Both of them have indexed the same
set of files. The first core, which I will r
Grouping does tend to be expensive. Our regular queries typically return in
10-15ms while the grouping queries take 60-80ms in a test environment (< 1M
docs).
This is ok for us, since we wrote our app to take the grouping queries out of
the critical path (async query in parallel with two prim
tomas.kalas wrote:
> Existing some hardware or software limits for indexing data?
The only really hard Solr limit is 2 billion X per shard, where X is document
count, unique values in a DocValues String field and other things like that.
There are some softer limits, after which performance degr
Silly thing … Maybe the immense token was generating because trying to set
"string" as field type for your text ?
Can be ?
Can you wipe out the index, set a proper type for your text, and index
again ?
No worries about the not full stack trace,
We learn and do wrong things everyday :)
Errare humanu
Hi Wenbin,
To me, your instance appears well provisioned. Likewise, your analysis of test
vs. production performance makes a lot of sense. Perhaps your time would be
well spent tuning the query performance for your app before resorting to
sharding?
To that end, what do you see when you se
Yeah I'm just gonna say hands down this was a totally bad question. My fault,
mea culpa. I'm pretty new to working in an IDE environment and using a stack
trace (I just finished my first year of CS at University and now I'm
interning). I'm actually kind of embarrassed by how long it took me to
real
Hello,
I'm trying to parse Solr Responses with SolrJ, but the responses contain mixed
types : for example 'song' documents and 'movie' documents with different
fields.
The getBeans method takes 1 class type as input parameter, this does not allow
for mixed document types responses.
What would
Hello i have a few questions for indexing data.
Existing some hardware or software limits for indexing data?
And is some maximum of indexed documents?
Thanks for your answers.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Limit-indexed-documents-tp4212913.html
Sent from th
Framework way?
Maybe try delving into the log4j framework and modify the log4j.properties
file. You can generate different log files based upon what class generated the
message. Here's an example that I experimented with previously, it generates
an update log, and 2 different query logs with
We are running PaperThin's CommonSpot CMS in a Cold Fusion 10 and MS SQL Server
2008 R2 environment. We're using Apache Solr 4.10.4 vice Cold Fusion's Solr. We
can create (and delete) collections through the CS CMS; they appear in (and
disappear from) both the physical file structure as well as
Hi.
I have an old index running on a standalone Solr 4.7.1 and I have to
migrate its index to my new SolrCloud 5.1 installation.
I'm looking for some way to do this but I'm a little confused.
Could you help me please?
Thank you very much!
Bye
2015-06-17 16:11 GMT+02:00 Shalin Shekhar Mangar :
> Is ZK healthy? Can you try the following from the server on which Solr
> is running:
>
> echo ruok | nc zk1 2181
Thank you very much Shalin for your answer!
My ZK cluster was not ready because two nodes was dead and only one
node was running.
I
I have enough RAM (30G) and Hard disk (1000G). It is not I/O bound or
computer disk bound. In addition, the Solr was started with maximal 4G for
JVM, and index size is < 2G. In a typical test, I made sure enough free RAM
of 10G was available. I have not tuned any parameter in the configuration,
it
The AnalyticsQuery can be used to implement custom faceting modules. This
would allow you to calculate facets counts in an algorithm similar to
group.facets before the result set is collapsed. If you are in distributed
mode you will also need to implement a merge strategy:
http://heliosearch.org/s
Unfortunately this won't give you group.facet results:
q=whatever
fq={!collapse tag=collapse}blah
facet.field={!ex=collapse}my_facet_field
This will give you the expanded facet counts as it removes the collapse
filter.
A good explanation of group.facets is here:
http://blog.trifork.com/2012/04/
On Fri, Jun 19, 2015, at 06:20 AM, Derek Poh wrote:
> Hi
>
> I read about "collapsingQParser returns the facet count the same as
> group.truncate=true" and has this issue with the facet count and the
> after filter facet count notthe same.
> Using group.facetdoes not has this issue but it's perf
I definitely agree with Erick, the stack trace you posted is not complete
again.
This is an example of the same problem you got with a complete, meaningful
stack trace :
"
Stacktrace you provided :
org.apache.solr.common.SolrException: Exception writing document id 12345
> to the index; possible a
The CollapsingQParserPlugin currently doesn't calculate facets at all. It
simply collapses the document set. The facets are then calculated only on
the group heads.
Grouping has special faceting code built into it that supports the
group.facet functionality.
Joel Bernstein
http://joelsolr.blogspo
Actually the documentation is not clear enough.
Let's try to understand this suggester.
*Building*
This suggester build a FST that it will use to provide the autocomplete
feature running prefix searches on it .
The terms it uses to generate the FST are the tokens produced by the
"suggestFreeTextA
Hi
I read about "collapsingQParser returns the facet count the same as
group.truncate=true" and has this issue with the facet count and the
after filter facet count notthe same.
Using group.facetdoes not has this issue but it's performance is very
badcompared to collapsingQParser.
I trying t
Please open a JIRA with details of what the issues are, we should try to
support this..
On 18 Jun 2015 15:07, "Bence Vass" wrote:
> Hello,
>
> Is there any documentation on how to start Solr 5.2.1 on Solaris (Solaris
> 10)? The script (solr start) doesn't work out of the box, is anyone running
>
51 matches
Mail list logo