"500B" - as in 500,000,000,000? Really?
-- Jack Krupansky
-Original Message-
From: tomasv
Sent: Friday, July 18, 2014 8:18 PM
To: solr-user@lucene.apache.org
Subject: shards as subset of All Shards
Hello, This is kind of weird, but here goes:
We are setting up a document repository (
That query is representative of some of the queries in my test, but I
didn't notice any correlation between using the match all docs query and
poor query performance. Here's another example of a query that took longer
than expected.
qt=en&q=dress green
leather&fq=userId:(383)&fq={!tag=productR
Hello, This is kind of weird, but here goes:
We are setting up a document repository (SOLR4). This will be a large (to
us) repository of approximately 500B documents. The documents are based on
"people".
Once all my documents are uploaded, we will receive new (follow-up)
information on our "peop
search engn dev [sachinyadav0...@gmail.com] wrote:
> out of 700 million documents 95-97% values are unique approx.
That's quite a lot. If you are not already using DocValues for that, you should
do so.
So, each shard handles ~175M documents. Even with DocValues, there is an
overhead of just hav
Further down in the stack trace you will find the "cause" of the exception.
Solr is calling the "init" method, but your code is throwing an exception.
Your jar is probably in the proper place, otherwise Solr wouldn't have been
able to load it and call the init method for it.
-- Jack Krupansky
You can specify an "alternate" query to use for highlighting purposes, with
the "hl.q" parameter. It doesn't affect the query results, but lets you
control which terms get highlighted.
See:
http://wiki.apache.org/solr/HighlightingParameters#hl.q
-- Jack Krupansky
-Original Message-
F
If I use a combined query - range query and others (term query), all terms
in field matched is highlighted. Any way to highlight only the term(s) in
term query?
Here is example.
+date:{20031231 TO *] +(title:red)
It highlight all terms except stopword.
using fq would not be an option because th
Hi,
Thanks for the reply. Is there a better way to do it if the scenario is the
following:
Indexed values: "abc def"
Query String:"xy abc def z"
So basically the query string has to match all the words present in the
indexed data to give a MATCH.
--
View this message in context:
http://luc
Hi, Below is the text_general field type when I search Text:Boradway it is
not returning all the records, it returning only few records. But when I search
for Text:*Broadway*, it is getting more records. When I get into multiple words
ln search like "Broadway Hotel", it may not get "Broadway"
On Fri, Jul 18, 2014 at 2:10 PM, Hayden Muhl wrote:
> I was doing some performance testing on facet queries and I noticed
> something odd. Most queries tended to be under 500 ms, but every so often
> the query time jumped to something like 5000 ms.
>
> q=*:*&fq={!tag=productBrandId}productBrandId:
I was doing some performance testing on facet queries and I noticed
something odd. Most queries tended to be under 500 ms, but every so often
the query time jumped to something like 5000 ms.
q=*:*&fq={!tag=productBrandId}productBrandId:(156
1227)&facet.field={!ex=productBrandId}productBrandId&face
Comments inline
On 16 July 2014 20:31, Bjørn Axelsen
wrote:
> Hi Solr users
>
> I would appreciate your inputs on how to handle a *mix *of *simple *and
> *nested
> *documents in the most easy and flexible way.
>
> I need to handle:
>
>- simple documens: webpages, short articles etc. (approx
You are looking for wildcard queries but they can be quite costly and you
will need to benchmark performance ..
Specially Suffix wild card queries (of type *abc) are quite costly ..
You can convert a suffix query into a prefix query by using a
ReverseTokenFilter during index time analysis.
A sea
Right, this is the worst kind of use-case for faceting. You have
150M docs/shard and are asking up to 125M buckets to count
into, plus control structures. Performance of this (even without OOMs)
will be a problem. Having multiple queries execute this simultaneously
will increase memory usage.
So y
Hi,
My requirement is to give a match whenever a string is found within the
indexed data of a field irrespective of where it is found.
For example, if I have a field which is indexed with the data "abc". Now any
of the following query string must give a match: xyzabc,xyabc, abcxyz ..
I am using
The data will be stored 100 times in your example,
independently for each document, albeit compressed.
Hmmm, doing that would certainly reduce the disk space
requirements, but it'd also complicate the document read
process. Instead of a single contiguous read from
disk per document, there'd be mul
Probably a question better asked on the Lily or HBase user forums
since those projects use Solr and will have a much better sense
of what Solr versions are compatible.
Best,
Erick
On Fri, Jul 18, 2014 at 4:20 AM, Vivekanand Ittigi
wrote:
> Hi,
>
> I tried to Integrate Solr with HBase Using HB
Joins can be chained, don't quite know if that fits your use-case... But
Whenever I see a question that looks like "How can I make Solr behave
like a database", I have to ask two questions:
1> Is Solr the right tool? It's a marvelous search engine, but not a RDBMS.
if your problem really
Hi Alan or Areek,
Were you able to make this work?
I have solr 4.9 with suggestcomponent and analyzingInfixLookupFactory wich
is amazing, but can't make filtering on the suggestion based on a field that
has their ID's.
Any help would be appreciated.
--
View this message in context:
http://lu
the dictionary file named : "fwfsta.bin" contain NULL.
thats mean the configuration is not correct.
maybe i need to add or change somthing on
thanks for help
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-4-7-2-auto-suggestion-tp4147677p4147878.html
Sent from the Sol
For traditional, non-SolrCloud "distributed" mode, load balancing and
sharded queries are independent concepts - you can use them each separately
or together at your choice. If you want the query to be sharded for a
non-SolrCloud Solr server, then you need to pass the "shards" parameter on
each
On 7/18/2014 12:51 AM, search engn dev wrote:
> From my understanding solr and solrj works as below,
> 1. LBHttpSolrServer keeps pinging above list of servers and maintains list
> of live servers.
> 2. Every time query arives it picks one server from the list (round-robin
> fashion)
> 3. Sends q
out of 700 million documents 95-97% values are unique approx.
My facet query is :
http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.limit=1&facet.field=user_digest
Above query throws OOM exception as soon as fire it to solr.
--
View this message in context:
http://lucene.
Hello,
Say I have 100 documents with the same large field value. Stored and
indexed.
I know the indexed tokens are stored only once with posting lists. But what
about original stored values? Do I get 100 copies of those? Or is Solr
smarter that that?
Regards,
Alex
Hi,
I tried to Integrate Solr with HBase Using HBase Indexer project
https://github.com/NGDATA/hbase-indexer/wiki (one of sub projects of Lily).
I used Apache HBase running on HDFS and solr 4.8.0 but i started getting
below mentioned error.
14/07/18 11:55:38 WARN impl.SepConsumer: Error processi
PS : You can give huge boosts to url at query time on a per request basis.
Don't specify the bqs on solrconfig.xml .. Always determine add bqs for the
query at run time..
On 18 July 2014 15:49, Umesh Prasad wrote:
> Or you can give huge boosts to url at query time. If you are using dismax
> th
Or you can give huge boosts to url at query time. If you are using dismax
then you can use bq
like bq = myfield:url1 ^ 50 .. That will bring up url1 as the first
result always.
On 18 July 2014 15:27, benjelloun wrote:
> hello,
>
> before index the URL to a field in Solr, you can use j
hello,
before index the URL to a field in Solr, you can use java api(Solrj) and do
a test
if(URL=="")
index on field1
else
index on field2
then use edismax to boost a specific field:
explicit
10
edismax
field1^5.0 field2^1.0
--
View this
hello,
for WordDelimiterFilterFactory:
this is an exemple in schema.xml to folow:
and for WordBreakSolrSpe
Hello,
In Solr Admin i put on /selects:
q="indexa",
it should auto suggest "indexation" and other suggestion if existe.
i have this response:
{ "responseHeader": { "status": 0, "QTime": 169, "params": { "indent":
"true", "q": "indexa", "_": "1405671103093", "wt": "json" } }, "command":
"build",
Getting this error below on adding new custom filter to schema.xml:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Plugin init failure for [schema.xml] fieldType "textCustom": Plugin init
failure for [schema.xml] analyzer/filter: Error instantiating class:
'org.apache.so
From: search engn dev [sachinyadav0...@gmail.com]:
> 1 collection : 4 shards : each shard has one master and one replica
> total documents : 700 million
Are you using DocValues for your facet fields? What is the approximate number
of unique values in your facets and what is their type (string, nu
32 matches
Mail list logo