You have any warming queries? Also, how do you measure the speed? What
does the boot log timestamps show for your index as opposed to - say -
an empty example index?
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your So
Forwarding to the mailing list.
-- Forwarded message --
From:
Date: Tue, Jun 24, 2014 at 2:15 PM
Subject: Re: solr4.7.2 startup take too long time
thanks for your reply.
i do not warm any queries. and the configuration is default as follow:
Hi Joel,
Had missed this email .. Some issue with my gmail setting.
The reason CollapsignQParserPlugin is more performant than regular grouping
is because
1. QParser refers to global ords for group.field and avoids storing
strings in a set. This has two advantage.
a) Terms of memory (storin
Hi ,
Found another bug with CollapsignQParserPlugin. Not a critical one.
It throws an exception when used with
true
Patch attached (against 4.8.1 but reproducible in other branches also)
518 T11 C0 oasc.SolrCore.execute [collection1] webapp=null path=null
params={q=*%3A*&fq=%7B%21collaps
Hi,
I'm getting below error while trying to access solrcloud ,
i tried with ,
added these two in to solrconfig.xml file
actual location is /opt/apps/prod/solr/dist here all the required below jars
are available
solr-4.8.1.war solr-map-reduce-4.8.1.jar
solr-a
Hi Erick,
that is what i did, tried that input on analysis page.
The index field splitting the value into two words: „test“ and „or123"
Now checking the query at analysis page, and there are the word ist splitting
into „test“ and „or123“.
By doing the query and look into the debug result, i se
i am a fresh man of nabble, i am so sorry that i take the wrong operation.
now i will repeat my answer here.
there is no warming queries i my solrconfig.xml , as follows:
static firstSearcher warming in solrconfig.xml
Hi Guys,
As I know RAMDirectoryFactory setting does not work with replication.
(
https://cwiki.apache.org/confluence/display/solr/DataDir+and+DirectoryFactory+in+SolrConfig
)
By the way, can I use it for replication slave nodes ( not master )
or for SolrCloud ?
Thanks,
Chunki.
Hi Sven,
StandardTokenizerFactory splits it into two pieces. You can confirm this at
analysis page.
If this is something you don't want, lets us know.
We can help you to create an analysis chain that suits your needs.
Ahmet
On Tuesday, June 24, 2014 10:39 AM, Sven Schönfeldt
wrote:
Hi Erick
Hi experts,
We have a requirement to import the data from hbase tables using solr, we
have tried with help of Dataimporthandler, we couldn't find the
configuration streps or document for dataimporthandler for HBASE, can
anybody please share the steps to configure,
we tried with basic configurati
Hi,
Looks like you have different versions jars than solr.war ?
Ahmet
On Tuesday, June 24, 2014 10:33 AM, atp wrote:
Hi,
I'm getting below error while trying to access solrcloud ,
i tried with ,
added these two in to solrconfig.xml file
actual location is /opt/apps/prod/solr/dis
Hi,
There is no DataSource or EntityProcessor for HBase, I think.
May be http://www.lilyproject.org/lily/index.html works for you?
AHmet
On Tuesday, June 24, 2014 1:27 PM, atp wrote:
Hi experts,
We have a requirement to import the data from hbase tables using solr, we
have tried with help of
I'm trying to create a Norwegian Lemmatizer based on a dictionary, but
for some odd reason I don't get any search results even thought the
Analyzer in Solr Admin shows that it does the right thing. It works at
query time if I have reindexed everything based on another stemmer, e.g.
NorwegianM
I am having an index file which contains the data from mysql database, I
created this index file using dataimporthandler of solr. My requirement is,
suppose if i add a new row to database, I want to update that row in my
existing index file of solr. I dont have any idea how to add the new record
Hi Erik,
thanks - if it helps, I eventually fixed the problem by deleting the
documents by id (via an http request), which apparently deleted all the
versions everywhere, then re-creating the documents via the admin interface
(update, csv). This seems to have left only one version of each documen
Thanks guys for your answers.
Sorry for the query syntax errors I've added in the previous queries.
Chris, you've been really helpful. Indeed, point 3 is the one I'm trying to
solve, rather than 2.
You're saying that "BooleanScorer will consult the clauses in order based
on which clause
says it ca
I am running Solr 4.5.1. Here is how my setup looks:
Have 2 modest sized Collections.
Collection 1 - 2 shards, 3 replicas (Size of Shard 1 - 115
MB, Size of Shard 2 - 55 MB)
Collection 2 - 2 shards, 3 replicas (Size of Shard 2 - 3.5
GB, Size of Shard 2 - 1 GB)
Thes
I am using Solr 4.5.1. I have two collections:
Collection 1 - 2 shards, 3 replicas (Size of Shard 1 - 115
MB, Size of Shard 2 - 55 MB)
Collection 2 - 2 shards, 3 replicas (Size of Shard 2 - 3.5
GB, Size of Shard 2 - 1 GB)
I have a batch process that performs indexi
On Tue, 2014-06-24 at 14:26 +0200, RadhaJayalakshmi wrote:
> Here are my observations from the application logs:
> 1) Out of 200 sample searches across both collections - 13 requests are slow
> (3 slow responses on Collection 1 and 10 slow responses on Collection 2).
>
> 2) When things run fast -
Single document update is quite possible! No worries there.
Since you’re using DIH (data import handler) you can use the delta-import
command, see
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreData
I have found the answer for the above query, i.e, by using Delta import
handler. But If I am going to use the deltaimporthandler of solr, Then I
need to add a last modified column in the database table. Is it possible to
achieve the same without altering the database table.
--
View this message
On 6/24/2014 12:51 AM, hrdxwandg wrote:
> Before i upgrade solr to 4.7.2, i use solr3.6.where i startup tomcat, the
> solr is started up quickly,the index size is 35G. After i upgrade solr to
> 4.7.2. i rebuild the index totally. and the size of index is 16G. But when i
> restart the tomcat, i foun
Hi,
Im new to solr and would ike to index my database.It is working fine for
columns.
But in solr I have one which which will take the average value from
database.The avaerage is not saving in solr.
Below is my sample dataconfig.xml
//fields<->Columns
Yes, the localhost is replaced to the right solr URL. The pasted one is test
URL.
After debug, we found the actual problem is the XML files path is not correct.
Thanks for all the support.
--Ravi
-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: Mon
How fast does it need to be?
I've done this sort of things for relevance evaluation with a driver in Python.
Send the query, request 10 or 100 hits in JSON. Request only the URL field (fl
parameter). Iterate through them until the URL matches. If it doesn't match,
request more. Print the number
I have few questions before that.
Do you mean, running Jetty in Production is good enough? Like, all Clustering,
Load Balance will be taken care..?
Can we run Jetty as a service in windows Server.?
Security won't be a problem if we use Jetty..?
I am in the impression, Tomcat will be more robust
Okay, Let me try again.
1. Here is some sample SolrJ code that creates a parent and child document
(I hope)
https://gist.github.com/anonymous/d03747661ef03923de74
2. I tried a block join query which didn't return any results (I tried the
"Block Join Parent Query Parser" approach described in this
Please don't. At least not until you prove that this
is where your bottleneck is. You haven't described
what you're trying to fix by making such a change.
Solr/Lucene already works a _lot_ to keep the relevant bits of
the index in memory. Additionally, the defaults use
MMapDirectory, which makes u
Hmmm. It would help if you posted a couple of other
pieces of information BTW, if this is new code are you
considering donating it back? If so please open a JIRA so
we can track it, see: http://wiki.apache.org/solr/HowToContribute
But to your question:
First couple of things I'd do:
1> see wha
Wildcards are a tough thing to get your head around. I
think my first post on the users list was titled
"I just don't get wildcards at all" or something like that...
Right, wildcards aren't tokenized. So by getting your term
through the query parsing as a single token, including the
hyphen, when t
Thanks for letting us know.
Erick
On Tue, Jun 24, 2014 at 5:25 AM, yann wrote:
> Hi Erik,
>
> thanks - if it helps, I eventually fixed the problem by deleting the
> documents by id (via an http request), which apparently deleted all the
> versions everywhere, then re-creating the documents via
Hi,
I am getting OOM during indexing 400 million docs (nested 7-20 children).
The memory usage gets higher while indexing until it gets to 24g.
also after OOM and stop indexing, the memory stays on 24g, *seems like a
leak.*
*Solr & Collection Info: *
solr 4.8 , 6 shards, 1 replicas per shard, 2
Hi Erlend,
After a quick look, I have implemented similar TokenFilter that injects several
tokens at same position.
Please see source code of : Zemberek2DeasciifyFilter in
https://github.com/iorixxx/lucene-solr-analysis-turkish
You can insert your line : final String[] values =
stemmer.ste
did you run the underneath query ATTRIBUTES.
STATE:TX. does it return anything?
On Tue, Jun 24, 2014 at 6:59 PM, Vinay B, wrote:
> Okay, Let me try again.
>
> 1. Here is some sample SolrJ code that creates a parent and child document
> (I hope)
> https://gist.github.com/anonymous/d03747661ef039
Your indexing process looks fine, there's no reason to
change it.
Optimizing is _probably_ unnecessary at all. In fact in the 4.x
world it was changed to "forceMerge" to make it seem less
attractive (I mean, who wouldn't want an optimized index?)
That said, the batch indexing process has nothing
Hi,
You don't need to optimize just based on segment counts. Solr doesn't
optimize automatically because often it doesn't improve things enough to
justify the computational cost of optimizing. You shouldn't optimize unless
you do a benchmark and discover that optimizing improves performance.
If y
I think I am officially tired of having to explain why Solr doesn't do what
users expect for this query. I mean, I can accept that low level Lucene
should work strictly on the decomposed terms of test test-or*, but is is
very reasonable for users (even EXPERT users) to expect that the Solr query
That is strange indeed. The usual culprit is that there is a commit
in there and no autowarming, so you see pauses when the first
query hits after a commit. But you say you only build the index once
which would seem to rule that out.
I'd be interested in what is in your Solr logs around the time
i
By quickly looking at it, I think you have unreachable code in the
NorwegianLemmatizerFilter
class (certainly, attaching & debugging would be your best bet):
@Override
public boolean incrementToken() throws IOException {
if (input.incrementToken()) {
if (!keywordAttr.is
Is there any way to limit the results of a query on the "from" index before it
gets joined?
The SQL analogy might be...
SELECT *
from toIndex join
(select * from fromIndex
where "some query"
limit 1000
) fromIndex on fromIndex.from=toIndex.to
Example:
_query_:"{!join fromIndex=expressionData fr
The one exception that we should always note is that if your batch includes
deletion of existing documents, an optimize can be appropriate since the
term frequencies stored by Lucene may be off since the deleted documents
still count as existing terms.
Is this exception noted in the Solr ref g
: Let's take this query sample:
: XXX OR AAA AND {!frange ...}
:
: For my use case:
: AAA returns a subset of 100k documents.
: frange returns 5k documents, all part of these 100k documents.
:
: Therefore, frange skips the most documents. From what you are saying,
: frange is going to be applied
Hello,
We do have a running SolrCloud cluster, a simple set up of 4 nodes — 2
shards and 2 replicas and ≈ 140GB index. And now we have to move to another
server and need to somehow copy existing index without downtime (if
applicable).
New config is exactly the same, same 4 nodes, same collections
I've just realized that old and new clusters do use different installations,
configs and lib paths. So the nodes from the new cluster will probably
simply refuse to start using configs from the old zookeper.
Only if there is a way to run them with their own zookeper and then manually
add as replic
Hi,
Yes, the query ATTRIBUTES.STATE:TX returns the child doc (see response
below) . Is there something else that I'm missing to link the parent and
the child? I followed your advice from my last thread and used a block join
in this attempt, but still don't see how the parent and child realize their
Two ideas:
1) monitor the GC activity with jvisualvm (comes with Oracle JDK), install
a VisualGC plugin, it is quite helpful. The idea is to try to find the GC
stop-the-world activities. If any found, look at tweaking the GC
parameters. Some insight: http://wiki.apache.org/solr/ShawnHeisey Some mo
Vinay,
pls upload your index dir somewhere, I can try to check what's wrong with
it.
On Tue, Jun 24, 2014 at 9:43 PM, Vinay B, wrote:
> Hi,
> Yes, the query ATTRIBUTES.STATE:TX returns the child doc (see response
> below) . Is there something else that I'm missing to link the parent and
> the c
Hello Kevin,
You can only apply some restriction clauses (with +) to the from side
query.
On Tue, Jun 24, 2014 at 8:09 PM, Kevin Stone wrote:
> Is there any way to limit the results of a query on the "from" index
> before it gets joined?
>
> The SQL analogy might be...
> SELECT *
> from toIndex
enable heap dump on OOME, and build the histogram by jhat.
Did you try to reduce MaxRamBuffer or max buffered docs? or enable
autocommit?
On Tue, Jun 24, 2014 at 7:43 PM, adfel70 wrote:
> Hi,
>
> I am getting OOM during indexing 400 million docs (nested 7-20 children).
> The memory usage gets h
I don't know what that means. Is that a no?
From: Mikhail Khludnev [mkhlud...@griddynamics.com]
Sent: Tuesday, June 24, 2014 2:18 PM
To: solr-user
Subject: Re: limit solr results before join
Hello Kevin,
You can only apply some restriction clauses (with +)
Michael, try this,
Thanks
https://www.dropbox.com/s/074p0wpjz916d78/test_core.tar.gz
On Tue, Jun 24, 2014 at 1:16 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Vinay,
> pls upload your index dir somewhere, I can try to check what's wrong with
> it.
>
>
> On Tue, Jun 24, 2014 at
_query_:"{!join fromIndex=expressionData from=anatomyID to=anatomyID
v='(anatomy:\"brain\") +id:[1 TO 1]'}"
On Tue, Jun 24, 2014 at 10:24 PM, Kevin Stone wrote:
> I don't know what that means. Is that a no?
>
>
> From: Mikhail Khludnev [mkhlud...@gri
I'm currently playing around with Solr Cloud migration strategies, too. I'm
wondering... when you say "zero downtime," do you mean zero *read*
downtime, or zero downtime altogether?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062
appinions inc.
“The Science of Influence Marketing
Zero read would be enough, we can safely stop index updates for a while. But
have some API endpoints, where read downtime is very undesirable.
Best,
Alex
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-copy-the-index-to-another-cluster-tp4143759p4143795.html
Sent
I wonder what can be wrong there.. it works for me absolutely fine
proofpic
http://postimg.org/image/51qrsm48p/
query
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3D%22content_type%3AparentDocument%22%7DATTRIBUTES.STATE%3ATX&wt=json&indent=true&debugQuery=true
gives
"respon
So what I'm playing with now is creating a new collection on the target
cluster, turning off the target cluster, wiping the indexes, and manually
just copying the indexes over to the correct directories and starting
again. In the middle, you can run an optimize or use the Lucene index
upgrader tool
Thanks, I Figured it out based on your last response, I mistakenly
UUencoded the wt=json and indent = true when i manufacturing the request.
%26wt%3djson%26indent%3dtrue
Incidentally, this translates to
{!parent
which="content_type:parentDocument"}ATTRIBUTES.STATE:TX&wt=json&indent=true
and r
Hi,
I have a number of documents in single core getting inedxed from different
sources with common properties but different values.
Problem is while fetching from one set of documents, i need to use "Raw
Query Parameters" as below.
http://solrserver/solr/collection1/select?q=*%3A*&wt=json&inden
Hi,
I'm testing SuggestComponent so I came up with some questions.
1. How can I set the term frequency as the weightField?
2. Why does it only work with stored fields? How can I return the value
resulted from my filter transformations (index)?
Thanks!
--
Sergio Roberto Charpinel Jr.
Hi Lalit,
_query_ is a magic field name. Please see :
http://searchhub.org/2009/03/31/nested-queries-in-solr/
What do you use _query_="AuthenticatedUserName=lalit" ? It is simply ignored.
Ahmet
On Tuesday, June 24, 2014 11:34 PM, lalitjangra
wrote:
Hi,
I have a number of documents in sin
This is easy if I only reqdefine a custom field to identify the desired
patterns (numbers, in my case)
For example, I could define a field thus:
Input:
hello, world bye 123-45 abcd sdfssdf --- aaa
Output:
123-45 ,
However, I also want to retain the behavi
Sorry, previous post got sent prematurely.
Here is the complete post:
This is easy if I only reqdefine a custom field to identify the desired
patterns (numbers, in my case)
For example, I could define a field thus:
Input:
hello, world bye 123-45 abcd sdfssdf -
Hi Chris,
Thanks for your patience, I've now got a better image on how things work.
I don't believe however that the two queries (the one with the post filter
and the one without one) are equivalent.
Suppose out of the whole document set:
XXX returns documents 1,2,3.
AAA returns documents 6,7,8.
: I don't believe however that the two queries (the one with the post filter
: and the one without one) are equivalent.
:
: Suppose out of the whole document set:
: XXX returns documents 1,2,3.
: AAA returns documents 6,7,8.
: {!frange}customfunction returns documents 7,8.
:
: Running this quer
: I am upgrading an index from Solr 3.6 to 4.2.0.
: Everything has been picked up except for the old DateFields.
Just to be crystal clear:
1) 4.2 is alreayd over a year old. the current rleease of Solr is 4.8,
and 4.9 will most likeley be available within a day or two
2) Even in 4.9, "solr.D
What about copyField'ing the content into the second field where you
apply the alternative processing. Than eDismax searching both. Don't
have to store the other field, just index.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accel
Check out the HBase Indexer http://ngdata.github.io/hbase-indexer/
Wolfgang.
On Jun 24, 2014, at 3:55 AM, Ahmet Arslan wrote:
> Hi,
>
> There is no DataSource or EntityProcessor for HBase, I think.
>
> May be http://www.lilyproject.org/lily/index.html works for you?
>
> AHmet
>
>
> On Tues
: I recently tried upgrading our setup from 4.5.1 to 4.7+, and I'm
: seeing an exception when I use (1) a function to sort and (2) result
: grouping. The same query works fine with either (1) or (2) alone.
: Example below.
Did you modify your schema in any way when upgrading?
Can you provide so
When I edit a child document, a block join query for the parent no longer
returns any hits. I thought I read that this was the way things worked but
needed to know for sure.
If so, is there any other way to achieve this functionality (I can deal
with creating the child doc with the parent, but wou
good.
--
View this message in context:
http://lucene.472066.n3.nabble.com/fq-more-then-one-tp959849p4143943.html
Sent from the Solr - User mailing list archive at Nabble.com.
Block join is a very specialized feature of Solr - it requires that creation
and update of the parent and all children be done as a single update
operation for all of the documents. So... you cannot update a child document
by itself, but need to update the entire block.
Unfortunately, this lim
Hi Solr !
I got this working . Here's how :
With the example jetty runner, you can Extract the tarball, and go to the
examples/ directory, where you can launch an embedded core. Then, find the
solrconfig.xml file. Edit it to contain the following xml:
myhcfs:///solr
/etc/hadoop/conf
I've always been under the impression that file-system-access-speed is crucial
for Lucene-based storage and have always advocated to not use NFS for that (for
which we had slowness of a factor of 5 approximately). Has there any
performance measurement made for such a setting? Is FS-caching sudde
Hey Suresh,
could you get a little more specific on what solved your problem here?
I am currently facing the same problem and am trying to find a proper
solution.
Thanks!
~ Dom
2014-02-28 7:46 GMT+01:00 sureshrk19 :
> Thanks Shawn and Erick.
>
> I followed SOLR configuration document and modif
Thanks Ahmet, Walfgang , i have installed hbase-indexer on one the server
but here also im unable to start the hbase indexer server.
Error: Could not find or load main class com.ngdata.hbaseindexer.Main
properly set the JAVA_HOME and INDEXER_HOME environmental variables.
please guide
Thanks.
75 matches
Mail list logo