Dear Yonik,
Hi,
Really thanks for you response.
Best regards.
On Tue, Jul 21, 2015 at 5:42 PM, Yonik Seeley wrote:
> On Tue, Jul 21, 2015 at 3:09 AM, Ali Nazemian
> wrote:
> > Dear Erick,
> > I found another thing, I did check the number of unique terms for this
> > field using schema browser,
Dears,
Hi,
I know that there are lots of tips about how to make the Solr indexing
faster. Probably some of the most important ones which are considered in
client side are choosing batch indexing and multi-thread indexing. There
are other important factors that are server side which I dont want to
m
Hi,
I am implementing searching using SOLR 5.0 and facing very strange problem.
I am having 4 fields Name and address, city and state in the document apart
from a unique ID.
My requirement is that it should give me those results first where there is
a match in name , then address, then state, cit
Hi,
Would like to check, as I've created a SorJ program and exported it as an
Runnable JAR, how do I integrate it together with Solr so that I can call
this JAR directly from Solr's REST API?
Currently I can only run it on command prompt using the command java -jar
solrj.jar
I'm using Solr 5.2.1
You can also use the types attribute to change the type of specific
characters, such as to treat the "!" or "&" as an .
-- Jack Krupansky
On Tue, Jul 21, 2015 at 7:43 PM, Sathiya N Sundararajan
wrote:
> Upayavira,
>
> thanks for the helpful suggestion, that works. I was looking for an option
>
Upayavira,
thanks for the helpful suggestion, that works. I was looking for an option
to turn off/circumvent that particular WordDelimiterFilter's behavior
completely. Since our indexes are hundred's of Terabytes, every time we
find a term that needs to be added, it will be a cumbersome process to
Bingo, thanks!
On Tue, Jul 21, 2015 at 4:12 PM, Konstantin Gribov
wrote:
> Try "invalidate caches and restart" in IDEA, remove .idea directory in
> lucene-solr dir. After that run "ant idea" and re-open project.
>
> Also, you have to, at least, close project, run "ant idea" and re-open it
> if s
Try "invalidate caches and restart" in IDEA, remove .idea directory in
lucene-solr dir. After that run "ant idea" and re-open project.
Also, you have to, at least, close project, run "ant idea" and re-open it
if switching between too diverged branches (e.g., 4.10 and 5_x).
вт, 21 июля 2015 г. в 2
Which can only happen if I post it to a web service, and won't happen if I
do it through config?
On Tue, Jul 21, 2015 at 2:19 PM, Upayavira wrote:
> yes, unless it has been added consciously as a separate field.
>
> On Tue, Jul 21, 2015, at 09:40 PM, Andrew Musselman wrote:
> > Thanks, so by the
yes, unless it has been added consciously as a separate field.
On Tue, Jul 21, 2015, at 09:40 PM, Andrew Musselman wrote:
> Thanks, so by the time we would get to an Analyzer the file path is
> forgotten?
>
> https://cwiki.apache.org/confluence/display/solr/Analyzers
>
> On Tue, Jul 21, 2015 at
In Java: UUID.randomUUID();
That is what I'm using.
Regards
> On 21 Jul 2015, at 22:38, Vineeth Dasaraju wrote:
>
> Hi Upayavira,
>
> I guess that is the problem. I am currently using a function for generating
> an ID. It takes the current date and time to milliseconds and generates the
> id.
Ah, nice tip, thanks! This could also make scripts more portable too.
Cheers,
Savvas
On 21 July 2015 at 08:40, Upayavira wrote:
> Note, when you start up the instances, you can pass in a hostname to use
> instead of the IP address. If you are using bin/solr (which you should
> be!!) then you ca
Thanks, so by the time we would get to an Analyzer the file path is
forgotten?
https://cwiki.apache.org/confluence/display/solr/Analyzers
On Tue, Jul 21, 2015 at 1:27 PM, Upayavira wrote:
> Solr generally does not interact with the file system in that way (with
> the exception of the DIH).
>
>
Hi Upayavira,
I guess that is the problem. I am currently using a function for generating
an ID. It takes the current date and time to milliseconds and generates the
id. This is the function.
public static String generateID(){
Date dNow = new Date();
SimpleDateFormat ft = new Simp
Are you making sure that every document has a unique ID? Index into an
empty Solr, then look at your maxdocs vs numdocs. If they are different
(maxdocs is higher) then some of your documents have been deleted,
meaning some were overwritten.
That might be a place to look.
Upayavira
On Tue, Jul 21
Solr generally does not interact with the file system in that way (with
the exception of the DIH).
It is the job of the code that pushes a file to Solr to process the
filename and send that along with the request.
See here for more info:
https://cwiki.apache.org/confluence/display/solr/Uploading+
I can confirm this behavior, seen when sending json docs in batch, never
happens when sending one by one, but sporadic when sending batches.
Like if sole/jetty drops couple of documents out of the batch.
Regards
> On 21 Jul 2015, at 21:38, Vineeth Dasaraju wrote:
>
> Hi,
>
> Thank You Erick
Hi,
Thank You Erick for your inputs. I tried creating batches of 1000 objects
and indexing it to solr. The performance is way better than before but I
find that number of indexed documents that is shown in the dashboard is
lesser than the number of documents that I had actually indexed through
sol
"contains" has to basically examine each and every term to see if it
matches. Say my
facet.contains=bbb. A matching term could be
aaabbbxyz
or
zzzbbbxyz
So there's no way to _know_ when you've found them all without
examining every last
one. So I'd try to redefine the problem to not require that.
I followed the instructions here
https://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ, including `ant
idea`, but I'm still not getting the links in solr classes and methods; do
I need to add libraries, or am I missing something else?
Thanks!
I'm not sure, it's a remote team but will get more info. For now, assuming
that a certain directory is specified, like "/user/andrew/", and a regex is
applied to capture anything two directories below matching "*/*/*.pdf".
Would there be a way to capture the wild-carded values and index them as
f
Keeping to the user list (the right place for this question).
More information is needed here - how are you getting these documents
into Solr? Are you posting them to /update/extract? Or using DIH, or?
Upayavira
On Tue, Jul 21, 2015, at 06:31 PM, Andrew Musselman wrote:
> Dear user and dev lists
Dear user and dev lists,
We are loading files from a directory and would like to index a portion of
each file path as a field as well as the text inside the file.
E.g., on HDFS we have this file path:
/user/andrew/1234/1234/file.pdf
And we would like the "1234" token parsed from the file path a
I did find a dark corner of our application that a dev had left some
experimental code in that snuck past QA, because it was rarely used. A
client discovered and was using it heavily over the past week. It was
generating multiple consecutive update/commit requests. Its been
disabled and the
Hi,
How can I upgrade the clusterstate.json to be split by collection?
I read this issue https://issues.apache.org/jira/browse/SOLR-5473.
In theory exists a param “stateFormat” that configured to 2 says to use the
/collections/collection/cluster.son format.
Where can I configure this?
—/Y
>
> Could this be due to caching? I have tried to disable all in my solrconfig.
If you mean Solr caches ? NO .
Solr caches live the life of the searcher.
So new searcher, new caches ( possibly warmed with updated results) .
If you mean your application caching or browser caching, you should veri
Hi Mese,
let me try to answer to your 2 questions :
1. What happens if a shard(both leader and replica) goes down. If the
> document on the "dead shard" is updated, will it forward the document to
> the
> new shard. If so, when the "dead shard" comes up again, will this not be
> considered for t
I tried using SolrMeter but for some reason it does not detect my url and
throws solr server exception
Sent from my iPhone
> On 21-Jul-2015, at 10:58 am, Alessandro Benedetti
> wrote:
>
> SolrMeter mate,
>
> http://code.google.com/p/solrmeter/
>
> Take a look, it will help you a lot !
>
>
SolrMeter mate,
http://code.google.com/p/solrmeter/
Take a look, it will help you a lot !
Cheers
2015-07-21 16:49 GMT+01:00 Nagasharath :
> Any recommended tool to test the query performance would be of great help.
>
> Thanks
>
--
--
Benedetti Alessandro
Visiting c
I am migrating from Solr 4.5.1 to Solr 5.2.1 on a Windows platform. I am using
multi-core, but not Solr cloud. I am having issues with my suite of junit
tests. My tests currently use code I found in SOLR-4502.
I was wondering whether anyone could point me at best-practice examples of
multi-c
Any recommended tool to test the query performance would be of great help.
Thanks
Ok. Thanks for your advice.
Regards,
Edwin
On 21 July 2015 at 15:37, Upayavira wrote:
> curl is just a command line HTTP client. You can use HTTP POST to send
> the JSON that you are mentioning below via any means that works for you
> - the file does not need to exist on disk - it just needs to
Hey shawn when I use the -m 2g command in my script I get the error a 'cannot
open [path]/server/logs/solr.log for reading: No such file or directory' I
do not see how this would affect that.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Data-Import-Handler-Stays-Idle-tp4
Okay. I'm going to run the index again with specifications that you
recommended. This could take a few hours but I will post the entire trace on
that error when it pops up again and I will let you guys know the results of
increasing the heap size.
--
View this message in context:
http://lucene
On 7/21/2015 8:17 AM, Paden wrote:
> There are some zip files inside the directory and have been addressed to in
> the database. I'm thinking those are the one's it's jumping right over. They
> are not the issue. At least I'm 95% sure. And Shawn if you're still watching
> I'm sorry I'm using solr-5
There are some zip files inside the directory and have been addressed to in
the database. I'm thinking those are the one's it's jumping right over. They
are not the issue. At least I'm 95% sure. And Shawn if you're still watching
I'm sorry I'm using solr-5.1.0.
--
View this message in context:
Also, the function used to generate hashes is
org.apache.solr.common.util.Hash.murmurhash3_x86_32(), which produces a 32-bit
value. The range of the hash values assigned to each shard are resident in
Zookeeper. Since you are using only a single hash component, all 32-bits will
be used by th
When are you generating the UUID exactly? If you set the unique ID field on
an "update", and it contains a new UUID, you have effectively created a new
document. Just a thought.
-Original Message-
From: mesenthil1 [mailto:senthilkumar.arumu...@viacomcontractor.com]
Sent: Tuesday, Ju
On Tue, Jul 21, 2015 at 3:09 AM, Ali Nazemian wrote:
> Dear Erick,
> I found another thing, I did check the number of unique terms for this
> field using schema browser, It reported 1683404 number of terms! Does it
> exceed the maximum number of unique terms for "fcs" facet method?
The real limit
Hello - this approach not only solves the problem but also allows me to run
different processing threads on other nodes.
Thanks!
Markus
-Original message-
> From:Chris Hostetter
> Sent: Saturday 18th July 2015 1:00
> To: solr-user
> Subject: Re: Programmatically find out if node is ov
Hi Dave,
generally giving terms in a dictionary, it's much more efficient to run
prefix queries than "contain" queries.
Talking about using docValues, if I remember well when they are loaded in
memory they are skipList, so you can use two operators on them :
- next() that simply gives you ht next
I found that facet contain search take much longer time than facet prefix
search. Do anyone have idea how to make contain search faster?
org.apache.solr.core.SolrCore; [concordance] webapp=/solr path=/select
params={q=sentence:"duty+of+care"&facet.field=autocomplete&indent=true&facet.prefix=duty+
Unable to delete by passing distrib=false as well. Also it is difficult to
identify those duplicate documents among the 130 million.
Is there a way we can see the generated hash key and mapping them to the
specific shard?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Sol
We have a similar situation: production runs Java 7u10 (yes, we know its
old!), and has custom GC options (G1 works well for us), and a 40Gb heap.
We are a heavy user of NRT (sub-second soft-commits!), so that may be the
common factor here.
Every time we have tried a later Java 7 or Java 8, the he
Looking at the javadoc for the WordDelimiterFilterFactory, it suggests
this config:
Note the protected="x" attribute. I suspect if you put Yahoo! into a
file referenced by that attribute, it may survive analysis. I'd be
curious to hear whether it works.
Upayavira
On
I suspect you can delete a document from the wrong shard by using
update?distrib=false.
I also suspect there are people here who would like to help you debug
this, because it has been reported before, but we haven't yet been able
to see whether it occurred due to human or software error.
Upayavir
Bhawna,
I think you need to reconcile yourself to the fact that what you want to
achieve is not going to be possible.
Solr (and Lucene underneath it) is HEAVILY optimised for high read/low
write situations, and that leads to some latency in content reaching the
index. If you wanted to change this
Note, when you start up the instances, you can pass in a hostname to use
instead of the IP address. If you are using bin/solr (which you should
be!!) then you can use bin/solr -h my-host-name and that'll be used in
place of the IP.
Upayavira
On Tue, Jul 21, 2015, at 05:45 AM, Erick Erickson wrote
curl is just a command line HTTP client. You can use HTTP POST to send
the JSON that you are mentioning below via any means that works for you
- the file does not need to exist on disk - it just needs to be added to
the body of the POST request.
I'd say review how to do HTTP POST requests from yo
On Tue, Jul 21, 2015, at 02:00 AM, Shawn Heisey wrote:
> On 7/20/2015 5:45 PM, Vineeth Dasaraju wrote:
> > I am trying to install Banana on top of solr but haven't been able to do
> > so. All the procedures that I get are for an earlier version of solr. Since
> > the directory structure has change
Dear Erick,
I found another thing, I did check the number of unique terms for this
field using schema browser, It reported 1683404 number of terms! Does it
exceed the maximum number of unique terms for "fcs" facet method? I read
somewhere it should be more than 16m does it true?!
Best regards.
O
51 matches
Mail list logo