is ticket.
>
> On Mon, May 1, 2017 at 9:15 PM, xavier jmlucjav
> wrote:
>
>> hi,
>>
>> I am facing this situation:
>> - I have a 3 node Solr 6.1 with some 1 shard, 1 node collections (it's
>> just
>> for dev work)
>> - the collections wher
hi,
I am facing this situation:
- I have a 3 node Solr 6.1 with some 1 shard, 1 node collections (it's just
for dev work)
- the collections where created with:
action=CREATE&...&createNodeSet=EMPTY"
then
action=ADDREPLICA&...&node=$NODEA&dataDir=$DATADIR"
- I have taken a BACKUP of the collec
Hi,
After getting our interval for calling delta index shorter and shorter, I
have found out that last_index_time in dataimport.properties is not
updated every time the indexing runs, it is skipped if no docs where added.
This happens at least in the following scenario:
- running delta as full i
ery powerful java wrapper, I used to use
it when Solr did not have a built in daemon setup. It was built by someone
how was using JSW, and got pissed when that one went commercial. It is very
configurable, but of course more complex. I wrote something about it some
time ago
https://medium.com/@jmlucjav/
reload API call. For restarting we rely on remote provisioning
> tools such as Salt, other managing tools can probably execute commands
> remotely as well.
>
> If you operate more than just a very few machines, i'd really recommend
> using these tools.
>
> Markus
>
>
&g
Hi,
When I need to restart a Solrcloud cluster, I always do this:
- log in into host nb1, stop solr
- log in into host nb2, stop solr
-...
- log in into host nbX, stop solr
- verify all hosts did stop
- in host nb1, start solr
- in host nb12, start solr
-...
I always wondered, if this was not rea
Hi,
I have a lucene Query (Boolean query with a bunch of possibly complex
spatial queries, even polygon etc) that I am building for some MemoryIndex
stuff.
Now I need to add that same query to a Solr query (adding it to a bunch of
other fq I am using). Is there a some way to piggyback the lucene
Hi
Is there somewhere a sample of some solrj code that given:
- a collection
- the id (like "IBM!12345")
returns the shard to where the doc will be routed? I was hoping to get that
info from CloudSolrClient itself but it's not exposing it as far as I can
see.
thanks
xavier
done, with simple patch https://issues.apache.org/jira/browse/SOLR-9697
On Thu, Oct 27, 2016 at 4:21 PM, xavier jmlucjav wrote:
> sure, will do, I tried before but I could not create a Jira, now I can,
> not sure what was happening.
>
> On Thu, Oct 27, 2016 at 3:14 PM, Shalin Sh
we'd have to hurry if this fix has to go in.
>
> On Thu, Oct 27, 2016 at 6:32 PM, xavier jmlucjav
> wrote:
>
> > Correcting myself here, I was wrong about the cause (I had already messed
> > with the script).
> >
> > I made it work by commenting out line 1261
IF "%ZK_OP%"=="cp" (
goto set_zk_dst
)
IF "%ZK_OP%"=="mv" (
goto set_zk_dst
)
set ZK_DST="_"
) ELSE IF NOT "%1"=="" (
set ERROR_MSG="Unrecognized or misplaced zk argument %1%"
Now upconfig works!
hi,
Am I missing something or this is broken in windows? I cannot upconfig, the
scripts keeps exiting immediately and showing usage, as if I use some wrong
parameters. This is on win10, jdk8. But I am pretty sure I saw it also on
win7 (don't have that around anymore to try)
I think the issue is:
I did set up JNDI for DIH once, and you have to tweak the jetty setup. Of
course, solr should have its own jetty instance, the old way of being just
a war is not true anymore. I don't remember where, but there should be some
instructions somewhere, it took me an afternoon to set it up fine.
xavier
gt; > 26. sep. 2016 kl. 20.39 skrev Shawn Heisey :
> >
> > On 9/26/2016 6:28 AM, xavi jmlucjav wrote:
> >> Yes, I had to change some fields, basically to use TrieIntField etc
> >> instead
> >> of the old IntField. I was assuming by using the IndexUpgrader to
Hi Shawn/Jan,
On Sun, Sep 25, 2016 at 6:18 PM, Shawn Heisey wrote:
> On 9/25/2016 4:24 AM, xavi jmlucjav wrote:
> > Everything went well, no errors when solr restarted, the collections
> shows
> > the right number of docs. But when I try to
Hi,
I have an existing 3.6 standalone installation. It has to be moved to
Solrcloud 6.1.0. Reindexing is not an option, so I did the following:
- Use IndexUpgrader to upgrade 3.6 -> 4.4 -> 5.5. I did not upgrade to 6.X
as 5.5 should be readable by 6.x
- Install solrcloud 6.1 cluster
- modify sche
parameter is always 0.
>
> Or your second query could even be just
> q=id:[last_id_returned_from_previous_query TO *]&sort=id
> asc&start=0&rows=1000
>
> Best,
> Erick
>
> On Mon, Jun 20, 2016 at 12:37 PM, xavi jmlucjav
> wrote:
> > Hi,
>
Hi,
I need to index into a new schema 800M docs, that exist in an older solr.
As all fields are stored, I thought I was very lucky as I could:
- use wt=csv
- combined with cursorMark
to easily script out something that would export/index in chunks of 1M docs
or something. CVS output being very e
stand-by to become the live collection using aliases.
>
>
> Arcadius
>
>
> On 31 March 2016 at 18:04, xavi jmlucjav wrote:
>
> > Hi,
> >
> > I have been working with
> > AnalyzingInfixLookupFactory/BlendedInfixLookupFactory in 5.5.0, and I
> have
>
Hi,
I have been working with
AnalyzingInfixLookupFactory/BlendedInfixLookupFactory in 5.5.0, and I have
a number of questions/comments, hopefully I get some insight into this:
- Doc not complete/up-to-date:
- blenderType param does not accept 'linear' value, it did in 5.3. I
commented it out
In order to force a OOM do this:
- index a sizable amount of docs with normal -Xmx, if you already have 350k
docs indexed, that should be enough
- now, stop solr and decrease memory, like -Xmx=15m, start it, and run a
query with a facet on a field with very high cardinality, ask for all
facets. If
2016 at 1:46 AM, Erick Erickson
wrote:
> Well, I'd imagine you could spawn threads and monitor/kill them as
> necessary, although that doesn't deal with OOM errors
>
> FWIW,
> Erick
>
> On Thu, Feb 11, 2016 at 3:08 PM, xavi jmlucjav wrote:
> > For sure,
, y, if your use case allows , then we now have
> that in Tika.
>
> I've been wanting to add a similar watchdog to tika-server ... any
> interest in that?
>
>
> -Original Message-
> From: xavi jmlucjav [mailto:jmluc...@gmail.com]
> Sent: Thursday, February
I have found that when you deal with large amounts of all sort of files, in
the end you find stuff (pdfs are typically nasty) that will hang tika. That
is even worse that a crash or OOM.
We used aperture instead of tika because at the time it provided a watchdog
feature to kill what seemed like a h
Mikahil, Yonik
thanks for having a look. This was my bad all the time...I forgot I was on
5.2.1 instead of 5.3.1 on this setup!! It seems some things were not there
yet on 5.2.1, I just upgraded to 5.3.1 and my query works perfectly.
Although I do agree with Mikhail the docs on this feature are a
Hi,
I am trying to get some faceting with the json facet api on nested doc, but
I am having issues. Solr 5.3.1.
This query gest the buckets numbers ok:
curl http://shost:8983/solr/collection1/query -d 'q=*:*&rows=0&
json.facet={
yearly-salaries : {
hi,
While working with DIH, I tried schemaless mode, and found out it does not
work if you are indexing with DIH. I could not find any issue or reference
to this in the mailing list, even if I found it a bit surprising nobody
tried that combination so far. Did anybody tested this before?
I manage
Hi,
I have a setup with AnalyzingInfixLookupFactory, suggest.count works. But
if I just replace:
s/AnalyzingInfixLookupFactory/BlendedInfixLookupFactory
suggest.count is not respected anymore, all suggestions are returned, so
making it virtually useless.
I am using RC4 that I believe is also bein
On Sat, May 30, 2015 at 11:15 PM, Toke Eskildsen
wrote:
> xavi jmlucjav wrote:
> > I think the plan is to facet only on class_u1, class_u2 for queries from
> > user1, etc. So faceting would not happen on all fields on a single query.
>
> I understand that, but most of t
for a second opinion. We did not get to
discuss a different schema, but if we get to this point I will take that
plan into consideration for sure.
xavi
On Sat, May 30, 2015 at 10:17 PM, Toke Eskildsen
wrote:
> xavi jmlucjav wrote:
> > They reason for such a large number of fields:
&g
orting
> >
> > Whether Solr breaks with thousands and thousands of fields is pretty
> > dependent on what you _do_ with those fields. Simply doing keyword
> > searches isn't going to put the same memory pressure on as, say,
> > faceting on them all (even if in
Hi guys,
someone I work with has been advised that currently Solr can support
'infinite' number of fields.
I thought there was a practical limitation of say thousands of fields (for
sure less than a million), orthings can start to break (I think I
remember seeings memory issues reported on th
this is easily doable by a custom (java code) request handler. If you want
to avoid writing any java code, you should investigate using
https://issues.apache.org/jira/browse/SOLR-5005 (I am myself going to have
a look at this interesting feature)
On Tue, Sep 9, 2014 at 4:33 PM, jimtronic wrote:
2014 at 1:05 PM, Aman Tandon
wrote:
> I have a question, does storing the data in copyfields save space?
>
> With Regards
> Aman Tandon
>
>
> On Tue, Aug 19, 2014 at 3:02 PM, jmlucjav wrote:
>
> > ok, I had not noticed text contains also the other metadata like
&g
ok, I had not noticed text contains also the other metadata like keywords,
description etc, nevermind!
On Tue, Aug 19, 2014 at 11:28 AM, jmlucjav wrote:
> In the sample schema.xml I can see this:
>
>
> stored="true" multiValued="true"/>
>
&g
In the sample schema.xml I can see this:
I am wondering, how does having this split in two fields text/content save
space?
appreciated Stefan. Done updating.
On Thu, Jul 17, 2014 at 5:36 PM, Stefan Matheis
wrote:
> Xavi
>
> It’s the former :) I’ve adding you to the contributors group
>
> -Stefan
>
>
> On Thursday, July 17, 2014 at 5:19 PM, jmlucjav wrote:
>
> > Hi guys,
> >
&
Hi guys,
I don't remember anymore what is the policy to have someone added to this
page:
- ask for edit rights and add your own line where needed
- send someone your line and they'll add it for you.
If the former, could I get edit permissions for the wiki? My login is
jmlucjav. If
://github.com/jmlucjav/sadat
Hi,
I have this scenario that I think is no unusual: solr will get a user
entered query string like 'apple pear france'.
I need to do this: if any of the terms is a country, then change the query
params to move that term to a fq, i.e:
q=apple pear france
to
q=apple pear&fq=country:france
What do
I am confused, wouldn't a doc that match both the phrase and the term
queries have a better score than a doc matching only the term score, even
if qf and pf are the same??
On Mon, Oct 28, 2013 at 7:54 PM, Upayavira wrote:
> There'd be no point having them the same.
>
> You're likely to include
davers wrote
> I want to elevate certain documents differently depending a a certain fq
> parameter in the request. I've read of somebody coding solr to do this but
> no code was shared. Where would I start looking to implement this feature
> myself?
Davers,
I am also looking into this feature.
gt;
> ==> Can't you get this from your container access logs after the fact? I
> may be misunderstanding something but why wouldn't mining the Jetty/Tomcat
> logs for the response size here suffice?
>
> Thanks!
> Amit
>
>
> On Thu, Apr 4, 2013 at 1:34 AM, xavie
t; The QueryResponseWriter class and in solrconfig.xml.
>
> -- Jack Krupansky
>
> -Original Message- From: xavier jmlucjav
> Sent: Wednesday, April 03, 2013 4:22 PM
> To: solr-user@lucene.apache.org
> Subject: do SearchComponents have access to response contents
>
>
>
I need to implement some SearchComponent that will deal with metrics on the
response. Some things I see will be easy to get, like number of hits for
instance, but I am more worried with this:
We need to also track the size of the response (as the size in bytes of the
whole xml response tat is stre
Damn...I was obfuscated seeing the 14 there...I had naively thought that
term freq would not be stored in the doc, 1 would be stored, but I guess it
still stores the real value and then applies custom similarity at query
time.
That means changing to a custom similarity does not need reindexing rig
u set the global similarity to solr.SchemaSimilarityFactory?
>
> See <http://wiki.apache.org/solr/SchemaXml#Similarity>.
>
> Steve
>
> On Mar 21, 2013, at 9:44 AM, xavier jmlucjav wrote:
>
> > Hi Felipe,
> >
> > I need to keep positions, that is why I canno
could be:
>
> indexed="true" stored="true" multiValued="false" omitNorms="true" />
>
> http://wiki.apache.org/solr/SchemaXml
>
>
> On Thu, Mar 21, 2013 at 7:35 AM, xavier jmlucjav
&
I have the following setup:
I index my corpus, and I can see tf is as usual, in this doc is 14 times in
this field:
4.5094776 = (MATCH) weight(description:galaxy^10.0 in 440)
[DefaultSimilarity], result of:
11:51 AM, xavier jmlucjav wrote:
>
>> Hi,
>>
>> I have an index where, if I kill solr via Control-C, it consistently hangs
>> next time I start it. Admin does not show cores, and searches never
>> return.
>> If I delete the index contents and I restart again all
Hi,
I have an index where, if I kill solr via Control-C, it consistently hangs
next time I start it. Admin does not show cores, and searches never return.
If I delete the index contents and I restart again all is ok. I am on
windows 7, jdk1.7 and Solr4.0.
Is this a known issue? I looked in jira bu
Steve, worked like a charm.
thanks!
On Sun, Mar 17, 2013 at 7:37 AM, Steve Rowe wrote:
> See https://issues.apache.org/jira/browse/LUCENE-4843
>
> Let me know if it works for you.
>
> Steve
>
> On Mar 16, 2013, at 5:35 PM, xavier jmlucjav wrote:
>
> > I read too
I read too fast your reply, so I thought you meant configuring the
LimitTokenPositionFilter. I see you mean I have to write one, ok...
On Sat, Mar 16, 2013 at 10:33 PM, xavier jmlucjav wrote:
> Steve,
>
> Yes, I want only "one", "one two", and "one two three&
ecified limit) would
> be better, rather than subclassing ShingleFilter. You could use
> LimitTokenCountFilter as a model, especially its "comsumeAllTokens" option.
> I think this would make a nice addition to Lucene.
>
> Also, what do you plan to use this for?
>
&
debug score info
- upgrade to griffon1.2.0
- allow using another handler (besides /select) enhancement
You can check it out here: https://github.com/jmlucjav/vifun
Binary distribution:
http://code.google.com/p/vifun/downloads/detail?name=vifun-0.6.zip
xavier
t; * The instructions say to highlight some numbers
> - I tried highlighting the 10 in rows paramour
> - I also tried the 45.15 in "rest", and some of the scores in the
> results list
>
> I never see the extra parameters you show in this screen shot:
>
> https://r
s.pdf?raw=true
>
> roman
>
> On Sat, Feb 23, 2013 at 9:12 AM, jmlucjav wrote:
>
> > Hi,
> >
> > I have built a small tool to help me tweak some params in Solr (typically
> > qf, bf in edismax). As maybe others find it useful, I am open sourcing it
> >
bout the ability to override the "wt" param, so that you can point
> it to the "/browse" handler directly?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> 23. feb. 2013 kl. 1
Hi,
I have an index where schema browser histogram reports some terms that I
never indexed. When you run a query to get those terms you get of course
none. I optimized the index and same issue. The field is a TrieIntField.
I think I might have seen some post about this (or a similar) issue but di
without going through such rigorous testing, maybe for my case (interested
only in DAY), I could just index the trielong values such as 20121010,
20110101 etc...
This would take less space than trieDate (I guess), and I still have a date
looking number (for easier handling). I could even base the
thanks Lance.
I new about rounding in the request params, but I want to know if there is
something to tweak at indexing time (by changing precisionSteop in
schema.xml) in order to store only needed information. At query time yes, I
would round to /DAY
--
View this message in context:
http://l
Hi
I have a TrieDateField in my index, where I will index dates (range
2000-2020). I am only interested in the DAY granularity, that is , I dont
care about time (I'll index all based on the same Timezone).
Is there an optimun value for precisionStep that I can use so I don't index
info I will not
If you are using DIH, is just doing (for a mysql project I have around for
example) something like this:
CONCAT(lat, ',',lon) as latlon
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-schema-xml-and-data-config-xml-tp4020376p4020437.html
Sent from
Paul Libbrecht-4 wrote
> PS: to stop this hell, I have a JSP pendant to the VelocityResponseWriter,
> is this something of interest for someone so that I contribute it?
Paul...yes it is! Anything that would help velocity related issues is
welcome
--
View this message in context:
http://lucene.
I have seen that issue several times, in my case it was always with an id
field, mysql db and linux. Same config but on windows did not show that
issue.
Never got to the bottom of it...as it was an id it was just working as it
was unique.
--
View this message in context:
http://lucene.472066.n
there is at least one scenario where no error is reported when it should be,
if the host runs out of disk when optimizing, it is not reported.
There is a jira issue open I think
--
View this message in context:
http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-
oh yeah, forgot about negatives and *:*...
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/adding-an-OR-to-a-fq-makes-some-doc-that-matched-not-match-anymore-tp3983775p3983863.html
Sent from the Solr - User mailing list archive at Nabble.com.
that does not change the results for me:
-suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B))&debugQuery=true
-found 1
-suggest?q=suggest_terms:lap*&fq=type:P&fq=((-type:B)+OR+name:aa)&debugQuery=true
-found 0
looks like a bug?
xab
--
View this message in context:
http://lucene.472066.n3.nab
Hi,
I am trying to understand this scenario (Solr3.6):
- /suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B)
numFound=1
- I add a OR to the second fq. That fq is already fulfilled by the found
doc, so adding a doc will also fulfill right?
/suggest?q=suggest_terms:lap*&fq=type:P&fq=(-type:B OR n
oh, then it should work with 1.5?? OK i know what happened then. I did not
see it happening myself, but he unzipped 3.6, started solr with the example
config and got the error. He had java1.5, so I told him to upgrade and it
worked, so I assumed Solr required 1.6
But this was in a linux box, so mo
Both wiki http://wiki.apache.org/solr/SolrInstall and tutorial
http://lucene.apache.org/solr/api/doc-files/tutorial.html state java 1.5 is
required, but trying to run solr3.6 with java 1.5 was giving some cryptic
error to a colleague.
xab
--
View this message in context:
http://lucene.472066.n3.
Hi,
I cannot seem to get right the configuration of using a properties file for
cores (with 3.6.0). In Solr3 Entr. Search Server book they say this:
"This property substitution works in solr.xml , solrconfig.xml,
schema.xml, and DIH configuration files."
So my solr.xml is like this:
Is there some way to index docs (extracted from main documenet) in a second
core when Solr is indexing the main document in a first core?
I guess it can be done by an UpdateProcessor in /core0 that prepares the new
docs and just calls /core1/update but maybe someone has already done this in
a bett
have a look at
http://wiki.apache.org/solr/SimpleFacetParameters#facet.query_:_Arbitrary_Query_Faceting
--
View this message in context:
http://lucene.472066.n3.nabble.com/Faceting-and-Variable-Buckets-tp3913947p3914017.html
Sent from the Solr - User mailing list archive at Nabble.com.
Well now I am really lost...
1. yes I want to suggest whole sentences too, I want the tokenizer to be
taken into account, and apparently it is working for me in 3.5.0?? I get
suggestions that are like "foo bar abc". Maybe what you mention is only for
file based dictionaries? I am using the field
Just to be sure, reproduced this with example config from 3.5.
1. add to schema.xml
I have done that by getting X top hits, finding the best match among them
(combination of Levenshtein distance, contains...tweaked the code till
testing showed good results), and then deciding if the candidate was a match
or not, again based in custom code plus a user defined leniency value
xab
-
I have double checked and still get the same behaviour. My field is:
Analisys
Hi,
I am using Suggester component, as advised in Solr3 book (using solr3.5):
a_suggest
org.apache.solr.spelling.suggest.Suggester
org.apache.solr.spelling.suggest.fst.FSTLookup
depending on you jvm version, -XX:+UseCompressedStrings would help alleviate
the problem. It did help me before.
xab
--
View this message in context:
http://lucene.472066.n3.nabble.com/Incrementally-updating-a-VERY-LARGE-field-Is-this-possibe-tp3881945p3885493.html
Sent from the Solr - User mail
thanks, that will work I think
--
View this message in context:
http://lucene.472066.n3.nabble.com/space-making-it-hard-tu-use-wilcard-with-lucene-parser-tp3882534p3885460.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I have a field type simpletext:
I have such a field name="fsuggest" type="simpletext"
I index there a value like this (bewtween []): [wi/rac/house aa bbb]
I can see in analysys page it is indexed as [wi/rac/h
I suspected it was to avoid caching, but I thought what was the harm of
caching at http level taking place if it's just suggestions, I would say it
would be even better.
So I can remove it...
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/solritas-timestamp-parameter
Hi,
I am studying solristas with its browse UI that comes in 3.5.0 example. I
have noticed the calls to /terms in order to get autocompletion terms have a
'timestamp' parameter.
What is it for? I did not find any such param in solr docs.
Can be safely be removed?
thanks
--
View this message i
Hi,
I was looking for a way to use spatical search given a location name (like
'dallas,tx'), and also given an IP, and I found
http://lucene.472066.n3.nabble.com/Spatial-Geonames-and-extension-to-Spatial-Solution-for-Solr-tc1311813.html
this post by Chris Mattmann mentioning some work with Geona
Ok thanks.
But I reviewed some of my searches and the - was not surrounded by
withespaces in all cases, so I'll have to remove lucene operators myself
from the user input. I understand there is no predefined way to do so.
--
View this message in context:
http://lucene.472066.n3.nabble.com/lucene
Hi,
I am using edismax with end user entered strings. One search was not finding
what appeared to be the best match. The search was:
Sage Creek Organics - Enchanted
If I remove the -, the doc I want is found as best score. Turns out (I
think) the - is the culprit as the best match has 'enchanted
yes, I am using https://github.com/alexwinston/RunJettyRun that apparently is
a fork of the original project that originated in the need to use an
jetty.xml.
So I am already setting an additional jetty.xml, this can be done in the Run
configuration, no need to use -D param. But as I mentioned solr
Hi,
I am following
http://www.lucidimagination.com/devzone/technical-articles/setting-apache-solr-eclipse
in order to be able to debug Solr in eclipse. I got it working fine.
Now, I usually use ./etc/jetty.xml to set logging configuration. When
starting jetty in eclipse I dont see any log files c
89 matches
Mail list logo