Thanks Upaya for sharing. I am looking to deploy Solr in a Windows 64 Bit
Server environment. Some people do say Jetty works optimally in a Linux based
environment. Having said that, I believe Solr will have improved it's stability
within a Windows environment.
I agree with you on the advice. S
Hi Adrian,
since version 5.0 Solr is shipped with Jetty. But I think it could be a
more interesting question is understand if default Jetty configuration
could be used "as is" in a production environment.
On Wed, Jul 15, 2015 at 8:43 AM, Adrian Liew
wrote:
> Hi all,
>
> Will like to ask your o
On 14/07/2015 17:04, Erick Erickson wrote:
Well, Shawn I for one am in your corner.
Schemaless is great for getting thing running, but it's
not an AI. And it can get into trouble guessing. Say
it guesses a field should be an int because the first one
it sees is 123 but it's really a part number.
I'm doing some testing on long running huge indexes.
Therefore I need a "clean" state after some days running.
My idea was to open a new searcher with commit command:
INFO - org.apache.solr.update.DirectUpdateHandler2;
start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expu
I also feel having dataDir configurable helps deployments in enterprise
easy. Generally software are installed in root disk e.g. /opt/solr and if
the data folder is within it, then it will require root drive to be
expanded as Solr index increases or need to be optimized, etc. Having data
folder con
What do you mean with "clean" state? A searcher is a view over a given
index (let's say) "state"...if the state didn't change why do you want
another (identical) view?
On 15 Jul 2015 02:30, "Bernd Fehling"
wrote:
>
> I'm doing some testing on long running huge indexes.
> Therefore I need a "clean
On top of that sorry, I didn't answer to your question because I don't know
if that is possible
Best,
Andrea
On 15 Jul 2015 02:51, "Andrea Gazzarini" wrote:
> What do you mean with "clean" state? A searcher is a view over a given
> index (let's say) "state"...if the state didn't change why do yo
Hi,
i'm using Solr 4.10.3, and i'm trying update a doc field using atomic update
(http://wiki.apache.org/solr/Atomic_Updates).
My schema.xml is like this:
I add a document with this command:
curl http://:/solr/default/update?commit=true -H
"Content-Type: text/xml" --data-binary '
HI Erick,
Thanks for pointing out the main problem of my system.
Trung.
On Fri, Jul 10, 2015 at 11:47 PM, Erick Erickson
wrote:
> In a word, no. If you don't store the data it is completely gone
> with no chance of retrieval.
>
> There are a couple of things to think about though
>
> 1> The or
Triggering a commit , implies the new Searcher to be opened in a soft
commit scenario.
With an hard commit, you can decide if opening or not the new searcher.
But this is probably a X/Y problem.
Can you describe better your real problem and not the way you were trying
to solve it ?
Cheers
2015-
This is kinda weird and looks a lot like a bug.
Let me try to reproduce it locally!
I let you know soon !
Cheers
2015-07-15 10:01 GMT+01:00 Martínez López, Alfonso :
> Hi,
>
> i'm using Solr 4.10.3, and i'm trying update a doc field using atomic
> update (http://wiki.apache.org/solr/Atomic_Updat
Just tried, on Solr 5.1 and I get the proper behaviour.
Actually where is the value for the dinamic_desc coming from ?
I can not see it in the updates and actually it is not in my index.
Are you sure you have not forgotten any detail ?
Cheers
2015-07-15 11:48 GMT+01:00 Alessandro Benedetti
:
Hi, thanks for your help!
Value for 'dinamic_desc' field come from 'src_desc' field. I copy the value
with:
Seems like when I update a different field (field 'name') via atomic update,
the copyField directive copies the value again from 'src_desc' to 'desc_field',
instead of updating the val
Ohhh!
I didn't read it completely, so i missed the copy field.
Ok now.
This is the explanation :
Copy fields are added at indexing time, when the document arrived to the
RunUpdateRequest processor.
If i remember well at this point , before we start the indexing the content
of source field is added
What ever you name a problem, I just wanted to open a new searcher
after several days of heavy load/searching on one of my slaves
to do some testing with empty field-/document-/filter-caches.
Sure, I could first add, then delete a document and do a commit.
Or may be only do a fake update of a docu
Well yes, a simple empty commit won't do the trick, the searcher is not going
to reload on recent versions. Reloading the core will.
-Original message-
> From:Bernd Fehling
> Sent: Wednesday 15th July 2015 13:42
> To: solr-user@lucene.apache.org
> Subject: Re: To the experts: howto for
Thank you all for helping on this topic. I'm going to play with this and
might come back with more questions.
Steve
On Tue, Jul 14, 2015 at 1:57 PM, Erick Erickson
wrote:
> Steve:
>
> Simplest solution:
> remove WordDelimiterFilterFactory.
> Use something like PatternReplaceCharFilterFactory o
Hi Everyone,
Out-of-the box, Solr (Lucene?) is set to use OR as the default Boolean
operator. Can someone tell me the advantages / disadvantages of using OR
or AND as the default?
I'm leaning toward AND as the default because the more words a user types,
the narrower the result set should be.
T
Going through the code in the RunUpdateRequestProcessor we call at one
point :
…
Document luceneDocument = cmd.getLuceneDocument();
// SolrCore.verbose("updateDocument",updateTerm,luceneDocument,writer);
writer.updateDocument(updateTerm, luceneDocument);
..
Inside that method we call :
public
Hi Markus,
excellent, reloading the core did it.
Best regards
Bernd
Am 15.07.2015 um 13:44 schrieb Markus Jelsma:
> Well yes, a simple empty commit won't do the trick, the searcher is not going
> to reload on recent versions. Reloading the core will.
>
> -Original message-
>> From:B
OK. so effectively use the core product as it was in Solr 4, running a
schema.xml file to control doc structures and validation. In Sol 5, does
anyone have a clear link or some pointers as to the options for bin/solr
create_core to boot up the instance I need?
Thanks for all the help.
--
View th
2015-07-15 12:44 GMT+01:00 Markus Jelsma :
> Well yes, a simple empty commit won't do the trick, the searcher is not
> going to reload on recent versions. Reloading the core will.
>
mmm Markus, let's assume we trigger a soft commit, even empty, if open
searcher is equal true, it is not going to b
Ok, thank very much.
When I try a second atomic udpate is when I got the exception you mentioned
"multiple values encountered
for non multiValued copy field". First time there is not exception but the
non-multivalued field get indexed with 2 values.
Cheers.
And, to answer your other question, yes, you can turn off auto-warming.If
your instance is dedicated to this client task, it may serve no purpose or be
actually counter-productive.
In the past, I worked on a Solr-based application that committed frequently
under application control (vs. aut
Am 15.07.2015 um 14:47 schrieb Alessandro Benedetti:
...
>>> What ever you name a problem, I just wanted to open a new searcher
>>> after several days of heavy load/searching on one of my slaves
>>> to do some testing with empty field-/document-/filter-caches.
>>
> Aren't you warming your caches
See SOLR-5783.
-Original message-
> From:Alessandro Benedetti
> Sent: Wednesday 15th July 2015 14:48
> To: solr-user@lucene.apache.org
> Subject: Re: To the experts: howto force opening a new searcher?
>
> 2015-07-15 12:44 GMT+01:00 Markus Jelsma :
>
> > Well yes, a simple empty comm
Just to re-iterate Charles' response with an example, we have a system
which needs to be as Near RT as we can make it. So we have application
level commitWith set to 250ms. Yes, we have to turn off a lot of caching,
auto-warming, etc, but it was necessary to make the index as real time as
we need
It is simply precision (AND) vs. recall (OR) - the former tries to limit
the total result count, while the latter tries to focus on relevancy of the
top results even if the total result count is higher.
Recall is good for discovery and browsing, where you sort of know what you
generally want, but
On 7/15/2015 3:01 AM, Martínez López, Alfonso wrote:
>
>
>
>
>
> multiValued="false" />
> multiValued="false" />
>
>
> And later I update the field 'name' with this command:
>
> curl http://:/solr/default/update?commit=true -H
> "Content-Type: text/xml" --data-binary ' name="id">1 upd
The AND default has one big problem. If the user misspells a single word, they
get no results. About 10% of queries are misspelled, so that means a lot more
failures.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Jul 15, 2015, at 7:21 AM, Jack Krup
Hey Shawn, I was debugging a little bit,this is the problem :
When adding a field from the solr document, to the Lucene one, even if this
document was previously added by the execution of the copy field
instruction to the Lucene Document, this check is carried :
org/apache/solr/update/DocumentBui
We are building an admin for our inventory. Using solr's faceting,
searching and stats functionality it provides different ways an admin can
look at the inventory.
The admin can also do some updates on the items and they need to see the
updates almost real time.
Our public facing website is alread
1. I can't get your explanation.
2. childFilter=(image_uri_s:somevalue) OR (-image_uri_s:*)
is not correct, lacks of quotes , and pointless (selecting some term, and
negating all terms gives nothing). Thus, considerable syntax can be only
childFilter="other_field:somevalue -image_uri_s:*"
3. I c
Hi,
in some cases it can be necessary to have the copy field stored. My Solr
instance is used by some legacy applications that need to retrive fields by
some especific field names. That's why i need to mantain 2 copies of the same
field: one with the old name and other for the new name (that is
A common approach to this problem is to include the spellcheck component and,
if there are corrections, include a "Did you mean ..." link in the results page.
-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org]
Sent: Wednesday, July 15, 2015 10:36 AM
To: solr-user@lu
bq: The admin can also do some updates on the items and they need to see the
updates almost real time.
Why not give the admin control over commits and default the other commits to
something reasonable? So make your defaults, say, 15 seconds (or 30 seconds
or longer). If the admin really needs the
On 7/15/2015 8:55 AM, Martínez López, Alfonso wrote:
> in some cases it can be necessary to have the copy field stored. My Solr
> instance is used by some legacy applications that need to retrive fields by
> some especific field names. That's why i need to mantain 2 copies of the same
> field: o
Since they want explicitly search within a given "version" of the data, this
seems like a textbook application for collection aliases.
You could have N public collection names: current_stuff, previous_stuff_1,
previous_stuff_2, ... At any given time, these will be aliased to reference
the
Alfonso:
Haven't worked with this myself, but could "field aliasing" handle your use-case
_without_ the need for a copyField at all?
See: https://issues.apache.org/jira/browse/SOLR-1205
Again I need to emphasize that I HAVE NOT worked with this so it may be
a really bad suggestion. Or it may not
The OP asked about MapReduceIndexerTool. My understanding is that this is
actually somewhat slower than the standard indexing path and is recommended
only if the site is already invested in the Hadoop infrastructure. E.g. input
files are already distributed on the Hadoop/Search cluster via HD
If you inlined the query rather than referenced the thread, it would be
easy to understand the problem.
once again, what doesn't meet your expectation: order of returned parents
or order of children attached to a parent doc?
On Wed, Jul 15, 2015 at 1:56 AM, DorZion wrote:
> I can sort the parent
If you're running in cloud mode, move to using collections with
the configs kept in Zookeeper.
Assuming you're not, you can use the create_core stuff, I'm
not sure what's unclear about it, did you try
bin/solr create_core -help? If that's not clear please make some
suggestions for making it more s
Hi all
I asked a related question before but couldn't get any response (see
SolrQueryRequest in SolrCloud vs Standalone Solr), asking it differently
here.
Is there a way to invoke
IndexSearcher.search(Query, Collector) over a SolrCloud collection so that
in invokes the search/collect implicitly
Sorry Erick, I completely agree with you, I didn;'t specify in details what
I was thinking :
" copy fields must not be executed if the updated field is not a source
field ( in a copy field couple) "
furthermore I agree again with you, copy field should try to give a
different analysis to the new
2015-07-15 16:01 GMT+01:00 Mikhail Khludnev :
> 1. I can't get your explanation.
>
> 2. childFilter=(image_uri_s:somevalue) OR (-image_uri_s:*)
> is not correct, lacks of quotes , and pointless (selecting some term, and
> negating all terms gives nothing).
Not considering the syntax,
We are talk
Charles:
bq: My understanding is that this is actually somewhat slower than
the standard indexing path...
Yes and no. If you just use a single thread, you're right it'll be
slower since it has to copy a
bunch of stuff around. Then at the end, the --go-live step copies the
built index to Solr
the
Hi Charles,
Thank you for the response. We will be using aliasing. Looking into ways
to avoid ingestion into each of the collections as you have mentioned "For
example, would it be faster to make a file system copy of the most recent
collection ..²
MapReduceIndexerTool is not an option at this po
Thank you all. Looks like OR is a better choice vs. AND.
Charles: I don't understand what you mean by the "spellcheck component".
Do you mean OR works best with spell checker?
Steve
On Wed, Jul 15, 2015 at 11:07 AM, Reitzel, Charles <
charles.reit...@tiaa-cref.org> wrote:
> A common approach t
bq: Is there a way to invoke IndexSearcher.search(Query, Collector)
Problem is that this question doesn't make a lot of sense to me.
IndexSearcher is, by definition, local to a single Lucene
instance. Distributed requests are a whole different beast. If you're going
to try to use custom request ha
By the way, using OR as the default, other than returning more results as
more words are entered, the ranking and performance of the search remains
the same right?
Steve
On Wed, Jul 15, 2015 at 12:12 PM, Steven White wrote:
> Thank you all. Looks like OR is a better choice vs. AND.
>
> Charles
This is really an apples/oranges comparison. They're essentially different
queries, and scores aren't comparable across different queries.
If you're asking "if doc 1 and doc 2 are returned by defaulting to AND or OR,
are they in the same position relative to each other?" then I'm pretty sure the
a
Erick
Thanks for your response and for the pointers! This will be a good starting
point; I will go through these.
The good news is in our usecase, we don't really care about the two passes.
In fact, our results are ConstantScore so we only need to aggregrate (i/e
sum) the results from each shard.
Hi Erick,
I understand there are variables that will impact ranking. However, if I
leave my edismax setting as is and simply switch from AND to OR as the
default Boolean, now if a user types "apples oranges" (without quotes) will
the ranking be the same as when I had AND? Will the performance be
ok. I checked with with my data
color:orlean => "numFound": 1,
-color:[* TO *] => "numFound": 602096 (it used to return 0 until 'pure
negational' (sic) queries were delivered)
color:orlean -color:[* TO *] => "numFound": 0,
color:orlean (*:* -color:[* TO *]) => "numFound": 602097,
fyi
https://lu
On Wed, Jul 15, 2015 at 10:46 AM, Chetan Vora wrote:
> Hi all
>
> I asked a related question before but couldn't get any response (see
> SolrQueryRequest in SolrCloud vs Standalone Solr), asking it differently
> here.
>
> Is there a way to invoke
>
> IndexSearcher.search(Query, Collector) over a
Mikhail -
This worked great.
http://localhost:8983/solr/demo/select?q={!parent
which='type:parent'}image_uri_s:somevalue&fl=*,[child
parentFilter=type:parent
childFilter=-type:parent]&indent=true
Thank you.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Querying-Nes
Mikhail
We do add new nodes with our custom results in some cases... just curious-
does that preclude us from doing what we're trying to do above? FWIW, we
can avoid the custom nodes if we had to.
Chetan
On Wed, Jul 15, 2015 at 12:39 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
>
Talking about performances you should take a look to the difference in
performance between :
- disjunction of K sorted arrays ( n*k*log(k)) in Lucene - where *k* are
the disjunction clauses and *n* the average posting list size (just learned
today from an expert lucene committer))
- conjunction
bq: now if a user types "apples oranges" (without quotes) will
the ranking be the same as when I had AND?
You haven't defined "same". But at root I think this is a red
herring, you haven't stated why you care. They're different queries
so I think the question is really which is more or less satisf
bq: does that preclude us from doing what we're trying to do above?
Not at all. You just have to process each response and combine them
perhaps.
In this case, you might be able to get away with just specifying the
shards parameter to the query and having the app layer deal with
the responses. At
What are your cache sizes? Max doc?
Also, what GC settings are you using? 6GB isn't all that much for a
memory-intensive app like Solr, esp. given the number of facet fields
you have. Lastly, are you using docvalues for your facet fields? That
should help reduce the amount of heap needed to comput
Hello,
I've ran into quite the snag and I'm wondering if anyone can help me out
here. So the situation.
I am using the DataImportHandler to pull from a database and a Linux file
system. The database has the metadata. The file system the document text. I
thought it had indexed all the files I had
That should be author 280 and 281. Sorry
--
View this message in context:
http://lucene.472066.n3.nabble.com/DIH-Not-Indexing-Two-Documents-tp4217546p4217547.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Everyone,
I need to use a RankQuery within a grouping [1].
I did some experiments with RerankQuery [2] and solr 4.10.2 and it seems
that
if you group on a field, the reranking query is completely ignored
(on the cloud, and on a single instance).
I would expect to see the results in each group
from here :
https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance
we can learn that Transaction Log is needed when replicas are used in SOLR
cloud .
Do I need it if I am not using a replicas ?
Could it be disabled for performance improvement ?
What are negative i
I have handler configured in solr.config as shard.tolerant = true , which
means ignore unavailable shards when returning a results .
Sometime shards are not really down,but doing GC or heavy commit .
Is it possible and how to ignore them ? I prefer to get a partial result
instead of timeout er
On Wed, Jul 15, 2015 at 12:00 PM, Chetan Vora wrote:
> Mikhail
>
> We do add new nodes with our custom results in some cases... just curious-
> does that preclude us from doing what we're trying to do above? FWIW, we
> can avoid the custom nodes if we had to.
>
If your custom component doesn't m
bq: Do I need it if I am not using a replicas
Yes. The other function of transaction logs is
to recover documents indexed to segments
that haven't been closed in the event of
abnormal termination (i.e. somebody pulls
the plug).
Here's some info you might find useful:
https://lucidworks.com/blog/u
After playing with SolrCloud I answered my own question: multiple collections
can live on the same node. Following the how-to in the solr-ref-guide was
getting me confused.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Migrating-from-solr-cores-to-collections-tp4217346p421
My first guess is that somehow these two documents have
the same as some other documents so later
docs are replacing newer docs. Although not conclusive,
looking at the admin page for the cores in question may
show numDocs=278 and maxDoc=280 or some such in
which case that would be what's happenin
On 7/15/2015 12:42 PM, SolrUser1543 wrote:
> from here :
> https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance
> we can learn that Transaction Log is needed when replicas are used in SOLR
> cloud .
>
> Do I need it if I am not using a replicas ?
> Could it be disab
You were 100 percent right. I went back and checked the metadata looking for
multiple instances of the same file path. Both of the files had an extra set
of metadata with the same filepath. Thank you very much.
--
View this message in context:
http://lucene.472066.n3.nabble.com/DIH-Not-Indexin
Sorry in advance if I am beating a dead horse here ...
Here is an article by Mark Miller that gives some background and examples:
http://blog.cloudera.com/blog/2013/10/collection-aliasing-near-real-time-search-for-really-big-data/
In particular, see the section entitled "Update Alias".
-Orig
Thanks Mikhail, the post is really useful!
I will study it in details.
A slight change in the syntax change the query parsed.
Anyway just tried again q=(image_uri_s:somevalue) OR (-image_uri_s:*)
query approach .
And actually it is working as expected:
q=(name:nome) OR (-name:*) ( give me all t
As you've seen RankQueries won't currently have any effect on Grouping
queries.
A RankQuery can be combined with Collapse and Expand though. You may want
to review Collapse and Expand and see if it meets your use case.
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Jul 15, 2015 at 2:36 PM,
Hi,
Can you please provide me the privilege to edit Wiki pages.
My Wiki username is Dikshant.
Thanks,
Dikshant
I added you to the Solr Wiki, if you need Lucene Wiki access let us know.
Erick
On Wed, Jul 15, 2015 at 7:59 PM, Dikshant Shahi wrote:
> Hi,
>
> Can you please provide me the privilege to edit Wiki pages.
>
> My Wiki username is Dikshant.
>
> Thanks,
> Dikshant
Thanks Erick! This is good for now.
On Thu, Jul 16, 2015 at 9:54 AM, Erick Erickson
wrote:
> I added you to the Solr Wiki, if you need Lucene Wiki access let us know.
>
> Erick
>
> On Wed, Jul 15, 2015 at 7:59 PM, Dikshant Shahi
> wrote:
> > Hi,
> >
> > Can you please provide me the privilege t
78 matches
Mail list logo