Dear Team,
I am trying to implement nested boosting in solr using map function.
http://www.example.com:8984/solr/collection1/select?&q=laundry
services&boost=map(query({!dismax
qf=titlex !v=$ql3 pf=""}),0,0,1,map(query({!dismax qf=city v='"mumbai"'
pf=""}),0,0,1,15))&ql3="laundry services".
But
HI
iam using solr 4.7.1 and trying to do a full import.My data source is a
table in mysql. It has 1000 rows and 20 columns.
Whenever iam trying to do a full import solr stops responding. But when i
try to do a import with a limit of 40 or less it works fine.
If i try to import more than
We'll surely look into UIMA integration.
But before moving, is this( https://wiki.apache.org/solr/OpenNLP ) the only
link we've got to integrate?isn't there any other article or link which may
help us to do fix this problem.
Thanks,
Vivek
On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan wrote:
>
Hello Jitka,
I wonder why you put the custom component logic into prepare() but not in
process()?
28.05.2014 1:55 пользователь "Jitka" написал:
> Hello and thanks for reading my question.
>
> If our high-level search handler doesn't get enough results back from a
> Solr
> query, it tweaks the qu
Hi Jean-Sebastien,
One thing you didn't mention is whether as you are increasing(I assume)
cache sizes you actually see performance improve? If not, then maybe there
is no value increasing cache sizes.
I assume you changed only one cache at a time? Were you able to get any one
of them to the poi
These are the typical queries we are using.
I'm curious if any of these parameters could be causing issues when using
synonyms.
?shards=myserver1.com:8080/svc/solr/wdsc,myserver1.com:8080/svc/solr/kms&sort=score
desc&q=(keyword:(this is a test) OR titleSearch:(this is a test) AND
(doctype:("D
We found a problem with the synonym list, and suspect there was some sort of
recursion causing the memory to be gobbled up until the JVM crashed.
Is this expected behavior from complex synonyms?
Or could this be due to the combination of complex synonyms and a bad query
format?
Jeremy D. Branha
You'll get very different performance profiles from the various
highlighters (we saw up to 15x speed difference in our queries on
average by changing highlighters). The default one re-analyzes the
entire stored document, in memory and is the slowest, but provides the
most faithful match to the
So, we were finally able to reproduce the heap overload behavior with a
stress test of a query that highlighted the large fields we found. We'll
have to play around with the highlighting settings, but for now we've
disabled the highlighting on this query (which is a canned query that
doesn't even
We have 3 SOLR instances on 3 different hosts and we have an external
zookeeper configured for each SOLR instance.
Suppose, instance1 and instance2 are up and running and instance3 is down. A
few records are added to both the running instances.
I am able to see the records that were added to in
Hi,
I am also facing similar issues with Tomcat running the solr. are you able
to solve this issue?
Thanks
Subrata
--
View this message in context:
http://lucene.472066.n3.nabble.com/JVM-Crashed-SOLR-deployed-in-Tomcat-tp4078439p4139421.html
Sent from the Solr - User mailing list archive at N
Hi Shawn,
The reason why am looking at the physical memory is that i see my nodes
falling off often. I have attached the cloud structure with this. I don't
seem to find the reason why this third node has 'gone away'? How ever i can
still query it as my tomcat server is up and running.
Currently t
ZooKeeper allows clients to put watches on paths in the ZK tree. When the
cluster state changes, every Solr client is notified by the ZK server and then
each client reads the updated state. No polling is needed or even helpful.
In any event, reading from ZK is much more lightweight than writing,
Hi,
I believe I answered it. Let me re-try,
There is no committed code for OpenNLP. There is an open ticket with patches.
They may not work with current trunk.
Confluence is the official documentation. Wiki is maintained by community.
Meaning wiki can talk about some uncommitted features/stuf
I’m curious how CloudSolrServer works in practice.
I understand that it gets the active solr nodes from zookeeper, but does it do
this for every request?
If it does hit zk for every request, that seems to put a lot of pressure on the
zk ensemble.
If it does NOT hit zk for every request, then h
Thanks for your reply. I'll check out that link.
--
View this message in context:
http://lucene.472066.n3.nabble.com/search-component-needs-access-to-results-of-previous-component-tp4138335p4139409.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 6/2/2014 2:21 PM, Marc Campeau wrote:
> I notice I have this in the logs when I start SOLR for default example (I
> had the same with my own connection)
>
> 21242 [coreZkRegister-1-thread-1] INFO
> org.apache.solr.cloud.ShardLeaderElectionContext – Enough replicas found
> to continue.
> 21242
On 6/2/2014 1:39 PM, Kashish wrote:
> I have a SOLR Cluster Cloud(SOLR+Tomcat) set up with one shard across 3 VM's.
> All works well. But i see one node falls off after sometime. I noticed this
> with two shards as well. The physical memory shoots up to 3.61 GB for total
> of 3.73 GB. Even before i
I notice I have this in the logs when I start SOLR for default example (I
had the same with my own connection)
21242 [coreZkRegister-1-thread-1] INFO
org.apache.solr.cloud.ShardLeaderElectionContext – Enough replicas found
to continue.
21242 [coreZkRegister-1-thread-1] INFO
org.apache.solr.clou
Try to stay with a separate collection/core for each tenant - otherwise
relevancy for document scores gets "polluted" by other tenants, even if you
do use filter queries to isolate what documents get returned for a tenant in
a multi-tenant core.
-- Jack Krupansky
-Original Message-
F
Here's my solrconfig.xml:
4.4
${solr.data.dir:}
${solr.lock.type:native}
true
6
3
false
${solr.autoSoftCommit.maxTime:15000}
1024
I have a SOLR Cluster Cloud(SOLR+Tomcat) set up with one shard across 3 VM's.
All works well. But i see one node falls off after sometime. I noticed this
with two shards as well. The physical memory shoots up to 3.61 GB for total
of 3.73 GB. Even before i loaded the documents, the physical memory u
Hi all,
I've been reading up on solr cloud (via solr in action) with an eye toward
multi-tenancy. (Read: "solrcloud newbie")
One question that came up: what if a "one size fits all" synonyms file does
not work for all customers?
i.e. different customers/industries use different sets of synonym
James,
I get no results back and no suggestions for "wrangle" , however I get
suggestions for "wranglr" , and "wrangle" is not present in my index.
I am just searching for "wrangle" in a field that is created by copying
other fields, as to how it is analyzed I dont have access to it now.
Thanks
Hi,
One option (not tested by myself), could be the use of payloads (
http://wiki.apache.org/solr/Payloads).
Regards.
On Mon, Jun 2, 2014 at 7:58 PM, Hakim Benoudjit
wrote:
> Hi guys,
> Is it possible in solr to boost documents having a field value (Ex.
> :)?
> I know that it's possible to bo
Hakim,
That is what Boost Query (bq=) does.
http://wiki.apache.org/solr/DisMaxQParserPlugin#bq_.28Boost_Query.29
Jason
On Jun 2, 2014, at 10:58 AM, Hakim Benoudjit wrote:
> Hi guys,
> Is it possible in solr to boost documents having a field value (Ex.
> :)?
> I know that it's possible to boo
Boon,
I expect you will find many definitions of “proper usage” depending upon
context and expected results. Personally, don’t believe this is Solr’s job to
enforce, and there are many ways through the use of directives in the servlet
container layer that can allow restrictions if you feel th
If "wrangle" is not in your index, and if it is within the max # of edits, then
it should suggest it.
Are you getting anything back from spellcheck at all? What is the exact query
you are using? How is the spellcheck field analyzed? If you're using
stemming, then "wrangle" and "wrangler" mig
Thanks, you mean "wrangler" , has been stemmed to "wrangle" , if thats the
case then why does it not return any results for "wrangle" ?
On Mon, Jun 2, 2014 at 2:07 PM, david.w.smi...@gmail.com <
david.w.smi...@gmail.com> wrote:
> It appears to be stemmed.
>
> ~ David Smiley
> Freelance Apache Lu
It appears to be stemmed.
~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley
On Mon, Jun 2, 2014 at 2:06 PM, S.L wrote:
> OK, I just realized that "wrangle" is a proper english word, probably thats
> why I dont get a suggestion for "
OK, I just realized that "wrangle" is a proper english word, probably thats
why I dont get a suggestion for "wrangler" in this case. How ever in my
test index there is no "wrangle" present , so even though this is a proper
english word , since there is no occurence of it in the index should'nt
Solr
Thanks, I will check with the jira.. but you dint answe my first
question..? And there's no way to integrate solr with openNLP?or is there
any committed code, using which i can go head.
Thanks,
Vivek
On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan wrote:
> Hi,
>
> Here is the jira issue : https:
I do not get any suggestion (when I search for "wrangle") , however I
correctly get the suggestion wrangler when I search for wranglr , I am
using the Direct and WordBreak spellcheckers in combination, I have not
tried using anything else.
Is the distance calculation of Solr different than what Le
Hi guys,
Is it possible in solr to boost documents having a field value (Ex.
:)?
I know that it's possible to boost a field above other fields at
query-time, but I want to boost a field value not the field name.
And if so, is the boosting done at query time or on indexing?
--
Hakim Benoudjit.
What do you get then? Suggestions, but not the one you’re looking for, or
is it deemed correctly spelled?
Have you tried another spellChecker impl, for troubleshooting purposes?
~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley
On S
Anyone ?
On Sat, May 31, 2014 at 12:33 AM, S.L wrote:
> Hi All,
>
> I have a small test index of 400 documents , it happens to have an entry
> for "wrangler", When I search for "wranglr", I correctly get the collation
> suggestion as "wrangler", however when I search for "wrangle" , I do not
>
Siegfried:
Thanks! That pretty well nails the issue as being in Tika, it's nice to
know!
Erick
On Mon, Jun 2, 2014 at 10:14 AM, Siegfried Goeschl wrote:
> Hi folks,
>
> Brian was so kind and sent me the troublesome PDF document
>
> I gave it a try with PDFBox directly in order to extract the
Hi folks,
Brian was so kind and sent me the troublesome PDF document
I gave it a try with PDFBox directly in order to extract the text
(PDFBox is used by Tikka to extract the textual content of a PDF document)
* hitting an infinite loop with PDFBox 1.8.3
* no problems with PDFBox 1.8.4 & 1.8.
Hi,
Here is the jira issue : https://issues.apache.org/jira/browse/LUCENE-2899
Anyone can create an account.
I didn't use UIMA by myself and I have little knowledge about it. But I believe
it is possible to use OpenNLP inside UIMA.
You need to dig into UIMA documentation.
Solr UIMA integrati
Hi Arslan,
If not uncommitted code, then which code to be used to integrate?
If i have to comment my problems, which jira and how to put it?
And why you are suggesting UIMA integration. My requirements is integrating
with openNLP.? You mean we can do all the acitivties through UIMA as we do
it u
On our search pages we have a main request where we really want to give the
correct answer, but we also have a number of other child searches performed
on that page where we're happy to get 90% of the way there and be able to
enforce an SLA.
Right now, when the main search finishes we have to comp
Do these numeric values have any significance to the application, or are
they merely to reserve holes that will be later filled in without reindexing
existing documents? I mean, there is no API to retrieve the numeric values
or query them, right? IOW, they are not like stored values or docvalues
Alas, the doc is silent on that point:
https://cwiki.apache.org/confluence/display/solr/Managed+Resources
The Solr javadoc adds no additional clarification.
At a minimum, it should either clearly state that the overwrite (replace)
feature is not supported, or show how to do it.
I haven't dug
Hi,
Uncommitted code could have these kind of problems. It is not guaranteed to
work with latest trunk.
You could commend the problem you face on the jira ticket.
By the way, may be you are after something doable with already committed UIMA
stuff?
https://cwiki.apache.org/confluence/display/s
Thanks for your quick response.
Our JVM is configured with a heap of 8GB. So we are pretty close of the
"optimal" configuration you are mentioning. The only other programs running is
Zookeeper (which has its own storage device) and a proprietary API (with a heap
of 1GB) we have on top of Solr t
we recently upgraded to Solr 4.8 and we are using REST API to update
synonyms.
we are trying to migrate synonyms from synonyms.txt file to new files. we
have our synonyms defined in synonyms.txt where we can overwrite expand
property with => .
ex tv=>television in synonyms.txt .
I am wondering h
Upgrade steps are carried along in the CHANGES.txt file, there's a section
for every release (i.e. 4.1 -> 4.2, 4.5 -> 4.7) etc. There's no 4.0 -> 4.8
in a single go though. So I'd start there.
Best,
Erick
On Mon, Jun 2, 2014 at 7:14 AM, Alexandre Rafalovitch
wrote:
> You can do lots of new stu
On 6/2/2014 8:24 AM, Jean-Sebastien Vachon wrote:
> We have yet to determine where the exact breaking point is.
>
> The two patterns we are seeing are:
>
> - less cache (around 20-30% hit/ratio), poor performance but
> overall good stability
When caches are too small, a low hit ratio is
Would both then be supported? I see where it would be easily detectable.
And I also assume that this wouldn't break back-compat?
Best
Erick
On Mon, Jun 2, 2014 at 6:22 AM, Elran Dvir wrote:
> Hi all,
>
> I am the one that contributed EnumField code to Solr.
> There was a long discussion how th
Joe:
One thing to add, if you're returning that doc (or perhaps even some
fields, this bit is still something of a mystery to me) then the whole 180M
may be being decompressed. Since 4.1 the stored fields have been compressed
to disk by default. That this, this is only true if the docs in question
Hmmm, when changing the schema there might be issues is you're changing the
definition of an already-existing field. I've seen weirdness when the
fundamental definition of a field changes so I'd be cautious.
You'd only be able to add new fields via copyField I'd guess.
In this situation, since yo
Hi All,
We have a 5 nodes setup running Solr 4.8.1 and we are trying to get the most
out of it by tuning Solr caches.
Following is the output of the script version.sh provided with Tomcat
Server version: Apache Tomcat/7.0.39
Server built: Mar 22 2013 12:37:24
Server number: 7.0.39.0
OS Name:
You can do lots of new stuff, but I believe the old config will run ok
without changes. One thing to be aware of is the logging jar
unbundling and manual correction for that when running under tomcat.
That's on the Wiki somewhere and should have been covered in 4.2 to
4.7 change.
Regards,
Alex.
I followed this link to integrate https://wiki.apache.org/solr/OpenNLP to
integrate
Installation
For English language testing: Until LUCENE-2899 is committed:
1.pull the latest trunk or 4.0 branch
2.apply the latest LUCENE-2899 patch
3.do 'ant compile'
cd solr/contrib/opennlp/sr
Hi! I have a question which I posted on
http://stackoverflow.com/questions/23959727/sum-of-nested-queries-in-solr about
taking the sum of OR'd nested queries. I'll repeat it here, but if you want
some SO points and have an answer, feel free to answer there.
[quote]
We have a search that takes
Here is my configuration :
schema.xml:
solrconfig.xml:
text_en
default
name_gen
solr.DirectSolrSpellChecker
internal
0.5
Hi Wolfgang,
Thanks for your response, can you quote some running example of
MapReduceIndexerTool
for indexing through csv files.
If you are referring to
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-User-Guide/csug_mapreduceindexertool.html?scroll=cs
No collate is working as expected, Please help. It is more like spell
checking with Infix suggester.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Spell-check-or-Did-you-mean-this-with-Phrase-suggestion-tp4135547p4139260.html
Sent from the Solr - User mailing list archive
Hi all,
I am the one that contributed EnumField code to Solr.
There was a long discussion how the integer values of an enum field should be
indicated in the configuration.
It was decided that the integer value wouldn't be written explicitly, but would
be implicitly determined by the value order.
Hi all,
First time posting so the regular sorry if this is a popular question..
Anyhoo - I'm running solr 4.0 on a test rig with multicore and I would like
to upgrade to 4.8.1. I can't find any clear tutorials on this on the web
and I can only see a thread on 4.2 -> 4.7 on the mailing list.
Can
Joe - there shouldn't really be a problem *indexing* these fields:
remember that all the terms are spread across the index, so there is
really no storage difference between one 180MB document and 180 1 MB
documents from an indexing perspective.
Making the field "stored" is more likely to lead
Thanks for clearing this up. The wiki, being an authoritative reference, needs
to be corrected.
Re. default commit settings. I agree educating developers is very essential.
But in reality, you can't rely on this as the sole mechanism for ensuring
proper usage of the update API, especially for c
And the followup question would be.. if some of these documents are
legitimately this large (they really do have that much text), is there a
good way to still allow that to be searchable and not explode our index?
These would be "text_en" type fields.
On Mon, Jun 2, 2014 at 6:09 AM, Joe Gresock
So, we're definitely running into some very large documents (180MB, for
example). I haven't run the analysis on the other 2 shards yet, but this
could definitely be our problem.
Is there any conventional wisdom on a good "maximum size" for your indexed
fields? Of course it will vary for each sys
Hi Erick,
Thanks again, I'm using the same thing as workaround. I first use "pivot
faceting" and another call to fetch actual documents.
On Fri, May 30, 2014 at 9:46 PM, Erick Erickson
wrote:
> OK, I see what you're trying to do. Unfortunately grouping is just not
> built to support multivalu
Sounds like you should consider using MapReduceIndexerTool. AFAIK, this is the
most scalable indexing (and merging) solution out there.
Wolfgang.
On Jun 2, 2014, at 10:33 AM, Vineet Mishra wrote:
> Hi Erick,
>
> Thanks for your mail, please let me go through with my use case.
> I am having ar
Not an exact answer.. OpenGrok uses Lucene, but not Solr.
On 2 Jun 2014 07:48, "Alexandre Rafalovitch" wrote:
> Hello,
>
> Anybody knows of a recent projects that index SVN repos for Solr
> search? With or without UI.
>
> I know of similar efforts for other VCS, but the only thing I found
> for S
Hi Erick,
Thanks for your mail, please let me go through with my use case.
I am having around 20-40 Billion Records to index with each record is
having around 200-400 fields, the data is sensor data so it can be easily
stored in Integer or Float. Now to index this huge amount of data I am
going wi
Hi Otis,
I have to index some huge amount of data that's around Billions of records,
since indexing via HTTP post mechanism will be a slow and lethargic due to
network delay hence I am indexing through EmbeddedSolrServer to create
index which I can later upload to different Shards in SolrCloud, al
69 matches
Mail list logo