On 4/18/2014 6:15 PM, Candygram For Mongo wrote:
> We are getting Out Of Memory errors when we try to execute a full import
> using the Data Import Handler. This error originally occurred on a
> production environment with a database containing 27 million records. Heap
> memory was configured for
Vineet please share after you setup for solr cloud
Are you using jetty or tomcat.?
On Saturday, April 19, 2014, Vineet Mishra wrote:
> Thanks Furkan, I will definitely give it a try then.
>
> Thanks again!
>
>
>
>
> On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI wrote:
>
>> Hi Vineet;
>>
>> I've
I guess you can apply some deboost for URL.
Lakshmi it will be more helpful to suggest if you also provide some kind of
example about what you want to achieve
On Saturday, April 19, 2014, A Laxmi wrote:
> Markus, like I mentioned in my last email, I have got the qf with title,
> content and url.
I have uploaded several files including the problem description with
graphics to this link on Google drive:
https://drive.google.com/folderview?id=0B7UpFqsS5lSjWEhxRE1NN2tMNTQ&usp=sharing
I shared it with this address "solr-user@lucene.apache.org" so I am hoping
it can be accessed by people in th
Just upload them in Google Drive and share the link with this group.
On Fri, Apr 18, 2014 at 9:15 PM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:
>
>
We consistently reproduce this problem on multiple systems configured with
6GB and 12GB of heap space. To quickly reproduce many cases for
troubleshooting we reduced the heap space to 64, 128 and 512MB. With 6 or
12GB configured it takes hours to see the error.
On Fri, Apr 18, 2014 at 5:54 PM,
I see heap size commands for 128 Meg and 512 Meg. That will certainly run out
of memory. Why do you think you have 6G of heap with these settings?
–Xmx128m –Xms128m
–Xmx512m –Xms512m
wunder
On Apr 18, 2014, at 5:15 PM, Candygram For Mongo
wrote:
> I have lots of log files and other files to
I have lots of log files and other files to support this issue (sometimes
referenced in the text below) but I am not sure the best way to submit. I
don't want to overwhelm and I am not sure if this email will accept graphs
and charts. Please provide direction and I will send them.
*Issue Descri
The LucidWorks Search query parser does indeed support multi-word synonyms
at query time.
I vaguely recall some Jira traffic on supporting multi-word synonyms at
query time for some special cases, but a review of CHANGES.txt does not find
any such changes that made it into a release, yet.
Th
Ahmet:
Yeah, the index .vs. query time bit is a pain. Often what people will
do is take their best shot at index time, then accumulate omissions
and use that list for query time. Then whenever they can/need to
re-index, merge the query-time list into the index time list and start
over.
Not an ide
Hi Jack,
I am planning to extract and publish such words for Turkish language. But I am
not sure how to utilize them.
I wonder if there is a more flexible solution that will work query time only.
That would not require reindexing every time a new item is added.
Ahmet
On Friday, April 18, 20
Luke actually does this, or attempts to. The doc you assemble is lossy
though
It doesn't have stop words
All capitalization is lost
original terms for synonyms are lost
all punctuation is lost
I don't think you can do this unless you store term information.
it's slow.
original words that are
Markus, like I mentioned in my last email, I have got the qf with title,
content and url. That doesn't help a whole lot. Could you please advise if
there are any other parameters that I should consider for solr request
handler config or the numbers I have got for title, content, url in qf
parameter
Hi Markus, Yes, you are right. I passed the qf from my front-end framework
(PHP which uses SolrClient). This is how I got it set-up:
$this->solr->set_param('defType','edismax');
$this->solr->set_param('qf','title^10 content^5 url^5');
where you can see qf = title^10 content^5 url^
Hi, replicating full features search engine behaviour is not going to work with
nutch and solr out of the box. You are missing a thousand features such as
proper main content extraction, deduplication, classification of content and
hub or link pages, and much more. These things are possible to i
Sorry, didn't think this through. You're right, still the same problem..
On 16 Apr 2014 17:40, "Alexandre Rafalovitch" wrote:
> Why? I want stored=false, at which point multivalued field is just offset
> values in the dictionary. Still have to reconstruct from offsets.
>
> Or am I missing somethi
Hi,
When I started to compare the search results with the two options below, I
see a lot of difference in the search results esp. the* urls that show up
on the top *(*Relevancy *perspective).
(1) Nutch 2.2.1 (with *Solr 4.0*)
(2) Bing custom search set-up
I wonder how should I tweak the boost p
Thanks Furkan, I will definitely give it a try then.
Thanks again!
On Tue, Apr 15, 2014 at 7:53 PM, Furkan KAMACI wrote:
> Hi Vineet;
>
> I've been using SolrCloud for such kind of Big Data and I think that you
> should consider to use it. If you have any problems you can ask it here.
>
> Tha
I believe you could use term vectors to retrieve all the terms in a
document, with their offsets. Retrieving them from the inverted index
would be expensive since the index is term-oriented, not
document-oriented. Without tv, I think you essentially have to scan the
entire term dictionary loo
try not setting softCommit=true, that's going to take the current
state of your index and make it visible. If your DIH process has
deleted all your records, then that's the "current state".
Personally I wouldn't try to mix-n-match like this, the results will
take forever to get right. If you absol
I've been working on getting AnalyzingInfixSuggester to make suggestions
using tokens drawn from multiple fields. I've done this by copying
tokens from each of those fields into a destination field, and building
suggestions using that destination field. This allows me to use
different analysi
You're confusing a couple of things here. the /select_test can be
accessed by pointing your URL at it rather than using qt, i.e. the
destination you're going to will be
http://server:port/solr/collection/select_test
rather than
http://server:port/solr/collection/select
Best,
Erick
On Thu, Apr 17,
Hi Alistair,
quick email before getting my plane - I worked with similar requirements in the
past and tuning SOLR can be tricky
* are you hitting the same SOLR query handler (application versus manual
checking)?
* turn on debugging for your application SOLR queries so you see what query is
act
Is this a manageable list? That is, not a zillion names? If so, it
seems like you could do this with synonyms. Assuming your string_ci
bit is a "string" type, you'd need to change that to something like
KeywordTokenizerFactory followed by filters, and you might want to add
something like LowercaseF
cool, thanks.
Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa
On Thu, Apr 17, 2014 at 11:37 PM, Erick Erickson wrote:
> No, the 5 most recently used in a query will be used to autowarm.
>
> If you have things you _know_ are going to be popular fqs, you could
> put them in newS
Hello,
I was looking into "QueryElevationComponent" component.
As per the spec (http://wiki.apache.org/solr/QueryElevationComponent), if
config is not found in zookeepr, it should be loaded from data directory.
However, I see the bug. It doesn't seem to be working even in latest 4.7.2
release.
I
Hey Jack,
thanks for the reply. I added autoGeneratePhraseQueries="true" to the
fieldType and now it's giving me even more results! I'm not sure if the
debug of my query will be helpful but I'll paste it just in case someone
might have an idea. This produces 113524 results, whereas if I manually
e
Hi Remi ,
Thanks for your reply.
I tried with with setting the query_text for "apple ipod" and added the
required doc_id to elevate.
I got the result but again I am not able to get the desired result for NLP
queries such as "ipod nano generation 5" or "apple ipod best music ".
As in both the que
Use an index-time synonym filter with a synonym entry:
indira nagar,indiranagar
But do not use that same filter at query time.
But, that may mess up some exact phrase queries, such as:
q="indiranagar xyz"
since the following term is actually positioned after the longest synonym.
To resolve t
Hi,
I Have a field called "title". It is having a values called "indira nagar"
as well as "indiranagar".
If i type any of the keywords it has to display both results.
Can anybody help how can we do this?
I am using the title field in the following way:
Make sure your field type has the autoGeneratePhraseQueries="true" attribute
(default is false). q.op only applies to explicit terms, not to terms which
decompose into multiple terms. Confusing? Yes!
-- Jack Krupansky
-Original Message-
From: Alistair
Sent: Friday, April 18, 2014 6:1
Hello all,
I'm a fairly new Solr user and I need my search function to handle compound
words in German. I've searched through the archives and found that Solr
already has a Filter Factory made for such words called
DictionaryCompoundWordTokenFilterFactory. I've already built a list of words
that I
Hi zzT
Putting numShards in core.properties also works.
I struggled a little bit while figuring out this "configuration approach".
I knew I am not alone! ;-)
On 2 April 2014 18:06, zzT wrote:
> It seems that I've figured out a "configuration approach" to this issue.
>
> I'm having the exact s
On 4/18/2014 12:04 AM, Alexandre Rafalovitch wrote:
> Did you read through the CJK article series? Maybe there is something
> in there?
> http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html
>
> Sorry, no help on actual Japanese.
Almost everything I know about
35 matches
Mail list logo