Thanks for your guidance Alexandre Rafalovitch.
I am looking into this seriously.
Another question is that I facing error in replication of eff file
This is master replication configuration:
core/conf/solrconfig.xml
>
> commit
> startup
> ../data/external_eff_view
Dear Team,
I am working on external file field. But, I does not know the configuration
of how to replicate the EFF files.
This is master replication configuration:
core/conf/solrconfig.xml
commit
startup
../data/external_eff_views
The eff file is present at
This might be related:
https://issues.apache.org/jira/browse/SOLR-3514
On Sat, Jun 28, 2014 at 5:34 PM, Kamal Kishore Aggarwal <
kkroyal@gmail.com> wrote:
> Hi Team,
>
> I have recently implemented EFF in solr. There are about 1.5 lacs(unsorted)
> values in the external file. After this imp
How would we know where the problem is? It's your custom
implementation. And it's your own documents, so we don't know field
sizes/etc. And it's your own metric (ok, Indian metric, but lacs are
fairly unknown outside of India).
Seriously though, have you tried using any memory profilers and
runnin
Any replies ??
On Sat, Jun 28, 2014 at 5:34 PM, Kamal Kishore Aggarwal <
kkroyal@gmail.com> wrote:
> Hi Team,
>
> I have recently implemented EFF in solr. There are about 1.5
> lacs(unsorted) values in the external file. After this implementation, the
> server has become slow. The solr query
On 7/1/2014 4:57 AM, mskeerthi wrote:
> I have to download my 5 million records from sqlserver to solr into one
> index. I am getting below exception after downloading 1 Million records. Is
> there any configuration or another to download from sqlserver to solr.
>
> Below is the exception i am get
Hi, Jack,
Thank you very much for you solution its works!
I'm sorry that I didn't make it clear at the beginning for 'score' which i
mean document score (solr produce it at query time).
Thank you very much for all of you,
Chun.
--
View this message in context:
http://lucene.472066.n3.n
Chris,
We have actually done that. Our requirement was basically have a single
installation of Solr to assume different roles and each role had its own
changes for optimisation done on solrconfig.xml and schema.xml
When we start a role we basically adapt to file role_solrconfig.xml and
role_sch
On 7/2/2014 11:55 AM, IJ wrote:
> Here is a short wishlist based on the experience in debugging this issue:
> 1. Wish SolrQueryResponse could contain a list of node names / shard-replica
> names that a request passed through for processing the query (when debug is
> turned ON)
> 2. Wish SolrQueryR
Thanks chris, independent of servlet container is good.
Eagerly waiting for solr 5 :)
With Regards
Aman Tandon
On Thu, Jul 3, 2014 at 7:58 AM, Chris Hostetter
wrote:
>
> This is a long standing issue in solr, that has some suggested fixes (see
> jira comments), but no one has been seriously a
That's good to know.
I don't actually want to do it. I want to see just how much of Solr's
schema and configuration can be reliably validated. The error messages
I've been getting back for misconfigured setups are less than ideal at
times. But it should be easy for me to validate certain things
This is a long standing issue in solr, that has some suggested fixes (see
jira comments), but no one has been seriously afected by it enough for
anyone to invest time in trying to improve it...
https://issues.apache.org/jira/browse/SOLR-2357
In general, the fact that Solr is moving away from b
: Is it required for the schema.xml and solrconfig.xml to have those exact
: filenames?
It's an extremelely good idea ... but strictly speaking no...
https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-CREATE
This smells lik
Is it required for the schema.xml and solrconfig.xml to have those exact
filenames?
Can I alias schema.xml to foo.xml in some way, for example?
Thanks.
: Now that I think about it, though, is there a way to use the Update Xml
: messages with something akin to the cloud solr server? I only see examples
: posting to actual Solr instances, but we really need to be able to take
: advantage of the zookeepers to send our updates to the appropriate ser
Thanks pranab, I am unfamiliar with payloads, can you provide some info
about payload and how they are helpful in nlp
On Jul 2, 2014 7:41 PM, "parnab kumar" wrote:
> Aman,
>
> I feel focusing on "Question-Answering" and "Information Extraction"
> components of NLP should help you achieve
We reload at interval of 6/7 days and restart may be in 15/18 days if the
response becomes too slow
On Jul 2, 2014 7:09 PM, "Markus Jelsma" wrote:
> Hi, you can safely ignore this, it is shutting down anyway. Just don't
> reload the app a lot of times without actually restarting Tomcat.
>
> -
Created the jira ..
https://issues.apache.org/jira/browse/SOLR-6222
On 30 June 2014 23:53, Joel Bernstein wrote:
> Sure, go ahead create the ticket. I think there is more we can here as
> well. I suspect we can get the CollapsingQParserPlugin to work with
> useFilterForSortedQuery=true if scor
How would the MapReduceIndexerTool (MRIT for short)
find the local disk to write from HDFS to for each shard?
All it has is the information in the Solr configs, which are
usually relative paths on the local Solr machines, relative
to SOLR_HOME. Which could be different on each node
(that would be s
bq: Is this a BUG or a FEATURE in Solr
How about "just the way it works"?
You've changed the route key with the same
unique key, taking control of the routing.
When you change that routing, how is Solr to
know where the _old_ document lived? It would
have to, say, query the entire cluster for an
Hi Manuel,
I think OCR error correction is one of well-known NLP tasks.
I'd thought it could be implemented in the past by using Lucene.
This is a brief idea:
1. You have got a Lucene index. This existing index is made from
correct (i.e. error free) documents that are same domain of OCR documen
You probably don't have a field named "score". That said, the Solr error
message is not very useful at all!
If you want to reference the document score, I don't think there is a direct
way to do it, but you can indirectly by using the query function:
.../select?q=MacBook&sort=sum(base_score,q
Hi Jack,
I tried as you suggest
.../select?q=MacBook&sort=sum(base_score,score)+desc&wt=json&indent=true
but it didn't work and I got this error message
"error":{
"msg":"sort param could not be parsed as a query, and is not a field
that exists in the index: sum(base_score,score)",
"c
Hi Ahmet,
I also tried this
.../select?q=MacBook&sort=sum(base_score, score)+desc&wt=json&indent=true
I got the same error
"error":{
"msg":"Can't determine a Sort Order (asc or desc) in sort spec
'sum(base_score, score) desc', pos=15",
"code":400}}
Best regards,
Chun
--
View this me
Take a look at the synonym filter as well. I mean, basically that's exactly
what you are doing - adding synonyms at each position.
-- Jack Krupansky
-Original Message-
From: Manuel Le Normand
Sent: Wednesday, July 2, 2014 12:57 PM
To: solr-user@lucene.apache.org
Subject: Re: OCR - Sav
Hi,
When we run Solr Map Reduce Indexer Tool (
https://github.com/markrmiller/solr-map-reduce-example), it generates
indexes on HDFS
The last stage is Go Live to merge the generated index to live SolrCloud
index.
If the live SolrCloud write index to local file system (rather than HDFS),
the Go
Thanks for posting this.
-- Jack Krupansky
-Original Message-
From: wrdrvr
Sent: Wednesday, July 2, 2014 1:47 PM
To: solr-user@lucene.apache.org
Subject: Re: Migration from Autonomy IDOL to SOLR
I know that this is an old thread, but I wanted to pass on some additional
information in
I think the white space after the comma is the culprit. No white space is
allowed in function queries that are embedded, such as in the sort
parameter.
-- Jack Krupansky
-Original Message-
From: Ahmet Arslan
Sent: Wednesday, July 2, 2014 2:19 PM
To: solr-user@lucene.apache.org
Subje
Hi,
Why did you use upper case? What happens when you use : sort=sum(...
On Wednesday, July 2, 2014 6:23 PM, rachun wrote:
Gora,
firstly I would like thank you for your quick response.
.../select?q=MacBook&sort=SUM(base_score, score)+desc&wt=json&indent=true
I tried that but it didn't wo
This issue was finally resolved. Adding an explicit Host - IP address mapping
on /etc/host file seemed to do the trick. The one strange thing is - before
the host file entry was made - we were unable to simulate the 5 second delay
from the linux shell by performing a simple nslookup . In any
case -
I know that this is an old thread, but I wanted to pass on some additional
information in blatant self promotion.
We've just completed an IDOL to Solr migration for our e commerce site with
approximately 40 Million items and anywhere between 200,000 to 300,000
searches per day. I am documenting s
So - we do end up with two copies / versions of the same document (uniqueid)
- one in each of the two shards - Is this a BUG or a FEATURE in Solr ?
Have a follow up question - In case one were to attempt to delete the
document -lets say usng the CloudSolrServer - deleteById() API - would that
atte
Thanks for your answers Erick and Michael.
The term confidence level is an OCR output metric which tells for every
word what are the odds it's the actual scanned term. I wish the OCR prog to
output all the "suspected words" that sum up to above ~90% of confidence it
is the actual term instead of o
Problem here is that you wind up with a zillion unique terms in your
index, which may lead to performance issues, but you probably already
know that :).
I've seen situations where running it through a dictionary helps. That
is, does each term in the OCR match some dictionary? Problem here is
that
Gora,
firstly I would like thank you for your quick response.
.../select?q=MacBook&sort=SUM(base_score, score)+desc&wt=json&indent=true
I tried that but it didn't work and I got this error message
"error":{
"msg":"Can't determine a Sort Order (asc or desc) in sort spec
'SUM(base_score, scor
Thanks Ahmet,
I tried with multiple combinations & finally got it using full query as
nested query.
Is it fine to use full query inside nested query with filters _query_ as
below.
http://localhost:8983/solr/collection1/select?q=text:sharepoint&wt=json&indent=true&AuthenticatedUserName=ljangra&_q
On 2 July 2014 20:32, rachun wrote:
> Dear all,
>
> Could anybody suggest me how to customize the score?
> So, I have data like this ..
>
> {ID : '0001', Title :'MacBookPro',Price: 400,Base_score:'121.2'}
> {ID : '0002', Title :'MacBook',Price: 350,Base_score:'100.2'}
> {ID : '0003', Title :'Lapto
Dear all,
Could anybody suggest me how to customize the score?
So, I have data like this ..
{ID : '0001', Title :'MacBookPro',Price: 400,Base_score:'121.2'}
{ID : '0002', Title :'MacBook',Price: 350,Base_score:'100.2'}
{ID : '0003', Title :'Laptop',Price: 300,Base_score:'155.7'}
Notice that I ha
I don't have first hand knowledge of how you implement that, but I bet a
look at the WordDelimiterFilter would help you understand how to emit
multiple terms with the same positions pretty easily.
I've heard of this "bag of word variants" approach to indexing poor-quality
OCR output before for fin
We migrated a big application from Endeca (6.0, I think) a several years ago.
We were not using any of the business UI tools, but we found that Solr is a lot
more flexible and performant than Endeca. But with more flexibility comes more
you need to know.
The hardest thing was to migrate the E
Hello,
Many of our indexed documents are scanned and OCR'ed documents.
Unfortunately we were not able to improve much the OCR quality (less than
80% word accuracy) for various reasons, a fact which badly hurts the
retrieval quality.
As we use an open-source OCR, we think of changing every scanned
Aman,
I feel focusing on "Question-Answering" and "Information Extraction"
components of NLP should help you achieve what you are looking for. Go
through this book *Taming Text * (http://www.manning.com/ingersoll/ ) .
Most of your queries should be answered including details on implementa
Wow - so apparently I have terrible recall and should re-read this thread I
started on the same topic when upgrading from 1.4 to 3.6 and hit a very
similar fieldNorm issue almost two years ago! =)
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201207.mbox/%3CCALyTvnpwZMj4zxPbK0abVpnyRJny
Hi, you can safely ignore this, it is shutting down anyway. Just don't reload
the app a lot of times without actually restarting Tomcat.
-Original message-
> From:Aman Tandon
> Sent: Wednesday 2nd July 2014 7:22
> To: solr-user@lucene.apache.org
> Subject: Memory Leaks in solr 4.8.1
>
Hi, i don't think this is ever going to work with the MLT Handler, you should
use the regular SearchHandler instead.
-Original message-
> From:SafeJava T
> Sent: Monday 30th June 2014 17:52
> To: solr-user@lucene.apache.org
> Subject: NPE when using facets with the MLT handler.
>
> I
Thanks Eric,
I will watch out for Map reduce option. It will be helpfull if I get any
links to set up hadoop with solr.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Integrating-solr-with-Hadoop-tp4144715p4145157.html
Sent from the Solr - User mailing list archive at Nabb
46 matches
Mail list logo