Hello everybody,
Little question :
status_official:true^1,5
How come this doesn't show up datas and if I remove status_official then it
will show up data.
I tried to add status_official:false^1 but nothing come up and if I remove
this param I've some value.
I would like to boost some status .
Hi
I dont get where I'm wrong.
I would like to boost some type of my books.
So If I do : &bq=status_official:0^1.5+status_creative:0^1.5
I've one result
If I do: &bq=status_official:1^1.5+status_creative:1^1.5
Nothing, I think the result should still come up even if it doesn't have
this status
I've actually added (status_official:1 OR status_creative:1)^2.5
sunnyfr wrote:
>
> Hi
>
> I dont get where I'm wrong.
> I would like to boost some type of my books.
>
> So If I do : &bq=status_official:0^1.5+status_creative:0^1.5
> I've one result
>
> If I do: &bq=status_official:1^1.5+st
Hi,
I just upgraded from solr-1.3-dev to 1.4-dev and I'm having issues with
the location of the dataDir.
I configure solr through -Dsolr.solr.home= /u01/app/apptest/solr
In v1.3 the datadir is located in /u01/app/apptest/solr/data
However when I dorp the 1.4 war in place the dataDir is opened
We have a distributed index consisting of several shards. There could be
some documents repeated across shards. We want to remove the duplicate
records from the documents returned from the shards, and re-order the
results by grouping them on the basis of a clustering algorithm and
reranking the doc
Hi All,
Our web based document management system has few thousand users and is
growing rapidly. Like any SaaS, while we support a lot of customers,
only few of them (those logged in) will be reading their index and only
a subset of those logged in (who are adding documents) will be writing
to
Hi
How come if i put in my query q=+wow-kill
wow-kill
dismax
I will have books which contain wow and kill instead of books which have wow
in the title without kill???
Thanks a lot,
--
View this message in context:
http://www.nabble.com/dismax-%2B-and---tp4770p4770.html
Sent from the
On Feb 26, 2009, at 6:04 AM, CIF Search wrote:
We have a distributed index consisting of several shards. There
could be
some documents repeated across shards. We want to remove the duplicate
records from the documents returned from the shards, and re-order the
results by grouping them on the
I presume these all have different unique ids?
If you can address it at indexing time, then have a look at
https://issues.apache.org/jira/browse/SOLR-799
Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236
On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote:
Is it possibl
1a. Multiple Solr instances partitioned by user_id%N, with index
files segmented by user_id field.
That can scale rather gracefully, though it does need reindexing
to add a server.
wunder
On 2/26/09 3:44 AM, "Vikram B. Kumar" wrote:
> Hi All,
>
> Our web based document management system has f
In your example there is no space between +wow -kill so my guess is that
solr is interpreting it as wow-kill all one word. Then depending on the
field type the tokenizer is probably splitting wow and kill into 2 words
along the -.
--
Jeff Newburn
> From: sunnyfr
> Reply-To:
> Date: Thu, 26 Fe
Hi Wunder,
Can you please elaborate?
Vikram
On Thu, Feb 26, 2009 at 10:13 AM, Walter Underwood
wrote:
> 1a. Multiple Solr instances partitioned by user_id%N, with index
> files segmented by user_id field.
>
> That can scale rather gracefully, though it does need reindexing
> to add a server.
>
>
Hi guys,
I look for the parameter or the way to boost the order of the word in the
query.
Let's imagine people look for "rich & famous" book ... so in the search they
will just write rich & famous
and let's imagine a book with a better rating and lot of views is like
famous & very rich is there
Hi All:
Is there any study / research done on using scanned paper documents as
images (may be PDF), and then use some OCR or other technique for
extracting text, and the resultant index quality?
Thanks in advance,
Sithu D Sudarsan
sithu.sudar...@fda.hhs.gov
sdsudar...@ualr.edu
Hi Sithu,
there is a project called ocropus done by the DFKI, check the online demo
here: http://demo.iupr.org/cgi-bin/main.cgi
And also http://sites.google.com/site/ocropus/
Regards
Hannes
m...@hcmeyer.com
http://mimblog.de
On Thu, Feb 26, 2009 at 5:29 PM, Sudarsan, Sithu D. <
sithu.sudar...
On Thu, Feb 26, 2009 at 11:25 AM, sunnyfr wrote:
> How can I tell it to put a lot of more weight for the book which has exactly
> the same title.
A sloppy phrase query should work.
See the "pf" param in the dismax query parser.
-Yonik
http://www.lucidimagination.com
With five servers, assign 1/5 of user_id's to each server. Choose
the number of servers to handle the number of logged-in users.
Each user's searches go to the single server with their data.
Partitioning by user_id is common with relational databases.
We do this to hold our two billion movie ratin
Thanks Hannes,
The tool looks good.
Sincerely,
Sithu D Sudarsan
sithu.sudar...@fda.hhs.gov
sdsudar...@ualr.edu
-Original Message-
From: hannesc...@googlemail.com [mailto:hannesc...@googlemail.com] On
Behalf Of Hannes Carl Meyer
Sent: Thursday, February 26, 2009 11:35 AM
To: solr-user@l
It's exactly what I'm looking for. Thank you Grant.
- Original Message
From: Grant Ingersoll
To: solr-user@lucene.apache.org
Sent: Thursday, February 26, 2009 6:56:22 AM
Subject: Re: unique result
I presume these all have different unique ids?
If you can address it at indexing time,
Another project worth investigating is Tesseract.
http://code.google.com/p/tesseract-ocr/
- Original Message
From: Hannes Carl Meyer
To: solr-user@lucene.apache.org
Sent: Thursday, February 26, 2009 11:35:14 AM
Subject: Re: Use of scanned documents for text extraction and indexing
H
Hello,
I am writing a custom analyzer for my field type. This analyzer would need
to use SolrResourceLoader and SolrConfig, so I want to make it
SolrCoreAware.
However, it seems that Analyzer classes aren't supposed to be used in this
way (as described in http://wiki.apache.org/solr/SolrPlugins).
Hello folks!
I was under the impression that this sync bottleneck was fixed in
recent versions of Solr/Lucene, but we're seeing it with 1.4-dev right
now. When we load test a server with >100 threads (using jmeter), we
see several threads all blocked at the same spot:
"http-8080-exec-505"
That's interesting.
We should be using read-only readers, which should not synchronize on
the deleted docs check. But as your stack trace shows, you're using
SegmentReader and MultiSegmentReader.
Right now, if I look at the admin/statistics page at the searcher, it
shows the following for the rea
I see a ReadOnlySegmentReader now - we're on an optimized index now
which gets around the isDeleted() check.
(solr4, optimized)
searcherName : searc...@260f8e27 main
caching : true
numDocs : 139583
maxDoc : 139583
readerImpl : ReadOnlySegmentReader
readerDir : org.apache.lucene.store.NIOFSDirec
There is quite a bit of litterature available on this topic. This paper
presents a summary. Nothing immediately applicable I'm afraid.
Retrieving OCR Text: A survey of current approaches
Steven M. Beitzel, Eric C. Jensen, David A Grossman
Illinois Institute of Technology
It lists a number of othe
Another simple solution for your requirement is to use multicore. This way
you will have only one Solr webapp loaded with as many indexes as you need.
See more at http://wiki.apache.org/solr/MultiCore
2009/2/25 Michael Della Bitta
> Unfortunately, I think the way this works is the container cre
: I am writing a custom analyzer for my field type. This analyzer would need
: to use SolrResourceLoader and SolrConfig, so I want to make it
: SolrCoreAware.
1) Solr's support for using Analyzer instances is mainly just to make it
easy for people who already have existing ANalyzer impls that th
: We should be using read-only readers, which should not synchronize on
FWIW: skimming through code that i don't normally look at to see the new
read only changes i noticed this in SolrCore...
// gets a non-caching searcher
public SolrIndexSearcher newSearcher(String name, boolean readOnly)
Tesseract is pure OCR. Ocropus builds on Tesseract.
Vikram
On Thu, Feb 26, 2009 at 12:11 PM, Shashi Kant wrote:
> Another project worth investigating is Tesseract.
>
> http://code.google.com/p/tesseract-ocr/
>
>
>
>
> - Original Message
> From: Hannes Carl Meyer
> To: solr-user@lucene.
Does anyone have any good documentation that explains how to set up
the warming feature within the config?
On Wed, Feb 25, 2009 at 11:58 AM, Marc Sturlese wrote:
>
> Shalin your patch worked perfect for my use case.
> Thank's both for the information!
>
>
>
> Amit Nithian wrote:
>>
>> I'm actuall
I believe the query component will generate the query in such a way that i
get the results that i want, but not process the returned results, is that
correct? Is there a way in which i can group the returned results, and rank
each group separately, and return the results together. In other words wh
Can anyone back that up?
IMHO Tesseract is the state-of-the-art in OCR, but not sure that "Ocropus
builds on Tesseract".
Can you confirm that Vikram has a point?
Shashi
- Original Message
From: Vikram Kumar
To: solr-user@lucene.apache.org; Shashi Kant
Sent: Thursday, February 26,
Hi,
If the master goes down and the slave(s) already have the index, search remains
working
If the master goes down during replication, the search will remain working, but
the slave will not have/see the most recent index changes.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nu
You can use Tesseract, an openSource OCR Engine owned from Google. Its
native C Code and to use it in Java you should use JNI or direct process
creation. There is no PDF support, but you can use imagemagick to
convert those docs on the fly. The engine scan documents line by line
without trying
I guess this is a bug introduced by SOLR-943. We shall raise an
issue.(JIRA is down now)
--Noble
On Thu, Feb 26, 2009 at 4:26 PM, Thijs wrote:
> Hi,
>
> I just upgraded from solr-1.3-dev to 1.4-dev and I'm having issues with the
> location of the dataDir.
>
> I configure solr through -Dsolr.solr.
35 matches
Mail list logo