Hi,
We are using 4 cores (starting from core 0 to core 4) for parallel indexing
process. We use shards to do distributed indexing and we also use dismax
request handler when doing search.
I have configured core0 as Shards master core.
When I issue a search query (with dismax request handler) on
Tried one, of Perry Mason's secretary when she was young (and HOOOT),
Barbara Hale.
http://www.skylighters.org/ggparade/index8.html
Didn't find it. 1.8 billion images indexed is probably a DROP in the bucket of
what's out there.
Dennis Gearon
Signature Warning
It is alwa
I am not sure if I understand your question correctly..
Are you saying that you are not able to start Jetty server in linux box? or
SOLR application is not starting up even after server has started?
Thanks,
Barani
--
View this message in context:
http://lucene.472066.n3.nabble.com/SOLR-Config-
Hi,
As Tom suggested removing optimize and passing the ids as list (instead of
for loop) will surely increase the speed of deletion.
We have a program which fetches complete list of ID from back end (around 10
million) and compares it with the complete list of id's present in SOLR
document and d
Hi,
As far as I know there is no queuing mechanism in SOLR for concurrent
indexing request. It would simple ignore the concurrent request (first come
first serve basis).. Solr experts, please correct me if I am wrong..
To achieve concurrency, we have implemented a queue using JMS and we send
th
Just to give you some more clarification.. you can create multiple database
config file (separate) to extract the data from different sources and add
the hardcoded identifier in SOLR select query corresponding to each source.
So you will have multiple data import handler committing the data in to
I am not sure whether I understand your question properly.
If you are trying to get data from different database and dumping it to same
index file then you need to specify a way to retrieve a particular data back
from that XML (which actually contains the consolidated data from all Db's).
For do
Hi,
Thanks a lot for your reply..
I am using database import handler to get the data (DIH) from DB.
When I get a null data in single valued attribute the 'default' attribute
seems to work perfectly fine.
But seems like I need to validate the Null value (like using case when else
statement) in
Could you give us a bit more information?
How are you getting this information into Solr? SolrJ?
DataImportHandler? It's hard to see where the null value is getting
dropped, if we don't know the path that it is coming in.
I suspect that the default attribute won't do it. It's possible that
you mi
Tom,
I would like to reachout to directly. Whats your email address?
/j
From: Tom Hill
To: solr-user@lucene.apache.org
Sent: Fri, December 10, 2010 9:43:08 PM
Subject: Re: command line parameters for solr
java -jar start.jar --help
More docs here
http://do
hi group,
I have multiple document types indexed on a single core solr instance and
each comes from a different DB.
What is the best way to configure DIH to read each document type from
corresponding DB. AS far as i could find DIH does not honour multiple
document tags inside the data config.
Th
W dniu 2010-12-11 06:24, Dennis Gearon pisze:
Threre is actually some image recognition search engine software somewhere I
heard about. Take a picture of something, say a poster, upload it, and it will
adjust for some lighting/angle/distortion, and try to find it on the web
somewhere.
tineye
thanks Tom, really appreciate.
From: Tom Hill
To: solr-user@lucene.apache.org
Sent: Fri, December 10, 2010 9:43:08 PM
Subject: Re: command line parameters for solr
java -jar start.jar --help
More docs here
http://docs.codehaus.org/display/JETTY/A+look+at+the
java -jar start.jar --help
More docs here
http://docs.codehaus.org/display/JETTY/A+look+at+the+start.jar+mechanism
Personally, I usually limit access to localhost by using whatever
firewall the machine uses.
Tom
On Fri, Dec 10, 2010 at 7:55 PM, Jack O wrote:
> Hello,
>
> For starting solr, fr
Check out this page: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Look, in particular, for "stemming".
On Fri, Dec 10, 2010 at 7:58 PM, Jack O wrote:
> Hello,
>
> Need one more help:
>
> What do I have to do so that search will work for singulars and plurals ?
>
>
>
> I would real
Threre is actually some image recognition search engine software somewhere I
heard about. Take a picture of something, say a poster, upload it, and it will
adjust for some lighting/angle/distortion, and try to find it on the web
somewhere.
You hear about crazy stuff like this at dev camps. B
Hello,
Need one more help:
What do I have to do so that search will work for singulars and plurals ?
I would really appreciate all your help.
/J
Hello,
For starting solr, from where do i find the list of command line parameters.
java -jar start.jar blahblah...
I am especially looking for how to specify my own jetty config file. I want to
allow access of solr from localhost only.
I would really appreciate all your help.
/J
Searching for an image with a painted query! Wow.
On Wed, Dec 8, 2010 at 11:14 PM, Maciej Lisiewski wrote:
> There is imgSeek ( http://www.imgseek.net/isk-daemon ), which while being
> far from perfect (can't handle rotated images) is quite simple and has
> already been added to xapian.
> Paper o
There is I believe no way to do this without separate copies of your
script. Each 'handler=/dataimport' has to refer to a separate config
file.
You can make several copies and name them config1.xml, config2.xml
etc. You'll have to call each one manually, so you have to manage your
own thread pool.
Hi All,
I am trying to debug my queries and see how scoring is done. I have 6 cores and
send the quesy to 6 shards and it's dismax handler (with search on various
fields with different boostings). I enable debug, and view source but I'm
unable
to see the explanations. I'm returning ID and scor
: I have a field that I use for facetting. I do not tokenize this field. It
: has entries like:
:
: AWB artikel 2, lid 1
: AWB artikel 8:75
: Algemene Wet Bestuursrecht artikel 8:75
I assume those are names of laws, followed by page/paragram numbers in
various formats? (and evidently "lid" is
: I thought that I have to use NGramFilter for wildcard search.
: But It was the wrong idea.
: Thanks, iorixxx
your confusion may be because using EdgeNGramFilter is a way to make
"prefix" queries faster by precomputing hte prefixes as index time
instead of at query time. (trading disk space f
: I made the field that is indexed with EdgeNGramFilterFactory as default
: search field. All my query responses are very slow, some of them taking more
: than 10seconds to respond.
based on the info you've given, there's dozens of posisble reasons why you
might see slow queries -- it's hard to
: #SOLR-433 "MultiCore and SpellChecker replication" [1]. Based on the
: status of this feature request I'd asume that the normal procedure of
: keeping the spellchecker index up2date would be running a cron job on
: each node/slave that updates the spellchecker.
: Is that right?
i'm not 100% cer
: As Solr's standard date faceting does not appear to meet this need, we will
: need to use faceting on arbitrary queries, i.e. by passing multiple values
: for facet.query
correct, facet.date is really just a convincence feature over using
facet.query when you want lots of consistently sized ra
Hi Chris,
thanks for your description. I should think about this a little bit
more, then I will ask some details. The main problem is that Synonyms
are one kind of relations, and Thesaurus may contain 6-10 kinds of
relations. And it is depending on the user, which types of relations
he would like
: The question asked, in good faith, was does solr support or extend to
: implementing a thesaurus. It looks like it does not which is fine. It does
Well, my point was that "thesaurus" is not a feature description. it's a
data structure, and depending on your goals, the existing SynonymFilter
hi all,
We wish to implement date faceting with a 'sliding date range', 'last 24
hours, last week, last month, last year' . Google New currently implements
such faceting when you search for a topic.
As Solr's standard date faceting does not appear to meet this need, we will
need to use facetin
Hi,
I have a multivalued field for which some of the records have null or empty
data in it.
Since its difficult to parse and match empty XML tags in SOLR ouput, I
thought I would assign a default value for those empty data as below.
But my approach is not working if this field has atleast one
Hi everybody,
I¹m having some troubles trying to figure out how to separate lines in a
paragraph from a search result, I¹m indexing PDF¹s but when I search the
highlight terms I can not know when the first line ends and the next one
begins,
Is there a way to put a [...] like google o a Paragraph
Hi everybody,
I¹m having some troubles trying to figure out how to separate lines in a
paragraph from a search result, I¹m indexing PDF¹s but when I search the
highlight terms I can not know when the first line ends and the next one
begins,
Is there a way to put a [...] like google o a Paragraph
Thanks a lot for the response.
Unfortunately I can't check the statistics page. For some reason the solr
webapp itself is only returning a directory listing. This is sometimes
fixed when I restart but if I do that I'll lose the state I have now. I can
get at the JMX interface. Can I check my i
Hi John,
WeakReferences allow things to get GC'd, if there are no other
references to the object referred to.
My understanding is that WeakHashMaps use weak references for the Keys
in the HashMap.
What this means is that the keys in HashMap can be GC'd, once there
are no other references to the
: My imaginative use case:
: - the user enters a term and maybe he turns on a flag to get not just
: the term, but all terms, which related somehow with this (usually the
: synonyms and narrower terms).
: - Solr first find the queried term(s) in the thesaurus, then finds the
: related terms, modif
I have been load testing solr 1.4.1 and have been running into OOM errors.
Not out of heap but with the GC overhead limit exceeded message meaning that
it didn't actually run out of heap space but just spent too much CPU time
trying to make room and gave up.
I got a heap dump and sent it through t
Hi there,
We are trying to replace opentext (V7.6) autonomy with solr so that we can
index other contents, too. Due to lack of manpower and time, the management
wants to buy the adapter if available. Do you know of any vendor who sells
the adapter or professional service? Thank you.
Brian Ko
In looking at some of the docs support for geospatial search.
I see this functionality is mostly scheduled for upcoming release 4.0 (with
some
playing around with backported code).
I note the support for the bounding box filter, but will "bounding box" be one
of the supported *data* types fo
All,
Right now I am using the default DIH config that comes with the Solr
examples. I update my index using the dataimport handler here
http://localhost:8983/solr/admin/dataimport.jsp?handler=/dataimport
This works fine but I want to be able to index more than just one feed at a
time and more im
Nutch is also a great option if you want a crawler. I have found that you
will need to use the latest version of PDFBox and a it's dependencies for
better results. Also, make sure to set JAVA_OPT to something really large so
that you won't exceed your heap size.
Adam
On Fri, Dec 10, 2010 at 6:27
Hi Lee,
Thank you very much for your quick answer!
It works fine!
Ciao,
Alessandro
solr-user@lucene.apache.org
-Original Message-
From: lee carroll [mailto:lee.a.carr...@googlemail.com]
Sent: 09 December 2010 18:46
To: solr-user@lucene.apache.org; alessandro.ri...@virgilio.it
S
Although I access solr from rails by sunspot, the rails server runs on
heroku, so on a different machine. I prefer to have solr as stand
alone server and want to tell sunspot where it can find the running
solr.
I am quite new to chef, but if I can I could help with writing a
cookbook I would. If y
Two Peters (or rather a stupid english bloke who can't work out how to type
fancy accents :-)
Sorry Péter (took me 10 minutes to work out i could cut and paste) my reply
was to the clustering post by Peter Sturge. Clustering sounds great but
being able to define a thesaurus scheme excatly would be
Hi Lee,
according to my vision the user could decide which relationship types
would he likes to attach to his search, and the application would call
his attention to other possibilities. So there would be no heuristic
method applied, because e.g. boarder terms would cause lots of
misleading result
Hi,
I have implemented a QueryParser that queries another solr core and returns a
list of values (resourceIds) that are the primary solr key on the main core. I
then query the main core using the resourceId to retrieve the Lucene docId. I
build up an array of ints of these doc ids. I put this a
Hi Pankaj,
you can find the needed documentation right here [1].
Hope this helps,
Tommaso
[1] : http://wiki.apache.org/solr/ExtractingRequestHandler
2010/12/10 pankaj bhatt
> Hi All,
> I am a newbie to SOLR and trying to integrate TIKA + SOLR.
> Can anyone please guide me, how to achieve
Hi All,
I am a newbie to SOLR and trying to integrate TIKA + SOLR.
Can anyone please guide me, how to achieve this.
* My Req is:* I have a directory containing a lot of PDF,DOC's and i need to
make a search within the documents. I am using SOLR web application.
I just need some
I will likely need to create one in the next week or two. Depends upon
how soon you need one.
The one you've found is probably designed to work with rails apps. It
assumes you have solr installed already, and adds another
instance/index.
I certainly need one that can do something that'll create s
Hi,
that there's no feedback indicates that our plans/preferences are
fine. Otherwise it's now a good opportunity to feed back :-)
Cheers,
Martin
On Wed, Dec 8, 2010 at 2:48 PM, Martin Grotzke
wrote:
> Hi,
>
> we're just planning to move from our replicated single index setup to
> a replicated
Hi Peter,
Thats way to clever for me :-)
Discovering thesuarus relationships would be fantastic but its not clear
what heuristics you would need to use to discover broader, narrower, related
documents etc. Although I might be doing the clustering down i'm sceptical
about the accuracy.
cheers Lee
Hi,
I tried to setup Solr by chef and so far found only the opscode
one, but this one setup only the group and the user for solr, not the
solr engine. Does anyone know about a maintained solr chef cookbook?
Thanks for suggestion!
Georg
Hi Lee,
Perhaps Solr's clustering component might be helpful for your use case?
http://wiki.apache.org/solr/ClusteringComponent
On Fri, Dec 10, 2010 at 9:17 AM, lee carroll
wrote:
> Hi Chris,
>
> Its all a bit early in the morning for this mined :-)
>
> The question asked, in good faith, was
Hi Chris,
Its all a bit early in the morning for this mined :-)
The question asked, in good faith, was does solr support or extend to
implementing a thesaurus. It looks like it does not which is fine. It does
support synonyms and synonym rings which is again fine. The ski example was
an illustrat
On 09.12.2010 21:26, ext Chris Hostetter wrote:
: doc1 is name=A B category=B
: doc2 is name=A category=B
:
: when searching for the terms "A" and "B" I want doc2 to get a higher score.
: to be more specific, I don't want the term "B" to influence doc1's score in
: both and, only in one of them.
I also try to define the problem.
In the library world there are some general and special thesaurus,
which reveal the relations between concepts. The relations have types
as Lee described: Prefered Term (PT), Broader Terms (BT), Narrower
Terms (NT) Related Terms (RT) and others. Some of these thes
55 matches
Mail list logo