Hi
I am trying to upgrade my SOLR version from 1.4 to 3.2. but it's giving me
below exception. I have checked solr home path & it is correct.. Please help
SEVERE: Could not start Solr. Check solr/home property
java.lang.NoSuchMethodError:
org.apache.solr.common.SolrException.logOnce(Lorg/slf4j/Lo
It looks like https://issues.apache.org/jira/browse/SOLR-2382 or even
https://issues.apache.org/jira/browse/SOLR-2613.
I guess by using SOLR-2382 you can specify your own SortedMapBackedCache
subclass which is able to share your Dictionary.
Regards
On Tue, Dec 6, 2011 at 12:26 AM, Brent Mills wr
Hi Chris:
Thanks a lot for your response. This is the kind of information I'm looking
for.
What you said about faceting is the key. I want to use my existing edismax
configuration to create the scored document result set of type Y. I don't want
to affect their scores, but for each document
On 12/5/2011 6:57 PM, Jamie Johnson wrote:
Question which is a bit off topic. You mention your algorithm for
sharding, how do you handle updates or do you not have to deal with
that in your scenario?
I have a long running program based on SolrJ that handles updates. Once
a minute, I run thro
On Mon, Dec 5, 2011 at 6:23 AM, Per Steffensen wrote:
and add features
What's the list of features you are looking for?
--
- Mark
http://www.lucidimagination.com
Shawn,
Question which is a bit off topic. You mention your algorithm for
sharding, how do you handle updates or do you not have to deal with
that in your scenario?
On Sat, Dec 3, 2011 at 1:54 PM, Shawn Heisey wrote:
> In another thread, something was said that sparked my interest:
>
> On 12/1/2
Hola!hope was fading fast finding this was such a relief its crazy how
the tables have turned I had to share this with someonehttp://www.llantasgigantes.com.mx/profile/89SimonWalker/";>http://www.llantasgigantes.com.mx/profile/89SimonWalker/see
you
This sorts of works. Although it feels kind of hack and I am not sure how
robust it is for more complicated situations. Is there any reason behind not
supporting a "proper" negative boost? Is there any mathematical restriction?
--
View this message in context:
http://lucene.472066.n3.nabble.com/P
: Then when I match a new Document with "Red, Big", Document 1 should be top,
: Document 2 in the middle, and Document 3 in the bottom. But I still want
: Document 3 to show up in result because it still matches on Red.
:
: If I simply add opposite tags in the query with <1 boost (search for "Red
Hi,
> add features corresponding to stuff that we used to use in ElasticSearch
Does that mean you have used ElasticSearch but decided to try SolrCloud instead?
I'm also looking at a distributed solution. ElasticSearch just seems much
further along than SolrCloud. So I'd be interested to hear ab
: Right, the Solr/Lucene query syntax isn't true Boolean logic, so
: applying all the neat DeMorgan's rules is sometimes surprising.
And more specificly, mixing "boolean" operators (AND/OR) with prefix
operators (+/-) is a recipe for disaster. In an expression like this..
XXX OR -YYY
On Mon, Dec 5, 2011 at 3:28 PM, Shawn Heisey wrote:
> On 12/4/2011 12:41 AM, Ted Dunning wrote:
>
>> Read the papers I referred to. They describe how to search fairly
>> enormous
>> corpus with an 8GB in-memory index (and no disk cache at all).
>>
>
> They would seem to indicate moving away from
Jeff,
I'm not entirely understanding everything you've been asking about (in
terms of what your ultimate goal is) but as far as the JoinQParser
specificially...
:
http://localhost:8091/solr/ing-content/select/?qt=partner-tmo&fq=type:node&q={!join+from=conceptId+to=id+fromIndex=partner-tmo}brc
: Have you looked at:
: http://wiki.apache.org/solr/SolrCaching
this page was actually a little light on details about fieldValueCache, so
i tried to fill in some of hte blanks in the latest version.
https://wiki.apache.org/solr/SolrCaching#fieldValueCache
-Hoss
: I am using solr 3.4 and configured my DataImportHandler to get some data from
: MySql as well as index some rich document from the disk.
...
: But after some time i get the following error in my error log. It looks like
: a class missing error, Can anyone tell me which poi jar version w
On 12/4/2011 12:41 AM, Ted Dunning wrote:
Read the papers I referred to. They describe how to search fairly enormous
corpus with an 8GB in-memory index (and no disk cache at all).
They would seem to indicate moving away from Solr. While that would not
be entirely out of the question, I don't
Hi Eric,
After reading more about pf param I increased them a few times and this solved
options 2, 3, 4 but 1. As an example, for phrase "newspaper latimes"
latimes.com is not even in the results to boost it to the first place and
changing mm param to 1<-1 5<-2 6<90% solves
only 1,4 but 2
Is it possible to extract content for file types that Tika doesn’t support
without changing and rebuilding Tika? Do I need to specify a tika.config
file in the solrconfig.xml file, and if so, what is the format of that file?
One example that I’m trying to solve is for a document management syst
Martijn,
I'm just seeing this reply today, please excuse the late reply.
I tried your suggestion, it I do get results back, but I get back a list of
Users when I instead am trying to get back a list of Posts.
Is it not possible to arbitrarily sort by either side of the join in solr?
> Have
I'm not really sure how to title this but here's what I'm trying to do.
I have a query that creates a rather large dictionary of codes that are shared
across multiple fields of a base entity. I'm using the
cachedsqlentityprocessor but I was curious if there was a way to join this
multiple time
Hello Erik,
I will take a look at both:
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor
and
org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessor
and figure out what I need to extend to handle processing in the way I
am looking for. I am assumi
On 12/05/2011 01:52 PM, Michael Kelleher wrote:
I am crawling a bunch of HTML pages within a site (using ManifoldCF),
that will be sent to Solr for indexing. I want to extract some
content out of the pages, each piece of content to be stored as its
own field BEFORE indexing in Solr.
My guess
Michael -
I was following your discussion on the MCF list too as well.
What kind of information do you want to extract from the HTML pages? The UIMA
thing would be fairly heavy weight. The simplest thing on the Solr-side of the
equation would be to write an UpdateProcessor(Factory) and creat
I am crawling a bunch of HTML pages within a site (using ManifoldCF),
that will be sent to Solr for indexing. I want to extract some content
out of the pages, each piece of content to be stored as its own field
BEFORE indexing in Solr.
My guess would be that I should use a Document processing
I know I'm using SolR for a task that is better suited for the DB to handle but
I'm
doing this for reasons related to the overall design of my system. My DB is
going to
become very large over time and it is constantly being updated via Hadoop jobs
that
collect,analyze some data and generate t
*pk*: The primary key for the entity. It is*optional*and only needed
when using delta-imports. It has no relation to the uniqueKey defined in
schema.xml but they both can be the same.
When using in a nested entity is the PK the primary key column of the
join table or the key used for joining?
On Mon, Dec 5, 2011 at 6:23 AM, Per Steffensen wrote:
>Will it be possible to maintain a how-to-use section on
>http://wiki.apache.org/solr/NewSolrCloudDesign with examples, e.g. like to
>ones on http://wiki.apache.org/solr/SolrCloud,
Yep, it was on my near-term todo list to put up a quick deve
Yes completely agree, just wanted to make sure I wasn't missing the obvious :)
On Mon, Dec 5, 2011 at 1:39 PM, Yonik Seeley wrote:
> On Mon, Dec 5, 2011 at 1:29 PM, Jamie Johnson wrote:
>> In this
>> situation I don't think splitting one shard would help us we'd need to
>> split every shard to r
On Mon, Dec 5, 2011 at 1:29 PM, Jamie Johnson wrote:
> In this
> situation I don't think splitting one shard would help us we'd need to
> split every shard to reduce the load on the burdened systems right?
Sure... but if you can split one, you can split them all :-)
-Yonik
http://www.lucidimagin
Currently, solr grouping (http://wiki.apache.org/solr/FieldCollapsing)
sorts groups "by the score of the top document within each group". E.g.
[...]
"groups":[{
"groupValue":"81cb63020d0339adb019a924b2a9e0c2",
"doclist":{"numFound":9,"start":0,"maxScore":4.729042,"docs":[
{
Thanks Yonik, must have just missed it.
A question about adding a new shard to the index. I am definitely not
a hashing expert, but the goal is to have a uniform distribution of
buckets based on what we're hashing. If that happens then our shards
would reach capacity at approximately the same ti
Assuming you are using Drupal for the website, you can have Solr set
up and integrated with Drupal in < 5 minutes for local development
purposes.
See: https://drupal.org/node/1358710 for a pre-configured download.
-Peter
On Mon, Dec 5, 2011 at 11:46 AM, Achebe, Ike, JCL
wrote:
> Hi,
> My name i
A colleague came to be with a problem that intrigued me. I can see
partly how to solve it with Solr, but looking for insight into solving
the last step.
The problem:
1) Start from a set of text transcriptions of videos where there is a
timestamp associated with each word.
2) Index into Solr wit
Hi,
My name is Ike Achebe and I am a Developer Analyst with the Johnson County
Library. I'm actually researching better and less expensive alternatives to
"Google Appliance Search " , which is currently our search engine.
Fortunately, I have come across a variety of blogs recommending Lucene/S
Am Montag, den 05.12.2011, 08:11 -0500 schrieb Erick Erickson:
> You can try bumping up the timeouts in your SolrJ program, the
> SolrServer has a bunch of timeout options.
>
> You can pretty easily tell if the optimize has carried through
> anyway, your index files should have been reduced
> subs
That seems pretty straightforward. Thanks!
2011/12/5 Tomás Fernández Löbbe :
> You could try adding a
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory
>
> Regards,
>
> Tomás
Hello,
When i add a synonym to synonyms.txt it works fine. For example:
foo => bar (when searching for foo, also bar gets found)
But this won't work (asume bar-bar is somewhere indexed) :
foo => bar-bar
what should i do to enable the searching of synonyms with dashes in them?
Thank you,
Zoran
Hello All,
I have my field description listed below, but I don't think its pertinent. As
my issue seems to be with the query parser.
I'm currently using an edismax subquery clause to help with my searching as
such:
_query_:"{!type=edismax qf='ref_expertise'}\(nonlinear OR soliton\) AND
\"opti
Hi All,
If you've wanted a full time job working on Lucene or Solr, we have two
positions open that just might be of interest. The job descriptions are below.
Interested candidates should submit their resumes off list to
care...@lucidimagination.com.
You can learn more on our website:
ht
Thanks for answering
Mark Miller skrev:
Guess that is the whole point. Guess that I do not have to replicate
configuration files, since SolrCloud (AFAIK) does not use local
configuration files but information in ZK. And the it gets a little hard to
guess how to do it, since the explanation o
On Mon, Dec 5, 2011 at 9:21 AM, Jamie Johnson wrote:
> What does the version field need to look like?
It's in the example schema:
-Yonik
http://www.lucidimagination.com
Yes, and without doing much in the way of queries, either. Basically, our
test data has large numbers of distinct terms, each of which can be large
in themselves. Heap usage is a straight line -- up -- 75 percent of the
heap is consumed with byte[] allocations at the leaf of an object graph
li
What does the version field need to look like? Something like?
On Sun, Dec 4, 2011 at 2:00 PM, Yonik Seeley wrote:
> On Fri, Dec 2, 2011 at 10:48 AM, Mark Miller wrote:
>> You always want to use the distrib-update-chain. Eventually it will
>> probably be part of the default chain and auto
1> Try adding &debugQuery=on and see if the query parses the way
you expect.
2> Look at your admin/analysis page to see if your fields are getting
parsed the way you think.
3> Look in your admin/schema page to see if the actual terms are
what you expect...
Yeah, it's kind of daunting w
Guess that is the whole point. Guess that I do not have to replicate
> configuration files, since SolrCloud (AFAIK) does not use local
> configuration files but information in ZK. And the it gets a little hard to
> guess how to do it, since the explanation on http://wiki.apache.org/solr/*
> *SolrRe
Because I need the count and the result to return back to the client side. Both
the grouping and the facet offers me a solution to do that, but my doubt is
about performance ...
With Grouping my results are:
"grouped":{
"category":{
"matches": ...,
"groups":[{
"groupVa
Hi Kashif,
that is not possible in solr. The facets are always based on all the
documents matching the query.
But there is a workaround:
1) Do a normal query without facets (you only need to request doc ids
at this point)
2) Collect all the IDs of the documents returned
3) Do a second query for a
Have you considered specifying a boost
function in your handler instead? See:
http://wiki.apache.org/solr/DisMaxQParserPlugin#bf_.28Boost_Functions.29
Best
Erick
On Sun, Dec 4, 2011 at 12:43 AM, Zac Smith wrote:
> Hi,
>
> I think this is a pretty common requirement so hoping someone can easily
Have you looked at the "pf" (phrase fields)
parameter of edismax?
http://wiki.apache.org/solr/DisMaxQParserPlugin#pf_.28Phrase_Fields.29
Best
Erick
On Sat, Dec 3, 2011 at 7:04 PM, wrote:
> Hello,
>
> Here is my request handler
>
>
>
> edismax
> explicit
> 0.01
> site^1.5 content^0.5 title^1.
Am 05.12.2011 14:28, schrieb Per Steffensen:
Hi
Reading http://wiki.apache.org/solr/SolrReplication I notice the
"pollInterval" (guess it should have been "pullInterval") on the slaves.
That indicate to me that indexed information is not really "pushed" from
master to slave(s) on events defined
Why not just use the first form of the document
and just facet.field=category? You'll get
two different facet counts for XX and YY
that way.
I don't think grouping is the way to go here.
Best
Erick
On Sat, Dec 3, 2011 at 6:43 AM, Juan Pablo Mora wrote:
> I need to do some counts on a StrField f
> Is there a formal manner to transfer the data to a database or file-format>
> from which it can be reloaded?>> I would say an export to a CSV file (which
> could become huge) and the reload> it from that?
Not quite sure what you mean by that. The data is not *in* the
solr index if it's not stor
Some details please. Are you indexing and searching
on the same machine? How are you committing?
After every add? Via commitWithin? Via autocommit?
What version of Solr? Whatenvironment?
You might review:
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
On Fri, Dec 2, 2011 at 2:35 PM, M
There's no good way to say to Solr "Use only this
much memory for searching". You can certainly
limit the size somewhat by configuring your caches
to be small. But if you're sorting, then Lucene will
use up some cache space etc.
Are you actually running into problems?
Best
Erick
On Fri, Dec 2, 2
Well, Solr is a text search engine, and a good one. But this sure
feels like a problem that RDBMSs were built to handle. Why do
you want to do this? Is your current performance a problem?
Are you blowing your space resources out of the water? Do you
want to distribute your app to places not connect
Right, the Solr/Lucene query syntax isn't true Boolean logic, so
applying all the neat DeMorgan's rules is sometimes surprising.
The first form take all records with event dates or that fall outside your range
and inverts the results.
The second selects all documents that fall in the indicated da
Hi
Reading http://wiki.apache.org/solr/SolrReplication I notice the
"pollInterval" (guess it should have been "pullInterval") on the slaves.
That indicate to me that indexed information is not really "pushed" from
master to slave(s) on events defined by "replicateAfter" (e.g. commit),
but tha
Have you looked at:
http://wiki.apache.org/solr/SolrCaching
?
But no, they aren't used for the same thing. The people
who work on the code work hard to keep the memory
use down.
Best
Erick
On Fri, Dec 2, 2011 at 4:37 AM, RT RT wrote:
> Hi,
>
> I'm trying to understand caching, looking on the wi
You can try bumping up the timeouts in your SolrJ program, the
SolrServer has a bunch of timeout options.
You can pretty easily tell if the optimize has carried through
anyway, your index files should have been reduced
substantially. But I'm pretty sure it's completing successfully.
Why call it w
Yes, if you take a look at the debugQuery output you'll see the generated
query. It should contain the fields and boost as specified in the "q"
parameter.
You could also use index-time boosting if those boosts are static.
On Mon, Dec 5, 2011 at 9:55 AM, Robert Brown wrote:
> So I need to explici
So I need to explicitly set the boosts in the query?
ie
q=+(field1:this^2 field1:"that thing"^4) +(field2:other^3)
---
IntelCompute
Web Design & Local Online Marketing
http://www.intelcompute.com
On Mon, 5 Dec 2011 09:49:34 -0300, Tomás Fernández Löbbe
wrote:
> In this case, the boost and
In this case, the boost and fields in the "qf" parameter won't be
considered for the search. With this query Solr will search for documents
with the terms "this" and/or (depending on your default operator) "that" in
the field1 and the term "other" in the field2
On Mon, Dec 5, 2011 at 9:44 AM, Robe
Thanks Tomás,
My example should have read...
q=+(field1:this field1:that) +(field2:other)
I'm using edismax.
so with this approach, the boosts as specified in solrconfig qf will
remain in place?
---
IntelCompute
Web Design & Local Online Marketing
http://www.intelcompute.com
On Mon, 5 Dec
You could try adding a
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LengthFilterFactory
Regards,
Tomás
On Mon, Dec 5, 2011 at 6:01 AM, Marian Steinbach wrote:
> Hi!
>
> I am surprised to find an empty string as the most frequent index term in
> one of my fields. Until now I
Hi Robert, the answer depends on the query parser you are using. If you are
using the "edismax" query parser, then the "qf" will only be used when you
don't specify any field in the "q" parameter. In your example the result
query will be, boolean queries for "this" and "that" in the field1 and a
Di
Tomás Fernández Löbbe skrev:
Hi, AFAIK SolrCloud still doesn't support replication, that's why in the
example you have to copy the directory manually. Replication has to be
implemented by using the SolrReplication as you mentioned or use some kind
of distributed indexing (you'll have to do it you
Hi, AFAIK SolrCloud still doesn't support replication, that's why in the
example you have to copy the directory manually. Replication has to be
implemented by using the SolrReplication as you mentioned or use some kind
of distributed indexing (you'll have to do it yourself). SolrReplication
stuff i
Hi
My guess is that the work for acheiving
http://wiki.apache.org/solr/NewSolrCloudDesign has begun on branch
"solrcloud". It is hard to follow what is going on and how to use what
has been acheived - you cannot follow the examples on
http://wiki.apache.org/solr/SolrCloud anymore (e.g. there
If I have a set list in solrconfig for my "qf" along with their
boosts, and I then specify field names directly in q (where I could
also override the boosts), are the boosts left in place, or reset to 1?
this^3
that^2
other^9
ie q=field1:+(this that) +(other)
--
IntelCompute
We
Hi all,
i am looking for a solution where i want the facets to obtain based on the
paging of solr documents.
For ex:-
say i hv a query *:* and set start=0 and rows=10 and then i want facets on
any one of the fields in the 10 docs obtained and not on the entire docs for
which the query was matched
Hi
I have been working with ElasticSearch for a while now, and find it very
cool. Unfortunately we are no longer allowed to use ElasticSearch in our
project. Therefore we are looking for alternatives - Solr(Cloud) is an
option.
I have been looking at SolrCloud and worked through the "example
Hello,
I have one solr instance and i'm very happy with that. Now we have multiple
daily updates
and is see the response time is slower when doing a update. I think i need
some master slave replication. Now my question is: Is a slave slower when
there is an replication running from master to slave
Hi Marian,
thanks for your answer.
Using a copyField is a good idea.
Mark
2011/12/5 Marian Steinbach :
> Hi Mark!
>
> You could help yourself with creating an additional field. One field would
> hold the stemmed version and the other one would hold the unstemmed
> version.
>
> This would allow f
Hello friends,
I have integrated solr in my alfresco document management server.
I want to search on metadata of documents stored in alfresco. and i also
want to display them in search result.
I have added meta tags as fields in schema.sql. But still i am unable to
search on metadata of the do
Hi Alan,
Solr can do this fast and easy, but I wonder if a simple key-value-store
won't fit better for your suits.
Do you really only need to query be chart_id, or do you also need to
query by time range?
In either case, as long as your data fits into an in-memory database, I
would suggest
Hi Mark!
You could help yourself with creating an additional field. One field would
hold the stemmed version and the other one would hold the unstemmed
version.
This would allow for a higher boost on the unstemmed field.
Use copyField for convenience to copy the content from one field to the
oth
Hi,
I like to use the HunspellStemFilterFactory to improve my search results.
Why isn't there an arg "inject" like in solr.PhoneticFilterFactory to
add tokens instead of replacing them?
I don't want to replace them, because documents with the "unstemmed"
word should be more relevant.
Thanks.
77 matches
Mail list logo