may be you need multi core feature of solr , you can have a single Solr
instance with separate configurations and indexes
http://wiki.apache.org/solr/CoreAdmin
On Fri, Jun 3, 2011 at 12:04 PM, Naveen Gupta wrote:
> Hi
>
> I want to implement different index strategy where we want to keep inde
Hi all,
I wrote my own SearchHandler and therefore overrided the handleRequestBody
method.
This method takes two input parameters : SolrQueryRequest and
SolrQueryResponse objects.
The thing I'd like to do is to get the query fields that are used in my
request.
Of course I can use req.getParams().g
Hi all,
I'm using CJKTokenizerFactory tokenizer to handle text which contains both
Japanese and alphabet words. However, I noticed that CJKTokenizerFactory
converts alphabet to lowercase, so that I cannot use
WordDelimiterFilterFactory filter with splitOnCaseChange property for camel
case words.
@ Pravesh: It's 2 seperate cores, not 2 indexes. Sorry for that.
@ Erick: Yes, I've seen this suggestion and it seems to be the only possible
solution. I'll look into it.
Thanks for your answers guys!
Kurt
On Wed, Jun 1, 2011 at 4:24 PM, Erick Erickson wrote:
> If I read this correctly, one app
Hi,
We have stemming in our Solr search and we need to retrieve the word/phrase
after stemming. That is if I search for "oranges", through stemming a search
for "orange" is carried out. If I turn on debugQuery I would be able to see
this, however we'd like to access it through the result if possib
You can use DataImportHandler for your full/incremental indexing. Now NRT
indexing could vary as per business requirements (i mean delay cud be 5-mins
,10-mins,15-mins,OR, 30-mins). Then it also depends on how much volume will
be indexed incrementally.
BTW, r u having Master+Slave SOLR setup?
--
V
BTW, why r u sorting on this field?
You could also index & store this field twice. First, in its original value,
and then second, by encoding to some unique code/hash and index it and sort
on that.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Sorting-tp3017285p3019055.html
Hi Tomás
Thanks, that makes a lot of sense, and your math is sound.
It is working well. An if() function would be great, and it seems its coming
soon.
Richard
--
View this message in context:
http://lucene.472066.n3.nabble.com/Sorting-algorithm-tp3014549p3019077.html
Sent from the Solr - User
Hi,
in Solr 4.x (trunk version of mid may) I have noticed a null pointer
exception if I activate debugging (debug=true) and use a wildcard to
filter by facet value, e.g.
if I have a price field
..."&debug=true&facet.field=price&fq=price[500+TO+*]"
I get
SEVERE: java.lang.RuntimeException: ja
Stefan,
i guess there is a colon missing? &fq=price:[500+TO+*] should do the trick
Regards
Stefan
On Fri, Jun 3, 2011 at 11:42 AM, Stefan Moises wrote:
> Hi,
>
> in Solr 4.x (trunk version of mid may) I have noticed a null pointer
> exception if I activate debugging (debug=true) and use a wildc
Hi Stefan,
sorry, actually there is a colon, I just forgot it in my example...
so the exception also appears for
&fq=price:[500+TO+*]
But only if debug=true... and "normal" price values work, e.g.
&fq=price:[500+TO+999]
Thanks,
Stefan
Am 03.06.2011 11:46, schrieb Stefan Matheis:
Stefan,
i
Hi
We want to post to solr server with some of the files (rtf,doc,etc) using
php .. one way is to post using curl
is there any client like java client (solrcell)
urls will also help
Thanks
Naveen
Hi Kurt,
I think this is a bit more tricky than that.
For example, if a user searches for "oranges", the stemmer may return
"orang" which is not an existing word.
So getting stemmed words might/will not work for your highlighting purpose.
Ludovic.
-
Jouve
France.
--
View this message in co
On Fri, Jun 3, 2011 at 3:55 PM, Naveen Gupta wrote:
> Hi
>
> We want to post to solr server with some of the files (rtf,doc,etc) using
> php .. one way is to post using curl
Do not normally use PHP, and have not tried it myself.
However, there is a PHP extension for Solr:
http://wiki.apache.org
Hey Erick,
i written separate process as you suggested, and achieved task.
Thanks a lot
Vishal Parekh
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-update-database-record-after-indexing-tp2874171p3019217.html
Sent from the Solr - User mailing list archive at Nabble
Thanks to all,
i done by using multicore,
vishal parekh
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-do-offline-adding-updating-index-tp2923035p3019219.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks kbootz
your suggestion works fine,
vishal parekh
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-concatenate-two-nodes-of-xml-with-xpathentityprocessor-tp2861260p3019223.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Jamie,
I don't know why range facets didn't make it into SolrJ. But I've recently
opened an issue for this:
https://issues.apache.org/jira/browse/SOLR-2523
I hope this will be committed soon. Check the patch out and see if you like
it.
Martijn
On 2 June 2011 18:22, Jamie Johnson wrote:
> C
Dear Solr experts,
I am curious to learn what visualization tools are out there to help me
"visualize" my query results. I am not talking about a language specific
client per se but something more like Carrot2 which breaks clusters in to
their knowledge tree and expandable pie chart. Sorry if thos
Hi Pravesh
We don't have that setup right now .. we are thinking of doing that
for writes we are going to have one instance and for read, we are going to
have another...
do you have other design in mind .. kindly share
Thanks
Naveen
On Fri, Jun 3, 2011 at 2:50 PM, pravesh wrote:
> You c
Yes,
that one i used and it is working fine .thanks to nabble ..
Thanks
Naveen
On Fri, Jun 3, 2011 at 4:02 PM, Gora Mohanty wrote:
> On Fri, Jun 3, 2011 at 3:55 PM, Naveen Gupta wrote:
> > Hi
> >
> > We want to post to solr server with some of the files (rtf,doc,etc) using
> > php .. one way
Hi Naveen:
Solr with RankingAlgorithm supports NRT. The performance is about 262
docs / sec. You can get more information about the performance and NRT
from here:
http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search
You can download Solr with RankingAlgorithm from here:
http://solr-ra.tgels
What is the "best practice" method to index the following in Solr:
I'm attempting to use solr for a book store site.
Each book will have a price but on occasions this will be discounted. The
discounted price exists for a defined time period but there may be many
discount periods. Each discount wi
You can go ahead with the Master/Slave setup provided by SOLR. Its trivial to
setup and you also get SOLR's operational scripts for index synch'ing b/w
Master-to-Slave(s), OR the Java based replication feature.
There is no need to re-invent other architecture :)
--
View this message in context:
Hello,
I'm trying to move a VuFind installation from an ailing physical server into a
virtualized environment, and I'm running into performance problems. VuFind is
a Solr 1.4.1-based application with fairly large and complex records (many
stored fields, many words per record). My particular i
You'v got to tell us more about your setup. We can only guess that you're
on a remote file system and there's a problem there, which would be a
network problem outside of Solr's purview
You might want to review:
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
On Fri, Jun 3, 2011 at
Romi:
Please review:
http://wiki.apache.org/solr/UsingMailingLists
This is the Solr forum. jQuery questions are best directed at a
jQuery-specific forum.
Best
Erick
On Fri, Jun 3, 2011 at 2:27 AM, Romi wrote:
> lee carroll: Sorry for this. i did this because i was not getting any
> response. a
Hmmm, I just tried it on a trunk from a couple of days ago and it
doesn't error out.
Could you re-try with a new build?
Thanks
Erick
On Fri, Jun 3, 2011 at 5:51 AM, Stefan Moises wrote:
> Hi Stefan,
> sorry, actually there is a colon, I just forgot it in my example...
> so the exception also app
I'm not quite sure what you mean by "visualization" here. Do you
want to see the query parse tree? The results list in something other
than XML (see the /browse functionality if so). How documents are
ranked?
"Visualization" is another overloaded word ...
Best
Erick
On Fri, Jun 3, 2011 at 7:13 A
Hi Erick
sure, thanks for looking into it! I'll let you know if it's working for
me there, too...
(I'm using edismax btw., but I've also tested with standard and got the
exception)
Stefan
Am 03.06.2011 15:22, schrieb Erick Erickson:
Hmmm, I just tried it on a trunk from a couple of days ago
How often are the discounts changed? Because you can simply
re-index the book information with a multiValued "discounts" field
and get something similar to your example (&wt=json)
Best
Erick
On Fri, Jun 3, 2011 at 8:38 AM, Judioo wrote:
> What is the "best practice" method to index the foll
This bug was introduced during the cutover from strings to BytesRef on
TermRangeQuery.
I just committed a fix.
-Yonik
http://www.lucidimagination.com
On Fri, Jun 3, 2011 at 5:42 AM, Stefan Moises wrote:
> Hi,
>
> in Solr 4.x (trunk version of mid may) I have noticed a null pointer
> exception if
Do be careful how often you pull down indexes on your slaves. A
too-short polling interval can
lead to some problems. Start with, say, 5 minutes and insure that your
autowarm time (see your
logs) is less than your polling interval
Best
Erick
On Fri, Jun 3, 2011 at 8:43 AM, pravesh wrote:
>
Demian,
* You can run iostat or vmstat and see if there is disk IO during your slow
queries and compare that to disk IO (if any) with your fast/cached queries
* You can make sure you warm up your index well after the first and any new
searcher, so that OS and Solr caches are warmed up
* You can
This doesn't seem right. Here's a couple of things to try:
1> attach &debugQuery=on to your long-running queries. The QTime returned
is the time taken to search, NOT including the time to load the
docs. That'll
help pinpoint whether the problem is the search itself, or assembling the
Hi Adam,
Try this:
http://lmgtfy.com/?q=search%20results%20visualizations
In practice I find that visualizations are cool and attractive looking, but
often text is more useful because it's more direct. But there is room for
graphical representation of search results, sure.
Otis
Sematext
Hi Dmitry,
Yes, you could also implement your own custom SearchComponent. In this
component you could grab the query param, examine the query value, and based on
that add the shards URL param with appropriate value, so that when the regular
QueryComponent grabs stuff from the request, it has t
Hi,
I'm guessing your index is on some sort of network drive that got detached?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Gaurav Shingala
> To: Apache SolrUser
> Sent: Fri, June
Hey Guys
Just a test mail, please ignore this.
--
Thanx& Regards
Jasneet Sabharwal
Software Developer
NextGen Invent Corporation
Thanks Erick for the response.
So my data structure is the same, i.e. they all use the same schema. Though
I think it makes sense for us to somehow break apart the data, for example
by the date it was indexed. I'm just trying to get a feel for how large we
should aim to keep those (by day, by we
Thanks to you and Otis for the suggestions! Some more information:
- Based on the Solr stats page, my caches seem to be working pretty well (few
or no evictions, hit rates in the 75-80% range).
- VuFind is actually doing two Solr queries per search (one initial search
followed by a supplemental
I am noticing something strange with our recent upgrade to solr 3.1 and
want to see if anyone has experienced anything similar.
I have a solr.StrField field named Status the values are Enabled,
Disabled, or ''
When I facet on that field it I get
Enabled 4409565
Disabled 29185
"" 112
The is
Hi,
We migrated to Solr a few days back, but have now after going live we have
noticed a performance drop, especially when we do a delta index, which we
are executing every 1hours with around 100,000 records . We have a multi
core Solr server running on a Linux machine, with 4Gb given to the JV
Because when browsing through legislation, people want to browse in
the same order as it is actually printed in the hard copy volumes.
It did work by using a copyfield to a lowercase field.
On Fri, Jun 3, 2011 at 2:29 AM, pravesh wrote:
> BTW, why r u sorting on this field?
> You could also index
So here's what I'm seeing: I'm running Solr 3.1
I'm running a java client that executes a Httpget (I tried HttpPost) with a
large shard list. If I remove a few shards from my current list it returns
fine, when I use my full shard list I get a "HTTP/1.1 400 Bad Request". If
I execute it in firefox
It sounds like you're hitting the max URL length (8K is a common default) for
the HTTP web server that you're using to run Solr.
All of the web servers I know about let you bump this limit up via
configuration settings.
-- Ken
On Jun 3, 2011, at 9:27am, JohnRodey wrote:
> So here's what I'm s
Rohit:
Yes, run indexing on one machine (master), searches on the other (slave) and
set
up replication between them. Don't optimize your index and warm up the
searcher
and caches on slaves. No downtime.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem sea
Right, if you facet results, then your warmup queries should include those
facets. The same with sorting. If you sort on fields A and B, then include
warmup queries that sort on A and B.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://searc
Dan, does the problem go away if you get rid of those 112 documents with empty
Status or replace their empty status value with, say, "Unknown"?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
>
Hi Otis,
Thanks! This sounds promising. This custom implementation, will it hurt in
any way the stability of the front end SOLR? After implementing it, can I
run some tests to verify the stability / performance?
Dmitry
On Fri, Jun 3, 2011 at 4:49 PM, Otis Gospodnetic wrote:
> Hi Dmitry,
>
> Yes
Nah, if you can quickly figure out which shard a given query maps to, then all
this component needs to do is stick the appropriate shards param value in the
request and let the request pass through to the other SearchComponents in the
chain, including QueryComponent, which will know what to do
Otis and Erick,
Believe it or not, I did Google this and didn't come up with anything all
that useful. I was at the Lucene Revolution conference last year and saw
some prezos that had some sort of graphical representation of the query
results. The one from Basic Tech especially caught my attention
Hi,
Is it just me, or would others like things like:
* The ability to tell Solr (by passing some URL param?) to skip one or more of
its caches and get data from the index
* An additional attrib in the Solr response that shows whether the query came
from the cache or not
* Maybe something else a
Got it, I can quickly figure the shard out, thanks a lot Otis!
Dmitry
On Fri, Jun 3, 2011 at 8:00 PM, Otis Gospodnetic wrote:
> Nah, if you can quickly figure out which shard a given query maps to, then
> all
> this component needs to do is stick the appropriate shards param value in
> the
> re
It sounds like you need to increase the HTTP header size.
In tomcat the default is 4096 bytes, and to change it you need to add
maxHttpHeaderSize="" to the connector definition in server.xml
Colin.
-Original Message-
From: Ken Krugler [mailto:kkrugler_li...@transpac.com]
Sent: Friday,
Otis, I just deleted the documents and committed and I still get that error.
Thanks,
Dan
On 6/3/11 9:43 AM, Otis Gospodnetic wrote:
Dan, does the problem go away if you get rid of those 112 documents with empty
Status or replace their empty status value with, say, "Unknown"?
Otis
Semate
On Jun 2, 2011, at 8:29 PM, Naveen Gupta wrote:
> and what about NRT, is it fine to apply in this case of scenario
Is NRT really what's wanted here? I'm asking the experts, as I have a situation
not too different from the b.p.
It appears to me (from the dox) that NRT makes a difference in the
Hi,
Why not use HTTP POST?
Dmitry
On Fri, Jun 3, 2011 at 8:27 PM, Colin Bennett wrote:
> It sounds like you need to increase the HTTP header size.
>
> In tomcat the default is 4096 bytes, and to change it you need to add
> maxHttpHeaderSize="" to the connector definition in server.xml
>
> Coli
$ curl "http://192.168.34.51:8080/solr/select?q=*%3A*&rows=0"; >> resp.xml
$ xmlstarlet sel -t -v "//@numFound" resp.xml
--
Regards,
K. Gabriele
--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the
Yep that was my issue.
And like Ken said on Tomcat I set maxHttpHeaderSize="65536".
--
View this message in context:
http://lucene.472066.n3.nabble.com/Hitting-the-URI-limit-how-to-get-around-this-tp3017837p3020774.html
Sent from the Solr - User mailing list archive at Nabble.com.
: How to know how many documents are indexed? Anything more elegant than
: parsing numFound?
> $ curl "http://192.168.34.51:8080/solr/select?q=*%3A*&rows=0";
> >> resp.xml
> $ xmlstarlet sel -t -v "//@numFound" resp.xml
solr/admin/stats.jsp is actually an xml too and contains numDocs and maxDoc
i
Hi all,
I need to highlight searched words in the original text (xml) of a document.
So I'm trying to develop a new Highlighter which uses the defaultHighlighter
to highlight some fields and then retrieve the original text file/document
(external or internal storage) and put the highlighted part
And what happens if you add &fl=?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: dan whelan
> To: solr-user@lucene.apache.org
> Sent: Fri, June 3, 2011 1:38:33 PM
> Subject: Re: fq nul
Yes, when people talk about NRT search they refer to 'add to view lag'. In a
typical Solr master-slave setup this is dominated by waiting for replication,
doing the replication, and then warming up.
If your problem is indexing speed then that's a separate story that I think
you'll find answers
To clarify a bit more, I took a look to this function :
termPositions
public TermPositions termPositions()
throws IOException
Description copied from class: IndexReader
Returns an unpositioned TermPositions enumerator.
But it returns an unpositioned enumerat
> I need to highlight searched words in the original text
> (xml) of a document.
Why don't you remove xml tags in an analyzer? You can highlight xml by doing so.
$ curl --fail "http://192.168.34.51:8080/solr/admin/stats.jsp"; >> resp.xml
$ xmlstarlet sel -t -v "//@numDocs" resp.xml
*Extra content at the end of the document*
On Fri, Jun 3, 2011 at 8:56 PM, Ahmet Arslan wrote:
> : How to know how many documents are indexed? Anything more elegant than
> : p
It returned results when I added the fl param.
Strange... wonder what is going on there
Thanks,
Dan
On 6/3/11 12:17 PM, Otis Gospodnetic wrote:
And what happens if you add&fl=?
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-luce
Quick impressions:
The faceting is usually best done on fields that don't have lots of unique
values for three reasons:
1> It's questionable how much use to the user to have a gazillion facets.
In the case of a unique field per document, in fact, it's useless.
2> resource requirements go up a
Nope, cores are just a self-contained index, really.
What is the point of breaking them up? If you have some kind
of rolling currency (i.e. you only want to keep the last N days/weeks/months)
then you can always delete-by-query to age-out the relevant docs.
You'll be able to fit more on one serve
The original document is not indexed. Currently it is just stored and could
be stored in an filesystem or a database in the future.
The different parts of a document are indexed in multiple different fields
with some different analyzers (stemming, multiple languages, regex,...).
So, I don't thin
Why, I'm just wondering?
For a case where you know the next query would not be possible to be
already in the cache because it is so different from the norm?
Just for timing information for instrumentation used for tuning (ie so
you can compare cached response times vs non-cached response time
On Fri, Jun 3, 2011 at 1:02 PM, Otis Gospodnetic
wrote:
> Is it just me, or would others like things like:
> * The ability to tell Solr (by passing some URL param?) to skip one or more of
> its caches and get data from the index
Yeah, we've needed this for a long time, and I believe there's a JIR
Dan, this doesn't really have anything to do with your filter on the
Status field except that it causes different documents to be selected.
The root cause is a schema mismatch with your index.
A string field (or so the schema is saying it's a string field) is
returning "null" for a value, which is
Right, so now try adding different fields and see which one breaks it again.
Then you know which field is a problem and you can dig deeper around that field.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original
Robert,
Mainly so that you can tell how fast the search itself is when query or
documents or filters are not cached.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Robert Petersen
>
Romi,
If you don't have a unique ID field, you can always create a UUID - see
http://search-lucene.com/?q=uuid&fc_type=javadoc
If you don't want to use QEC, remove it from the list of components in
solrconfig.xml
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosyst
Roger, wrong list.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Roger Shah
> To: "solr-user@lucene.apache.org"
> Sent: Thu, May 26, 2011 3:06:15 PM
> Subject: Nutch Crawl error
>
>
Greeting all, I found a bug today while trying to upgrade from 1.4.1 to 3.1
In 1.4.1 I was able to insert this doc:
User
14914457UserSan
Franciscojtoyjtoylife
hacker0.05
And then I can run the query:
http://localhost:8983/solr/select?q=life&qf=description_text&defType=dismax&sort=scores:rails_
Hi,
Discounts can change daily. Also there can be a lot of them (over time and
in a given time period ).
Could you give an example of what you mean buy multi-valuing the field.
Thanks
On 3 June 2011 14:29, Erick Erickson wrote:
> How often are the discounts changed? Because you can simply
> re
80 matches
Mail list logo