I don't know - by chance, I'm actually doing about the same sequence of events
right now with Solr 4.1, and the cores are running fine…
What do the logs say?
- Mark
On Feb 14, 2013, at 10:18 PM, Anirudha Jadhav wrote:
> *1.empty Zookeeper*
> *2.empty index directories for solr*
> *3.empty sol
I'm trying to explore Parts-Of-Speech tagging with SOLR. Firstly, am I
right in assuming that OpenNLP integration is the right direction in
which to proceed?
With respect to getting OpenNLP to work with SOLR (
http://wiki.apache.org/solr/OpenNLP ) , I tried following the
instructions , only to be
*1.empty Zookeeper*
*2.empty index directories for solr*
*3.empty solr.xml*
*3.1 upload / link cfg in zookeeper for test collection*
*4*.* start 4 solr servers on different machines*
*5. Access server* : i see
that's ok
*6. CREATE collection*
http://hostname:15000/solr/admin/collections?a
You could run Lucene benchmark stuff and compare. Or look at
ActionGenerator from Sematext on Github which you could also use for
performance testing and comparing.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Feb 14, 2013 10:56 AM, "Michael Della Bitta" <
michael.della.bi...@appinion
Hi,
I am having a column called 'lastUpdate' in my solr which will contain
last updated date. Now i want fetch last 24 lastupdated dates from that
column. How to do this???
Querying the solr server with the following URL fetches me the result ,
http://localhost/solr/MC_10701_catalogEntry/q=lastU
Use the edismax query parser and set the PF, PF2, and PF3 parameters so that
adjacent pairs and triples of query terms will get "phrase boosted".
See:
http://wiki.apache.org/solr/ExtendedDisMax#pf_.28Phrase_Fields.29
http://wiki.apache.org/solr/ExtendedDisMax#pf2_.28Phrase_bigram_fields.29
-- J
Howdy,
I have a straight-forward index that contains a "name" field. I am currently
taking a string of text, tokenizing it into individual strings and making a
query out of them all against the "name" field.
Note that the name field is split up by a whitespace tokenizer and a lower
case filter du
We have two boxes, they are really nice servers, 32 core cpu, 192 G
memory with both RAID arrays and fusion IOs. But each of them running
two instances of Solr, one for indexing and the other for
searching.Search index is on fusion IO card.
Each instance has 11 cores and a small core for making in
I took your advice, waited for the servers to go down then:
[ec2-user@zuk-solr-slave-02 ~]$ ps -wwwf -p 10131
UIDPID PPID C STIME TTY TIME CMD
tomcat 10131 1 17 23:00 ?00:03:13 /usr/sbin/sshd
This doesn't say much :(
What should I do know?
--
View this mess
Hi,
It is curious to know how many linux boxes do you have and how many cores in
each of them. It was my understanding that solr puts in the memory all
documents found for a keyword, not the whole index. So, why it must be faster
with more cores, when number of selected documents from many sepa
Just to close this discussion , we solved the problem by splitting the index.
It turned out that distributed search with 12 cores are faster than
searching two cores.
All queries ,tomcat configuration, jvm configuration remain same. Now
queries are served in milliseconds.
On Thu, Jan 31, 2013 at
Shawn,
Awesome. Exactly something I am looking for.
Thanks!
Ming
On Thu, Feb 14, 2013 at 12:00 PM, Shawn Heisey wrote:
> On 2/14/2013 12:46 PM, Mingfeng Yang wrote:
>
>> I have a few Solr indexes, each with 20-200 millions documents, which were
>> indexed by querying multiple PostgreSQL data
On 2/14/2013 12:46 PM, Mingfeng Yang wrote:
I have a few Solr indexes, each with 20-200 millions documents, which were
indexed by querying multiple PostgreSQL databases. If I do rebuild the
index by the same way, it would take a few months, because the PostgresSQL
query is slow.
Now, I need to
I have a few Solr indexes, each with 20-200 millions documents, which were
indexed by querying multiple PostgreSQL databases. If I do rebuild the
index by the same way, it would take a few months, because the PostgresSQL
query is slow.
Now, I need to do the following changes to all indexes.
1. de
Steve Rowe [sar...@gmail.com] wrote:
> On Feb 14, 2013, at 11:24 AM, Walter Underwood wrote:
> > Laptop disks are slower than the EC2 disks.
> My laptop disk is an SSD.
So it's not a disk? ...Sorry, couldn't resist.
Unfortunately Amazon only has two SSD-backed solutions and they are #3 and #2
: I think the order needs to be in lowercase. Try "asc" instead of "ASC".
Should be trivial to support uppercase ASC and DESC as well, not sure why
no one thought of adding that before...
https://issues.apache.org/jira/browse/SOLR-4458
...patches welcome
-Hoss
Oops - that's definitely not the link I meant to give ;-) Here's the
link from slideshare:
http://www.slideshare.net/thelabdude/boosting-documents-in-solr-lucene-revolution-2011
In there we used Mahout to calculate recommendation scores and then
loaded them using external file field.
Cheers,
Tim
Start by looking at Solr's external file field and
http://www.linkedin.com/profile/view?id=18807864&trk=tab_pro
On Thu, Feb 14, 2013 at 6:24 AM, Á_o wrote:
> Well, thinking a bit more, the second solution is not practical.
>
> If Solr retrieves, say, 1.000 documents, I would have to navigate
Works perfectly. Thank you. I didn't know this tokenizer does nothing before
:)
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-define-a-lowercase-fieldtype-without-tokenizer-tp4040500p4040507.html
Sent from the Solr - User mailing list archive at Nabble.com.
You can use a KeywordTokenizerFactory, which will tokenise into a single
term, and then do your lowercasing. Does that get you what you want?
Upayavira
On Thu, Feb 14, 2013, at 05:11 PM, Bing Hua wrote:
> Hi,
>
> I don't want the field to be tokenized because Solr doesn't support
> sorting
> on
Hi Peter,
Your "original query" didn't make it to the mailing list. You're experiencing
a long-standing nabble bug: nabble eats code. (I've told them about it a
couple of times, but the problem persists, so I guess they're not interested in
fixing it.)
My suggestion: don't use nabble for pos
Hi,
I don't want the field to be tokenized because Solr doesn't support sorting
on a tokenized field. In order to do case insensitive sorting I need to copy
a field to a lowercase but not tokenized field. How to define this?
I did below but it says I need to specify a tokenizer or a class for
ana
Ok, something went wrong with posting the code,since I did not escape the
quotes and ampersands.
I tried your code, but nu luck.
Here's the original query I'm trying to execute. What characters do I need
to escape? I thought only the < and > characters?
Thanks!
--
View this message in contex
Thanks,
We run SOLR 4.0 in production. Yesterday, I ported our configuration to 4.1
on my local workstation. I just looked at the SOLR-4400 fix versions and as
per the info, I might wait till 4.2 before porting.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Multi-Core-On-
If you can spare the load of a long request, I'd do an unsorted query
for everything, non-paged. I'd dump that into a line-per-row format
and use something like Apache Hive to do the analysis.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd
On Feb 14, 2013, at 11:24 AM, Walter Underwood wrote:
> Laptop disks are slower than the EC2 disks.
My laptop disk is an SSD.
Just for sake of comparison, http://www.ec2instances.info/
At the low end, EC2 CPUs come in 1, 2, 2.5, and 3.25 unit sizes. A
m2.xlarge uses 3.25 unit CPUs, so one would have to step up to the
high storage, high IO, or cluster compute nodes to do better than that
at single threaded tasks.
Good th
No, you still have to fix problems with data-config.xml. Just that prior to
4.0-alpha if you started solr with a problem in the config, you had no way to
fix it and refreshing without restarting solr (or at least doing a core
reload). With 4.0, you can fix your config file and just retry.
I t
Just using a single CPU (log processing with Python), my MacBook Pro (2GHz
Intel Core i7) is twice as fast as an m2.xlarge EC2 instance.
Laptop disks are slower than the EC2 disks.
EC2 is for quantity, not quality.
wunder
On Feb 14, 2013, at 5:10 AM, Jack Krupansky wrote:
> That raises the qu
I do a brute-force regression test where I read all the documents from
shard 1 and compare them to documents in shard 2. I had to have all the
fields stored to do that, but in my case that doesn't change the size of
the index much.
So, in other words, I do a search for a page's worth of documents
Ok, but I restarted solr several times and the issue still occurs. So my
guess is that the entity I added contains errors:
50'
END as PriceCategory From products)
Select PriceCategory, Count(*) as Cnt From Categorized Group By
PriceCategory ">
Or are you saying tha
This looks like https://issues.apache.org/jira/browse/SOLR-2115 , which was
fixed for 4.0-Alpha .
Bascially, if you do not put a data-config.xml file in the "defaults" section
in solrconfig.xml, or if your config file has any errors, you won't be able to
use DIH unless you fix the problem and r
Or perhaps we should develop our own, Solr-based benchmark...
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a Game
On Thu, Feb 14, 2013 at 10:54 AM, Michael Della Bi
My dual-core, HT-enabled Dell Latitude from last year has this CPU:
model name : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
bogomips: 4988.65
An m3.xlarge reports:
model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
bogomips : 4000.14
I tried running geekbench and phoronx-test-suite and fa
Daniel,
This bug has already been recorded and hopefully will be fixed in time for 4.2.
See https://issues.apache.org/jira/browse/SOLR-4361 .
James Dyer
Ingram Content Group
(615) 213-4311
-Original Message-
From: Daniel Rijkhof [mailto:daniel.rijk...@gmail.com]
Sent: Wednesday, Febr
MockAnalyzer is really just MocKTokenizer+MockTokenFilter+
Instead you just define your own analyzer chain using MockTokenizer.
This is the way all lucene's own analysis tests work: e.g.
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/analysis/common/src/test/org/apache/lucene/analysis
Partial updates is nothing as clever as I may have made it sound, it is just
changing a record value , for example last name from Smith to Jones, that's
my partial update.
No errors at all in indexing, I have not yet checked the logs , but the DIH
output counts show no errors, here is an example
Well, thinking a bit more, the second solution is not practical.
If Solr retrieves, say, 1.000 documents, I would have to navigate through
ALL (maybe less with some reasonable upper limit) of them to recalculate the
scores and reorder them according to the new score although the Web App is
going t
Almost forgot. Do be aware of
https://issues.apache.org/jira/browse/SOLR-4400. This came to light under
an absurd load of opening/closing transient cores, which only means it
won't show up until you go into production. The fix is on both trunk and 4x.
On Thu, Feb 14, 2013 at 7:46 AM, Erick Eric
Daniel:
It would be great if you would go ahead and edit the Wiki, all you have to
do is create a signon. Having just gone through the pain of figuring this
out, you're best positioned to know how to warn others!
Best
Erick
On Thu, Feb 14, 2013 at 4:56 AM, Daniel Rijkhof wrote:
> James,
>
> I'
That raises the question of how your average professional notebook computer
(PC or Mac or Linux) compares to a garden-variety cloud server such as an
Amazon EC2 m1.large (or m3.xlarge) in terms of performance such as document
ingestion rate or how many documents you can load before load and/or q
Hi,
If I am not mistaken I saw some open jira to collect queries and calculate
popular searches etc.
Some commercial solutions exist:
http://sematext.com/search-analytics/index.html
http://soleami.com/blog/soleami-start_en.html
--- On Wed, 2/13/13, ROSENBERG, YOEL (YOEL)** CTR **
wrote:
Fr
I updated this page: http://wiki.apache.org/solr/CoreAdmin, look for
"transientCacheSize" and "loadOnStartup". Be aware that this is somewhat in
flux, but anything you find please report!
Man, oh man, do I have a lot of documentation to do on all this once the
dust settles
Erick
On Wed, Feb
One data point: I can comfortably index and search the Wikipedia dump (11M
articles, 5M with text) on my Macbook Pro. Admittedly not heavy-duty
queries, but
Erick
On Wed, Feb 13, 2013 at 4:01 PM, Matthew Shapiro wrote:
> Excellent, thank you very much for the reply!
>
> On Wed, Feb 13, 201
If I'm understanding your quetion correctly, you have to build that out
yourself. Solr doesn't store the searches, nor the results.
Hmm, though if you keep the Solr logs around you can reconstruct the
queries from them although it takes a bit of work. The other place would be
your servelet contain
Hi,
We need to get the filterCache in a Component but
SolrIndexSearcher.getCache(String name) does not return it. It seems the
filterCache is not added to cacheMap and can therefore not be returned.
SolrCache filterCache =
rb.req.getSearcher().getCache("filterCache");
Will always return null.
I'm trying to monitor the state of a master-slave Solr4.1 cluster. I can easily
get the generation number of the slaves using JMX like this:
solr/{corename}/org.apache.solr.handler.ReplicationHandler/generation
That works fine. However on the master, this number is always 1. Which makes it
Hello Jack,
Thanks for your answer, it helped me gaining a deeper understandig what happens
at index time, and finding a solution myself:
It seems that putting the synonym filter in both filter chains (index and
query), setting expand="false", and putting the desired synonym first in the
row,
Hello Everyone,
I've been integrating Solr 4.1 into a Web GIS solution and it's working great.
I have implemented JTS within Solr 4.1 and indexed thousands of WKT polygons
provided by XML document genereated by a GE's GIS Core system. Everything seems
to working out great.
Now I have a feature
On all products I have I want to implement a price range filter.
Since this pricerange is applied on the entire population and not on a
single product, my assumption was that it would not make sense to define
this within the "shopitem" entity, but rather under the document
"shopitems". So that's wh
Yes i did some changes with the requesthandler.I have added edismax and removed the df field specified there and Now
its working as i expected.
Thanks for the help ahmet.
> Date: Thu, 14 Feb 2013 01:31:14 -0800
> From: iori...@yahoo.com
> Subject: RE: Why a phrase is getting searched against defa
Almost. I did not benchmark it but tend to believe this
http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashMap.html :
"iteration over the collection-views of a LinkedHashMap requires time
proportional to the /size/ of the map, regardless of its capacity.
Iteration over a HashMap is like
Thanks Markus
I didn't know that page. It's all I need it.
Thanks again
El 14/02/2013 10:47, Markus Jelsma escribió:
See: admin/luke?show=index or the admin UI.
-Original message-
From:Miguel
Sent: Thu 14-Feb-2013 10:45
To: solr-user@lucene.apache.org
Subject: How-to get d
James,
I'm not completely sure, and i have not tested the following:
.last_index_time might also not be accessible...
Daniel
On Thu, Feb 14, 2013 at 12:47 AM, Daniel Rijkhof
wrote:
> James,
>
> I debugged it until I found where things go 'wrong'.
>
> Apparently the current implementation Varia
See: admin/luke?show=index or the admin UI.
-Original message-
> From:Miguel
> Sent: Thu 14-Feb-2013 10:45
> To: solr-user@lucene.apache.org
> Subject: How-to get date of indexing process
>
> Hi everybody
>
> I am looking for the way to get date of last indexing process or comm
Hi everybody
I am looking for the way to get date of last indexing process or
commit event that it happened in my Solr server.
I found a possible solution to add timestamp field , for example:
||
But, I would like a solution without modify the schema of Solr server.
I checked statistics p
On 14 February 2013 14:42, Bayu Widyasanyata wrote:
> On Thu, Feb 14, 2013 at 3:53 PM, Gora Mohanty wrote:
>
>> 3. Depending on how you installed Solr, there should be a folder
>> like webapps/solr/WEB-INF/ . In that folder, edit web.xml, and
>> add and tags. The entries
>> for the
Hi,
instead of &edismax=true can you try &defType=edismax
ahmet
--- On Thu, 2/14/13, Pragyanshis Pattanaik wrote:
> From: Pragyanshis Pattanaik
> Subject: RE: Why a phrase is getting searched against default fields in solr
> To: "solr Forum"
> Date: Thursday, February 14, 2013, 10:21 AM
> It
On Thu, Feb 14, 2013 at 3:53 PM, Gora Mohanty wrote:
> 3. Depending on how you installed Solr, there should be a folder
> like webapps/solr/WEB-INF/ . In that folder, edit web.xml, and
> add and tags. The entries
> for the latter should match the entries in step 1.
>
One thing that
On 14 February 2013 14:05, Bayu Widyasanyata wrote:
> Hi,
>
> I'm sure it's an "old" question..
> I just want protecting Admin page (/solr) with Basic Authentication.
> But I can't found fine answer yet out there.
>
> I use Solr 4.1 with Apache Tomcat/7.0.35.
[...]
The easiest way to do this with
Hi,
I'm sure it's an "old" question..
I just want protecting Admin page (/solr) with Basic Authentication.
But I can't found fine answer yet out there.
I use Solr 4.1 with Apache Tomcat/7.0.35.
Could anyone give me a quick hints or links?
Thanks in advance!
--
wassalam,
[bayu]
Hi Hemant,
I think your use case would be useful for relevancy tuning. It could be
implemented as either SearchComponent or QParserPlugin.
Edismax query parser has pf2 pf3 parameters can remedy to some degree.
Probably edismax extension will be best place to put it. Similar to
https://issues.
It is returning me all the documents which contains the phrase as it is
searching against Defaultfield.my default field is like below
I have defined SearchableField as default field.
Thanks,Pragyanshis
> Date: Wed, 13 Feb 2013 23:18:06 -0800
> From: iori...@yahoo.com
> Subject: Re: Why a
I may be missing something but let me go back to your original statements:
1) You build the index once per week from scratch
2) You replicate this from master to slave.
My understanding of the way replication works is that it's meant to only
send along files that are new and if any files named the
Okay so I think I found a solution if you are a maven user and don't
mind forcing the test codec to Lucene40 then do the following:
Add this to your pom.xml under the
"
" section
org.apache.maven.plugins
maven-surefire-plugin
2.13
-Dtests.codec=Lucene40
I
65 matches
Mail list logo