Aggregate functions on faceted result

2010-03-11 Thread Marcus Herou
FacetFunctionComponent but felt that I should drop a mail here asking if it is already possible. Is it even remotely possible to create this function in SOLR ? Cheers //Marcus Herou -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/

Re: Is optimizing always necessary?

2010-01-30 Thread Marcus Herou
indexes > again and again should be time costing. > > 2010/1/30 Marcus Herou > > > If one only have additions do I then need to optimize the index at all ? > > > > I thought that only update/deletes created "holes" in the index. Or > should > > the

Is optimizing always necessary?

2010-01-29 Thread Marcus Herou
If one only have additions do I then need to optimize the index at all ? I thought that only update/deletes created "holes" in the index. Or should the index be sorted on disk at all times, is that the reason ? Cheers //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +4

Re: SOLR dynamic core creation

2009-08-28 Thread Marcus Herou
Thanks. Yup, I have reloaded the wiki all morning (GMT+1 time) Cheers //Marcus On Fri, Aug 28, 2009 at 2:06 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Fri, Aug 28, 2009 at 5:25 PM, Marcus Herou >wrote: > > > That seems sweet! One conf dir, man

Re: SOLR dynamic core creation

2009-08-28 Thread Marcus Herou
t; after say august , you cretae andn empty core "core-aug" and swap it > out with "newcore". > > Make every core use the same instancedir with different dataDir > > On Fri, Aug 28, 2009 at 3:08 PM, Marcus Herou > wrote: > > Hi. > > > > We ar

SOLR dynamic core creation

2009-08-28 Thread Marcus Herou
o choose which indices to search in to utilize the memory better. We probably need to distribute a file to the clients so they know where each index reside (or have the conf in a DB) Any thoughts ? Cheers //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tails

Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Marcus Herou
gt; > _ > > {Beto|Norberto|Numard} Meijome > > > > "Gravity cannot be blamed for people falling in love." > > Albert Einstein > > > > I speak for myself, not my employer. Contents may be hot. Slippery when > > wet. Reading disclaimers makes you go blind. Writing them is worse. You > > have been Warned. > > > > > > -- > View this message in context: > http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/

Re: Is there any other way to load the index beside using "http" connection?

2009-07-06 Thread Marcus Herou
Yes exactly just being friendly sharing a working routine. Took me some hours to figure out DIH myself at the time. //Marcus On Mon, Jul 6, 2009 at 1:32 PM, Norberto Meijome wrote: > On Sun, 5 Jul 2009 21:36:35 +0200 > Marcus Herou wrote: > > > Sharing some of our exports from D

Re: Is there any other way to load the index beside using "http" connection?

2009-07-05 Thread Marcus Herou
xport it. > > > > How about if LUSQL, some mentioned about this? Is this apps free(open > source) > > application? Do you have any experience with this apps? > > Not i, sorry. > > Have you looked into DIH? It's designed for this kind of work. > > B > _ > {Beto|Norberto|Numard} Meijome > > "Great spirits have often encountered violent opposition from mediocre > minds." > Albert Einstein > > I speak for myself, not my employer. Contents may be hot. Slippery when > wet. > Reading disclaimers makes you go blind. Writing them is worse. You have > been > Warned. > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/

Re: Scaling out/up or a mix

2009-07-01 Thread Marcus Herou
1, 2009 at 1:31 PM, Toke Eskildsen wrote: > On Tue, 2009-06-30 at 22:59 +0200, Marcus Herou wrote: > > The number of concurrent users today is insignficant but once we push > > for the service we will get into trouble... I know that since even one > > simple faceting query (whi

Re: Scaling out/up or a mix

2009-06-28 Thread Marcus Herou
soon. //Marcus On Sat, Jun 27, 2009 at 12:02 AM, Marcus Herou wrote: > Hi. > > I currently have an index which is 16GB per machine (8 machines = 128GB) > (data is stored externally, not in index) and is growing like crazy (we are > indexing blogs which is crazy by nature) and have

Scaling out/up or a mix

2009-06-26 Thread Marcus Herou
fields and collecting results ? What I'm trying to find out is what I can do to get most bang for the buck with a limited (aren't we all limited?) budget. Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tai

Re: Date faceting - howto improve performance

2009-04-30 Thread Marcus Herou
questions. > > On Thu, Apr 30, 2009 at 1:00 AM, Marcus Herou >wrote: > > > Aha! > > > > Hmm , googling wont help me I see. any hints of usages ? > > > > /M > > > > > > On Tue, Apr 28, 2009 at 12:29 AM, Shalin Shekhar Mangar

Re: how to reset the index in solr

2009-04-30 Thread Marcus Herou
not an intended recipient; any form of > disclosure, copyright, distribution and any other means of use of > information is unauthorised and subject to legal implications. We do not > accept any liability for the transmission of incomplete, delayed > communication and recipients must check this email and any attachments for > the presence of viruses before downloading them. > > > > > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: Date faceting - howto improve performance

2009-04-29 Thread Marcus Herou
> work out-of-the-box for trie fields I think. But you could use facet.query > to achieve the same effect. On my simple benchmarks I've found trie fields > to give a huge improvement in range searches. > > On Sat, Apr 25, 2009 at 4:24 PM, Marcus Herou >wrote: > > &

Re: Date faceting - howto improve performance

2009-04-27 Thread Marcus Herou
supported either at Lucene > level or at Solr level. If index 1 has m docs and index 2 has n docs, > index 1 will have m+n docs after adding index 2 to index 1. Documents > themselves are not modified by index merge. > > Cheers, > Ning > > > On Sat, Apr 25, 2009 at 4:03 PM, M

Re: Solr-1.4 indexing slower ?

2009-04-25 Thread Marcus Herou
s -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Marcus Herou > > To: solr-user@lucene.apache.org > > Sent: Saturday, April 25, 2009 7:00:57 AM > > Subject: Solr-1.4 indexing slower ? > > > >

Re: Date faceting - howto improve performance

2009-04-25 Thread Marcus Herou
, Marcus Herou wrote: > Guys! > > Thanks for these insights, I think we will head for Lucene level merging > strategy (two or more indexes). > When merging I guess the second index need to have the same doc ids > somehow. This is an internal id in Lucene, not that easy to get hold o

Re: Date faceting - howto improve performance

2009-04-25 Thread Marcus Herou
2009 at 5:10 PM, Marcus Herou wrote: > Guys! > > Thanks for these insights, I think we will head for Lucene level merging > strategy (two or more indexes). > When merging I guess the second index need to have the same doc ids > somehow. This is an internal id in Lucene, not that

Re: Date faceting - howto improve performance

2009-04-25 Thread Marcus Herou
; > Andrzej), I remember he once provided the working recipe. > > > > > > Otis -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > - Original Message > > > From: Marcus Herou > > > To: solr-us

Solr-1.4 indexing slower ?

2009-04-25 Thread Marcus Herou
... I launched 1.4 due to the fact that the rumours said that date faceting was faster in solr-1.4 which I believe it is. That's why I missed to profile indexing speed. Did not Lucene as well change version between the two ? Wondering if anyone else experience the same issues. //Marcus -- Ma

Date faceting - howto improve performance

2009-04-25 Thread Marcus Herou
ng to get this to work (negates Q3). We have 8 shards as of current, what would the most efficient way be to reindexing the whole shebang ? Dump the entire database to disk (sigh), create many xml file splits and use curl in a random/hash(numServers) manner on them ? Kindly //Marcus -- M

Re: PageRank sort

2009-04-24 Thread Marcus Herou
0.0 0.0 0.0 0.0 7.0 On Sat, Apr 25, 2009 at 12:49 AM, Marcus Herou wrote: > Cool! > > GET ' > http://127.0.0.1:8110/solr/test/s

Re: PageRank sort

2009-04-24 Thread Marcus Herou
Cool! GET ' http://127.0.0.1:8110/solr/test/select?indent=on&start=0&rows=100&q={!boostb=blogRank v=$qq}&qq=title:solr&debugQuery=on' On Sat, Apr 25, 2009 at 12:43 AM, Marcus Herou wrote: > That seems wise... PageRank * Text-based Scoring. > > So you mean

Re: PageRank sort

2009-04-24 Thread Marcus Herou
> OR for a dismax relevancy query, > &q={!boost b=myScore v=$qq}&qq={!dismax qf=text_all pf=text_all}solr rocks > > If the {! type of syntax looks new, check out > http://wiki.apache.org/solr/LocalParams > powerful stuff! > > -Yonik > http://www.lucidimagination.com > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: PageRank sort

2009-04-24 Thread Marcus Herou
That is fantastic, I am creating a really small index right now trying to figure out howto implement the FunctionQuery for this. //Marcus On Fri, Apr 24, 2009 at 10:55 PM, Yonik Seeley wrote: > On Fri, Apr 24, 2009 at 1:39 PM, Marcus Herou > wrote: > > Great! That seems like so

Re: PageRank sort

2009-04-24 Thread Marcus Herou
Works like a charm! Thank you sir. //Marcus On Fri, Apr 24, 2009 at 11:01 PM, Marcus Herou wrote: > That is fantastic, I am creating a really small index right now trying to > figure out howto implement the FunctionQuery for this. > > //Marcus > > > On Fri, Apr 24, 20

Re: Change boost of documents / single fields / external scoring ?

2009-04-24 Thread Marcus Herou
CustomScoreQuery(realQuery, prQuery); >> TopDocs hits = searcher.search(q, 10); >> >> MyPageRankScores is your class, subclassing ValueSource and implementing >> the >> getValues method. >> >> You could subclass CustomScoreQuery if you want to tweak just how the >> "

Re: PageRank sort

2009-04-24 Thread Marcus Herou
And I published the setup here: http://dev.tailsweep.com/solr-external-scoring/en/ /M On Sat, Apr 25, 2009 at 12:01 AM, Marcus Herou wrote: > Works like a charm! > > Thank you sir. > > //Marcus > > > On Fri, Apr 24, 2009 at 11:01 PM, Marcus Herou > wrote: > >&

Re: PageRank sort

2009-04-24 Thread Marcus Herou
nd/or pseudo code ? > > > On Apr 24, 2009, at 1:52 AM, Marcus Herou wrote: > > Hi. >> >> I've posted before but here it goes again: >> >> I have BlogData data which is more or less 100% static but one field is >> not >> - the PageRank.

Re: Change boost of documents / single fields / external scoring ?

2009-04-24 Thread Marcus Herou
eSource that pulls in the external scores? > > Mike > > On Thu, Apr 23, 2009 at 4:01 PM, Marcus Herou > wrote: > > Hi. > > > > Confusing subject eh ? Trying to become a little clearer in a few > sentences. > > > > We have a Solr/Lucene index where eac

PageRank sort

2009-04-23 Thread Marcus Herou
nyone have an idea of howto implement these patterns in SOLR ? I have never extended SOLR but am not afraid of doing so if someone pushes me in the right direction. Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/

Re: Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
Could an ExternalFileField help me ? http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html On Thu, Apr 23, 2009 at 10:01 PM, Marcus Herou wrote: > Hi. > > Confusing subject eh ? Trying to become a little clearer in a few > sentences. > > We have a

Re: Custom score for a id field

2009-04-23 Thread Marcus Herou
gt; Please let me know the solution to do this. > > Thanks, > Raju > > > -- > View this message in context: > http://www.nabble.com/Custom-score-for-a-id-field-tp23197465p23197465.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Marcus Herou CTO

Re: modify SOLR scoring

2009-04-23 Thread Marcus Herou
? > > > > Thanks. > > > > Excuse for my english. > > > > > > -- > > View this message in context: http://www.nabble.com/modify-SOLR- > > scoring-tp23198326p23198326.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/ http://blogg.tailsweep.com/

Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
s ? Would there be a possibilty to do some kind of join (parallell searches separate index types) ? or send the result to a separate sorting algorithm ? Hmmm Perhaps a subclass of Sort ? Grasping at straws here folks... Hope anyone of the core experts can help us. Cheers //Marcus Herou -- Mar

Re: Buzz measurement - Aggregate functions

2008-10-10 Thread Marcus Herou
OTECTED]> wrote: > you can try using the field collapse patch (currently in JIRA). You'll > probably need to manually extract the patch code and apply it yourself as > its latest update only applies to an earlier version of solr (1.3-dev). > > http://issues.apache.org/jira/bro

Buzz measurement - Aggregate functions

2008-10-10 Thread Marcus Herou
--+-----+ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: Ampersand issue making me go nuts!

2008-09-07 Thread Marcus Herou
But good! You can't imagine how many hours I've put into this so thanks again! A 1 minute solution :) Kindly //Marcus On Sun, Sep 7, 2008 at 3:10 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Sun, Sep 7, 2008 at 7:50 AM, Marcus Herou <[EMAIL PROTECTED]> > wrote

Ampersand issue making me go nuts!

2008-09-07 Thread Marcus Herou
Hi, turning to the mailing list since I cannot find any similar case by googling. We have troubles searching when the "&" sign is included in the search query, for example; description:"h&m", "m&m" etc. The server setup looks like this. We apache-solr-1.3.0-RC2.war on all machines (same issues w

Re: scaling / sharding questions

2008-06-15 Thread Marcus Herou
> 16 shards: 0*, 1*, 2*... d*, e*, f* > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > > From: Marcus Herou <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Cc: [EMAIL PROTEC

Re: Num docs

2008-06-14 Thread Marcus Herou
t; > - Original Message > > From: Marcus Herou <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Thursday, June 12, 2008 3:17:52 PM > > Subject: Re: Num docs > > > > Cacti, Nagios you name it already in use :) > > > > Well I

Re: scaling / sharding questions

2008-06-14 Thread Marcus Herou
> shards, on the indexing writer side, but then merge groups of them into > logical shards which are snapshotted to reader solrs' on a frequent basis. > I haven't done any testing along these lines, but logically it seems like > an > idea worth pursuing. > > enjoy,

Re: Num docs

2008-06-12 Thread Marcus Herou
o have right in > Solr, > > > primarily for use when allocating and sizing shards for Distributed > Search. > > > JIRA enhancement/feature issue? > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > >

Re: Num docs

2008-06-10 Thread Marcus Herou
rcus, > > > > > > > > > For that you can rely on du, vmstat, iostat, top and such, too. :) > > > > > > Otis > > > -- > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > - Original Message

Re: Num docs

2008-06-07 Thread Marcus Herou
er. You can get it from its > output. It may also be possible to get *just* that number, but I'm not > looking at docs/code right now to know for sure. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message >

Num docs

2008-06-07 Thread Marcus Herou
Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: scaling / sharding questions

2008-06-06 Thread Marcus Herou
gt; to correct me, or ask questions as you see fit. > > And yes, I will report how we are doing things when we get this all figured > out, > and if there are items that we can contribute back to Solr we will. If > nothing > else there will be a nice article of how we manage TB of data with Solr. > > enjoy, > > -jeremy > > -- > > Jeremy Hinegardner [EMAIL PROTECTED] > > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: Solr feasibility with terabyte-scale data

2008-05-11 Thread Marcus Herou
Sat, May 10, 2008 at 6:03 PM, Marcus Herou <[EMAIL PROTECTED]> wrote: > Hi Otis. > > Thanks for the insights. Nice to get feedback from a technorati guy. Nice > to see that the snippet of yours is almost a copy of mine, gives me the > right stomach feeling about this :) &g

Re: Solr feasibility with terabyte-scale data

2008-05-10 Thread Marcus Herou
bly will not need > > > millisecond query response. > > > > > > Our environment makes available Apache on blade servers (Dell 1955 dual > > > dual-core 3.x GHz Xeons w/ 8GB RAM) connected to a *large*, > > > high-performance NAS system over a de

Re: Solr feasibility with terabyte-scale data

2008-05-10 Thread Marcus Herou
> You should also look at a new project called Katta: > > http://katta.wiki.sourceforge.net/ > > First code check-in should be happening this weekend, so I'd wait until > Monday to take a look :) > > -- Ken > > > On 9 May 2008, at 01:17, Marcus Herou wrote: >> >> Cool. &g

Re: Solr feasibility with terabyte-scale data

2008-05-09 Thread Marcus Herou
>> millisecond query response. >>> >>> Our environment makes available Apache on blade servers (Dell 1955 dual >>> dual-core 3.x GHz Xeons w/ 8GB RAM) connected to a *large*, >>> high-performance NAS system over a dedicated (out-of-band) GbE switch >>> (Dell PowerConnect 5324) using a 9K MTU (jumbo packets). We are starting >>> with 2 blades and will add as demands require. >>> >>> While we have a lot of storage, the idea of master/slave Solr Collection >>> Distribution to add more Solr instances clearly means duplicating an >>> immense index. Is it possible to use one instance to update the index >>> on NAS while other instances only read the index and commit to keep >>> their caches warm instead? >>> >>> Should we expect Solr indexing time to slow significantly as we scale >>> up? What kind of query performance could we expect? Is it totally >>> naive even to consider Solr at this kind of scale? >>> >>> Given these parameters is it realistic to think that Solr could handle >>> the task? >>> >>> Any advice/wisdom greatly appreciated, >>> >>> Phil >>> >>> >>> >>> >> -- >> View this message in context: >> http://www.nabble.com/Solr-feasibility-with-terabyte-scale-data-tp14963703p17142176.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: java.lang.IncompatibleClassChangeError

2008-01-24 Thread Marcus Herou
ou get funny compilation errors after > an update. > > > Marcus Herou wrote: > > Hi. > > > > I did a svn update in trunk and deployed new war on server and jars on > > client (after recompile) and got this. > > I read that the SolrServer changed from Ab

java.lang.IncompatibleClassChangeError

2008-01-24 Thread Marcus Herou
) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:514) at com.caucho.util.ThreadPool.runTasks(ThreadPool.java:520) at com.caucho.util.ThreadPool.run(ThreadPool.java:442) at java.lang.Thread.run(Thread.java:619) -- Marcus Herou Solution Architect and Java

Re: OOE during indexing

2008-01-22 Thread Marcus Herou
as <[EMAIL PROTECTED]> wrote: > > On 22-Jan-08, at 9:46 PM, Marcus Herou wrote: > > > > OK I got the conclusion myself. add memory to the box and get some > > more > > boxes :) > > I'm glad you've come to that conclusion, but to reinforce it: Solr/ >

Re: OOE during indexing

2008-01-22 Thread Marcus Herou
62) > > : [06:25:53.877] at org.apache.solr.search.LRUCache.warm(LRUCache.java > :192) > : [06:25:53.877] at org.apache.solr.search.SolrIndexSearcher.warm( > : SolrIndexSearcher.java:1393) > > > > -Hoss > > -- Marcus Herou Solution Architect and Java developer Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/

Re: OOE during indexing

2008-01-22 Thread Marcus Herou
towarming you need 2x peak memory usage. The only thing you can do > is increase your max heap size or be careful about cache autowarming > (possibly turning it off). > > cheers, > -Mike > > On 21-Jan-08, at 9:44 PM, Marcus Herou wrote: > > > Hi. > > >

OOE during indexing

2008-01-21 Thread Marcus Herou
rent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:675) [06:25:53.877] at java.lang.Thread.run(Thread.java:595) Help anyone? Attaching schema.xml and solr