Re: Solr Clustering

2012-09-17 Thread Denis Kuzmenok
u want to cluster few thousands of documents , for example when user search solr , just cluster the search results Mahout is much more scalable and probably you need Hadoop for that thanks chandan On Tue, Sep 4, 2012 at 2:10 PM, Denis Kuzmenok wrote: > > > ---- Original Message ---

Solr Clustering

2012-09-04 Thread Denis Kuzmenok
Hi, all. I know there is carrot2 and mahout for clustering. I want to implement such thing: I fetch documents and want to group them into clusters when they are added to index (i want to filter "similar" documents for example for 1 week). i need these documents quickly, so i cant rely on some po

Solr Clustering

2012-09-04 Thread Denis Kuzmenok
Hi, all. I know there is carrot2 and mahout for clustering. I want to implement such thing: I fetch documents and want to group them into clusters when they are added to index (i want to filter "similar" documents for example for 1 week). i need these documents quickly, so i cant rely on some po

Solr Clustering

2012-09-04 Thread Denis Kuzmenok
Original Message Subject: Solr Clustering From: Denis Kuzmenok To: solr-user@lucene.apache.org CC: Hi, all. I know there is carrot2 and mahout for clustering. I want to implement such thing: I fetch documents and want to group them into clusters when they are added to

Re: Field grouping?

2011-08-31 Thread Denis Kuzmenok
But i don't know what values would be price field in that query. It can be 100-1000, and 10-100, and i want to get ranges in every query, just split price field by docs number. > Yes, Ranged Facets > http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range > 2011/8/31

Field grouping?

2011-08-31 Thread Denis Kuzmenok
Hi. Suppose i have a field "price" with different values, and i want to get ranges for this field depending on docs count, for example i want to get 5 ranges for 100 docs with 20 docs in each range, 6 ranges for 200 docs = 34 docs in each field, etc. Is it possible with solr?

Re: indexing but not able to search

2011-07-06 Thread Denis Kuzmenok
> Hi All > I indexed a set of documents using Solr, which are shown in the stats page > on the admin panel. > However, the search interface always returns 0 documents to me. > When I give the query as *:*, it does return me all the 20K odd documents I > tried indexing just a few hours back. > Can

Re: Strange behavior

2011-06-16 Thread Denis Kuzmenok
Of course, i did stop the solr before copying the index. Deleting index and reindexing on production server did solve an issue. Strange, but working.. > Have you stopped Solr before manually copying the data? This way you > can be sure that index is the same and you didn't have any new docs

Re: Strange behavior

2011-06-14 Thread Denis Kuzmenok
2011, at 10:44 AM, Denis Kuzmenok wrote: >> Hi. >> >> I've debugged search on test machine, after copying to production server >> the entire directory (entire solr directory), i've noticed that one >> query (SDR S70EE K) does match on test server, and does not on >> production. >> How can that be? >>

Strange behavior

2011-06-14 Thread Denis Kuzmenok
Hi. I've debugged search on test machine, after copying to production server the entire directory (entire solr directory), i've noticed that one query (SDR S70EE K) does match on test server, and does not on production. How can that be?

Re: Edismax sorting help

2011-06-09 Thread Denis Kuzmenok
Your solution seems to work fine, not perfect, but much better then mine :) Thanks! >> If i do query like "Samsung" i want to see prior most relevant results >> with  isflag:true and bigger popularity, but if i do query like "Nokia >> 6500"  and  there is isflag:false, then it should be higher

Edismax sorting help

2011-06-09 Thread Denis Kuzmenok
Hi, everyone. I have fields: text fields: name, title, text boolean field: isflag (true / false) int field: popularity (0 to 9) Now i do query: defType=edismax start=0 rows=20 fl=id,name q=lg optimus fq= qf=name^3 title text^0.3 sort=score desc pf=name bf=isflag sqrt(popularity) mm=100% debug

Re: Problem with boosting function

2011-06-08 Thread Denis Kuzmenok
try: q=title:Unicamp&defType=dismax&bf=question_count^5.0 "title:Unicamp" in any search handler will search only in requested field > The queries I am trying to do are > q=title:Unicamp > and > q=title:Unicamp&bf=question_count^5.0 > The boosting factor (5.0) is just to verify if it was really

Re: Problem with boosting function

2011-06-08 Thread Denis Kuzmenok
Show your full request to solr (all params) > Hi, > I'm trying to use bf parameter in solr queries but I'm having some problems. > The context is: I have some topics and a integer weight of popularity > (number of users that follow the topic). I'd like to boost the documents > according to this w

Re: Boosting result on query.

2011-06-08 Thread Denis Kuzmenok
> If you could move to 3.x and your "linked item" boosts could be > calculated offline in batch periodically you could use an external > file field to store the doc boost. > a few If's though I have 3.2 and external file field doesn't work without solr restart (on multicore instance).

Re: Documents update

2011-06-07 Thread Denis Kuzmenok
olr without external files and then create them - they are not working.. What is wrong? PS: Solr 3.2 > http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html > On Tuesday 31 May 2011 15:41:32 Denis Kuzmenok wrote: >> Flags are stored to filter results a

Need query help

2011-06-06 Thread Denis Kuzmenok
om selected values 3) Another values for selected properties (is any chosen) 4) Another brand_id for selected brand_id 5) Another price for selected price Will appreciate any help or thoughts! Cheers, Denis Kuzmenok

Re: Need Schema help

2011-06-02 Thread Denis Kuzmenok
Thursday, June 2, 2011, 6:29:23 PM, you wrote: Wow. This sounds nice. Will try this way. Thanks! > Denis, > would dynamic fields help: > field defined as *_price in schema > at index time you index fields named like: > [1-9]_[0-99]_price > at query time you search the price field for a given co

Need Schema help

2011-06-02 Thread Denis Kuzmenok
Hi) What i need: Index prices to products, each product has multiple prices, to each region, country, and price itself. I tried to do with field type "long" multiple:true, and form value as "country code + region code + price" (1004000349601, for example), but it has strange beha

Re: Solr memory consumption

2011-06-02 Thread Denis Kuzmenok
> Hey Denis, > * How big is your index in terms of number of documents and index size? 5 cores, average 250.000 documents, one with about 1 million (but without text, just int/float fields), one with about 10 million id/name documents, but with n-gram. Size: 4 databases about 1G (sum),

Re: Solr memory consumption

2011-06-01 Thread Denis Kuzmenok
you don't understand? Just start with whatever > the Solr example jetty has, and only change things if you have a reason > to (that you understand). > On 6/1/2011 1:19 PM, Denis Kuzmenok wrote: >> Overall memory on server is 24G, and 24G of swap, mostly all the time >&g

Re: Solr memory consumption

2011-06-01 Thread Denis Kuzmenok
he default parameters from the Solr example > jetty, and if you don't run into any problems, then great. Starting > with the example jetty shipped with Solr would be the easiest way to get > started for someone who doesn't know much about Java/JVM. > On 6/1/2011 12:37 PM, Den

Re: Solr memory consumption

2011-06-01 Thread Denis Kuzmenok
So what should i do to evoid that error? I can use 10G on server, now i try to run with flags: java -Xms6G -Xmx6G -XX:MaxPermSize=1G -XX:PermSize=512M -D64 Or should i set xmx to lower numbers and what about other params? Sorry, i don't know much about java/jvm =( Wednesday, June 1, 2011, 7:29:

Re: Solr memory consumption

2011-06-01 Thread Denis Kuzmenok
Here is output after about 24 hours running solr. Maybe there is some way to limit memory consumption? :( test@d6 ~/solr/example $ java -Xms3g-Xmx6g-D64 -Dsolr.solr.home=/home/test/solr/example/multicore/ -jar start.jar 2011-05-31 17:05:14.265:INFO::Logging to STDERR via

Re: Solr memory consumption

2011-06-01 Thread Denis Kuzmenok
. it counts all memory, not > sure... if you don't have big values for 99.9%wa (which means WAIT I/O - > disk swap usage) everyhing is fine... > -Original Message- > From: Denis Kuzmenok > Sent: May-31-11 4:18 PM > To: solr-user@lucene.apache.org > Subject: Solr

Solr memory consumption

2011-05-31 Thread Denis Kuzmenok
I run multiple-core solr with flags: -Xms3g -Xmx6g -D64, but i see this in top after 6-8 hours and still raising: 17485 test214 10.0g 7.4g 9760 S 308.2 31.3 448:00.75 java -Xms3g -Xmx6g -D64 -Dsolr.solr.home=/home/test/solr/example/multicore/ -jar start.jar Are there any ways t

Re: Documents update

2011-05-31 Thread Denis Kuzmenok
Will it be slow if there are 3-5 million key/value rows? > http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html > On Tuesday 31 May 2011 15:41:32 Denis Kuzmenok wrote: >> Flags are stored to filter results and it's pretty highloaded, it's &g

Re: Documents update

2011-05-31 Thread Denis Kuzmenok
Flags are stored to filter results and it's pretty highloaded, it's working fine, but i can't update index very often just to make flags up to time =\ Where can i read about using external fields / files? > And it wouldn't work unless all the data is stored anyway. Currently there's > no w

Solr 3.1 commit errors

2011-05-30 Thread Denis Kuzmenok
After restart i have these errors every time i do commit via post.jar. Config: multicore / 5 cores, Solr 3.1 Lock obtain timed out: SimpleFSLock@/home/ava/solr/example/multicore/context/data/index/write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLoc

n-gram speed

2011-05-30 Thread Denis Kuzmenok
I have a database with n-gram field, about 5 millions documents. QTime is about 200-1000 ms, database is not optimized because it must reply to queries everytime and data are updated often. Is it normal? Solr: 3.1, java -Xms2048M -Xmx4096M Server: i7, 12Gb

Re: Documents update

2011-05-27 Thread Denis Kuzmenok
I'm using 3.1 now. Indexing lasts for a few hours, and have big plain size. Getting all documents would be rather slow :( > Not with 1.4, but apparently there is a patch for trunk. Not > sure if it is in 3.1. > If you are on 1.4, you could first query Solr to get the data > for the document

Documents update

2011-05-27 Thread Denis Kuzmenok
Hi. I have and indexed database which is indexed few times a day and contain tinyint flag (like is_enabled, is_active, etc), and content isn't changed too often, but flags are. So if i index via post.jar only flags then entire document is deleted and there's only unique key and flags. Is

XML Update overwrite?

2011-05-12 Thread Denis Kuzmenok
Hi. I try to understand the meaning of overwrite="false" in xml that i post with post.jar. I have two possible behaviour: 1) if the document with specified uniquekey exists - it's not updated (even if some fields are changed) 2) if the document with specified uniquekey exists and all

Re: Solr sorting

2011-03-14 Thread Denis Kuzmenok
> --- On Mon, 3/14/11, Denis Kuzmenok wrote: >> From: Denis Kuzmenok >> Subject: Solr sorting >> To: solr-user@lucene.apache.org >> Date: Monday, March 14, 2011, 10:23 AM >> Hi. >> Is there any way to make such scheme working: >> I  have  many 

Solr sorting

2011-03-14 Thread Denis Kuzmenok
Hi. Is there any way to make such scheme working: I have many documents, each has a random field to enable random sorting, and i have a weight field. I want to get random results, but documents with bigger weight should appear more frequently. Is that possible? Thanks, in advance.

Solr context search

2010-11-17 Thread Denis Kuzmenok
Hi. I wonder is it possible in built-in way to make context search in Solr? I have about 50k documents (mainly 'name' of char(150)), so i receive a content of a page and should show found documents. Of course i can just join by OR and submit a search, but an accuracy would be not so goo