u want to cluster few thousands of documents ,
for example when user search solr , just cluster the search results
Mahout is much more scalable and probably you need Hadoop for that
thanks
chandan
On Tue, Sep 4, 2012 at 2:10 PM, Denis Kuzmenok wrote:
>
>
> ---- Original Message ---
Hi, all.
I know there is carrot2 and mahout for clustering. I want to implement such
thing:
I fetch documents and want to group them into clusters when they are added to
index (i want to filter "similar" documents for example for 1 week). i need
these documents quickly, so i cant rely on some po
Hi, all. I know there is carrot2 and mahout for clustering. I want to implement
such thing: I fetch documents and want to group them into clusters when they
are added to index (i want to filter "similar" documents for example for 1
week). i need these documents quickly, so i cant rely on some po
Original Message
Subject: Solr Clustering
From: Denis Kuzmenok
To: solr-user@lucene.apache.org
CC:
Hi, all.
I know there is carrot2 and mahout for clustering. I want to implement such
thing:
I fetch documents and want to group them into clusters when they are added to
But i don't know what values would be price field in that query. It
can be 100-1000, and 10-100, and i want to get ranges in every query,
just split price field by docs number.
> Yes, Ranged Facets
> http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range
> 2011/8/31
Hi.
Suppose i have a field "price" with different values, and i want to
get ranges for this field depending on docs count, for example i want
to get 5 ranges for 100 docs with 20 docs in each range, 6 ranges for
200 docs = 34 docs in each field, etc.
Is it possible with solr?
> Hi All
> I indexed a set of documents using Solr, which are shown in the stats page
> on the admin panel.
> However, the search interface always returns 0 documents to me.
> When I give the query as *:*, it does return me all the 20K odd documents I
> tried indexing just a few hours back.
> Can
Of course, i did stop the solr before copying the index. Deleting
index and reindexing on production server did solve an issue. Strange,
but working..
> Have you stopped Solr before manually copying the data? This way you
> can be sure that index is the same and you didn't have any new docs
2011, at 10:44 AM, Denis Kuzmenok wrote:
>> Hi.
>>
>> I've debugged search on test machine, after copying to production server
>> the entire directory (entire solr directory), i've noticed that one
>> query (SDR S70EE K) does match on test server, and does not on
>> production.
>> How can that be?
>>
Hi.
I've debugged search on test machine, after copying to production server
the entire directory (entire solr directory), i've noticed that one
query (SDR S70EE K) does match on test server, and does not on
production.
How can that be?
Your solution seems to work fine, not perfect, but much better then
mine :)
Thanks!
>> If i do query like "Samsung" i want to see prior most relevant results
>> with isflag:true and bigger popularity, but if i do query like "Nokia
>> 6500" and there is isflag:false, then it should be higher
Hi, everyone.
I have fields:
text fields: name, title, text
boolean field: isflag (true / false)
int field: popularity (0 to 9)
Now i do query:
defType=edismax
start=0
rows=20
fl=id,name
q=lg optimus
fq=
qf=name^3 title text^0.3
sort=score desc
pf=name
bf=isflag sqrt(popularity)
mm=100%
debug
try:
q=title:Unicamp&defType=dismax&bf=question_count^5.0
"title:Unicamp" in any search handler will search only in requested field
> The queries I am trying to do are
> q=title:Unicamp
> and
> q=title:Unicamp&bf=question_count^5.0
> The boosting factor (5.0) is just to verify if it was really
Show your full request to solr (all params)
> Hi,
> I'm trying to use bf parameter in solr queries but I'm having some problems.
> The context is: I have some topics and a integer weight of popularity
> (number of users that follow the topic). I'd like to boost the documents
> according to this w
> If you could move to 3.x and your "linked item" boosts could be
> calculated offline in batch periodically you could use an external
> file field to store the doc boost.
> a few If's though
I have 3.2 and external file field doesn't work without solr restart
(on multicore instance).
olr without external files and
then create them - they are not working..
What is wrong?
PS: Solr 3.2
> http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
> On Tuesday 31 May 2011 15:41:32 Denis Kuzmenok wrote:
>> Flags are stored to filter results a
om selected values
3) Another values for selected properties (is any chosen)
4) Another brand_id for selected brand_id
5) Another price for selected price
Will appreciate any help or thoughts!
Cheers,
Denis Kuzmenok
Thursday, June 2, 2011, 6:29:23 PM, you wrote:
Wow. This sounds nice. Will try this way. Thanks!
> Denis,
> would dynamic fields help:
> field defined as *_price in schema
> at index time you index fields named like:
> [1-9]_[0-99]_price
> at query time you search the price field for a given co
Hi)
What i need:
Index prices to products, each product has multiple prices, to each
region, country, and price itself.
I tried to do with field type "long" multiple:true, and form
value as "country code + region code + price" (1004000349601, for
example), but it has strange beha
> Hey Denis,
> * How big is your index in terms of number of documents and index size?
5 cores, average 250.000 documents, one with about 1 million (but
without text, just int/float fields), one with about 10 million
id/name documents, but with n-gram.
Size: 4 databases about 1G (sum),
you don't understand? Just start with whatever
> the Solr example jetty has, and only change things if you have a reason
> to (that you understand).
> On 6/1/2011 1:19 PM, Denis Kuzmenok wrote:
>> Overall memory on server is 24G, and 24G of swap, mostly all the time
>&g
he default parameters from the Solr example
> jetty, and if you don't run into any problems, then great. Starting
> with the example jetty shipped with Solr would be the easiest way to get
> started for someone who doesn't know much about Java/JVM.
> On 6/1/2011 12:37 PM, Den
So what should i do to evoid that error?
I can use 10G on server, now i try to run with flags:
java -Xms6G -Xmx6G -XX:MaxPermSize=1G -XX:PermSize=512M -D64
Or should i set xmx to lower numbers and what about other params?
Sorry, i don't know much about java/jvm =(
Wednesday, June 1, 2011, 7:29:
Here is output after about 24 hours running solr. Maybe there is some
way to limit memory consumption? :(
test@d6 ~/solr/example $ java -Xms3g-Xmx6g-D64
-Dsolr.solr.home=/home/test/solr/example/multicore/ -jar start.jar
2011-05-31 17:05:14.265:INFO::Logging to STDERR via
. it counts all memory, not
> sure... if you don't have big values for 99.9%wa (which means WAIT I/O -
> disk swap usage) everyhing is fine...
> -Original Message-
> From: Denis Kuzmenok
> Sent: May-31-11 4:18 PM
> To: solr-user@lucene.apache.org
> Subject: Solr
I run multiple-core solr with flags: -Xms3g -Xmx6g -D64, but i see
this in top after 6-8 hours and still raising:
17485 test214 10.0g 7.4g 9760 S 308.2 31.3 448:00.75 java
-Xms3g -Xmx6g -D64 -Dsolr.solr.home=/home/test/solr/example/multicore/ -jar
start.jar
Are there any ways t
Will it be slow if there are 3-5 million key/value rows?
> http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
> On Tuesday 31 May 2011 15:41:32 Denis Kuzmenok wrote:
>> Flags are stored to filter results and it's pretty highloaded, it's
&g
Flags are stored to filter results and it's pretty highloaded, it's
working fine, but i can't update index very often just to make flags
up to time =\
Where can i read about using external fields / files?
> And it wouldn't work unless all the data is stored anyway. Currently there's
> no w
After restart i have these errors every time i do commit via post.jar.
Config: multicore / 5 cores, Solr 3.1
Lock obtain timed out:
SimpleFSLock@/home/ava/solr/example/multicore/context/data/index/write.lock
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
SimpleFSLoc
I have a database with n-gram field, about 5 millions documents. QTime
is about 200-1000 ms, database is not optimized because it must reply
to queries everytime and data are updated often. Is it normal?
Solr: 3.1, java -Xms2048M -Xmx4096M
Server: i7, 12Gb
I'm using 3.1 now. Indexing lasts for a few hours, and have big
plain size. Getting all documents would be rather slow :(
> Not with 1.4, but apparently there is a patch for trunk. Not
> sure if it is in 3.1.
> If you are on 1.4, you could first query Solr to get the data
> for the document
Hi.
I have and indexed database which is indexed few times a day and
contain tinyint flag (like is_enabled, is_active, etc), and content
isn't changed too often, but flags are.
So if i index via post.jar only flags then entire document is deleted
and there's only unique key and flags.
Is
Hi.
I try to understand the meaning of overwrite="false" in xml that i
post with post.jar.
I have two possible behaviour:
1) if the document with specified uniquekey exists - it's not updated
(even if some fields are changed)
2) if the document with specified uniquekey exists and all
> --- On Mon, 3/14/11, Denis Kuzmenok wrote:
>> From: Denis Kuzmenok
>> Subject: Solr sorting
>> To: solr-user@lucene.apache.org
>> Date: Monday, March 14, 2011, 10:23 AM
>> Hi.
>> Is there any way to make such scheme working:
>> I have many
Hi.
Is there any way to make such scheme working:
I have many documents, each has a random field to enable random
sorting, and i have a weight field.
I want to get random results, but documents with bigger weight should
appear more frequently.
Is that possible?
Thanks, in advance.
Hi.
I wonder is it possible in built-in way to make context search in
Solr?
I have about 50k documents (mainly 'name' of char(150)), so i receive
a content of a page and should show found documents.
Of course i can just join by OR and submit a search, but an accuracy
would be not so goo
36 matches
Mail list logo