I will appreciate any comments or help on this. Thanks.
Rav
-- Forwarded message --
From: Ravish Bhagdev
Date: Fri, Mar 2, 2012 at 12:12 AM
Subject: Using MLT Handler to find similar documents but also filter
similar documents by a keyword.
To: solr-user@lucene.apache.org
Hi,
Chamnap,
that'd be a view of the stored fields only (although Luke has some more to
extract unstored fields).
In my search projects I have an indexer and that component (not DIH) can
display an "indexed view" of a document.
maybe it helps.
paul
Le 10 mars 2012 à 08:57, Anupam Bhattacharya a
Hi all,
i am new to solr ...i would like to know how to avoid indexing of
duplicate documents.
i have one table in database with one column and that has the keywords which
are
frequently repeated when i tried to index it is indexing all the terms in
the database.
i would like to ignore the index
Thanks Anupam and Paul.
Yes, it can't display unstored fields. I can't find the way to extract
unstored fields in Luke. Any idea?
In your project, which indexer do you use? Previously, I wrote a ruby
script to index, but it took a lot of time. That's why I changed to DIH.
Chamnap
On Sat, Mar 1
Hi all,
I'm using DIH solr 3.5 to import data from mysql. In my document, I have
some fields: name, category, text_spell, ...
text_spell is a multi-valued field which combines from name and category
(category is a multi-value field as well).
In this case, I would use ScriptTra
Hello,
First of all you can have an access to the context, where the parent entity
fields can be obtained from (following your link):
The semantics of execution is same as that of a java transformer. The
method can have two arguments as in 'transformRow(Map ,
Context context) in the abstract clas
Hello,
DIH has a cute interactive ui with debug/verbose features. Have you checked
them?
On Sat, Mar 10, 2012 at 10:57 AM, Chamnap Chhorn wrote:
> Hi all,
>
> I'm doing data import using DIH in solr 3.5. I'm curious to know whether it
> is see the xml representation of indexed data from the brow
I made my own indexed doc representation using JDOM then represented that
web-based.
paul
Le 10 mars 2012 à 12:08, Chamnap Chhorn a écrit :
> Thanks Anupam and Paul.
>
> Yes, it can't display unstored fields. I can't find the way to extract
> unstored fields in Luke. Any idea?
> In your proje
Hi
I am trying to index 12MM docs faster than is currently happening in Solr
(using solrj). We have identified solr's add method as the bottleneck (and not
commit - which is tuned ok through mergeFactor and maxRamBufferSize and jvm
ram).
Adding 1000 docs is taking approximately 25 seconds. We
Thanks Mikhail.
Yeah, in this case CopyField is better. I can combine multiple fields into
a new field, right? Something like this:
Anyway, I might need to access the child entity and parent entity. Can you
provide me some examples on how to use context? I'm not a java developer,
it's a little
Chamnap,
Context's way is kind of experimental as-is approach, and the only way to
explore it is use debugger or be ready to debug JavaScript manually. It is
not documented well.
Common approach is copyfield.
With Best Wishes.
On Sat, Mar 10, 2012 at 8:24 PM, Chamnap Chhorn wrote:
> Thanks Mikh
I have a case where I'd like to get documents which most closely match a
particular vector. The RowSimilarityJob of Mahout is ideal for
precalculating similarity between existing documents but in my case the
query is constructed at run time. So the UI constructs a vector to be
used as a query.
Mikhail, DIH interactive ui doesn't look good to me because I can't see the
xml of indexed documents. I need to see to make sure I'm doing right.
How do you make sure you're doing right by using DIH interactive ui?
On Sat, Mar 10, 2012 at 7:14 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wr
Hello, I have a great challenge here. I have a big file(1.2G) with more than
200 million records need to index. It might more than 9 G file with more
than 1000 million record later.
One record contains 3 fields. I am quite newer for solr and lucene, so I
have some questions:
1. It seems that solr
On Mar 10, 2012, at 1:52 PM, neosky wrote:
> Hello, I have a great challenge here. I have a big file(1.2G) with more than
> 200 million records need to index. It might more than 9 G file with more
> than 1000 million record later.
> One record contains 3 fields. I am quite newer for solr and luc
Does "Solr" support a 3-way join? i.e.
http://wiki.apache.org/solr/Join (I have the 2-way join working)
For example, I am pulling 3 different tables from a RDBMS into one Solr core:
Table#1: Customers (parent table)
Table#2: Addresses (child table with foreign key to customers)
Tabl
Fields can be multi-valued. Put multiple phone numbers in a field and match all
of them.
wunder
On Mar 10, 2012, at 4:58 PM, Angelina Bola wrote:
> Does "Solr" support a 3-way join? i.e.
> http://wiki.apache.org/solr/Join (I have the 2-way join working)
>
> For example, I am pulling 3 differe
Barring the horrible name I am wondering if folks would be interested
in having something like this as an alternative to the standard
kstemmer. This is largely based on the SynonymFilter except it builds
tokens using the kstemmer and the original input. I've created a JIRA
for this to start discu
Hi all,
Anyone please help explain me about a row in DIH.
Let's say, a listing can have multiple keyphrase_assets. A keyphrase_asset
is a comma-seperated value ("hotel,bank,..."). I need to index and split by
comma into a multi-valued keyphrase field.
function fKeyphrasePosition(row) {
}
Theref
Look at the MoreLikeThis feature in Lucene. I believe it does roughly
what you describe.
On Sat, Mar 10, 2012 at 9:58 AM, Pat Ferrel wrote:
> I have a case where I'd like to get documents which most closely match a
> particular vector. The RowSimilarityJob of Mahout is ideal for
> precalculating
Yeah I am a bit afraid when people want to use the join() feature. To
get good performance you really need to try to stick to the
recommendation of denormalizing your database into multiValued search
fields.
You can also use external fields, or store formatted info into a
String field in json or x
Why not wrap the call into a service and then call the right handler?
On Fri, Mar 9, 2012 at 10:11 AM, geeky2 wrote:
> hello all,
>
> does solr have a mechanism that could intercept a request (before it is
> handed off to a request handler).
>
> the intent (from the business) is to send in a gene
debugQuery tells you.
On Fri, Mar 9, 2012 at 1:05 PM, Russell Black wrote:
> When searching across multiple fields, is there a way to identify which
> field(s) resulted in a match without using highlighting or stored fields?
--
Bill Bell
billnb...@gmail.com
cell 720-256-8076
Great answer Robert.
On Fri, Mar 9, 2012 at 12:06 PM, Robert Stewart wrote:
> Split up index into say 100 cores, and then route each search to a specific
> core by some mod operator on the user id:
>
> core_number = userid % num_cores
>
> core_name = "core"+core_number
>
> That way each index co
24 matches
Mail list logo