Have you tried to reindex using DocValues? Fields used for faceting are
stored on disk and not on ram using the FieldCache. If you have enough
memory they will be loaded on the system cache but not on the java heap.
This is good for GC too when committing.
http://wiki.apache.org/solr/DocValues
-
Not a solution for the short term but sounds like a good use case to migrate
to Solr 4.X and use DocValues instead of FieldCache for faceting.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-3-6-optimize-and-field-cache-question-tp4076398p4076822.html
Sent from the Solr
Hey there,
I'm testing a custom similarity which loads data from and external file
located in solr_home/core_name/conf/. I load data from the file into a Map
on the init method of the SimilarityFactory. I would like to reload that Map
every time a commit happens or every X hours.
To do that I've th
This is totally deprecated but maybe can be helpful if you want to re-sort
some documents
https://issues.apache.org/jira/browse/SOLR-1311
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tweaking-boosts-for-more-search-results-variety-tp4088302p4089044.html
Sent from the Solr
Deduplication uses lucene indexWriter.updateDocument using the signature
term. I don't think it's possible as a default feature to choose wich
document to index, the "original" should be always the last to be indexed.
/IndexWriter.updateDocument
Updates a document by first deleting the document(s)
Hey there,
I'm wondering if there's a more clean way to to this:
I've written a SearchComponent, that runs as last-component. In the prepare
method I build a DocSet (SortedIntDocSet) based on if some values of the
fieldCache of a determined field accomplish some rules (if rules are
accomplished, se
As far as I know there's no issue about this. You have to reindex and that's
it.
In which kind of field are you changing the norms? (You just will see
changes in text fields)
Using debugQuery=true you can see how norms affect the score (in case you
have them not omited)
--
View this message in con
Replication is easier to manage and a bit faster. See the performance
numbers: http://wiki.apache.org/solr/SolrReplication
--
View this message in context:
http://lucene.472066.n3.nabble.com/Collection-Distribution-vs-Replication-in-Solr-tp3458724p3459178.html
Sent from the Solr - User mailing li
hey there!
can someone explain me how impacts to have multivalued fields when sorting?
I have read in other threads how does it affect when faceting but couldn't
find any info of the impact when sorting
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/perfor
I mean sorting the query results, not facets.
I am asking because I have added a multivalued field that has as much 10
values. But 70% of the docs has just 1 or 2 fields of this multiValued
field. I am not doing faceting.
Since I have added the multiValued field, "java old gen" seems to get full
m
Hey Erik,
I am currently sorting by a multiValued. It apears a feature tha't you may
not know wich of the fields of the multiValued field makes the document be
in that position. This is good for me, I don't care for my tests.
What I need to know if there is any performance issue in all of this.
Th
Maybe this helps:
http://wiki.apache.org/solr/SolrPlugins#QParserPlugin
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-query-boosting-be-used-with-a-custom-request-handlers-tp884499p912691.html
Sent from the Solr - User mailing list archive at Nabble.com.
>>Well, sorting requires that all the unique values in the target field
>>get loaded into memory
That's what I tought, thanks.
>>But a larger question is whether what your doing is worthwhile
>>even as just a measurement. You say
>>"This is good for me, I don't care for my tests". I claim that
>>
I think there's people using this patch in production:
https://issues.apache.org/jira/browse/SOLR-1301
I have tested it myself indexing data from CSV and from HBase and it works
properly
--
View this message in context:
http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p914553.ht
I think a good solution could be to use hadoop with SOLR-1301 to build solr
shards and then use solr distributed search against these shards (you will
have to copy to local from HDFS to search against them)
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp48
Well, the patch consumes the data from a csv. You have to modify the input to
use TableInputFormat (I don't remember if it's called exaclty like that) and
it will work.
Once you've done that, you have to specify as much reducers as shards you
want.
I know 2 ways to index using hadoop
method 1 (so
Hi Otis, just for curiosity, wich strategy do you use? Index in the map or
reduce side?
Do you use it to build shards or a single monolitic index?
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p919335.html
Sent from the Solr - User mai
Thanks, that's very useful info. However can't reproduce the error. I've
created and index where all documents have a multivalued date field and each
document have a minimum of one value in that field. (most of the docs have 2
or 3). So, the number of un-inverted term instances is greater than
the
>>*There are lot's of docs with the same value, I mention that because I
supose that same value has nothing to do with the number of un-inverted term
instances.
It has to do, I've been able to reproduce teh error by setting different
values to each field:
HTTP Status 500 - there are more terms th
I supose you use BatchSize=-1 to index that amount of data. Up from 5.1.7
connector there's this param:
netTimeoutForStreamingResults
The default value is 600. Increasing that maybe can help (2400 for example?)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Recommended-MySQL
Hey there,
I've done some tests with a custom java app using EmbeddedSolrServer to
create an index.
It works ok and I am able to build the index but I've noticed after the
commit an optimize are done, the app never terminates.
How should I end it? Is there any way to tell the EmbeddedSolrServer to
Seems that coreContainer.shoutdown() solves the problem.
Anyone doing it in a different way?
--
View this message in context:
http://lucene.472066.n3.nabble.com/ending-a-java-app-that-uses-EmbeddedSolrServer-tp963573p964013.html
Sent from the Solr - User mailing list archive at Nabble.com.
As far as I know, the higher you set the value, the faster the indexing
process will be (because more things are kept in memory). But depending on
which are your needs, it may not be the best option. If you set a high
mergeFactor and you want to optimize the index once the process is done,
this op
http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/
--
View this message in context:
http://lucene.472066.n3.nabble.com/JVM-GC-is-very-frequent-tp1345760p1348065.html
Sent from the Solr - User mailing list archive at Nabble.com.
I need to load a FieldCache for a field wich is a solr "integer" type and has
as maximum 3 digits. Let's say my index has 10M docs.
I am wandering what is more optimal and less memory consumig, to load a
FieldCache.DEFAUL.getInts or a FieldCache.DEFAULT.getStringIndex.
The second one will have a
I noticed that long ago.
Fixed it doing in HighlightComponent finishStage:
@Override
public void finishStage(ResponseBuilder rb) {
boolean hasHighlighting = true ;
if (rb.doHighlights && rb.stage == ResponseBuilder.STAGE_GET_FIELDS) {
Map.Entry[] arr = new
NamedList.NamedListEnt
Well these are pretty different things. SolrCloud is meant to handle
distributed search in a more easy way that "raw" solr distributed search.
You have to build the shards in your own way.
Solr+hadoop is a way to build these shards/indexes in paralel.
--
View this message in context:
http://luc
You have to create the core's folder with it's conf inside the Solr home.
Once done you can call the create action of the admin handler:
http://wiki.apache.org/solr/CoreAdmin#CREATE
If you need to dinamically create, start and stop lots of cores there's this
patch, but don't know about it's curren
To create the core, the folder with the confs must already exist and has to
be placed in the proper place (inside the solr home). Once you run the
create core action, this core will we added to solr.xml and dinamically
loaded.
--
View this message in context:
http://lucene.472066.n3.nabble.com/D
As far as I know, in the core admin page you can find when was the last time
an index had a modification and was comitted checking the lastModified.
But? what startTime and uptime mean?
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/Core-status-uptime-and-s
>> and i index data on the basis of these fields. Now, incase i need to add a
new field, is there a way i can >> add the field without corrupting the
previous data. Is there any feature which adds a new field with a
>> default value to the existing records.
You just have to add the new field in
>> and i index data on the basis of these fields. Now, incase i need to add a
new field, is there a way i can >> add the field without corrupting the
previous data. Is there any feature which adds a new field with a
>> default value to the existing records.
You just have to add the new field in
I have a doubt about how NRTCachingDirectory works.
As far as I've seen, it receives a delegator Directory and caches newly
created segments. So, if MMapDirectory use to be the default:
1.- Does NRTCachingDirectory works acting sort of as a wrapper of MMap
caching the new segments?
2.- If I have
timeAllowed can be used outside distributed search. It is used by the
TimeL¡mitingCollector. When the search time is equal to timeAllowed it will
stop searching and will return the results that could find till then.
This can be a problem when using incremental indexing. Lucene starts
searching from
http://wiki.apache.org/solr/Deduplication
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-ignore-indexing-of-duplicated-documents-tp3814858p3818973.html
Sent from the Solr - User mailing list archive at Nabble.com.
http://lucene.472066.n3.nabble.com/Multiple-Facet-Dates-td495480.html
--
View this message in context:
http://lucene.472066.n3.nabble.com/Faceting-on-a-date-field-multiple-times-tp3961282p3961865.html
Sent from the Solr - User mailing list archive at Nabble.com.
I need to dive into search grouping / field collapsing again. I've seen there
are lot's of issues about it now.
Can someone point me to the minimum patches I need to run this feature in
trunk? I want to see the code of the most optimised version and what's being
done in distributed search. I think
In case you need to create lots of indexes and register/unregister fast,
there is work on the way http://wiki.apache.org/solr/LotsOfCores
--
View this message in context:
http://lucene.472066.n3.nabble.com/Need-to-create-dyanamic-indexies-base-on-different-document-workspaces-tp2845919p2852410.ht
Any suggestion about this issue?--
View this message in context:
http://lucene.472066.n3.nabble.com/Strange-performance-behaviour-when-concurrent-requests-are-done-tp505478p2878758.html
Sent from the Solr - User mailing list archive at Nabble.com.
That's true. But the degradation is so big. If you use lunch concurrent
requests to a web app taht doesn't use Solr the time per request won't
degradate that much. For me, it looks more like a synchronized is happening
somewhere in Solr or Lucene and is causing this.--
View this message in context:
Hey there,
I've noticed a very odd behaviour with the snapinstaller and commit (using
collectionDistribution scripts). The first time I install a new index
everything works fine. But when installing a new one, I can't see the new
documents. Checking the status page of the core tells me that the ind
Test are done on Solr 1.4
The simplest way to reproduce my problem is having 2 indexes and a Solr box
with just one core. Both index must have been created with the same schema.
1- Remove the index dir of the core and start the server (core is up with an
empty index)
2- check status page of the co
I don't know if this could have something to do with the problem but some of
the files of the indexes have same size and name (in all the index but not
in the empty one).
I have also realized that when moving back to the empty index and
committing, numDocs and maxDocs change. Once I'm with the empt
I have some more info!
I've build another index bigger than the others so names of the files are
not the same. This way, if I move from any of the other index to the bigger
one or vicevera it works (I can see the cahnges in the version, numDocs and
maxDocs)! So, I thing it is related to the name of
I've found the problem in case someone is interested.
It's because of the indexReader.reopen(). If it is enabled, when opening a
new searcher due to the commit, this code is executed (in
SolrCore.getSearcher(boolean forceNew, boolean returnSearcher, final
Future[] waitSearcher)):
...
if
Are u indexing with full import? In case yes and the resultant index has
similar num of docs (that the one you had before) try setting reopenReaders
to false in solrconfig.xml
* You have to send the comit, of course.
--
View this message in context:
http://lucene.472066.n3.nabble.com/embeded-solr
You have different options here. You can give more boost at indexing time to
the documents that have set the fields you want. For this to take effect you
will have to reindex and set omitNorms="false" to the fields you are going
to search. This same concept can be applied to boost single fields ins
Has someone noticed this problem and solved it somehow? (without using
LUCENE_33 in the solrconfig.xml)
https://issues.apache.org/jira/browse/LUCENE-3668
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/offsets-issues-with-multiword-synonyms-since-LUCENE-33
Well an example would be:
synonyms.txt:
huge,big size
The I have the docs:
1- The huge fox attacks first
2- The big size fox attacks first
Then if I query for huge, the highlights for each document are:
1- The huge fox attacks first
2- The big size fox attacks first
The analyzer looks like this
That's provably because you are using both the CollpaseComponent and the
QueryComponent. I think the 2 or 3 last patches allow full replacement of
QueryComponent.You shoud just replace:
for:
This will sort your problem and make response times faster.
Jay Hill wrote:
>
> I'm doing some test
Hey there,
I would like to be able to do something like: After the indexing process is
done with DIH I would like to open an indexreader, iterate over all docs,
modify some of them depending on others and delete some others. I can easy
do this directly coding with lucene but would like to know if
code... so trying to find out the best way to do that as a plugin
instead of a hack as possible.
Thanks in advance
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> It is best handled as a 'newSearcher' listener in solrconfig.xml.
> onImportEnd is invoked before committing
>
> On
e event fired is firstSearcher. newSearcher
> is fired when a commit happens
>
>
> On Tue, Jul 28, 2009 at 4:19 PM, Marc Sturlese
> wrote:
>>
>> Ok, but if I handle it in a newSearcher listener it will be executed
>> every
>> time I reload a core, isn't it? Th
(I
need to modify them depending on values of other documents, that's why I
can't do it with DIH delta-import).
Thanks in advance
Noble Paul നോബിള് नोब्ळ्-2 wrote:
>
> On Tue, Jul 28, 2009 at 5:17 PM, Marc Sturlese
> wrote:
>>
>> That really sounds the best way
uments after indexing process is done
> with
> : DIH
> :
> : If you make your EventListener implements SolrCoreAware you can get
> : hold of the core on inform. use that to get hold of the
> : SolrIndexWriter
> :
> : On Wed, Jul 29, 2009 at 9:20 PM, Marc St
hold of
SolrIndexWriter just holding core...
Marc Sturlese wrote:
>
> Hey there,
> I would like to be able to do something like: After the indexing process
> is done with DIH I would like to open an indexreader, iterate over all
> docs, modify some of them depending on others and d
:>the only way to "negative boost" is to "positively boost" the inverse...
:>
:> (*:* -field1:value_to_penalize)^10
This will do the job aswell as bq supports pure negative queries (at least
in trunk):
bq=-field1:value_to_penalize^10
http://wiki.apache.org/solr/SolrRelevancyFAQ#head-76e53d
As far as I know you can not do that with DIH. What size is your index?
Probably the best you can do is index from scratch again with full-import.
clico wrote:
>
> I hope it could be a solution.
>
> But I think I understood that u can use deletePkQuery like this
>
> "select document_id from ta
Hey there, I need to sort my query results alphabetically for a determinated
field called "town". This field is analyzed with a KeywordAnalyzer and isn't
multiValued. Add that some docs doesn't doesn'h have this field.
Doing just:
http://localhost/solr//select/?q=whatever&version=2.2&start=0&rows
r
> fielType definitions in the schema.
>
> On Mon, Aug 24, 2009 at 11:58 AM, Marc Sturlese
> wrote:
>
>>
>> Hey there, I need to sort my query results alphabetically for a
>> determinated
>> field called "town". This field is analyzed with a KeywordAn
jn Visinescu wrote:
>> >
>> > There's a "sortMissingLast" true/false property that you can set on
>> your
>> > fielType definitions in the schema.
>> >
>> > On Mon, Aug 24, 2009 at 11:58 AM, Marc Sturlese
>> > wrote:
>
Hey there,
I need a query to get the total number of documents in my index. I can get
if I do this using DismaxRequestHandler:
q.alt=*:*&facet=false&hl=false&rows=0
I have noticed this query is very memory consuming. Is there any more
optimized way in trunk to get the total number of documents of
Hey there, I am using DIH to import a db table and and have writed a custom
transformer following the example:
package foo;
public class CustomTransformer1{
public Object transformRow(Map row) {
String artist = row.get("artist");
if (artist != null)
Doing this you will send the dump where you want:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/the/dump
Then you can open the dump with jhat:
jhat /path/to/the/dump/your_stack.bin
It provably will give you a OutOfMemortException due to teh large size ofthe
dump. In case you can give
I think it doesn't make sense to enable warming if your solr instance is just
for indexing pourposes (it changes if you use it for search aswell). You
could comment the caches aswell from solrconfig.xml
Setting queryResultWindowSize and queryResultMaxDocsCached to sero maybe
could help... (but if
Hey there,
I am trying to set up the Katta integration plugin. I would like to know if
Katta's ranking algorith is used when searching among shards. In case yes,
would it mean it solves the problem with IDF's of distributed Solr?
--
View this message in context:
http://www.nabble.com/SOLR-1395-
Are you using one single solr instance with multicore or multiple solr
instances with one index each?
Erik_l wrote:
>
> Hi,
>
> Currently we're running 10 Solr indexes inside a single Tomcat6 instance.
> In the near future we would like to add another 30-40 indexes to every
> Tomcat instance we
hold you will suffer of slow response times.
Erik_l wrote:
>
> We're not using multicore. Today, one Tomcat instance host a number of
> indexes in form of 10 Solr indexes (10 individual war files).
>
>
> Marc Sturlese wrote:
>>
>> Are you using one single solr
Is there any way to make snapinstaller install the index in
spanpshot20091023124543 (for example) from another disk? I am asking this
because I would like not to optimize the index in the master (if I do that
it takes a long time to send it via rsync if it is so big). This way I would
just have to
Hey there,
I am thinking to develope facet dates for distributed search but I don't
know exacly where to start. I am familiar with facet dates source code and I
think if I could undesertand how distributed facet queries work shouldn't be
that difficult.
I have read http://wiki.apache.org/solr/Writ
Hey there,
I am using Solr 1.4 out of the box and am trying to create a core at runtime
using the CREATE action.
I am getting this error when executing:
http://localhost:8983/solr/admin/cores?action=CREATE&name=x&instanceDir=x&persist=true&config=solrconfig.xml&schema=schema.xml&dataDir=da
With 1.4
-Add log4j jars to Solr
-Configure de SyslogAppender with something like:
log4j.appender.solrLog=org.apache.log4j.net.SyslogAppender
log4j.appender.solrLog.Facility=LOCAL0
log4j.appender.solrLog.SyslogHost=127.0.0.1
log4j.appender.solrLog.layout=org.apache.log4j.PatternLayout
log4j.appe
And what about:
vs.
Wich is the differenece between both? It's just bcdint always better?
Thanks in advance
Yonik Seeley-2 wrote:
>
> On Fri, Dec 4, 2009 at 7:38 PM, Jay Hill wrote:
>> 1) Is there any benefit to using the "int" type as a TrieIntField w/
>> precisionStep=0 over the "pint" ty
I am tracing QueryComponent.java and would like to know the pourpose of doFSV
function. Don't understand what fsv are for.
Have tried some queries with fsv=true and some extra info apears in the
response:
But don't know what is it for and can't find much info out there. I read:
// The query
Hey there,
I need that once a document has been created be able to decide if I want it
to be indexed or not. I have thought in implement an UpdateRequestProcessor
to do that but don't know how to tell Solr in the processAdd void to skip
the document.
If I delete all the field would it be skiped or
eers
>
> On Thu, Dec 10, 2009 at 12:09 PM, Marc Sturlese
> wrote:
>
>>
>> Hey there,
>> I need that once a document has been created be able to decide if I want
>> it
>> to be indexed or not. I have thought in implement an
>> UpdateRequestProces
Yes, it did
Cheers
Chris Male wrote:
>
> Hi,
>
> Yeah thats what I was suggesting. Did that work?
>
> On Thu, Dec 10, 2009 at 12:24 PM, Marc Sturlese
> wrote:
>
>>
>> Do you mean something like?:
>>
>>@Override
>>public vo
Should sortMissingLast param be working on trie-fields?
--
View this message in context:
http://old.nabble.com/tire-fields-and-sortMissingLast-tp26873134p26873134.html
Sent from the Solr - User mailing list archive at Nabble.com.
If you want to retrieve a huge volume of rows you will end up with an
OutOfMemoryException due to the jdbc driver. Setting batchSize to -1 in your
data-config.xml (that internally will set it to Integer.MIN_VALUE) will make
the query to be executed in streaming, avoiding the memory exception.
Joe
Solr but don't know how to do this comas stuff. I wouls like
to do something like this:
...title:"+query_string+" (setting boosting 3) and title:+query_string+
(setting boosting 2)...
I supose I have to add something ti the solrconfig.xml but couldn't find
what.
Any advice?
Than
Hey there,
I am doing the same and I am experimenting some trouble. I get the document
data searching by term. The problem is that when I do it several times
(inside a huge for) the app starts increasing the memory use until I use
almost the whole memory...
Did u find any other way to do that?
J
cs... but the memory
problem never disapeared...
If I call the garbage collector every time I use the upper code the memory
doesn't increase undefinitely but... the app works soo slow.
Any suggestion?
Thanks for replaying!
Yonik Seeley wrote:
>
> On Sun, Nov 2, 2008 at 8:09 PM, Marc St
Hey your are right,
I'm trying to migrate my app to solr. For the moment I am using solr for the
searching part of the app but i am using my own lucene app for indexing,
Shoud have posted in lucene forum for this trouble. Sorry about that.
Iam trying to use termdocs properly now.
Thanks for your a
I
do the select and the mapping db_field - index_field
*The mysql connector is correctly added in the classpath
I think I must be missing something in my configuration but can't find
what...
Anyone can give me a hand? I am a bit lost with this problem...
Thanks in advanced
Marc Sturlese
That worked! I was writing in a bad way the
> It seems like your data-config does not have any tag. The
> following is the correct structure:
>
>
>
>
>
>
>
> On Tue, Nov 11, 2008 at 12:31 AM, Marc Sturlese
> <[EMAIL PROTECTED]>wrote:
>
Field
Inside my requesthandler called /dataimport (wich uses
org.apache.solr.handler.dataimport.DataImportHandler class)
Has anyone done something similar?
Marc Sturlese
--
View this message in context:
http://www.nabble.com/deduplication---dataimporthandler-tp20437553p20437553
Hey there,
Since few weeks ago I am trying to migrate my lucene core app to Solr and
many questions are coming to my mind...
Before being in ApacheCon I thought that my Lucene Index works fine with my
Solr Search Engine but after my conversation with Erik in the Solr BootCamp
I understood that the
Hey there, I am using dataimport with full-import successfully but there's no
way do make it work with delta-import. Aparently solr doesn't show any error
but it does not do what it is supose to.
I thing the problme is with dataimport.properties because it is never
updated. I have it placed in the
> use.
>
> On Fri, Nov 14, 2008 at 4:35 PM, Marc Sturlese
> <[EMAIL PROTECTED]>wrote:
>
>>
>> Hey there, I am using dataimport with full-import successfully but
>> there's
>> no
>> way do make it work with delta-import. Aparently solr doesn
Hey,
That's the weird thing... in the log everything seems to work fine:
Nov 14, 2008 3:12:46 PM org.apache.solr.handler.dataimport.DataImportHandler
processConfiguration
INFO: Processing configuration from solrconfig.xml:
{config=/opt/netbeans-5.5.1/enterprise3/apache-tomcat-5.5.17/bin/solr/conf
gt; meant
> for debugging only. If you want to do a commit, add commit=true as a
> request
> parameter.
>
> On Fri, Nov 14, 2008 at 7:56 PM, Marc Sturlese
> <[EMAIL PROTECTED]>wrote:
>
>>
>> Hey,
>> That's the weird thing... in the
Hey there,
I have posted before telling about my situation but I thing my explanation
was a bit confusing...
I am using dataImportHanlder and delta-import and it's working perfectly. I
have also coded my own SqlEntityProcesor to delete from the index and
database expired rows.
Now I need to do d
2008-11-12 05:10 PM this one exactly).
I have downloaded the last nightly-build source code and couldn't see the
needed classes in there.
Anyones knows something?Should I ask this in the developers forum?
Thanks in advanced
Marc Sturlese wrote:
>
> Hey there,
>
> I have post
Marc Sturlese wrote:
>
> Thank you so much. I have it sorted.
> I am wondering now if there is any more stable way to use deduplication
> than adding to the solr source project this patch:
> https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.iss
Hey there, I've been testing and checking the source of the
TextProfileSignature.java to avoid similar entries at indexing time.
What I understood is that it is useful for huge text where the frequency of
the tokens (the words in lowercase just with number and leters in taht case)
is important. If
>>
>> I have my own duplication system to detect that but I use String
>> comparison
>> so it works really slow...
>>
What are you doing for the String comparison? Not exact right?
hey,
My comparison method looks for similar (not just exact)... what I do is to
compare two text word to word. Wh
Ken Krugler wrote:
>
>>Marc Sturlese wrote:
>>>Hey there, I've been testing and checking the source of the
>>>TextProfileSignature.java to avoid similar entries at indexing time.
>>>What I understood is that it is useful for huge text where the frequency
Hey there,
I have started working with an index divided in 3 shards. When I did a
distributed search I got an error with the fields that were not string or
text. I read that the error was due to BinaryResponseWriter and not
string/text empty fields.
I found the solution in an old thread of this f
Hey there,
I am faceing a problem doing filed facets and I don't know if there exist
any solution in Solr to solve my problem.
I want to do facets with a field that is very small text. To do that I am
using the KeywordTokenizerfactory to keep all the words of the text in just
one token. I use Low
Hey there,
After developing my own extends classes from sqlentityprocesor,
jdbcdatasource and transformer I have my customized dataimporthandler almost
working.
I have to reach one more goal.
In one hand I don't always have to index all the fields from my db row. For
example fields from db that
1 - 100 of 270 matches
Mail list logo