Please follow instructions here: http://lucene.apache.org/solr/resources.html
F.
> On Jun 8, 2015, at 1:06 AM, Dylan wrote:
>
> On 30 May 2015 12:08, "Lalit Kumar 4" wrote:
>
>> Please unsubscribe me as well
>>
>> On May 30, 2015 15:23, Neha Jatav wrote:
>> Unsubscribe me
>>
Quoting Erik from two days ago:
Please follow the instructions here:
http://lucene.apache.org/solr/resources.html. Be sure to use the exact same
e-mail you used to subscribe.
> On May 30, 2015, at 6:07 AM, Lalit Kumar 4 wrote:
>
> Please unsubscribe me as well
>
> On May 30, 2015 15:23, Neh
can use 18.0.
Simple really.
François
> On May 26, 2015, at 10:30 AM, Robust Links wrote:
>
> i can't run 14.0.1. that is the problem. 14 does not have the interfaces i
> need
>
> On Tue, May 26, 2015 at 10:28 AM, François Schiettecatte <
> fschietteca...@gmail.c
Run whatever tests you want with 14.0.1, replace it with 18.0, rerun the tests
and compare.
François
> On May 26, 2015, at 10:25 AM, Robust Links wrote:
>
> by "dumping" you mean recompiling solr with guava 18?
>
> On Tue, May 26, 2015 at 10:22 AM, François Schi
Have you tried dumping guava 14.0.1 and using 18.0 with Solr? I did a while ago
and it worked fine for me.
François
> On May 26, 2015, at 10:11 AM, Robust Links wrote:
>
> i have a minhash logic that uses guava 18.0 method that is not in guava
> 14.0.1. This minhash logic is a separate maven p
Rebecca
You don’t want to give all the memory to the JVM. You want to give it just
enough for it to work optimally and leave the rest of the memory for the OS to
use for caching data. Giving the JVM too much memory can result in worse
performance because of GC. There is no magic formula to figu
Dinesh
See this:
http://wordlist.aspell.net/varcon/
You will need to do some work to convert to a SOLR friendly format though.
Cheers
François
> On Feb 12, 2015, at 12:22 AM, dinesh naik wrote:
>
> Hi ,
> We are looking for a dictionary to support American/British English synonym.
How about adding 'expungeDeletes=true' as well as 'commit=true'?
François
On Sep 13, 2014, at 4:09 PM, FiMka wrote:
> Hi guys, could you say how to delete a document in Solr? After I delete a
> document it still persists in the search results. For example there is the
> following document saved
How about :
datefield:[NOW-1DAY/DAY TO *]
François
On Sep 2, 2014, at 6:54 AM, Aman Tandon wrote:
> Hi,
>
> I did it using this, fq=datefield:[2014-09-01T23:59:59Z TO
> 2014-09-02T23:59:59Z].
> Correct me if i am wrong.
>
> Is there any way to find this using the NOW?
>
>
> With Re
I would also get some metrics when SOLR is doing nothing, the JVM does do work
in the background and looking at the memory graph in VisualVM will show a nice
sawtooth.
François
On Aug 14, 2014, at 1:16 PM, Erick Erickson wrote:
> bq: I just don’t know why Solr is suddenly going nuts.
>
> Hm
Hi
If you are seeing " appelé au téléphone" in the browser, I would guess that
the data is being rendered in UTF-8 by your server and the content type of the
html is set to iso-8859-1 or not being set and your browser is defaulting to
iso-8859-1.
You can force the encoding to utf-8 in the
A default garbage collector will be chosen for you by the VM, might help to get
the stack trace to look at.
François
On Jul 24, 2014, at 10:06 AM, Ameya Aware wrote:
> ooh ok.
>
> So you want to say that since i am using large heap but didnt set my
> garbage collection, thats why i why gettin
not
if it is UNLOADed and then LOADed. It occurs whether G1, CMS or ParallelGC is
used for garbage collection.
I used JDK 1.7.0_60 and Tomcat 7.0.54 for the underlying layers.
Not sure where to take it from here?
Cheers
François
On Jun 16, 2014, at 4:50 PM, François Schiettecatte
wrote:
&
Hi
I am running into an interesting garbage collection issue and am looking for
suggestions/thoughts.
Because some word lists such as synonyms, plurals, protected words need to be
updated on a regular basis I have to RELOAD a number of cores in order to 'pick
up' the new lists.
What I have
Just click the 'Releases' link:
https://github.com/DmitryKey/luke/releases
François
On Jun 9, 2014, at 10:43 AM, Aman Tandon wrote:
> No, Anyways thanks Alex, but where is the luke jar?
>
> With Regards
> Aman Tandon
>
>
> On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch
> wro
Have you tried using:
-XX:-UseGCOverheadLimit
François
On Apr 8, 2014, at 6:06 PM, Haiying Wang wrote:
> Hi,
>
> We were trying to merge a large index (9GB, 21 million docs) into current
> index (only 13MB), using mergeindexes command ofCoreAdminHandler, but always
> run into OOM e
Maybe you should try a more recent release of Luke:
https://github.com/DmitryKey/luke/releases
François
On Apr 7, 2014, at 12:27 PM, azhar2007 wrote:
> Hi All,
>
> I have a solr index which is indexed ins Solr.4.7.0.
>
> Ive attempted to open the index with Luke4.0.0 and also other v
Have you looked at the debugging output?
http://wiki.apache.org/solr/CommonQueryParameters#Debugging
François
On Apr 2, 2014, at 1:37 AM, Bob Laferriere wrote:
>
> I have built an commerce search engine. I am struggling with the word “no” in
> queries. We have products that are “No S
Better to user '+A +B' rather than AND/OR, see:
http://searchhub.org/2011/12/28/why-not-and-or-and-not/
François
On Mar 25, 2014, at 10:21 PM, Koji Sekiguchi wrote:
> (2014/03/26 2:29), abhishek jain wrote:
>> hi friends,
>>
>> when i search for "A and B" it gives me result for A , B
Hi
Why not copy the core directory instead of the data directory? The conf
directory is very small and that would ensure that you don't get schema
mismatch issues.
If you are stuck with copying the data directory, then I would replace the data
directory in the target core and reload that core,
gt; and using jetty with solr here..
>
>
> On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte <
> fschietteca...@gmail.com> wrote:
>
>> A few more specifics about the environment would help, Windows/Linux/...?
>> Jetty/Tomcat/...?
>>
>> Françoi
f the remote machine someone will need to go and restart
> the machine ...
>
> You can try use a kvm or other remote control system
>
> --
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Tuesday, October 22, 2013 at 5:46 PM, Françoi
If you are on linux/unix, use the kill command.
François
On Oct 22, 2013, at 12:42 PM, Raheel Hasan wrote:
> Hi,
>
> is there a way to stop/restart java? I lost control over it via SSH and
> connection was closed. But the Solr (start.jar) is still running.
>
> thanks.
>
> --
> Regards,
> Ra
Well no, the OS is smarter than that, it manages file system cache along with
other memory requirements. If applications need more memory then file system
cache will likely be reduced.
The command is a cheap trick to get the OS to fill the file system cache as
quickly as possible, not sure how
Kumar
You might want to look into the 'pf' parameter:
https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
François
On Oct 21, 2013, at 9:24 AM, kumar wrote:
> I am querying solr for exact match results. But it is showing some other
> results also.
>
> E
To put the file data into file system cache which would make for faster access.
François
On Oct 21, 2013, at 8:33 AM, michael.boom wrote:
> Hmm, no, I haven't...
>
> What would be the effect of this ?
>
>
>
> -
> Thanks,
> Michael
> --
> View this message in context:
> http://lucene.4
Hi
The approach I take is to store enough data in the SOLR index to render the
results page, and go to the database if the user want to view a document.
Cheers
François
On Oct 6, 2013, at 9:45 AM, user 01 wrote:
> @Gora:
> you understood the schema correctly, but I can't believe it's strang
Shouldn't the search be more like this if you are searching in the
'descricaoRoteiro' field:
descricaoRoteiro:(BPS 8D BEACH*)
or in your example you have a space in between 'descricaoRoteiro' and 'BPS':
descricaoRoteiro:BPS 8D BEACH*
François
On Sep 2, 2013, at 8:08 AM, Dmitr
Kamal
You could also use the 'mm' parameter to require a minimum match, or you could
prepend '+' to each required term.
Cheers
François
On May 13, 2013, at 7:57 AM, Kamal Palei wrote:
> Hi Rafał Kuć
> I added q.op=AND as per you suggested. I see though some initial record
> document contain
Hi
Just ran into this bug while playing around with 3.6. Using edismax and
entering a a search like this "(text:foobar)" causes the query parser to mangle
the query as shown by the results below. Adding a space after the first paren
solves this. I checked 3.6.1 and get the same issue. I recall
I would create a hash of the document content and store that in SOLR along with
any document info you wish to store. When a document is presented for indexing,
hash that and compare to the hash of the stored document, index if they are
different and skip if they are not.
François
On Nov 24,
I suspect it is just part of the wildcard handling, maybe someone can chime in
here, you may need to catch this before it gets to SOLR.
François
On Nov 12, 2012, at 5:44 PM, johnmu...@aol.com wrote:
> Thanks for the quick response.
>
>
> So, I do not want to use ReversedWildcardFilterFactory,
John
You can still use leading wildcards even if you dont have the
ReversedWildcardFilterFactory in your analysis but it means you will be
scanning the entire dictionary when the search is run which can be a
performance issue. If you do use ReversedWildcardFilterFactory you wont have
that perf
Aaron
The best way to make sure the index is cached by the OS is to just cat it on
startup:
cat `find /path/to/solr/index` > /dev/null
Just make sure your index is smaller than RAM otherwise data will be rotated
out.
Memory mapping is built on the virtual memory system, and I suspect
What is probably going on is that the response is not being interpreted as
UTF-8 but as some other encoding.
What are you using to display the response?
François
On Aug 28, 2012, at 8:08 AM, zehoss wrote:
> Hi,
> at the beginning I would like to sorry for my english. I hope my message
> will
You should check this at pcper.com:
http://pcper.com/ssd-decoder
http://pcper.com/content/SSD-Decoder-popup
Specs for a wide range of SSDs.
Best regards
François
On Aug 23, 2012, at 5:35 PM, Peyman Faratin wrote:
> Hi
>
> Is there a SSD brand and spec that the community re
I would create two indices, one with your content and one with your ads. This
approach would allow you to precisely control how many ads you pull back and
how you merge them into the results, and you would be able to control schemas,
boosting, defaults fields, etc for each index independently.
On Jul 11, 2012, at 2:52 PM, Shawn Heisey wrote:
> On 7/2/2012 2:33 AM, Nabeel Sulieman wrote:
>> Argh! (and hooray!)
>>
>> I started from scratch again, following the wiki instructions. I did only
>> one thing differently; put my data directory in /opt instead of /home/dev.
>> And now it works!
Giovanni
means the data is stored in the index and can be returned with
the search results (see the 'fl' parameter). This is independent of
Which means that you can store but not index a field:
Best regards
François
On Jun 30, 2012, at 9:57 AM, Giovanni Gherdovich wrote:
n 19, 2012, at 9:03 AM, Bruno Mannina wrote:
> Linux Ubuntu :) since 2 months ! so I'm a new in this world :)
>
> Le 19/06/2012 15:01, François Schiettecatte a écrit :
>> Well that depends on the platform you are on, you did not mention that.
>>
>> If you
mes during the process but How can I check
> IO HDD ?
>
> Le 19/06/2012 14:13, François Schiettecatte a écrit :
>> Just a suggestion, you might want to monitor CPU usage and disk I/O, there
>> might be a bottleneck.
>>
>> Cheers
>>
>> François
>>
Just a suggestion, you might want to monitor CPU usage and disk I/O, there
might be a bottleneck.
Cheers
François
On Jun 19, 2012, at 7:07 AM, Bruno Mannina wrote:
> Actually -Xmx512m and no effect
>
> Concerning maxFieldLength, no problem it's commented
>
> Le 19/06/2012 13:02, Erick Erick
FWIW it looks like this feature has been enabled by default since JDK 6 Update
23:
http://blog.juma.me.uk/2008/10/14/32-bit-or-64-bit-jvm-how-about-a-hybrid/
François
On Mar 15, 2012, at 6:39 AM, Husain, Yavar wrote:
> Thanks a ton.
>
> From: L
Ola
Here is what I have for this:
##
#
# Log4J configuration for SOLR
#
# http://wiki.apache.org/solr/SolrLogging
#
#
# 1) Download LOG4J:
# http://logging.apache.org/log4j/1.2/
# http://logging.apache.org/log4j/1.2/download.h
You could take a look at this:
http://www.let.rug.nl/vannoord/TextCat/
Will probably require some work to integrate/implement through
François
On Feb 20, 2012, at 3:37 AM, bing wrote:
> I have looked into the TikaCLI with -language option, and learned that Tika
> can output only the la
Have you tried checking any logs?
Have you tried identifying a file which did not make it in and submitting just
that one and seeing what happens?
François
On Feb 9, 2012, at 10:37 AM, Rong Kang wrote:
>
> Yes, I put all file in one directory and I have tested file names using
> code.
>
Anderson
I would say that this is highly unlikely, but you would need to pay attention
to how they are generated, this would be a good place to start:
http://en.wikipedia.org/wiki/Universally_unique_identifier
Cheers
François
On Feb 8, 2012, at 1:31 PM, Anderson vasconcelos wrote:
>
Using ReversedWildcardFilterFactory will double the size of your dictionary
(more or less), maybe the drop in performance that you are seeing is a result
of that?
François
On Jan 17, 2012, at 9:01 PM, Shyam Bhaskaran wrote:
> Hi,
>
> For reverse indexing we are using the ReversedWildcardFilte
Johnny
What you are going to want to do is boost the artist field with respect to the
others, for example using edismax my 'qf' parameter is:
number^5 title^3 default
so hits in the number field get a five-fold boost and hits in the title field
get a three-fold boost. In your case you
About the search 'referal_url:*www.someurl.com*', having a wildcard at the
start will cause a dictionary scan for every term you search on unless you use
ReversedWildcardFilterFactory. That could be the cause of your slowdown if you
are I/O bound, and even if you are CPU bound for that matter.
I am not an expert on this but the oom-killer will kill off the process
consuming the greatest amount of memory if the machine runs out of memory, and
you should see something to that effect in the system log, /var/log/messages I
think.
François
On Dec 14, 2011, at 2:54 PM, Adolfo Castro Menna
You might try the snowball stemmer too, I am not sure how closely that will fit
your requirements though.
Alternatively you could use synonyms.
François
On Nov 29, 2011, at 1:08 AM, mina wrote:
> thank you for your answer.i read it and i use this filter in my schema.xml in
> solr:
>
>
>
> b
It won't and depending on how your analyzer is set up the terms are most likely
stemmed at index time.
You could create a separate field for unstemmed terms though, or use a less
aggressive stemmer such as EnglishMinimalStemFilterFactory.
François
On Nov 29, 2011, at 12:33 PM, Robert Brown wro
It looks like you are using the plural stemmer, you might want to look into
using the Porter stemmer instead:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming
François
On Nov 28, 2011, at 9:14 AM, mina wrote:
> I use solr 3.3,I want solr index words with their suffixes. whe
Wouldn't 'diseases AND water' or '+diseases +water' return you that result? Or
you could search on 'water' while filtering on 'diseases'.
Or am I missing something here?
François
On Nov 8, 2011, at 4:19 PM, sharnel pereira wrote:
> Hi,
>
> I have 10k records indexed using solr 1.4
>
> We hav
simply query for "Solr". This is
> what's Solr made for. :)
>
> -Kuli
>
> Am 01.11.2011 13:24, schrieb François Schiettecatte:
>> Arshad
>>
>> Actually it is available, you need to use the ReversedWildcardFilterFactory
>> which I am sure you
Arshad
Actually it is available, you need to use the ReversedWildcardFilterFactory
which I am sure you can Google for.
Solr and SQL address different problem sets with some overlaps but there are
significant differences between the two technologies. Actually '%Solr%' is a
worse case for SQL bu
Erik
I would complement the date with default values as you suggest and store a
boolean flag indicating whether the date was complete or not, or store the
original date if it is not complete which would probably be better because the
presence of that data would tell you that the original date w
You have not said how big your index is but I suspect that allocating 13GB for
your 20 cores is starving the OS of memory for caching file data. Have you
tried 6GB with 20 cores? I suspect you will see the same performance as 6GB &
10 cores.
Generally it is better to allocate just enough memory
Wildcard terms are not analyzed, so your synonyms.txt may come into play here,
have you check the analysis for deniz* ?
François
On Sep 7, 2011, at 10:08 PM, deniz wrote:
> well yea you are right... i realised that lack of detail issue here... so
> here it comes...
>
>
> This is from my sche
My memory of this is a little rusty but isn't mmap also limited by mem + swap
on the box? What does 'free -g' report?
François
On Sep 7, 2011, at 12:25 PM, Rich Cariens wrote:
> Ahoy ahoy!
>
> I've run into the dreaded OOM error with MMapDirectory on a 23G cfs compound
> index segment file. Th
I note that there is a full download option available, might be easier than
crawling.
François
On Sep 4, 2011, at 9:56 AM, Markus Jelsma wrote:
> Hi,
>
> Solr is a search engine, not a crawler. You can use Apache Nutch to crawl
> your
> site and have it indexed in Solr.
>
> Cheers,
>
>> Hi
Satish
You don't say which platform you are on but have you tried links (with ln on
linux/unix) ?
François
On Aug 31, 2011, at 12:25 AM, Satish Talim wrote:
> I have 1000's of cores and to reduce the cost of loading unloading
> schema.xml, I have my solr.xml as mentioned here -
> http://wiki.a
ding and
> encodes apropriatly. this should be a common solr problem if all search
> engines treat utf-8 that way, right?
>
> Any ideas how to fix that? Is there maybe a special solr functionality for
> this?
>
> 2011/8/27 François Schiettecatte
>
>> Merlin
>>
>
Merlin
Ü encodes to two characters in utf-8 (C39C), and one in iso-8859-1 (%DC) so it
looks like there is a charset mismatch somewhere.
Cheers
François
On Aug 27, 2011, at 6:34 AM, Merlin Morgenstern wrote:
> Hello,
>
> I am having problems with searches that are issued from spiders that
Sounds to me that you are looking for HTTP Persistent Connections (connection
keep-alive as opposed to close), and a singleton object. This would be outside
SOLR per se.
A few caveats though, I am not sure if tomcat supports keep-alive, and I am not
sure how SOLR deals with multiple requests co
Assuming you are running on Linux, you might want to check /var/log/messages
too (the location might vary), I think the kernel logs forced process
termination there. I recall that the kernel will usually picks the process
consuming the most memory, there may be other factors involved too.
Franç
Indeed, the analysis will show if the term is a stop word, the term gets
removed by the stop filter, turning on verbose output shows that.
François
On Jul 31, 2011, at 6:27 PM, Shashi Kant wrote:
> Check your Stop words list
> On Jul 31, 2011 6:25 PM, "François Schiettecatte"
That seems a little far fetched, have you checked your analysis?
François
On Jul 31, 2011, at 4:58 PM, randohi wrote:
> One of our clients (a hot girl!) brought this to our attention:
> In this document there are many f* words:
>
> http://sec.gov/Archives/edgar/data/1474227/00014742271032/
I have not seen this mentioned anywhere, but I found a useful 'trick' to
restart solr without having to restart tomcat. All you need to do is 'touch'
the solr.xml in the solr.home directory. It can take a few seconds but solr
will restart and reload any config.
Cheers
François
On Jul 27, 201
Note that the Qtime in the response packet is the search, exclusive of
> assembling the response so that's probably a good number to measure.
>
> Best
> Erick
>
> On Fri, Jul 8, 2011 at 8:01 AM, jame vaalet wrote:
>> i would prefer every setting to be in its defa
I get slf4j-log4j12-1.6.1.jar from
http://www.slf4j.org/dist/slf4j-1.6.1.tar.gz, it is what interfaces slf4j to
log4j, you will also need to add log4j-1.2.16.jar to WEB-INF/lib.
François
On Jul 26, 2011, at 3:40 PM, O. Klein wrote:
>
> François Schiettecatte wrote:
>>
>&
FWIW, here is the process I follow to create a log4j aware version of the
apache solr war file and the corresponding lo4j.properties files.
Have fun :)
François
##
#
# Log4J configuration for SOLR
#
# http://wiki.apache.org/solr/Sol
Adding to my previous reply, I just did a quick check on the 'text_en' and
'text_en_splitting' field types and they both strip leading '#'.
Cheers
François
On Jul 22, 2011, at 10:49 AM, Shawn Heisey wrote:
> On 7/22/2011 8:34 AM, Jason Toy wrote:
>> How does one search for words with character
Check your analyzers to make sure that these characters are not getting
stripped out in the tokenization process, the url for 3.3 is somewhere along
the lines of:
http://localhost/solr/admin/analysis.jsp?highlight=on
And you should be indeed be searching on "\#test".
François
On Jul 2
You need to do something like this in the ./conf/tomcat server.xml file:
See 'URIEncoding' in http://tomcat.apache.org/tomcat-7.0-doc/config/http.html
Note that this will assume that the encoding of the data is in utf-8 if (and
ONLY if) the charset parameter is not set in the HTTP request
I think anything but a 200 OK mean it is dead like the proverbial parrot :)
François
On Jul 19, 2011, at 7:42 AM, Romi wrote:
> But the problem is when solr server is not runing
> *"http://host:port/solr/admin/ping"*
>
> will not give me any json response
> then how will i get the status :(
>
Easy, the hyphen is out on its own (with spaces on either side) and is probably
getting removed from the search by the tokenizer. Check your analysis.
François
On Jul 14, 2011, at 6:05 AM, roySolr wrote:
> It looks like it's still not working.
>
> I send this to SOLR: q=arsenal \- london
>
>
http://lucene.apache.org/java/2_9_1/queryparsersyntax.html
http://wiki.apache.org/solr/SolrQuerySyntax
François
On Jul 13, 2011, at 1:29 PM, GAURAV PAREEK wrote:
> Hello,
>
> What are wildcards we can use with the SOLR ?
>
> Regards,
> Gaurav
You just need to provide a second sort field along the lines of:
sort=score desc, author desc
François
On Jul 12, 2011, at 6:13 AM, Lox wrote:
> Hi,
>
> In the case where two or more documents are returned with the same score, is
> there a way to tell Solr to sort them alphabetically?
Hi
I don't think that anyone has run such benchmarks, in fact this topic came up
two weeks ago and I volunteered some time to do that because I have some spare
time this week, so I am going to run some benchmarks this weekend and report
back.
The machine I have to do this a core i7 960, 24GB,
Celso Pinto wrote:
>> Hi François,
>>
>> it is indeed being stemmed, thanks a lot for the heads up. It appears
>> that stemming is also configured for the query so it should work just
>> the same, no?
>>
>> Thanks again.
>>
>> Regards,
&
I would run that word through the analyzer, I suspect that the word 'teste' is
being stemmed to 'test' in the index, at least that is the first place I would
check.
François
On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote:
> Hi everyone,
>
> I'm having some trouble figuring out why a query wit
Indeed, I find the Porter stemmer to be too 'aggressive' for my taste, I prefer
the EnglishMinimalStemFilterFactory, with the caveat that it depends on your
data set.
Cheers
François
On Jun 29, 2011, at 6:21 AM, Ahmet Arslan wrote:
>> Hi, when i query for "elegant" in
>> solr i get results fo
wrote:
> Thanks François Schiettecatte, information you provided is very helpful.
> i need to know one more thing, i downloaded one of the given dictionary but
> it contains many files, do i need to add all this files data in to
> synonyms.text ??
>
> -
> Thanks & Regard
work on it, as there are some other low hanging fruits I've to
>>> capture. Will share my thoughts soon.
>>>
>>>
>>> *Pranav Prakash*
>>>
>>> "temet nosce"
>>>
>>> Twitter <http://twitter.com/pranavprakash>
le <http://www.google.com/profiles/pranny>
>
>
> 2011/6/28 François Schiettecatte
>
>> Maybe there is a way to get Solr to reject documents that already exist in
>> the index but I doubt it, maybe someone else with can chime here here. You
>> could do a search for
gt; Since I am using SOLR as index engine Only and using Riak(key-value
> storage) as storage engine, I dont want to do the overwrite on duplicate.
> I just need to discard the duplicates.
>
>
>
> 2011/6/28 François Schiettecatte
>
>> Create a hash from the url an
Create a hash from the url and use that as the unique key, md5 or sha1 would
probably be good enough.
Cheers
François
On Jun 28, 2011, at 7:29 AM, Mohammad Shariq wrote:
> I also have the problem of duplicate docs.
> I am indexing news articles, Every news article will have the source URL,
> I
Well you need to find word lists and/or a thesaurus.
This is one place to start:
http://wordlist.sourceforge.net/
I used the US/UK english word list for my synonyms for an index I have because
it contains both US and UK english terms, the list lacks some medical terms
though so we just
Wayne
I am not sure what you mean by 'changing the record'.
One option would be to implement something like the synonyms filter to generate
the TC for SC when you index the document, which would index both the TC and
the SC in the same location. That way your users would be able to search with
Mike
I would be very interested in the answer to that question too. My hunch is that
the answer is no too. I have a few text databases that range from 200MB to
about 60GB with which I could run some tests. I will have some downtime in
early July and will post results.
From what I can tell the
That is correct, but you only need to commit, optimize is not a requirement
here.
François
On Jun 18, 2011, at 11:54 PM, Mohammad Shariq wrote:
> I have define in my solr and Deleting the docs from solr using
> this uniqueKey.
> and then doing optimization once in a day.
> is this right way to
on a project with 30+
normalized tables, but only 4 cores.
Perhaps describing what you are trying to achieve would give us greater insight
and thus be able to make more concrete recommendation?
Cheers
François
On Jun 18, 2011, at 2:36 PM, shacky wrote:
> Il 18 giugno 2011 20:27, Franç
Sure.
François
On Jun 18, 2011, at 2:25 PM, shacky wrote:
> 2011/6/15 Edoardo Tosca :
>> Try to use multiple cores:
>> http://wiki.apache.org/solr/CoreAdmin
>
> Can I do concurrent searches on multiple cores?
#x27;d keep the default
> settings. My real issue is why are not query keywords treated as a
> set?<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201106.mbox/%3CBANLkTikHunhyWc2WVTofRYU4ZW=c8oe...@mail.gmail.com%3E>
> 2011/6/18 François Schiettecatte
>
>> What do you have
What do you have set up for stemming?
François
On Jun 18, 2011, at 8:00 AM, Gabriele Kahlout wrote:
> Hello,
>
> Debugging query results I find that:
> paste
> content:past
>
> Now paste and past are two different words. Why does Solr not consider
> that? How do I make it?
>
> --
> Regards,
I am assuming that you are running on linux here, I have found atop to be very
useful to see what is going on.
http://freshmeat.net/projects/atop/
dstat is also very useful too but needs a little more work to 'decode'.
Obviously there is contention going on, you just need to figure out
I think you will need to provide more information than this, no-one on this
list is omniscient AFAIK.
François
On Jun 14, 2011, at 10:44 AM, Denis Kuzmenok wrote:
> Hi.
>
> I've debugged search on test machine, after copying to production server
> the entire directory (entire solr director
Underscores and dashes are fine, but I would think that colons (:) are verboten.
François
On Jun 4, 2011, at 9:49 PM, Jamie Johnson wrote:
> Is there a list anywhere detailing field name restrictions. I imagine
> fields containing periods (.) are problematic if you try to use that field
> when
1 - 100 of 136 matches
Mail list logo