Re: Unsubscribe me

2015-06-08 Thread François Schiettecatte
Please follow instructions here: http://lucene.apache.org/solr/resources.html F. > On Jun 8, 2015, at 1:06 AM, Dylan wrote: > > On 30 May 2015 12:08, "Lalit Kumar 4" wrote: > >> Please unsubscribe me as well >> >> On May 30, 2015 15:23, Neha Jatav wrote: >> Unsubscribe me >>

Re: Unsubscribe me

2015-05-30 Thread François Schiettecatte
Quoting Erik from two days ago: Please follow the instructions here: http://lucene.apache.org/solr/resources.html. Be sure to use the exact same e-mail you used to subscribe. > On May 30, 2015, at 6:07 AM, Lalit Kumar 4 wrote: > > Please unsubscribe me as well > > On May 30, 2015 15:23, Neh

Re: YAJar

2015-05-26 Thread François Schiettecatte
can use 18.0. Simple really. François > On May 26, 2015, at 10:30 AM, Robust Links wrote: > > i can't run 14.0.1. that is the problem. 14 does not have the interfaces i > need > > On Tue, May 26, 2015 at 10:28 AM, François Schiettecatte < > fschietteca...@gmail.c

Re: YAJar

2015-05-26 Thread François Schiettecatte
Run whatever tests you want with 14.0.1, replace it with 18.0, rerun the tests and compare. François > On May 26, 2015, at 10:25 AM, Robust Links wrote: > > by "dumping" you mean recompiling solr with guava 18? > > On Tue, May 26, 2015 at 10:22 AM, François Schi

Re: YAJar

2015-05-26 Thread François Schiettecatte
Have you tried dumping guava 14.0.1 and using 18.0 with Solr? I did a while ago and it worked fine for me. François > On May 26, 2015, at 10:11 AM, Robust Links wrote: > > i have a minhash logic that uses guava 18.0 method that is not in guava > 14.0.1. This minhash logic is a separate maven p

Re: how to debug solr performance degradation

2015-02-24 Thread François Schiettecatte
Rebecca You don’t want to give all the memory to the JVM. You want to give it just enough for it to work optimally and leave the rest of the memory for the OS to use for caching data. Giving the JVM too much memory can result in worse performance because of GC. There is no magic formula to figu

Re: American British Dictionary for Solr

2015-02-12 Thread François Schiettecatte
Dinesh See this: http://wordlist.aspell.net/varcon/ You will need to do some work to convert to a SOLR friendly format though. Cheers François > On Feb 12, 2015, at 12:22 AM, dinesh naik wrote: > > Hi , > We are looking for a dictionary to support American/British English synonym.

Re: Solr: How to delete a document

2014-09-13 Thread François Schiettecatte
How about adding 'expungeDeletes=true' as well as 'commit=true'? François On Sep 13, 2014, at 4:09 PM, FiMka wrote: > Hi guys, could you say how to delete a document in Solr? After I delete a > document it still persists in the search results. For example there is the > following document saved

Re: Date field related query

2014-09-02 Thread François Schiettecatte
How about : datefield:[NOW-1DAY/DAY TO *] François On Sep 2, 2014, at 6:54 AM, Aman Tandon wrote: > Hi, > > I did it using this, fq=datefield:[2014-09-01T23:59:59Z TO > 2014-09-02T23:59:59Z]. > Correct me if i am wrong. > > Is there any way to find this using the NOW? > > > With Re

Re: Random OOM Exceptions

2014-08-14 Thread François Schiettecatte
I would also get some metrics when SOLR is doing nothing, the JVM does do work in the background and looking at the memory graph in VisualVM will show a nice sawtooth. François On Aug 14, 2014, at 1:16 PM, Erick Erickson wrote: > bq: I just don’t know why Solr is suddenly going nuts. > > Hm

Re: Character encoding problems

2014-07-29 Thread François Schiettecatte
Hi If you are seeing " appelé au téléphone" in the browser, I would guess that the data is being rendered in UTF-8 by your server and the content type of the html is set to iso-8859-1 or not being set and your browser is defaulting to iso-8859-1. You can force the encoding to utf-8 in the

Re: Java heap space error

2014-07-24 Thread François Schiettecatte
A default garbage collector will be chosen for you by the VM, might help to get the stack trace to look at. François On Jul 24, 2014, at 10:06 AM, Ameya Aware wrote: > ooh ok. > > So you want to say that since i am using large heap but didnt set my > garbage collection, thats why i why gettin

Re: Garbage collection issue and RELOADing cores

2014-07-01 Thread François Schiettecatte
not if it is UNLOADed and then LOADed. It occurs whether G1, CMS or ParallelGC is used for garbage collection. I used JDK 1.7.0_60 and Tomcat 7.0.54 for the underlying layers. Not sure where to take it from here? Cheers François On Jun 16, 2014, at 4:50 PM, François Schiettecatte wrote: &

Garbage collection issue and RELOADing cores

2014-06-16 Thread François Schiettecatte
Hi I am running into an interesting garbage collection issue and am looking for suggestions/thoughts. Because some word lists such as synonyms, plurals, protected words need to be updated on a regular basis I have to RELOAD a number of cores in order to 'pick up' the new lists. What I have

Re: Any way to view lucene files

2014-06-09 Thread François Schiettecatte
Just click the 'Releases' link: https://github.com/DmitryKey/luke/releases François On Jun 9, 2014, at 10:43 AM, Aman Tandon wrote: > No, Anyways thanks Alex, but where is the luke jar? > > With Regards > Aman Tandon > > > On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch > wro

Re: OutOfMemoryError while merging large indexes

2014-04-08 Thread François Schiettecatte
Have you tried using: -XX:-UseGCOverheadLimit François On Apr 8, 2014, at 6:06 PM, Haiying Wang wrote: > Hi, > > We were trying to merge a large index (9GB, 21 million docs) into current > index (only 13MB), using mergeindexes command ofCoreAdminHandler, but always > run into OOM e

Re: Reading Solr index

2014-04-07 Thread François Schiettecatte
Maybe you should try a more recent release of Luke: https://github.com/DmitryKey/luke/releases François On Apr 7, 2014, at 12:27 PM, azhar2007 wrote: > Hi All, > > I have a solr index which is indexed ins Solr.4.7.0. > > Ive attempted to open the index with Luke4.0.0 and also other v

Re: The word "no" in a query

2014-04-02 Thread François Schiettecatte
Have you looked at the debugging output? http://wiki.apache.org/solr/CommonQueryParameters#Debugging François On Apr 2, 2014, at 1:37 AM, Bob Laferriere wrote: > > I have built an commerce search engine. I am struggling with the word “no” in > queries. We have products that are “No S

Re: AND not as a boolean operator in Phrase

2014-03-25 Thread François Schiettecatte
Better to user '+A +B' rather than AND/OR, see: http://searchhub.org/2011/12/28/why-not-and-or-and-not/ François On Mar 25, 2014, at 10:21 PM, Koji Sekiguchi wrote: > (2014/03/26 2:29), abhishek jain wrote: >> hi friends, >> >> when i search for "A and B" it gives me result for A , B

Re: Solr cores across multiple machines

2013-12-17 Thread François Schiettecatte
Hi Why not copy the core directory instead of the data directory? The conf directory is very small and that would ensure that you don't get schema mismatch issues. If you are stuck with copying the data directory, then I would replace the data directory in the target core and reload that core,

Re: Stop/Restart Solr

2013-10-22 Thread François Schiettecatte
gt; and using jetty with solr here.. > > > On Tue, Oct 22, 2013 at 9:54 PM, François Schiettecatte < > fschietteca...@gmail.com> wrote: > >> A few more specifics about the environment would help, Windows/Linux/...? >> Jetty/Tomcat/...? >> >> Françoi

Re: Stop/Restart Solr

2013-10-22 Thread François Schiettecatte
f the remote machine someone will need to go and restart > the machine ... > > You can try use a kvm or other remote control system > > -- > Yago Riveiro > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > On Tuesday, October 22, 2013 at 5:46 PM, Françoi

Re: Stop/Restart Solr

2013-10-22 Thread François Schiettecatte
If you are on linux/unix, use the kill command. François On Oct 22, 2013, at 12:42 PM, Raheel Hasan wrote: > Hi, > > is there a way to stop/restart java? I lost control over it via SSH and > connection was closed. But the Solr (start.jar) is still running. > > thanks. > > -- > Regards, > Ra

Re: Solr timeout after reboot

2013-10-21 Thread François Schiettecatte
Well no, the OS is smarter than that, it manages file system cache along with other memory requirements. If applications need more memory then file system cache will likely be reduced. The command is a cheap trick to get the OS to fill the file system cache as quickly as possible, not sure how

Re: Exact Match Results

2013-10-21 Thread François Schiettecatte
Kumar You might want to look into the 'pf' parameter: https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser François On Oct 21, 2013, at 9:24 AM, kumar wrote: > I am querying solr for exact match results. But it is showing some other > results also. > > E

Re: Solr timeout after reboot

2013-10-21 Thread François Schiettecatte
To put the file data into file system cache which would make for faster access. François On Oct 21, 2013, at 8:33 AM, michael.boom wrote: > Hmm, no, I haven't... > > What would be the effect of this ? > > > > - > Thanks, > Michael > -- > View this message in context: > http://lucene.4

Re: Can I use app specific document id as the document id that Solr uses for internal purposes?

2013-10-06 Thread François Schiettecatte
Hi The approach I take is to store enough data in the SOLR index to render the results page, and go to the database if the user want to view a document. Cheers François On Oct 6, 2013, at 9:45 AM, user 01 wrote: > @Gora: > you understood the schema correctly, but I can't believe it's strang

Re: setQuery in SolrJ

2013-09-02 Thread François Schiettecatte
Shouldn't the search be more like this if you are searching in the 'descricaoRoteiro' field: descricaoRoteiro:(BPS 8D BEACH*) or in your example you have a space in between 'descricaoRoteiro' and 'BPS': descricaoRoteiro:BPS 8D BEACH* François On Sep 2, 2013, at 8:08 AM, Dmitr

Re: Mandatory words search in SOLR

2013-05-13 Thread François Schiettecatte
Kamal You could also use the 'mm' parameter to require a minimum match, or you could prepend '+' to each required term. Cheers François On May 13, 2013, at 7:57 AM, Kamal Palei wrote: > Hi Rafał Kuć > I added q.op=AND as per you suggested. I see though some initial record > document contain

Bug in query parser?

2012-12-28 Thread François Schiettecatte
Hi Just ran into this bug while playing around with 3.6. Using edismax and entering a a search like this "(text:foobar)" causes the query parser to mangle the query as shown by the results below. Adding a space after the first paren solves this. I checked 3.6.1 and get the same issue. I recall

Re: Indexing only on change

2012-11-24 Thread François Schiettecatte
I would create a hash of the document content and store that in SOLR along with any document info you wish to store. When a document is presented for indexing, hash that and compare to the hash of the stored document, index if they are different and skip if they are not. François On Nov 24,

Re: Is leading wildcard search turned on by default in Solr 3.6.1?

2012-11-12 Thread François Schiettecatte
I suspect it is just part of the wildcard handling, maybe someone can chime in here, you may need to catch this before it gets to SOLR. François On Nov 12, 2012, at 5:44 PM, johnmu...@aol.com wrote: > Thanks for the quick response. > > > So, I do not want to use ReversedWildcardFilterFactory,

Re: Is leading wildcard search turned on by default in Solr 3.6.1?

2012-11-12 Thread François Schiettecatte
John You can still use leading wildcards even if you dont have the ReversedWildcardFilterFactory in your analysis but it means you will be scanning the entire dictionary when the search is run which can be a performance issue. If you do use ReversedWildcardFilterFactory you wont have that perf

Re: MMapDirectory, demand paging, lazy evaluation, ramfs and the much maligned RAMDirectory (oh my!)

2012-10-24 Thread François Schiettecatte
Aaron The best way to make sure the index is cached by the OS is to just cat it on startup: cat `find /path/to/solr/index` > /dev/null Just make sure your index is smaller than RAM otherwise data will be rotated out. Memory mapping is built on the virtual memory system, and I suspect

Re: Solr and Tomcat - problem with unicode characters

2012-08-28 Thread François Schiettecatte
What is probably going on is that the response is not being interpreted as UTF-8 but as some other encoding. What are you using to display the response? François On Aug 28, 2012, at 8:08 AM, zehoss wrote: > Hi, > at the beginning I would like to sorry for my english. I hope my message > will

Re: recommended SSD

2012-08-23 Thread François Schiettecatte
You should check this at pcper.com: http://pcper.com/ssd-decoder http://pcper.com/content/SSD-Decoder-popup Specs for a wide range of SSDs. Best regards François On Aug 23, 2012, at 5:35 PM, Peyman Faratin wrote: > Hi > > Is there a SSD brand and spec that the community re

Re: The way to customize ranking?

2012-08-23 Thread François Schiettecatte
I would create two indices, one with your content and one with your ads. This approach would allow you to precisely control how many ads you pull back and how you merge them into the results, and you would be able to control schemas, boosting, defaults fields, etc for each index independently.

Re: Can't find solr.xml

2012-07-11 Thread François Schiettecatte
On Jul 11, 2012, at 2:52 PM, Shawn Heisey wrote: > On 7/2/2012 2:33 AM, Nabeel Sulieman wrote: >> Argh! (and hooray!) >> >> I started from scratch again, following the wiki instructions. I did only >> one thing differently; put my data directory in /opt instead of /home/dev. >> And now it works!

Re: difference between stored="false" and stored="true" ?

2012-06-30 Thread François Schiettecatte
Giovanni means the data is stored in the index and can be returned with the search results (see the 'fl' parameter). This is independent of Which means that you can store but not index a field: Best regards François On Jun 30, 2012, at 9:57 AM, Giovanni Gherdovich wrote:

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
n 19, 2012, at 9:03 AM, Bruno Mannina wrote: > Linux Ubuntu :) since 2 months ! so I'm a new in this world :) > > Le 19/06/2012 15:01, François Schiettecatte a écrit : >> Well that depends on the platform you are on, you did not mention that. >> >> If you

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
mes during the process but How can I check > IO HDD ? > > Le 19/06/2012 14:13, François Schiettecatte a écrit : >> Just a suggestion, you might want to monitor CPU usage and disk I/O, there >> might be a bottleneck. >> >> Cheers >> >> François >>

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
Just a suggestion, you might want to monitor CPU usage and disk I/O, there might be a bottleneck. Cheers François On Jun 19, 2012, at 7:07 AM, Bruno Mannina wrote: > Actually -Xmx512m and no effect > > Concerning maxFieldLength, no problem it's commented > > Le 19/06/2012 13:02, Erick Erick

Re: Solr out of memory exception

2012-03-15 Thread François Schiettecatte
FWIW it looks like this feature has been enabled by default since JDK 6 Update 23: http://blog.juma.me.uk/2008/10/14/32-bit-or-64-bit-jvm-how-about-a-hybrid/ François On Mar 15, 2012, at 6:39 AM, Husain, Yavar wrote: > Thanks a ton. > > From: L

Re: Solr logging

2012-02-20 Thread François Schiettecatte
Ola Here is what I have for this: ## # # Log4J configuration for SOLR # # http://wiki.apache.org/solr/SolrLogging # # # 1) Download LOG4J: # http://logging.apache.org/log4j/1.2/ # http://logging.apache.org/log4j/1.2/download.h

Re: Development inside or outside of Solr?

2012-02-20 Thread François Schiettecatte
You could take a look at this: http://www.let.rug.nl/vannoord/TextCat/ Will probably require some work to integrate/implement through François On Feb 20, 2012, at 3:37 AM, bing wrote: > I have looked into the TikaCLI with -language option, and learned that Tika > can output only the la

Re: Help:Solr can't put all pdf files into index

2012-02-09 Thread François Schiettecatte
Have you tried checking any logs? Have you tried identifying a file which did not make it in and submitting just that one and seeing what happens? François On Feb 9, 2012, at 10:37 AM, Rong Kang wrote: > > Yes, I put all file in one directory and I have tested file names using > code. >

Re: Using UUID for uniqueId

2012-02-08 Thread François Schiettecatte
Anderson I would say that this is highly unlikely, but you would need to pay attention to how they are generated, this would be a good place to start: http://en.wikipedia.org/wiki/Universally_unique_identifier Cheers François On Feb 8, 2012, at 1:31 PM, Anderson vasconcelos wrote: >

Re: Question on Reverse Indexing

2012-01-17 Thread François Schiettecatte
Using ReversedWildcardFilterFactory will double the size of your dictionary (more or less), maybe the drop in performance that you are seeing is a result of that? François On Jan 17, 2012, at 9:01 PM, Shyam Bhaskaran wrote: > Hi, > > For reverse indexing we are using the ReversedWildcardFilte

Re: best query for one-box search string over multiple types & fields?

2012-01-15 Thread François Schiettecatte
Johnny What you are going to want to do is boost the artist field with respect to the others, for example using edismax my 'qf' parameter is: number^5 title^3 default so hits in the number field get a five-fold boost and hits in the title field get a three-fold boost. In your case you

Re: Doing url search in solr is slow

2012-01-09 Thread François Schiettecatte
About the search 'referal_url:*www.someurl.com*', having a wildcard at the start will cause a dictionary scan for every term you search on unless you use ReversedWildcardFilterFactory. That could be the cause of your slowdown if you are I/O bound, and even if you are CPU bound for that matter.

Re: Shutdown hook issue

2011-12-14 Thread François Schiettecatte
I am not an expert on this but the oom-killer will kill off the process consuming the greatest amount of memory if the machine runs out of memory, and you should see something to that effect in the system log, /var/log/messages I think. François On Dec 14, 2011, at 2:54 PM, Adolfo Castro Menna

Re: how index words with their perfix in solr?

2011-11-29 Thread François Schiettecatte
You might try the snowball stemmer too, I am not sure how closely that will fit your requirements though. Alternatively you could use synonyms. François On Nov 29, 2011, at 1:08 AM, mina wrote: > thank you for your answer.i read it and i use this filter in my schema.xml in > solr: > > > > b

Re: Don't snowball depending on terms

2011-11-29 Thread François Schiettecatte
It won't and depending on how your analyzer is set up the terms are most likely stemmed at index time. You could create a separate field for unstemmed terms though, or use a less aggressive stemmer such as EnglishMinimalStemFilterFactory. François On Nov 29, 2011, at 12:33 PM, Robert Brown wro

Re: how index words with their perfix in solr?

2011-11-28 Thread François Schiettecatte
It looks like you are using the plural stemmer, you might want to look into using the Porter stemmer instead: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming François On Nov 28, 2011, at 9:14 AM, mina wrote: > I use solr 3.3,I want solr index words with their suffixes. whe

Re: query within search results

2011-11-08 Thread François Schiettecatte
Wouldn't 'diseases AND water' or '+diseases +water' return you that result? Or you could search on 'water' while filtering on 'diseases'. Or am I missing something here? François On Nov 8, 2011, at 4:19 PM, sharnel pereira wrote: > Hi, > > I have 10k records indexed using solr 1.4 > > We hav

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread François Schiettecatte
simply query for "Solr". This is > what's Solr made for. :) > > -Kuli > > Am 01.11.2011 13:24, schrieb François Schiettecatte: >> Arshad >> >> Actually it is available, you need to use the ReversedWildcardFilterFactory >> which I am sure you

Re: Is SQL Like operator feature available in Apache Solr query

2011-11-01 Thread François Schiettecatte
Arshad Actually it is available, you need to use the ReversedWildcardFilterFactory which I am sure you can Google for. Solr and SQL address different problem sets with some overlaps but there are significant differences between the two technologies. Actually '%Solr%' is a worse case for SQL bu

Re: Uncomplete date expressions

2011-10-29 Thread François Schiettecatte
Erik I would complement the date with default values as you suggest and store a boolean flag indicating whether the date was complete or not, or store the original date if it is not complete which would probably be better because the presence of that data would tell you that the original date w

Re: drastic performance decrease with 20 cores

2011-09-26 Thread François Schiettecatte
You have not said how big your index is but I suspect that allocating 13GB for your 20 cores is starving the OS of memory for caching file data. Have you tried 6GB with 20 cores? I suspect you will see the same performance as 6GB & 10 cores. Generally it is better to allocate just enough memory

Re: synonyms.txt: different results on admin and on site..

2011-09-08 Thread François Schiettecatte
Wildcard terms are not analyzed, so your synonyms.txt may come into play here, have you check the analysis for deniz* ? François On Sep 7, 2011, at 10:08 PM, deniz wrote: > well yea you are right... i realised that lack of detail issue here... so > here it comes... > > > This is from my sche

Re: MMapDirectory failed to map a 23G compound index segment

2011-09-07 Thread François Schiettecatte
My memory of this is a little rusty but isn't mmap also limited by mem + swap on the box? What does 'free -g' report? François On Sep 7, 2011, at 12:25 PM, Rich Cariens wrote: > Ahoy ahoy! > > I've run into the dreaded OOM error with MMapDirectory on a 23G cfs compound > index segment file. Th

Re: Solr and wikipedia for schools

2011-09-04 Thread François Schiettecatte
I note that there is a full download option available, might be easier than crawling. François On Sep 4, 2011, at 9:56 AM, Markus Jelsma wrote: > Hi, > > Solr is a search engine, not a crawler. You can use Apache Nutch to crawl > your > site and have it indexed in Solr. > > Cheers, > >> Hi

Re: shareSchema="true" - location of schema.xml?

2011-08-31 Thread François Schiettecatte
Satish You don't say which platform you are on but have you tried links (with ln on linux/unix) ? François On Aug 31, 2011, at 12:25 AM, Satish Talim wrote: > I have 1000's of cores and to reduce the cost of loading unloading > schema.xml, I have my solr.xml as mentioned here - > http://wiki.a

Re: Error while decoding %DC (Ü) from URL - results in ?

2011-08-29 Thread François Schiettecatte
ding and > encodes apropriatly. this should be a common solr problem if all search > engines treat utf-8 that way, right? > > Any ideas how to fix that? Is there maybe a special solr functionality for > this? > > 2011/8/27 François Schiettecatte > >> Merlin >> >

Re: Error while decoding %DC (Ü) from URL - results in ?

2011-08-27 Thread François Schiettecatte
Merlin Ü encodes to two characters in utf-8 (C39C), and one in iso-8859-1 (%DC) so it looks like there is a charset mismatch somewhere. Cheers François On Aug 27, 2011, at 6:34 AM, Merlin Morgenstern wrote: > Hello, > > I am having problems with searches that are issued from spiders that

Re: SolrServer instances

2011-08-26 Thread François Schiettecatte
Sounds to me that you are looking for HTTP Persistent Connections (connection keep-alive as opposed to close), and a singleton object. This would be outside SOLR per se. A few caveats though, I am not sure if tomcat supports keep-alive, and I am not sure how SOLR deals with multiple requests co

Re: Solr 3.3 crashes after ~18 hours?

2011-08-02 Thread François Schiettecatte
Assuming you are running on Linux, you might want to check /var/log/messages too (the location might vary), I think the kernel logs forced process termination there. I recall that the kernel will usually picks the process consuming the most memory, there may be other factors involved too. Franç

Re: Solr can not index "F**K"!

2011-07-31 Thread François Schiettecatte
Indeed, the analysis will show if the term is a stop word, the term gets removed by the stop filter, turning on verbose output shows that. François On Jul 31, 2011, at 6:27 PM, Shashi Kant wrote: > Check your Stop words list > On Jul 31, 2011 6:25 PM, "François Schiettecatte"

Re: Solr can not index "F**K"!

2011-07-31 Thread François Schiettecatte
That seems a little far fetched, have you checked your analysis? François On Jul 31, 2011, at 4:58 PM, randohi wrote: > One of our clients (a hot girl!) brought this to our attention: > In this document there are many f* words: > > http://sec.gov/Archives/edgar/data/1474227/00014742271032/

Re: schema.xml changes, need re-indexing ?

2011-07-27 Thread François Schiettecatte
I have not seen this mentioned anywhere, but I found a useful 'trick' to restart solr without having to restart tomcat. All you need to do is 'touch' the solr.xml in the solr.home directory. It can take a few seconds but solr will restart and reload any config. Cheers François On Jul 27, 201

Re: performance variation with respect to the index size

2011-07-26 Thread François Schiettecatte
Note that the Qtime in the response packet is the search, exclusive of > assembling the response so that's probably a good number to measure. > > Best > Erick > > On Fri, Jul 8, 2011 at 8:01 AM, jame vaalet wrote: >> i would prefer every setting to be in its defa

Re: Spellcheck compounded words

2011-07-26 Thread François Schiettecatte
I get slf4j-log4j12-1.6.1.jar from http://www.slf4j.org/dist/slf4j-1.6.1.tar.gz, it is what interfaces slf4j to log4j, you will also need to add log4j-1.2.16.jar to WEB-INF/lib. François On Jul 26, 2011, at 3:40 PM, O. Klein wrote: > > François Schiettecatte wrote: >> >&

Re: Spellcheck compounded words

2011-07-26 Thread François Schiettecatte
FWIW, here is the process I follow to create a log4j aware version of the apache solr war file and the corresponding lo4j.properties files. Have fun :) François ## # # Log4J configuration for SOLR # # http://wiki.apache.org/solr/Sol

Re: problem searching on non standard characters

2011-07-22 Thread François Schiettecatte
Adding to my previous reply, I just did a quick check on the 'text_en' and 'text_en_splitting' field types and they both strip leading '#'. Cheers François On Jul 22, 2011, at 10:49 AM, Shawn Heisey wrote: > On 7/22/2011 8:34 AM, Jason Toy wrote: >> How does one search for words with character

Re: problem searching on non standard characters

2011-07-22 Thread François Schiettecatte
Check your analyzers to make sure that these characters are not getting stripped out in the tokenization process, the url for 3.3 is somewhere along the lines of: http://localhost/solr/admin/analysis.jsp?highlight=on And you should be indeed be searching on "\#test". François On Jul 2

Re: POST VS GET and NON English Characters

2011-07-20 Thread François Schiettecatte
You need to do something like this in the ./conf/tomcat server.xml file: See 'URIEncoding' in http://tomcat.apache.org/tomcat-7.0-doc/config/http.html Note that this will assume that the encoding of the data is in utf-8 if (and ONLY if) the charset parameter is not set in the HTTP request

Re: How to find whether solr server is running or not

2011-07-19 Thread François Schiettecatte
I think anything but a 200 OK mean it is dead like the proverbial parrot :) François On Jul 19, 2011, at 7:42 AM, Romi wrote: > But the problem is when solr server is not runing > *"http://host:port/solr/admin/ping"* > > will not give me any json response > then how will i get the status :( >

Re: - character in search query

2011-07-14 Thread François Schiettecatte
Easy, the hyphen is out on its own (with spaces on either side) and is probably getting removed from the search by the tokenizer. Check your analysis. François On Jul 14, 2011, at 6:05 AM, roySolr wrote: > It looks like it's still not working. > > I send this to SOLR: q=arsenal \- london > >

Re: Wildcard

2011-07-13 Thread François Schiettecatte
http://lucene.apache.org/java/2_9_1/queryparsersyntax.html http://wiki.apache.org/solr/SolrQuerySyntax François On Jul 13, 2011, at 1:29 PM, GAURAV PAREEK wrote: > Hello, > > What are wildcards we can use with the SOLR ? > > Regards, > Gaurav

Re: Result list order in case of ties

2011-07-12 Thread François Schiettecatte
You just need to provide a second sort field along the lines of: sort=score desc, author desc François On Jul 12, 2011, at 6:13 AM, Lox wrote: > Hi, > > In the case where two or more documents are returned with the same score, is > there a way to tell Solr to sort them alphabetically?

Re: performance variation with respect to the index size

2011-07-08 Thread François Schiettecatte
Hi I don't think that anyone has run such benchmarks, in fact this topic came up two weeks ago and I volunteered some time to do that because I have some spare time this week, so I am going to run some benchmarks this weekend and report back. The machine I have to do this a core i7 960, 24GB,

Re: Wildcard search not working if full word is queried

2011-07-01 Thread François Schiettecatte
Celso Pinto wrote: >> Hi François, >> >> it is indeed being stemmed, thanks a lot for the heads up. It appears >> that stemming is also configured for the query so it should work just >> the same, no? >> >> Thanks again. >> >> Regards, &

Re: Wildcard search not working if full word is queried

2011-06-30 Thread François Schiettecatte
I would run that word through the analyzer, I suspect that the word 'teste' is being stemmed to 'test' in the index, at least that is the first place I would check. François On Jun 30, 2011, at 2:21 PM, Celso Pinto wrote: > Hi everyone, > > I'm having some trouble figuring out why a query wit

Re: filters effect on search results

2011-06-29 Thread François Schiettecatte
Indeed, I find the Porter stemmer to be too 'aggressive' for my taste, I prefer the EnglishMinimalStemFilterFactory, with the caveat that it depends on your data set. Cheers François On Jun 29, 2011, at 6:21 AM, Ahmet Arslan wrote: >> Hi, when i query for "elegant" in >> solr i get results fo

Re: Include synonys in solr

2011-06-28 Thread François Schiettecatte
wrote: > Thanks François Schiettecatte, information you provided is very helpful. > i need to know one more thing, i downloaded one of the given dictionary but > it contains many files, do i need to add all this files data in to > synonyms.text ?? > > - > Thanks & Regard

Re: Removing duplicate documents from search results

2011-06-28 Thread François Schiettecatte
work on it, as there are some other low hanging fruits I've to >>> capture. Will share my thoughts soon. >>> >>> >>> *Pranav Prakash* >>> >>> "temet nosce" >>> >>> Twitter <http://twitter.com/pranavprakash>

Re: Removing duplicate documents from search results

2011-06-28 Thread François Schiettecatte
le <http://www.google.com/profiles/pranny> > > > 2011/6/28 François Schiettecatte > >> Maybe there is a way to get Solr to reject documents that already exist in >> the index but I doubt it, maybe someone else with can chime here here. You >> could do a search for

Re: Removing duplicate documents from search results

2011-06-28 Thread François Schiettecatte
gt; Since I am using SOLR as index engine Only and using Riak(key-value > storage) as storage engine, I dont want to do the overwrite on duplicate. > I just need to discard the duplicates. > > > > 2011/6/28 François Schiettecatte > >> Create a hash from the url an

Re: Removing duplicate documents from search results

2011-06-28 Thread François Schiettecatte
Create a hash from the url and use that as the unique key, md5 or sha1 would probably be good enough. Cheers François On Jun 28, 2011, at 7:29 AM, Mohammad Shariq wrote: > I also have the problem of duplicate docs. > I am indexing news articles, Every news article will have the source URL, > I

Re: Include synonys in solr

2011-06-28 Thread François Schiettecatte
Well you need to find word lists and/or a thesaurus. This is one place to start: http://wordlist.sourceforge.net/ I used the US/UK english word list for my synonyms for an index I have because it contains both US and UK english terms, the list lacks some medical terms though so we just

Re: Searching in Traditional / Simplified Chinese Record

2011-06-20 Thread François Schiettecatte
Wayne I am not sure what you mean by 'changing the record'. One option would be to implement something like the synonyms filter to generate the TC for SC when you index the document, which would index both the TC and the SC in the same location. That way your users would be able to search with

Re: Extending Solr Highlighter to pull information from external source

2011-06-20 Thread François Schiettecatte
Mike I would be very interested in the answer to that question too. My hunch is that the answer is no too. I have a few text databases that range from 200MB to about 60GB with which I could run some tests. I will have some downtime in early July and will post results. From what I can tell the

Re: Is it true that I cannot delete stored content from the index?

2011-06-19 Thread François Schiettecatte
That is correct, but you only need to commit, optimize is not a requirement here. François On Jun 18, 2011, at 11:54 PM, Mohammad Shariq wrote: > I have define in my solr and Deleting the docs from solr using > this uniqueKey. > and then doing optimization once in a day. > is this right way to

Re: Multiple indexes

2011-06-18 Thread François Schiettecatte
on a project with 30+ normalized tables, but only 4 cores. Perhaps describing what you are trying to achieve would give us greater insight and thus be able to make more concrete recommendation? Cheers François On Jun 18, 2011, at 2:36 PM, shacky wrote: > Il 18 giugno 2011 20:27, Franç

Re: Multiple indexes

2011-06-18 Thread François Schiettecatte
Sure. François On Jun 18, 2011, at 2:25 PM, shacky wrote: > 2011/6/15 Edoardo Tosca : >> Try to use multiple cores: >> http://wiki.apache.org/solr/CoreAdmin > > Can I do concurrent searches on multiple cores?

Re: Why does paste get parsed into past?

2011-06-18 Thread François Schiettecatte
#x27;d keep the default > settings. My real issue is why are not query keywords treated as a > set?<http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201106.mbox/%3CBANLkTikHunhyWc2WVTofRYU4ZW=c8oe...@mail.gmail.com%3E> > 2011/6/18 François Schiettecatte > >> What do you have

Re: Why does paste get parsed into past?

2011-06-18 Thread François Schiettecatte
What do you have set up for stemming? François On Jun 18, 2011, at 8:00 AM, Gabriele Kahlout wrote: > Hello, > > Debugging query results I find that: > paste > content:past > > Now paste and past are two different words. Why does Solr not consider > that? How do I make it? > > -- > Regards,

Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread François Schiettecatte
I am assuming that you are running on linux here, I have found atop to be very useful to see what is going on. http://freshmeat.net/projects/atop/ dstat is also very useful too but needs a little more work to 'decode'. Obviously there is contention going on, you just need to figure out

Re: Strange behavior

2011-06-14 Thread François Schiettecatte
I think you will need to provide more information than this, no-one on this list is omniscient AFAIK. François On Jun 14, 2011, at 10:44 AM, Denis Kuzmenok wrote: > Hi. > > I've debugged search on test machine, after copying to production server > the entire directory (entire solr director

Re: Solr Field name restrictions

2011-06-04 Thread François Schiettecatte
Underscores and dashes are fine, but I would think that colons (:) are verboten. François On Jun 4, 2011, at 9:49 PM, Jamie Johnson wrote: > Is there a list anywhere detailing field name restrictions. I imagine > fields containing periods (.) are problematic if you try to use that field > when

  1   2   >