Text search within facets?

2010-02-12 Thread chasiubao
Hello, Is it possible to do a text search within facets? Something that will return me what words solr used to gather my results and how many of those results were found. For example, if I have the following field: and it has docs that contain something like english bulldog french bulldog b

Re: How to reindex data without restarting server

2010-02-12 Thread Emad Mushtaq
Hi, Thanks ! This is very useful :) :) On Fri, Feb 12, 2010 at 7:55 AM, Joe Calderon wrote: > if you use the core model via solr.xml you can reload a core without having > to to restart the servlet container, > http://wiki.apache.org/solr/CoreAdmin > > On 02/11/2010 02:40 PM, Emad Mushtaq wrote:

EmbeddedSolrServer vs CommonsHttpSolrServer

2010-02-12 Thread dcdmailbox-info
Hi all, I am new to solr/solrj. I correctly started up the server example given in the distribution (apache-solr-1.4.0\example\solr), populated the index with test data set, and successfully tested with http query string via browser (es. http://localhost:8983/solr/select/?indent=on&q=video&fl=

inconsistency between analysis.jsp and actual search

2010-02-12 Thread Lukas Kahwe Smith
Hi I am indexing the name "FC St. Gallen" using the following type: Which according to analysis.jsp gets split into: f | fc | s | st | g | ga | gal | gall | gal

Re: inconsistency between analysis.jsp and actual search

2010-02-12 Thread Ahmet Arslan
> Which according to analysis.jsp gets split into: > f | fc | s | st | g | ga | gal | gall | galle | gallen > > So far so good. > > Now if I search for "fc st.gallen" according to > analysis.jsp it will search for: > fc | st | gallen > > But when I do a dismax search using the following handler:

Re: EmbeddedSolrServer vs CommonsHttpSolrServer

2010-02-12 Thread Ron Chan
I suspect this has something to do with the dataDir setting in the example 's solrconfig.xml ${solr.data.dir:./solr/data} we use the example's solrconfig.xml as the base for our deployments and always comment this out the default of having conf and data sitting under the solr home works wel

Local Solr Inconsistent results for radius

2010-02-12 Thread Emad Mushtaq
Hello, I have a question related to local solr. For certain locations (latitude, longitude), the spatial search does not work. Here is the query I try to make which gives me no results: q=*&qt=geo&sort=geo_distance asc&lat=33.718151&long=73. 060547&radius=450 However if I make the same query wit

Re: inconsistency between analysis.jsp and actual search

2010-02-12 Thread Lukas Kahwe Smith
On 12.02.2010, at 11:17, Ahmet Arslan wrote: > analysis.jsp does not do actual query parsing. just shows produced tokens > step by step in analysis (charfilter, tokenizer, tokenfilter) phase. > "admin/analysis.jsp page will show you how your field is processed while > indexing and while querying

optimize is taking too much time

2010-02-12 Thread mklprasad
hi in my solr u have 1,42,45,223 records having some 50GB . Now when iam loading a new record and when its trying optimize the docs its taking 2 much memory and time can any body please tell do we have any property in solr to get rid of this. Thanks in advance -- View this message in contex

Re: EmbeddedSolrServer vs CommonsHttpSolrServer

2010-02-12 Thread dcdmailbox-info
Yes you are right. [code.2] works fine by commenting out the following lines on solrconfig.xml Is it correct this different behaviour from EmbeddedSolrServer ? Or it can be considered a low priority bug? Thanks for you prompt reply! Dino. -- Da: Ron Chan

Re: EmbeddedSolrServer vs CommonsHttpSolrServer

2010-02-12 Thread Erik Hatcher
When using EmbeddedSolrServer, you could simply set the solr.data.dir system property or launch your process from the same working directory where you are launching the HTTP version of Solr. Either of those should also work to alleviate this issue. Erik On Feb 12, 2010, at 5:36 AM

Re: EmbeddedSolrServer vs CommonsHttpSolrServer

2010-02-12 Thread Ron Chan
don't think this is a bug, the default behaviour is for /data to sit under Solr home there should be no need to use this parameter unless it is special case not sure why it is like this in the example - Original Message - From: dcdmailbox-i...@yahoo.it To: solr-user@lucene.apache.

Good literature on search basics

2010-02-12 Thread javaxmlsoapdev
Does anyone know good literature(web resources, books etc) on basics of search? I do have Solr 1.4 and Lucene books but wanted to go in more details on basics. Thanks, -- View this message in context: http://old.nabble.com/Good-literature-on-search-basics-tp27562021p27562021.html Sent from the

persistent cache

2010-02-12 Thread Tim Terlegård
Does Solr use some sort of a persistent cache? I do this 10 times in a loop: * start solr * create a core * execute warmup query * execute query with sort fields * stop solr Executing the query with sort fields takes 5-20 times longer the first iteration than the other 9 iterations. For

Re: persistent cache

2010-02-12 Thread Shalin Shekhar Mangar
2010/2/12 Tim Terlegård > Does Solr use some sort of a persistent cache? > > I do this 10 times in a loop: > * start solr > * create a core > * execute warmup query > * execute query with sort fields > * stop solr > > Executing the query with sort fields takes 5-20 times longer the first > i

Re: Dismax phrase queries

2010-02-12 Thread Shalin Shekhar Mangar
On Fri, Feb 12, 2010 at 6:06 AM, Jason Rutherglen < jason.rutherg...@gmail.com> wrote: > I'd like to boost an exact phrase match such as q="video poker" over > q=video poker. How would I do this using dismax? > > I tried pre-processing video poker into, video poker "video poker" > however that ju

Re: spellcheck

2010-02-12 Thread michaelnazaruk
I try to config spellcheck, but I still have this problem: Config: solr.FileBasedSpellChecker file spellings.txt UTF-8 ./spellcheckerFile false false 1 true file spellcheck Maybe I have thi

Re: Local Solr Inconsistent results for radius

2010-02-12 Thread Mauricio Scheffer
Hi Emad, I had the same issue ( http://old.nabble.com/Spatial---Local-Solr-radius-td26943608.html ), it seems that this happens only on eastern areas of the world. Try inverting the sign of all your longitudes, or translate all your longitudes to the west. Cheers, Mauricio On Fri, Feb 12, 2010 a

Re: inconsistency between analysis.jsp and actual search

2010-02-12 Thread Ahmet Arslan
> Anyways so how can I get "st.gallen" split into two terms > at query time? As you mentioned in your first mail, query st.gallen is already broken into two terms/words. But query parser constructs a phrase query. There was an disscussion about this behaviour earlier. http://www.lucidimagination

Fwd: indexing: issue with default values

2010-02-12 Thread nabil rabhi
in the schema.xml I have fileds with int type and default value exp: but when a document has no value for the field "postal_code" at indexing, I get the following error: Posting file Immo.xml to http://localhost:8983/solr/update/ Error 500 HTTP ERROR: 500For input string: "" java.lang.Numb

Re: persistent cache

2010-02-12 Thread Tim Terlegård
2010/2/12 Shalin Shekhar Mangar : > 2010/2/12 Tim Terlegård > >> Does Solr use some sort of a persistent cache? >> > Solr does not have a persistent cache. That is the operating system's file > cache at work. Aha, that's very interesting and seems to make sense. So is the primary goal of warmup

Re: Good literature on search basics

2010-02-12 Thread Jaco
See http://markmail.org/thread/z5sq2jr2a6eayth4 On 12 February 2010 12:14, javaxmlsoapdev wrote: > > Does anyone know good literature(web resources, books etc) on basics of > search? I do have Solr 1.4 and Lucene books but wanted to go in more > details > on basics. > > Thanks, > -- > View this

Re: indexing: issue with default values

2010-02-12 Thread Erik Hatcher
When a document has no value, are you still sending a postal_code field in your post to Solr? Seems like you are. Erik On Feb 12, 2010, at 8:12 AM, nabil rabhi wrote: in the schema.xml I have fileds with int type and default value exp: stored="true" default="0"/> but when a docume

Re: Dismax phrase queries

2010-02-12 Thread Jason Rutherglen
Was going to post that I more or less figured it out. Dismax handles this automatically with the ps parameter, which is different than the bs parameter... On Fri, Feb 12, 2010 at 3:48 AM, Shalin Shekhar Mangar wrote: > On Fri, Feb 12, 2010 at 6:06 AM, Jason Rutherglen < > jason.rutherg...@gmail.

Re: indexing: issue with default values

2010-02-12 Thread nabil rabhi
yes, sometimes the document has postal_code with no values , i still post it to solr 2010/2/12 Erik Hatcher > When a document has no value, are you still sending a postal_code field in > your post to Solr? Seems like you are. > >Erik > > > On Feb 12, 2010, at 8:12 AM, nabil rabhi wrote:

Re: Collating results from multiple indexes

2010-02-12 Thread Jan Høydahl / Cominvent
Really? The last time I looked at AIE, I am pretty sure there was Solr core msgs in the logs, so I assumed it used EmbeddedSolr or something. But I may be mistaken. Anyone from Attivio here who can elaborate? Is the join stuff at Lucene level or on top of multiple Solr cores or what? -- Jan Høy

Re: indexing: issue with default values

2010-02-12 Thread Erik Hatcher
That would be the problem then, I believe. Simply don't post a value to get the default value to work. Erik On Feb 12, 2010, at 10:18 AM, nabil rabhi wrote: yes, sometimes the document has postal_code with no values , i still post it to solr 2010/2/12 Erik Hatcher When a documen

Re: indexing: issue with default values

2010-02-12 Thread nabil rabhi
thanx Eric, that was very helpfull 2010/2/12 Erik Hatcher > That would be the problem then, I believe. Simply don't post a value to > get the default value to work. > >Erik > > > On Feb 12, 2010, at 10:18 AM, nabil rabhi wrote: > > yes, sometimes the document has postal_code with no va

Re: persistent cache

2010-02-12 Thread Tommy Chheng
One solution is to add the persistent cache with memcache at the application layer. -- Tommy Chheng Programmer and UC Irvine Graduate Student Twitter @tommychheng http://tommy.chheng.com On 2/12/10 5:19 AM, Tim Terlegård wrote: 2010/2/12 Shalin Shekhar Mangar: 2010/2/12 Tim Terlegård Do

Re: Text search within facets?

2010-02-12 Thread Ahmet Arslan
> For example, if I have the following field: > > stored="true"/> > > and it has docs that contain something like > > english bulldog > french bulldog > bichon frise > > If I search for "english bulldog" and facet on "dog", I > will get the > following: > > 135 > 23 > 12 Thats strange. The q

expire/delete documents

2010-02-12 Thread Matthieu Labour
HiIs there a way for solr or lucene to expire documents based on a field in a document. Let's say that I have a createTime field whose type is date, can i set a policy in schema.xml for solr to delete the documents older than X days?Thank you

Re: Local Solr Inconsistent results for radius

2010-02-12 Thread Emad Mushtaq
Hello Mauricio, Do you know why such a problem occurs. Has it to do with certain latitudes, longitudes. If so why is it happening. Is it a bug in local solr? On Fri, Feb 12, 2010 at 5:50 PM, Mauricio Scheffer < mauricioschef...@gmail.com> wrote: > Hi Emad, > > I had the same issue ( > http://old

Re: expire/delete documents

2010-02-12 Thread Mat Brown
You could easily have a scheduled job that ran delete by query to remove posts older than a certain date... On Fri, Feb 12, 2010 at 13:00, Matthieu Labour wrote: > HiIs there a way for solr or lucene to expire documents based on a field in a > document. Let's say that I have a createTime field w

Re: Deleting spelll checker index

2010-02-12 Thread darniz
HI Guys Opening this thread again. I need to get around this issue. i have a spellcheck field defined and i am copying two fileds make and model to this field i have buildoncommit and buildonoptimize set to true hence when i index data and try to search for a work accod i get back suggestion ac

Re: Local Solr Inconsistent results for radius

2010-02-12 Thread Mauricio Scheffer
Yes, it seems to be a bug, at least with the code you and I are using. If you don't need to search across the whole globe, try translating your longitudes as I suggested. On Fri, Feb 12, 2010 at 3:04 PM, Emad Mushtaq wrote: > Hello Mauricio, > > Do you know why such a problem occurs. Has it to do

Re: persistent cache

2010-02-12 Thread Tom Burton-West
Hi Tim, We generally run about 1600 cache-warming queries to warm up the OS disk cache and the Solr caches when we mount a new index. Do you have/expect phrase queries? If you don't, then you don't need to get any position information into your OS disk cache. Our position information takes ab

Has anyone prepared a general purpose synonyms.txt for search engines

2010-02-12 Thread Emad Mushtaq
Hi, I was wondering if anyone has prepared a synonyms.txt for general purpose search engines, that can be shared. If not could you refer me to places where such a synonym list or thesaurus can be found. Synonyms for search engines are different from the regular thesaurus. Any help would be highly

Re: Has anyone prepared a general purpose synonyms.txt for search engines

2010-02-12 Thread Julian Hille
Hi, at openthesaurus.org or .com you can find a mysql version of synonyms you just have to join it to fit the synonym schema of solr yourself. Am 12.02.2010 um 20:03 schrieb Emad Mushtaq: > Hi, > > I was wondering if anyone has prepared a synonyms.txt for general purpose > search engines, th

Re: Encountering a roadblock with my Solr schema design...use dedupe?

2010-02-12 Thread Amit Nithian
Hi all, I am the author of the article referenced in this thread and after reading it again, I can understand where there might have been confusion and my apologies on that. I have edited the article to indicate that a deduplication component is in the works and referenced SOLR-236. The article ca

Re: Has anyone prepared a general purpose synonyms.txt for search engines

2010-02-12 Thread Emad Mushtaq
Wow thanks!! You all are awesome! :D :D On Sat, Feb 13, 2010 at 12:32 AM, Julian Hille wrote: > Hi, > > at openthesaurus.org or .com you can find a mysql version of synonyms you > just have to join it to fit the synonym schema of solr yourself. > > > Am 12.02.2010 um 20:03 schrieb Emad Mushtaq:

Re: Has anyone prepared a general purpose synonyms.txt for search engines

2010-02-12 Thread Julian Hille
Hi, Your welcome. Thats something google came up with some weeks ago :) Am 12.02.2010 um 20:42 schrieb Emad Mushtaq: > Wow thanks!! You all are awesome! :D :D > > On Sat, Feb 13, 2010 at 12:32 AM, Julian Hille wrote: > >> Hi, >> >> at openthesaurus.org or .com you can find a mysql version o

Re: implementing profanity detector

2010-02-12 Thread Mike Perham
On Thu, Feb 11, 2010 at 10:49 AM, Grant Ingersoll wrote: > > Otherwise, I'd do it via copy fields.  Your first field is your main field > and is analyzed as before.  Your second field does the profanity detection > and simply outputs a single token at the end, safe/unsafe. > > How long are your

For caches, any reason to not set initialSize and size to the same value?

2010-02-12 Thread Jay Hill
If I've done a lot of research and have a very good idea of where my cache sizes are having monitored the stats right before commits, is there any reason why I wouldn't just set the initialSize and size counts to the same values? Is there any reason to set a smaller initialSize if I know reliably t

Re: For caches, any reason to not set initialSize and size to the same value?

2010-02-12 Thread Yonik Seeley
On Fri, Feb 12, 2010 at 5:23 PM, Jay Hill wrote: > If I've done a lot of research and have a very good idea of where my cache > sizes are having monitored the stats right before commits, is there any > reason why I wouldn't just set the initialSize and size counts to the same > values? Is there an

reloading sharedlib folder

2010-02-12 Thread Joe Calderon
when using solr.xml, you can specify a sharedlib directory to share among cores, is it possible to reload the classes in this dir without having to restart the servlet container? it would be useful to be able to make changes to those classes on the fly or be able to drop in new plugins

RE: For caches, any reason to not set initialSize and size to the same value?

2010-02-12 Thread Fuad Efendi
I always use initial size = max size, just to avoid Arrays.copyOf()... Initial (default) capacity for HashMap is 16, when it is not enough - array copy to new 32-element array, then to 64, ... - too much wasted space! (same for ConcurrentHashMap) Excuse me if I didn't understand the question...

RE: For caches, any reason to not set initialSize and size to the same value?

2010-02-12 Thread Fuad Efendi
Funny, Arrays.copy() for HashMap... but something similar... Anyway, I use same values for initial size and max size, to be safe... and to have OOP at startup :) > -Original Message- > From: Fuad Efendi [mailto:f...@efendi.ca] > Sent: February-12-10 6:55 PM > To: solr-user@lucene.apach

Re: Deleting spelll checker index

2010-02-12 Thread darniz
Any update on this Do you guys want to rephrase my question, if its not clear. Thanks darniz darniz wrote: > > HI Guys > Opening this thread again. > I need to get around this issue. > i have a spellcheck field defined and i am copying two fileds make and > model to this field > > > i have

Re: implementing profanity detector

2010-02-12 Thread Chris Hostetter
: Otherwise, I'd do it via copy fields. Your first field is your main : field and is analyzed as before. Your second field does the profanity : detection and simply outputs a single token at the end, safe/unsafe. you don't even need custom code for this ... copyFiled all your text into a 'ha

Re: expire/delete documents

2010-02-12 Thread Chris Hostetter
: You could easily have a scheduled job that ran delete by query to : remove posts older than a certain date... or since you specificly asked about delteing anything older then X days (in this example i'm assuming x=7)... createTime:[NOW-7DAYS TO *] -Hoss

migrating from solr 1.3 to 1.4

2010-02-12 Thread Sachin Sebastian
Hi there, I'm trying to migrate from solr 1.3 to solr 1.4 and I've few issues. Initially my localsolr was throwing NullPointer exception and I fixed it by changing type of lat and lng to 'tdouble'. But now I'm not able to update index. When I try to update index it throws out error say

cannot match on phrase queries

2010-02-12 Thread Kevin Osborn
I am seeing this in several of my fields. I have something like "Samsung X150" or "Nokia BH-212". And my query will not match on X150 or BH-212. So, my query is something like +model:(Samsung X150). Through debugQuery, I see that this gets converted to +(model:samsung model:"x 150"). It matches

Re: Solr 1.4: Full import FileNotFoundException

2010-02-12 Thread Chris Hostetter
: I have noticed that when I run concurrent full-imports using DIH in Solr : 1.4, the index ends up getting corrupted. I see the following in the log I'm fairly confident that concurrent imports won't work -- but it shouldn't corrupt your index -- even if the DIH didn't actively check for this

Re: Cannot get like exact searching to work

2010-02-12 Thread Chris Hostetter
: > Can your query consist of more than one words? : : Yes, and I expect it almost always will (the query string is coming : from a search box on a website). ... : Actually it won't. The data I am indexing has extra spaces in front : and is capitalized. I really need to be able to filter

Interesting stuff; Solr as a syslog store.

2010-02-12 Thread Antonio Lobato
Hey everyone, I don't actually have a question, but I just thought I'd share something really cool that I did with Solr for our company. We run a good amount of servers, well into the several hundreds, and naturally we need a way to centralize all of the system logs. For a while we used a com

Re: sorting

2010-02-12 Thread Chris Hostetter
:title^1.2 contentEN^0.8 contentIT^0.8 contentDE^0.8 :title^1.2 contentEN^0.8 contentIT^0.8 contentDE^0.8 FWIW: I don't think you understand what the "bf" param is for ... it's not analogous to qf and pf, it's for expressing a list of boost functions -- a function can be a simple field

Re: sorting

2010-02-12 Thread Chris Hostetter
: that *may* be causing your problem, if the function parser is attempting : to generate the FieldCache for your content fields. Yep ... that's it ... if you use a barefield name as a function, and that field name is not numeric, the result is an OrdFieldSource shiceh uses the FieldCache. I o

RE: expire/delete documents

2010-02-12 Thread Fuad Efendi
> or since you specificly asked about delteing anything older > then X days (in this example i'm assuming x=7)... > > createTime:[NOW-7DAYS TO *] createTime:[* TO NOW-7DAYS]

Re: How to reindex data without restarting server

2010-02-12 Thread Chris Hostetter
: if you use the core model via solr.xml you can reload a core without having to : to restart the servlet container, : http://wiki.apache.org/solr/CoreAdmin For making a schema change, the steps would be: - create a "new_core" with the new schema - reindex all the docs into "new_core" - "SW

Re: Deleting spelll checker index

2010-02-12 Thread Chris Hostetter
: Any update on this Patience my friend ... 5 hours after you send an email isn't long enough to wait before asking for "any update on this" -- it's just increasing the volume of mail everyone gets and distracting people from actual bugs/issues. FWIW: this doesn't really seem directly related

Re: cannot match on phrase queries

2010-02-12 Thread Kevin Osborn
It appears that omitTermFreqAndPositions is indeed the culprit. I assume it has to do with the fact that the index parsing of BH-212 puts multiple terms in the same position. From: Kevin Osborn To: Solr Sent: Fri, February 12, 2010 5:28:08 PM Subject: cannot

Re: Solr 1.4: Full import FileNotFoundException

2010-02-12 Thread Noble Paul നോബിള്‍ नोब्ळ्
concurrent imports are not allowed in DIH, unless u setup multiple DIH instances On Sat, Feb 13, 2010 at 7:05 AM, Chris Hostetter wrote: > > : I have noticed that when I run concurrent full-imports using DIH in Solr > : 1.4, the index ends up getting corrupted. I see the following in the log > >

Re: Solr 1.4: Full import FileNotFoundException

2010-02-12 Thread Chris Hostetter
: concurrent imports are not allowed in DIH, unless u setup multiple DIH instances Right, but that's not the issue -- the question is wether attemping to do so might be causing index corruption (either because of a bug or because of some possibly really odd config we currently know nothing abo

parsing strings into phrase queries

2010-02-12 Thread Kevin Osborn
Right now if I have the query model:(Nokia BH-212V), the parser turns this into +(model:nokia model:"bh 212 v"). The problem is that I might have a model called Nokia BH-212, so this is completely missed. In my case, I would like my query to be +(model:nokia model:bh model:212 model:v). This is

Re: Interesting stuff; Solr as a syslog store.

2010-02-12 Thread Olivier Dobberkau
Am 13.02.2010 um 03:02 schrieb Antonio Lobato: > Just thought this would be a neat story to share with you all. I've really > grown to love Solr, it's something else! Hi Antonio, Great. Would you also share the source code somewhere! May the Source be with you. Thanks. Olivier