Date boosting with dismax question

2011-10-21 Thread Craig Stadler
Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 precisionStep="6" positionIncrementGap="0"/> stored="false" omitNorms="true"

Re: Sorting fields with letters?

2011-10-21 Thread Tomás Fernández Löbbe
I don't know if you'll find exactly what you need, but you can sort by any field or FunctionQuery. See http://wiki.apache.org/solr/FunctionQuery On Fri, Oct 21, 2011 at 7:03 PM, Peter Spam wrote: > Is there a way to use a custom sorter, to avoid re-indexing? > > > Thanks! > Pete > > On Oct 21, 2

Re: Sorting fields with letters?

2011-10-21 Thread Peter Spam
Is there a way to use a custom sorter, to avoid re-indexing? Thanks! Pete On Oct 21, 2011, at 2:13 PM, Tomás Fernández Löbbe wrote: > Well, yes. You probably have a string field for that content, right? so the > content is being compared as strings, not as numbers, that why something > like 100

Re: NRT and replication

2011-10-21 Thread Tomás Fernández Löbbe
I was thinking in this, would it make sense to keep the master / slave architecture, adding documents to the master and the slaves, do soft commits (only) to the slaves and hard commits to the master? That way you wouldn't be doing any merges on slaves. Would that make sense? On Fri, Oct 21, 2011

Re: can solr follow and index hyperlinks embedded in rich text documents (pdf, doc, etc)?

2011-10-21 Thread Tomás Fernández Löbbe
Hi Tod, Solr doesn't actually crawl, If you need to feed Solr with that kind of information you'll have to use some crawling tool or implement that yourself. Regards, Tomás On Fri, Oct 21, 2011 at 2:48 PM, Tod wrote: > I have a feeling the answer is "no" since you wouldn't want to start > inde

Re: Sorting fields with letters?

2011-10-21 Thread Tomás Fernández Löbbe
Well, yes. You probably have a string field for that content, right? so the content is being compared as strings, not as numbers, that why something like 1000 is lower than 2. Leading zeros would be an option. Another option is to separate the field into numeric fields and sort by those (this last

Re: NRT and replication

2011-10-21 Thread Mark Miller
Yeah - a distributed update processor like the one Yonik wrote will do fine in simple situations. On Oct 17, 2011, at 7:33 PM, Esteban Donato wrote: > thanks Yonik. Any idea of when this should be completed? In the > meantime I think I will have to add docs to every replica, possibly > impleme

Re: SOLR CLOUD IN TWO DIFFERENT HOSTS

2011-10-21 Thread Mark Miller
On a quick pass this looks okay - especially if it works on the same host. Seems odd you would get a 404 with the zk link - without more info I don't know what is up with that, but perhaps when Solr tries to determine the local machines address, it's not finding what you want? We use localHost =

Re: data import in 4.0

2011-10-21 Thread Alireza Salimi
So to me it heightens the probability of classloader conflicts, I haven't worked with Solr 4.0, so I don't know if set of JAR files are the same with Solr 3.4. Anyway, make sure that there is only ONE instance of apache-solr-dataimporthandler-***.jar in your whole tomcat+webapp. Maybe you have thi

Sorting fields with letters?

2011-10-21 Thread Peter Spam
Hi everyone, I have a field that has a letter in it (for example, 1A1, 2A1, 11C15, etc.). Sorting it seems to work most of the time, except for a few things, like 10A1 is lower than 8A100, and 10A100 is lower than 10A99. Any ideas? I bet if my data had leading zeros (ie 10A099), it would beh

Re: data import in 4.0

2011-10-21 Thread Adeel Qureshi
its deployed on a tomcat server .. On Fri, Oct 21, 2011 at 12:49 PM, Alireza Salimi wrote: > Hi, > > How do you start Solr, through start.jar or you deploy it to a web > container? > Sometimes problems like this are because of different class loaders. > I hope my answer would help you. > > Regard

Re: Error Finding solrconfig.xml

2011-10-21 Thread rocco2004
Hi Hoss, It ended up been permission issue. I moved the example folder to /usr/local/jakarta/tomcat/webapps/solr/WEB-INF/ and it was able to find it. Java seem to be giving file not found even if it doesn't have permissions to access it. I'm wondering in such cases under what user runs tomcat an

Re: java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log

2011-10-21 Thread Tod
On 10/19/2011 2:58 PM, wrote: Hi Tod, I had similar issue with slf4j, but it was NoClassDefFound. Do you have some other dependencies in your application that use some other version of slf4j? You can use mvn dependency:tree to get all dependencies in your application. Or maybe there's some other

Re: data import in 4.0

2011-10-21 Thread Alireza Salimi
Hi, How do you start Solr, through start.jar or you deploy it to a web container? Sometimes problems like this are because of different class loaders. I hope my answer would help you. Regards On Fri, Oct 21, 2011 at 12:47 PM, Adeel Qureshi wrote: > Hi I am trying to setup the data import handl

can solr follow and index hyperlinks embedded in rich text documents (pdf, doc, etc)?

2011-10-21 Thread Tod
I have a feeling the answer is "no" since you wouldn't want to start indexing a large volume of office documents containing hyperlinks that could lead all over the internet. But, since there might be a use case like "a customer just asked me if it could be done?", I thought I would make sure.

Re: Can Solr handle large text files?

2011-10-21 Thread Peter Spam
Thanks for your note, Anand. What was the maximum chunk size for you? Could you post the relevant portions of your configuration file? Thanks! Pete On Oct 21, 2011, at 4:20 AM, anand.ni...@rbs.com wrote: > Hi, > > I was also facing the issue of highlighting the large text files. I applied

Re: Can Solr handle large text files?

2011-10-21 Thread Peter Spam
Thanks for the response, Karsten. 1) What's the recommended maximum chunk size? 2) Does my tokenizer look reasonable? Thanks! Pete On Oct 21, 2011, at 2:28 AM, karsten-s...@gmx.de wrote: > Hi Peter, > > highlighting in large text files can not be fast without dividing the > original text in

data import in 4.0

2011-10-21 Thread Adeel Qureshi
Hi I am trying to setup the data import handler with solr 4.0 and having some unexpected problems. I have a multi-core setup and only one core needed the dataimport handler so I have added the request handler to it and added the lib imports in config file for some reason this doesnt works .. it

success with indexing Wikipedia - lessons learned

2011-10-21 Thread Fred Zimmerman
http://business.zimzaz.com/wordpress/2011/10/how-to-clone-wikipedia-mirror-and-index-wikipedia-with-solr/

Re: Highlighting misses some characters

2011-10-21 Thread Dirceu Vieira
Whether removing the filter of not really depends on the use of it in the search and what result is expected from it. Have a look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory I'd say you should find out what exactly are the requirements for the search,

Re: Want to support "did you mean xxx" but is Chinese

2011-10-21 Thread Ken Krugler
Hi Floyd, Typically you'd do this by creating a custom analyzer that - segments Chinese text into words - Converts from words to pinyin or zhuyin Your index would have both the actual Hanzi characters, plus (via copyfield) this phonetic version. During search, you'd use dismax to search agai

Re: Highlighting misses some characters

2011-10-21 Thread docmattman
Yea, I'm using EdgeNGramFilterFactory, should I remove that? I actually inherited this index from another person who used to be part of the project, so there may be a few things that need to be changed. Here is my field type from the schema:

Re: How to make UnInvertedField faster?

2011-10-21 Thread Michael McCandless
Well... the limitation of DocValues is that it cannot handle more than one value per document (which UnInvertedField can). Hopefully we can fix that at some point :) Mike McCandless http://blog.mikemccandless.com On Fri, Oct 21, 2011 at 7:50 AM, Simon Willnauer wrote: > In trunk we have a feat

Re: How to make UnInvertedField faster?

2011-10-21 Thread Jason Rutherglen
Sweet + Very cool! On Fri, Oct 21, 2011 at 7:50 AM, Simon Willnauer < simon.willna...@googlemail.com> wrote: > In trunk we have a feature called IndexDocValues which basically > creates the uninverted structure at index time. You can then simply > suck that into memory or even access it on disk d

Re: LUCENE-2208 (SOLR-1883) Bug with HTMLStripCharFilter, given patch in next nightly build?

2011-10-21 Thread Vadim Kisselmann
UPDATE: i checked out the latest trunk-version and patched this with the patch from LUCENE-2208. This patch seems not to work. Or i had done something wrong. My old log snippets: Http - 500 Internal Server Error Error: Carrot2 clustering failed And this was caused by: Http - 500 Internal

SOLRNET combine LocalParams with SolrMultipleCriteriaQuery?

2011-10-21 Thread Grüger , Joscha
Hello, does anybody know how to combine SolrMultipleCriteriaQuery and LocalParams (in SOLRnet)? I've tried things like that (don't worry about bad the code, it's just to test) var test = solr.Query(BuildQuery(parameters), new QueryOptions { FilterQue

Re: How to make UnInvertedField faster?

2011-10-21 Thread Simon Willnauer
In trunk we have a feature called IndexDocValues which basically creates the uninverted structure at index time. You can then simply suck that into memory or even access it on disk directly (RandomAccess). Even if I can't help you right now this is certainly going to help you here. There is no need

RE: Can Solr handle large text files?

2011-10-21 Thread Anand.Nigam
Hi, I was also facing the issue of highlighting the large text files. I applied the solution proposed here and it worked. But I am getting following error : Basically 'hitGrouped.vm' is not found. I am using solr-3.4.0. Where can I get this file from. Its reference is present in browse.vm

Re: Want to support "did you mean xxx" but is Chinese

2011-10-21 Thread Li Li
we have implemented one supporting "did you mean" and preffix suggestion for Chinese. But we base our working on solr 1.4 and we did many modifications so it will cost time to integrate it to current solr/lucene. Here are our solution. glad to see any advices. 1. offline words and p

Re: Painfully slow indexing

2011-10-21 Thread Simon Willnauer
On Wed, Oct 19, 2011 at 3:58 PM, Pranav Prakash wrote: > Hi guys, > > I have set up a Solr instance and upon attempting to index document, the > whole process is painfully slow. I will try to put as much info as I can in > this mail. Pl. feel free to ask me anything else that might be required. >

Re: arbitrary results

2011-10-21 Thread Dirceu Vieira
HI Peter, You might wanna check out http://wiki.apache.org/solr/QueryElevationComponent. Regards, Dirceu On Fri, Oct 21, 2011 at 11:44 AM, Peter A. Kirk wrote: > Hi > > is it possible to set up Solr so that a search for a particular term > results in some arbitrary (selected) documents? > For

arbitrary results

2011-10-21 Thread Peter A. Kirk
Hi is it possible to set up Solr so that a search for a particular term results in some arbitrary (selected) documents? For example, I want a search for "elephant" to return documents with id's 17, 18 and 36. Even though these documents would not normally occur in a result for a search for "ele

Re: Can Solr handle large text files?

2011-10-21 Thread karsten-solr
Hi Peter, highlighting in large text files can not be fast without dividing the original text in small piece. So take a look in http://xtf.cdlib.org/documentation/under-the-hood/#Chunking and in http://www.lucidimagination.com/blog/2010/09/16/2446/ Which means that you should divide your files a

Re: Getting single documents by fq on unique field, performance

2011-10-21 Thread pravesh
This approach seems fine. You might benchmark it through load test etc. Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-single-documents-by-fq-on-unique-field-performance-tp3440229p3440351.html Sent from the Solr - User mailing list archive at Nabble.com

Re: Highlighting misses some characters

2011-10-21 Thread Dirceu Vieira
Hi, Are you using any kind of NGram tokenizer? At first I'd have said it is caused by stemming, but since it's not like the stem and it's derived word are being highlighted, it's more like parts of it are... If you use NGram or EdgeNGram this will generate tokens for each part of the word (the si

Re: Painfully slow indexing

2011-10-21 Thread Alain Rogister
As an alternative, I can suggest this one which worked great for me: - generate the ready-for-indexing XML documents on a file system - use curl to feed them into Solr I am not dealing with huge volumes, but was surprised at how *fast* Solr was indexing my documents using this simple approach. Al

Getting single documents by fq on unique field, performance

2011-10-21 Thread Robert Brown
Hi, We do regular searches against documents, with highlighting on. To then view a document in more detail, we re-do the search but using fq=id:12345 to return the single document of interest, but still want highlighting on, so sending the q param back again. Is there anything you would rec

Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread Darren Govoni
My interpretation of your results are that your FQ found 1281 documents with 1213206 value in sou_codeMetier field. Of those results, 476 also had 1212104 as a value...and so on. Since ALL the results will have the field value in your FQ, then I would expect the "other" values to be equal or less

Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread Alain Rogister
Pravesh, Not exactly. Here is the search I do, in more details (different field name, but same issue). I want to get a count for a specific value of the sou_codeMetier field, which is multivalued. I expressed this by including a fq clause : /select/?q=*:*&facet=true&facet.field=sou_codeMetier&fq

Re: hierarchical synonym

2011-10-21 Thread Lukáš Vlček
Hi, I think what you are looking for are synonym rules like this: dog, cat, bird => animal I think the following link can be interesting to you as well: http://wisdombase.net/wiki/index.php?title=Hiearchy_synonym_search_solution_in_solr I am not a Solr expert but speaking about Lucene synonyms

Re: Painfully slow indexing

2011-10-21 Thread pravesh
Are you posting through HTTP/SOLRJ? Your script time 'T' includes time between sending POST request -to- the response fetched after successful response right?? Try sending in small batches like 10-20. BTW how many documents are u indexing??? Regds Pravesh -- View this message in context:

Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread pravesh
Could u clarify on below: >>When I make a search on facet.qua_code=1234567 ?? Are u trying to say, when u fire a fresh search for a facet item, like; q=qua_code:1234567?? This this would fetch for documents where qua_code fields contains either the terms 1234567 OR both terms (1234567 & 9384738..