Re: Hierarchical faceting

2010-08-13 Thread John Wang
Check out project Bobo: http://sna-projects.com/bobo/ A lucene based faceted search library. Now with solr plugin: http://snaprojects.jira.com/wiki/display/BOBO/Bobo+Solr+Plugin -John On Fri, Aug 13, 2010 at 8:56 PM, Jayendra Patil < jayendra.patil@gmail.com> wrote: > Multiple values are p

Re: Hierarchical faceting

2010-08-13 Thread Jayendra Patil
Multiple values are probably same as Multiple Tokens with a high position increment gap. Would still prefer to go with the multivalued field approach, as it is inbuilt and easier to get back the individual facets with the count in the response. Regards, Jayendra On Fri, Aug 13, 2010 at 7:57 AM, M

Re: diacritics on query string

2010-08-13 Thread Jayendra Patil
*ASCIIFoldingFilter *is probably the filter known to replace the assented chars to normal ones. However i don't see that in your config. For the issue, you can easily debug the issue through solr analysis tool. Regards, Jayendra On Fri, Aug 13, 2010 at 3:20 AM, Andrea Gazzarini < andrea.gazzar.

Re: How to compile nightly build?

2010-08-13 Thread Jayendra Patil
yup, The Nightly build you pointed out has pre-built code and does the include the lucene and module dependencies needed for compilation. In case you want to compile from the source You can check the code from the location @ https://svn.apache.org/repos/asf/lucene/dev/trunk/solr There are

Re: Tomcat / Solr clustered

2010-08-13 Thread Erick Erickson
It would help a lot if you could tell us what errors you're getting, what you've actually configured and what you expect. It's very hard to diagnose a problem with so little information. Please review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Aug 13, 2010 at 5:21 PM, Clau

Re: uniqueKey and custom fieldType

2010-08-13 Thread Erick Erickson
In order to make even a guess, we'd have to see your new field type. Particularly its field definitions and the analysis chain... Best Erick On Fri, Aug 13, 2010 at 5:16 PM, j wrote: > Does fieldType have any effect on the thing that I specify should be > unique? > > uniqueKey has been working

Re: Problem instantiating CommonsHttpSolrServer using solrj

2010-08-13 Thread Patrick Archibald
I went through jar hell yesterday. I finally got Solrj working. http://jarfinder.com was a big help. Rock on, PLA Patrick L Archibald http://patrickarchibald.com On Fri, Aug 13, 2010 at 7:25 PM, Chris Hostetter wrote: > > : I get the following runtime error: > : > : Exception in thread "main

Re: can searcher.getReader().getFieldNames() return only stored fields?

2010-08-13 Thread Chris Hostetter
: however, both of these can/will return fields that are not stored. is there : a parameter that I can use to only return fields that are stored? : : there does not seem to be a IndexReader.FieldOption.STORED and cant tell if : any of the others might work At the level of this API, the IndexRea

Re: dismax debugging hyphens dashes

2010-08-13 Thread Chris Hostetter
: I have a solr instance with 1 document whose title is "ABC12-def". I : am using dismax. While "abc", "12", and "def" do match, "abc12" and : "def" do not. Here is a the parsedquery_toString, I'm having trouble : understanding it: : : +(id:abc12^3.0 | title:"(abc12 abc) 12"^1.5) (id:abc12^3.0 |

Re: Problem instantiating CommonsHttpSolrServer using solrj

2010-08-13 Thread Chris Hostetter
: I get the following runtime error: : : Exception in thread "main" java.lang.NoClassDefFoundError: : org/apache/solr/client/solrj/SolrServerException : Caused by: java.lang.ClassNotFoundException: : org.apache.solr.client.solrj.SolrServerException ... : I am following the this link : h

Re: Do we need index analyzer for query elevation component

2010-08-13 Thread Chris Hostetter
: In order for query elevation we define a type. do we really need index time : analyzer for query elevation type. If you declared an index analyzer, it would probably never be used in this context (i don't remember the details of QEC off the top of my head) but to be clear: regardless of usa

Re: How to compile nightly build?

2010-08-13 Thread Chris Hostetter
The nightly test artifacts don't currently contain everything needed to recompile the sources, this is a known issue... https://issues.apache.org/jira/browse/SOLR-1989 ...if you want to compile from source off hte trunk or 3x branch, you need to check out the *entire* branch (not just the "

Re: Facet Fields - ID vs. Display Value

2010-08-13 Thread Chris Hostetter
: If your concern is performance, faceting integers versus faceting strings, I : believe Lucene makes the differences negligible. Given that choice I'd go >From a speed standpoint, i believe the differnece is negligable, but from a memory standpoint the datastructures for dealing with an integer

Tomcat / Solr clustered

2010-08-13 Thread Claudio Devecchi
Hi, Somebody could help me pls? I'm trying to run the clustered solr over tomcat6, I followed the instructions on wiki but not works, my doubt is... Is only follow that part of solr instruction, or I have to configure something on tomcat? Tks

uniqueKey and custom fieldType

2010-08-13 Thread j
Does fieldType have any effect on the thing that I specify should be unique? uniqueKey has been working for me up until recently. I change the field that is unique from type "string" to a fieldType that I have defined. Now when I do an update I get a newly created document (so that I have duplicat

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-13 Thread Tod
On 8/12/2010 8:02 PM, Chris Hostetter wrote: : It returns in around a second. When I execute the attached code it takes just : over three minutes. The optimal for me would be able get closer to the : performance I'm seeing with curl using Solrj. I think your problem may be that StreamingUpdate

Re: wildcards in solr synonyms file

2010-08-13 Thread Chris Hostetter
: So for example, i have a document : cars - 10% off : and I search for word "discount", this document should be returned aswell. : : In synonyms file, I have written : discount => *% I would implement something like this by having something at index time like the PatternReplaceTokenFilter at i

Re: Search Results optimization

2010-08-13 Thread Hasnain
im sorry, query q=stapler^100 hammer ^0 is working fine, but I still need guidance with my second question. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-Results-optimization-tp1129374p1138110.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: edismax pf2 and ps

2010-08-13 Thread Yonik Seeley
On Fri, Aug 13, 2010 at 2:38 PM, Ron Mayer wrote: > Yonik Seeley wrote: >> Perhaps a ps2 parameter to match pf2? > > That might be nice. > > I could try to put together such a patch if people were interested. > > One more thing I've been contemplating is if my results might > be even better if I h

Re: edismax pf2 and ps

2010-08-13 Thread Ron Mayer
Yonik Seeley wrote: > Perhaps a ps2 parameter to match pf2? That might be nice. I could try to put together such a patch if people were interested. One more thing I've been contemplating is if my results might be even better if I had a couple different "pf2"s with different "ps"'s at the same ti

Can I tell Solr to merge *oldest* rather than smallest segments - if so I think I wouldn't need optimize anymore.

2010-08-13 Thread Ron Mayer
Short summary: * If I could make Solr merge oldest segments (or the one with the most deleted docs) rather than smallest segments; I think I'd almost never need "optimize". * Can I tell Solr to do this? Or if not, can someone point me in the right direction regarding where I might

Re: Search Results optimization

2010-08-13 Thread Hasnain
Thank you for quick response I've tried using boosting and used the query q=stapler^100 hammer ^0, but it is still mixing up the results, am I doing it wrong? also, if I have items named as "Swingline red stapler" and "Arm & hammer", how would I know when querying that "swingline red stapler" is o

Re: edismax pf2 and ps

2010-08-13 Thread Yonik Seeley
Perhaps a ps2 parameter to match pf2? -Yonik http://www.lucidimagination.com On Fri, Aug 13, 2010 at 2:11 PM, Ron Mayer wrote: > Jayendra Patil wrote: >> We pretty much had the same issue, ended up customizing the ExtendedDismax >> code. >> >> In your case its just a change of a single line >>  

RE: analysis tool vs. reality

2010-08-13 Thread Burton-West, Tom
+1 I just had occasion to debug something where the interaction between the queryparser and the analyzer produced *interesting* results. Having a separate jsp that includes the whole chain (i.e. analyzer/tokenizer/filter and qp) would be great! Tom -Original Message- From: Michael McC

Re: edismax pf2 and ps

2010-08-13 Thread Ron Mayer
Jayendra Patil wrote: > We pretty much had the same issue, ended up customizing the ExtendedDismax > code. > > In your case its just a change of a single line > addShingledPhraseQueries(query, normalClauses, phraseFields2, 2, > tiebreaker, pslop); > to > addShingledPhraseQueries(q

Re: Programmatic Access to Solr schema?

2010-08-13 Thread Chris Hostetter
: Are you looking to get access to a remote schema? You can pull schema.xml : via HTTP using a URL like: Alternately: using the LukeRequestHandler may give you more structured access to the schema, dependso n what you are looking for. : If you're accessing the schema from inside a custom Solr

Re: Parsing solrj results

2010-08-13 Thread Chris Hostetter
: I have implement a solrj client for quering index data from database. the : search result is in text but with SolrDocument[{description= which is a : field in xml. How i can parse out this. Thanks. I'm more then a little confused by your question -- you are using SolrJ? correct? SolrJ takes c

Re: DataImportHandler and SAXParseExceptions with Jetty

2010-08-13 Thread harrysmith
Shawn Heisey-4 wrote: > > Because < and > are critical characters in XML, you have to encode them > to actually use them as part of your config, just as you do on an HTML > page. Use < instead of <. When I first ran into this, I was > surprised that &rt; was not required as well, but it's p

different pdf version issue

2010-08-13 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
I have a problem to index pdf files which are pdf version 1.5 or 1.6. There is no problem at all for me to index pdf files with version 1.4. Here is the error I got: HTTP ERROR: 500org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.pdfpar...@44ff

Re: Results from More then One Cors?

2010-08-13 Thread Janne Majaranta
Hi Jörg, You can use the "shards" parameter to search the cores you want. For example, say you have core0, core1, core2. You want to have results from all the cores. Then you could use the following example URL: http://localhost:8983/solr/core0/select?q=*:*&sort=myfield+desc&shards=localhost:8983

Re: Results from More then One Cors?

2010-08-13 Thread Jörg Agatz
Sorry.. I tryed it with more Details :-).. So i have a lot of cors.. 10 to 20... now i search a way to get results from 5 to ten Cors at the same time.. i need to sort ol results need view facet search out of the 10 cors at the same time. So one query, one server with 20 Corrs and o result wit

Re: DataImportHandler and SAXParseExceptions with Jetty

2010-08-13 Thread Shawn Heisey
On 8/12/2010 8:32 PM, harrysmith wrote: Win XP, Solr 1.4.1 out of the box install, using jetty. If I add greater than or less than (ie< or>) in any xml field and attempt to load or run from the DataImportConsole I receive a SAXParseException. Example follows: If I don't have a 'less than' it w

SolrIndexSearcher.QueryCommand filters and filter

2010-08-13 Thread Stephen Green
I find myself in a situation where I need to handle a query that has a number of filter queries associated with it as well as another constraint that generates a DocSet of documents that should be applied as a filter against the search results. I'm building a query component to deal with this case

Solr Reports

2010-08-13 Thread Samuel Lopes Grigolato
Hi, What is the best way to extract report information of a Solr server, like indexing statistics (taxonomy, documents without any category, etc), and search statistics (common queries, zero-result queries, etc) ? Will I need to code such things in custom processors/search handlers? Thanks, Samue

commitReserveDuration question

2010-08-13 Thread Cuong Hoang
Hi all, Can someone please explain how commitReserveDuration works in Solr replication? There isn't much information on how this property would affect the commit process. I tried to look into the code but I don't think I can make good sense of it. The reason I'm asking this question is that I have

Re: Search Results optimization

2010-08-13 Thread Marco Martinez
You can use a boost higher for stapler to accomplished your requirement. Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2010/8/13 Hasnain > > Hi All, > > My question is related to search results,

Re: Hierarchical faceting

2010-08-13 Thread Mats Bolstad
Thank you for your answer. I sure will implement something in that direction. But couldn't multiple tokens be used instead of multiple values? // some tokenizer and filters that generates "0//Europe", "1//Europe//Norway", "2//Europe//Norway//Oslo" Wouldn't that work in just the sam

Wiki documentation Packaged as single HTML or PDF

2010-08-13 Thread Samuel Lopes Grigolato
Hello, I need to ship the Solr wiki documentation, preferably in PDF format, with a solution to a customer. I tried to find some way to do this but it seems like the wiki haven't an export feature. Does anyone know how I can achieve this? Thanks in advance, Samuel.

Re: Indexing Hanging during GC?

2010-08-13 Thread Rebecca Watson
hi, ok I have a theory about the cause of my problem -- java's GC failure I think is due to a solr memory leak caused from overlapping auto-commit calls -- does that sound plausible?? (ducking for cover now...) I watched the log files and noticed that when the threads start to increase (from a st

Re: Deleting with the DIH sometimes doesn't delete

2010-08-13 Thread Qwerky
I'm using solr 1.4.1 and I've got about 280,000 docs in the index. I'm using a multi core setup (if that makes any difference) with 2 cores. When I check the stats from the JSP my updateHandler reports 3 deletes; cumulative_deletesById : 3 When I search from the admin page the docs are still fo

Search Results optimization

2010-08-13 Thread Hasnain
Hi All, My question is related to search results, I want to customize my query so that for query "stapler hammer", I should get results for all items containing word "stapler" first and then results containing hammer, right now results are mixing up, I want them sorted, i.e. all results of staple

Re: analysis tool vs. reality

2010-08-13 Thread Michael McCandless
Maybe, separate from analysis.jsp (showing only how text is analyzed), Solr needs a debug page showing the steps the field's QueryParser goes through on a given query, to debug such tricky QueryParser/Analyzer interactions? We could make a wrapper around the analyzer that records each text fragmen

diacritics on query string

2010-08-13 Thread Andrea Gazzarini
Hi, I have a problem regarding a diacritic character on my query string : *q=intertestualità * which is encoded in *q=intertestualit%E0 * What I'm not understanding is the following query response fragments : 0 23 score desc score,title on on 0 *intertestualit* 2.2 3