date:20100318

AW: Will Solr fit our needs?

2010-03-18 Thread Moritz Maedler

Hi guys! Thanks alot for your suggestions and help - I really appreciate that! As we need e.g. the the price for sorting I think it must be in the index? Thus, I'm not shure that a key-value-store is the thing we are looking for as we need a good searchengine. Currently we are using serveral indic

Re: Will Solr fit our needs?

2010-03-18 Thread Lukáš Vlček

On Thu, Mar 18, 2010 at 8:45 AM, Moritz Maedler wrote: > Hi guys! > > Thanks alot for your suggestions and help - I really appreciate that! > As we need e.g. the the price for sorting I think it must be in the index? > Thus, I'm not shure that a key-value-store is the thing we are looking for > as

HTTP Status 500 - null java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249)

2010-03-18 Thread Marc Des Garets

Hi, I am doing a really simple query on my index (it's running in tomcat): http://host:8080/solr_er_07_09/select/?q=hash_id:123456 I am getting the following exception: HTTP Status 500 - null java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249) at org.apache.lucene

solrj sends duplicate documents

2010-03-18 Thread Tim Terlegård

I'm using StreamingUpdateSolrServer to index a document. StreamingUpdateSolrServer server = new StreamingUpdateSolrServer("http://localhost:8983/solr/core0";, 20, 4); server.setRequestWriter(new BinaryRequestWriter()); SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "12121212")

Re: solrj sends duplicate documents

2010-03-18 Thread Erik Hatcher

The StreamingUpdateSolrServer does not support binary format, unfortunately. Erik On Mar 18, 2010, at 8:15 AM, Tim Terlegård wrote: I'm using StreamingUpdateSolrServer to index a document. StreamingUpdateSolrServer server = new StreamingUpdateSolrServer("http://localhost:8983/solr/c

Return all Facets?

2010-03-18 Thread homerlex

I'm starting to play with Solr. I am looking at the API and see that there is an addFacetField on the SolrQuery Object that is required to specify which facet fields you want returned. Is there any way to specify that we want all facet fields with explicitly having to add them all via addFacetFi

Re: Term Highlighting without store text in index

2010-03-18 Thread Alexey Serba

Hey Dominique, See http://www.lucidimagination.com/search/document/5ea8054ed8348e6f/highlight_arbitrary_text#3799814845ebf002 Although it might be not good solution for huge texts, wildcard/phrase queries. http://issues.apache.org/jira/browse/SOLR-1397 On Mon, Mar 15, 2010 at 4:09 PM, dbejean

excluder filters and multivalued fields

2010-03-18 Thread Marc Sturlese

I don't think there's a way to do what has come to my mind but want to be sure. Let's say I have a doc with 2 fileds, one is multiValued doc1: name->john year->2009;year->2010;year->2011 And I query for: q=john&fq=-year:2010 Doc1 won't be in the matching results. Is there a way to make it appea

RE: XPath Processing Applied to Clob

2010-03-18 Thread Craig Christman

You could also do the xpath processing on the oracle end using the extract or extractValue functions. Here's a good reference: http://www.psoug.org/reference/xml_functions.html -Original Message- From: Neil Chaudhuri [mailto:nchaudh...@potomacfusion.com] Sent: Wednesday, March 17, 201

Re: solrj sends duplicate documents

2010-03-18 Thread Tim Terlegård

It would be nice if the documentation mentioned this. :) /Tim 2010/3/18 Erik Hatcher : > The StreamingUpdateSolrServer does not support binary format, unfortunately. > > Erik > > On Mar 18, 2010, at 8:15 AM, Tim Terlegård wrote: > >> I'm using StreamingUpdateSolrServer to index a document

where can i get an synonym.txt and spellcheck.txt ?

2010-03-18 Thread stocki

Hello. I search an synonym and spellcheck.txt where can i find it in the laaarge internet ? or how, do you filled these two files with good names ? -- View this message in context: http://old.nabble.com/where-can-i-get-an-synonym.txt-and-spellcheck.txt---tp27946812p27946812.html Sent from t

Re: Return all Facets?

2010-03-18 Thread Erik Hatcher

No, there isn't. How would one know what all the facet fields are, though? One trick, use the luke request handler to get the list of fields, then use that list to construct the facet fields request parameters. Erik On Mar 18, 2010, at 8:40 AM, homerlex wrote: I'm starting to p

Re: where can i get an synonym.txt and spellcheck.txt ?

2010-03-18 Thread Erick Erickson

You probably won't find a good synonyms file. The problem is that synonyms tend to be domain-specific, so a synonyms file for chemistry would be of little use for psychology. Spellcheck is generally more useful it it's derived from words already *in* your index. It's of little use to a user to ha

some snynonym clarifications

2010-03-18 Thread Mark Fletcher

Hi, Just needed some help to understand the following synonym mappings:- 1. aaa => does it mean:- if the user queries for aaa it is replaced with and documents matching are searched for or does it mean if the user queries for aaa, documents with aaa as well a

Re: some snynonym clarifications

2010-03-18 Thread Markus Jelsma

Hi, Check out the wiki page on the SynonymFilterFactory. It gives a decent explantion on the subject. The backslash is just for escaping otherwise meaningful characters. [1]:http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Cheers, On Thursday 18 March 2

Re: some snynonym clarifications

2010-03-18 Thread Mark Fletcher

Hi, Thanks for the mail. I had tried the WIKI. My doubts remaining were mainly:- 1. If we have synonyms specified and they replace your search keyword with the ones specified wouldn't we face a risk of our original keyword missed out. What i meant is if I have a keyword for search say "agricultu

Recommended OS

2010-03-18 Thread blargy

Does anyone have any recommendations on which OS to use when setting up Solr search server? Any memory/disk space recommendations? Thanks -- View this message in context: http://old.nabble.com/Recommended-OS-tp27948306p27948306.html Sent from the Solr - User mailing list archive at Nabble.com

Opinions on Facet+Fulltext behavior?

2010-03-18 Thread Mark Bennett

Most sites allow you to search for some text, and then click on Facets (or Tags or Taxonomy branches) to drill down into your search. Most sites also show the search box in these search results, with the text previously entered, so that you can edit it and resubmit. Perhaps you want to add a word

Re: Recommended OS

2010-03-18 Thread K Wong

http://wiki.apache.org/solr/FAQ#What_are_the_Requirements_for_running_a_Solr_server.3F I have Solr running on CentOS 5.4. It runs fine on the OpenJDK 1.6.0 and Tomcat 5. If I were to do it again, I'd probably just stick with Jetty. You really will need to read the docs to get the settings right a

Re: where can i get an synonym.txt and spellcheck.txt ?

2010-03-18 Thread stocki

aha, okay thx. and how do you get yout spellcheck words from your productnames ? we have somtimes very looong names. how it is possible to use the spellchecker function or autosuggestion in the right way ? Erick Erickson wrote: > > You probably won't find a good synonyms file. The proble

Re: [search_dev] Opinions on Facet+Fulltext behavior?

2010-03-18 Thread Mark Bennett

Hi Chris, A cool idea, and I like that on Google too. But while that's great for techies, not for other demographics. The restriction on "no checkbox or 'start new search'" was because those were considered too complicated / distracting / old-school for the target users, so punctuation in the se

Re: Recommended OS

2010-03-18 Thread Jean-Sebastien Vachon

On 2010-03-18, at 1:03 PM, K Wong wrote: > http://wiki.apache.org/solr/FAQ#What_are_the_Requirements_for_running_a_Solr_server.3F > > I have Solr running on CentOS 5.4. It runs fine on the OpenJDK 1.6.0 > and Tomcat 5. If I were to do it again, I'd probably just stick with > Jetty. Would you mi

Re: Recommended OS

2010-03-18 Thread blargy

Beat me to the punch with that question. KWong, did you happen to install the Apache APR? Wondering if it is even worth the trouble. I am thinking about going with RedHat Enterprise 5 unless anyone has any objections? Jean-Sebastien Vachon wrote: > > > On 2010-03-18, at 1:03 PM, K Wong wrote

Re: Recommended OS

2010-03-18 Thread K Wong

We're running Solr to provide search services to a Drupal 6 installation. The site is very low traffic (35 uniques a day) and search doesn't get used very often. I was thinking that I could get away with running it on the Jetty that comes with Solr. It's just one less thing that has to be looked af

Re: Solr query parser doesn't invoke analyzer for simple term query?

2010-03-18 Thread Chris Hostetter

: It seems that Solr's query parser doesn't pass a single term query no ... the query parser always uses the analyzer for "text" regardless of wether it's a single term or not (it doesnt' even know if it's a single term until the Analyzer tells it) cases where the analyzer isn't used are thing

Re: Solr query parser doesn't invoke analyzer for simple term query?

2010-03-18 Thread Chris Hostetter

: : Thank you, Marco. I see the debug out put that looks like: : title_jpn:2001年 : title_jpn:2001年 : PhraseQuery(title_jpn:"2001 年") : title_jpn:"2001 年" ... : Does this mean the standard query parser does send the : raw query string to the Analyzer and (because the query : yielded more t

Re: dynamic categorization & transactional data

2010-03-18 Thread caman

1) Took care of the first one by Transformer. 2) Any input on 2 please? I need to store # of views and popularity with each document and that can change pretty often. Recommended to use database or can this be updated to SOLr directly? My issue with DB is that with every SOLR search hit, will have

Re: dynamic categorization & transactional data

2010-03-18 Thread Smiley, David W.

You'll probably want to influence your relevancy on this popularity number that is changing often. ExternalFileField looks like a possibility though I haven't used it. Another would be using an in-memory cache which stores all popularity numbers for any data that has its popularity updated sin

Re: dynamic categorization & transactional data

2010-03-18 Thread caman

David, Much appreciated. This gives me enough to work with. I missed one important point. Our data changes pretty frequently which mean we may be running deltas every 5-10 minutes. in-memory should work thanks David Smiley @MITRE.org wrote: > > You'll probably want to influence your releva

Re: Return all Facets?

2010-03-18 Thread homerlex

Thanks for the reply. Can someone point me to a sample on how to use the luke request handler to get this info? Erik Hatcher-4 wrote: > > No, there isn't. How would one know what all the facet fields are, > though? > > One trick, use the luke request handler to get the list of fields, >

Re: Return all Facets?

2010-03-18 Thread Smiley, David W.

Coincidentally I'm working on something like this right now. However in my case, I want results filtered by the current search for the facets they use (which is a subset of all available), with a count. This is a sort of meta-facet since its faceting on the facetable fields. I've implemented

Issue with exact matching

2010-03-18 Thread Alex Thurlow

I'm trying to give a super boost to fields that match exactly, but it doesn't appear to be working. I have this: stored="true"/> sortMissingLast="true" omitNorms="true"> The dataset has two items with title="Rude Boy", but they are coming up way down the list. My query looks li

Re: dynamic categorization & transactional data

2010-03-18 Thread Grant Ingersoll

On Mar 18, 2010, at 2:44 PM, caman wrote: > > 1) Took care of the first one by Transformer. This is often also something done by a classifier that is trained to deal with all the statistical variations in your text. Tools like Weka, Mahout, OpenNLP, etc. can be applied here. > 2) Any input

Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-18 Thread brad anderson

Tried following their tutorial for plugging zoie into solr: http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Server It appears it only allows you to search on documents after you do a commit? Am I missing something here, or does plugin not doing anything. Their tutorial tells you to do a co

trimfilterfactory on string fieldtype?

2010-03-18 Thread Tommy Chheng

Can the trim filter factory work on string fieldtypes? When I define a trim filter factory on a string fieldtype, i get an exception: org.apache.solr.common.SolrException: Unknown fieldtype 'string' specified on field id at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.jav

good spell dictionary

2010-03-18 Thread michaelnazaruk

Can anyone tell me, where I can buy or download free spell dictionary for solr? I need not simple dictionary! I need very good spell american-english dictionary(or only american)! -- View this message in context: http://old.nabble.com/good-spell-dictionary-tp27950854p27950854.html Sent from the

DIH questions

2010-03-18 Thread Shawn Heisey

Below is my data-config.xml file, which I am using to build an index for my first shard. I have a couple of questions. Can Solr include the hostname (short version) it's running on in the query? Alternatively, is there a way to override the query with a URL parameter before or when doing the

Re: DIH questions

2010-03-18 Thread Lukas Kahwe Smith

On 18.03.2010, at 23:12, Shawn Heisey wrote: > Below is my data-config.xml file, which I am using to build an index for my > first shard. I have a couple of questions. > > Can Solr include the hostname (short version) it's running on in the query? > Alternatively, is there a way to override

Re: Boundary match as part of query language?

2010-03-18 Thread Chris Hostetter

: Now, I know how to work-around this, by appending some unique character : sequence at each end of the field and then include this in my search in : the front end. However, I wonder if any of you have been planning a : patch to add a native boundary match feature to Solr that would : automagi

Re: Facet pagination

2010-03-18 Thread Chris Hostetter

: Is there a way to get *total count of facets* per field? sorry, no. you can skip ahead, but the only way to know when you're done is when you stop getting constraints back for that field. -Hoss

Re: Generating a sitemap

2010-03-18 Thread Chris Hostetter

: Been testing nutch to crawl for solr and I was wondering if anyone had : already worked on a system for getting the urls out of solr and generating : an XML sitemap for Google. it's pretty easy to just paginate through all docs in solr, so you could do that -- but I'd be really suprised if Nut

Re: Multi valued fields

2010-03-18 Thread Chris Hostetter

: Can I build a query such as : : : -field: A : : which will return all documents that do not have "exclusive" A in the : their field's values. By exclusive I mean that I don't want documents : that only have A in their list of values. In my sample case, the query : would return doc A

Re: Filtering search results

2010-03-18 Thread Chris Hostetter

: For example, in dice.com, the visitor can search by some keyword and filter : further by Skill, Country, Province, City, Telecommute, Travel Required : (shown on the left pane on dice.com). We were wondering if there is some : built-in feature/functionality that can be used from Solr to implemen

Re: Issue with exact matching

2010-03-18 Thread Erick Erickson

I only have time for a quick glance, but what jumps out is that this part: title:rude boy^100 probably isn't matching "boy" against your title field, it's matching "rude" against title, but "boy" against your default field and boosting the "boy" part. Try parenthesizing (at least that works in Lu

Re: DIH questions

2010-03-18 Thread Shawn Heisey

That looks very useful. So does this mean that this will work? URL text: ?command=full-import&numShards=6&modValue=0&minDid=229615984 XML: query="SELECT * FROM [table] WHERE (did % ${dataimporter.request.numShards}) = ${dataimporter.request.modValue} AND ${dataimporter.request.minDid} >= did"

stream.url Contention

2010-03-18 Thread Giovanni Fernandez-Kincade

I recently switched from posting a file (PDFs in this case) to the Extract handler, to using the Stream.URL parameter. I've noticed a huge amount of contention around opening URL connections: http-8080-Processor36 [BLOCKED] CPU time: 0:47 sun.net.www.protocol.file.Handler.openConnection(URL) jav

Re: good spell dictionary

2010-03-18 Thread Erick Erickson

Spellcheck is generally more useful it it's derived from words already *in* your index. It's of little use to a user to have spellcheck/autosuggest show terms that aren't in the index... HTH Erick On Thu, Mar 18, 2010 at 6:00 PM, michaelnazaruk wrote: > > Can anyone tell me, where I can buy or

Re: Generating a sitemap

2010-03-18 Thread Jon Baer

It's also possible to try and use the Velocity contrib response writer and paging it w/ the sitemap elements. BTW generating a sitemap was a big reason of a switch we did from GSA to Solr because (for some reason) the map took way too long to generate (even simple requests). If you page throug

Re: DIH questions

2010-03-18 Thread Shawn Heisey

I gave this config idea a try, looks like it works perfectly. I thought at first that it wasn't working, but as is usual with such things, my XML was faulty. Many many thanks! Shawn On 3/18/2010 5:19 PM, Shawn Heisey wrote: That looks very useful. So does this mean that this will work? U

[POLL] Users of abortOnConfigurationError ?

2010-03-18 Thread Chris Hostetter

Due to some issues with the (lack of) functionality behind the "abortOnConfigurationError" option in solrconfig.xml, I'd like to take a quick poll of the solr-user community... * If you have never heard of the abortOnConfigurationError option prior to this message, please ignore this emai

Re: Return all Facets?

2010-03-18 Thread Erik Hatcher

David - sounds kinda like this one: http://issues.apache.org/jira/browse/SOLR-1280 :) Maybe you'd be up for rounding this issue out with your enhancements and get this committable? Erik On Mar 18, 2010, at 4:06 PM, Smiley, David W. wrote: Coincidentally I'm working on something li

Re: [ANN] Zoie Solr Plugin - Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+

2010-03-18 Thread Erik Hatcher

"When I don't do the commit, I cannot search the documents I've indexed." - that's exactly how Solr without Zoie works, and it's how Lucene itself works. Gotta commit to see the documents indexed. Erik On Mar 18, 2010, at 5:41 PM, brad anderson wrote: Tried following their tutori

How many facet values are too many?

2010-03-18 Thread Andy

My understanding is that too many facet values will decrease performance How many is too many? Are there any rules of thumb for this? 2 related questions: - I expect a facet field to have many values (values are user generated), any thing I can do to minimize the performance impact? - Any way

Re: Weired behaviour for certain search terms

2010-03-18 Thread Akash Sahu

I tired adding &hl.maxAnalyzedChars=-1 to my search query but it didnt helped. Just wanted to know if there are limitations on the certain search terms. Its bit strange that solr is not behaving properly for certain terms (especially returning the excerpts in highlighting dictionary). The terms wh

Re: PDFBox/Tika Performance Issues

2010-03-18 Thread Mattmann, Chris A (388J)

Hi Giovanni, Let's try and isolate the problem. Can you try parsing the PDF file with tika-app as a standalone? Take your tika-app jar file then run java -jar tika-app-0.7-SNAPSHOT.jar -m /path/to/pdf/file That should give you something like: Content-Type: application/pdf created: Thu Sep 06 0

Omitting norms question

2010-03-18 Thread blargy

Should I include not omit-norms on any fields that I would like to boost via a boost-query/function query? For example I have a created_on field on one of my documents and I would like to add some sort of function query to this field when querying. In this case does this mean I need to have the n

56 matches

Mail list logo