Re: index the links having a certain website

2012-04-01 Thread Marcelo Carvalho Fernandes
Hi Manuel, Do you mean you need to index html files? What kind of search do you image doing? --- Marcelo Carvalho Fernandes On Friday, March 30, 2012, Manuel Antonio Novoa Proenza < mano...@estudiantes.uci.cu> wrote: > > > > > > Hello > > I'm not good with English, and therefore I had to resort

Re: Position Solr results

2012-04-01 Thread Marcelo Carvalho Fernandes
Try using the "score" field in the search results. --- Marcelo Carvalho Fernandes On Friday, March 30, 2012, Manuel Antonio Novoa Proenza < mano...@estudiantes.uci.cu> wrote: > > > > > > Hi > > I'm not good with English, and for this reason I had to resort to a translator. > > I have the followin

Re: Content privacy, search & index

2012-04-01 Thread dbenjamin
Hi Paul, You lost me :-) You mean implementing a specific RequestHandler just for my needs ? Also, when you say "It'd transform a query for "a b" into "+(a b) +(authorizedBit)"", that's not so clear to me, do you mind explaining this like i was a 6 years old ? ;-) (even if I think that's just a

RE: Content privacy, search & index

2012-04-01 Thread dbenjamin
Hi spring, Solution 1 is what i had in mind. So i can't do the whole thing directly in Solr ? (Except maybe by implementing a new RequestHandler like Paul suggested) Concerning the auto-complete of friends in the search box, you won't use the auto-complete feature from Solr then, will you ? Beca

Re: Content privacy, search & index

2012-04-01 Thread Paul Libbrecht
Hello Benjamin, Le 1 avr. 2012 à 11:48, dbenjamin a écrit : > You lost me :-) > You mean implementing a specific RequestHandler just for my needs ? I think a QueryComponent is enough, it'd extend QueryComponent. It's prepare method reads all the params and calls the ResponseBuilder's setQuery wi

Re: reproducibility of query results

2012-04-01 Thread Ahmet Arslan
> I appear to be observing some > unpredictability in query results, and I > wanted to eliminate Solr itself as a possible cause. > > Using 1.4 at the moment. I insert a stack of document (using > the > EmbeddedSolrServer) and then run a query, retrieving 200 > results. (A > significant fraction o

Re: reproducibility of query results

2012-04-01 Thread Benson Margulies
i make a new index each iteration. if I insert the same docs in the same order, should I expect the same query results? Note that I shut down entirely after the adds, then in a new process run the queries. On Apr 1, 2012, at 11:37 AM, Ahmet Arslan wrote: >> I appear to be observing some >> unpre

Re: 'foruns' don't match 'forum' with NGramFilterFactory (or EdgeNGramFilterFactory)

2012-04-01 Thread Bráulio Bhavamitra
Using edismax defType made the thing work, could anybody explain why? (any stemming is enabled now) bráulio 2012/2/14 Bráulio Bhavamitra > Hello all, > > I'm experimenting with NGramFilterFactory and EgdeNGramFilterFactory. > > Both of them shows a match in my solr admin analysis, but when I q

Re: index the links having a certain website

2012-04-01 Thread Manuel Antonio Novoa Proenza
hi Marcelo Certainly I want to index HTML documents, but would like to save these separately to text links that these have, for, say , See What are the links to external websites have a page ? . I reiterate that my English is very bad so I use a translator , anyway then send you what I mean in

Re: Position Solr results

2012-04-01 Thread Manuel Antonio Novoa Proenza
hi Marcelo In that sense I think the score does not help. The score is a number that I determined at that position results generated are a given site. For example : I perform the following query : q = university Solr generates several results among which is that of a certain website. Does s

Re: reproducibility of query results

2012-04-01 Thread Ahmet Arslan
> i make a new index each iteration. if > I insert the same docs in the > same order, should I expect the same query results? Note > that I shut > down entirely after the adds, then in a new process run the > queries. By saying a new index, you mean you create a an empty, new index? Yes, you shou

RE: reproducibility of query results

2012-04-01 Thread Steven A Rowe
If your results are only sorted by score, it's possible that some have exactly the same score. Unless you use a secondary sort, I don't think the order of returned results among same-scored hits is guaranteed. As a result, if you cut off hits at some fixed threshold, you could see different en

Re: reproducibility of query results

2012-04-01 Thread Benson Margulies
On Sun, Apr 1, 2012 at 1:05 PM, Steven A Rowe wrote: > If your results are only sorted by score, it's possible that some have > exactly the same score. Unless you use a secondary sort, I don't think the > order of returned results among same-scored hits is guaranteed. As a > result, if you cut

Re: ExtractingRequestHandler

2012-04-01 Thread Erick Erickson
Yes, you can. but Generally, storing the raw input in Solr is not the best approach. The problem here is that pretty soon you get a huge index that contains *everything*. Solr was not intended to be a data store. Besides, you then need to store the binary form of the file. Solr only deals with

RE: ExtractingRequestHandler

2012-04-01 Thread spring
Hi Erik, I think we have some misunderstanding. I want to index the text of the docs in Solr (only indexed, NOT stored). But I want the text (Tika output) back for: * later faster reindexing (some text extraction like OCR takes really long) * use the text for other processings The original doc

Re: 'foruns' don't match 'forum' with NGramFilterFactory (or EdgeNGramFilterFactory)

2012-04-01 Thread Erick Erickson
It's really hard to explain when there's not much background info. You haven't provide much to analyze. You might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick 2012/4/1 Bráulio Bhavamitra : > Using edismax defType made the thing work, could anybody explain why? > > (any stemmi

Re: ExtractingRequestHandler

2012-04-01 Thread Erick Erickson
Ahhh, OK. Sure, anything you store in Solr you can get back. The key is not Tika, but your schema.xml file, and setting 'stored="true" ' bq: So my question was if I can index the original doc via ExtractingRequestHandler in Solr AND get back the text output, in a single call. I know of now way to

Re: ExtractingRequestHandler

2012-04-01 Thread Bill Bell
I have had good luck with creating a separate core index for just data. This is a different core than the indexed core. Very fast. Bill Bell Sent from mobile On Apr 1, 2012, at 11:15 AM, Erick Erickson wrote: > Yes, you can. but Generally, storing the raw input in Solr is > not the best

Open deleted index file failing jboss shutdown with Too many open files Error

2012-04-01 Thread Gopal Patwa
I am using Solr 4.0 nightly build with NRT and I often get this error during auto commit "Too many open files". I have search this forum and what I found it is related to OS ulimit setting, please see below my ulimit settings. I am not sure what ulimit setting I should have for open file? ulimit -n

Re: Luke using shards

2012-04-01 Thread Lance Norskog
> (http://localhost:8983/solr/admin/luke?shards=localhost:8983/solr,localhost:7574/solr) If luke did distributed search, it would go into an infinite loop :) On Thu, Mar 29, 2012 at 5:03 AM, Dmitry Kan wrote: > One option to try here (not verified) is to set up a Solr front that will > point to

Re: Trouble handling Unit symbol

2012-04-01 Thread Rajani Maski
Thank you for the reply. On Sat, Mar 31, 2012 at 3:38 AM, Chris Hostetter wrote: > > : We have data having such symbols like : ? > : Indexed data has -Dose:"0 ?L" > : Now , when it is searched as - Dose:"0 ?L" >... > : Query Q value observed : S257:"0 ??L/injection" > > First o

Re: SolrCloud

2012-04-01 Thread asia
Then what exactly solrcloud does.because when i am firing query I am getting response even without zookeeper? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-tp3867086p3876820.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud

2012-04-01 Thread asia
Thanks for replying, So if i will make a replica of each shard,then should I use zookeeper for every shards and replica or only for the replica.! more question i want to ask is that I am using solr in tomcat and eclipse environment using solrj.so I am a bit confuse as to how to use zookeeper in it

Re: SolrCloud new....

2012-04-01 Thread asia
Hello, even I am working on the same,I have tried with the wiki example but I am getting errors.I want to use zookeeper with solrj in eclipse using tomcat.Need little help.as to how to integrate zookeeper in eclipse for solrcloud. -- View this message in context: http://lucene.472066.n3.nabble.co

Re: Responding to Requests with Chunks/Streaming

2012-04-01 Thread Mikhail Khludnev
Hello, Small update - reading streamed response is done via callback. No SolrDocumentList in memory. https://github.com/m-khl/solr-patches/tree/streaming here is the test https://github.com/m-khl/solr-patches/blob/d028d4fabe0c20cb23f16098637e2961e9e2366e/solr/core/src/test/org/apache/solr/response