Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Parvin Gasimzade
Thank you for your answers. I have a Map and want to boost the score of that documents during search time. In my example i get that map inside ValueSource and boost the matched documents score. In the query if {!graph} is added then it will return boosted query otherwise it will return regular l

Re: best practice to introducing singletons inside of Solr (IoC)

2011-12-28 Thread Mikhail Khludnev
Erick, Ok. Let me try with plain java one. Possibly I'll need more tight integration like injecting a core into the singleton, etc. But I don't know yet. Thanks for your efforts. On Wed, Dec 28, 2011 at 5:48 PM, Erick Erickson wrote: > I must be missing something here. Why would this be any dif

Re: Grouping results after Sorting or vice-versa

2011-12-28 Thread Vijayaragavan
Hi Juan, I'm using Solr 3.1 The type of the date field is long. Let's say, the documents indexed in Solr server be.. 1326c5cc09bbc99a_1 1326c5cc09bbc99a 1316078009000 <.. Some Other fields here ..> Some subject here... Some message here... 1321dff33cecd5f4_1 1321dff33cecd5f4 131495631

Re: solr keep old docs

2011-12-28 Thread Mikhail Khludnev
Alexander, I have two ideas how to implement fast dedupe externally, assuming your PKs don't fit to java.util.*Map: - your crawler can use inprocess RDBMS (Derby, H2) to track dupes; - if your crawler is stateless - it doesn't track PKs which has been already crawled, you can retrieve it

Re: High response time after being idle

2011-12-28 Thread Odey
It seems like my operation system was causing me trouble in some way. I couldn't find what was triggering this issue, but after migrating the whole project from wamp to lamp it has been resolved and everything is running smoothly again. Thank you very much for your help! Regards, -- View this me

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Yes I have been warned that query index each time before adding doc to index might be resource consuming. Will check it. As for the overwrite parameter I think the name is not the best then. People outside the "business" like me misuse it and assume what I wrote. Overwrite shall mean what it means

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Unfortunately I have a lot of duplicates and taking that searching might suffer I will try with implementing update procesor. But your idea is interesting and I will consider it, thanks. Best Regards Alexander Aristov On 28 December 2011 19:12, Tanguy Moal wrote: > Hello Alexander, > > I don

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Ted Dunning
This copying is a bit overstated here because of the way that small segments are merged into larger segments. Those larger segments are then copied much less often than the smaller ones. While you can wind up with lots of copying in certain extreme cases, it is quite rare. In particular, if you

Re: Custom Shingle Factory Filter Requirement

2011-12-28 Thread Vannia Rajan
On Tue, Dec 27, 2011 at 1:10 PM, Ahmet Arslan wrote: > > To achieve this behavior, you can use StandardTokenizerFactory and > EdgeNGramFilterFactory and LowerCaseFilterFactory at index time. > > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory > Thanks, b

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Lance Norskog
Here is an example of schema design: a PDF file of 5MB might have maybe 50k of actual text. The Solr ExtractingRequestHandler will find that text and only index that. If you set the field to stored=true, the 5mb will be saved. If saved=false, the PDF is not saved. Instead, you would store a link to

Re: Migration from Solr 1.4 to Solr 3.5

2011-12-28 Thread Lance Norskog
Yes, the 3.5 Solr is opening and reading the Solr 1.4 index. When you do a commit, it will rewrite the index in 3.5 format. Doing a complete copy of the configs from 1.4 to 3.5 is easy, but there are a lot of new features and changed defaults in the solrconfig.xml file. These make indexing faster,

Re: Sort facets by defined custom Collator

2011-12-28 Thread Chris Hostetter
: Subject: Sort facets by defined custom Collator deja-vu... http://www.lucidimagination.com/search/p:solr/s:email/l:user/sort:date?q=%22Facet+Ordering%22 -Hoss

Re: Facet Ordering

2011-12-28 Thread Chris Hostetter
: I've seen in the solr faceting overview that it is possible to sort : either by count or lexicographically, but is there a way to sort so : the lowest counts come back first? Peter Sturge looked into this a while back and provided a patch, but there were some issues with it that never got reso

Re: Facet Ordering

2011-12-28 Thread Jamie Johnson
I have a database where a user is searching for documents, and the things which I'm faceting on are tags. Tags boil down to things of interest, perhaps names, places, etc. The user in our case has asked for the ability to change the ordering so they can easily find things that appear very infrequ

Re: Facet Ordering

2011-12-28 Thread Koji Sekiguchi
(11/12/29 5:50), Jamie Johnson wrote: I've seen in the solr faceting overview that it is possible to sort either by count or lexicographically, but is there a way to sort so the lowest counts come back first? As far as I know, no. What is your use case? koji -- http://www.rondhuit.com/en/

Re: High response time after being idle

2011-12-28 Thread Chris Hostetter
: Is it possible that the system is running out of RAM, and swapping, : or is aggressively swapping for some reason? it doesn't have to be the solr /tomcat process memory getting swapped out -- but that's certainly possible -- it could also be that the filesystem cache is expunging the disk pag

Re: High response time after being idle

2011-12-28 Thread Otis Gospodnetic
Right, I think that's what's happening here. Google "swapiness" if you are on Linux. Alternatively, one could add something to prevent the OS from swapping out Solr's process.  Here is how ElasticSearch does it, for example: https://github.com/elasticsearch/elasticsearch/issues/464 Otis

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 2:16 AM, Parvin Gasimzade wrote: > I have created custom Solr FunctionQuery in Solr 3.4. > I extended ValueSourceParser, ValueSource, Query and QParserPlugin classes. Note that you only need a QParserPlugin implementation for top level query types, not function queries. Wi

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Juan Grande
Hi Parvin, You must also add the query parser definition to solrconfig.xml, for example: *Juan* On Wed, Dec 28, 2011 at 4:16 AM, Parvin Gasimzade < parvin.gasimz...@gmail.com> wrote: > Hi all, > > I have created custom Solr FunctionQuery in Solr 3.4. > I extended ValueSourceParser, ValueSou

Re: High response time after being idle

2011-12-28 Thread Erick Erickson
What else, if anything, do you have running on the server? Because it's possible that pages are being swapped out for other processes to use. Solr itself shouldn't, as far as I know, time out anything so I expect you're running into issues with the op system. Best Erick On Wed, Dec 28, 2011 at 1

Re: Grouping results after Sorting or vice-versa

2011-12-28 Thread Juan Grande
Hi, I don't have an answer, but maybe I can help you if you provide more information, for example: - Which Solr version are you running? - Which is the type of the date field? - The output you are getting - The output you expect - Any other information that you consider relevant. Thanks, *Juan*

Facet Ordering

2011-12-28 Thread Jamie Johnson
I've seen in the solr faceting overview that it is possible to sort either by count or lexicographically, but is there a way to sort so the lowest counts come back first?

Re: edismax doesn't obey 'pf' parameter

2011-12-28 Thread Chris Hostetter
: Of course. What I meant to say was there is : always exactly one token in a non-tokenized : field and it's offset is always exactly 0. There : will never be tokens at position 1. : : So asking to match phrases, which is based on : term positions is basically a no-op. That's not always true. c

Re: FTP mount crash when crawling with solrj

2011-12-28 Thread Chris Hostetter
: I have a lots of files in my FTP account,and i use the curlftpfs to mount : them to folder and then start index them with solrj api, but after a minutes : pass something strange happen and the mounted folder is not accessible and : crash,also i can not unmount it and the message "device is in us

Re: Solr-3.5.0/Nutch-1.4 - SolrDeleteDuplicates fails

2011-12-28 Thread Chris Hostetter
: Exception in thread "main" java.io.IOException: Job failed! : : at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) : : at : org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373) : : at : org.apache.nutch.indexer.solr.SolrDeleteDup

Re: XPathEntityProcessor and ExtractingRequestHandler

2011-12-28 Thread Chris Hostetter
: Can I use a XPathEntityProcessor in conjunction with an : ExtractingRequestHandler? Also, the scripting language that : XPathEntityProcessor uses/supports, is that just ECMA/JavaScript? : : Or is XPathEntityProcessor only supported for use in conjuntion with the : DataImportHandler? The Entit

Re: LineEntityProcessor

2011-12-28 Thread Chris Hostetter
You really haven't posted enough details for people to guess as to what your problem might be (in particuar: the actaul examples of your configs, and any log messages during hte import) please consult this wiki page and then post a followup with more details... https://wiki.apache.org

Re: solr keep old docs

2011-12-28 Thread Chris Hostetter
: That said, writing your own update request handler : that detected this case isn't very difficult, : extend UpdateRequestProcessorFactory/UpdateRequestProcessor : and use it as a plugin. i can't find the thread at the moment, but the general issue that has caused people headaches with this typ

Re: High response time after being idle

2011-12-28 Thread Gora Mohanty
On Wed, Dec 28, 2011 at 8:52 PM, Odey wrote: > Hello, > > I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good > for just one exception: when Solr remains idle without handling any requests > for about 5-10 mins the first request sent again will be delayed for a few > secon

High response time after being idle

2011-12-28 Thread Odey
Hello, I'm running Solr 3.5 on a XAMPP/Tomcat environment. It's working pretty good for just one exception: when Solr remains idle without handling any requests for about 5-10 mins the first request sent again will be delayed for a few seconds. Subsequent requests are lightning-fast as usual. So i

Re: solr keep old docs

2011-12-28 Thread Tanguy Moal
Hello Alexander, I don't know much about your requirements in terms of size and performances, but I've had a similar use case and found a pretty simple workaround. If your duplicate rate is not too high, you can have the SignatureProcessor to generate fingerprint of documents (you already did

Re: Poor performance on distributed search

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 5:47 AM, ku3ia wrote: > So, based on p.2) and on my previous researches, I conclude, that the more > documents I want to retrieve, the slower is search and main problem is the > cycle in writeDocs method. Am I right? Can you advice something in this > situation? For the fi

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
Thanks Eric, it sets me direction. I will be writing new plugin and will get back to the dev forum with results and then we will decide next steps. Best Regards Alexander Aristov On 28 December 2011 18:08, Erick Erickson wrote: > Well, the short answer is that nobody else has > 1> had a simil

Re: Problems while searching in default field

2011-12-28 Thread Erick Erickson
Right, you were mislead by the discussion in for that patch, the option you specified was NOT how the patch was eventually implemented. Try reading this page instead: http://wiki.apache.org/solr/MultitermQueryAnalysis The short form is that with 3.6 (i.e. 3.x at this point) you may not have to do

Re: solr keep old docs

2011-12-28 Thread Erick Erickson
Well, the short answer is that nobody else has 1> had a similar requirement AND 2> not found a suitable work around AND 3> implemented the change and contributed it back. So, if you'd like to volunteer . Seriously. If you think this would be valuable and are willing to work on it, hop on over

Re: How can I check if a more complex query condition matched?

2011-12-28 Thread Erick Erickson
There's no easy/efficient way that I know of to do this. Perhaps a good question is what value-add this is going to make for your app and is there a better way to convey this information. For instance, would highlighting convey "enough" information to your user? You're right that you don't want to

Re: best practice to introducing singletons inside of Solr (IoC)

2011-12-28 Thread Erick Erickson
I must be missing something here. Why would this be any different from any other singleton? I just did a little experiment where I implemented the classic singleton pattern in a RequestHandler and accessed from a Filter (both plugins) with no problem at all, just the usual blah var = MySingleton.ge

Re: Indexing problem

2011-12-28 Thread Martin Koch
Could it be a commit you're needing? curl 'localhost:8983/solr/update?commit=true' /Martin On Wed, Dec 28, 2011 at 11:47 AM, mumairshamsi wrote: > http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml > > i am trying to index this file for this i am using this command > > java

Re: Indexing problem

2011-12-28 Thread Ahmet Arslan
> http://lucene.472066.n3.nabble.com/file/n3616191/02.xml > 02.xml > > i am trying to index this file for this i am using this > command > > java -jar post.jar *.xml > > commands run fine but when i search not result is > displaying > > I think it is encoding problem can any one help

Re: Migration from Solr 1.4 to Solr 3.5

2011-12-28 Thread Bhavnik Gajjar
Thanks community! That helps! To check practically, I have now setup Solr 3.5 in test environment. Few observations on that, 1. I simply copy-pasted one of the Solr 1.4 instance on Solr 3.5 setup (after correcting schema.config and solr.config files based on what is suited for 3.5). If

Re: Problems while searching in default field

2011-12-28 Thread mechravi25
Hi, Thanks a lot guys. I tried the following options 1.) Downloaded the solr 3.5.0 version and updated the schema.xml file with the sample fields i have. I then tried to set the property "ignoreCaseForWildcards=true" for a field type as mentioned in the url given for the patch-2438, but got the

Indexing problem

2011-12-28 Thread mumairshamsi
http://lucene.472066.n3.nabble.com/file/n3616191/02.xml 02.xml i am trying to index this file for this i am using this command java -jar post.jar *.xml commands run fine but when i search not result is displaying I think it is encoding problem can any one help ?? -- View this mes

Grouping results after Sorting or vice-versa

2011-12-28 Thread vijayrs
The issue i'm facing is... I didn't get the expected results when i combine "group" param and "sort" param. The query is... http://localhost:8080/solr/core1/select/?qt=nutch&q=*:*&fq=userid:333&group=true&group.field=threadid&group.sort=date%20desc&sort=date%20desc where "threadid" is a hexadeci

Re: solr keep old docs

2011-12-28 Thread Alexander Aristov
the problem with dedupe (SignatureUpdateProcessor ) is that it REPLACES old docs. I have tried it already. Best Regards Alexander Aristov On 28 December 2011 13:04, Lance Norskog wrote: > The SignatureUpdateProcessor is for exactly this problem: > > > http://www.lucidimagination.com/search/lin

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread meghana
Thans iorixxx and Koji for your reply , so can i fulfill my needed requirement by using hl.regex.pattern and making hl.fragmenter=regex ?? i was watching on these fields on wiki. i am thinking to use it to make my highlighted text show in my desire format. my string is like below 1s: This is v

RE: Poor performance on distributed search

2011-12-28 Thread ku3ia
Hi all. Due to my code review, I discovered next things: 1) as I wrote before, seems there is a low disk read speed; 2) at ~/solr-3.5/solr/core/src/java/org/apache/solr/response/XMLWriter.java and in the same classes there is a writeDocList => writeDocs method, which contains a cycle for of all doc

Re: How can I check if a more complex query condition matched?

2011-12-28 Thread Max
Thanks for your reply, I thought about using the debug mode, too, but the information is not easy to parse and doesnt contain everything I want. Furthermore I dont want to enable debug mode in production. Is there anything else I could try? On Tue, Dec 27, 2011 at 12:48 PM, Ahmet Arslan wrote: >

Was:Re: hl.boundaryScanner and hl.bs.chars [off topic]

2011-12-28 Thread Tanguy Moal
Dear list, I'd like to bounce on that issue... IMHO, configuration parsing could be a little bit stricter... At least, what stands for a "severe" configuration error could be user-defined. Let me give some examples that are common errors and that don't trigger the "abortOnConfigurationError"

Re: How to run the solr dedup for the document which match 80% or match almost.

2011-12-28 Thread Lance Norskog
You would have to implement this yourself in your indexing code. Solr has an analysis plugin which does the analysis for your text and then returns the result, but does not query or index. You can use this to calculate the fuzzy hash, then search against index. You might be able to code this in an

Re: solr keep old docs

2011-12-28 Thread Lance Norskog
The SignatureUpdateProcessor is for exactly this problem: http://www.lucidimagination.com/search/link?url=http://wiki.apache.org/solr/Deduplication On Tue, Dec 27, 2011 at 10:42 PM, Alexander Aristov wrote: > I get docs from external sources and the only place I keep them is solr > index. I have

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread Koji Sekiguchi
(11/12/28 17:08), Ahmet Arslan wrote: FastVectorHighlighter requires Solr3.1 http://wiki.apache.org/solr/HighlightingParameters#hl.useFastVectorHighlighter Right. In addition, baoundaryScanner requires 3.5. koji -- http://www.rondhuit.com/en/

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread Ahmet Arslan
> I tried by adding BoundaryScanner in my > solrconfig.xml  and set > hl.useFastVectorHighlighter=true, termVectors=on, > termPositions=on and > termOffsets=on. in my query. then also i didn't get any > effect on my > highlighting. > do i missing anything , or doing anything wrong?? > i like to

Re: hl.boundaryScanner and hl.bs.chars

2011-12-28 Thread meghana
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below

Re: Solr - Mutivalue field search on different elements

2011-12-28 Thread meghana
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below

Re: Solr - Mutivalue field search on different elements

2011-12-28 Thread Ahmet Arslan
> i can't delete 1s ,2s ...etc from my > field value , i have to keep text in > this format... so i'll apply slop in my search to do my > needed search done. It is OK if you cant delete 1s, 2s, etc from field value. We can eat up those special markups in analysis chain. PatternReplaceCharFil