is there any attribute in schema.xml to avoid duplication in solr?

2011-10-04 Thread nagarjuna
Hi everybody i want to know whether is there any attribute in schema.xml to avoid the duplications? pls reply thanks -- View this message in context: http://lucene.472066.n3.nabble.com/is-there-any-attribute-in-schema-xml-to-avoid-duplication-in-solr-tp3392408p3392408.html Sent from

Re: UniqueKey filed length exceeds

2011-10-04 Thread kiran.bodigam
Thanks for u r reply Erick, (Here my use case is -MM-DD 13:54:11.414632 needs to be unique key) when i trying to search the data for http://localhost:8080/solr/select/?q=2009-11-04:13:51:07.348184 it throws following error, though i change my schema to textfield i am getting following

Re: Documents Indexed, SolrJ see nothing before long time

2011-10-04 Thread darul
Thank you Christopher, I have found one issue in my code when building a query, thus I do not know why it is not working. When I comment this line, I get right result count : // solrQuery.setParam("fq", "+creation_date:[* TO NOW] +type:QUESTION"); Where creation_date is one Date field and type o

how to avoid duplicates in search results?

2011-10-04 Thread nagarjuna
Hi everybody i got the following response - - 0 0 - groups on 0 participate 2.2 30 - - testing group testing group http://abc.xyz.com/groups/testing-group/discussions/62 - testing group testing group http://abc.xyz.com/groups/testing-

Re: Documents Indexed, SolrJ see nothing before long time

2011-10-04 Thread darul
Well, I guess, it is stupid to make +creation_date:[* TO NOW] filter -- View this message in context: http://lucene.472066.n3.nabble.com/Documents-Indexed-SolrJ-see-nothing-before-long-time-tp3389721p3392538.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to avoid duplicates in search results?

2011-10-04 Thread Edoardo Tosca
You can probably use the Grouping feature: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters There is also a Document Duplicate Detection at index time: http://wiki.apache.org/solr/Deduplication On Tue, Oct 4, 2011 at 9:55 AM, nagarjuna wrote: > Hi everybody > i got the followi

Re: SolrJ Annotation for multiValued field

2011-10-04 Thread darul
well, another mistake, it works...sorry ;) -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Annotation-for-multiValued-field-tp3390255p3392652.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: email - DIH

2011-10-04 Thread jb
Nobody to help? I tried telnet to get informations about the emails. Via telnet with IMAP i can get any required fields. Is this an implementation issue? -- View this message in context: http://lucene.472066.n3.nabble.com/email-DIH-tp2711416p3392846.html Sent from the Solr - User mailing list

Re: Query failing because of omitTermFreqAndPositions

2011-10-04 Thread Michael McCandless
This is because, within one segment only 1 value (omitP or not) is possible, for all the docs in that segment. This then means, on merging segments with different values for omitP, Lucene must "reconcile" the different values, and that reconciliation will favor omitting positions (if it went the o

Re: UniqueKey filed length exceeds

2011-10-04 Thread Jamie Johnson
It looks like your query is getting parsed as a field and a value field: 2009-11-04 value: 13:51:07.34814 if you'd like to make a query like this you need to escape the : so something like 2009-11-04\:13\:51\:07.348184 See the following link for more information http://lucene.apache.org/java/2

Re: is there any attribute in schema.xml to avoid duplication in solr?

2011-10-04 Thread Tom Gullo
UniqueId avoids entries with the same id. -- View this message in context: http://lucene.472066.n3.nabble.com/is-there-any-attribute-in-schema-xml-to-avoid-duplication-in-solr-tp3392408p3393085.html Sent from the Solr - User mailing list archive at Nabble.com.

How to achieve Indexing @ 270GiB/hr

2011-10-04 Thread Pranav Prakash
Greetings, While going through the article 265% indexing speedup with Lucene's concurrent flushing I was stunned by the endless possibilities in which Indexing speed could be increased. I'd like to take inputs from eve

RE: solr searching for special characters?

2011-10-04 Thread Ahmet Arslan
> how it is possible also explain me and which tokenizer > class can support for > finding the special characters . Probably WhiteSpaceTokenizer will do the job for you. Plus you need to escape special characters (if you are using defType=lucene query parser). Anyhow you need to provide us mor

Re: Deploy Solritas as a separate application?

2011-10-04 Thread Erik Hatcher
On Oct 3, 2011, at 23:32 , jwang wrote: > Solritas is a nice search UI integrated with Solr with many features we could > use. However we do not want to build our UI into our Solr instance. We will > have a front-end web app interfacing with Solr. Is there an easy way to > deploy Solritas as a se

Re: SOLR HttpCache Qtime

2011-10-04 Thread Lord Khan Han
We are using this Qtime field and publishing in our front web. Even the httpCache decreasing the Qtime in reality, its still using the cached old Qtime value . We can use our internal qtime instead of Solr's but I just wonder is there any way to say Solr if its coming httpCache re-calculate the Qt

Re: Shingle and Query Performance

2011-10-04 Thread Lord Khan Han
We figured out that if use only shingle field not combined with ouput Unigram than performance getting better. I f we use output unigram its not good from the normal index field. so we decide to make separate field only combined shingle using this field to support main queries. On Wed, Aug 31, 201

Analyzer Tokenizer for Exact and Contains search on single field

2011-10-04 Thread Satish Talim
I am a Solr newbie. Let's say we have a field with 4 records as follows: "James" "James Edward" "James Edward Gray" "JamesEdward" a. In Solr 3.4, I want an exact search on the given field for "James Edward". Record 2 should be returned. b. Next on the same field, I want to check whether "James"

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
Ok, so I am pretty sure this information is not available. What is the most appropriate way to add information like this to ZK? I can obviously look for the system properties enable.master and enable.slave, but that won't be fool proof since someone could put this in the config file instead and n

Re: SOLR HttpCache Qtime

2011-10-04 Thread Erick Erickson
But if the HTTP cache is what's returning the value, Solr never sees anything at all, right? So Solr doesn't have a chance to do anything here. Best Erick On Tue, Oct 4, 2011 at 9:24 AM, Lord Khan Han wrote: > We are using this Qtime field and publishing in our front web. Even the > httpCache de

Indexing PDF

2011-10-04 Thread Héctor Trujillo
Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application 

Re: Indexing PDF

2011-10-04 Thread Paul Libbrecht
full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : > Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with > some files I’ve got problems because they stored estrange characters. I got

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
I'm putting this out there for comment. Right now I'm in ZKControllers and changed register as follows: public void register(SolrCore core, boolean forcePropsUpdate) throws IOException, and at line 479 I've added this SolrRequestHandler requestHandler = core.getRequestHandler("/replicatio

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Mark Miller
Because the distributed indexing phase of SolrCloud will not use replication, we have not really gone down this path at all. One thing we are considering is adding the ability to add various roles to each shard as hints - eg a shard might be designated a searcher and another an indexer. You mi

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
Thanks for the reply Mark. So a couple of questions. When is distributed indexing going to be available on Trunk? Are there docs on it now? I think having Roles on the shard would scratch the itch here, because as you said I could then include a role which indicated what to do with this server.

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
also as an FYI I created this JIRA https://issues.apache.org/jira/browse/SOLR-2811 which perhaps should be removed if the roles option comes to life. Is there a JIRA on that now? On Tue, Oct 4, 2011 at 12:12 PM, Jamie Johnson wrote: > Thanks for the reply Mark.  So a couple of questions. > > W

Re: sorting using function query results are notin order

2011-10-04 Thread abhayd
any help? -- View this message in context: http://lucene.472066.n3.nabble.com/sorting-using-function-query-results-are-notin-order-tp3390926p3393781.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Analyzer Tokenizer for Exact and Contains search on single field

2011-10-04 Thread Steven A Rowe
Hi Satish, I don't think there is a single analyzer that does what you want. However, you could send the info to a second field with copyField, and use e.g. WhitespaceTokenizer on one field for contains-style queries, and KeywordTokenizer on the other field (or just use the "string" field type)

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
So I see this JIRA which references roles https://issues.apache.org/jira/browse/SOLR-2765 I'm looking at implementing what Yonik suggested, namely in solrconfig.xml I have something like these will be pulled out and added to the cloudDescriptor so they can be saved in ZK. Seem reasonable?

Suggestions feature

2011-10-04 Thread Milan Dobrota
I am working on a feature similar to Youtube suggestions (where the videos are suggested based on your viewing history). What I do is parse the history and get the user's interests, in the form of weighted topics. When I boost according to those interests, the dominant ones take over the result lis

Re: sorting using function query results are notin order

2011-10-04 Thread Yonik Seeley
Hmmm, try adding fl={!func}Count to make sure Count is an indexed field and function queries are getting the right values. -Yonik http://www.lucene-eurocon.com - The Lucene/Solr User Conference On Mon, Oct 3, 2011 at 3:42 PM, abhayd wrote: > hi > I am trying to sort results from solr using sum

Re: Determining master/slave from ZK in SolrCloud

2011-10-04 Thread Jamie Johnson
so my initial test worked, this appeared in ZK now roles=searcher,indexer which I can use to tell if it should be used to write to or not. It had fewer changes to other files as well I needed to change CloudDescriptor (add roles variable/methods) CoreContainer (parse roles attribute) ZKCo

Solr Schema and how?

2011-10-04 Thread caman
Hello all, We have a screen builder application where users design their own forms. They have a choice of create forms fields with type date, text,numbers,large text etc upto total of 500 fields supported on a screen. Once screens are designed system automatically handle the type checking for val

Search on content_type

2011-10-04 Thread ahmad ajiloo
Hi I'm using Nutch for crawing and indexed my data by using index-more plugin and added my required field (like content_type) to schema.xml in Solr. Now how can i search on pdf files (a kind of content_types) using this new index? what query should i enter to have a search on pdf files in Solr?

Re: Indexing PDF

2011-10-04 Thread ahmad ajiloo
I have this problem too, in indexing some of persian pdf files. 2011/10/4 Héctor Trujillo > Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But > with > some files I’ve got problems because they stored estrange characters. I got > stored this content: > +++ > > Starting a

Re: SOLR HttpCache Qtime

2011-10-04 Thread Lord Khan Han
I just want to be sure.. because its solr internal HTTP cache.. not an outside httpcacher On Tue, Oct 4, 2011 at 5:39 PM, Erick Erickson wrote: > But if the HTTP cache is what's returning the value, > Solr never sees anything at all, right? So Solr > doesn't have a chance to do anything here. >

Case Insensitive Sting

2011-10-04 Thread Strokin, Eugene
Hello, I know that this topic was already discussed, but I want to make sure I understood it right. I need to have a field for email of a user. I should be able to find a document(s) by this field, and it should be exact match, and case insensitive. Based on that I've found from previous discussi

Re: Indexing PDF

2011-10-04 Thread Robert Muir
Your persian pdf problem is different, and already taken care of in pdfbox trunk https://issues.apache.org/jira/browse/PDFBOX-1127 On Tue, Oct 4, 2011 at 2:04 PM, ahmad ajiloo wrote: > I have this problem too, in indexing some of persian pdf files. > > 2011/10/4 Héctor Trujillo > >> Hi all, I'm

http request works, but wget same URL fails

2011-10-04 Thread Fred Zimmerman
This http request works as desired (bringing back a csv file) http://zimzazsearch3-1.bitnamiapp.com:8983/solr/select?indent=on&version=2.2&q=battleship&wt=csv&; but the same URL submitted via wget produces the 500 error reproduced below. I want the wget to download the csv file. What's going on

Re: Case Insensitive Sting

2011-10-04 Thread Ahmet Arslan
> Hello, I know that this topic was > already discussed, but I want to make sure I understood it > right. > I need to have a field for email of a user. I should be > able to find a document(s) by this field, and it should be > exact match, and case insensitive. > Based on that I've found from previ

"Private" text fields

2011-10-04 Thread Kristian Rickert
I'm trying to find a way to query a "private" field in solr while using the text fields. So I want to allow private tags only searchable by an assigned owner. The private tags will also query along side regular keyword tags. Here's an example: Company A (identified by idA) searches and finds com

RE: Suggestions feature

2011-10-04 Thread Steven A Rowe
Hi Milan, I have three ideas: 1. Boost by log(weight) instead of just by weight. This would reduce weight-to-weight ratios and so reduce the likelihood of hit list domination, while still retaining the user's relative preferences. Multiple log applications will further decrease the weight-to

Re: http request works, but wget same URL fails

2011-10-04 Thread Fred Zimmerman
got it. curl " http://zimzazsearch3-1.bitnamiapp.com:8983/solr/select/?indent=on&q=video&fl=name,id&wt=csv"; works like a champ. On Tue, Oct 4, 2011 at 15:35, Fred Zimmerman wrote: > This http request works as desired (bringing back a csv file) > > > http://zimzazsearch3-1.bitnamiapp.com:89

Re: Remove results limit

2011-10-04 Thread Tomás Fernández Löbbe
Hi Andrew, I think this question belongs to the users list more than to the dev's list. Programmatically, it depends on the client library you are using, if you are using SolrJ, it should be something like: SolrQuery query = new SolrQuery(); ... query.setRows(20); query.setStart(40); You can als

Hierarchical faceting with Date

2011-10-04 Thread Ravi Bulusu
Hi, I'm trying to perform a hierarchical (pivot) faceted search and it doesn't work with date (as one of the field). My questions are 1. Is this a supported feature or just a bug that needs to be addressed? 2. If it is not intended to be supported, what is the complexity involved in implementing

Re: SOLR error with custom FacetComponent

2011-10-04 Thread Ravi Bulusu
Thanks for your response. I could solve my use case with your suggestion. -Ravi Bulusu On Sat, Sep 24, 2011 at 1:51 PM, Ravi Bulusu wrote: > Erik, > > Unfortunately the facet fields are not static. The field are dynamic SOLR > fields and are generated by different applications. > The field name

Re: UniqueKey filed length exceeds

2011-10-04 Thread Chris Hostetter
: if you'd like to make a query like this you need to escape the : so : something like or use the "term" QParser, which was created for the explicit purpose of never needing to worry about escaping terms in your index... q={!term f=id}2009-11-04:13:51:07.348184 -Hoss

Re: is there any attribute in schema.xml to avoid duplication in solr?

2011-10-04 Thread Chris Hostetter
y : i want to know whether is there any attribute in schema.xml to avoid : the duplications? you need to explain your problem better ... "duplications" can mean differnet things to different people. (duplicate documents? duplicates terms in a field? etc...) please provide a detailed desc

composite Unique Keys?

2011-10-04 Thread Jason Toy
I have several different document types that I store. I use a serialized integer that is unique to the document type. If I use id as the uniqueKey, then there is a possibility to have colliding docs on the id, what would be the best way to have a unique id given I am storing my unique identifier

Re: sorting using function query results are notin order

2011-10-04 Thread abhayd
that was it..COunt was not indexed...works fine now -- View this message in context: http://lucene.472066.n3.nabble.com/sorting-using-function-query-results-are-notin-order-tp3390926p3394876.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: sorting using function query results are notin order

2011-10-04 Thread Chris Hostetter
: Subject: Re: sorting using function query results are notin order : : that was it..COunt was not indexed...works fine now Hmmm... which version of solr are you using? what exactly is the FieldType of your "Count" field? Since Solr 3.1, attempting to use a function on a non-indexed field *s

Re: how to avoid duplicates in search results?

2011-10-04 Thread Chris Hostetter
: There is also a Document Duplicate Detection at index time: : http://wiki.apache.org/solr/Deduplication Of just setting "url" as your UniqueKey field would solve this simplr usecase. but it's not entirely clear what else you consider "duplicates" besides this one example. : > - : > testin

StreamingUpdateSolrServer and commitWithin

2011-10-04 Thread Leonardo Souza
Hi, I'm confused about using StreamingUpdateSolrServer and commitWithin parameter in conjuction with waitSearcher and waitFlush. Does it make sense a request like this? UpdateRequest updateRequest = new UpdateRequest(); updateRequest.setCommitWithin(12); updateRequest.setWaitSearcher(false);

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-10-04 Thread Chris Hostetter
: OK, if SOLR-2403 being related to the bug I described, has been fixed in : SOLR 3.4 than we are safe, since we are in the process of migration. Is it : possible to verify this somehow? Is FacetComponent class is the one I should : start checking this from? Can you give any other pointers? Accor

Re: indexing FTP documet with solrj

2011-10-04 Thread Chris Hostetter
: I want to index some document with solrj API's but the URL of theses : documents is FTP, : How to set username and password for FTP acount in solrj : : in solrj API there is CommonsHttpSolrServer method but i do not find any : method for FTP configuration it sounds like you are getting ocn

Re: Any plans to support function queries on score?

2011-10-04 Thread Chris Hostetter
: Do you have any plans to support function queries on score field? for : example, sort=floor(product(score, 100)+0.5) desc? You most certianly can conput function queries on the the "score" of a query, but you have to be explicit about which query you want to use the score of. You seem to alr

Re: SOLR HttpCache Qtime

2011-10-04 Thread Erick Erickson
Still doesn't make sense to me. There is no Solr HTTP cache that I know of. There is a queryResultCache. There is a filterCache. There is a documentCache. There's may even be custom cache implementations. There's a fieldValueCache. There's no http cache internal to Solr as far as I can tell. If yo

Re: schema changes changes 3.3 to 3.4?

2011-10-04 Thread Erick Erickson
It looks to me like you changed the analysis chain for the field in question by removing stemmers of some sort or other. The quickest way to answer this kind of question is to get familiar with the admin/analysis page (don't forget to check the verbose checkboxes). Enter the term in both the index

Re: How to obtain the Explained output programmatically ?

2011-10-04 Thread Erick Erickson
If we're talking SolrJ here, I *think* you can get it from the NamedList returned from SolrResponse.getResponse, but I confess I haven't tried it. If not SolrJ, what is your plugin doing? And where to you expect your plugin to be in the query process? Best Erick On Mon, Oct 3, 2011 at 5:31 PM, D

Re: SOLR HttpCache Qtime

2011-10-04 Thread Nicholas Chase
Seems to me what you're asking is how to have an accurate query time when you're getting a response that's been cached by an HTTP cache. This might be from the browser, or from a proxy, or from something else, but it's not from Solr. The reason that the QTime doesn't change is because it's th

Re: sorting using function query results are notin order

2011-10-04 Thread abhayd
hi solr-spec-version 4.0.0.2011.07.19.16.15.08 solr-impl-version 4.0-SNAPSHOT ${svnversion} - ad895d - 2011-07-19 16:15:08 lucene-spec-version 4.0-SNAPSHOT lucene-impl-version 4.0-SNAPSHOT ${svnversion} - ad895d - 2011-07-19 16:15:13

Re: sorting using function query results are notin order

2011-10-04 Thread Chris Hostetter
: solr-spec-version : 4.0.0.2011.07.19.16.15.08 : solr-impl-version : 4.0-SNAPSHOT ${svnversion} - ad895d - 2011-07-19 16:15:08 Uh, ok ... that doesn't really answer my question at all. According to that info, your version of solr was *built* on 2011-07-19, using

Scoring of DisMax in Solr

2011-10-04 Thread David Ryan
Hi, When I examine the score calculation of DisMax in Solr, it looks to me that DisMax is using tf x idf^2 instead of tf x idf. Does anyone have insight why tf x idf is not used here? Here is the score contribution from one one field: score(q,c) = queryWeight x fieldWeight =

DIH full-import with clean=false is still removing old data

2011-10-04 Thread Pulkit Singhal
Hello, I have a unique dataset of 1,110,000 products, each as its own file. It is split into three different directories as 500,000 and 110,000 files and 500,000. When I run: http://localhost:8983/solr/bbyopen/dataimport?command=full-import&clean=false&commit=true The first 500,000 entries are su

Re: schema changes changes 3.3 to 3.4?

2011-10-04 Thread jo
Interesting... I did not make changes on the default settings, but defenetely will give that a shot.. thanks I will comment later if I found a solution beside replacing the schema with the default one on 3.3 thanks JO -- View this message in context: http://lucene.472066.n3.nabble.com/schem

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-10-04 Thread Yonik Seeley
On Tue, Oct 4, 2011 at 7:13 PM, Chris Hostetter wrote: > > : OK, if SOLR-2403 being related to the bug I described, has been fixed in > : SOLR 3.4 than we are safe, since we are in the process of migration. Is it > : possible to verify this somehow? Is FacetComponent class is the one I should > :

Re: sorting using function query results are notin order

2011-10-04 Thread abhayd
hi hoss, I see this in change.txt $Id: CHANGES.txt 1148494 2011-07-19 19:25:01Z hossman $ repository root: http://svn.apache.org/repos/asf/lucene/dev/trunk Revision:1148519 If u let me know what/where to look for i can send details. Here is field defination -- View this message in contex

Re: DIH full-import with clean=false is still removing old data

2011-10-04 Thread Pulkit Singhal
Bah it worked after cleaning it out for the 3rd time, don't know what I did differently this time :( On Tue, Oct 4, 2011 at 8:00 PM, Pulkit Singhal wrote: > Hello, > > I have a unique dataset of 1,110,000 products, each as its own file. > It is split into three different directories as 500,000

A simple query?

2011-10-04 Thread alexw
Hi all, This may seem to be an easy one but I have been struggling to get it working. To simplify things, let's say I have a field that can contain any combination of the 26 alphabetic letters, space delimited: a b b c x y z The search term is a list of user specified letters,

Re: SOLR HttpCache Qtime

2011-10-04 Thread Lance Norskog
Solr supports having the browser cache the results. If your client code supports this caching, or your code goes through an HTTP cacher like Squid, it could return a cached page for a query. Is this what you mean? On Tue, Oct 4, 2011 at 4:55 PM, Nicholas Chase wrote: > Seems to me what you're as

Re: Scoring of DisMax in Solr

2011-10-04 Thread Bill Bell
This seems like a bug to me. On 10/4/11 6:52 PM, "David Ryan" wrote: >Hi, > > >When I examine the score calculation of DisMax in Solr, it looks to me >that DisMax is using tf x idf^2 instead of tf x idf. >Does anyone have insight why tf x idf is not used here? > >Here is the score contributio

Re: A simple query?

2011-10-04 Thread tamanjit.bin...@yahoo.co.in
Hi, Set your default operator to OR i.e. in schema.xml Also keep your fieldType=text i.e. As you would want whitespace tokenization and try your query with () i.e. /select/?q=myfields:(a b)&version=2.2&start=0&rows=2&indent=on This hopefully should solve your problem. -- View this message