Re: disable echoParam completely for security issues

2011-10-11 Thread abhayd
gr8 that worked!! -- View this message in context: http://lucene.472066.n3.nabble.com/disable-echoParam-completely-for-security-issues-tp3414411p3414859.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: A simple query?

2011-10-11 Thread Chris Hostetter
: This may seem to be an easy one but I have been struggling to get it : working. To simplify things, let's say I have a field that can contain any : combination of the 26 alphabetic letters, space delimited: I suspect you may have "over simplified" your problem in a way that may make some specif

Re: disable echoParam completely for security issues

2011-10-11 Thread Chris Hostetter
: But even if i specify "none" in config file, request parameter overrides it. : Echo back to browser has some issues. the request params only override things specified in the requestHandler definition if they are listed as "defaults" you can also specify "invariants" that are used no matter wh

disable echoParam completely for security issues

2011-10-11 Thread abhayd
hi I already know abt echoParam=none in request parameter and setting in solr config file. But even if i specify "none" in config file, request parameter overrides it. Echo back to browser has some issues. For instance rows parameter accepts script tag and echo back actually runs the script in

Re: negative boosts for docs with common field value

2011-10-11 Thread Chris Hostetter
: The setup for this question was to simplify the actual environment, : we're not actually demoting popular authors. Well, the beter you describe your problem in terms of your *actual* goal,, the more likely people can help give you applicable answers... https://people.apache.org/~hossman/#xypr

All boost values are 1.0 in solr

2011-10-11 Thread Marek Bachmann
Hey ho, first of all: I am not sure if this topic belongs to the solr or nutch list, sorry for the double post. For some reasons all of the solr documents have a boost value of 1.0 I indexed them using the solrindex command from nutch 1.3. The pages were scored with Webgraph an the output of

Re: negative boosts for docs with common field value

2011-10-11 Thread Rob Brown
The setup for this question was to simplify the actual environment, we're not actually demoting popular authors. Perhaps index-time (negative) boosts are indeed the only way. -- IntelCompute Web Design and Online Marketing http://www.intelcompute.com -Original Message- From: Chris H

Re: is SOLR-2412 production ready?

2011-10-11 Thread abhayd
hi Thanks for the response. I think i would like to make it work with latest trunk version. I really dont understand the problem u have described in read me file. >From readme doc it looks like limitation is "decision tree is not useful once u re-index solr". I really didn't understand the firs

Re: join & sort query with key, value pair

2011-10-11 Thread abhayd
hi I already looked at MLT , and seems like a good solution. but problem is sorting. I want to sort results in seq number order -- View this message in context: http://lucene.472066.n3.nabble.com/join-sort-query-with-key-value-pair-tp3384333p3414220.html Sent from the Solr - User mailing list

Re: Possible bug in Solr JoinQParserPlugin?!

2011-10-11 Thread Chris Hostetter
: I have the following query : /core1/select?q=*:*&fq={!join from=id to=childIds fromIndex=core2}specials:1&fl=id,name ... : org.apache.solr.search.JoinQParserPlugin$1.parse(JoinQParserPlugin.java:60) : the parse is called for the filterquery on the main core (core1). Not the : core of th

Re: negative boosts for docs with common field value

2011-10-11 Thread Chris Hostetter
: Some searches will obviously be saturated by docs from any given author if : they've simply written more. : : I'd like to give a negative boost to these matches, there-by making sure that : 1 Author doesn't saturate the results just because they've written 500 : documents, compared to others wh

Re: lib directory on 1.4.1 with multi cores and tomcat

2011-10-11 Thread Chris Hostetter
: I've specified shared lib as "lib" in the solr.xml file. My assumption : being this will be the lib under solr-home. : However my cores cannot load classes from any new jar's placed in this : dir after a tomcat restart. Specifying a relative path for sharedLib in your solr.xml file may not be

Re: join & sort query with key, value pair

2011-10-11 Thread Chris Hostetter
: What i want is a query will send video_id and in response i want to see : video with that video_id and all other related videos with keyword in sorted : order by seq start by looking at "MoreLikeThis" https://wiki.apache.org/solr/MoreLikeThis -Hoss

Re: In-document highlighting DocValues?

2011-10-11 Thread Jan Høydahl
Hi, Looking more at the new DocValues for 4.0, they are only per-document, right? So I guess what I'm thinking is to use the good old Payloads per term to store this info. Since that's a single value, we could encode the values as byte[] somehow. But the crucial point here is how to iterate th

Re: Pls help :-) ! calling external ws/db to fetch field instead of own index?

2011-10-11 Thread Jan Høydahl
Hi, I would first of all try the most simple solution, namely re-index bug documents on every change, even if they occur frequently. Solr can handle quite some update load, so I would not call it impossible before I've tried. If all you need the external value for is boosting score, then you co

Re: Replication with an HA master

2011-10-11 Thread Ted Dunning
On Tue, Oct 11, 2011 at 8:17 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > > In the case of using a shared (SAN) index between 2 masters, what happens > if the > > live master fails in such a way that the index remains "locked" (such > > as if some hardware failure and it did not unl

Re: Replication with an HA master

2011-10-11 Thread Ted Dunning
On Tue, Oct 11, 2011 at 6:55 PM, Brandon Ramirez < brandon_rami...@elementk.com> wrote: > Using a shared volume crossed my mind too, but I discarded the idea because > of literature I have read about Lucene performing poorly against remote file > systems. But then I suppose a SAN wouldn't be a re

State of SolrCloud branch

2011-10-11 Thread Jamie Johnson
Is there a roadmap for when the work on the solrcloud will be made merged to trunk? There are some features (like SOLR-2765) which I would like to take advantage of but don't know what other implications there are associated with using the branch in production. Can anyone provide any thoughts/exp

Controlling the order of partial matches based on the position

2011-10-11 Thread aronitin
Hi All, I'm using SOLR/Lucene to index few keywords in a "multivalued" field. The data that is being stored in the indexes is already mined to remove the noise and occurrences and is very precise. All the text mining and filtering steps are already performed before indexing. Whenever a user sea

Re: Replication with an HA master

2011-10-11 Thread Otis Gospodnetic
Hello, - Original Message - > From: Robert Stewart > To: solr-user@lucene.apache.org > Cc: > Sent: Tuesday, October 11, 2011 3:37 PM > Subject: Re: Replication with an HA master > > In the case of using a shared (SAN) index between 2 masters, what happens if > the > live master fails

Re: Pls help :-) ! calling external ws/db to fetch field instead of own index?

2011-10-11 Thread Otis Gospodnetic
Hi, No, Solr out of the box cannot call external services - it searches its own index and returns documents indexed/stored there that match a query. You may be able to write a custom SearchComponent that also pulls in data from external data sources and incorporate those in the results, essentia

Pls help :-) ! calling external ws/db to fetch field instead of own index?

2011-10-11 Thread Ikhsvaku S
Help pls sirs :) On Tue, Oct 11, 2011 at 1:29 PM, Ikhsvaku S wrote: > Hi, > > We were recently investigating use of Solr for querying & indexing our bug > database. We are very happy and most of the fields could be indexed > straightforward. But there are some of the fields that cant be indexed

Re: Replication with an HA master

2011-10-11 Thread Robert Stewart
In the case of using a shared (SAN) index between 2 masters, what happens if the live master fails in such a way that the index remains "locked" (such as if some hardware failure and it did not unlock/close index). Will the other master be able to open/write to the index as new documents are ad

Re: Replication with an HA master

2011-10-11 Thread Otis Gospodnetic
Hello, Yes, you've read about NFS, which is why I gave the example of a SAN (which can have multiple power supplies, controllers, etc.) Yes, should be OK to have multiple Solr instances have the same index open, since only one of them will actually be writing to it, thanks to LB. Otis Sem

Re: capacity planning

2011-10-11 Thread Travis Low
Our plan for the VM is just benchmarking, not production. We will turn off all guest machines, then configure a Solr VM. Then we'll tweak memory and see what effect it has on indexing and searching. Then we'll reconfigure the number of processors used and see what that does. Then again with mor

RE: Replication with an HA master

2011-10-11 Thread Brandon Ramirez
Using a shared volume crossed my mind too, but I discarded the idea because of literature I have read about Lucene performing poorly against remote file systems. But then I suppose a SAN wouldn't be a remote file system in the same sense as an NFS-mounted NAS or similar. Should I be concerned

Re: Replication with an HA master

2011-10-11 Thread Otis Gospodnetic
A few alternatives: * Have the master keep the index on a shared disk (e.g. SAN) * Use LB to easily switch to between masters, potentially even automatically if LB can detect the primary is down Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://

RE: setting a large positionIncrementGap

2011-10-11 Thread Michael Ryan
> Separately: why do you want to make the gap so large? No reason, really. I'm just curious about how it works under the covers. -Michael

Re: capacity planning

2011-10-11 Thread eks dev
Re. "I have little experience with VM servers for search." We had huge performance penalty on VMs, CPU was bottleneck. We couldn't freely run measurements to figure out what the problem really was (hosting was contracted by customer...), but it was something pretty scary, kind of 8-10 times slowe

Re: setting a large positionIncrementGap

2011-10-11 Thread Michael McCandless
This gap is vInt encoded in the index, so you'll use more bytes as you increase it (but only per-additional-field-value, ie, on the transition from one field to another). Also the max the position is allowed to be is roughly 2.1B (Integer.MAX_VALUE), so don't set the gap to something that could ov

Re: capacity planning

2011-10-11 Thread Otis Gospodnetic
Hi, To add to what Erik wrote - keep in mind you can compress data before indexing/storing it in Solr so, assuming those PDFs are not compressed under the hood, even if you store your fields for highlighting or other purposes, the resulting index may be smaller than raw PDFs, if you compress th

Re: Architecture and Capacity planning for large Solr index

2011-10-11 Thread Otis Gospodnetic
Hi Rahul, This is unfortunately not enough information for anyone to give you very precise answers, so I'll just give some rough ones: * best disk - SSD :) * CPU - multicore, depends on query complexity, concurrency, etc. * sharded search and failover - start with SolrCloud, there are a couple o

setting a large positionIncrementGap

2011-10-11 Thread Michael Ryan
Is there any negative side-effects of setting a very large positionIncrementGap? For example, I use positionIncrementGap=100 right now - is there any reason for me to not use positionIncrementGap=1, or even greater? I saw a thread from a few months ago asking something like this, but I did

Re: how to retrieve only the updated data from database using solr?

2011-10-11 Thread Shawn Heisey
On 10/11/2011 12:23 AM, nagarjuna wrote: Hi pravesh Thank u for ur reply.. i am using DIH,but right now i dont have timestamp in my databaseinstead of that i have datecreated column which wont updated for the changes .when ever i created some thing it just store

Re: capacity planning

2011-10-11 Thread Toke Eskildsen
Travis Low [t...@4centurion.com] wrote: > Toke, thanks. Comments embedded (hope that's okay): Inline or top-posting? Long discussion, but for mailing lists I clearly prefer the former. [Toke: Estimate characters] > Yes. We estimate each of the 23K DB records has 600 pages of text for the > co

Re: Solr Cloud on solrcloud branch acting strange

2011-10-11 Thread Jamie Johnson
This problem was based on some code that I had changed, branch works as expected, sorry to throw up this flag. On Mon, Oct 10, 2011 at 11:15 PM, Yonik Seeley wrote: > On Sun, Oct 9, 2011 at 11:30 PM, Jamie Johnson wrote: >> I'm doing some work on the solrcloud branch in SVN and am noticing >> so

Re: Query url escape caracters ?

2011-10-11 Thread darul
*ClientUtils.toQueryString()* saved my life ! I was boring for hours before finding solutions with my favourite search engine ;) http://lucene.472066.n3.nabble.com/Does-SOLR-provide-a-java-class-to-perform-url-encoding-td842660.html final SolrQuery newQuery = SolrQueryBuilder.buildQuery(queryPara

Re: Search suggestion with misspellings

2011-10-11 Thread Oliver Beattie
Just realised that I said "Katy Pe" as the example when I actually meant "Katie Pe", apologies —Oliver On 11 October 2011 16:13, Oliver Beattie wrote: > Hi, > > I'm sure this is something that's probably been covered before, and I > shouldn't need to ask. But anyway. I'm trying to build an aut

Re: capacity planning

2011-10-11 Thread Travis Low
Toke, thanks. Comments embedded (hope that's okay): On Tue, Oct 11, 2011 at 10:52 AM, Toke Eskildsen wrote: > > Greetings. I have a paltry 23,000 database records that point to a > > voluminous 300GB worth of PDF, Word, Excel, and other documents. We are > > planning on indexing the records a

Re: Search suggestion with misspellings

2011-10-11 Thread Doug McKenzie
I've just done something similar and rather than using the Spellchecker went for NEdgeGramFilters instead for the suggestions. Worth looking into imo On 11/10/2011 16:13, Oliver Beattie wrote: Hi, I'm sure this is something that's probably been covered before, and I shouldn't need to ask. Bu

Newbie question

2011-10-11 Thread darul
If using CommonsHttpSolrServer query() method with parameter wt=json, when retrieving QueryResponse, how to do to get JSON result output stream ? I do not understand, I can get response.getResults() etc...but no way to find just JSON output stream. Thanks, Jul -- View this message in context:

Architecture and Capacity planning for large Solr index

2011-10-11 Thread Rahul Warawdekar
Hi All, I am working on a Solr search based project, and would highly appreciate help/suggestions from you all regarding Solr architecture and capacity planning. Details of the project are as follows 1. There are 2 databases from which, data needs to be indexed and made searchable,

Query url escape caracters ?

2011-10-11 Thread darul
Hello, We use SolrJ for building and sending request to Solr server. (working well) On the other part, we want to use HttpClient to request server and get result in Json or Xml result ouput format. Scenario: - building SolrQuery object with SolrJ. - getting parameters with SolrQuery.toString()

Search suggestion with misspellings

2011-10-11 Thread Oliver Beattie
Hi, I'm sure this is something that's probably been covered before, and I shouldn't need to ask. But anyway. I'm trying to build an autosuggest with org.apache.solr.spelling.suggest.Suggester The content being searched is music artist names, so I need to be able to deal with suggesting things lik

Re: capacity planning

2011-10-11 Thread Toke Eskildsen
On Tue, 2011-10-11 at 14:36 +0200, Travis Low wrote: > Greetings. I have a paltry 23,000 database records that point to a > voluminous 300GB worth of PDF, Word, Excel, and other documents. We are > planning on indexing the records and the documents they point to. I have no > clue on how we can c

Re: capacity planning

2011-10-11 Thread Travis Low
Thanks, Erik! We probably won't use highlighting. Also, documents are added but *never* deleted. Does anyone have comments about memory and CPU resources required for indexing the 300GB of documents in a "reasonable" amount of time? It's okay if the initial indexing takes hours or maybe even da

Re: capacity planning

2011-10-11 Thread Paul Libbrecht
My experience was 10% of the size. Le 11 oct. 2011 à 15:49, Erik Hatcher a écrit : > (roughly 35% the size, generally).

Re: capacity planning

2011-10-11 Thread Erik Hatcher
Travis - Whether the index is bigger than the original content depends on what you need to do with it in Solr. One of the primary deciding factors is if you need to use highlighting, which currently requires the fields to be highlighted be stored. Stored fields will take up about the same spa

[Announce] Solr 3.4 with RankingAlgorithm 1.3, NRT support

2011-10-11 Thread Nagendra Nagarajayya
Hi! I am very excited to announce the availability of Solr 3.4 with RankingAlgorithm 1.3. This version supports NRT and can update 10,000 docs / sec (MbArtists Index). MbArtists index is the example used in the Solr 1.4 Enterprise Book, has 43 fields so is quite realistic. RankingAlgorithm

Re: schema changes changes 3.3 to 3.4?

2011-10-11 Thread jo
Thanks, as a general rule a would totally agree, however the way we are using solr we don't want to be attach to any particular schema, we just want it to work with whatever the default is... but I will keep your suggestion in mind in the future thanks everyone.. you guys are always very helpful

Re: is it possible to write delta query without using timestamp column?

2011-10-11 Thread Erik Hatcher
Try here for more details: http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command Any query can be put into deltaQuery - whatever makes the most sense for your database, to select the id's that need updating. Then deltaImportQuery is used to pick up the details of the data to

capacity planning

2011-10-11 Thread Travis Low
Greetings. I have a paltry 23,000 database records that point to a voluminous 300GB worth of PDF, Word, Excel, and other documents. We are planning on indexing the records and the documents they point to. I have no clue on how we can calculate what kind of server we need for this. I imagine the

RE: Newbie: document count and facets

2011-10-11 Thread kenneth hansen
ok, I'll try to answer this myself. Would this be correct to give me the data I need? http://localhost:8080/solr-3.3.0/select/? q=accountId:12345 &start=0 &rows=10 &indent=on &facet=true &facet.date=createdDate &f.createdDate.facet.date.start=NOW/DAYS-1MONTHS &f.createdDate.facet.date.end=NOW

Re: is it possible to write delta query without using timestamp column?

2011-10-11 Thread vighnesh
thanks erik, but give me any examples on writing delta query without using timestamp column. and explain me briefly. thanx in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possible-to-write-delta-query-without-using-timestamp-column-tp3412105p3412270.htm

Re: is it possible to write delta query without using timestamp column?

2011-10-11 Thread Erik Hatcher
Yes. But the query will best work if you have some "delta" criteria. Otherwise you might as well do a full import. Erik On Oct 11, 2011, at 5:36, vighnesh wrote: > hello everyone > > is it possible to write delta querys without using timestamp column in > database table? > > > Thanks

Re: multiple dateranges/timeslots per doc: modeling openinghours.

2011-10-11 Thread Geert-Jan Brits
Op 11 oktober 2011 03:21 schreef Chris Hostetter het volgende: > > : Conceptually > : the Join-approach looks like it would work from paper, although I'm not a > : big fan of introducing a lot of complexity to the frontend / querying > part > : of the solution. > > you lost me there -- i don't see

is it possible to write delta query without using timestamp column?

2011-10-11 Thread vighnesh
hello everyone is it possible to write delta querys without using timestamp column in database table? Thanks in advance. Regards Vighnesh -- View this message in context: http://lucene.472066.n3.nabble.com/is-it-possible-to-write-delta-query-without-using-timestamp-column-tp3412105p34121

Re: Interesting DIH challenge

2011-10-11 Thread Chantal Ackermann
Hi Gora, sure, glad to be of help. If you find any problems with the xslt it would be great if you could notify me. I remember having problems with empty fields (see SOLR-1790: https://issues.apache.org/jira/browse/SOLR-1790 ). I think my solution was to make sure that the response of the source

Does solr support this scenario - calling external ws/db to fetch field instead of own index?

2011-10-11 Thread Ikhsvaku S
Hi, We were recently investigating use of Solr for querying & indexing our bug database. We are very happy and most of the fields could be indexed straightforward. But there are some of the fields that cant be indexed as they are changed all the time and we want to incorporate that too in solr que

Re: Possible bug in Solr JoinQParserPlugin?!

2011-10-11 Thread Thijs
Hi Can someone help me confirm this. Or should I create a ticket? Thijs On 7-10-2011 10:10, Thijs wrote: Hi I think I might have found a bug in the JoinQParser. But I want to verify this first before creating a issue. I have two cores with 2 different schema's now I want to join between