Re: ubuntu lucid package

2010-04-30 Thread Gora Mohanty
On Thu, 29 Apr 2010 19:54:49 -0700 (PDT) Otis Gospodnetic wrote: > Pablo, Ubuntu Lucid is *brand* new :) > > try: > find / -name \*solr\* > or > locate solr.war [...] Also, the standard Debian/Ubuntu way of finding out what files a package installed is: dpkg -l Regards, Gora

Indexing metadata in solr using ContentStreamUpdateRequest

2010-04-30 Thread Sandhya Agarwal
Hello, I am using ContentStreamUpdateRequest, to index binary documents. At the time of indexing the content, I want to be able to index some additional metadata as well. I believe, this metadata must be provided, prefixed with *literal*. For instance, I have a field named “field1”, defined in

AW: Slow Date-Range Queries

2010-04-30 Thread Jan Simon Winkelmann
For now I need them, I will however most likely (as suggested by Ahmet Arslan), create another boolean field to get rid of them, just simply due to the fact that I am switching to Solr 1.4 frange queries. On the topic of frange queries, is there a way to simulate the date range wildcards here?

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread Peter Sturge
As far as I'm aware, this information isn't stored intrinsically in Solr. We had a similar requirement whereby we need to keep track of which searches have been performed by particular users. This is more of a security audit requirement rather than generic searching, but the solution was to audit

Re: ubuntu lucid package

2010-04-30 Thread pablo platt
http://localhost:8080/solr/admin/ gives me the solr admin. thanks On Fri, Apr 30, 2010 at 10:24 AM, Gora Mohanty wrote: > On Thu, 29 Apr 2010 19:54:49 -0700 (PDT) > Otis Gospodnetic wrote: > > > Pablo, Ubuntu Lucid is *brand* new :) > > > > try: > > find / -name \*solr\* > > or > > locate s

RE: Problem with pdf, upgrading Cell

2010-04-30 Thread pk
Mark, did you managed to get it work? I did try latest Tika (0.7) command line and successfully parsed earlier problematic pdf. Then i replaced Tika related jars in Solr-1.4 contrib/extraction/lib folder with new ones. Now it doesn;t throw any exception, but no content extraction, only metadata!

RE: Problem with pdf, upgrading Cell

2010-04-30 Thread Sandhya Agarwal
I observed the same issue too, with tika 0.7 jars. It now fails to extract content from documents of any type. Works with tika 0.5 though. Thanks, Sandhya -Original Message- From: pk [mailto:pkal...@gmail.com] Sent: Friday, April 30, 2010 3:17 PM To: solr-user@lucene.apache.org Subject:

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread pk
Peter, It seems that your solution (SOLR-1872) requires authentication too (and be tracked via ur uuid), but my users will be general public using browsers, and i can't force any such auth restrictions. Also you didn't mention if you are already persisting the audit data.. Or i may need to extend

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread MitchK
The most simple way is to send the querystring to your Solr-client *and* to your custom query-fetcher, which could be any database you like. Doing so, you can count how often which query was send etc. *And* you can make them searchable by exporting those datasets to another Solr-core. Why an extr

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread Abdelhamid ABID
Hi, Why you don't just create a filter in the solr context, by this way you can grasp user q param and persist it. On 4/30/10, pk wrote: > > > Peter, > It seems that your solution (SOLR-1872) requires authentication too (and be > tracked via ur uuid), but my users will be general public using bro

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread Praveen Agrawal
Thanks Mitch.. I've an application fronting the Solr for updating/searching etc, and i'll make use of that to store this info. Thanks to all for suggestions. On Fri, Apr 30, 2010 at 3:43 PM, MitchK wrote: > > The most simple way is to send the querystring to your Solr-client *and* to > your cu

Re: Any way to get top 'n' queries searched from Solr?

2010-04-30 Thread Peter Sturge
Yes, you're right, SOLR-1872 is for security authorization, and part of this is to audit what users are searching. The reference to this was to show you how your requirement can be accomplished. To have just the auditing and not the security, you'd need to create your own SearchComponent and extra

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Grant Ingersoll
Can you share the PDF it is failing on? FWIW, PDFs are notoriously hard to extract. They come in all shapes and flavors and I've seen many a commercial extractor fail on them too. Have you tried using either Tika standalone or PDFBox standalone? Does the file work there? On Apr 26, 2010, at

Re: Indexing metadata in solr using ContentStreamUpdateRequest

2010-04-30 Thread Grant Ingersoll
What does your schema look like? On Apr 30, 2010, at 3:47 AM, Sandhya Agarwal wrote: > Hello, > > I am using ContentStreamUpdateRequest, to index binary documents. At the time > of indexing the content, I want to be able to index some additional metadata > as well. I believe, this metadata mus

Re: ubuntu lucid package

2010-04-30 Thread Olivier Dobberkau
Am 30.04.2010 um 09:24 schrieb Gora Mohanty: > Also, the standard Debian/Ubuntu way of finding out what files a > package installed is: > dpkg -l > > Regards, > Gora You might try: # dpkg -L solr-common /. /etc /etc/solr /etc/solr/web.xml /etc/solr/conf /etc/solr/conf/admin-extra.html /etc/s

RE: Indexing metadata in solr using ContentStreamUpdateRequest

2010-04-30 Thread Sandhya Agarwal
Thanks, Grant. I resolved this issue by doing the following : For each of my own metadata fields, it is also required to define the mapping between tika field and solr field, either in solrconfig.xml or while submitting the request for indexing. Also, got to make sure that lowernames = false, i

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Praveen Agrawal
I did try standalone version of tika0.7, and it extracted pdf content successfully. Then i replaced tika related jars in contrib/extraction/lib of solr1.4 dist'n with their newer versions, and now it doesn;t extract contents from ANY pdf. Earlier (0.4) it was throwing exception for few pdfs, but no

Solr date representation

2010-04-30 Thread Toby White
Don't know if this counts as a bug report or not - it's certainly a corner case, but it's just bitten me. http://wiki.apache.org/solr/IndexingDates suggests that the canonical form of a date is a string like: 1995-12-31T23:59:59Z and says that this is a "restricted form of the canonical r

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Marc Ghorayeb
Hi Nope i didn't get it to work... Just like you, command line version of tika extracts correctly the content, but once included in Solr, no content is extracted. What i tried until now is:- Updating the tika libraries inside Solr 1.4 public version, no luck there.- Downloading the latest SVN v

Re: Elevation of of part match

2010-04-30 Thread MitchK
Gert, could you provide the solrconfig- and schema-specifications you have made? If the wiki really means what it says, the behaviour you want should be possible. But that's only what I guess. Btw: The standard definition for the elevation-component is "string" in the example-directory. That me

Re: ubuntu lucid package

2010-04-30 Thread pablo platt
what parts doesn't work for you? If there are bugs in the package it will be great if you can report them to make it better. On Fri, Apr 30, 2010 at 1:50 PM, Olivier Dobberkau wrote: > > Am 30.04.2010 um 09:24 schrieb Gora Mohanty: > > > Also, the standard Debian/Ubuntu way of finding out what f

RE: benefits of float vs. string

2010-04-30 Thread Nagelberg, Kallin
When using numerical types you can do ranges like 3 < myfield <= 10 , as well as a lot of other interesting mathematical functions that would not be possible with a string type. Thanks for the info Yonik, -Kallin Nagelberg -Original Message- From: Dennis Gearon [mailto:gear...@sbcglobal

prefixing with dismax

2010-04-30 Thread Nagelberg, Kallin
Hey, I've been using the dismax query parser so that I can pass a user created search string directly to Solr. Now I'm getting the requirement that something like 'Bo' must match 'Bob', or 'Bob Jo' must match 'Bob Jones'. I can't think of a way to make this happen with Dismax, though it's prett

RE: Elevation of of part match

2010-04-30 Thread MitchK
The elevate.xml-example says: "" Did you make a restart? -- View this message in context: http://lucene.472066.n3.nabble.com/Elevation-of-of-part-match-tp767139p768120.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Grant Ingersoll
Praveen and Marc, Can you share the PDF (feel free to email my private email) that fails in Solr? Thanks, Grant On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote: > > Hi > Nope i didn't get it to work... Just like you, command line version of tika > extracts correctly the content, but once in

How is DeletionPolicy supposed to work?

2010-04-30 Thread Paleo Tek
Hi folks, In moving to 1.4, it was unclear to me how deletionPolicy was supposed to work. I commit/optimize on a build server, then replicate to multiple search servers. I don't need anything fancy for a deletion policy: save one copy, and replicate on copy. But when I used no policy, so

Re: Trouble with parenthesis

2010-04-30 Thread Yonik Seeley
Pure negatives in lucene syntax don't match anything (solr currently only fixes this for you if it's a pure negative at the top-level, not embeded). Try changing (NOT periodicite:"annuel") to (*:* NOT periodicite:"annuel") But the second version below where you just removed the parens will be mor

RE: Elevation of of part match

2010-04-30 Thread Villemos, Gert
Yes, I restarted. To make sure I just did it again. Same result; "archive" elevates, "packet archive" doesnt. G. From: MitchK [mailto:mitc...@web.de] Sent: Fri 4/30/2010 5:02 PM To: solr-user@lucene.apache.org Subject: RE: Elevation of of part match The e

Re: How is DeletionPolicy supposed to work?

2010-04-30 Thread Yonik Seeley
Simply use what the default was in the example solrconfig.xml... there is no need to modify that unless you are doing something advanced. In the config below, you show maxOptimizedCommitsToKeep=1, which will increase index size by always keeping around one optimized commit point. -Yonik Apache Lu

Re: Problem with pdf, upgrading Cell

2010-04-30 Thread Praveen Agrawal
Grant, You can try any of the sample pdfs that come in /docs folder of Solr 1.4 dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only metadata i.e. stream_size, content_type apart from my own literals are indexed, and content is missing.. On Fri, Apr 30, 2010 at 8:52 PM, Gran

Re: StreamingUpdateSolrServer hangs

2010-04-30 Thread Yonik Seeley
On Thu, Apr 29, 2010 at 7:51 PM, Yonik Seeley wrote: > I'm trying to reproduce now... single thread adding documents to a > multithreaded client, StreamingUpdateSolrServer(addr,32,4) > > I'm currently at the 2.5 hour mark and 100M documents - no issues so far. I let it go to 500M docs... everythi

Re: thresholding results by percentage drop from maxScore in lucene/solr

2010-04-30 Thread MitchK
Mike, why don't order by the number of found items in your facet? If you get too many facets, just throw those away that got the smallest value, if you got not enough place for them. I suggest that, because you don't know every search-case. Sometimes the user does not really know what he is sear

RE: Elevation of of part match

2010-04-30 Thread MitchK
Sorry, as far as I did not make any experiences with the elevatorComponent, I can't help you with this. Even searching in the mailing list offers no usefull information. . . -- View this message in context: http://lucene.472066.n3.nabble.com/Elevation-of-of-part-match-tp767139p768895.html Sent

Custom SolrQueryRequest/SolrQueryResponse

2010-04-30 Thread Aaron Hiniker
Solr team, Long time, first time-- many thanks for all your work on creating this excellent search appliance. The 40,000ft view of my problem is that I need to execute multiple queries per endpoint invocation, with the results for each query grouped in the response output as such that they wer

Re: Solr date representation

2010-04-30 Thread Chris Hostetter
: then on retrieving that document, I get back the value : : 1-01-01T00:00:00Z : : (ie no preceding zeroes) - which tripped up my date-parsing routines. : Preceding zeroes seem to be universally dropped - all dates before 1000AD seem : to have the equivalent problem. : : Is this a bug in the co