Solr wiki link broken

2010-01-25 Thread Teruhiko Kurosaka
In http://lucene.apache.org/solr/ the wiki tab and "Docs (wiki)" hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr But the wiki site seems to be broken. The above link took me to a generic help page of the Wiki system. What's going on? Did I just hit t

RE: Solr vs. Compass

2010-01-25 Thread Funtick
Minutello, Nick wrote: > > Maybe spend some time playing with Compass rather than speculating ;) > I spent few weeks by studying Compass source code, it was three years ago, and Compass docs (3 years ago) were saying the same as now: "Compass::Core provides support for two phase commits transa

Re: Indexing TrieDateField Using Lucene

2010-01-25 Thread Yonik Seeley
On Mon, Jan 25, 2010 at 8:03 PM, brad anderson wrote: > I'm trying to create a faster index generator for testing purposes. Using > lucene has helped immensely to increase indexing speed. Have you tried using other indexing methods such as CSV or StreamingUpdateSolrServer? If there are any perfor

Indexing TrieDateField Using Lucene

2010-01-25 Thread brad anderson
Greetings, I'm trying to create a faster index generator for testing purposes. Using lucene has helped immensely to increase indexing speed. However, my schema includes a TrieDateField. I do not know how to correctly index this field type using Lucene's API's. I've tried the following: new DateFi

Reminder: Seattle Hadoop / HBase / Lucene / NoSQL meetup Jan 27th! Feat. Razorfish

2010-01-25 Thread Bradford Stephens
Greetings, I'm in the Bay Area doing startup-stuff this week, so Nick Dimiduk will be running this meetup again. You can reach him at ndimi...@gmail.com and 614-657-0267 A friendly reminder that the Seattle Hadoop, NoSQL, etc. meetup is on January 27th at University of Washington in the Allen Com

StreamingUpdateSolrServer seems to hang on indexing big batches

2010-01-25 Thread Jake Brownell
Hi, I swapped our indexing process over to the streaming update server, but now I'm seeing places where our indexing code adds several documents, but eventually hangs. It hangs just before the completion message, which comes directly after sending to solr. I found this issue in jira https://is

Analysis tool vs search query

2010-01-25 Thread nonrenewable
Hi, I've run into this issue that I have no way of resolving, since the analysis tool doesn't show me there is an error. I copy the exact field value into the analysis tool and i type in the exact query request i'm issuing and the tool finds it a match. However running the query with that exact s

machine tags, copy fields and pattern tokenizers

2010-01-25 Thread straup
Hi, I am trying to work out how to store, query and facet machine tags [1] in Solr using a combination of copy fields and pattern tokenizer factories. I am still relatively new to Solr so despite feeling like I've gone over the docs, and friends, it's entirely possible I've missed something

Re: determine which value produced a hit in multivalued field type

2010-01-25 Thread Lance Norskog
Thanks Erik, I did not know about the order guarantee for indexed multivalue fields. Timothy, it could be more than one term matches the queries. Highlighting will show you which terms matched your query. You'll have to post-process the results. On Mon, Jan 25, 2010 at 7:26 AM, Harsch, Timothy J.

Re: date math help

2010-01-25 Thread Ahmet Arslan
--- > What not use separate facet query for each these? > &facet.query:[NOW-1DAY TO > NOW]&facet.query:[NOW-7DAYS TO NOW]& etc. Sorry I forgot to add the field name: &facet.query=LastMod:[NOW-1DAY TO NOW]&facet.query=LastMod:[NOW-7DAYS TO NOW]&

Re: date math help

2010-01-25 Thread Ahmet Arslan
> If I wanted to return documents faceted by LastMod and I > wanted them grouped > by week what would the correct syntax be? > > I would like to eventually format the results to display > this: > > Last day (facet count) > Last Week (facet count) > Last 2 Weeks (facet count) > Last Month (facet c

date math help

2010-01-25 Thread solrquestion6
If I wanted to return documents faceted by LastMod and I wanted them grouped by week what would the correct syntax be? I would like to eventually format the results to display this: Last day (facet count) Last Week (facet count) Last 2 Weeks (facet count) Last Month (facet count) Last 2 Months (

Re: Delete by query

2010-01-25 Thread Lance Norskog
The problem is that negative queries are ignored by Lucene. Why this is still true I have no idea. On Mon, Jan 25, 2010 at 6:23 AM, Noam G. wrote: > > Hi David, > > Thank you very much - that did the trick :-) > > Noam. > -- > View this message in context: > http://old.nabble.com/Delete-by-query

Re: Default value attribute in RSS DIH

2010-01-25 Thread Lance Norskog
There does not seem to be on mentioned in the Wiki or the source. You can do this with Javascript code in the DIH configuration, or by setting the default in the schema.xml file. On Sun, Jan 24, 2010 at 12:20 PM, David Stuart wrote: > Hey All, > > Can anyone tell me what the attribute name is for

RE: Solr vs. Compass

2010-01-25 Thread Minutello, Nick
Correct. Its not 2PC. It just makes the window for inconistancy quite small ... without the user having to write anything. -N -Original Message- From: Lukas Kahwe Smith [mailto:m...@pooteeweet.org] Sent: 25 January 2010 21:19 To: solr-user@lucene.apache.org Subject: Re: Solr vs. Comp

RE: Solr vs. Compass

2010-01-25 Thread Minutello, Nick
Maybe spend some time playing with Compass rather than speculating ;) If it helps, on the project where I last used compass, we had what I consider to be a small dataset - just a few million documents. Nothing related to indexing/searching took more than a second or 2 - mostly it was 10's or 100's

Re: solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread Markus Jelsma
Hello Frank, Answers are inline: Frank van Lingen said: > I recently started working with solr and find it easy to setup and > tinker with. > > I now want to scale up my setup and was wondering if there is an > application/component that can do the following (I was not able to find > documentatio

Re: Solr vs. Compass

2010-01-25 Thread Lukas Kahwe Smith
On 25.01.2010, at 22:16, Minutello, Nick wrote: > Sorry, you have completely lost me :/ > > In simple terms, there are times when you want the primary storage > (database) and the Lucene index to be in synch - and updated atomically. > It all depends on the kind of application. Sure. I guess L

RE: Solr vs. Compass

2010-01-25 Thread Minutello, Nick
Sorry, you have completely lost me :/ In simple terms, there are times when you want the primary storage (database) and the Lucene index to be in synch - and updated atomically. It all depends on the kind of application. -N -Original Message- From: Fuad Efendi [mailto:f...@efendi.ca]

Re: solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread mike anderson
I think you might be looking for Apache Tika. On Mon, Jan 25, 2010 at 3:55 PM, Frank van Lingen wrote: > I recently started working with solr and find it easy to setup and tinker > with. > > I now want to scale up my setup and was wondering if there is an > application/component that can do the

Re: Lock problems: Lock obtain timed out

2010-01-25 Thread mike anderson
I am getting this exception as well, but disk space is not my problem. What else can I do to debug this? The solr log doesn't appear to lend any other clues.. Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={} status=500 QTime=1990 Jan 25, 20

Collating results from multiple indexes

2010-01-25 Thread Aaron McKee
Is there any somewhat convenient way to collate/integrate fields from separate indices during result writing, if the indices use the same unique keys? Basically, some sort of cross-index JOIN? As a bit of background, I have a rather heavyweight dataset of every US business (~25m records, an

solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread Frank van Lingen
I recently started working with solr and find it easy to setup and tinker with. I now want to scale up my setup and was wondering if there is an application/component that can do the following (I was not able to find documentation on this on the solr site): -Can I send solr an xml document with a

Re: Huge Index - RAM usage?

2010-01-25 Thread Erick Erickson
How much memory are you allocating for the JVM? And are you sure the memory isn't just memory hanging around available for GCing? An interesting test would be to restrict your available memory allowed to the JVM and see if you bumped up against that Here's a blog post from Mark Miller about

Re: wildcard search and hierarchical faceting

2010-01-25 Thread Erik Hatcher
There are some approaches outlined here that might be of interest: http://wiki.apache.org/solr/HierarchicalFaceting On Jan 24, 2010, at 2:54 AM, Andy wrote: I'd like to provide a hierarchical faceting functionality. An example would be location drill down such as USA -> New Y

Re: AW: Searching for empty fields possible?

2010-01-25 Thread Ahmet Arslan
> I'm not sure, theoretically fields with a null value > (php-side) should end > up not having the field. But then again i don't think it's > relevant just > yet. What bugs me is that if I add the -puid:[* TO *], all > results for > puid:[0 TO *] disappear, even though I am using "OR". - operator

Re: Wildcard Search and Filter in Solr

2010-01-25 Thread Ahmet Arslan
> Hi , > I m trying to use wildcard keywords in my search term and > filter term . but > i didnt get any results. > Searched a lot but could not find any lead . > Can someone help me in this. > i m using solr 1.2.0 and have few records indexed with > vendorName value as > Intel > > In solr admin

DataImportHandler TikaEntityProcessor FieldReaderDataSource

2010-01-25 Thread Shah, Nirmal
Hi, I am fairly new to Solr and would like to use the DIH to pull rich text files (pdfs, etc) from BLOB fields in my database. There was a suggestion made to use the FieldReaderDataSource with the recently commited TikaEntityProcessor. Has anyone accomplished this? This is my configuratio

Master Read Timeout

2010-01-25 Thread Giovanni Fernandez-Kincade
I have a slave that is pulling multiple cores from one master, and I'm very frequently seeing cases where the slave is getting timeouts when fetching from the master: 2010-01-25 11:00:22,819 [pool-3-thread-1] ERROR org.apache.solr.handler.SnapPuller - Master at: http://shredder:8080/solr/Filin

RE: Solr vs. Compass

2010-01-25 Thread Fuad Efendi
> >> Even if "commit" takes 20 minutes? > I've never seen a commit take 20 minutes... (anything taking that long > is broken, perhaps in concept) "index merge" can take from few minutes to few hours. That's why nothing can beat SOLR Master/Slave and sharding for huge datasets. And reopening of I

Re: Huge Index - RAM usage?

2010-01-25 Thread Antonio Lobato
Just indexing. If I shutdown Solr, memory usage goes down to 200MB. I've searched the mailing lists, but most situations are with people both searching and indexing. I was under the impression that indexing shouldn't use up so much memory. I'm trying to figure out where all the usage is

Re: LucidGaze, No Data

2010-01-25 Thread Mark Miller
Markus Jelsma wrote: > Hi, > > > Is the list without clue or should i mail Lucid directly? > > > Cheers, > > > >> I have installed and reconfigured everything according to the readme >> supplied with the recent LucidGaze release. Files have been written in the >> gaze directory in SOLR_HOME but

RE: Solr vs. Compass

2010-01-25 Thread Fuad Efendi
> >> Why to embed "indexing" as a transaction dependency? Extremely weird > idea. > There is nothing weird about different use cases requiring different > approaches > > If you're just thinking documents and text search ... then its less of > an issue. > If you have an online application where

Re: Huge Index - RAM usage?

2010-01-25 Thread Erick Erickson
Are you also searching on this machine or just indexing? I'll assume your certain that it's SOLR that's eating memory, as in you stop the process and your usage drops way down. But if you search the user list for memory, you'll see this kind of thing discussed a bunch of times, along with suggest

RE: Solr configuration issue for sorting on title field

2010-01-25 Thread EL KASMI Hicham
I see! Thanks a lot Eric. == Hicham El Kasmi Université Libre de Bruxelles - Libraries Av. F.D. Roosevelt 50, CP 180 1050 BRUSSELS Belgium Tel: + 32 2 650 25 30 Fax: + 32 2 650 23 91 == -Message d'origine-

Huge Index - RAM usage?

2010-01-25 Thread Antonio Lobato
Hello everyone! I have a question about indexing a large dataset in Solr and ram usage. I am currently indexing about 160 gigabytes of data to a dedicated indexing server. The data is constantly being fed to Solr, 24/7. The index grows as I prune away old data that is not needed, so th

Re: Solr configuration issue for sorting on title field

2010-01-25 Thread Erick Erickson
Well, stop doing that The error message is a bit misleading, and was probably in response to the more-frequent error of tokenizing a field then trying to sort on it. The problem is that sorting on something with more than one token is indeterminate. In this case, with more than one title is

RE: Solr configuration issue for sorting on title field

2010-01-25 Thread EL KASMI Hicham
Hi Eric, Yes, we're indexing more than one title per document (document's title, the title of the series or of the journal in which the document was published, etc...). Normally, each time we modified the SOLR config, we drop the old index and we create a new one. Thanks,

RE: determine which value produced a hit in multivalued field type

2010-01-25 Thread Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS]
If a simple "no" is the answer I'd be glad if anyone could confirm. Thanks. -Original Message- From: Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] [mailto:timothy.j.har...@nasa.gov] Sent: Friday, January 22, 2010 2:53 PM To: solr-user@lucene.apache.org Subject: determine which value produc

Re: LucidGaze, No Data

2010-01-25 Thread Markus Jelsma
Hi, Is the list without clue or should i mail Lucid directly? Cheers, >I have installed and reconfigured everything according to the readme > supplied with the recent LucidGaze release. Files have been written in the > gaze directory in SOLR_HOME but the *.log.x.y files are all empty! The rrd

RE: Delete by query

2010-01-25 Thread Noam G.
Hi David, Thank you very much - that did the trick :-) Noam. -- View this message in context: http://old.nabble.com/Delete-by-query-tp27306968p27307336.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Delete by query

2010-01-25 Thread David.Dankwerth
solrServer.deleteByQuery("*:* AND -version_uuid:e04534e2-28db-4420-a5f3-300477872c11"); Should do the trick. -Original Message- From: Noam G. [mailto:noam...@gmail.com] Sent: 25 January 2010 13:58 To: solr-user@lucene.apache.org Subject: Delete by query Hi, I have an index with 3M doc

Re: Solr configuration issue for sorting on title field

2010-01-25 Thread Erik Hatcher
Are you sending in more than one title per document, by chance? Have you changed your configuration without reindexing the entire collection, possibly? Erik On Jan 25, 2010, at 8:32 AM, EL KASMI Hicham wrote: Thanks Otis, The long message was an attempt to be as detailed as possi

Delete by query

2010-01-25 Thread Noam G.
Hi, I have an index with 3M docs. Each doc has a field calld version_uuid. I want to delete all docs that theire version_uuid is other then 'e04534e2-28db-4420-a5f3-300477872c11' (for example :-)) This is the query I submit to the SolrServer object using the deleteByQuery method: "NOT version_

Re: understanding termVector output

2010-01-25 Thread Grant Ingersoll
On Jan 22, 2010, at 12:39 PM, Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote: > Hi, > I'm trying to see if I can use termVectors for a use case I have. > Essentially I want to know is: where in the indexed value does the query hit > occur? Solr doesn't currently have support for Span Querie

Wildcard Search and Filter in Solr

2010-01-25 Thread ashokcz
Hi , I m trying to use wildcard keywords in my search term and filter term . but i didnt get any results. Searched a lot but could not find any lead . Can someone help me in this. this is my schema.xml entries and this is my field type definition for text field

RE: Solr configuration issue for sorting on title field

2010-01-25 Thread EL KASMI Hicham
Thanks Otis, The long message was an attempt to be as detailed as possible, so that one could understand our tests. I'm afraid the problem we wanted to describe didn't really come through. These are the relevant entries from our config: =

AW: Searching for empty fields possible?

2010-01-25 Thread Jan-Simon Winkelmann
> Are you indexing an empty value? Or not indexing a field at all? > -field:[* TO *] will match documents that do not have the field at all. I'm not sure, theoretically fields with a null value (php-side) should end up not having the field. But then again i don't think it's relevant just yet. Wha

Re: Searching for empty fields possible?

2010-01-25 Thread Erik Hatcher
Are you indexing an empty value? Or not indexing a field at all? -field:[* TO *] will match documents that do not have the field at all. Erik On Jan 25, 2010, at 7:02 AM, Jan-Simon Winkelmann wrote: Hi there, i have a field defined in my schema as follows: Valid values for the f

Searching for empty fields possible?

2010-01-25 Thread Jan-Simon Winkelmann
Hi there, i have a field defined in my schema as follows: Valid values for the fields are (in theory) any unsigned integer value. Also the field can be empty. My problem is, that if I have (puid:[0 TO *] OR -puid:[* TO *]) as filter query i get 0 results, without the "-puid:[* TO *])" part i g

Re: big index vs. lots of small ones

2010-01-25 Thread Thorsten Scherler
On Wed, 2010-01-20 at 08:38 -0800, Marc Sturlese wrote: > Check out this patch witch solve the distributed IDF's problem: > https://issues.apache.org/jira/browse/SOLR-1632 > I think it fixes what you are explaining. The price you pay is that there > are 2 requests per shard. If I am not worng the f

RE: Solr vs. Compass

2010-01-25 Thread Minutello, Nick
>> Why to embed "indexing" as a transaction dependency? Extremely weird idea. There is nothing weird about different use cases requiring different approaches If you're just thinking documents and text search ... then its less of an issue. If you have an online application where the indexing

Re: Index gets deleted after commit?

2010-01-25 Thread Sven Maurmann
DIH is the DataImportHandler. Please consult the two URLs http://wiki.apache.org/solr/DataImportHandler and http://wiki.apache.org/solr/DataImportHandlerFaq for further information. Cheers, Sven --On Monday, January 25, 2010 11:33:59 AM +0200 Bogdan Vatkov wrote: Hi Amit, What is D

Re: Index gets deleted after commit?

2010-01-25 Thread Bogdan Vatkov
Hi Amit, What is DIH? (I am Solr newbie). In the mean time I resolved my issue - it was very stupid one - on of the files in my folder with XMLs (that I send to Solr with the SimplePostTool), and actually the latest created one (so it got executed last each time I run the folder), contaoned *:*..

Re: Beyond Basic Faceted Search (SOLR-236|SOLR-64|SOLR-792)

2010-01-25 Thread David MARTIN
Hi Kelly, Did you succeed in using these patches? It seems I've got the same need as you : be able to collapse all products variations (SKU) under a single line (the product). As I'm beginning today with the field collapse patch, I'm still looking for the best solution for this need. Maybe someo