Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Judioo
Hi, I'm new to solr so apologies if the solution is already documented. I have installed and populated a solr index using the examples as a template with a version of the data below. I have XML in the form of 123898-2092099098982 Blu-Ray 2011-05-05T11:25:35+0500

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Judioo
The data is being imported directly from mysql. The document is however indeed a good starting place. Thanks 2011/5/18 Yury Kats > On 5/18/2011 4:19 PM, Judioo wrote: > > > Any help is greatly appreciated. Pointers to documentation that address > my > > issues is even more

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Judioo
as is? 2011/5/18 Yury Kats > On 5/18/2011 4:19 PM, Judioo wrote: > > > Any help is greatly appreciated. Pointers to documentation that address > my > > issues is even more helpful. > > I think this would be a good start: > > http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource >

Solr Indexing Patterns

2011-06-03 Thread Judioo
What is the "best practice" method to index the following in Solr: I'm attempting to use solr for a book store site. Each book will have a price but on occasions this will be discounted. The discounted price exists for a defined time period but there may be many discount periods. Each discount wi

Re: Solr Indexing Patterns

2011-06-03 Thread Judioo
ply > re-index the book information with a multiValued "discounts" field > and get something similar to your example (&wt=json) > > > Best > Erick > > On Fri, Jun 3, 2011 at 8:38 AM, Judioo wrote: > > What is the "best practice" method to index

Re: Solr Indexing Patterns

2011-06-06 Thread Judioo
discounts currently active * get all books using ISDN retrieved from above search Not that bad. However what happens when I want "all books that are currently on discount in the "horror" genre containing the word 'elm' in the title." The only way I can see in caterin

Re: Solr Indexing Patterns

2011-06-06 Thread Judioo
gets, and unfortunately the only way to figure that out is to test. > > But that's the first approach I'd try. > > Good luck! > Erick > > On Mon, Jun 6, 2011 at 11:42 AM, Judioo wrote: > > On 5 June 2011 14:42, Erick Erickson wrote: > > > >> See:

Re: Solr Indexing Patterns

2011-06-06 Thread Judioo
example of the best method to approach my problem, although Erick has help me understand the limitations of Solr. Just thought I'd say. On 6 June 2011 20:26, Judioo wrote: > Thanks > > > On 6 June 2011 19:32, Erick Erickson wrote: > >> #Everybody# (includ

Re: Solr Indexing Patterns

2011-06-09 Thread Judioo
d > field-collapsing. Each, IMO, is trying to deal with different aspects of > dealing with hieararchical or multi-class data, or data that is entities > with relationships. ). > > > On 6/6/2011 3:43 PM, Judioo wrote: > >> I do think that Solr would be better served if there was

Pattern: Is there a method of resolving multivalued date ranges into a single document?

2011-06-11 Thread Judioo
Hi All, Question on best methods again :) I have the following type of document. Tron . where theater identifies the place where the film is showing. Each theater is stored in another document. I want to store the timings in the same document as the film details.

Boost Strangeness

2011-06-15 Thread Judioo
Hi I'm confused about exactly how boosts relevancy scores work. Apologies if I am violating this groups etiquette but I could not find solr's paste bin anywhere. I have 2 document types but want to return any documents where the requested ID appears. The ID appears in multiple attributes but I w

Re: Boost Strangeness

2011-06-15 Thread Judioo
Apologies I have tried that method as well. /solr/select/?q=b007vty6&defType=dismax&qf=id^10%20parent_id^9%20brand_container_id^8%20series_container_id^8%20subseries_container_id^8%20clip_container_id^1%20clip_episode_id^1&debugQuery=on&fl=id,parent_id,brand_container_id,series_container_id,subser

Re: Boost Strangeness

2011-06-15 Thread Judioo
so all attributes except 'id' are of type text. I didn't know that about the string type. So is my problem as described ( that partial matches are contributing to the calculation ) and does defining the filed type as string solve this problem. Or is my understanding completely incorrect? Th

Re: Boost Strangeness

2011-06-15 Thread Judioo
String also does not seem to accept spaces. currently the _id fields can contain multiple ids ( using as a multiType alternative ). This is why I used the text type. On 15 June 2011 12:16, Judioo wrote: > stored="true"/> > > so all attributes except 'id' are

Re: Boost Strangeness

2011-06-16 Thread Judioo
t > their relative scores. > > Finally, your next e-mail hints at what's happening. If you're > putting multiple tokens in some of these fields, the length > normalization may be causing the matches to score lower. You can > try disabling those calculations (omitNorms=&

Boost Strangeness

2011-06-18 Thread Judioo
WONDERFUL! Just reporting back. This document is ACE http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters For explaining what the filters are and how to affect the analyzer. Erik your statement "First, boosting isn't absolute"  played on me so I continued to investigate boosting. I found

Do unused indexes after performance?

2011-06-24 Thread Judioo
Hi, As a proof of concept I have imported around ~11 million document in a solr index. my schema file has multiple fields defined Above being the most important for my question. The average document has around 40 attributes. Each document has: * a minimum of 2 tdate fileds ( max of 10) *

Replication without configs

2011-06-27 Thread Judioo
I have replicated a solr instance without configs as the slave has it's own config. The replication has failed. My plan was to use replication to remove the indexes I no longer wish to use which is why the slave has a different schema.xml file. Does anyone know why the replication has failed? Th

Sorting by vale of field

2011-06-29 Thread Judioo
Hi Say I have a field type in multiple documents which can be either type:bike type:boat type:car type:van and I want to order a search to give me documents in the following order type:car type:van type:boat type:bike Is there a way I can do this just using the &sort method? Thanks

Re: Sorting by vale of field

2011-06-29 Thread Judioo
Thanks, Yes this is the work around I am currently doing. Still wondering is the sort method can be used alone. On 29 June 2011 18:34, Michael Ryan wrote: > You could try adding a new int field (like "typeSort") that has the desired > sort values. So when adding a document with type:car, als