Newbie CVS problem

2008-03-16 Thread tim robertson
Hi All,
I have today installed SOLR and am trying to get CSV files indexed but cant
seem to get any hits.

Using a fresh 1.2 install, I am using the schema shipped and the
books.csvin the example.

It seems to upload ok:

[EMAIL 
PROTECTED]//Users/timrobertson/dev/apache-solr-nightly/example-tim/exampledocs$
curl
http://localhost:8983/solr/update/csv --data-binary @books.csv -H
'Content-type:text/plain; charset=utf-8'


017


But a search for Black returns no results - this is the URL I am using:

http://localhost:8983/solr/select/?q=Black&version=2.2&start=0&rows=10&indent=on


I am complete newbie, but looking at the schema I thought the Name column
would end up indexed.


Could someone please tell me what I am missing?


Many Thanks


Tim


Re: Newbie CVS problem

2008-03-16 Thread Yonik Seeley
You won't see anything until the results are committed.
try something like http://localhost:8983/solr/update/csv?commit=true
to commit after adding all the docs.

post.sh in exampledocs also has an example at the end of how to send a
commit command separately.

-Yonik

On Sun, Mar 16, 2008 at 6:56 AM, tim robertson
<[EMAIL PROTECTED]> wrote:
> Hi All,
>  I have today installed SOLR and am trying to get CSV files indexed but cant
>  seem to get any hits.
>
>  Using a fresh 1.2 install, I am using the schema shipped and the
>  books.csvin the example.
>
>  It seems to upload ok:
>
>  [EMAIL 
> PROTECTED]//Users/timrobertson/dev/apache-solr-nightly/example-tim/exampledocs$
>  curl
>  http://localhost:8983/solr/update/csv --data-binary @books.csv -H
>  'Content-type:text/plain; charset=utf-8'
>  
>  
>  0  name="QTime">17
>  
>
>  But a search for Black returns no results - this is the URL I am using:
>
>  
> http://localhost:8983/solr/select/?q=Black&version=2.2&start=0&rows=10&indent=on
>
>
>  I am complete newbie, but looking at the schema I thought the Name column
>  would end up indexed.
>
>
>  Could someone please tell me what I am missing?
>
>
>  Many Thanks
>
>
>  Tim
>


Re: Newbie CVS problem

2008-03-16 Thread tim robertson
Ah - perfect
Thanks!


On Sun, Mar 16, 2008 at 1:26 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:

> You won't see anything until the results are committed.
> try something like http://localhost:8983/solr/update/csv?commit=true
> to commit after adding all the docs.
>
> post.sh in exampledocs also has an example at the end of how to send a
> commit command separately.
>
> -Yonik
>
> On Sun, Mar 16, 2008 at 6:56 AM, tim robertson
> <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >  I have today installed SOLR and am trying to get CSV files indexed but
> cant
> >  seem to get any hits.
> >
> >  Using a fresh 1.2 install, I am using the schema shipped and the
> >  books.csvin the example.
> >
> >  It seems to upload ok:
> >
> >  [EMAIL PROTECTED]
> //Users/timrobertson/dev/apache-solr-nightly/example-tim/exampledocs$
> >  curl
> >  http://localhost:8983/solr/update/csv --data-binary @books.csv -H
> >  'Content-type:text/plain; charset=utf-8'
> >  
> >  
> >  0 >  name="QTime">17
> >  
> >
> >  But a search for Black returns no results - this is the URL I am using:
> >
> >
> http://localhost:8983/solr/select/?q=Black&version=2.2&start=0&rows=10&indent=on
> >
> >
> >  I am complete newbie, but looking at the schema I thought the Name
> column
> >  would end up indexed.
> >
> >
> >  Could someone please tell me what I am missing?
> >
> >
> >  Many Thanks
> >
> >
> >  Tim
> >
>


Performance of Filter Query

2008-03-16 Thread Fuad Efendi
I just noticed huge difference in performance of very simple fq-type queries
in comparison with standard Lucene queries:
2ms vs. 2ms

where 'distribution' of queried single-value field is extemely low, such as
fq=country:USA

Standard query  is 1 times faster than less intelligent


Does anyone experience similar staff? 

It's probably specific to [* TO *] which was stupid in this case...

Thanks


 



Re: AlphaNumeric search in Solr

2008-03-16 Thread solr_user

Hi Yonik,

  Removing the WordDelimiterFilter did the trick.  Now I am able to get
results back for alphanumeric search.  What other side effect will removing
the WordDelimiterFilter cause.

Thanks


Yonik Seeley wrote:
> 
> On Fri, Mar 14, 2008 at 10:15 PM, solr_user <[EMAIL PROTECTED]> wrote:
>>   No this index was not generated using Solr.  I just have the index
>> files
>>  without access to the source that generated those files.  Is there a way
>>  that I can change my Solr schema so that it wont split axd110 into two
>>  tokens.
> 
> The ideal answer would be to use the analyzer that generated the index.
> If you don't exactly know, remove the WordDelimiterFilter for starts.
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/AlphaNumeric-search-in-Solr-tp16063101p16084533.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: AlphaNumeric search in Solr

2008-03-16 Thread Yonik Seeley
On Sun, Mar 16, 2008 at 5:43 PM, solr_user <[EMAIL PROTECTED]> wrote:
>   Removing the WordDelimiterFilter did the trick.  Now I am able to get
>  results back for alphanumeric search.  What other side effect will removing
>  the WordDelimiterFilter cause.

It all depends on what type of matching you want.  WordDelimiterFilter
is one method of making queries like arati800xl, Atari-800XL, atari
800 XL, Atari 800/XL, etc, all match a document containing Atari
800XL.

-Yonik


Re: AlphaNumeric search in Solr

2008-03-16 Thread solr_user

Hi Yonik,

  I read some documentation on the WordDelimterFilter.  Just to clarify my
thinking, I understand that if I use WordDelimiterFilter and search for a
term like axd100 it will break it into two tokens "axd" and "100".  But then
when I do my search should Solr match the documents containing both these
tokens?

  In my application when I try to search for "axd 100" I get several
documents back, but when I search for axd100 with WordDelimiterFilter on, I
don't get back any results.  I was assuming that if WordDelimiterFilter
breaks axd100 into two tokens - "axd" and "100", then the search should
behave exactly as if I was searching for the string "axd 100".

Thanks.



Yonik Seeley wrote:
> 
> On Sun, Mar 16, 2008 at 5:43 PM, solr_user <[EMAIL PROTECTED]> wrote:
>>   Removing the WordDelimiterFilter did the trick.  Now I am able to get
>>  results back for alphanumeric search.  What other side effect will
>> removing
>>  the WordDelimiterFilter cause.
> 
> It all depends on what type of matching you want.  WordDelimiterFilter
> is one method of making queries like arati800xl, Atari-800XL, atari
> 800 XL, Atari 800/XL, etc, all match a document containing Atari
> 800XL.
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/AlphaNumeric-search-in-Solr-tp16063101p16088635.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Finding an empty field

2008-03-16 Thread Chris Hostetter

: It was a surprise to discover that 
:   dateorigin_sort:""
: is a syntax error, but
:   dateorigin_sort:["" TO *]
: is legit. This says that there's a bug in the Lucene syntax parser?

If it was, it was in the old parser ... using trunk Solr (with Lucene 
2.3.1) they both work for me.

Did you try dateorigin_sort:["" TO ""] then?

: .../solr/select/?q=*:*&version=2.2&start=0&rows=0&indent=on&facet=true&f
: acet.field=dateorigin_sort&facet.mincount=0&facet.sort=false

:  
:   
:   0

: Umm... it has an indexed empty value that does not correspond to a
: record?  Is it an unanchored data item in the index? Would optimizing
: make this index data go away?

that's odd ... the facet field code should ignore terms that are only in 
deleted documents (Are you sure your "standard" qt doesn't have a 
default or invariant "fq" param in the solrconfig.xml?) but even if I'm 
wrong: yes optimizing your index will expunge any terms which are only 
associated deleted docs.



-Hoss



Re: question on xsl sytlesheet and update

2008-03-16 Thread Chris Hostetter

: 1. how to attach my own stylesheet to the output?

You can use the XSLTResponseWriter to get Solr to apply an XSLT before 
writing hte response to the client...

http://wiki.apache.org/solr/XsltResponseWriter

...alternately there is a "stylesheet" param that can be used with the 
XMLResponseWriter that just causes it to include a stylesheet instruction 
in the response XML so the client can process it -- but it's fairly 
useless at the moment (it requires you to include your own stylesheet in 
the Solr WAR file or to configure some creative path mapping in your 
servlet container)

: 2. how the Solr do the update? What kind of update I can ask Solr to do?

That's a pretty vague question ... have you read the tutorial? have 
skimmed the main page of the Solr wiki -- particularly the section on 
"Indexing Documents"?

http://lucene.apache.org/solr/tutorial.html
http://wiki.apache.org/solr/



-Hoss



Re: AlphaNumeric search in Solr

2008-03-16 Thread Chris Hostetter

:   I read some documentation on the WordDelimterFilter.  Just to clarify my
: thinking, I understand that if I use WordDelimiterFilter and search for a
: term like axd100 it will break it into two tokens "axd" and "100".  But then
: when I do my search should Solr match the documents containing both these
: tokens?

Text Analysis for the purposes of building and searching an inverted index 
like Lucene is all about having "complimentary" tokenization/filtering at 
indextime and at query time.

Yes, WordDelimiterFilter can help match on the types of things in your 
example, but only using it to analyzer your query strings will not provide 
a magic bullet -- you have to have also used it with the appropriate 
settings at index time in order for the correct tokens to be indexed so 
you can find them when you query.

:   In my application when I try to search for "axd 100" I get several
: documents back, but when I search for axd100 with WordDelimiterFilter on, I
: don't get back any results.  I was assuming that if WordDelimiterFilter
: breaks axd100 into two tokens - "axd" and "100", then the search should
: behave exactly as if I was searching for the string "axd 100".

Not exactly, a better approximation would be searching for the *phrase* 
"axd 100" ... but it really depends on how you configure it.

Bottom line: if you don't have control over how your index is built, and 
it isn't build using any configuration of the WordDelimiterFilter, you are 
better off avoiding it at query time.




-Hoss



Re: Empty fields - dynamic

2008-03-16 Thread Chris Hostetter

: Is there a way to specify that a dynamic field cannot have an empty string?
: With static fields, you can enforce this with 'required="true"
: default="-1"'.

Uh... are you sure about that?  required="true" just says there must be a 
value supplied, the empty string is still a value. (i just verified that 
an empty string is a legal value even if required="true")

: Is there any way to do enforce this in the shipped Solr 1.2?  One could
: write a new custom analyzer that requires input. But is there anything
: available out of the box?

not that i can think of.


-Hoss