Re: Solr on Tomcat, how to use an external data directory?

2010-05-30 Thread Abdelhamid ABID
.. and to unset dataDir just leave it blank
${solr.data.dir:}

On Sun, May 30, 2010 at 12:15 AM, Chris Hostetter
wrote:

>
> : Most likely you have missed to point data directory in solrconf.xml,
> : this should help :
> : http://wiki.apache.org/solr/SolrConfigXml#dataDir_parameter
>
> right .. double check what the dataDir setting looks like ... if it's
> unset it uses "data" in your solr instance directory, but if it is set,
> it's (unfortunately) evaluated relative to the "current working directory"
> of your servlet container and some versions of solr had "./data" listed in
> the example solrconfig.xml
>
>
>
> -Hoss
>
>


-- 
Abdelhamid ABID
Software Engineer- J2EE / WEB


Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread Erick Erickson
The Solr admin page as access to (and uses) the field
definitions you've put in the config file. Luke has no
knowledge of this configuration, you have to choose
your analyzer from the drop down and select the one
closest to what's in your config file for SOLR. Are you
perhaps using an analyzer in Luke that doesn't
play nice with the definitions in SOLR?

HTH
Erick


On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:

> I tried the stand-alone Luke tool (not Luke request handler) to browse
> a solr index and find a few strange things:
>
> 1. Queries like "id:123" which work fine in /solr/admin web interface
> returns nothing in Luke. "*:*" returns everything fine in Luke.
>
> 2. When Luke displays records with query "*:*", it shows the string
> values fine but the numeric fields and date fields shows blank. It shows
> DocID OK, though.
>
> Anyone else has tried Luke on a solr index?
>


Re: strange results with query and hyphened words

2010-05-30 Thread Sascha Szott

Hi Markus,

I was facing the same problem a few days ago and found an explanation in 
the mail archive that clarifies my question regarding the usage of 
Solr's WordDelimiterFilterFactory:


http://markmail.org/message/qoby6kneedtwd42h

Best,
Sascha

markus.rietz...@rzf.fin-nrw.de wrote:

i am wondering why a search term with hyphen doesn't match.

my search term is "prof-auskunft". in WordDelimiterFilterFactory i have
catenateWords, so my understanding is that profi-auskunft would search
for profiauskunft. when i use the analyse panel in solr admi i see that
profi-auskunft matches a term "profiauskunft".

the analyse will show

Query Analyzer
WhitespaceTokenizerFactory
profi-auskunft
SynonymFilterFactory
profi-auskunft
StopFilterFactory
profi-auskunft

WordDelimiterFilterFactory

term position   1   2
term text   profi   auskunft
profiauskunft
term type   wordword
word
source start,end0,5 6,14
0,15

LowerCaseFilterFactory
SnowballPorterFilterFactory

why is auskunft and profiauskunft in one column. how do they get
searched?

when i search "profiauskunft" i have 230 hits, when i now search for
"profi-auskunft" i do get less hits. when i call the search with
debugQuery=on i see

body:"profi (auskunft profiauskunft)"

what does this query mean? profi and "auskunft or profiauskunft"?





   
 
 
 
 
 
 
 
 
 
 
 
   
   
 
 
 
 
 
 
   







Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread jlist9
I find in the Plugins tab that the default is PersianAnalyzer. I switched
to StandardAnalyzer and tried a few different Lucene Compatibility values
but it didn't help :-(

On Sun, May 30, 2010 at 4:40 AM, Erick Erickson  wrote:
> The Solr admin page as access to (and uses) the field
> definitions you've put in the config file. Luke has no
> knowledge of this configuration, you have to choose
> your analyzer from the drop down and select the one
> closest to what's in your config file for SOLR. Are you
> perhaps using an analyzer in Luke that doesn't
> play nice with the definitions in SOLR?
>
> HTH
> Erick
>
>
> On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:
>
>> I tried the stand-alone Luke tool (not Luke request handler) to browse
>> a solr index and find a few strange things:
>>
>> 1. Queries like "id:123" which work fine in /solr/admin web interface
>> returns nothing in Luke. "*:*" returns everything fine in Luke.
>>
>> 2. When Luke displays records with query "*:*", it shows the string
>> values fine but the numeric fields and date fields shows blank. It shows
>> DocID OK, though.
>>
>> Anyone else has tried Luke on a solr index?
>>
>


Re: Solr on Tomcat, how to use an external data directory?

2010-05-30 Thread jlist9
The JVM arg seems to overwrite that just fine:
-Dsolr.data.dir=/opt/solr/example/data

On Sun, May 30, 2010 at 12:14 AM, Abdelhamid  ABID  wrote:
> .. and to unset dataDir just leave it blank
> ${solr.data.dir:}
>
> On Sun, May 30, 2010 at 12:15 AM, Chris Hostetter
> wrote:
>
>>
>> : Most likely you have missed to point data directory in solrconf.xml,
>> : this should help :
>> : http://wiki.apache.org/solr/SolrConfigXml#dataDir_parameter
>>
>> right .. double check what the dataDir setting looks like ... if it's
>> unset it uses "data" in your solr instance directory, but if it is set,
>> it's (unfortunately) evaluated relative to the "current working directory"
>> of your servlet container and some versions of solr had "./data" listed in
>> the example solrconfig.xml
>>
>>
>>
>> -Hoss
>>
>>
>
>
> --
> Abdelhamid ABID
> Software Engineer- J2EE / WEB
>


Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread Erick Erickson
Then you have to provide a lot more detail about what you did
and what you're seeing and what you think you should see. You
might review this page:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Sun, May 30, 2010 at 1:41 PM, jlist9  wrote:

> I find in the Plugins tab that the default is PersianAnalyzer. I switched
> to StandardAnalyzer and tried a few different Lucene Compatibility values
> but it didn't help :-(
>
> On Sun, May 30, 2010 at 4:40 AM, Erick Erickson 
> wrote:
> > The Solr admin page as access to (and uses) the field
> > definitions you've put in the config file. Luke has no
> > knowledge of this configuration, you have to choose
> > your analyzer from the drop down and select the one
> > closest to what's in your config file for SOLR. Are you
> > perhaps using an analyzer in Luke that doesn't
> > play nice with the definitions in SOLR?
> >
> > HTH
> > Erick
> >
> >
> > On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:
> >
> >> I tried the stand-alone Luke tool (not Luke request handler) to browse
> >> a solr index and find a few strange things:
> >>
> >> 1. Queries like "id:123" which work fine in /solr/admin web interface
> >> returns nothing in Luke. "*:*" returns everything fine in Luke.
> >>
> >> 2. When Luke displays records with query "*:*", it shows the string
> >> values fine but the numeric fields and date fields shows blank. It shows
> >> DocID OK, though.
> >>
> >> Anyone else has tried Luke on a solr index?
> >>
> >
>


Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread jlist9
Oh, here's a modified/improved version of what I described in my first email:

1. Queries like "id:123" which work fine in /solr/admin web interface but
returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
expect Luke returns the same result as /solr/admin since it's essentially
a Lucene query?

2. When Luke displays records with query "*:*", it shows the string
values fine but the numeric fields and date fields show blank. It shows
DocID OK, though. I expect Luke to be able to show non-string values, too.

On Sun, May 30, 2010 at 10:57 AM, Erick Erickson
 wrote:
> Then you have to provide a lot more detail about what you did
> and what you're seeing and what you think you should see. You
> might review this page:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Sun, May 30, 2010 at 1:41 PM, jlist9  wrote:
>
>> I find in the Plugins tab that the default is PersianAnalyzer. I switched
>> to StandardAnalyzer and tried a few different Lucene Compatibility values
>> but it didn't help :-(
>>
>> On Sun, May 30, 2010 at 4:40 AM, Erick Erickson 
>> wrote:
>> > The Solr admin page as access to (and uses) the field
>> > definitions you've put in the config file. Luke has no
>> > knowledge of this configuration, you have to choose
>> > your analyzer from the drop down and select the one
>> > closest to what's in your config file for SOLR. Are you
>> > perhaps using an analyzer in Luke that doesn't
>> > play nice with the definitions in SOLR?
>> >
>> > HTH
>> > Erick
>> >
>> >
>> > On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:
>> >
>> >> I tried the stand-alone Luke tool (not Luke request handler) to browse
>> >> a solr index and find a few strange things:
>> >>
>> >> 1. Queries like "id:123" which work fine in /solr/admin web interface
>> >> returns nothing in Luke. "*:*" returns everything fine in Luke.
>> >>
>> >> 2. When Luke displays records with query "*:*", it shows the string
>> >> values fine but the numeric fields and date fields shows blank. It shows
>> >> DocID OK, though.
>> >>
>> >> Anyone else has tried Luke on a solr index?
>> >>
>> >
>>
>


Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread Erick Erickson
No, not nearly enough information.

You haven't shown the SOLR field type definitions.

You haven't provided, say, the output from SOLR if you add &debugQuery=on.

You haven't shown the terms from either SOLR admin or Luke that they
actually see in the index.

You haven't identified the version of SOLR that the index was created with

You haven't identified the version of Luke you're using.

Both SOLR and Luke use Lucene under the covers, so the problem is almost
certainly your expectations...

Imagine a co-worker from another department has provided you the information
you've provided us. What could you say?

Best
Erick

On Sun, May 30, 2010 at 2:19 PM, jlist9  wrote:

> Oh, here's a modified/improved version of what I described in my first
> email:
>
> 1. Queries like "id:123" which work fine in /solr/admin web interface but
> returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
> expect Luke returns the same result as /solr/admin since it's essentially
> a Lucene query?
>
> 2. When Luke displays records with query "*:*", it shows the string
> values fine but the numeric fields and date fields show blank. It shows
> DocID OK, though. I expect Luke to be able to show non-string values, too.
>
> On Sun, May 30, 2010 at 10:57 AM, Erick Erickson
>  wrote:
> > Then you have to provide a lot more detail about what you did
> > and what you're seeing and what you think you should see. You
> > might review this page:
> > http://wiki.apache.org/solr/UsingMailingLists
> >
> > Best
> > Erick
> >
> > On Sun, May 30, 2010 at 1:41 PM, jlist9  wrote:
> >
> >> I find in the Plugins tab that the default is PersianAnalyzer. I
> switched
> >> to StandardAnalyzer and tried a few different Lucene Compatibility
> values
> >> but it didn't help :-(
> >>
> >> On Sun, May 30, 2010 at 4:40 AM, Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> >> > The Solr admin page as access to (and uses) the field
> >> > definitions you've put in the config file. Luke has no
> >> > knowledge of this configuration, you have to choose
> >> > your analyzer from the drop down and select the one
> >> > closest to what's in your config file for SOLR. Are you
> >> > perhaps using an analyzer in Luke that doesn't
> >> > play nice with the definitions in SOLR?
> >> >
> >> > HTH
> >> > Erick
> >> >
> >> >
> >> > On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:
> >> >
> >> >> I tried the stand-alone Luke tool (not Luke request handler) to
> browse
> >> >> a solr index and find a few strange things:
> >> >>
> >> >> 1. Queries like "id:123" which work fine in /solr/admin web interface
> >> >> returns nothing in Luke. "*:*" returns everything fine in Luke.
> >> >>
> >> >> 2. When Luke displays records with query "*:*", it shows the string
> >> >> values fine but the numeric fields and date fields shows blank. It
> shows
> >> >> DocID OK, though.
> >> >>
> >> >> Anyone else has tried Luke on a solr index?
> >> >>
> >> >
> >>
> >
>


TikaEntityProcessor not working?

2010-05-30 Thread Brad Greenlee
Hi. I'm trying to get Solr to index a database in which one column is a 
filename of a PDF document I'd like to index. My configuration looks like this:


  
  
  

  

  

  


I'm using Solr from trunk (as of two days ago). The import process completes 
without errors, and it picks up the columns from the database, but not the 
content from the PDF file. It is definitely trying to access the PDF file, for 
if I give it an incorrect path name, it complains. It doesn't seem to be 
attempting to index the PDF, though, as it completes in about 40ms, whereas if 
I import the PDF via the ExtractingRequestHandler, it takes about 11 seconds to 
index it.

I've also tried the tika example in example-DIH and that doesn't seem to index 
anything, either. Am I doing something wrong, or is this just not working yet?

Cheers,

Brad



Re: Storing different entities in Solr

2010-05-30 Thread Bill Au
There is only one primary key in a single index.  If the id of your
different document types do collide, you can simply add a prefix or suffix
to make them unique.

Bill

On Fri, May 28, 2010 at 1:12 PM, Moazzam Khan  wrote:

> Thanks for all your answers guys. Requests and consultants have a many
> to many relationship so I can't store request info in a document with
> advisorID as the primary key.
>
> Bill's solution and multicore solutions might be what I am looking
> for. Bill, will I be able to have 2 primary keys (so I can update and
> delete documents)? If yes, can you please give me a link or someting
> where I can get more info on this?
>
> Thanks,
> Moazzam
>
>
>
> On Fri, May 28, 2010 at 11:50 AM, Bill Au  wrote:
> > You can keep different type of documents in the same index.  If each
> > document has a type field.  You can restrict your searches to specific
> > type(s) of document by using a filter query, which is very fast and
> > efficient.
> >
> > Bill
> >
> > On Fri, May 28, 2010 at 12:28 PM, Nagelberg, Kallin <
> > knagelb...@globeandmail.com> wrote:
> >
> >> Multi-core is an option, but keep in mind if you go that route you will
> >> need to do two searches to correlate data between the two.
> >>
> >> -Kallin Nagelberg
> >>
> >> -Original Message-
> >> From: Robert Zotter [mailto:robertzot...@gmail.com]
> >> Sent: Friday, May 28, 2010 12:26 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Storing different entities in Solr
> >>
> >>
> >> Sounds like you'll want to use a multiple core setup. One core fore each
> >> type
> >> of "document"
> >>
> >> http://wiki.apache.org/solr/CoreAdmin
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Storing-different-entities-in-Solr-tp852299p852346.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
>


Re: Prefix-Search with Stopwords - no results?

2010-05-30 Thread Gert Brinkmann

On 28.05.2010 22:06, Chris Hostetter wrote:

and one "text_prefix"
defined similarly but with an additional EdgeNGramTokenFilter used when
indexing to generate "prefix" tokens. then search those fields using
dismax...


To be sure that I understand this right:

Am I right that I should not stopword filter the EdgeNGramTokenFilter 
field? Otherwise I would run into the same problems again, won't I?


Or if stopword filtering is ok on this field: Do you filter the 
stopwords before or after EdgeNGram tokenizing?


Thanks,
Gert


Re: Luke browser does not show non-String Solr fields?

2010-05-30 Thread jlist9
Sorry. Let me add more info. (I assumed that anyone who tried it would see
the problem right away but that might not be the case.)

> You haven't shown the SOLR field type definitions.

Values of all non-String types in my index are not being shown.
In my case, this includes long, tint and date types.

> You haven't provided, say, the output from SOLR if you add &debugQuery=on.

Solr side of things seem to work fine. Is this still needed?

> You haven't shown the terms from either SOLR admin or Luke that they
> actually see in the index.

I'm not sure if I understand this question. Basically, if I do a query "id:694"
in /solr/admin, I get the result. But the same query doesn't return anything
in Luke.

> You haven't identified the version of SOLR that the index was created with

Solr 1.4

> You haven't identified the version of Luke you're using.

Luke 1.0.1 (2010-04-01)

> Both SOLR and Luke use Lucene under the covers, so the problem is almost
> certainly your expectations...

This may well be the case. Not knowing the details, I was still surprised
since I expected the same search behavior for simple queries between
solr and luke.

> Imagine a co-worker from another department has provided you the information
> you've provided us. What could you say?

I might give it a try and see the same thing :-D

Thanks

> On Sun, May 30, 2010 at 2:19 PM, jlist9  wrote:
>
>> Oh, here's a modified/improved version of what I described in my first
>> email:
>>
>> 1. Queries like "id:123" which work fine in /solr/admin web interface but
>> returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
>> expect Luke returns the same result as /solr/admin since it's essentially
>> a Lucene query?
>>
>> 2. When Luke displays records with query "*:*", it shows the string
>> values fine but the numeric fields and date fields show blank. It shows
>> DocID OK, though. I expect Luke to be able to show non-string values, too.
>>
>> On Sun, May 30, 2010 at 10:57 AM, Erick Erickson
>>  wrote:
>> > Then you have to provide a lot more detail about what you did
>> > and what you're seeing and what you think you should see. You
>> > might review this page:
>> > http://wiki.apache.org/solr/UsingMailingLists
>> >
>> > Best
>> > Erick
>> >
>> > On Sun, May 30, 2010 at 1:41 PM, jlist9  wrote:
>> >
>> >> I find in the Plugins tab that the default is PersianAnalyzer. I
>> switched
>> >> to StandardAnalyzer and tried a few different Lucene Compatibility
>> values
>> >> but it didn't help :-(
>> >>
>> >> On Sun, May 30, 2010 at 4:40 AM, Erick Erickson <
>> erickerick...@gmail.com>
>> >> wrote:
>> >> > The Solr admin page as access to (and uses) the field
>> >> > definitions you've put in the config file. Luke has no
>> >> > knowledge of this configuration, you have to choose
>> >> > your analyzer from the drop down and select the one
>> >> > closest to what's in your config file for SOLR. Are you
>> >> > perhaps using an analyzer in Luke that doesn't
>> >> > play nice with the definitions in SOLR?
>> >> >
>> >> > HTH
>> >> > Erick
>> >> >
>> >> >
>> >> > On Sat, May 29, 2010 at 10:55 PM, jlist9  wrote:
>> >> >
>> >> >> I tried the stand-alone Luke tool (not Luke request handler) to
>> browse
>> >> >> a solr index and find a few strange things:
>> >> >>
>> >> >> 1. Queries like "id:123" which work fine in /solr/admin web interface
>> >> >> returns nothing in Luke. "*:*" returns everything fine in Luke.
>> >> >>
>> >> >> 2. When Luke displays records with query "*:*", it shows the string
>> >> >> values fine but the numeric fields and date fields shows blank. It
>> shows
>> >> >> DocID OK, though.
>> >> >>
>> >> >> Anyone else has tried Luke on a solr index?
>> >> >>
>> >> >
>> >>
>> >
>>
>