from:"Brian Lamb"

Autocomplete

2011-09-01 Thread Brian Lamb

Hi all,

I've read numerous guides on how to set up autocomplete on solr and it works
great the way I have it now. However, my only complaint is that it only
matches the beginning of the word. For example, if I try to autocomplete
"dober", I would only get, "Doberman", "Doberman Pincher" but not "Pincher,
Doberman". Here is how my schema is configured:


   
 
 
 
   
   
 
 
   




How can I update my autocomplete so that it will match the middle of a word
as well as the beginning of the word?

Thanks,

Brian Lamb

Re: Autocomplete

2011-09-01 Thread Brian Lamb

I found that if I change



to



I can do autocomplete in the middle of a term.

Thanks!

Brian Lamb

On Thu, Sep 1, 2011 at 11:27 AM, Brian Lamb
wrote:

> Hi all,
>
> I've read numerous guides on how to set up autocomplete on solr and it
> works great the way I have it now. However, my only complaint is that it
> only matches the beginning of the word. For example, if I try to
> autocomplete "dober", I would only get, "Doberman", "Doberman Pincher" but
> not "Pincher, Doberman". Here is how my schema is configured:
>
>  positionIncrementGap="100">
>
>  
>  
>   maxGramSize="25" />
>
>
>  
>  
>
> 
>
>  stored="true" omitNorms="true" omitTermFreqAndPositions="true" />
>
> How can I update my autocomplete so that it will match the middle of a word
> as well as the beginning of the word?
>
> Thanks,
>
> Brian Lamb
>

Boosting is slow

2011-11-17 Thread Brian Lamb

Hi all,

I have about 20 million records in my solr index. I'm running into a
problem now where doing a boost drastically slows down my search
application. A typical query for me looks something like:

http://localhost:8983/solr/mycore/search/?q=test {!boost
b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}

I've tried several variations on the boost to see if that was the problem
but even when doing something simple like:

http://localhost:8983/solr/mycore/search/?q=test {!boost b=2}

it is still really slow. Is there a different approach I should be taking?

Thanks,

Brian Lamb

Re: Boosting is slow

2011-11-17 Thread Brian Lamb

Sorry, the query is actually:

http://localhost:8983/solr/mycore/search/?q=test{!boost
b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}&start=&sort=score+desc,mydate_field+desc&wt=xslt&tr=mysite.xsl

On Thu, Nov 17, 2011 at 2:59 PM, Brian Lamb
wrote:

> Hi all,
>
> I have about 20 million records in my solr index. I'm running into a
> problem now where doing a boost drastically slows down my search
> application. A typical query for me looks something like:
>
> http://localhost:8983/solr/mycore/search/?q=test {!boost
> b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}
>
> I've tried several variations on the boost to see if that was the problem
> but even when doing something simple like:
>
> http://localhost:8983/solr/mycore/search/?q=test {!boost b=2}
>
> it is still really slow. Is there a different approach I should be taking?
>
> Thanks,
>
> Brian Lamb
>
>

Re: Boosting is slow

2011-11-18 Thread Brian Lamb

Any ideas on this one?

On Thu, Nov 17, 2011 at 3:53 PM, Brian Lamb
wrote:

> Sorry, the query is actually:
>
> http://localhost:8983/solr/mycore/search/?q=test{!boost
> b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}&start=&sort=score+desc,mydate_field+desc&wt=xslt&tr=mysite.xsl
>
>
> On Thu, Nov 17, 2011 at 2:59 PM, Brian Lamb  > wrote:
>
>> Hi all,
>>
>> I have about 20 million records in my solr index. I'm running into a
>> problem now where doing a boost drastically slows down my search
>> application. A typical query for me looks something like:
>>
>> http://localhost:8983/solr/mycore/search/?q=test {!boost
>> b=product(sum(log(sum(myfield,1)),1),recip(ms(NOW,mydate_field),3.16e-11,1,8))}
>>
>> I've tried several variations on the boost to see if that was the problem
>> but even when doing something simple like:
>>
>> http://localhost:8983/solr/mycore/search/?q=test {!boost b=2}
>>
>> it is still really slow. Is there a different approach I should be taking?
>>
>> Thanks,
>>
>> Brian Lamb
>>
>>
>

MySQL data import

2011-12-11 Thread Brian Lamb

Hi all,

I have a few questions about how the MySQL data import works. It seems it
creates a separate connection for each entity I create. Is there any way to
avoid this?

By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?

Thanks,

Brian Lamb

Re: MySQL data import

2011-12-12 Thread Brian Lamb

Hi all,

Any tips on this one?

Thanks,

Brian Lamb

On Sun, Dec 11, 2011 at 3:54 PM, Brian Lamb
wrote:

> Hi all,
>
> I have a few questions about how the MySQL data import works. It seems it
> creates a separate connection for each entity I create. Is there any way to
> avoid this?
>
> By nature of my schema, I have several multivalued fields. Each one I
> populate with a separate entity. Is there a better way to do it? For
> example, could I pull in all the singular data in one sitting and then come
> back in later and populate with the multivalued items.
>
> An alternate approach in some cases would be to do a GROUP_CONCAT and then
> populate the multivalued column with some transformation. Is that possible?
>
> Lastly, is it possible to use copyField to copy three regular fields into
> one multiValued field and have all the data show up?
>
> Thanks,
>
> Brian Lamb
>

URLDataSource delta import

2011-12-12 Thread Brian Lamb

Hi all,

According to
http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource
a
delta-import is not "currently" implemented for URLDataSource. I say
"currently" because I've noticed that such documentation is out of date in
many places. I wanted to see if this feature had been added yet or if there
were plans to do so.

Thanks,

Brian Lamb

Re: MySQL data import

2011-12-12 Thread Brian Lamb

Thanks all. Erick, is there documentation on doing things with SolrJ and a
JDBC connection?

On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson wrote:

> You might want to consider just doing the whole
> thing in SolrJ with a JDBC connection. When things
> get complex, it's sometimes more straightforward.
>
> Best
> Erick...
>
> P.S. Yes, it's pretty standard to have a single
> field be the destination for several copyField
> directives.
>
> On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty  wrote:
> > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
> >  wrote:
> >> Hi all,
> >>
> >> I have a few questions about how the MySQL data import works. It seems
> it
> >> creates a separate connection for each entity I create. Is there any
> way to
> >> avoid this?
> >
> > Not sure, but I do not think that it is possible. However, from your
> description
> > below, I think that you are unnecessarily multiplying entities.
> >
> >> By nature of my schema, I have several multivalued fields. Each one I
> >> populate with a separate entity. Is there a better way to do it? For
> >> example, could I pull in all the singular data in one sitting and then
> come
> >> back in later and populate with the multivalued items.
> >
> > Not quite sure as to what you mean. Would it be possible for you
> > to post your schema.xml, and the DIH configuration file? Preferably,
> > put these on pastebin.com, and send us links. Also, you should
> > obfuscate details like access passwords.
> >
> >> An alternate approach in some cases would be to do a GROUP_CONCAT and
> then
> >> populate the multivalued column with some transformation. Is that
> possible?
> > [...]
> >
> > This is how we have been handling it. A complete description would
> > be long, but here is the gist of it:
> > * A transformer will be needed. In this case, we found it easiest
> >  to use a Java-based transformer. Thus, your entity should include
> >  something like
> >   > transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...>
> >  ...
> >  
> >  Here, the class name to be used for the transformer attribute follows
> >  the usual Java rules, and the .jar needs to be made available to Solr.
> > * The SELECT statement for the entity looks something like
> >  select group_concat( myfield SEPARATOR '@||@')...
> >  The separator should be something that does not occur in your
> >  normal data stream.
> > * Within the entity, define
> >   
> > * There are complications involved if NULL values are allowed
> >   for the field, in which case you would need to use COALESCE,
> >   maybe along with CAST
> > * The transformer would look up "myfield", split along the separator,
> >   and populate the multi-valued field.
> >
> > This *is* a little complicated, so I would also like to hear about
> > possible alternatives.
> >
> > Regards,
> > Gora
>

PHP/Solr library

2012-01-04 Thread Brian Lamb

Hi all,

I've been exploring http://www.php.net/manual/en/book.solr.php as a way to
maintain my index. I already have a PHP script that I use to update a
database so I was hoping to be able to update the database at the same time
I am updating the index.

However, I've been getting the following error when trying to run
$solr_client->commit();

Unsuccessful update request. Response Code 0. (null)

I've tried looking to see why I'm getting the error but I cannot find a
reasonable explanation. My guess is that it is because my index is rather
large (22 million records) and thus it is timing out or something like that
but I cannot confirm that that is the case nor do I know how to fix it even
if it were.

Any help here would be greatly appreciated.

Thanks,

Brian Lamb

Re: PHP/Solr library

2012-01-04 Thread Brian Lamb

Hi Param,

That's the method I'm switching over from. It seems that script works
inefficiently with my set up as the data is spread out over multiple
tables. I've considered creating a simple solr MySQL table just to maintain
the solr data but I wanted to try out this PHP extension first.

But thanks for the suggestion!

Brian Lamb

On Wed, Jan 4, 2012 at 2:58 PM, Sethi, Parampreet <
parampreet.se...@teamaol.com> wrote:

> Hi Brian,
>
> Not exactly solution to your problem. But it may help, you can run Solr
> directly on top of your database, if your schema is simple manipulation of
> the database fields. This way you only need to update the database and
> solr index will be automatically updated with the latest data. I am using
> this in production and it's working pretty neatly.
>
> Here are few helpful links:
> http://wiki.apache.org/solr/DataImportHandler
> http://www.params.me/2011/03/configure-apache-solr-14-with-mysql.html
>
> -param
>
> On 1/4/12 2:50 PM, "Brian Lamb"  wrote:
>
> >Hi all,
> >
> >I've been exploring http://www.php.net/manual/en/book.solr.php as a way
> to
> >maintain my index. I already have a PHP script that I use to update a
> >database so I was hoping to be able to update the database at the same
> >time
> >I am updating the index.
> >
> >However, I've been getting the following error when trying to run
> >$solr_client->commit();
> >
> >Unsuccessful update request. Response Code 0. (null)
> >
> >I've tried looking to see why I'm getting the error but I cannot find a
> >reasonable explanation. My guess is that it is because my index is rather
> >large (22 million records) and thus it is timing out or something like
> >that
> >but I cannot confirm that that is the case nor do I know how to fix it
> >even
> >if it were.
> >
> >Any help here would be greatly appreciated.
> >
> >Thanks,
> >
> >Brian Lamb
>
>

dataimport

2011-02-24 Thread Brian Lamb

Hi all,

First of all, I'm quite new to solr.

I have the server set up and everything appears to work. I set it up so that
the indexed data comes through a mysql connection:


  
db-data-config.xml
   


And here is the contents of db-data-config.xml:


   

  
 
   


When I point my browser at localhost:8983/solr/dataimport, the server
produces the following message:

Feb 24, 2011 8:58:24 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={command=full-import} status=0
QTime=10
Feb 24, 2011 8:58:24 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
INFO: Starting Full Import
Feb 24, 2011 8:58:24 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Feb 24, 2011 8:58:24 PM org.apache.solr.update.DirectUpdateHandler2
deleteAll
INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
Feb 24, 2011 8:58:24 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=1
commit{dir=/wwwroot/apps/apache-solr-1.4.1/example/solr/data/index,segFN=segments_p,version=1297781919778,generation=25,filenames=[_n.nrm,
_n.tis, _n.prx, segments_p, _n.fdt, _n.frq, _n.tii, _n.fdx, _n.fnm]
Feb 24, 2011 8:58:24 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1297781919778
Feb 24, 2011 8:58:24 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Creating a connection for entity id with URL:
jdbc:mysql://localhost/researchsquare_beta_library?characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull
Feb 24, 2011 8:58:25 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Time taken for getConnection(): 137
Killed

So it looks like for whatever reason, the server crashes trying to do a full
import. When I add a LIMIT clause on the query, it works fine when the LIMIT
is only 250 records but if I try to do 500 records, I get the same message.

The fields types are:

SHOW CREATE TABLE mytable;
CREATE TABLE mytable (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `fielda` varchar(650) COLLATE utf8_unicode_ci DEFAULT NULL,
   `fieldb` varchar(500) COLLATE utf8_unicode_ci DEFAULT NULL,
   `fieldc` text COLLATE utf8_unicode_ci,
   `fieldd` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
   PRIMARY KEY (`id`)
);

How can I get Solr to do a full import without crashing? Doing it 250
records at a time is not going to be feasible because there are about 50
records.

Indexed, but cannot search

2011-02-28 Thread Brian Lamb

Hi all,

I was able to get my installation of Solr indexed using dataimport. However,
I cannot seem to get search working. I can verify that the data is there by
going to:

http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on

This gives me the response: 

But when I go to

http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on

I get the response: 

I know that dog should return some results because it is the first result
when I select all the records. So what am I doing incorrectly that would
prevent me from seeing results?

Sub entities

2011-02-28 Thread Brian Lamb

Hi all,

I was able to get my dataimport to work correctly but I'm a little unclear
as to how the entity within an entity works in regards to search results.
When I do a search for all results, it seems only the outermost responses
are returned. For example, I have the following in my db config file:


  

  






  

  

  

  


However, specie never shows up in my search results:


  Mammal
  1
  Canis


I had hoped the results would include the species. Can it? If so, what is my
malfunction?

Re: Sub entities

2011-03-01 Thread Brian Lamb

Yes, it looks like I had left off the field (misspelled it actually). I
reran the full import and the fields did properly show up. However, it is
still not working as expected. Using the example below, a result returned
would only list one specie instead of a list of species. I have the
following in my schema.xml file:



I reran the fullimport but it is still only listing one specie instead of
multiple. Is my above declaration incorrect?

On Tue, Mar 1, 2011 at 3:41 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Brian,
>
> except for your sql-syntax error in the specie_relations-query "SELECT
> specie_id FROMspecie_relations .." (missing whitespace after FROM)
> your config looks okay.
>
> following questions:
> * is there a field named specie in your schema? (otherwise dih will
> silently ignore it)
> * did you check your mysql-query log? to see which queries were
> executed and what their result is?
>
> And, just as quick notice .. there is no need to use  column="foo" name="foo"> (while both attribute have the same value).
>
> Regards
> Stefan
>
> On Mon, Feb 28, 2011 at 9:52 PM, Brian Lamb
>  wrote:
> > Hi all,
> >
> > I was able to get my dataimport to work correctly but I'm a little
> unclear
> > as to how the entity within an entity works in regards to search results.
> > When I do a search for all results, it seems only the outermost responses
> > are returned. For example, I have the following in my db config file:
> >
> > 
> >   > driver="com.mysql.jdbc.Driver"
> >
> url="jdbc:mysql://localhost/db?characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull"
> > user="user" password="password"/>
> >
> >  
> >
> >
> >
> >
> >
> >
> >  
> >
> >  
> >
> >  
> >
> >  
> > 
> >
> > However, specie never shows up in my search results:
> >
> > 
> >  Mammal
> >  1
> >  Canis
> > 
> >
> > I had hoped the results would include the species. Can it? If so, what is
> my
> > malfunction?
> >
>

Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb

Thank you for your reply but the searching is still not working out. For
example, when I go to:

http://localhost:8983/solr/select/?q=*%3A*<http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on>

I get the following as a response:


  
Mammal
1
Canis
  


(plus some other docs but one is enough for this example)

But if I go to 
http://localhost:8983/solr/select/?q=type%3A<http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on>
Mammal

I only get:



But it seems that should return at least the result I have listed above.
What am I doing incorrectly?

On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:

> q=dog is equivalent to q=text:dog (where the default search field is
> defined as text at the bottom of schema.xml).
>
> If you want to specify a different field, well, you need to tell it :-)
>
> Is that it?
>
> Upayavira
>
> On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
>  wrote:
> > Hi all,
> >
> > I was able to get my installation of Solr indexed using dataimport.
> > However,
> > I cannot seem to get search working. I can verify that the data is there
> > by
> > going to:
> >
> >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> > This gives me the response:  > start="0">
> >
> > But when I go to
> >
> >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> >
> > I get the response: 
> >
> > I know that dog should return some results because it is the first result
> > when I select all the records. So what am I doing incorrectly that would
> > prevent me from seeing results?
> >
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source
>
>

Re: Sub entities

2011-03-01 Thread Brian Lamb

Thanks for the help Stefan. It seems removing column="specie" fixed it.

On Tue, Mar 1, 2011 at 11:18 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Brian,
>
> On Tue, Mar 1, 2011 at 4:52 PM, Brian Lamb
>  wrote:
> >  > indexed="true" stored="true" required="false" />
>
> Not sure, but iirc  in this context has no column-Attribute ..
> that should normally not break your solr-configuration.
>
> Are you sure, that your animal has multiple species assigned? Checked
> the Query from the MySQL-Query-Log and verified that it returns more
> than one record?
>
> Otherwise you could enable
> http://wiki.apache.org/solr/DataImportHandler#LogTransformer for your
> dataimport, which outputs a log-row for every record .. just to
> ensure, that your Query-Results is correctly imported
>
> HTH, Regards
> Stefan
>

Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb

Hi all,

The problem was that my fields were defined as type="string" instead of
type="text". Once I corrected that, it seems to be fixed. The only part that
still is not working though is the search across all fields.

For example:

http://localhost:8983/solr/select/?q=type%3AMammal

Now correctly returns the records matching mammal. But if I try to do a
global search across all fields:

http://localhost:8983/solr/select/?q=Mammal
http://localhost:8983/solr/select/?q=text%3AMammal

I get no results returned. Here is how the schema is set up:


text


Thanks to everyone for your help so far. I think this is the last hurdle I
have to jump over.

On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:

> Next question, do you have your "type" field set to index="true" in your
> schema?
>
> Upayavira
>
> On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
>  wrote:
> > Thank you for your reply but the searching is still not working out. For
> > example, when I go to:
> >
> > http://localhost:8983/solr/select/?q=*%3A*<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> >
> > I get the following as a response:
> >
> > 
> >   
> > Mammal
> > 1
> > Canis
> >   
> > 
> >
> > (plus some other docs but one is enough for this example)
> >
> > But if I go to
> > http://localhost:8983/solr/select/?q=type%3A<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> > Mammal
> >
> > I only get:
> >
> > 
> >
> > But it seems that should return at least the result I have listed above.
> > What am I doing incorrectly?
> >
> > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> >
> > > q=dog is equivalent to q=text:dog (where the default search field is
> > > defined as text at the bottom of schema.xml).
> > >
> > > If you want to specify a different field, well, you need to tell it :-)
> > >
> > > Is that it?
> > >
> > > Upayavira
> > >
> > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > >  wrote:
> > > > Hi all,
> > > >
> > > > I was able to get my installation of Solr indexed using dataimport.
> > > > However,
> > > > I cannot seem to get search working. I can verify that the data is
> there
> > > > by
> > > > going to:
> > > >
> > > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> > > >
> > > > This gives me the response:  > > > start="0">
> > > >
> > > > But when I go to
> > > >
> > > >
> > >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> > > >
> > > > I get the response: 
> > > >
> > > > I know that dog should return some results because it is the first
> result
> > > > when I select all the records. So what am I doing incorrectly that
> would
> > > > prevent me from seeing results?
> > > >
> > > ---
> > > Enterprise Search Consultant at Sourcesense UK,
> > > Making Sense of Open Source
> > >
> > >
> >
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source
>
>

Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb

Oh if only it were that easy :-). I have reindexed since making that change
which is how I was able to get the regular search working. I have not
however been able to get the search across all fields to work.

On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma wrote:

> Traditionally, people forget to reindex ;)
>
> > Hi all,
> >
> > The problem was that my fields were defined as type="string" instead of
> > type="text". Once I corrected that, it seems to be fixed. The only part
> > that still is not working though is the search across all fields.
> >
> > For example:
> >
> > http://localhost:8983/solr/select/?q=type%3AMammal
> >
> > Now correctly returns the records matching mammal. But if I try to do a
> > global search across all fields:
> >
> > http://localhost:8983/solr/select/?q=Mammal
> > http://localhost:8983/solr/select/?q=text%3AMammal
> >
> > I get no results returned. Here is how the schema is set up:
> >
> >  > multiValued="true"/>
> > text
> > 
> >
> > Thanks to everyone for your help so far. I think this is the last hurdle
> I
> > have to jump over.
> >
> > On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:
> > > Next question, do you have your "type" field set to index="true" in
> your
> > > schema?
> > >
> > > Upayavira
> > >
> > > On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
> > >
> > >  wrote:
> > > > Thank you for your reply but the searching is still not working out.
> > > > For example, when I go to:
> > > >
> > > > http://localhost:8983/solr/select/?q=*%3A*<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > I get the following as a response:
> > > >
> > > > 
> > > >
> > > >   
> > > >
> > > > Mammal
> > > > 1
> > > > Canis
> > > >
> > > >   
> > > >
> > > > 
> > > >
> > > > (plus some other docs but one is enough for this example)
> > > >
> > > > But if I go to
> > > > http://localhost:8983/solr/select/?q=type%3A<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > Mammal
> > > >
> > > > I only get:
> > > >
> > > > 
> > > >
> > > > But it seems that should return at least the result I have listed
> > > > above. What am I doing incorrectly?
> > > >
> > > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> > > > > q=dog is equivalent to q=text:dog (where the default search field
> is
> > > > > defined as text at the bottom of schema.xml).
> > > > >
> > > > > If you want to specify a different field, well, you need to tell it
> > > > > :-)
> > > > >
> > > > > Is that it?
> > > > >
> > > > > Upayavira
> > > > >
> > > > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > > > >
> > > > >  wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > I was able to get my installation of Solr indexed using
> dataimport.
> > > > > > However,
> > > > > > I cannot seem to get search working. I can verify that the data
> is
> > >
> > > there
> > >
> > > > > > by
> > >
> > > > > > going to:
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > > > This gives me the response:  > > > > > numFound="234961" start="0">
> > > > > >
> > > > > > But when I go to
> > >
> > >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&inde
> > > nt=on
> > >
> > > > > > I get the response:  start="0">
> > > > > >
> > > > > > I know that dog should return some results because it is the
> first
> > >
> > > result
> > >
> > > > > > when I select all the records. So what am I doing incorrectly
> that
> > >
> > > would
> > >
> > > > > > prevent me from seeing results?
> > > > >
> > > > > ---
> > > > > Enterprise Search Consultant at Sourcesense UK,
> > > > > Making Sense of Open Source
> > >
> > > ---
> > > Enterprise Search Consultant at Sourcesense UK,
> > > Making Sense of Open Source
>

Re: Indexed, but cannot search

2011-03-02 Thread Brian Lamb

Here are the relevant parts of schema.xml:


globalField


This is what is returned when I search:


-

0
1
-

Mammal
true



-

Mammal
Mammal
globalField:mammal
globalField:mammal

LuceneQParser
-

1.0
-

1.0
-

1.0

-

0.0

-

0.0

-

0.0

-

0.0

-

0.0


-

0.0
-

0.0

-

0.0

-

0.0

-

0.0

-

0.0

-

0.0






On Tue, Mar 1, 2011 at 7:57 PM, Markus Jelsma wrote:

> Hmm, please provide analyzer of text and output of debugQuery=true. Anyway,
> if
> field type is fieldType text and the catchall field text is fieldType text
> as well
> and you reindexed, it should work as expected.
>
> > Oh if only it were that easy :-). I have reindexed since making that
> change
> > which is how I was able to get the regular search working. I have not
> > however been able to get the search across all fields to work.
> >
> > On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma
> wrote:
> > > Traditionally, people forget to reindex ;)
> > >
> > > > Hi all,
> > > >
> > > > The problem was that my fields were defined as type="string" instead
> of
> > > > type="text". Once I corrected that, it seems to be fixed. The only
> part
> > > > that still is not working though is the search across all fields.
> > > >
> > > > For example:
> > > >
> > > > http://localhost:8983/solr/select/?q=type%3AMammal
> > > >
> > > > Now correctly returns the records matching mammal. But if I try to do
> a
> > > > global search across all fields:
> > > >
> > > > http://localhost:8983/solr/select/?q=Mammal
> > > > http://localhost:8983/solr/select/?q=text%3AMammal
> > > >
> > > > I get no results returned. Here is how the schema is set up:
> > > >
> > > >  > > > multiValued="true"/>
> > > > text
> > > > 
> > > >
> > > > Thanks to everyone for your help so far. I think this is the last
> > > > hurdle
> > >
> > > I
> > >
> > > > have to jump over.
> > > >
> > > > On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:
> > > > > Next question, do you have your "type" field set to index="true" in
> > >
> > > your
> > >
> > > > > schema?
> > > > >
> > > > > Upayavira
> > > > >
> > > > > On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
> > > > >
> > > > >  wrote:
> > > > > > Thank you for your reply but the searching is still not working
> > > > > > out. For example, when I go to:
> > > > > >
> > > > > > http://localhost:8983/solr/select/?q=*%3A*<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > >
> > > > > dent=on
> > > > >
> > > > > > I get the following as a response:
> > > > > >
> > > > > > 
> > > > > >
> > > > > >   
> > > > > >
> > > > > > Mammal
> > > > > > 1
> > > > > > Canis
> > > > > >
> > > > > >   
> > > > > >
> > > > > > 
> > > > > >
> > > > > > (plus some other docs but one is enough for this example)
> > > > > >
> > > > > > But if I go to
> > > > > > http://localhost:8983/solr/select/?q=type%3A<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > >
> > > > > dent=on
> > > > >
> > > > > > Mammal
> > > > > >
> > > > > > I only get:
> > > > > >
> > > > > > 
> > > > > >
> > > > > > But it seems that should return at least the result I have listed
> > > > > > above. What am I doing incorrectly?
> > > > > >
> > > > > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira 
> wrote:
> > > > > > > q=dog is equivalent to q=text:dog (where the default search
> field
> > >
> > > is
> > >
> > > > > > > defined as text at the bottom of schema.xml).
> > > > > > >
> > > > > > > If you want to specify a

Re: Indexed, but cannot search

2011-03-02 Thread Brian Lamb

So here's something interesting. I did a delta import this morning and it
looks like I can do a global search across those fields.

I'll do another full import and see if that fixed the problem. I had done a
fullimport after making this change but it seems like another reindex is in
order.

On Wed, Mar 2, 2011 at 10:31 AM, Markus Jelsma
wrote:

> Please also provide analysis part of fieldType text. You can also use Luke
> to
> inspect the index.
>
> http://localhost:8983/solr/admin/luke?fl=globalField&numTerms=100
>
> On Wednesday 02 March 2011 16:09:33 Brian Lamb wrote:
> > Here are the relevant parts of schema.xml:
> >
> >  > multiValued="true"/>
> > globalField
> > 
> >
> > This is what is returned when I search:
> >
> > 
> > -
> > 
> > 0
> > 1
> > -
> > 
> > Mammal
> > true
> > 
> > 
> > 
> > -
> > 
> > Mammal
> > Mammal
> > globalField:mammal
> > globalField:mammal
> > 
> > LuceneQParser
> > -
> > 
> > 1.0
> > -
> > 
> > 1.0
> > -
> > 
> > 1.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > 
> > -
> > 
> > 0.0
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > -
> > 
> > 0.0
> > 
> > 
> > 
> > 
> > 
> >
> > On Tue, Mar 1, 2011 at 7:57 PM, Markus Jelsma
> wrote:
> > > Hmm, please provide analyzer of text and output of debugQuery=true.
> > > Anyway, if
> > > field type is fieldType text and the catchall field text is fieldType
> > > text as well
> > > and you reindexed, it should work as expected.
> > >
> > > > Oh if only it were that easy :-). I have reindexed since making that
> > >
> > > change
> > >
> > > > which is how I was able to get the regular search working. I have not
> > > > however been able to get the search across all fields to work.
> > > >
> > > > On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma
> > >
> > > wrote:
> > > > > Traditionally, people forget to reindex ;)
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > The problem was that my fields were defined as type="string"
> > > > > > instead
> > >
> > > of
> > >
> > > > > > type="text". Once I corrected that, it seems to be fixed. The
> only
> > >
> > > part
> > >
> > > > > > that still is not working though is the search across all fields.
> > > > > >
> > > > > > For example:
> > > > > >
> > > > > > http://localhost:8983/solr/select/?q=type%3AMammal
> > > > > >
> > > > > > Now correctly returns the records matching mammal. But if I try
> to
> > > > > > do
> > >
> > > a
> > >
> > > > > > global search across all fields:
> > > > > >
> > > > > > http://localhost:8983/solr/select/?q=Mammal
> > > > > > http://localhost:8983/solr/select/?q=text%3AMammal
> > > > > >
> > > > > > I get no results returned. Here is how the schema is set up:
> > > > > >
> > > > > >  > > > > > multiValued="true"/>
> > > > > > text
> > > > > > 
> > > > > >
> > > > > > Thanks to everyone for your help so far. I think this is the last
> > > > > > hurdle
> > > > >
> > > > > I
> > > > >
> > > > > > have to jump over.
> > > > > >
> > > > > > On Tue, Mar 1, 2011 at 12:34 PM, Upayavira 
> wrote:
> > > > > > > Next question, do you have your "type" field set to
> index="true"
> > > > > > > in
> > > > >
> > > > > your
> > > > >
> > > > > > > schema?
> > > > > > >
> > > > > >

Formatting the XML returned

2011-03-02 Thread Brian Lamb

Hi all,

This list has proven itself quite useful since I got started with Solr. I'm
wondering if it is possible to dictate the XML that is returned by a search?
Right now it seems very inefficient in that it is formatted like:

Val
Val

Etc.

I would like to change it so that it reads something like:

Val
Val

Is this possible? If so, how?

Thanks,

Brian Lamb

docBoost

2011-03-08 Thread Brian Lamb

Hi all,

I am using dataimport to create my index and I want to use docBoost to
assign some higher weights to certain docs. I understand the concept behind
docBoost but I haven't been able to find an example anywhere that shows how
to implement it. Assuming the following config file:

Re: dataimport

2011-03-09 Thread Brian Lamb

This has since been fixed. The problem was that there was not enough memory
on the machine. It works just fine now.

On Tue, Mar 8, 2011 at 6:22 PM, Chris Hostetter wrote:

>
> : INFO: Creating a connection for entity id with URL:
> :
> jdbc:mysql://localhost/researchsquare_beta_library?characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull
> : Feb 24, 2011 8:58:25 PM
> org.apache.solr.handler.dataimport.JdbcDataSource$1
> : call
> : INFO: Time taken for getConnection(): 137
> : Killed
> :
> : So it looks like for whatever reason, the server crashes trying to do a
> full
> : import. When I add a LIMIT clause on the query, it works fine when the
> LIMIT
> : is only 250 records but if I try to do 500 records, I get the same
> message.
>
> ...wow.  that's ... weird.
>
> I've never seen a java process just log "Killed" like that.
>
> The only time i've ever seen a process log "Killed" is if it was
> terminated by the os (ie: "kill -9 ")
>
> What OS are you using? how are you running solr? (ie: are you using the
> simple jetty example "java -jar start.jar" or are you using a differnet
> servlet container?) ... are you absolutely certain your machine doens't
> have some sort of monitoring in place that kills jobs if they take too
> long, or use too much CPU?
>
>
> -Hoss
>

Sorting

2011-03-09 Thread Brian Lamb

Hi all,

I know that I can add &sort=score desc to the url to sort in descending
order. However, I would like to sort a MoreLikeThis response which returns
records like this:


  
  


I don't want them grouped by result; I would just like have them all thrown
together and then sorted according to score. I have an XSLT which does put
them altogether and returns the following:


  
x.
some_id
  


However it appears that it basically applies the stylesheet to result
name="3" then result name="2".

How can I make it so that with my XSLT, the results appear sorted by
?

Re: docBoost

2011-03-09 Thread Brian Lamb

Anyone have any clue on this on?

On Tue, Mar 8, 2011 at 2:11 PM, Brian Lamb wrote:

> Hi all,
>
> I am using dataimport to create my index and I want to use docBoost to
> assign some higher weights to certain docs. I understand the concept behind
> docBoost but I haven't been able to find an example anywhere that shows how
> to implement it. Assuming the following config file:
>
> 
>   dataSource="animals"
>   pk="id"
>   query="SELECT * FROM animals">
> 
> 
> 
> dataSource="boosts"
>query="SELECT boost_score FROM boosts WHERE animal_id = ${
> animal.id}">
>

Re: docBoost

2011-03-09 Thread Brian Lamb

That makes sense. As a follow up, is there a way to only conditionally use
the boost score? For example, in some cases I want to use the boost score
and in other cases I want all documents to be treated equally.

On Wed, Mar 9, 2011 at 2:42 PM, Jayendra Patil  wrote:

> you can use the ScriptTransformer to perform the boost calcualtion and
> addition.
> http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer
>
> 
><![CDATA[
>function f1(row)  {
>// Add boost
>row.put('$docBoost',1.5);
>return row;
>}
>]]>
>
> query="select * from X">
>
>    
>
> 
>
> Regards,
> Jayendra
>
>
> On Wed, Mar 9, 2011 at 2:01 PM, Brian Lamb
>  wrote:
> > Anyone have any clue on this on?
> >
> > On Tue, Mar 8, 2011 at 2:11 PM, Brian Lamb <
> brian.l...@journalexperts.com>wrote:
> >
> >> Hi all,
> >>
> >> I am using dataimport to create my index and I want to use docBoost to
> >> assign some higher weights to certain docs. I understand the concept
> behind
> >> docBoost but I haven't been able to find an example anywhere that shows
> how
> >> to implement it. Assuming the following config file:
> >>
> >> 
> >> >>   dataSource="animals"
> >>   pk="id"
> >>   query="SELECT * FROM animals">
> >> 
> >> 
> >> 
> >>  >>dataSource="boosts"
> >>query="SELECT boost_score FROM boosts WHERE animal_id =
> ${
> >> animal.id}">
> >>

Excluding results from more like this

2011-03-09 Thread Brian Lamb

Hi all,

I'm using MoreLikeThis to find similar results but I'd like to exclude
records by the id number. For example, I use the following URL:

http://localhost:8983/solr/search/?q=id:(2 3
5)&mlt=true&mlt.fl=description,id&fl=*,score

How would I exclude record 4 form the MoreLikeThis results?

I tried,

http://localhost:8983/solr/search/?q=id:(2 3
5)&mlt=true&mlt.fl=description,id&fl=*,score&mlt.q=!4

But that still returned record 4 in the MoreLikeThisResults.

Re: Excluding results from more like this

2011-03-09 Thread Brian Lamb

That doesn't seem to do it. Record 4 is still showing up in the MoreLikeThis
results.

On Wed, Mar 9, 2011 at 4:12 PM, Otis Gospodnetic  wrote:

> Brian,
>
> ...?q=id:(2  3 5) -4
>
>
> Otis
> ---
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Brian Lamb 
> > To: solr-user@lucene.apache.org
> > Sent: Wed, March 9, 2011 4:05:10 PM
> > Subject: Excluding results from more like this
> >
> > Hi all,
> >
> > I'm using MoreLikeThis to find similar results but I'd like to  exclude
> > records by the id number. For example, I use the following  URL:
> >
> > http://localhost:8983/solr/search/?q=id:(2  3
> > 5)&mlt=true&mlt.fl=description,id&fl=*,score
> >
> > How would I  exclude record 4 form the MoreLikeThis results?
> >
> > I tried,
> >
> > http://localhost:8983/solr/search/?q=id:(2  3
> > 5)&mlt=true&mlt.fl=description,id&fl=*,score&mlt.q=!4
> >
> > But  that still returned record 4 in the MoreLikeThisResults.
> >
>

Re: docBoost

2011-03-10 Thread Brian Lamb

Okay I think I have the idea:


  
  <![CDATA[
function BoostScores(row) {
  // if searching for recommendations add in the boost score
   if(some_condition) {
row.put('$docBoost', row.get('boost_score'));
  } // end if(some_condition)

  return row;
} // end function BoostRecommendations(row)
  ]]>
 
 
  
  
  
  
http://localhost/solr/search/?q=dog
Boosted search: http://localhost/solr/search?q=dog&boost=true

To achieve this, would it be applied in the data import handler? If so, what
would I need to put in for some_condition?

Thanks for all the help so far. I truly do appreciate it.

Thanks,

Brian Lamb

On Wed, Mar 9, 2011 at 11:50 PM, Bill Bell  wrote:

> Yes just add if statement based on a field type and do a row.put() only if
> that other value is a certain value.
>
>
>
> On 3/9/11 1:39 PM, "Brian Lamb"  wrote:
>
> >That makes sense. As a follow up, is there a way to only conditionally use
> >the boost score? For example, in some cases I want to use the boost score
> >and in other cases I want all documents to be treated equally.
> >
> >On Wed, Mar 9, 2011 at 2:42 PM, Jayendra Patil
> > >> wrote:
> >
> >> you can use the ScriptTransformer to perform the boost calcualtion and
> >> addition.
> >> http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer
> >>
> >> 
> >><![CDATA[
> >>function f1(row)  {
> >>// Add boost
> >>row.put('$docBoost',1.5);
> >>return row;
> >>}
> >>]]>
> >>
> >> >> query="select * from X">
> >>
> >>
> >>
> >> 
> >>
> >> Regards,
> >> Jayendra
> >>
> >>
> >> On Wed, Mar 9, 2011 at 2:01 PM, Brian Lamb
> >>  wrote:
> >> > Anyone have any clue on this on?
> >> >
> >> > On Tue, Mar 8, 2011 at 2:11 PM, Brian Lamb <
> >> brian.l...@journalexperts.com>wrote:
> >> >
> >> >> Hi all,
> >> >>
> >> >> I am using dataimport to create my index and I want to use docBoost
> >>to
> >> >> assign some higher weights to certain docs. I understand the concept
> >> behind
> >> >> docBoost but I haven't been able to find an example anywhere that
> >>shows
> >> how
> >> >> to implement it. Assuming the following config file:
> >> >>
> >> >> 
> >> >> >> >>   dataSource="animals"
> >> >>   pk="id"
> >> >>   query="SELECT * FROM animals">
> >> >> 
> >> >> 
> >> >> 
> >> >>  >> >>dataSource="boosts"
> >> >>query="SELECT boost_score FROM boosts WHERE animal_id
> >>=
> >> ${
> >> >> animal.id}">
> >> >>

Re: Sorting

2011-03-10 Thread Brian Lamb

Any ideas on this one?

On Wed, Mar 9, 2011 at 2:00 PM, Brian Lamb wrote:

> Hi all,
>
> I know that I can add &sort=score desc to the url to sort in descending
> order. However, I would like to sort a MoreLikeThis response which returns
> records like this:
>
> 
>   
>   
> 
>
> I don't want them grouped by result; I would just like have them all thrown
> together and then sorted according to score. I have an XSLT which does put
> them altogether and returns the following:
>
> 
>   
> x.
> some_id
>   
> 
>
> However it appears that it basically applies the stylesheet to result
> name="3" then result name="2".
>
> How can I make it so that with my XSLT, the results appear sorted by
> ?
>

Re: Sorting

2011-03-14 Thread Brian Lamb

It doesn't necessarily need to go through an XSLT but the idea remains the
same. I want have the highest scores first no matter which result they match
with.

So if the results are like this:

  0.439
  1

  0.215
  2

  0.115
  3

  0.539
  4

  0.338
  5

I would want them to be formatted like this:

0.539
4

0.439
1

0.338
5

0.215
2

0.115
3

The way I do it now is to fetch the results and then parse them with PHP to
simulate that but it seems horribly inefficient so I'd like to do it within
Solr if at all possible.

On Thu, Mar 10, 2011 at 4:02 PM, Brian Lamb
wrote:

> Any ideas on this one?
>
>
> On Wed, Mar 9, 2011 at 2:00 PM, Brian Lamb 
> wrote:
>
>> Hi all,
>>
>> I know that I can add &sort=score desc to the url to sort in descending
>> order. However, I would like to sort a MoreLikeThis response which returns
>> records like this:
>>
>> 
>>   
>>   
>> 
>>
>> I don't want them grouped by result; I would just like have them all
>> thrown together and then sorted according to score. I have an XSLT which
>> does put them altogether and returns the following:
>>
>> 
>>   
>> x.
>> some_id
>>   
>> 
>>
>> However it appears that it basically applies the stylesheet to result
>> name="3" then result name="2".
>>
>> How can I make it so that with my XSLT, the results appear sorted by
>> ?
>>
>
>

Dynamically boost search scores

2011-03-14 Thread Brian Lamb

Hi all,

I have a field in my schema called boost_score. I would like to set it up so
that if I pass in a certain flag, each document score is boosted by the
number in boost_score.

For example if I use:

http://localhost/solr/search/?q=dog

I would get search results like normal. But if I use:

http://localhost/solr/search?q=dog&boost=true

The score of each document would be boosted by the number in the field
boost_score.

Unfortunately, I have no idea how to implement this actually but I'm hoping
that's where you all can come in.

Thanks,

Brian Lamb

Re: Dynamically boost search scores

2011-03-15 Thread Brian Lamb

Thank you for the advice. I looked at the page you recommended and came up
with:

http://localhost:8983/solr/search/?q=dog&fl=boost_score,genus,species,score&rows=15&bf=%22ord%28sum%28boost_score,1%29%29
^10%22

But appeared to have no effect. The results were in the same order as they
were when I left off the bf parameter. So what am I doing incorrectly?

Thanks,

Brian Lamb

On Mon, Mar 14, 2011 at 11:45 AM, Markus Jelsma
wrote:

> See boosting documents by function query. This way you can use document's
> boost_score field to affect the final score.
>
> http://wiki.apache.org/solr/FunctionQuery
>
> On Monday 14 March 2011 16:40:42 Brian Lamb wrote:
> > Hi all,
> >
> > I have a field in my schema called boost_score. I would like to set it up
> > so that if I pass in a certain flag, each document score is boosted by
> the
> > number in boost_score.
> >
> > For example if I use:
> >
> > http://localhost/solr/search/?q=dog
> >
> > I would get search results like normal. But if I use:
> >
> > http://localhost/solr/search?q=dog&boost=true
> >
> > The score of each document would be boosted by the number in the field
> > boost_score.
> >
> > Unfortunately, I have no idea how to implement this actually but I'm
> hoping
> > that's where you all can come in.
> >
> > Thanks,
> >
> > Brian Lamb
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>

Multicore

2011-03-16 Thread Brian Lamb

Hi all,

I am setting up multicore and the schema.xml file in the core0 folder says
not to sure that one because its very stripped down. So I copied the schema
from example/solr/conf but now I am getting a bunch of class not found
exceptions:

SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.KeywordMarkerFilterFactory'

For example.

I also copied over the solrconfig.xml from example/solr/conf and changed all
the lib dir="xxx" paths to go up one directory higher (
instead). I've found that when I use my solrconfig file with the stripped
down schema.xml file, it runs correctly. But when I use the full schema xml
file, I get those errors.

Now this says to me I am not loading a library or two somewhere but I've
looked through the configuration files and cannot see any other place other
than solrconfig.xml where that would be set so what am I doing incorrectly?

Thanks,

Brian Lamb

Adding the suggest component

2011-03-17 Thread Brian Lamb

Hi all,

When I installed Solr, I downloaded the most recent version (1.4.1) I
believe. I wanted to implement the Suggester (
http://wiki.apache.org/solr/Suggester). I copied and pasted the information
there into my solrconfig.xml file but I'm getting the following error:

Error loading class 'org.apache.solr.spelling.suggest.Suggester'

I read up on this error and found that I needed to checkout a newer version
from SVN. I checked out a full version and copied the contents of
src/java/org/apache/spelling/suggest to the same location on my set up.
However, I am still receiving this error.

Did I not put the files in the right place? What am I doing incorrectly?

Thanks,

Brian Lamb

Re: Adding the suggest component

2011-03-18 Thread Brian Lamb

That does seem like a better solution. I downloaded a recent version and
there were the following files/folders:

build.xml
dev-tools
LICENSE.txt
lucene
NOTICE.txt
README.txt
solr

So I did cp -r solr/* /path/to/solr/stuff/ and started solr. I didn't get
any error message but I only got the following messages:

2011-03-18 14:11:02.016:INFO::Logging to STDERR via
org.mortbay.log.StdErrLog
2011-03-18 14:11:02.240:INFO::jetty-6.1-SNAPSHOT
2011-03-18 14:11:02.284:INFO::Started SocketConnector@0.0.0.0:8983

Where as before I got a bunch of messages indicating various libraries had
been loaded. Additionally, when I go to http://localhost/solr/admin/, I get
the following message:

HTTP ERROR: 404

Problem accessing /solr/admin. Reason:

NOT_FOUND

What did I do incorrectly?

Thanks,

Brian Lamb


On Fri, Mar 18, 2011 at 9:04 AM, Erick Erickson wrote:

> What do you mean "you copied the contents...to the right place"? If you
> checked out trunk and copied the files into 1.4.1, you have mixed source
> files between disparate versions. All bets are off.
>
> Or do you mean jar files? or???
>
> I'd build the source you checked out (at the Solr level) and use that
> rather
> than try to mix-n-match.
>
> BTW, if you're just starting (as in not in production), you may want to
> consider
> using 3.1, as it's being released even as we speak and has many
> improvements
> over 1.4. You can get a nightly build from here:
> https://builds.apache.org/hudson/view/S-Z/view/Solr/
>
> Best
> Erick
>
> On Thu, Mar 17, 2011 at 3:36 PM, Brian Lamb
>  wrote:
> > Hi all,
> >
> > When I installed Solr, I downloaded the most recent version (1.4.1) I
> > believe. I wanted to implement the Suggester (
> > http://wiki.apache.org/solr/Suggester). I copied and pasted the
> information
> > there into my solrconfig.xml file but I'm getting the following error:
> >
> > Error loading class 'org.apache.solr.spelling.suggest.Suggester'
> >
> > I read up on this error and found that I needed to checkout a newer
> version
> > from SVN. I checked out a full version and copied the contents of
> > src/java/org/apache/spelling/suggest to the same location on my set up.
> > However, I am still receiving this error.
> >
> > Did I not put the files in the right place? What am I doing incorrectly?
> >
> > Thanks,
> >
> > Brian Lamb
> >
>

Re: Adding the suggest component

2011-03-18 Thread Brian Lamb

Sorry, that was a typo on my part.

I was using http://localhost:8983/solr/admin and getting the above error
messages.

On Fri, Mar 18, 2011 at 2:57 PM, Geert-Jan Brits  wrote:

> > 2011-03-18 14:11:02.284:INFO::Started SocketConnector@0.0.0.0:8983
> Solr started on port 8983
>
> instead of this:
> > http://localhost/solr/admin/
>
> try this instead:
> http://localhost:8983/solr/admin/ <http://localhost/solr/admin/>
>
> Cheers,
> Geert-Jan
>
>
>
> 2011/3/18 Brian Lamb 
>
> > That does seem like a better solution. I downloaded a recent version and
> > there were the following files/folders:
> >
> > build.xml
> > dev-tools
> > LICENSE.txt
> > lucene
> > NOTICE.txt
> > README.txt
> > solr
> >
> > So I did cp -r solr/* /path/to/solr/stuff/ and started solr. I didn't get
> > any error message but I only got the following messages:
> >
> > 2011-03-18 14:11:02.016:INFO::Logging to STDERR via
> > org.mortbay.log.StdErrLog
> > 2011-03-18 14:11:02.240:INFO::jetty-6.1-SNAPSHOT
> > 2011-03-18 14:11:02.284:INFO::Started SocketConnector@0.0.0.0:8983
> >
> > Where as before I got a bunch of messages indicating various libraries
> had
> > been loaded. Additionally, when I go to http://localhost/solr/admin/, I
> > get
> > the following message:
> >
> > HTTP ERROR: 404
> >
> > Problem accessing /solr/admin. Reason:
> >
> >NOT_FOUND
> >
> > What did I do incorrectly?
> >
> > Thanks,
> >
> > Brian Lamb
> >
> >
> > On Fri, Mar 18, 2011 at 9:04 AM, Erick Erickson  > >wrote:
> >
> > > What do you mean "you copied the contents...to the right place"? If you
> > > checked out trunk and copied the files into 1.4.1, you have mixed
> source
> > > files between disparate versions. All bets are off.
> > >
> > > Or do you mean jar files? or???
> > >
> > > I'd build the source you checked out (at the Solr level) and use that
> > > rather
> > > than try to mix-n-match.
> > >
> > > BTW, if you're just starting (as in not in production), you may want to
> > > consider
> > > using 3.1, as it's being released even as we speak and has many
> > > improvements
> > > over 1.4. You can get a nightly build from here:
> > > https://builds.apache.org/hudson/view/S-Z/view/Solr/
> > >
> > > Best
> > > Erick
> > >
> > > On Thu, Mar 17, 2011 at 3:36 PM, Brian Lamb
> > >  wrote:
> > > > Hi all,
> > > >
> > > > When I installed Solr, I downloaded the most recent version (1.4.1) I
> > > > believe. I wanted to implement the Suggester (
> > > > http://wiki.apache.org/solr/Suggester). I copied and pasted the
> > > information
> > > > there into my solrconfig.xml file but I'm getting the following
> error:
> > > >
> > > > Error loading class 'org.apache.solr.spelling.suggest.Suggester'
> > > >
> > > > I read up on this error and found that I needed to checkout a newer
> > > version
> > > > from SVN. I checked out a full version and copied the contents of
> > > > src/java/org/apache/spelling/suggest to the same location on my set
> up.
> > > > However, I am still receiving this error.
> > > >
> > > > Did I not put the files in the right place? What am I doing
> > incorrectly?
> > > >
> > > > Thanks,
> > > >
> > > > Brian Lamb
> > > >
> > >
> >
>

Re: Adding the suggest component

2011-03-22 Thread Brian Lamb

Thanks everyone for the advice. I checked out a recent version from SVN and
ran:

ant clean example

This worked just fine. However when I went to start the solr server, I get
this error message:

SEVERE: org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.handler.dataimport.DataImportHandler'

It looks like those files are there:

contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/

But for some reason, they aren't able to be found. Where would I update this
setting and what would I update it to?

Thanks,

Brian Lamb

On Mon, Mar 21, 2011 at 10:15 AM, Erick Erickson wrote:

> OK, I think you're jumping ahead and trying to do
> too many things at once.
>
> What did you download? Source? The distro? The error
> you posted usually happens for me when I haven't
> compiled the "example" target from source. So I'd guess
> you don't have the proper targets built. This assumes you
> downloaded the source via SVN.
>
> If you downloaded a distro, I'd start by NOT copying anything
> anywhere, just go to the example code and start Solr. Make
> sure you have what you think you have.
>
> I've seen "interesting" things get cured by removing the entire
> directory where your servlet container unpacks war files, but
> that's usually in development environments.
>
> When I get in these situations, I usually find it's best to back
> up, do one thing at a time and verify that I get the expected
> results at each step. It's tedious, but
>
> Best
> Erick
>
>
> On Fri, Mar 18, 2011 at 4:18 PM, Ahmet Arslan  wrote:
> >> downloaded a recent version and
> >> > > there were the following files/folders:
> >> > >
> >> > > build.xml
> >> > > dev-tools
> >> > > LICENSE.txt
> >> > > lucene
> >> > > NOTICE.txt
> >> > > README.txt
> >> > > solr
> >> > >
> >> > > So I did cp -r solr/* /path/to/solr/stuff/ and
> >> started solr. I didn't get
> >> > > any error message but I only got the following
> >> messages:
> >
> > How do you start solr? using java -jar start.jar? Did you run 'ant clean
> example' in the solr folder?
> >
> >
> >
> >
>

Re: Adding the suggest component

2011-03-22 Thread Brian Lamb

I found the following in the build.xml file:


   
















  


It looks like the dataimport handler path is correct in there so I don't
understand why it's not being compile.

I ran ant example again today but I'm still getting the same error.

Thanks,

Brian Lamb

On Tue, Mar 22, 2011 at 11:28 AM, Brian Lamb
wrote:

> Thanks everyone for the advice. I checked out a recent version from SVN and
> ran:
>
> ant clean example
>
> This worked just fine. However when I went to start the solr server, I get
> this error message:
>
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'org.apache.solr.handler.dataimport.DataImportHandler'
>
> It looks like those files are there:
>
> contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/
>
> But for some reason, they aren't able to be found. Where would I update
> this setting and what would I update it to?
>
> Thanks,
>
> Brian Lamb
>
> On Mon, Mar 21, 2011 at 10:15 AM, Erick Erickson 
> wrote:
>
>> OK, I think you're jumping ahead and trying to do
>> too many things at once.
>>
>> What did you download? Source? The distro? The error
>> you posted usually happens for me when I haven't
>> compiled the "example" target from source. So I'd guess
>> you don't have the proper targets built. This assumes you
>> downloaded the source via SVN.
>>
>> If you downloaded a distro, I'd start by NOT copying anything
>> anywhere, just go to the example code and start Solr. Make
>> sure you have what you think you have.
>>
>> I've seen "interesting" things get cured by removing the entire
>> directory where your servlet container unpacks war files, but
>> that's usually in development environments.
>>
>> When I get in these situations, I usually find it's best to back
>> up, do one thing at a time and verify that I get the expected
>> results at each step. It's tedious, but
>>
>> Best
>> Erick
>>
>>
>> On Fri, Mar 18, 2011 at 4:18 PM, Ahmet Arslan  wrote:
>> >> downloaded a recent version and
>> >> > > there were the following files/folders:
>> >> > >
>> >> > > build.xml
>> >> > > dev-tools
>> >> > > LICENSE.txt
>> >> > > lucene
>> >> > > NOTICE.txt
>> >> > > README.txt
>> >> > > solr
>> >> > >
>> >> > > So I did cp -r solr/* /path/to/solr/stuff/ and
>> >> started solr. I didn't get
>> >> > > any error message but I only got the following
>> >> messages:
>> >
>> > How do you start solr? using java -jar start.jar? Did you run 'ant clean
>> example' in the solr folder?
>> >
>> >
>> >
>> >
>>
>
>

Re: Adding the suggest component

2011-03-22 Thread Brian Lamb

Awesome! That fixed that problem. I'm getting another class not found error
but I'll see if I can fix it on my own first.

On Tue, Mar 22, 2011 at 11:56 AM, Ahmet Arslan  wrote:

>
> --- On Tue, 3/22/11, Brian Lamb  wrote:
>
> > From: Brian Lamb 
> > Subject: Re: Adding the suggest component
> > To: solr-user@lucene.apache.org
> > Cc: "Erick Erickson" 
> > Date: Tuesday, March 22, 2011, 5:28 PM
> > Thanks everyone for the advice. I
> > checked out a recent version from SVN and
> > ran:
> >
> > ant clean example
> >
> > This worked just fine. However when I went to start the
> > solr server, I get
> > this error message:
> >
> > SEVERE: org.apache.solr.common.SolrException: Error loading
> > class
> > 'org.apache.solr.handler.dataimport.DataImportHandler'
>
> run 'ant clean dist' and copy trunk/solr/dist/
>
> apache-solr-dataimporthandler-extras-4.0-SNAPSHOT.jar
> apache-solr-dataimporthandler-4.0-SNAPSHOT.jar
>
> to solrHome/lib directory.
>
>
>
>
>
>
>

Re: Adding the suggest component

2011-03-22 Thread Brian Lamb

I fixed a few other exceptions it threw when I started the server but I
don't know how to fix this one:

java.lang.NoClassDefFoundError: Could not initialize class
org.apache.solr.handler.dataimport.DataImportHandler
at java.lang.Class.forName0(Native Method)

java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
at
org.apache.solr.handler.dataimport.DataImportHandler.(DataImportHandler.java:72)

Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)

I've searched Google but haven't been able to find a reason why this happens
and how to fix it.

Thanks,

Brian Lamb

On Tue, Mar 22, 2011 at 12:54 PM, Brian Lamb
wrote:

> Awesome! That fixed that problem. I'm getting another class not found error
> but I'll see if I can fix it on my own first.
>
>
> On Tue, Mar 22, 2011 at 11:56 AM, Ahmet Arslan  wrote:
>
>>
>> --- On Tue, 3/22/11, Brian Lamb  wrote:
>>
>> > From: Brian Lamb 
>> > Subject: Re: Adding the suggest component
>> > To: solr-user@lucene.apache.org
>> > Cc: "Erick Erickson" 
>> > Date: Tuesday, March 22, 2011, 5:28 PM
>> > Thanks everyone for the advice. I
>> > checked out a recent version from SVN and
>> > ran:
>> >
>> > ant clean example
>> >
>> > This worked just fine. However when I went to start the
>> > solr server, I get
>> > this error message:
>> >
>> > SEVERE: org.apache.solr.common.SolrException: Error loading
>> > class
>> > 'org.apache.solr.handler.dataimport.DataImportHandler'
>>
>> run 'ant clean dist' and copy trunk/solr/dist/
>>
>> apache-solr-dataimporthandler-extras-4.0-SNAPSHOT.jar
>> apache-solr-dataimporthandler-4.0-SNAPSHOT.jar
>>
>> to solrHome/lib directory.
>>
>>
>>
>>
>>
>>
>>
>

Re: Adding the suggest component

2011-03-22 Thread Brian Lamb

That fixed that error as well as the could not initialize Dataimport class
error. Now I'm getting:

org.apache.solr.common.SolrException: Error Instantiating Request Handler,
org.apache.solr.handler.dataimport.DataImportHandler is not a
org.apache.solr.request.SolrRequestHandler

I can't find anything on this one. What I've added to the solrconfig.xml
file matches whats in example-DIH so I don't quite understand what the issue
is here. It sounds to me like it is not declared properly somewhere but I'm
not sure where/why.

Here is the relevant portion of my solrconfig.xml file:

 db-data-config.xml

Thanks for all the help so far. You all have been great.

Brian Lamb

On Tue, Mar 22, 2011 at 3:17 PM, Ahmet Arslan  wrote:

> > java.lang.NoClassDefFoundError: Could not initialize class
> > org.apache.solr.handler.dataimport.DataImportHandler
> > at java.lang.Class.forName0(Native Method)
> >
> > java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
> > at
> >
> org.apache.solr.handler.dataimport.DataImportHandler.(DataImportHandler.java:72)
> >
> > Caused by: java.lang.ClassNotFoundException:
> > org.slf4j.LoggerFactory
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> >
>
> You can find slf4j- related jars in \trunk\solr\lib, but this error is
> weird.
>
>
>
>

Re: Adding the suggest component

2011-03-23 Thread Brian Lamb

I'm still confused as to why I'm getting this error. To me it reads that the
.java file was declared incorrectly but I shouldn't need to change those
files so where am I doing something incorrectly?

On Tue, Mar 22, 2011 at 3:40 PM, Brian Lamb
wrote:

> That fixed that error as well as the could not initialize Dataimport class
> error. Now I'm getting:
>
> org.apache.solr.common.SolrException: Error Instantiating Request Handler,
> org.apache.solr.handler.dataimport.DataImportHandler is not a
> org.apache.solr.request.SolrRequestHandler
>
> I can't find anything on this one. What I've added to the solrconfig.xml
> file matches whats in example-DIH so I don't quite understand what the issue
> is here. It sounds to me like it is not declared properly somewhere but I'm
> not sure where/why.
>
> Here is the relevant portion of my solrconfig.xml file:
>
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>
>  db-data-config.xml
>
> 
>
> Thanks for all the help so far. You all have been great.
>
> Brian Lamb
>
> On Tue, Mar 22, 2011 at 3:17 PM, Ahmet Arslan  wrote:
>
>> > java.lang.NoClassDefFoundError: Could not initialize class
>> > org.apache.solr.handler.dataimport.DataImportHandler
>> > at java.lang.Class.forName0(Native Method)
>> >
>> > java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
>> > at
>> >
>> org.apache.solr.handler.dataimport.DataImportHandler.(DataImportHandler.java:72)
>> >
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.slf4j.LoggerFactory
>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>> >
>>
>> You can find slf4j- related jars in \trunk\solr\lib, but this error is
>> weird.
>>
>>
>>
>>
>

Re: Adding the suggest component

2011-03-23 Thread Brian Lamb

Thank you for the suggestion. I followed your advice and was able to get a
version up and running. Thanks again for all the help!

On Wed, Mar 23, 2011 at 1:55 PM, Ahmet Arslan  wrote:

> > I'm still confused as to why I'm
> > getting this error. To me it reads that the
> > .java file was declared incorrectly but I shouldn't need to
> > change those
> > files so where am I doing something incorrectly?
> >
>
> Brian, I think best thing to do is checkout a new clean copy from
> subversion and then do things step by step on this clean copy.
>
>
>
>

Default operator

2011-03-25 Thread Brian Lamb

Hi all,

I know that I can change the default operator in two ways:

1) <*solrQueryParser defaultOperator*="AND|OR"/>
2) Add q.op=AND

I'm wondering if it is possible to change the default operator for a
specific field only? For example, if I use the URL:

http://localhost:8983/solr/search/?q=animal:german shepherd&type:dog canine

I would want it to effectively be:

http://localhost:8983/solr/search/?q=animal:german AND shepherd&type:dog OR
canine

Other than parsing the URL before I send it out, is there a way to do this?

Thanks,

Brian Lamb

Re: Default operator

2011-03-28 Thread Brian Lamb

Thank you both for your input. I ended up using Ahmet's way because it seems
to fit better with the rest of the application.

On Sat, Mar 26, 2011 at 6:02 AM, lboutros  wrote:

> The other way could be to extend the SolrQueryParser to read a per field
> default operator in the solr config file. Then it should be possible to
> override this functions :
>
> setDefaultOperator
> getDefaultOperator
>
> and this two which are using the default operator :
>
> getFieldQuery
> addClause
>
> The you just have to declare it in the solr config file and configure your
> default operators.
>
> Ludovic.
>
>
>
> -
> Jouve
> France.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Default-operator-tp2732237p2734931.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

String field

2011-03-29 Thread Brian Lamb

Hi all,

I'm a little confused about the string field. I read somewhere that if I
want to do an exact match, I should use an exact match. So I made a few
modifications to my schema file:






And did a full import but when I do a search and return all fields, only id
is showing up. The only difference is that id is my primary key field so
that could be why it is showing up but why aren't the others showing up?

Thanks,

Brian Lamb

Re: String field

2011-03-29 Thread Brian Lamb

The full import wasn't spitting out any errors on the web page but in
looking at the logs, there were errors. Correcting those errors solved that
issue.

Thanks,

Brian Lamb

On Tue, Mar 29, 2011 at 2:44 PM, Erick Erickson wrote:

> try the schema browser from the admin page to be sure the fields
> you *think* are in the index really are. Did you do a commit
> after indexing? Did you re-index after the schema changes? Are
> you 100% sure that, if you did re-index, the new fields were in the
> docs submitted?
>
> Best
> Erick
>
> On Tue, Mar 29, 2011 at 11:46 AM, Brian Lamb
>  wrote:
> > Hi all,
> >
> > I'm a little confused about the string field. I read somewhere that if I
> > want to do an exact match, I should use an exact match. So I made a few
> > modifications to my schema file:
> >
> >  required="false"
> > />
> >  indexed="true"
> > stored="true" required="false" />
> >  > required="false" />
> >  > required="false" />
> >
> > And did a full import but when I do a search and return all fields, only
> id
> > is showing up. The only difference is that id is my primary key field so
> > that could be why it is showing up but why aren't the others showing up?
> >
> > Thanks,
> >
> > Brian Lamb
> >
>

Matching on a multi valued field

2011-03-29 Thread Brian Lamb

Hi all,

I have a field set up like this:



And I have some records:

RECORD1

  man's best friend
  pooch


RECORD2

  man's worst enemy
  friend to no one


Now if I do a search such as:
http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND df=common_names}man's
friend

Both records are returned. However, I only want RECORD1 returned. I
understand why RECORD2 is returned but how can I structure my query so that
only RECORD1 is returned?

Thanks,

Brian Lamb

Re: Matching on a multi valued field

2011-03-30 Thread Brian Lamb

Thank you all for your responses. The field had already been set up with
positionIncrementGap=100 so I just needed to add in the slop.

On Tue, Mar 29, 2011 at 6:32 PM, Juan Pablo Mora  wrote:

> >> A multiValued field
> >> is actually a single field with all data separated with
> positionIncrement.
> >> Try setting that value high enough and use a PhraseQuery.
>
>
> That is true but you cannot do things like:
>
> q="bar* foo*"~10 with default query search.
>
> and if you use dismax you will have the same problems with multivalued
> fields. Imagine the situation:
>
> Doc1:
>field A: ["foo bar","dooh"] 2 values
>
> Doc2:
>field A: ["bar dooh", "whatever"] Another 2 values
>
> the query:
>qt=dismax & qf= fieldA & q = ( bar dooh )
>
> will return both Doc1 and Doc2. The only thing you can do in this situation
> is boost phrase query in Doc2 with parameter pf in order to get Doc2 in the
> first position of the results:
>
> pf = fieldA^1
>
>
> Thanks,
> JP.
>
>
> El 29/03/2011, a las 23:14, Markus Jelsma escribió:
>
> > orly, all replies came in while sending =)
> >
> >> Hi,
> >>
> >> Your filter query is looking for a match of "man's friend" in a single
> >> field. Regardless of analysis of the common_names field, all terms are
> >> present in the common_names field of both documents. A multiValued field
> >> is actually a single field with all data separated with
> positionIncrement.
> >> Try setting that value high enough and use a PhraseQuery.
> >>
> >> That should work
> >>
> >> Cheers,
> >>
> >>> Hi all,
> >>>
> >>> I have a field set up like this:
> >>>
> >>>  indexed="true"
> >>> stored="true" required="false" />
> >>>
> >>> And I have some records:
> >>>
> >>> RECORD1
> >>> 
> >>>
> >>>  man's best friend
> >>>  pooch
> >>>
> >>> 
> >>>
> >>> RECORD2
> >>> 
> >>>
> >>>  man's worst enemy
> >>>  friend to no one
> >>>
> >>> 
> >>>
> >>> Now if I do a search such as:
> >>> http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND
> >>> df=common_names}man's friend
> >>>
> >>> Both records are returned. However, I only want RECORD1 returned. I
> >>> understand why RECORD2 is returned but how can I structure my query so
> >>> that only RECORD1 is returned?
> >>>
> >>> Thanks,
> >>>
> >>> Brian Lamb
>
>

Matching the beginning of a word within a term

2011-03-30 Thread Brian Lamb

Hi all,

I have a field set up like this:



And I have some records:

RECORD1

companion to mankind
pooch


RECORD2

companion to womankind
man's worst enemy


I would like to write a query that will match the beginning of a word within
the term. Here is the query I would use as it exists now:

http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND%20df=common_names}"companion
man"~10

In the above example. I would want to return only RECORD1.

The query as it exists right now is designed to only match records where
both words are present in the same term. So if I changed man to mankind in
the query, RECORD1 will be returned.

Even though the phrases companion and man exist in the same term in RECORD2,
I do not want RECORD2 to be returned because 'man' is not at the beginning
of the word.

How can I achieve this?

Thanks,

Brian Lamb

Re: Matching the beginning of a word within a term

2011-03-31 Thread Brian Lamb

No, I don't really want to break down the words into subwords. In the
example I provided, I would not want "kind" to match either record because
it is not at the beginning of the word even though "kind" appears in both
records as part of a word.

On Wed, Mar 30, 2011 at 4:42 PM, lboutros  wrote:

> Do you want to tokenize subwords based on dictionaries ? A bit like
> disagglutination of german words ?
>
> If so, something like this could help : DictionaryCompoundWordTokenFilter
>
> http://search.lucidimagination.com/search/document/CDRG_ch05_5.8.8
>
> Ludovic
>
>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html
>
> 2011/3/30 Brian Lamb [via Lucene] <
> ml-node+2754668-300063934-383...@n3.nabble.com>
>
> > Hi all,
> >
> > I have a field set up like this:
> >
> >  > stored="true" required="false" />
> >
> > And I have some records:
> >
> > RECORD1
> > 
> > companion to mankind
> > pooch
> > 
> >
> > RECORD2
> > 
> > companion to womankind
> > man's worst enemy
> > 
> >
> > I would like to write a query that will match the beginning of a word
> > within
> > the term. Here is the query I would use as it exists now:
> >
> >
> http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND%20df=common_names}
> "companion
> >
> > man"~10
> >
> > In the above example. I would want to return only RECORD1.
> >
> > The query as it exists right now is designed to only match records where
> > both words are present in the same term. So if I changed man to mankind
> in
> > the query, RECORD1 will be returned.
> >
> > Even though the phrases companion and man exist in the same term in
> > RECORD2,
> > I do not want RECORD2 to be returned because 'man' is not at the
> beginning
> > of the word.
> >
> > How can I achieve this?
> >
> > Thanks,
> >
> > Brian Lamb
> >
> >
> > --
> >  If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2754668.html
> >  To start a new topic under Solr - User, email
> > ml-node+472068-1765922688-383...@n3.nabble.com
> > To unsubscribe from Solr - User, click here<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472068&code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=
> >.
> >
> >
>
>
> -
> Jouve
> France.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2755561.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Matching the beginning of a word within a term

2011-04-04 Thread Brian Lamb

Thank you both for your replies. It looks like EdgeNGramFilter will do the
job nicely. Time to reindex...again.

On Fri, Apr 1, 2011 at 8:31 AM, Jan Høydahl  wrote:

> Check out
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
> Don't know if it works with phrases though
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 31. mars 2011, at 16.49, Brian Lamb wrote:
>
> > No, I don't really want to break down the words into subwords. In the
> > example I provided, I would not want "kind" to match either record
> because
> > it is not at the beginning of the word even though "kind" appears in both
> > records as part of a word.
> >
> > On Wed, Mar 30, 2011 at 4:42 PM, lboutros  wrote:
> >
> >> Do you want to tokenize subwords based on dictionaries ? A bit like
> >> disagglutination of german words ?
> >>
> >> If so, something like this could help :
> DictionaryCompoundWordTokenFilter
> >>
> >> http://search.lucidimagination.com/search/document/CDRG_ch05_5.8.8
> >>
> >> Ludovic
> >>
> >>
> >>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html
> >>
> >> 2011/3/30 Brian Lamb [via Lucene] <
> >> ml-node+2754668-300063934-383...@n3.nabble.com>
> >>
> >>> Hi all,
> >>>
> >>> I have a field set up like this:
> >>>
> >>>  indexed="true"
> >>> stored="true" required="false" />
> >>>
> >>> And I have some records:
> >>>
> >>> RECORD1
> >>> 
> >>> companion to mankind
> >>> pooch
> >>> 
> >>>
> >>> RECORD2
> >>> 
> >>> companion to womankind
> >>> man's worst enemy
> >>> 
> >>>
> >>> I would like to write a query that will match the beginning of a word
> >>> within
> >>> the term. Here is the query I would use as it exists now:
> >>>
> >>>
> >>
> http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND%20df=common_names}
> >> "companion
> >>>
> >>> man"~10
> >>>
> >>> In the above example. I would want to return only RECORD1.
> >>>
> >>> The query as it exists right now is designed to only match records
> where
> >>> both words are present in the same term. So if I changed man to mankind
> >> in
> >>> the query, RECORD1 will be returned.
> >>>
> >>> Even though the phrases companion and man exist in the same term in
> >>> RECORD2,
> >>> I do not want RECORD2 to be returned because 'man' is not at the
> >> beginning
> >>> of the word.
> >>>
> >>> How can I achieve this?
> >>>
> >>> Thanks,
> >>>
> >>> Brian Lamb
> >>>
> >>>
> >>> --
> >>> If you reply to this email, your message will be added to the
> discussion
> >>> below:
> >>>
> >>>
> >>
> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2754668.html
> >>> To start a new topic under Solr - User, email
> >>> ml-node+472068-1765922688-383...@n3.nabble.com
> >>> To unsubscribe from Solr - User, click here<
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472068&code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=
> >>> .
> >>>
> >>>
> >>
> >>
> >> -
> >> Jouve
> >> France.
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2755561.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Matching on a multi valued field

2011-04-04 Thread Brian Lamb

I just noticed Juan's response and I find that I am encountering that very
issue in a few cases. Boosting is a good way to put the more relevant
results to the top but it is possible to only have the correct results
returned?

On Wed, Mar 30, 2011 at 11:51 AM, Brian Lamb
wrote:

> Thank you all for your responses. The field had already been set up with
> positionIncrementGap=100 so I just needed to add in the slop.
>
>
> On Tue, Mar 29, 2011 at 6:32 PM, Juan Pablo Mora wrote:
>
>> >> A multiValued field
>> >> is actually a single field with all data separated with
>> positionIncrement.
>> >> Try setting that value high enough and use a PhraseQuery.
>>
>>
>> That is true but you cannot do things like:
>>
>> q="bar* foo*"~10 with default query search.
>>
>> and if you use dismax you will have the same problems with multivalued
>> fields. Imagine the situation:
>>
>> Doc1:
>>field A: ["foo bar","dooh"] 2 values
>>
>> Doc2:
>>field A: ["bar dooh", "whatever"] Another 2 values
>>
>> the query:
>>qt=dismax & qf= fieldA & q = ( bar dooh )
>>
>> will return both Doc1 and Doc2. The only thing you can do in this
>> situation is boost phrase query in Doc2 with parameter pf in order to get
>> Doc2 in the first position of the results:
>>
>> pf = fieldA^1
>>
>>
>> Thanks,
>> JP.
>>
>>
>> El 29/03/2011, a las 23:14, Markus Jelsma escribió:
>>
>> > orly, all replies came in while sending =)
>> >
>> >> Hi,
>> >>
>> >> Your filter query is looking for a match of "man's friend" in a single
>> >> field. Regardless of analysis of the common_names field, all terms are
>> >> present in the common_names field of both documents. A multiValued
>> field
>> >> is actually a single field with all data separated with
>> positionIncrement.
>> >> Try setting that value high enough and use a PhraseQuery.
>> >>
>> >> That should work
>> >>
>> >> Cheers,
>> >>
>> >>> Hi all,
>> >>>
>> >>> I have a field set up like this:
>> >>>
>> >>> > indexed="true"
>> >>> stored="true" required="false" />
>> >>>
>> >>> And I have some records:
>> >>>
>> >>> RECORD1
>> >>> 
>> >>>
>> >>>  man's best friend
>> >>>  pooch
>> >>>
>> >>> 
>> >>>
>> >>> RECORD2
>> >>> 
>> >>>
>> >>>  man's worst enemy
>> >>>  friend to no one
>> >>>
>> >>> 
>> >>>
>> >>> Now if I do a search such as:
>> >>> http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND
>> >>> df=common_names}man's friend
>> >>>
>> >>> Both records are returned. However, I only want RECORD1 returned. I
>> >>> understand why RECORD2 is returned but how can I structure my query so
>> >>> that only RECORD1 is returned?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Brian Lamb
>>
>>
>

MoreLikeThis match

2011-04-07 Thread Brian Lamb

Hi all,

I've been using MoreLikeThis for a while through select:

http://localhost:8983/solr/select/?q=field:more like
this&mlt=true&mlt.fl=field&rows=100&fl=*,score

I was looking over the wiki page today and saw that you can also do this:

http://localhost:8983/solr/mlt/?q=field:more like
this&mlt=true&mlt.fl=field&rows=100

which seems to run faster and do a better job overall. When the results are
returned, they are formatted like this:


  
0
1
  
  

  3.0438285
  5

  
  

  0.1125823
  3


  0.10231556
  8

 ...
  


It seems that it always returns just 1 response under match and response is
set by the rows parameter. How can I get more than one result under match?

What I'm trying to do here is whatever is set for field:, I would like to
return the top 100 records that match that search based on more like this.

Thanks,

Brian Lamb

Re: MoreLikeThis match

2011-04-07 Thread Brian Lamb

Actually, what is the difference between "match" and "response"? It seems
that match always returns one result but I've thrown a few cases at it where
the score of the highest response is higher than the score of match. And
then there are cases where the match score dwarfs the highest response
score.

On Thu, Apr 7, 2011 at 1:30 PM, Brian Lamb wrote:

> Hi all,
>
> I've been using MoreLikeThis for a while through select:
>
> http://localhost:8983/solr/select/?q=field:more like
> this&mlt=true&mlt.fl=field&rows=100&fl=*,score
>
> I was looking over the wiki page today and saw that you can also do this:
>
> http://localhost:8983/solr/mlt/?q=field:more like
> this&mlt=true&mlt.fl=field&rows=100
>
> which seems to run faster and do a better job overall. When the results are
> returned, they are formatted like this:
>
> 
>   
> 0
> 1
>   
>   
> 
>   3.0438285
>   5
> 
>   
>   
> 
>   0.1125823
>   3
> 
> 
>   0.10231556
>   8
> 
>  ...
>   
> 
>
> It seems that it always returns just 1 response under match and response is
> set by the rows parameter. How can I get more than one result under match?
>
> What I'm trying to do here is whatever is set for field:, I would like to
> return the top 100 records that match that search based on more like this.
>
> Thanks,
>
> Brian Lamb
>

Re: MoreLikeThis match

2011-04-08 Thread Brian Lamb

I've looked at both wiki pages and none really clarify the difference
between these two. If I copy and paste an existing index value for field and
do an mlt search, it shows up under match but not results. What is the
difference between these two?

On Thu, Apr 7, 2011 at 2:24 PM, Brian Lamb wrote:

> Actually, what is the difference between "match" and "response"? It seems
> that match always returns one result but I've thrown a few cases at it where
> the score of the highest response is higher than the score of match. And
> then there are cases where the match score dwarfs the highest response
> score.
>
>
> On Thu, Apr 7, 2011 at 1:30 PM, Brian Lamb 
> wrote:
>
>> Hi all,
>>
>> I've been using MoreLikeThis for a while through select:
>>
>> http://localhost:8983/solr/select/?q=field:more like
>> this&mlt=true&mlt.fl=field&rows=100&fl=*,score
>>
>> I was looking over the wiki page today and saw that you can also do this:
>>
>> http://localhost:8983/solr/mlt/?q=field:more like
>> this&mlt=true&mlt.fl=field&rows=100
>>
>> which seems to run faster and do a better job overall. When the results
>> are returned, they are formatted like this:
>>
>> 
>>   
>> 0
>> 1
>>   
>>   
>> 
>>   3.0438285
>>   5
>> 
>>   
>>   
>> 
>>   0.1125823
>>   3
>> 
>> 
>>   0.10231556
>>   8
>> 
>>  ...
>>   
>> 
>>
>> It seems that it always returns just 1 response under match and response
>> is set by the rows parameter. How can I get more than one result under
>> match?
>>
>> What I'm trying to do here is whatever is set for field:, I would like to
>> return the top 100 records that match that search based on more like this.
>>
>> Thanks,
>>
>> Brian Lamb
>>
>
>

Re: MoreLikeThis match

2011-04-11 Thread Brian Lamb

Does anyone have any thoughts on this one?

On Fri, Apr 8, 2011 at 9:26 AM, Brian Lamb wrote:

> I've looked at both wiki pages and none really clarify the difference
> between these two. If I copy and paste an existing index value for field and
> do an mlt search, it shows up under match but not results. What is the
> difference between these two?
>
>
> On Thu, Apr 7, 2011 at 2:24 PM, Brian Lamb 
> wrote:
>
>> Actually, what is the difference between "match" and "response"? It seems
>> that match always returns one result but I've thrown a few cases at it where
>> the score of the highest response is higher than the score of match. And
>> then there are cases where the match score dwarfs the highest response
>> score.
>>
>>
>> On Thu, Apr 7, 2011 at 1:30 PM, Brian Lamb > > wrote:
>>
>>> Hi all,
>>>
>>> I've been using MoreLikeThis for a while through select:
>>>
>>> http://localhost:8983/solr/select/?q=field:more like
>>> this&mlt=true&mlt.fl=field&rows=100&fl=*,score
>>>
>>> I was looking over the wiki page today and saw that you can also do this:
>>>
>>> http://localhost:8983/solr/mlt/?q=field:more like
>>> this&mlt=true&mlt.fl=field&rows=100
>>>
>>> which seems to run faster and do a better job overall. When the results
>>> are returned, they are formatted like this:
>>>
>>> 
>>>   
>>> 0
>>> 1
>>>   
>>>   
>>> 
>>>   3.0438285
>>>   5
>>> 
>>>   
>>>   >> maxScore="0.12775186">
>>> 
>>>   0.1125823
>>>   3
>>> 
>>> 
>>>   0.10231556
>>>   8
>>> 
>>>  ...
>>>   
>>> 
>>>
>>> It seems that it always returns just 1 response under match and response
>>> is set by the rows parameter. How can I get more than one result under
>>> match?
>>>
>>> What I'm trying to do here is whatever is set for field:, I would like to
>>> return the top 100 records that match that search based on more like this.
>>>
>>> Thanks,
>>>
>>> Brian Lamb
>>>
>>
>>
>

MoreLikeThis

2011-04-21 Thread Brian Lamb

Hi all,

I have an mlt search set up on my site with over 2 million records in the
index. Normally, my results look like:


  
0
204
  
  

  Some result.

  
  

  A similar result

...
  


And there are 100 results under response. However, in some cases, there are
no results under "response". Why is this the case and is there anything I
can do about it?

Here is my mlt configuration:


  
title,score
1
100
*,score
   


And here is the URL I use to get results:
http://localhost:8983/solr/mlt/?q=title:Some random title

Any help on this matter would be greatly appreciated. Thanks!

Brian Lamb

Re: MoreLikeThis

2011-04-25 Thread Brian Lamb

It finds something under "match" but just nothing under "response". I tried
turning on debugQuery=on but I did not see anything that jumped out at me as
a bug or anything. Is there some kind of threshold setting that I can tinker
with to see if that is the problem?

On Sun, Apr 24, 2011 at 2:37 AM, Grant Ingersoll wrote:

>
> On Apr 21, 2011, at 8:46 PM, Brian Lamb wrote:
>
> > Hi all,
> >
> > I have an mlt search set up on my site with over 2 million records in the
> > index. Normally, my results look like:
> >
> > 
> >  
> >0
> >204
> >  
> >  
> >
> >  Some result.
> >
> >  
> >  
> >
> >  A similar result
> >
> >...
> >  
> > 
> >
> > And there are 100 results under response. However, in some cases, there
> are
> > no results under "response". Why is this the case and is there anything I
> > can do about it?
>
> Is it because it couldn't find anything?  Or are you thinking there is a
> bug?  You might try adding &debugQuery=true and see what gets parsed, etc.
> and then try running that query.
>
>
> >
> > Here is my mlt configuration:
> >
> > 
> >  
> >title,score
> >1
> >100
> >*,score
> >   
> > 
> >
> > And here is the URL I use to get results:
> > http://localhost:8983/solr/mlt/?q=title:Some random title
> >
> > Any help on this matter would be greatly appreciated. Thanks!
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem docs using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Negative boost

2011-05-02 Thread Brian Lamb

Hi all,

I understand that the only way to simulate a negative boost is to positively
boost the inverse. I have looked at
http://wiki.apache.org/solr/SolrRelevancyFAQ but I think I am missing
something on the formatting of my query. I am using:

http://localhost:8983/solr/search?q=dog&bq=(*:* -species:Sheltie)^1

In this case, I am trying to search for records about "dog" but to put
records containing "Sheltie" closer to the bottom as I am not really
interested in that. However, the following queries:

http://localhost:8983/solr/search?q=dog
http://localhost:8983/solr/search?q=dog&bq=(*:* -species:Sheltie)^1

Return the exact same set of results with a record about a Sheltie as the
top result each time. What am I doing incorrectly?

Thanks,

Brian Lamb

Solr security

2011-05-09 Thread Brian Lamb

Hi all,

Is it possible to set up solr so that it will only execute dataimport
commands if they come from localhost?

Right now, my application and my solr installation are on different servers
so any requests are formatted http://domain:8983 instead of
http://localhost:8983. I am concerned that when I launch my application,
there will be the potential for abuse. Is the best solution to have
everything reside on the same server?

What are some other solutions?

Thanks,

Brian Lamb

Re: Solr security

2011-05-10 Thread Brian Lamb

Great posts all. I will give these a look and come up with something based
on these recommendations. I'm sure as I begin implementing something, I will
have more questions arise.

On Tue, May 10, 2011 at 9:00 AM, Anthony Wlodarski <
anth...@tinkertownlabs.com> wrote:

> The WIKI has a loose interpretation of how to set-up Jetty securely.
>  Please take a look at the article I wrote here:
> http://anthonyw.net/2011/04/securing-jetty-and-solr-with-php-authentication/.
>  Even if PHP is not your language that sits on top of Solr you can still use
> the first part of the tutorial.  If you are using Tomcat I would recommend
> looking here:
> http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html
>
> Regards,
>
> -Anthony
>
>
> On 05/09/2011 05:28 PM, Jan Høydahl wrote:
>
>> Hi,
>>
>> You can simply configure a firewall on your Solr server to only allow
>> access from your frontend server. Whether you use the built-in software
>> firewall of Linux/Windows/Whatever or use some other FW utility is a choice
>> you need to make. This is by design - you should never ever expose your
>> backend services, whether it's a search server or a database server, to the
>> public.
>>
>> Read more about Solr security on the WIKI:
>> http://wiki.apache.org/solr/SolrSecurity
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> On 9. mai 2011, at 20.57, Brian Lamb wrote:
>>
>>  Hi all,
>>>
>>> Is it possible to set up solr so that it will only execute dataimport
>>> commands if they come from localhost?
>>>
>>> Right now, my application and my solr installation are on different
>>> servers
>>> so any requests are formatted http://domain:8983 instead of
>>> http://localhost:8983. I am concerned that when I launch my application,
>>> there will be the potential for abuse. Is the best solution to have
>>> everything reside on the same server?
>>>
>>> What are some other solutions?
>>>
>>> Thanks,
>>>
>>> Brian Lamb
>>>
>>
> --
> Anthony Wlodarski
> Lead Software Engineer
> Get2Know.me (http://www.get2know.me)
> Office: 646-285-0500 x217
> Fax: 646-285-0400
>
>

MoreLikeThis PDF search

2011-05-12 Thread Brian Lamb

Hi all,

I've become more and more familiar with the MoreLikeThis handler over the
last several months. I'm curious whether it is possible to do a MoreLikeThis
search by uploading a PDF? I looked at the ExtractingRequestHandler and that
looks like it that is used to process PDF files and the like but is it
possible to combine the two?

Just to be clear, I don't want to send a PDF and have that be a part of the
index. But rather, I'd like to be able to use the PDF as a MoreLikeThis
search.

Thanks,

Brian Lamb

Changing the schema

2011-05-12 Thread Brian Lamb

If I change the field type in my schema, do I need to rebuild the entire
index? I'm at a point now where it takes over a day to do a full import due
to the sheer size of my application and I would prefer not having to reindex
just because I want to make a change somewhere.

Thanks,

Brian Lamb

Re: MoreLikeThis PDF search

2011-05-13 Thread Brian Lamb

Any thoughts on this one?

On Thu, May 12, 2011 at 10:46 AM, Brian Lamb
wrote:

> Hi all,
>
> I've become more and more familiar with the MoreLikeThis handler over the
> last several months. I'm curious whether it is possible to do a MoreLikeThis
> search by uploading a PDF? I looked at the ExtractingRequestHandler and that
> looks like it that is used to process PDF files and the like but is it
> possible to combine the two?
>
> Just to be clear, I don't want to send a PDF and have that be a part of the
> index. But rather, I'd like to be able to use the PDF as a MoreLikeThis
> search.
>
> Thanks,
>
> Brian Lamb
>

Re: MoreLikeThis PDF search

2011-05-17 Thread Brian Lamb

Would I be better off trying to use something like PHP to read the PDF file
and extrapolate the information and then pass it on to the MoreLikeThis
handler or is there a way it can be done by giving it the PDF directly?

On Fri, May 13, 2011 at 4:54 PM, Brian Lamb
wrote:

> Any thoughts on this one?
>
>
> On Thu, May 12, 2011 at 10:46 AM, Brian Lamb <
> brian.l...@journalexperts.com> wrote:
>
>> Hi all,
>>
>> I've become more and more familiar with the MoreLikeThis handler over the
>> last several months. I'm curious whether it is possible to do a MoreLikeThis
>> search by uploading a PDF? I looked at the ExtractingRequestHandler and that
>> looks like it that is used to process PDF files and the like but is it
>> possible to combine the two?
>>
>> Just to be clear, I don't want to send a PDF and have that be a part of
>> the index. But rather, I'd like to be able to use the PDF as a MoreLikeThis
>> search.
>>
>> Thanks,
>>
>> Brian Lamb
>>
>
>

Disable IDF scoring on certain fields

2011-05-17 Thread Brian Lamb

Hi all,

I have a field defined in my schema.xml file as


   
 
 
   



I would like do disable IDF scoring on this field. I am not interested in
how rare the term is, I only care if the term is present or not. The idea is
that if a user does a search for "myfield:dog OR myfield:pony", that any
document containing dog or pony would be scored identically. In the case
that both showed up, that record would be moved to the top but all the
records where they both showed up would have the same score.

So long story short, how can I disable the idf score for this particular
field?

Thanks,

Brian Lamb

Re: Disable IDF scoring on certain fields

2011-05-17 Thread Brian Lamb

Hi Markus,

I was just looking at overriding DefaultSimilarity so your email was well
timed. The problem I have with it is as you mentioned, it does not seem
possible to do it on a field by field basis. Has anyone had any luck with
doing some of the similarity functions on a field by field basis? I have
need to do more than one of them and from what I can find, it seems that
only computeNorm accounts for the name of the field.

Thanks,

Brian Lamb

On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma
wrote:

> Hi,
>
> Although you can configure per field TF (by omitTermFreqAndPositions) you
> can't
> do this for IDF. If you index is only used for this specific purpose (seems
> like an auto-complete index) then you can override DefaultSimilarity and
> return a static value for IDF. If you still want IDF for other fields then
> i
> think you have a problem because Solr doesn't yet support per-field
> similarity.
>
>
> http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/java/org/apache/lucene/search/DefaultSimilarity.java?view=markup
>
> Cheers,
>
> > Hi all,
> >
> > I have a field defined in my schema.xml file as
> >
> >  > positionIncrementGap="1000">
> >
> >  
> >   > maxGramSize="25" side="front" />
> >
> > 
> >  > stored="true" required="false" omitNorms="true" />
> >
> > I would like do disable IDF scoring on this field. I am not interested in
> > how rare the term is, I only care if the term is present or not. The idea
> > is that if a user does a search for "myfield:dog OR myfield:pony", that
> > any document containing dog or pony would be scored identically. In the
> > case that both showed up, that record would be moved to the top but all
> > the records where they both showed up would have the same score.
> >
> > So long story short, how can I disable the idf score for this particular
> > field?
> >
> > Thanks,
> >
> > Brian Lamb
>

Re: Disable IDF scoring on certain fields

2011-05-17 Thread Brian Lamb

Thank you Robert for pointing this out. This is not being used for
autocomplete. I already have another core set up for that :-)

The idea is like I outlined above. I just want a multivalued field that
treats every term in the field the same so that the only way documents
separate themselves is by an unrelated boost and/or matching on multiple
terms in that field.


On Tue, May 17, 2011 at 3:55 PM, Markus Jelsma
wrote:

> Well, if you're experimental you can try trunk as Robert points out it has
> been fixed there. If not, i guess you're stuck with creating another core.
>
> If this fieldType specifically used for auto-completion? If so, another
> core,
> preferably on another machine, is in my opinion the way to go.
> Auto-completion
> is tough in terms of performance.
>
> Thanks Robert for pointing to the Jira ticket.
>
> Cheers
>
> > Hi Markus,
> >
> > I was just looking at overriding DefaultSimilarity so your email was well
> > timed. The problem I have with it is as you mentioned, it does not seem
> > possible to do it on a field by field basis. Has anyone had any luck with
> > doing some of the similarity functions on a field by field basis? I have
> > need to do more than one of them and from what I can find, it seems that
> > only computeNorm accounts for the name of the field.
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma
> >
> > wrote:
> > > Hi,
> > >
> > > Although you can configure per field TF (by omitTermFreqAndPositions)
> you
> > > can't
> > > do this for IDF. If you index is only used for this specific purpose
> > > (seems like an auto-complete index) then you can override
> > > DefaultSimilarity and return a static value for IDF. If you still want
> > > IDF for other fields then i
> > > think you have a problem because Solr doesn't yet support per-field
> > > similarity.
> > >
> > >
> > >
> http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/jav
> > > a/org/apache/lucene/search/DefaultSimilarity.java?view=markup
> > >
> > > Cheers,
> > >
> > > > Hi all,
> > > >
> > > > I have a field defined in my schema.xml file as
> > > >
> > > >  > > > positionIncrementGap="1000">
> > > >
> > > >
> > > >
> > > >  
> > > >   > > >
> > > > maxGramSize="25" side="front" />
> > > >
> > > >
> > > >
> > > > 
> > > >  > > > indexed="true" stored="true" required="false" omitNorms="true" />
> > > >
> > > > I would like do disable IDF scoring on this field. I am not
> interested
> > > > in how rare the term is, I only care if the term is present or not.
> > > > The idea is that if a user does a search for "myfield:dog OR
> > > > myfield:pony", that any document containing dog or pony would be
> > > > scored identically. In the case that both showed up, that record
> would
> > > > be moved to the top but all the records where they both showed up
> > > > would have the same score.
> > > >
> > > > So long story short, how can I disable the idf score for this
> > > > particular field?
> > > >
> > > > Thanks,
> > > >
> > > > Brian Lamb
>

Re: Disable IDF scoring on certain fields

2011-05-18 Thread Brian Lamb

I believe I have applied the patch correctly. However, I cannot seem to
figure out where the similarity class I create should reside. Any tips on
that?

Thanks,

Brian Lamb

On Tue, May 17, 2011 at 4:00 PM, Brian Lamb
wrote:

> Thank you Robert for pointing this out. This is not being used for
> autocomplete. I already have another core set up for that :-)
>
> The idea is like I outlined above. I just want a multivalued field that
> treats every term in the field the same so that the only way documents
> separate themselves is by an unrelated boost and/or matching on multiple
> terms in that field.
>
>
> On Tue, May 17, 2011 at 3:55 PM, Markus Jelsma  > wrote:
>
>> Well, if you're experimental you can try trunk as Robert points out it has
>> been fixed there. If not, i guess you're stuck with creating another core.
>>
>> If this fieldType specifically used for auto-completion? If so, another
>> core,
>> preferably on another machine, is in my opinion the way to go.
>> Auto-completion
>> is tough in terms of performance.
>>
>> Thanks Robert for pointing to the Jira ticket.
>>
>> Cheers
>>
>> > Hi Markus,
>> >
>> > I was just looking at overriding DefaultSimilarity so your email was
>> well
>> > timed. The problem I have with it is as you mentioned, it does not seem
>> > possible to do it on a field by field basis. Has anyone had any luck
>> with
>> > doing some of the similarity functions on a field by field basis? I have
>> > need to do more than one of them and from what I can find, it seems that
>> > only computeNorm accounts for the name of the field.
>> >
>> > Thanks,
>> >
>> > Brian Lamb
>> >
>> > On Tue, May 17, 2011 at 3:34 PM, Markus Jelsma
>> >
>> > wrote:
>> > > Hi,
>> > >
>> > > Although you can configure per field TF (by omitTermFreqAndPositions)
>> you
>> > > can't
>> > > do this for IDF. If you index is only used for this specific purpose
>> > > (seems like an auto-complete index) then you can override
>> > > DefaultSimilarity and return a static value for IDF. If you still want
>> > > IDF for other fields then i
>> > > think you have a problem because Solr doesn't yet support per-field
>> > > similarity.
>> > >
>> > >
>> > >
>> http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/src/jav
>> > > a/org/apache/lucene/search/DefaultSimilarity.java?view=markup
>> > >
>> > > Cheers,
>> > >
>> > > > Hi all,
>> > > >
>> > > > I have a field defined in my schema.xml file as
>> > > >
>> > > > > > > > positionIncrementGap="1000">
>> > > >
>> > > >
>> > > >
>> > > >  
>> > > >  > > > >
>> > > > maxGramSize="25" side="front" />
>> > > >
>> > > >
>> > > >
>> > > > 
>> > > > > > > > indexed="true" stored="true" required="false" omitNorms="true" />
>> > > >
>> > > > I would like do disable IDF scoring on this field. I am not
>> interested
>> > > > in how rare the term is, I only care if the term is present or not.
>> > > > The idea is that if a user does a search for "myfield:dog OR
>> > > > myfield:pony", that any document containing dog or pony would be
>> > > > scored identically. In the case that both showed up, that record
>> would
>> > > > be moved to the top but all the records where they both showed up
>> > > > would have the same score.
>> > > >
>> > > > So long story short, how can I disable the idf score for this
>> > > > particular field?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Brian Lamb
>>
>
>

Similarity class for an individual field

2011-05-19 Thread Brian Lamb

Hi all,

Based on advice I received on a previous email thread, I applied patch
https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able to
apply a similarity class to certain fields but not all fields.

I ran the following commands:

$ cd 
$ svn up
$ wget https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
$ patch -p0 -i SOLR-2338.patch

And I did not get any errors. I then created my own SimilarityClass
listed below because it isn't very large:

package org.apache.lucene.misc;
import org.apache.lucene.search.DefaultSimilarity;

public class SimpleSimilarity extends DefaultSimilarity {
  public SimpleSimilarity() { super(); }
  public float idf(int dont, int care) { return 1; }
}

As you can see, it isn't very complicated. I'm just trying to remove
the idf from the scoring equation in certain cases.

Next, I make a change to the schema.xml file:


  


And apply that to the field in question:



But I think something did not get applied correctly to the patch. I
restarted and did a full import but the scores are exactly the same.
Also, I tried using the existing SweetSpotSimilarity:

  


But the scores remained unchanged even in that case. At this point,
I'm not quite sure how to debug this to see whether the problem is
with the patch or the similarity class but given that the SweetSpot
similarity class didn't work either, I'm inclined to think it was a
problem with the patch.

Any thoughts on this one?

Thanks,

Brian Lamb

Re: Similarity class for an individual field

2011-05-19 Thread Brian Lamb

Also, I've tried adding:



To the end of the schema file so that it is applied globally but it does not
appear to change the score either. What am I doing incorrectly?

Thanks,

Brian Lamb

On Thu, May 19, 2011 at 2:45 PM, Brian Lamb
wrote:

> Hi all,
>
> Based on advice I received on a previous email thread, I applied patch
> https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able to
> apply a similarity class to certain fields but not all fields.
>
> I ran the following commands:
>
> $ cd 
> $ svn up
> $ wget 
> https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
> $ patch -p0 -i SOLR-2338.patch
>
> And I did not get any errors. I then created my own SimilarityClass listed 
> below because it isn't very large:
>
> package org.apache.lucene.misc;
> import org.apache.lucene.search.DefaultSimilarity;
>
> public class SimpleSimilarity extends DefaultSimilarity {
>   public SimpleSimilarity() { super(); }
>
>   public float idf(int dont, int care) { return 1; }
> }
>
> As you can see, it isn't very complicated. I'm just trying to remove the idf 
> from the scoring equation in certain cases.
>
> Next, I make a change to the schema.xml file:
>
>  omitNorms="true">
>
>   
> 
>
> And apply that to the field in question:
>
>  indexed="true" stored="true" required="false" omitNorms="true" />
>
> But I think something did not get applied correctly to the patch. I restarted 
> and did a full import but the scores are exactly the same. Also, I tried 
> using the existing SweetSpotSimilarity:
>  omitNorms="true">
>   
>
> 
>
> But the scores remained unchanged even in that case. At this point, I'm not 
> quite sure how to debug this to see whether the problem is with the patch or 
> the similarity class but given that the SweetSpot similarity class didn't 
> work either, I'm inclined to think it was a problem with the patch.
>
> Any thoughts on this one?
>
> Thanks,
>
> Brian Lamb
>
>
>

Re: Similarity class for an individual field

2011-05-19 Thread Brian Lamb

I tried editing the SweetSpotSimilarity class located at
lucene/contrib/misc/src/java/org/apache/lucene/misc/SweetSpotSimilarity.java
to just return 1 for each function and the score does not change at all.
This has led me to believe that it does not recognize similarity at all. At
this point, all I have for similarity is the line at the end of the file to
apply similarity to all searches but that does not even work. So where am I
going wrong?

Thanks,

Brian Lamb

On Thu, May 19, 2011 at 3:41 PM, Brian Lamb
wrote:

> Also, I've tried adding:
>
> 
>
> To the end of the schema file so that it is applied globally but it does
> not appear to change the score either. What am I doing incorrectly?
>
> Thanks,
>
> Brian Lamb
>
> On Thu, May 19, 2011 at 2:45 PM, Brian Lamb  > wrote:
>
>> Hi all,
>>
>> Based on advice I received on a previous email thread, I applied patch
>> https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able
>> to apply a similarity class to certain fields but not all fields.
>>
>> I ran the following commands:
>>
>> $ cd 
>> $ svn up
>> $ wget 
>> https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
>> $ patch -p0 -i SOLR-2338.patch
>>
>> And I did not get any errors. I then created my own SimilarityClass listed 
>> below because it isn't very large:
>>
>> package org.apache.lucene.misc;
>> import org.apache.lucene.search.DefaultSimilarity;
>>
>> public class SimpleSimilarity extends DefaultSimilarity {
>>   public SimpleSimilarity() { super(); }
>>
>>
>>   public float idf(int dont, int care) { return 1; }
>> }
>>
>> As you can see, it isn't very complicated. I'm just trying to remove the idf 
>> from the scoring equation in certain cases.
>>
>>
>> Next, I make a change to the schema.xml file:
>>
>> > omitNorms="true">
>>
>>
>>   
>> 
>>
>> And apply that to the field in question:
>>
>> > indexed="true" stored="true" required="false" omitNorms="true" />
>>
>>
>> But I think something did not get applied correctly to the patch. I 
>> restarted and did a full import but the scores are exactly the same. Also, I 
>> tried using the existing SweetSpotSimilarity:
>> > omitNorms="true">
>>   
>>
>>
>> 
>>
>> But the scores remained unchanged even in that case. At this point, I'm not 
>> quite sure how to debug this to see whether the problem is with the patch or 
>> the similarity class but given that the SweetSpot similarity class didn't 
>> work either, I'm inclined to think it was a problem with the patch.
>>
>>
>> Any thoughts on this one?
>>
>> Thanks,
>>
>> Brian Lamb
>>
>>
>>
>

Re: Similarity class for an individual field

2011-05-20 Thread Brian Lamb

Yes. Was that not what I was supposed to do?

On Thu, May 19, 2011 at 8:26 PM, Koji Sekiguchi  wrote:

> (11/05/20 3:45), Brian Lamb wrote:
>
>> Hi all,
>>
>> Based on advice I received on a previous email thread, I applied patch
>> https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able
>> to
>> apply a similarity class to certain fields but not all fields.
>>
>> I ran the following commands:
>>
>> $ cd
>> $ svn up
>> $ wget
>> https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
>> $ patch -p0 -i SOLR-2338.patch
>>
>> And I did not get any errors. I then created my own SimilarityClass
>>
>
> Brian,
>
> I'm confused what you did because SOLR-2338 has been resolved in March and
> committed
> in trunk, but you did svn up & apply patch in your trunk?
>
> Koji
> --
> http://www.rondhuit.com/en/
>

Re: Similarity class for an individual field

2011-05-20 Thread Brian Lamb

So what was my mistake? I still have not resolved this issue.

On Fri, May 20, 2011 at 11:22 AM, Brian Lamb
wrote:

> Yes. Was that not what I was supposed to do?
>
>
> On Thu, May 19, 2011 at 8:26 PM, Koji Sekiguchi wrote:
>
>> (11/05/20 3:45), Brian Lamb wrote:
>>
>>> Hi all,
>>>
>>> Based on advice I received on a previous email thread, I applied patch
>>> https://issues.apache.org/jira/browse/SOLR-2338. My goal was to be able
>>> to
>>> apply a similarity class to certain fields but not all fields.
>>>
>>> I ran the following commands:
>>>
>>> $ cd
>>> $ svn up
>>> $ wget
>>> https://issues.apache.org/jira/secure/attachment/12475027/SOLR-2338.patch
>>> $ patch -p0 -i SOLR-2338.patch
>>>
>>> And I did not get any errors. I then created my own SimilarityClass
>>>
>>
>> Brian,
>>
>> I'm confused what you did because SOLR-2338 has been resolved in March and
>> committed
>> in trunk, but you did svn up & apply patch in your trunk?
>>
>> Koji
>> --
>> http://www.rondhuit.com/en/
>>
>
>

Similarity

2011-05-23 Thread Brian Lamb

Hi all,

I'm having trouble getting the basic similarity example to work. If you
notice at the bottom of the schema.xml file, there is a line there that is
commented out:



I uncomment that line and replace it with the following:



Which comes natively with lucene. However, the scores before and after
making this change are the same. I did a full import both times but that
didn't seem to help.

I ran svn up on both my solr directory and my lucene directory. Actually, my
lucene directory was not previously under svn so I removed everything in
there and did svn co
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/

So why isn't my installation taking the SweetSpot Similarity change?

Thanks,

Brian Lamb

Re: Similarity

2011-05-23 Thread Brian Lamb

Okay well this is encouraging. I changed SweetSpotSimilarity to
MyClassSimilarity. I created this class in:

lucene/contrib/misc/src/java/org/apache/lucene/misc/

I am getting a ClassNotFoundException when I try to start solr.

Here is the contents of the MyClassSimilarity file:

package org.apache.lucene.misc;
import org.apache.lucene.search.DefaultSimilarity;

public class MyClassSimilarity extends DefaultSimilarity {
  public MyClassSimilarity() { super(); }
  public float idf(int a1, int a2) { return 1; }
}

So then this raises two questions. Why am I getting a classNotFoundException
and how can I go about fixing it?

Thanks,

Brian Lamb

On Mon, May 23, 2011 at 3:41 PM, Markus Jelsma
wrote:

> As far as i know, SweetSpotSimilarty needs be configured. I did use it once
> but
> wrapped a factory around it to configure the sweet spot. It worked just as
> expected and explained in that paper about the subject.
>
> If you use a custom similarity that , for example, caps tf to 1. Does it
> then
> work?
>
>
>
> > Hi all,
> >
> > I'm having trouble getting the basic similarity example to work. If you
> > notice at the bottom of the schema.xml file, there is a line there that
> is
> > commented out:
> >
> > 
> >
> > I uncomment that line and replace it with the following:
> >
> > 
> >
> > Which comes natively with lucene. However, the scores before and after
> > making this change are the same. I did a full import both times but that
> > didn't seem to help.
> >
> > I ran svn up on both my solr directory and my lucene directory. Actually,
> > my lucene directory was not previously under svn so I removed everything
> > in there and did svn co
> > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
> >
> > So why isn't my installation taking the SweetSpot Similarity change?
> >
> > Thanks,
> >
> > Brian Lamb
>

Re: Similarity

2011-05-24 Thread Brian Lamb

This did the trick. Thanks!

On Mon, May 23, 2011 at 5:03 PM, Markus Jelsma
wrote:

> Hmm. I don't add code to Apache packages but create my own packages and
> namespaces, build a jar and add it to the lib directory as specified in
> solrconfig. Then you can use the FQCN to in the similarity config to point
> to
> the class.
>
> May be it can work when messing inside the apache namespace but then you
> have
> to build Lucene as well.
>
>
> > Okay well this is encouraging. I changed SweetSpotSimilarity to
> > MyClassSimilarity. I created this class in:
> >
> > lucene/contrib/misc/src/java/org/apache/lucene/misc/
> >
> > I am getting a ClassNotFoundException when I try to start solr.
> >
> > Here is the contents of the MyClassSimilarity file:
> >
> > package org.apache.lucene.misc;
> > import org.apache.lucene.search.DefaultSimilarity;
> >
> > public class MyClassSimilarity extends DefaultSimilarity {
> >   public MyClassSimilarity() { super(); }
> >   public float idf(int a1, int a2) { return 1; }
> > }
> >
> > So then this raises two questions. Why am I getting a
> > classNotFoundException and how can I go about fixing it?
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Mon, May 23, 2011 at 3:41 PM, Markus Jelsma
> >
> > wrote:
> > > As far as i know, SweetSpotSimilarty needs be configured. I did use it
> > > once but
> > > wrapped a factory around it to configure the sweet spot. It worked just
> > > as expected and explained in that paper about the subject.
> > >
> > > If you use a custom similarity that , for example, caps tf to 1. Does
> it
> > > then
> > > work?
> > >
> > > > Hi all,
> > > >
> > > > I'm having trouble getting the basic similarity example to work. If
> you
> > > > notice at the bottom of the schema.xml file, there is a line there
> that
> > >
> > > is
> > >
> > > > commented out:
> > > >
> > > > 
> > > >
> > > > I uncomment that line and replace it with the following:
> > > >
> > > > 
> > > >
> > > > Which comes natively with lucene. However, the scores before and
> after
> > > > making this change are the same. I did a full import both times but
> > > > that didn't seem to help.
> > > >
> > > > I ran svn up on both my solr directory and my lucene directory.
> > > > Actually, my lucene directory was not previously under svn so I
> > > > removed everything in there and did svn co
> > > > http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
> > > >
> > > > So why isn't my installation taking the SweetSpot Similarity change?
> > > >
> > > > Thanks,
> > > >
> > > > Brian Lamb
>

Similarity per field

2011-05-25 Thread Brian Lamb

Hi all,

I sent a mail in about this topic a week ago but now that I have more
information about what I am doing, as well as a better understanding of how
the similarity class works, I wanted to start a new thread with a bit more
information about what I'm doing, what I want to do, and how I can make it
work correctly.

I have written a similarity class that I would like applied to a specific
field.

This is how I am defining the fieldType:


   
 
 
   
   


And then I assign a specific field to that fieldType:



Then, I restarted solr and did a fullimport. However, the changes I have
made do not appear to be taking hold. For simplicity, right now I just have
the idf function returning 1. When I do a search with debugQuery=on, the idf
behaves as it normally does. However, when I search on this field, the idf
should be 1 and that is not the case.

To try and nail down where the problem occurs, I commented out the
similarity class definition in the fieldType and added it globally to the
schema file:



Then, I restarted solr and did a fullimport. This time, the idf scores were
all 1. So it seems to me the problem is not with my similarity class but in
trying to apply it to a specific fieldType.

According to https://issues.apache.org/jira/browse/SOLR-2338, this should be
in the trunk now yes? I have run svn up on both my lucene and solr installs
and it still is not recognizing it on a per field basis.

Is the tag different inside a fieldType? Did I not update solr correctly?
Where is my mistake?

Thanks,

Brian Lamb

Re: Similarity per field

2011-05-25 Thread Brian Lamb

I looked at the patch page and saw the files that were changed. I went into
my install and looked at those same files and found that they had indeed
been changed. So it looks like I have the correct version of solr.

On Wed, May 25, 2011 at 1:01 PM, Brian Lamb
wrote:

> Hi all,
>
> I sent a mail in about this topic a week ago but now that I have more
> information about what I am doing, as well as a better understanding of how
> the similarity class works, I wanted to start a new thread with a bit more
> information about what I'm doing, what I want to do, and how I can make it
> work correctly.
>
> I have written a similarity class that I would like applied to a specific
> field.
>
> This is how I am defining the fieldType:
>
>  positionIncrementGap="1000">
>
>  
>   maxGramSize="1" side="front" />
>
>
> 
>
> And then I assign a specific field to that fieldType:
>
>  indexed="true" stored="true" required="false" omitNorms="true" />
>
> Then, I restarted solr and did a fullimport. However, the changes I have
> made do not appear to be taking hold. For simplicity, right now I just have
> the idf function returning 1. When I do a search with debugQuery=on, the idf
> behaves as it normally does. However, when I search on this field, the idf
> should be 1 and that is not the case.
>
> To try and nail down where the problem occurs, I commented out the
> similarity class definition in the fieldType and added it globally to the
> schema file:
>
> 
>
> Then, I restarted solr and did a fullimport. This time, the idf scores were
> all 1. So it seems to me the problem is not with my similarity class but in
> trying to apply it to a specific fieldType.
>
> According to https://issues.apache.org/jira/browse/SOLR-2338, this should
> be in the trunk now yes? I have run svn up on both my lucene and solr
> installs and it still is not recognizing it on a per field basis.
>
> Is the tag different inside a fieldType? Did I not update solr correctly?
> Where is my mistake?
>
> Thanks,
>
> Brian Lamb
>

Edgengram

2011-05-25 Thread Brian Lamb

Hi all,

I'm running into some confusion with the way edgengram works. I have the
field set up as:


   
 
   
   


I've also set up my own similarity class that returns 1 as the idf score.
What I've found this does is if I match a string "abcdefg" against a field
containing "abcdefghijklmnop", then the idf will score that as a 7:

7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2 abcdefg=2)

I get why that's happening, but is there a way to avoid that? Do I need to
do a new field type to achieve the desired affect?

Thanks,

Brian Lamb

Re: Edgengram

2011-05-27 Thread Brian Lamb

For this, I ended up just changing it to string and using "abcdefg*" to
match. That seems to work so far.

Thanks,

Brian Lamb

On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
wrote:

> Hi all,
>
> I'm running into some confusion with the way edgengram works. I have the
> field set up as:
>
>  positionIncrementGap="1000">
>
>  
> maxGramSize="100" side="front" />
>
> 
>
> I've also set up my own similarity class that returns 1 as the idf score.
> What I've found this does is if I match a string "abcdefg" against a field
> containing "abcdefghijklmnop", then the idf will score that as a 7:
>
> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2 abcdefg=2)
>
> I get why that's happening, but is there a way to avoid that? Do I need to
> do a new field type to achieve the desired affect?
>
> Thanks,
>
> Brian Lamb
>

Re: Similarity per field

2011-05-27 Thread Brian Lamb

I'm still not having any luck with this. Has anyone actually gotten this to
work so far? I feel like I've followed the directions to the letter but it
just doesn't work.

Thanks,

Brian Lamb

On Wed, May 25, 2011 at 2:48 PM, Brian Lamb
wrote:

> I looked at the patch page and saw the files that were changed. I went into
> my install and looked at those same files and found that they had indeed
> been changed. So it looks like I have the correct version of solr.
>
>
> On Wed, May 25, 2011 at 1:01 PM, Brian Lamb  > wrote:
>
>> Hi all,
>>
>> I sent a mail in about this topic a week ago but now that I have more
>> information about what I am doing, as well as a better understanding of how
>> the similarity class works, I wanted to start a new thread with a bit more
>> information about what I'm doing, what I want to do, and how I can make it
>> work correctly.
>>
>> I have written a similarity class that I would like applied to a specific
>> field.
>>
>> This is how I am defining the fieldType:
>>
>> > positionIncrementGap="1000">
>>
>>  
>>  > maxGramSize="1" side="front" />
>>
>>
>> 
>>
>> And then I assign a specific field to that fieldType:
>>
>> > indexed="true" stored="true" required="false" omitNorms="true" />
>>
>> Then, I restarted solr and did a fullimport. However, the changes I have
>> made do not appear to be taking hold. For simplicity, right now I just have
>> the idf function returning 1. When I do a search with debugQuery=on, the idf
>> behaves as it normally does. However, when I search on this field, the idf
>> should be 1 and that is not the case.
>>
>> To try and nail down where the problem occurs, I commented out the
>> similarity class definition in the fieldType and added it globally to the
>> schema file:
>>
>> 
>>
>> Then, I restarted solr and did a fullimport. This time, the idf scores
>> were all 1. So it seems to me the problem is not with my similarity class
>> but in trying to apply it to a specific fieldType.
>>
>> According to https://issues.apache.org/jira/browse/SOLR-2338, this should
>> be in the trunk now yes? I have run svn up on both my lucene and solr
>> installs and it still is not recognizing it on a per field basis.
>>
>> Is the tag different inside a fieldType? Did I not update solr correctly?
>> Where is my mistake?
>>
>> Thanks,
>>
>> Brian Lamb
>>
>
>

Explain the difference in similarity and similarityProvider

2011-05-30 Thread Brian Lamb

I'm looking over the patch notes from
https://issues.apache.org/jira/browse/SOLR-2338 and I do not understand the
difference between


  param value


and


  is there an echo?


When would I use one over the other?

Thanks,

Brian Lamb

Re: Edgengram

2011-05-31 Thread Brian Lamb

In this particular case, I will be doing a solr search based on user
preferences. So I will not be depending on the user to type "abcdefg". That
will be automatically generated based on user selections.

The contents of the field do not contain spaces and since I am created the
search parameters, case isn't important either.

Thanks,

Brian Lamb

On Tue, May 31, 2011 at 9:44 AM, Erick Erickson wrote:

> That'll work for your case, although be aware that string types aren't
> analyzed at all,
> so case matters, as do spaces etc.
>
> What is the use-case here? If you explain it a bit there might be
> better answers
>
> Best
> Erick
>
> On Fri, May 27, 2011 at 9:17 AM, Brian Lamb
>  wrote:
> > For this, I ended up just changing it to string and using "abcdefg*" to
> > match. That seems to work so far.
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> > wrote:
> >
> >> Hi all,
> >>
> >> I'm running into some confusion with the way edgengram works. I have the
> >> field set up as:
> >>
> >>  >> positionIncrementGap="1000">
> >>
> >>  
> >> >> maxGramSize="100" side="front" />
> >>
> >> 
> >>
> >> I've also set up my own similarity class that returns 1 as the idf
> score.
> >> What I've found this does is if I match a string "abcdefg" against a
> field
> >> containing "abcdefghijklmnop", then the idf will score that as a 7:
> >>
> >> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2 abcdefg=2)
> >>
> >> I get why that's happening, but is there a way to avoid that? Do I need
> to
> >> do a new field type to achieve the desired affect?
> >>
> >> Thanks,
> >>
> >> Brian Lamb
> >>
> >
>

Re: Edgengram

2011-05-31 Thread Brian Lamb


   
 
 
   


I believe I used that link when I initially set up the field and it worked
great (and I'm still using it in other places). In this particular example
however it does not appear to be practical for me. I mentioned that I have a
similarity class that returns 1 for the idf and in the case of an edgengram,
it returns 1 * length of the search string.

Thanks,

Brian Lamb

On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com <
bmdakshinamur...@gmail.com> wrote:

> Can you specify the analyzer you are using for your queries?
>
> May be you could use a KeywordAnalyzer for your queries so you don't end up
> matching parts of your query.
>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> This should help you.
>
> On Tue, May 31, 2011 at 8:24 PM, Brian Lamb
> wrote:
>
> > In this particular case, I will be doing a solr search based on user
> > preferences. So I will not be depending on the user to type "abcdefg".
> That
> > will be automatically generated based on user selections.
> >
> > The contents of the field do not contain spaces and since I am created
> the
> > search parameters, case isn't important either.
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Tue, May 31, 2011 at 9:44 AM, Erick Erickson  > >wrote:
> >
> > > That'll work for your case, although be aware that string types aren't
> > > analyzed at all,
> > > so case matters, as do spaces etc.....
> > >
> > > What is the use-case here? If you explain it a bit there might be
> > > better answers
> > >
> > > Best
> > > Erick
> > >
> > > On Fri, May 27, 2011 at 9:17 AM, Brian Lamb
> > >  wrote:
> > > > For this, I ended up just changing it to string and using "abcdefg*"
> to
> > > > match. That seems to work so far.
> > > >
> > > > Thanks,
> > > >
> > > > Brian Lamb
> > > >
> > > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> > > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I'm running into some confusion with the way edgengram works. I have
> > the
> > > >> field set up as:
> > > >>
> > > >>  > > >> positionIncrementGap="1000">
> > > >>
> > > >>  
> > > >> > > >> maxGramSize="100" side="front" />
> > > >>
> > > >> 
> > > >>
> > > >> I've also set up my own similarity class that returns 1 as the idf
> > > score.
> > > >> What I've found this does is if I match a string "abcdefg" against a
> > > field
> > > >> containing "abcdefghijklmnop", then the idf will score that as a 7:
> > > >>
> > > >> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2
> abcdefg=2)
> > > >>
> > > >> I get why that's happening, but is there a way to avoid that? Do I
> > need
> > > to
> > > >> do a new field type to achieve the desired affect?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Brian Lamb
> > > >>
> > > >
> > >
> >
>
>
>
> --
> Thanks and Regards,
> DakshinaMurthy BM
>

Re: Edgengram

2011-06-01 Thread Brian Lamb

Hi Tomás,

Thank you very much for your suggestion. I took another crack at it using
your recommendation and it worked ideally. The only thing I had to change
was


  


to


  


The first did not produce any results but the second worked beautifully.

Thanks!

Brian Lamb

2011/5/31 Tomás Fernández Löbbe 

> ...or also use the LowerCaseTokenizerFactory at query time for consistency,
> but not the edge ngram filter.
>
> 2011/5/31 Tomás Fernández Löbbe 
>
> > Hi Brian, I don't know if I understand what you are trying to achieve.
> You
> > want the term query "abcdefg" to have an idf of 1 insead of 7? I think
> using
> > the KeywordTokenizerFilterFactory at query time should work. I would be
> > something like:
> >
> >  > positionIncrementGap="1000">
> >   
> >
> > 
> >  > maxGramSize="25" side="front" />
> >   
> >   
> >   
> >   
> > 
> >
> > this way, at query time "abcdefg" won't be turned to "a ab abc abcd abcde
> > abcdef abcdefg". At index time it will.
> >
> > Regards,
> > Tomás
> >
> >
> > On Tue, May 31, 2011 at 1:07 PM, Brian Lamb <
> brian.l...@journalexperts.com
> > > wrote:
> >
> >>  >> positionIncrementGap="1000">
> >>   
> >> 
> >>  >> maxGramSize="25" side="front" />
> >>   
> >> 
> >>
> >> I believe I used that link when I initially set up the field and it
> worked
> >> great (and I'm still using it in other places). In this particular
> example
> >> however it does not appear to be practical for me. I mentioned that I
> have
> >> a
> >> similarity class that returns 1 for the idf and in the case of an
> >> edgengram,
> >> it returns 1 * length of the search string.
> >>
> >> Thanks,
> >>
> >> Brian Lamb
> >>
> >> On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com <
> >> bmdakshinamur...@gmail.com> wrote:
> >>
> >> > Can you specify the analyzer you are using for your queries?
> >> >
> >> > May be you could use a KeywordAnalyzer for your queries so you don't
> end
> >> up
> >> > matching parts of your query.
> >> >
> >> >
> >>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> >> > This should help you.
> >> >
> >> > On Tue, May 31, 2011 at 8:24 PM, Brian Lamb
> >> > wrote:
> >> >
> >> > > In this particular case, I will be doing a solr search based on user
> >> > > preferences. So I will not be depending on the user to type
> "abcdefg".
> >> > That
> >> > > will be automatically generated based on user selections.
> >> > >
> >> > > The contents of the field do not contain spaces and since I am
> created
> >> > the
> >> > > search parameters, case isn't important either.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Brian Lamb
> >> > >
> >> > > On Tue, May 31, 2011 at 9:44 AM, Erick Erickson <
> >> erickerick...@gmail.com
> >> > > >wrote:
> >> > >
> >> > > > That'll work for your case, although be aware that string types
> >> aren't
> >> > > > analyzed at all,
> >> > > > so case matters, as do spaces etc.
> >> > > >
> >> > > > What is the use-case here? If you explain it a bit there might be
> >> > > > better answers
> >> > > >
> >> > > > Best
> >> > > > Erick
> >> > > >
> >> > > > On Fri, May 27, 2011 at 9:17 AM, Brian Lamb
> >> > > >  wrote:
> >> > > > > For this, I ended up just changing it to string and using
> >> "abcdefg*"
> >> > to
> >> > > > > match. That seems to work so far.
> >> > > > >
> >> > > > > Thanks,
> >> > > > >
> >> > > > > Brian Lamb
> >> > > > >
> >> > > > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> >> > > > > wrote:
> >> > > > >
> >> > > > >> Hi all,
> >> > > > >>
> >> > > > >> I'm running into some confusion with the way edgengram works. I
> >> have
> >> > > the
> >> > > > >> field set up as:
> >> > > > >>
> >> > > > >>  >> > > > >> positionIncrementGap="1000">
> >> > > > >>
> >> > > > >>  
> >> > > > >> >> minGramSize="1"
> >> > > > >> maxGramSize="100" side="front" />
> >> > > > >>
> >> > > > >> 
> >> > > > >>
> >> > > > >> I've also set up my own similarity class that returns 1 as the
> >> idf
> >> > > > score.
> >> > > > >> What I've found this does is if I match a string "abcdefg"
> >> against a
> >> > > > field
> >> > > > >> containing "abcdefghijklmnop", then the idf will score that as
> a
> >> 7:
> >> > > > >>
> >> > > > >> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2
> >> > abcdefg=2)
> >> > > > >>
> >> > > > >> I get why that's happening, but is there a way to avoid that?
> Do
> >> I
> >> > > need
> >> > > > to
> >> > > > >> do a new field type to achieve the desired affect?
> >> > > > >>
> >> > > > >> Thanks,
> >> > > > >>
> >> > > > >> Brian Lamb
> >> > > > >>
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks and Regards,
> >> > DakshinaMurthy BM
> >> >
> >>
> >
> >
>

Re: Edgengram

2011-06-01 Thread Brian Lamb

I think in my case LowerCaseTokenizerFactory will be sufficient because
there will never be spaces in this particular field. But thank you for the
useful link!

Thanks,

Brian Lamb

On Wed, Jun 1, 2011 at 11:44 AM, Erick Erickson wrote:

> Be a little careful here. LowerCaseTokenizerFactory is different than
> KeywordTokenizerFactory.
>
> LowerCaseTokenizerFactory will give you more than one term. e.g.
> the string "Intelligence can't be MeaSurEd" will give you 5 terms,
> any of which may match. i.e.
> "intelligence", "can", "t", "be", "measured".
> whereas KeywordTokenizerFactory followed, by, say LowerCaseFilter
> would give you exactly one token:
> "intelligence can't be measured".
>
> So searching for "measured" would get a hit in the first case but
> not in the second. Searching for "intellig*" would hit both.
>
> Neither is better, just make sure they do what you want!
>
> This page will help a lot:
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.LowerCaseTokenizerFactory
> as will the admin/analysis page.
>
> Best
> Erick
>
> On Wed, Jun 1, 2011 at 10:43 AM, Brian Lamb
>  wrote:
> > Hi Tomás,
> >
> > Thank you very much for your suggestion. I took another crack at it using
> > your recommendation and it worked ideally. The only thing I had to change
> > was
> >
> > 
> >  
> > 
> >
> > to
> >
> > 
> >  
> > 
> >
> > The first did not produce any results but the second worked beautifully.
> >
> > Thanks!
> >
> > Brian Lamb
> >
> > 2011/5/31 Tomás Fernández Löbbe 
> >
> >> ...or also use the LowerCaseTokenizerFactory at query time for
> consistency,
> >> but not the edge ngram filter.
> >>
> >> 2011/5/31 Tomás Fernández Löbbe 
> >>
> >> > Hi Brian, I don't know if I understand what you are trying to achieve.
> >> You
> >> > want the term query "abcdefg" to have an idf of 1 insead of 7? I think
> >> using
> >> > the KeywordTokenizerFilterFactory at query time should work. I would
> be
> >> > something like:
> >> >
> >> >  >> > positionIncrementGap="1000">
> >> >   
> >> >
> >> > 
> >> >  >> > maxGramSize="25" side="front" />
> >> >   
> >> >   
> >> >   
> >> >   
> >> > 
> >> >
> >> > this way, at query time "abcdefg" won't be turned to "a ab abc abcd
> abcde
> >> > abcdef abcdefg". At index time it will.
> >> >
> >> > Regards,
> >> > Tomás
> >> >
> >> >
> >> > On Tue, May 31, 2011 at 1:07 PM, Brian Lamb <
> >> brian.l...@journalexperts.com
> >> > > wrote:
> >> >
> >> >>  >> >> positionIncrementGap="1000">
> >> >>   
> >> >> 
> >> >>  >> >> maxGramSize="25" side="front" />
> >> >>   
> >> >> 
> >> >>
> >> >> I believe I used that link when I initially set up the field and it
> >> worked
> >> >> great (and I'm still using it in other places). In this particular
> >> example
> >> >> however it does not appear to be practical for me. I mentioned that I
> >> have
> >> >> a
> >> >> similarity class that returns 1 for the idf and in the case of an
> >> >> edgengram,
> >> >> it returns 1 * length of the search string.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Brian Lamb
> >> >>
> >> >> On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com <
> >> >> bmdakshinamur...@gmail.com> wrote:
> >> >>
> >> >> > Can you specify the analyzer you are using for your queries?
> >> >> >
> >> >> > May be you could use a KeywordAnalyzer for your queries so you
> don't
> >> end
> >> >> up
> >> >> > matching parts of your query.
> >> >> >
> >> >> >
> >> >>
> >>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> >> >> > This should

Searching using a PDF

2011-06-01 Thread Brian Lamb

Is it possible to do a search based on a PDF file? I know its possible to
update the index with a PDF but can you do just a regular search with it?

Thanks,

Brian Lamb

Re: Searching using a PDF

2011-06-02 Thread Brian Lamb

I mean instead of typing http://localhost:8983/?q=mysearch, I would send a
PDF file with the contents of "mysearch" and search based on that. I am
leaning toward handling this before it hits solr however.

Thanks,

Brian Lamb

On Wed, Jun 1, 2011 at 3:52 PM, Erick Erickson wrote:

> I'm not quite sure what you mean by "regular search". When
> you index a PDF (Presumably through Tika or Solr Cell) the text
> is indexed into your index and you can certainly search that. Additionally,
> there may be meta data indexed in specific fields (e.g. author,
> date modified, etc).
>
> But what does "search based on a PDF file" mean in your context?
>
> Best
> Erick
>
> On Wed, Jun 1, 2011 at 3:41 PM, Brian Lamb
>  wrote:
> > Is it possible to do a search based on a PDF file? I know its possible to
> > update the index with a PDF but can you do just a regular search with it?
> >
> > Thanks,
> >
> > Brian Lamb
> >
>

Default query parser operator

2011-06-06 Thread Brian Lamb

Hi all,

Is it possible to change the query parser operator for a specific field
without having to explicitly type it in the search field?

For example, I'd like to use:

http://localhost:8983/solr/search/?q=field1:word token field2:parser syntax


instead of

http://localhost:8983/solr/search/?q=field1:word AND token field2:parser
syntax

But, I only want it to be applied to field1, not field2 and I want the
operator to always be AND unless the user explicitly types in OR.

Thanks,

Brian Lamb

Re: Default query parser operator

2011-06-07 Thread Brian Lamb

I feel like this should be fairly easy to do but I just don't see anywhere
in the documentation on how to do this. Perhaps I am using the wrong search
parameters.

On Mon, Jun 6, 2011 at 12:19 PM, Brian Lamb
wrote:

> Hi all,
>
> Is it possible to change the query parser operator for a specific field
> without having to explicitly type it in the search field?
>
> For example, I'd like to use:
>
> http://localhost:8983/solr/search/?q=field1:word token field2:parser
> syntax
>
> instead of
>
> http://localhost:8983/solr/search/?q=field1:word AND token field2:parser
> syntax
>
> But, I only want it to be applied to field1, not field2 and I want the
> operator to always be AND unless the user explicitly types in OR.
>
> Thanks,
>
> Brian Lamb
>

Re: Default query parser operator

2011-06-07 Thread Brian Lamb

Hi Jonathan,

Thank you for your reply. Your point about my example is a good one. So let
me try to restate using your example. Suppose I want to apply AND to any
search terms within field1.

Then

field1:foo field2:bar field1:baz field2:bom

would by written as

http://localhost:8983/solr/?q=field1:foo OR field2:bar OR field1:baz OR
field2:bom

But if they were written together like:

http://localhost:8983/solr/?q=field1:(foo baz) field2:(bar bom)

I would want it to be

http://localhost:8983/solr/?q=field1:(foo AND baz) OR field2:(bar OR bom)

But it sounds like you are saying that would not be possible.

Thanks,

Brian Lamb

On Tue, Jun 7, 2011 at 11:27 AM, Jonathan Rochkind  wrote:

> Nope, not possible.
>
> I'm not even sure what it would mean semantically. If you had default
> operator "OR" ordinarily, but default operator "AND" just for "field2", then
> what would happen if you entered:
>
> field1:foo field2:bar field1:baz field2:bom
>
> Where the heck would the ANDs and ORs go?  The operators are BETWEEN the
> clauses that specify fields, they don't belong to a field. In general, the
> operators are part of the query as a whole, not any specific field.
>
> In fact, I'd be careful of your example query:
>q=field1:foo bar field2:baz
>
> I don't think that means what you think it means, I don't think the
> "field1" applies to the "bar" in that case. Although I could be wrong, but
> you definitely want to check it.  You need "field1:foo field1:bar", or set
> the default field for the query to "field1", or use parens (although that
> will change the execution strategy and ranking): q=field1:(foo bar)   
>
> At any rate, even if there's a way to specify this so it makes sense, no,
> Solr/lucene doesn't support any such thing.
>
>
>
>
> On 6/7/2011 10:56 AM, Brian Lamb wrote:
>
>> I feel like this should be fairly easy to do but I just don't see anywhere
>> in the documentation on how to do this. Perhaps I am using the wrong
>> search
>> parameters.
>>
>> On Mon, Jun 6, 2011 at 12:19 PM, Brian Lamb
>> wrote:
>>
>>  Hi all,
>>>
>>> Is it possible to change the query parser operator for a specific field
>>> without having to explicitly type it in the search field?
>>>
>>> For example, I'd like to use:
>>>
>>> http://localhost:8983/solr/search/?q=field1:word token field2:parser
>>> syntax
>>>
>>> instead of
>>>
>>> http://localhost:8983/solr/search/?q=field1:word AND token field2:parser
>>> syntax
>>>
>>> But, I only want it to be applied to field1, not field2 and I want the
>>> operator to always be AND unless the user explicitly types in OR.
>>>
>>> Thanks,
>>>
>>> Brian Lamb
>>>
>>>

Re: Default query parser operator

2011-06-10 Thread Brian Lamb

It could, it would be a little bit clunky but that's the direction I'm
heading.

On Tue, Jun 7, 2011 at 6:05 PM, lee carroll wrote:

> Hi Brian could your front end app do this field query logic?
>
> (assuming you have an app in front of solr)
>
>
>
> On 7 June 2011 18:53, Jonathan Rochkind  wrote:
> > There's no feature in Solr to do what you ask, no. I don't think.
> >
> > On 6/7/2011 1:30 PM, Brian Lamb wrote:
> >>
> >> Hi Jonathan,
> >>
> >> Thank you for your reply. Your point about my example is a good one. So
> >> let
> >> me try to restate using your example. Suppose I want to apply AND to any
> >> search terms within field1.
> >>
> >> Then
> >>
> >> field1:foo field2:bar field1:baz field2:bom
> >>
> >> would by written as
> >>
> >> http://localhost:8983/solr/?q=field1:foo OR field2:bar OR field1:baz OR
> >> field2:bom
> >>
> >> But if they were written together like:
> >>
> >> http://localhost:8983/solr/?q=field1:(foo baz) field2:(bar bom)
> >>
> >> I would want it to be
> >>
> >> http://localhost:8983/solr/?q=field1:(foo AND baz) OR field2:(bar OR
> bom)
> >>
> >> But it sounds like you are saying that would not be possible.
> >>
> >> Thanks,
> >>
> >> Brian Lamb
> >>
> >> On Tue, Jun 7, 2011 at 11:27 AM, Jonathan Rochkind
> >>  wrote:
> >>
> >>> Nope, not possible.
> >>>
> >>> I'm not even sure what it would mean semantically. If you had default
> >>> operator "OR" ordinarily, but default operator "AND" just for "field2",
> >>> then
> >>> what would happen if you entered:
> >>>
> >>> field1:foo field2:bar field1:baz field2:bom
> >>>
> >>> Where the heck would the ANDs and ORs go?  The operators are BETWEEN
> the
> >>> clauses that specify fields, they don't belong to a field. In general,
> >>> the
> >>> operators are part of the query as a whole, not any specific field.
> >>>
> >>> In fact, I'd be careful of your example query:
> >>>q=field1:foo bar field2:baz
> >>>
> >>> I don't think that means what you think it means, I don't think the
> >>> "field1" applies to the "bar" in that case. Although I could be wrong,
> >>> but
> >>> you definitely want to check it.  You need "field1:foo field1:bar", or
> >>> set
> >>> the default field for the query to "field1", or use parens (although
> that
> >>> will change the execution strategy and ranking): q=field1:(foo bar)
> >>> 
> >>>
> >>> At any rate, even if there's a way to specify this so it makes sense,
> no,
> >>> Solr/lucene doesn't support any such thing.
> >>>
> >>>
> >>>
> >>>
> >>> On 6/7/2011 10:56 AM, Brian Lamb wrote:
> >>>
> >>>> I feel like this should be fairly easy to do but I just don't see
> >>>> anywhere
> >>>> in the documentation on how to do this. Perhaps I am using the wrong
> >>>> search
> >>>> parameters.
> >>>>
> >>>> On Mon, Jun 6, 2011 at 12:19 PM, Brian Lamb
> >>>> wrote:
> >>>>
> >>>>  Hi all,
> >>>>>
> >>>>> Is it possible to change the query parser operator for a specific
> field
> >>>>> without having to explicitly type it in the search field?
> >>>>>
> >>>>> For example, I'd like to use:
> >>>>>
> >>>>> http://localhost:8983/solr/search/?q=field1:word token field2:parser
> >>>>> syntax
> >>>>>
> >>>>> instead of
> >>>>>
> >>>>> http://localhost:8983/solr/search/?q=field1:word AND token
> >>>>> field2:parser
> >>>>> syntax
> >>>>>
> >>>>> But, I only want it to be applied to field1, not field2 and I want
> the
> >>>>> operator to always be AND unless the user explicitly types in OR.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Brian Lamb
> >>>>>
> >>>>>
> >
>

Reject URL requests unless from localhost for dataimport

2011-06-24 Thread Brian Lamb

Hi all,

My solr server is currently set up at www.mysite.com:8983/solr. I would like
to keep this for the time being but I would like to restrict users from
going to www.mysite.com:8983/solr/dataimport. In that case, I would only
want to be able to do localhost:8983/solr/dataimport. Is this possible? If
so, where should I look for a guide?

Thanks,

Brian Lamb

Records disappearing

2011-06-28 Thread Brian Lamb

Hi all,

I'm having some weird behavior with my dataimport script. Because of memory
issues, I've taken to doing a delta import as doing a fullimport with
clean=false. My dataimport config file is set up like:


  




  


I've found that one (possible more that I haven't noticed) keeps
disappearing from the index. I will do a fullimport&clean=false and search
and the record will be there. I'll search again a few hours later and its
there. But then all of a sudden, its gone. I don't know what is triggering
that one record's disappearance but it is quite annoying. Any ideas what's
going on?

Thanks,

Brian Lamb

Rounding errors in solr

2011-07-22 Thread Brian Lamb

Hi all,

I've noticed some peculiar scoring issues going on in my application. For
example, I have a field that is multivalued and has several records that
have the same value. For example,


  National Society of Animal Lovers
  Nat. Soc. of Ani. Lov.


I have about 300 records with that exact value.

Now, when I do a search for references:(national society animal lovers), I
get the following results:

252
159
82
452
105

When I do a search for references:(nat soc ani lov), I get the results
ordered differently:

510
122
501
82
252

When I load all the records that match, I notice that at some point, the
scores aren't the same but differ by only a little:

1.471928 in one and the one before it was 1.471929

I turned on debugQuery=on and the scores for each of those two records are
exactly the same. Therefore, I think there is some kind of rounding error
going on.

Is there a way I can fix this?

Alternatively, can I sort by a rounded version of the score? I tried
sort=round(score,5) but I get the following message:

Can't determine Sort Order: 'round(score,5) ', pos=5

I also tried sort=sum(score,1) just to see if I was using round
incorrectly but I get an error message there too saying score is not a
recognized field.

Please help!

Thanks,

Brian Lamb

Re: Rounding errors in solr

2011-07-25 Thread Brian Lamb

Yes and that's causing some problems in my application. Is there a way to
truncate the 7th decimal place in regards to sorting by the score?

On Fri, Jul 22, 2011 at 4:27 PM, Yonik Seeley wrote:

> On Fri, Jul 22, 2011 at 4:11 PM, Brian Lamb
>  wrote:
> > I've noticed some peculiar scoring issues going on in my application. For
> > example, I have a field that is multivalued and has several records that
> > have the same value. For example,
> >
> > 
> >  National Society of Animal Lovers
> >  Nat. Soc. of Ani. Lov.
> > 
> >
> > I have about 300 records with that exact value.
> >
> > Now, when I do a search for references:(national society animal lovers),
> I
> > get the following results:
> >
> > 252
> > 159
> > 82
> > 452
> > 105
> >
> > When I do a search for references:(nat soc ani lov), I get the results
> > ordered differently:
> >
> > 510
> > 122
> > 501
> > 82
> > 252
> >
> > When I load all the records that match, I notice that at some point, the
> > scores aren't the same but differ by only a little:
> >
> > 1.471928 in one and the one before it was 1.471929
>
> 32 bit floats only have 7 decimal digits of precision, and in floating
> point land (a+b+c) can be slightly different than (c+b+a)
>
> -Yonik
> http://www.lucidimagination.com
>

1 2 >

1 - 100 of 106 matches

Mail list logo