Re: data-import runned by cron job withou wating the end of the previous one

2008-09-23 Thread sunnyfr

when I try without adaptive parameter I've OOME:

HTTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap
space 


Shalin Shekhar Mangar wrote:
> 
> On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
> 
>>
>> Hi,
>> There is something wierd :
>> I've plan cron job every 5mn which heat delta-import's url and works fine
>> :
>> The point is : It does look like if it doesn't check every data for
>> updating
>> or creating a new one :
>> Because every 5mn the delta importa is started again : (even like if
>> delta-import is not done)
>>
> 
> That should not be happening. Why do you feel it is starting again without
> waiting for the previous import to finish?
> 
> 
>>
>> idle
>> 
>> −
>> 
>> 0:2:23.885
>> 1
>> 1863146
>> 0
>> 0
>> 2008-09-22 17:40:01
>> 2008-09-22 17:40:01
>> 
>>
> 
> I'm confused by this output. How frequently do you update your database?
> How
> many rows are modified in the database in that 5 minute period?
> 
> What is the type of your last modified column in the database on which you
> use for identifying the deltas?
> 
> 
>>
>> and I wonder if it does come from my data-config file parameters :
>> which is adaptive :
>>
>>  >  driver="com.mysql.jdbc.Driver"
>>  url="jdbc:mysql://master.books.com/books"
>>  user="solr"
>>  password="tah1Axie"
>>batchSize="-1"
>>  responseBuffering="adaptive"/>
>>
>> Thanks,
>>
> 
> The part on responseBuffering is not applicable for MySQL so you can
> remove
> that.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622383.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: data-import runned by cron job withou wating the end of the previous one

2008-09-23 Thread sunnyfr

When I try without adaptive parameter I've an out of memory error.



Shalin Shekhar Mangar wrote:
> 
> On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
> 
>>
>> Hi,
>> There is something wierd :
>> I've plan cron job every 5mn which heat delta-import's url and works fine
>> :
>> The point is : It does look like if it doesn't check every data for
>> updating
>> or creating a new one :
>> Because every 5mn the delta importa is started again : (even like if
>> delta-import is not done)
>>
> 
> That should not be happening. Why do you feel it is starting again without
> waiting for the previous import to finish?
> 
> 
>>
>> idle
>> 
>> −
>> 
>> 0:2:23.885
>> 1
>> 1863146
>> 0
>> 0
>> 2008-09-22 17:40:01
>> 2008-09-22 17:40:01
>> 
>>
> 
> I'm confused by this output. How frequently do you update your database?
> How
> many rows are modified in the database in that 5 minute period?
> 
> What is the type of your last modified column in the database on which you
> use for identifying the deltas?
> 
> 
>>
>> and I wonder if it does come from my data-config file parameters :
>> which is adaptive :
>>
>>  >  driver="com.mysql.jdbc.Driver"
>>  url="jdbc:mysql://master.books.com/books"
>>  user="solr"
>>  password="tah1Axie"
>>batchSize="-1"
>>  responseBuffering="adaptive"/>
>>
>> Thanks,
>>
> 
> The part on responseBuffering is not applicable for MySQL so you can
> remove
> that.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622498.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Solr Using

2008-09-23 Thread Dinesh Gupta





Hi Otis,

Currently I am creating indexes from Java standalone program. 

I am preparing data by using query & have made data to index.

Function as blow can we write.

I have large number of product & we want to user it at production level.

Please provide me sample or tutorials.


/**
 * 
 * 
 * @param pbi
 * @throws DAOException
 */
protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException {
long start = System.currentTimeMillis();
Long prn = pbi.getPbirfnum();
if (!isValidProduct(pbi)) {
if(logger.isDebugEnabled())
logger.debug("Product Discarded" + prn+ " not a valid product. 
");
discarded++;
return null;
}

IsmpptDAO pptDao = new IsmpptDAO();
Set categoryList = new HashSet(pptDao.findByProductCategories(prn));

Iterator iter = categoryList.iterator();
Set directCategories = new HashSet();
while (iter.hasNext()) {
Object[] obj = (Object[]) iter.next();
Long categoryId = (Long) obj[0];
String categoryName = (String) obj[1];
directCategories.add(new CategoryRecord(categoryId, categoryName));
}

if (directCategories.size() == 0) {
if(logger.isDebugEnabled())
logger.debug("Product Discarded" + prn
+ " not placed in any category directly [ismppt].");
discarded++;
return null;
}

// Get all the categories for the direct categories - contains
// CategoryRecord objects
Set categories = getCategories(directCategories, prn);
Set categoryIds = new HashSet(); // All category ids

Iterator it = categories.iterator();
while (it.hasNext()) {
CategoryRecord rec = (CategoryRecord) it.next();
categoryIds.add(rec.getId());
}

//All categories so far TOTAL (direct+parent categories)
if (categoryIds.size() == 0) {
if(logger.isDebugEnabled())
logger.debug("Product Discarded" + prn+ " direct categories are 
not placed under other categories.");
discarded++;
return null;
}

Set catalogues = getCatalogues(prn);
if (catalogues.size()!=0){
if(logger.isDebugEnabled())
logger.debug("[" + prn + "]-> Total Direct PCC Catalogues [" + 
collectionToStringNew(catalogues) +"]");
}

getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues);
if (catalogues.size() == 0) {
if(logger.isDebugEnabled())
logger.debug("Product Discarded " + prn+ " not attached with 
any catalogue");
discarded++;
return null;
}

String productDirectCategories = collectionToString(directCategories);
String productAllCategories = collectionToString(categories);
String productAllCatalogues = collectionToStringNew(catalogues);

String categoryNames = getCategoryNames(categories);

if(logger.isInfoEnabled())
logger.info("TO Document Product " + pbi.getPbirfnum() + " Dir 
Categories " +
  productDirectCategories + " All Categories "
+ productAllCategories + " And Catalogues "
+ productAllCatalogues);

directCategories = null;
categories=null;
catalogues=null;


Document document = new ProductDocument().toDocument(pbi,
productAllCategories, productAllCatalogues,
productDirectCategories, categoryNames);

categoryNames =null;
pbi=null;
productAllCatalogues =null;
productAllCategories =null;
productDirectCategories=null;
categoryNames=null;

long time = System.currentTimeMillis() - start;
if (time > longestIndexTime) {
longestIndexTime = time;
}
return document;
}



> Date: Mon, 22 Sep 2008 22:10:16 -0700
> From: [EMAIL PROTECTED]
> Subject: Re: Solr Using
> To: solr-user@lucene.apache.org
> 
> Dinesh,
> 
> Please have a look at the Solr tutorial first.
> Then have a look at the new DataImportHandler - there is a very detailed page 
> about it on the Wiki.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
> > From: Dinesh Gupta <[EMAIL PROTECTED]>
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, September 23, 2008 1:02:34 AM
> > Subject: Solr Using
> > 
> > 
> > 
> > Hi All,
> > 
> > I am new to Solr. I am using Lucene last 2 years.
> > 
> > We create Lucene indexes for database.
> > 
> > Please help to migrate to Solr.
> > 
> > How can achieve this.
> > 
> > If any one have idea, please help.
> > 
> > Thanks In Advance.
> > 
> > 
> > Regards,
> > Dinesh Gupta
> > 
> > __

Re: Searching for future or "null" dates

2008-09-23 Thread Michael Lackhoff
On 23.09.2008 00:30 Chris Hostetter wrote:

> : Here is what I was able to get working with your help.
> : 
> : (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR
> : ((*:* -endDate:[* TO *])))
> : 
> : the *:* is what I was missing.
> 
> Please, PLEASE ... do yourself a favor and stop using "AND" and "OR" ...  
> food will taste better, flowers will smell fresher, and the world will be 
> a happy shinny place...
> 
> +productId:102685804 +liveDate:[* TO NOW] +(endDate:[NOW TO *] (*:* 
> -endDate:[* TO *]))

I would also like to follow your advice but don't know how to do it with
defaultOperator="AND". What I am missing is the equivalent to OR:
AND: +
NOT: -
OR: ???
I didn't find anything on the Solr or Lucene query syntax pages. If
there is such an equivalent then I guess the query would become:
productId:102685804 liveDate:[* TO NOW] (endDate:[NOW TO *] (*:*
-endDate:[* TO *]))

I switched to the AND-default because that is the default in my web
frontend so I don't have to change logic. What should I do in this
situation? Go back to the OR-default?

It is not so much this example I am after but I have a syntax translater
in my application that must be able to handle similar expressions and I
want to keep it simple and still have tasty food ;-)

-Michael


Re: Solr Using

2008-09-23 Thread Shalin Shekhar Mangar
Hi Dinesh,

Your code is hardly useful to us since we don't know what you are trying to
achieve or what all those Dao classes do.

Look at the Solr tutorial first -- http://lucene.apache.org/solr/
Use the SolrJ client for communicating with Solr server --
http://wiki.apache.org/solr/Solrj
Also take a look at DataImportHandler which can help avoid all this code --
http://wiki.apache.org/solr/DataImportHandler

If you face any problem, first search this mailing list through markmail.orgor
nabble.com to find previous posts related to your issue. If you don't find
anything helpful, post specific questions here which we will help answer.

On Tue, Sep 23, 2008 at 3:56 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote:

>
>
>
>
>
> Hi Otis,
>
> Currently I am creating indexes from Java standalone program.
>
> I am preparing data by using query & have made data to index.
>
> Function as blow can we write.
>
> I have large number of product & we want to user it at production level.
>
> Please provide me sample or tutorials.
>
>
> /**
> *
> *
> * @param pbi
> * @throws DAOException
> */
>protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException
> {
>long start = System.currentTimeMillis();
>Long prn = pbi.getPbirfnum();
>if (!isValidProduct(pbi)) {
>if(logger.isDebugEnabled())
>logger.debug("Product Discarded" + prn+ " not a valid
> product. ");
>discarded++;
>return null;
>}
>
>IsmpptDAO pptDao = new IsmpptDAO();
>Set categoryList = new HashSet(pptDao.findByProductCategories(prn));
>
>Iterator iter = categoryList.iterator();
>Set directCategories = new HashSet();
>while (iter.hasNext()) {
>Object[] obj = (Object[]) iter.next();
>Long categoryId = (Long) obj[0];
>String categoryName = (String) obj[1];
>directCategories.add(new CategoryRecord(categoryId,
> categoryName));
>}
>
>if (directCategories.size() == 0) {
>if(logger.isDebugEnabled())
>logger.debug("Product Discarded" + prn
>+ " not placed in any category directly [ismppt].");
>discarded++;
>return null;
>}
>
>// Get all the categories for the direct categories - contains
>// CategoryRecord objects
>Set categories = getCategories(directCategories, prn);
>Set categoryIds = new HashSet(); // All category ids
>
>Iterator it = categories.iterator();
>while (it.hasNext()) {
>CategoryRecord rec = (CategoryRecord) it.next();
>categoryIds.add(rec.getId());
>}
>
>//All categories so far TOTAL (direct+parent categories)
>if (categoryIds.size() == 0) {
>if(logger.isDebugEnabled())
>logger.debug("Product Discarded" + prn+ " direct categories
> are not placed under other categories.");
>discarded++;
>return null;
>}
>
>Set catalogues = getCatalogues(prn);
>if (catalogues.size()!=0){
>if(logger.isDebugEnabled())
>logger.debug("[" + prn + "]-> Total Direct PCC Catalogues ["
> + collectionToStringNew(catalogues) +"]");
>}
>
>getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues);
>if (catalogues.size() == 0) {
>if(logger.isDebugEnabled())
>logger.debug("Product Discarded " + prn+ " not attached with
> any catalogue");
>discarded++;
>return null;
>}
>
>String productDirectCategories =
> collectionToString(directCategories);
>String productAllCategories = collectionToString(categories);
>String productAllCatalogues = collectionToStringNew(catalogues);
>
>String categoryNames = getCategoryNames(categories);
>
>if(logger.isInfoEnabled())
>logger.info("TO Document Product " + pbi.getPbirfnum() + " Dir
> Categories " +
>  productDirectCategories + " All Categories "
>+ productAllCategories + " And Catalogues "
>+ productAllCatalogues);
>
>directCategories = null;
>categories=null;
>catalogues=null;
>
>
>Document document = new ProductDocument().toDocument(pbi,
>productAllCategories, productAllCatalogues,
>productDirectCategories, categoryNames);
>
>categoryNames =null;
>pbi=null;
>productAllCatalogues =null;
>productAllCategories =null;
>productDirectCategories=null;
>categoryNames=null;
>
>long time = System.currentTimeMillis() - start;
>if (time > longestIndexTime) {
>longestIndexTime = time;
>}
>return document;
>}
>
>
>
> > Date: Mon, 22 Sep 2008 22:10:16 -0700
> > From: [EMAIL PROTECTED]
> > Subject: Re: Solr Using
> > To: solr-user@lucene.a

Lucene index

2008-09-23 Thread Dinesh Gupta

Hi,
Current we are using Lucene api to create index.

It creates index in a directory with 3 files like

xxx.cfs , deletable & segments.

If I am creating Lucene indexes from Solr, these file will be created or not?

Please give me example on MySQL data base instead of hsqldb


Regards,
Dinesh

_
Movies, sports & news! Get your daily entertainment fix, only on live.com
http://www.live.com/?scope=video&form=MICOAL

EmbeddedSolrServer and the MultiCore functionality

2008-09-23 Thread Aleksander M. Stensby
Hello everyone, I'm new to Solr (have been using Lucene for a few years  
now). We are looking into Solr and have heard many good things about the  
project:)


I have a few questions regarding the EmbeddedSolrServer in Solrj and the  
MultiCore features... I've tried to find answers to this in the archives  
but have not succeeded.
The thing is, I want to be able to use the Embedded server to access  
multiple cores on one machine, and I would like to at least have the  
possibility to access the lucene indexes without http. In particular I'm  
wondering if it is possible to do the "shards" (distributed search)  
approach using the embedded server, without using http requests.


lets say I register 2 cores to a container and init my embedded server  
like this:

CoreContainer container = new CoreContainer();
container.register("core1", core1, false);
container.register("core2", core2, false);
server = new EmbeddedSolrServer(container, "core1");
then queries performed on my server will return results from core1... and  
if i do ..=new EmbeddedSolrServer(container, "core2") the results will  
come from core2.


If i have solr up and running and do something like this:
query.set("shards",  
"localhost:8080/solr/core0,localhost:8080/solr/core1");

I will get the results from both cores, obviously...

But is there a way to do this without using shards and accessing the cores  
through http?
I presume it would/should be possible to do the same thing directly  
against the cores, but my question is really if this has been implemented  
already / is it possible?



Thanks in advance for any replies!

Best regards,
 Aleksander


--
Aleksander M. Stensby
Senior Software Developer
Integrasco A/S
+47 41 22 82 72
[EMAIL PROTECTED]


Re: Lucene index

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote:

>
> Hi,
> Current we are using Lucene api to create index.
>
> It creates index in a directory with 3 files like
>
> xxx.cfs , deletable & segments.
>
> If I am creating Lucene indexes from Solr, these file will be created or
> not?


The lucene index will be created in the solr_home inside the data/index
directory.


> Please give me example on MySQL data base instead of hsqldb
>

If you are talking about DataImportHandler then there is no difference in
the configuration except for using the MySql driver instead of hsqldb.

-- 
Regards,
Shalin Shekhar Mangar.


RE: Lucene index

2008-09-23 Thread Dinesh Gupta

Hi Shalin Shekhar,

Let me explain my issue.

I have some tables in my database like

Product
Category 
Catalogue
Keywords
Seller
Brand
Country_city_group
etc.
I have a class that represent  product document as

Document doc = new Document();
// Keywords which can be used directly for search
doc.add(new Field("id",(String) 
data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED));

// Sorting fields]
String priceString = (String) data.get("Price");
if (priceString == null)
priceString = "0";
long price = 0;
try {
price = (long) Double.parseDouble(priceString);
} catch (Exception e) {

}

doc.add(new 
Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED));
Date createDate = (Date) data.get("CreateDate");
if (createDate == null) createDate = new Date();

doc.add(new 
Field("cdt",String.valueOf(createDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED));

Date modiDate = (Date) data.get("ModiDate");
if (modiDate == null) modiDate = new Date();

doc.add(new 
Field("mdt",String.valueOf(modiDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.UnStored("cdt", String.valueOf(createDate.getTime(;

// Additional fields for search
doc.add(new Field("bnm",(String) 
data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED));
doc.add(new Field("bnm1",(String) 
data.get("Brand1"),Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text("bnm", (String) data.get("Brand"))); //Tokenized 
and Unstored
doc.add(new Field("bid",(String) 
data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); // 
untokenized &
doc.add(new Field("grp",(String) 
data.get("Group"),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.Text("grp", (String) data.get("Group")));
doc.add(new Field("gid",(String) 
data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New
doc.add(new Field("snm",(String) 
data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Text("snm", (String) data.get("Seller")));
doc.add(new Field("sid",(String) 
data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); // New
doc.add(new Field("ttl",(String) 
data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));

String title1 = (String) data.get("Title");
title1 = removeSpaces(title1);
doc.add(new 
Field("ttl1",title1,Field.Store.NO,Field.Index.UN_TOKENIZED));

doc.add(new Field("ttl2",title1,Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));
 
// ColumnC - Product Sequence
String productSeq = (String) data.get("ProductSeq");
if (productSeq == null) productSeq = "";
doc.add(new 
Field("seq",productSeq,Field.Store.NO,Field.Index.UN_TOKENIZED));
//doc.add(Field.Keyword("seq", productSeq));

// New Added
doc.add(new Field("sdc",(String) 
data.get("SpecialDescription"),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored("sdc", (String) 
data.get("SpecialDescription"),true));
doc.add(new Field("kdc", (String) 
data.get("KeywordDescription"),Field.Store.NO,Field.Index.TOKENIZED));
//doc.add(Field.UnStored("kdc", (String) 
data.get("KeywordDescription"),true));

// ColumnB - Product Category and parent categories
doc.add(new Field("cts",(String) 
data.get("Categories"),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text("cts", (String) data.get("Categories")));

// ColumnB - Product Category and parent categories //Raman
doc.add(new Field("dct",(String) 
data.get("DirectCategories"),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text("dct", (String) data.get("DirectCategories")));

// ColumnC - Product Catalogues
doc.add(new Field("clg",(String) 
data.get("Catalogues"),Field.Store.YES,Field.Index.TOKENIZED));
//doc.add(Field.Text("clg", (String) data.get("Catalogues")));

//Product Delivery Cities
doc.add(new Field("dcty",(String) 
data.get("DelCities"),Field.Store.YES,Field.Index.TOKENIZED));
// Additional Information
//Top Selling Count
String sellerCount=((Long)data.get("SellCount")).toString();
doc.add(new 
Field("bsc",sellerCount,Field.Store.YES,Field.Index.TOKENIZED));


I am preparing data from querying databse.
Please tell me how can I migrate my logic to Solr.
I 

Optimise while uploading?

2008-09-23 Thread Geoff Hopson
Hi,

Probably a stupid question with the obvious answer, but if I am
running a Solr master and accepting updates, do I have to stop the
updates when I start the optimise of the index? Or will optimise just
take the latest snapshot and work on that independently of the
incoming updates?

Really enjoying Solr, BTW. Nice job!

Thanks
Geoff


Re: snapshot.yyyymmdd ... can't found them?

2008-09-23 Thread sunnyfr

Yes In deed it was problem with the path .. thanks a lot,
Just didnt get this part " If you turn up your logging to "FINE" what does
that mean ?

Huge thanks for your answer,



hossman wrote:
> 
> 
> : And I did change my config file :
> : 
> : 

commit

2008-09-23 Thread sunnyfr

Hi,

I don't know why when I start commit manually it doesn't fire snapshooter ?
I did it manually because no snapshot was created and if i run it manually
it works.

so my auto commit is activated (I think) :

  1
  1000


My snapshooter too:


  ./data/solr/book/logs/snapshooter
  data/solr/book/bin
  true
   arg1 arg2 
   MYVAR=val1 


Update are done on the server :
delta-import
idle

−

1513
574
0
2008-09-23 16:00:01
2008-09-23 16:00:01
2008-09-23 16:00:37
2008-09-23 16:00:37
216
−

Indexing completed. Added/Updated: 216 documents. Deleted 0 documents.

2008-09-23 16:01:29
0:1:28.667


and everything is at the good place I think, my path are good ...


-- 
View this message in context: 
http://www.nabble.com/commit-tp19628500p19628500.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Lucene index

2008-09-23 Thread Shalin Shekhar Mangar
Hi Dinesh,

This seems straightforward for Solr. You can use the embedded jetty server
for a start. Look at the tutorial on how to get started.

You'll need to modify the schema.xml to define all the fields that you want
to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good
start on how to do that. Each field in your code will have a counterpart in
the schema.xml with appropriate flags (indexed/stored/tokenized etc.)

Once that is complete, try to modify the DataImportHandler's hsqldb example
for your mysql database.

On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote:

>
> Hi Shalin Shekhar,
>
> Let me explain my issue.
>
> I have some tables in my database like
>
> Product
> Category
> Catalogue
> Keywords
> Seller
> Brand
> Country_city_group
> etc.
> I have a class that represent  product document as
>
> Document doc = new Document();
>// Keywords which can be used directly for search
>doc.add(new Field("id",(String)
> data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED));
>
>// Sorting fields]
>String priceString = (String) data.get("Price");
>if (priceString == null)
>priceString = "0";
>long price = 0;
>try {
>price = (long) Double.parseDouble(priceString);
>} catch (Exception e) {
>
>}
>
>doc.add(new
> Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED));
>Date createDate = (Date) data.get("CreateDate");
>if (createDate == null) createDate = new Date();
>
>doc.add(new Field("cdt",String.valueOf(createDate.getTime()),
> Field.Store.NO,Field.Index.UN_TOKENIZED));
>
>Date modiDate = (Date) data.get("ModiDate");
>if (modiDate == null) modiDate = new Date();
>
>doc.add(new Field("mdt",String.valueOf(modiDate.getTime()),
> Field.Store.NO,Field.Index.UN_TOKENIZED));
>//doc.add(Field.UnStored("cdt",
> String.valueOf(createDate.getTime(;
>
>// Additional fields for search
>doc.add(new Field("bnm",(String)
> data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED));
>doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO
> ,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Text("bnm", (String) data.get("Brand")));
> //Tokenized and Unstored
>doc.add(new Field("bid",(String)
> data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); //
> untokenized &
>doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO
> ,Field.Index.TOKENIZED));
>//doc.add(Field.Text("grp", (String) data.get("Group")));
>doc.add(new Field("gid",(String)
> data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New
>doc.add(new Field("snm",(String)
> data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Text("snm", (String) data.get("Seller")));
>doc.add(new Field("sid",(String)
> data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); //
> New
>doc.add(new Field("ttl",(String)
> data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED));
>//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));
>
>String title1 = (String) data.get("Title");
>title1 = removeSpaces(title1);
>doc.add(new Field("ttl1",title1,Field.Store.NO
> ,Field.Index.UN_TOKENIZED));
>
>doc.add(new Field("ttl2",title1,Field.Store.NO
> ,Field.Index.TOKENIZED));
>//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));
>
>// ColumnC - Product Sequence
>String productSeq = (String) data.get("ProductSeq");
>if (productSeq == null) productSeq = "";
>doc.add(new Field("seq",productSeq,Field.Store.NO
> ,Field.Index.UN_TOKENIZED));
>//doc.add(Field.Keyword("seq", productSeq));
>
>// New Added
>doc.add(new Field("sdc",(String) data.get("SpecialDescription"),
> Field.Store.NO,Field.Index.TOKENIZED));
>//doc.add(Field.UnStored("sdc", (String)
> data.get("SpecialDescription"),true));
>doc.add(new Field("kdc", (String) data.get("KeywordDescription"),
> Field.Store.NO,Field.Index.TOKENIZED));
>//doc.add(Field.UnStored("kdc", (String)
> data.get("KeywordDescription"),true));
>
>// ColumnB - Product Category and parent categories
>doc.add(new Field("cts",(String)
> data.get("Categories"),Field.Store.YES,Field.Index.TOKENIZED));
>//doc.add(Field.Text("cts", (String) data.get("Categories")));
>
>// ColumnB - Product Category and parent categories //Raman
>doc.add(new Field("dct",(String)
> data.get("DirectCategories"),Field.Store.YES,Field.Index.TOKENIZED));
>//doc.add(Field.Text("dct", (S

Re: Optimise while uploading?

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 7:06 PM, Geoff Hopson <[EMAIL PROTECTED]>wrote:

>
> Probably a stupid question with the obvious answer, but if I am
> running a Solr master and accepting updates, do I have to stop the
> updates when I start the optimise of the index? Or will optimise just
> take the latest snapshot and work on that independently of the
> incoming updates?


Usually an optimize is performed at the end of the indexing operation.
However, an optimize operation will block incoming update requests until it
completes.

Snapshots are a different story. Solr does not even know about any snapshots
-- all operations are performed on the main index only. If you look under
the hoods, it is the snapshooter shell script which creates the snapshot
directories.

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit

2008-09-23 Thread Shalin Shekhar Mangar
On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr <[EMAIL PROTECTED]> wrote:

>
> My snapshooter too:
>
>
>  ./data/solr/book/logs/snapshooter
>  data/solr/book/bin
>  true
>   arg1 arg2 
>   MYVAR=val1 
>
> and everything is at the good place I think, my path are good ...
>
>
Those paths look strange. Are you sure your snapshooter script is inside a
directory named "logs"?

Try giving absolute paths to the snapshooter script in the "exe" section.
Also, put the absolute path to the bin directory in the "dir" section and
try again.

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit

2008-09-23 Thread sunnyfr

Right my bad it was bin directory, but even when i fire commit no snapshot
created ??
Does it check the number of document even when i fire it and another
question I dont rember have put in the conf file the path to commit, but
even manually it doesnt work 

[EMAIL PROTECTED]:/# ./data/solr/book/bin/commit -V
+ [[ -n '' ]]
+ [[ -z 8180 ]]
+ [[ -z localhost ]]
+ [[ -z solr ]]
+ curl_url=http://localhost:8180/solr/update
+ fixUser -V
+ [[ -z root ]]
++ whoami
+ [[ root != root ]]
++ who -m
++ cut '-d ' -f1
++ sed '-es/^.*!//'
+ oldwhoami=root
+ [[ root == '' ]]
+ setStartTime
+ [[ Linux == \S\u\n\O\S ]]
++ date +%s
+ start=1222180545
+ logMessage started by root
++ timeStamp
++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2008/09/23 16:35:45 started by root
+ [[ -n '' ]]
+ logMessage command: ./data/solr/book/bin/commit -V
++ timeStamp
++ date '+%Y/%m/%d %H:%M:%S'
+ echo 2008/09/23 16:35:45 command: ./data/solr/book/bin/commit -V
+ [[ -n '' ]]
++ curl http://localhost:8180/solr/update -s -H 'Content-type:text/xml;
charset=utf-8' -d ''
+ rs=''
+ [[ 0 != 0 ]]
+ echo ''
+ grep ' 
> On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr <[EMAIL PROTECTED]> wrote:
> 
>>
>> My snapshooter too:
>>
>>
>>  ./data/solr/book/logs/snapshooter
>>  data/solr/book/bin
>>  true
>>   arg1 arg2 
>>   MYVAR=val1 
>>
>> and everything is at the good place I think, my path are good ...
>>
>>
> Those paths look strange. Are you sure your snapshooter script is inside a
> directory named "logs"?
> 
> Try giving absolute paths to the snapshooter script in the "exe" section.
> Also, put the absolute path to the bin directory in the "dir" section and
> try again.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/commit-tp19628500p19629217.html
Sent from the Solr - User mailing list archive at Nabble.com.



Refresh of synonyms.txt without reload

2008-09-23 Thread Batzenmann

Hi,

I'm quite new to solr and I'm looking for a way to extend the list of used
synonyms used at query-time without having to reload the config. What I've
found so far are these tow thread linked to below, of which neither really
helped me out.
Especially the MultiCore solution seems a little bit too much for 'just
reloading' the synonyms..

Right now I would choose a solution where I'd extend the
SynonymFilterFactory with a parameter for an interval in which it would look
for an update of the synonyms source file (synonyms.txt).
In case of an updated file the SynMap would be updated and from that point
on the new synonyms would be included in the query analysis.

Is this a valid approach? Would someone else find this usefull to?

cheers, Axel

http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.html
Multiple Solr Cores 
"http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td15089111.html
Re: Is it possible to add synonyms run time? 
-- 
View this message in context: 
http://www.nabble.com/Refresh-of-synonyms.txt-without-reload-tp19629361p19629361.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Refresh of synonyms.txt without reload

2008-09-23 Thread Walter Underwood
This is probably not useful because synonyms work better at index time
than at query time. Reloading synonyms also requires reindexing all
the affected documents.

wunder

On 9/23/08 7:45 AM, "Batzenmann" <[EMAIL PROTECTED]> wrote:

> 
> Hi,
> 
> I'm quite new to solr and I'm looking for a way to extend the list of used
> synonyms used at query-time without having to reload the config. What I've
> found so far are these tow thread linked to below, of which neither really
> helped me out.
> Especially the MultiCore solution seems a little bit too much for 'just
> reloading' the synonyms..
> 
> Right now I would choose a solution where I'd extend the
> SynonymFilterFactory with a parameter for an interval in which it would look
> for an update of the synonyms source file (synonyms.txt).
> In case of an updated file the SynMap would be updated and from that point
> on the new synonyms would be included in the query analysis.
> 
> Is this a valid approach? Would someone else find this usefull to?
> 
> cheers, Axel
> 
> http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.h
> tml
> Multiple Solr Cores
> "http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td150891
> 11.html
> Re: Is it possible to add synonyms run time? 



Re: EmbeddedSolrServer and the MultiCore functionality

2008-09-23 Thread Ryan McKinley


If i have solr up and running and do something like this:
   query.set("shards", "localhost:8080/solr/core0,localhost: 
8080/solr/core1");

I will get the results from both cores, obviously...

But is there a way to do this without using shards and accessing the  
cores through http?
I presume it would/should be possible to do the same thing directly  
against the cores, but my question is really if this has been  
implemented already / is it possible?




not implemented...

Check line 384 of SearchHandler.java
  SolrServer server = new CommonsHttpSolrServer(url, client);

it defaults to CommonsHttpSolrServer.

This could easily change to EmbeddedSolrServer, but i'm not sure it is  
a very common usecase...


why would you have multiple shards on the same machine?

ryan




RE: deleting record from the index using deleteByQuery method

2008-09-23 Thread Kashyap, Raghu
Thanks for your response Chris.

I do see the reviewid in the index through luke. I guess what I am
confused about is the field cumulative_delete. Does this have any
significance to whether the delete was a success or not? Also shouldn't
the method deleteByQuery return a diff status code based on if the
delete was successful or not?

-Raghu 


-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Monday, September 22, 2008 11:30 PM
To: solr-user@lucene.apache.org
Subject: Re: deleting record from the index using deleteByQuery method


:  I am trying to delete a record from the index using SolrJ. When I
: execute it I get a status of 0 which means success. I see that the
: "cummulative_deletbyquery" count increases by 1 and also the "commit"
: count increases by one. I don't see any decrease on the "numDocs"
count.
: When I query it back I do see that record again. 

I'm not positive, but i don't think deleting by query will error if no 
documents matched the query -- so just because it succeeds doesn't mean
it 
actually deleted anything ... are you sure '"rev.id:" + reviewId'
matches 
on the document you are trying to delete?  does that search find it
using 
the default handler?  (is there any analyzer weirdness?)



-Hoss

If you are not the intended recipient of this e-mail message, please notify the 
sender 
and delete all copies immediately. The sender believes this message and any 
attachments 
were sent free of any virus, worm, Trojan horse, and other forms of malicious 
code. 
This message and its attachments could have been infected during transmission. 
The 
recipient opens any attachments at the recipient's own risk, and in so doing, 
the 
recipient accepts full responsibility for such actions and agrees to take 
protective 
and remedial action relating to any malicious code. Travelport is not liable 
for any 
loss or damage arising from this message or its attachments.




DataImport troubleshooting

2008-09-23 Thread KyleMorrison

I have searched the forum and the internet at large to find an answer to my
simple problem, but have been unable. I am trying to get a simple dataimport
to work, and have not been able to. I have Solr installed on an Apache
server on Unix. I am able to commit and search for files using the usual
Simple* tools. These files begin with ... and so on.

On the data import, I have inserted
  

  /R1/home/shoshana/kyle/Documents/data-config.xml  

  

into solrconfig, and the data import looks like this:

http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" />








 

I apologize for the ugly xml. Nonetheless, when I go to
http://host:8080/solr/dataimport, I get a 404, and when I go to
http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing
happens. I have editted out the host name because I don't know if the
employer would be ok with it. Any guidance?

Thanks in advance,
Kyle
-- 
View this message in context: 
http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
Sent from the Solr - User mailing list archive at Nabble.com.



SolrUpdateServlet Warning

2008-09-23 Thread Gregg
I've got a small configuration question. When posting docs via SolrJ, I get
the following warning in the Solr logs:

WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters:
wt=xml&version=2.2
  If you are using solrj, make sure to register a request handler to /update
rather then use this servlet.
  Add: 
to your solrconfig.xml

I have an update handler configured in solrconfig.xml as follows:



What's the preferred solution? Should I comment out the SolrUpdateServlet in
solr's web.xml? My Solr server is running at /solr, if that helps.

Thanks.

Gregg


Re: SolrUpdateServlet Warning

2008-09-23 Thread Ryan McKinley


On Sep 23, 2008, at 12:35 PM, Gregg wrote:

I've got a small configuration question. When posting docs via  
SolrJ, I get

the following warning in the Solr logs:

WARNING: The @Deprecated SolrUpdateServlet does not accept query  
parameters:

wt=xml&version=2.2
 If you are using solrj, make sure to register a request handler to / 
update

rather then use this servlet.
 Add: class="solr.XmlUpdateRequestHandler" >

to your solrconfig.xml

I have an update handler configured in solrconfig.xml as follows:





are you sure?

check http://localhost:8983/solr/admin/stats.jsp
and search for XmlUpdateRequestHandler
make sure it is registered to /update


What's the preferred solution? Should I comment out the  
SolrUpdateServlet in

solr's web.xml? My Solr server is running at /solr, if that helps.



that will definitely work, but it should not be necessary to crack  
open the .war file.



ryan


Re: DataImport troubleshooting

2008-09-23 Thread Shalin Shekhar Mangar
Are there any exceptions in the log file when you start Solr?

On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison <[EMAIL PROTECTED]> wrote:

>
> I have searched the forum and the internet at large to find an answer to my
> simple problem, but have been unable. I am trying to get a simple
> dataimport
> to work, and have not been able to. I have Solr installed on an Apache
> server on Unix. I am able to commit and search for files using the usual
> Simple* tools. These files begin with ... and so on.
>
> On the data import, I have inserted
>   class="org.apache.solr.handler.dataimport.DataImportHandler">
>
>   name="config">/R1/home/shoshana/kyle/Documents/data-config.xml
>
>  
>
> into solrconfig, and the data import looks like this:
> 
> baseUrl="http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" />
>
> forEach="/iProClassDatabase/iProClassEntry/"
> url="/R1/home/shoshana/kyle/Documents/exampleIproResult.xml">
>
> xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession">
>
> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature"
> />
>
> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID"
> />
> xpath="/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length" />
>
>
> 
>
> I apologize for the ugly xml. Nonetheless, when I go to
> http://host:8080/solr/dataimport, I get a 404, and when I go to
> http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing
> happens. I have editted out the host name because I don't know if the
> employer would be ok with it. Any guidance?
>
> Thanks in advance,
> Kyle
> --
> View this message in context:
> http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Precision issue with sum() function

2008-09-23 Thread water4u99

Problem with the span filter - removing some test - re-posting.

water4u99 wrote:
> 
> Hi,
> 
> Some additional clue as to where the issue is: the computed number changed
> when there is an additional query it in the query request.
> 
> Ex1: .../select/?q=_val_:%22sum(stockPrice_f,10.00)%22&fl=*,score
> This yields a correct answer - 38.0 where the stockPrice_f dynamic field
> has the value of 28.0
> 
> However, when there is another query term - the answer changes.
> Ex2:
> .../select/?q=PRICE_MIN:20%20_val_:%22sum(stockPrice_f,10.00)%22&fl=*,score
> 
> This yields an incorrect answer: 36.41818
> 
> The config is straight out of the examples/ directory with only my own
> field definitions.
> 
> Thanks if anyone can explain or help.
> 
> 
> 
> 
> water4u99 wrote:
>> 
>> Hi,
>> 
>> I have indexed a dynamic field in the  as: > name="stockPrice_f">28.00.
>> It is visible in my query.
>> However, when I issue a query with a function: ...
>> _val_:"sum(stockPrice_f, 10.00)"&fl=*,score
>> I received the output of: 36.41818
>> There were no other computations.
>> 
>> Can any one help on why the answer is off.
>> 
>> Thank you.
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Precision-issue-with-sum%28%29-function-tp19616287p19633206.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to use copyfield with dynamicfield?

2008-09-23 Thread Erik Hatcher

Simply set "text" to be multivalued (one for each *_t field).

Erik

On Sep 22, 2008, at 1:08 PM, Jon Drukman wrote:


I have a dynamicField declaration:




I want to copy any *_t's into a text field for searching with  
dismax. As it is, it appears you can't search dynamicfields this way.


I tried adding a copyField:



I do have a text field in my schema:



However I get 400 errors whenever I try to update a record with  
entries in the *_t.



INFO: /update  0 2
Sep 22, 2008 10:04:40 AM org.apache.solr.core.SolrException log
SEVERE: org.apache.solr.core.SolrException: ERROR: multiple values  
encountered for non multiValued field text: first='Centennial Dr,  
Oakland, CA' second=''
at  
org 
.apache 
.solr.update.DocumentBuilder.addSingleField(DocumentBuilder.java:62)


I'm going to guess that the copyField with a wildcard is not  
allowed. If that is true, how does one deal with the situation where  
you want to allow new fields AND have them searchable?


-jsd-




Re: DataImport troubleshooting

2008-09-23 Thread KyleMorrison

Thank you for help. The problem was actually just stupidity on my part, as it
seems I was running the wrong startup and shutdown shells for the server,
and thus the server was getting restarted. I restarted the server and I can
at least access those pages. I'm getting some wonky output, but I assume
this will be sorted out.

Kyle



Shalin Shekhar Mangar wrote:
> 
> Are there any exceptions in the log file when you start Solr?
> 
> On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison <[EMAIL PROTECTED]> wrote:
> 
>>
>> I have searched the forum and the internet at large to find an answer to
>> my
>> simple problem, but have been unable. I am trying to get a simple
>> dataimport
>> to work, and have not been able to. I have Solr installed on an Apache
>> server on Unix. I am able to commit and search for files using the usual
>> Simple* tools. These files begin with ... and so on.
>>
>> On the data import, I have inserted
>>  > class="org.apache.solr.handler.dataimport.DataImportHandler">
>>
>>  > name="config">/R1/home/shoshana/kyle/Documents/data-config.xml
>>
>>  
>>
>> into solrconfig, and the data import looks like this:
>> 
>>> baseUrl="http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" />
>>
>>> forEach="/iProClassDatabase/iProClassEntry/"
>> url="/R1/home/shoshana/kyle/Documents/exampleIproResult.xml">
>>>
>> xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession">
>>>
>> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature"
>> />
>>>
>> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID"
>> />
>>> xpath="/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length" />
>>
>>
>> 
>>
>> I apologize for the ugly xml. Nonetheless, when I go to
>> http://host:8080/solr/dataimport, I get a 404, and when I go to
>> http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing
>> happens. I have editted out the host name because I don't know if the
>> employer would be ok with it. Any guidance?
>>
>> Thanks in advance,
>> Kyle
>> --
>> View this message in context:
>> http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/DataImport-troubleshooting-tp19630990p19635170.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Precision issue with sum() function

2008-09-23 Thread Grant Ingersoll
Try adding a debugQuery=true parameter on to see if that helps you  
decipher what is going on.


FWIW, the _val_ boost is a factor in scoring, but it isn't the only  
factor.   Perhaps you're seeing the document score factor in as well?


-Grant

On Sep 22, 2008, at 6:37 PM, water4u99 wrote:



Hi,

I have indexed a dynamic field in the  as: 28.00.
It is visible in my query.
However, when I issue a query with a function: ...  
_val_:"sum(stockPrice_f,

10.00)"&fl=*,score
I received the output of: 36.41818
There were no other computations.

Can any one help on why the answer is off.

Thank you.
--
View this message in context: 
http://www.nabble.com/Precision-issue-with-sum%28%29-function-tp19616287p19616287.html
Sent from the Solr - User mailing list archive at Nabble.com.






Re: SolrUpdateServlet Warning

2008-09-23 Thread Gregg
This turned out to be a fairly pedestrian bug on my part: I had "/update"
appended to the Solr base URL when I was adding docs via SolrJ.

Thanks for the help.

--Gregg

On Tue, Sep 23, 2008 at 12:42 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:

>
> On Sep 23, 2008, at 12:35 PM, Gregg wrote:
>
>  I've got a small configuration question. When posting docs via SolrJ, I
>> get
>> the following warning in the Solr logs:
>>
>> WARNING: The @Deprecated SolrUpdateServlet does not accept query
>> parameters:
>> wt=xml&version=2.2
>>  If you are using solrj, make sure to register a request handler to
>> /update
>> rather then use this servlet.
>>  Add: > >
>> to your solrconfig.xml
>>
>> I have an update handler configured in solrconfig.xml as follows:
>>
>> 
>>
>>
> are you sure?
>
> check http://localhost:8983/solr/admin/stats.jsp
> and search for XmlUpdateRequestHandler
> make sure it is registered to /update
>
>
>  What's the preferred solution? Should I comment out the SolrUpdateServlet
>> in
>> solr's web.xml? My Solr server is running at /solr, if that helps.
>>
>>
> that will definitely work, but it should not be necessary to crack open the
> .war file.
>
>
> ryan
>


Highlight Fragments

2008-09-23 Thread David Snelling
Ok, I'm very frustrated. I've tried every configuraiton I can and parameters
and I cannot get fragments to show up in the highlighting in solr. (no
fragments at the bottom or highlights  in the text. I must be
missing something but I'm just not sure what it is.

/select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true

And I get highlight segment, but no fragments or phrase highlighting.

My goal - if I'm doing this completely wrong - is to get google like
snippets of text around the query term (or at mimimum to highlight the query
term itself).

Results:

synopsis
true
3
gap
crayon
synopsis
standard
true
2.1


−

−
...
.
..














-- 
"hic sunt dracones"


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Make sure the fields you're trying to highlight are stored in your schema
(e.g. )



David Snelling-2 wrote:
> 
> Ok, I'm very frustrated. I've tried every configuraiton I can and
> parameters
> and I cannot get fragments to show up in the highlighting in solr. (no
> fragments at the bottom or highlights  in the text. I must be
> missing something but I'm just not sure what it is.
> 
> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
> 
> And I get highlight segment, but no fragments or phrase highlighting.
> 
> My goal - if I'm doing this completely wrong - is to get google like
> snippets of text around the query term (or at mimimum to highlight the
> query
> term itself).
> 
> Results:
> 
> synopsis
> true
> 3
> gap
> crayon
> synopsis
> standard
> true
> 2.1
> 
> 
> −
> 
> −
> ...
> .
> ..
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> "hic sunt dracones"
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlight Fragments

2008-09-23 Thread David Snelling
This is the configuration for the two fields I have tried on






On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote:

>
> Make sure the fields you're trying to highlight are stored in your schema
> (e.g. )
>
>
>
> David Snelling-2 wrote:
> >
> > Ok, I'm very frustrated. I've tried every configuraiton I can and
> > parameters
> > and I cannot get fragments to show up in the highlighting in solr. (no
> > fragments at the bottom or highlights  in the text. I must be
> > missing something but I'm just not sure what it is.
> >
> >
> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
> >
> > And I get highlight segment, but no fragments or phrase highlighting.
> >
> > My goal - if I'm doing this completely wrong - is to get google like
> > snippets of text around the query term (or at mimimum to highlight the
> > query
> > term itself).
> >
> > Results:
> > 
> > synopsis
> > true
> > 3
> > gap
> > crayon
> > synopsis
> > standard
> > true
> > 2.1
> > 
> > 
> > −
> > 
> > −
> > ...
> > .
> > ..
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >
> > --
> > "hic sunt dracones"
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"hic sunt dracones"


using BoostingTermQuery

2008-09-23 Thread Ensdorf Ken
Hi-

I'm new to Solr, and I'm trying to figure out the best way to configure it to 
use BoostingTermQuery in the scoring mechanism.  Do I need to create a custom 
query parser?  All I want is the default parser behavior except to get the 
custom term boost from the Payload data.  Thanks!

-Ken


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Try a query where you're sure to get something to highlight in one of your
highlight fields, for example:

/select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription



David Snelling-2 wrote:
> 
> This is the configuration for the two fields I have tried on
> 
>  stored="true"/>
>  compressed="true"/>
> 
> 
> 
> On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote:
> 
>>
>> Make sure the fields you're trying to highlight are stored in your schema
>> (e.g. )
>>
>>
>>
>> David Snelling-2 wrote:
>> >
>> > Ok, I'm very frustrated. I've tried every configuraiton I can and
>> > parameters
>> > and I cannot get fragments to show up in the highlighting in solr. (no
>> > fragments at the bottom or highlights  in the text. I must be
>> > missing something but I'm just not sure what it is.
>> >
>> >
>> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
>> >
>> > And I get highlight segment, but no fragments or phrase highlighting.
>> >
>> > My goal - if I'm doing this completely wrong - is to get google like
>> > snippets of text around the query term (or at mimimum to highlight the
>> > query
>> > term itself).
>> >
>> > Results:
>> > 
>> > synopsis
>> > true
>> > 3
>> > gap
>> > crayon
>> > synopsis
>> > standard
>> > true
>> > 2.1
>> > 
>> > 
>> > −
>> > 
>> > −
>> > ...
>> > .
>> > ..
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> >
>> > --
>> > "hic sunt dracones"
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> "hic sunt dracones"
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: using BoostingTermQuery

2008-09-23 Thread Grant Ingersoll
At this point, it's roll your own.  I'd love to see the BTQ in Solr  
(and Spans!), but I wonder if it makes sense w/o better indexing side  
support.  I assume you are rolling your own Analyzer, right?  Spans  
and payloads are this huge untapped area for better search!



On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:


Hi-

I'm new to Solr, and I'm trying to figure out the best way to  
configure it to use BoostingTermQuery in the scoring mechanism.  Do  
I need to create a custom query parser?  All I want is the default  
parser behavior except to get the custom term boost from the Payload  
data.  Thanks!


-Ken


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: using BoostingTermQuery

2008-09-23 Thread Otis Gospodnetic
It may be too early to say this but I'll say it anyway :)
There should be a juicy case study that includes payloads, BTQ, and Spans in 
the upcoming Lucene in Action 2.  I can't wait to see it, personally.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Grant Ingersoll <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, September 23, 2008 5:29:05 PM
> Subject: Re: using BoostingTermQuery
> 
> At this point, it's roll your own.  I'd love to see the BTQ in Solr  
> (and Spans!), but I wonder if it makes sense w/o better indexing side  
> support.  I assume you are rolling your own Analyzer, right?  Spans  
> and payloads are this huge untapped area for better search!
> 
> 
> On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:
> 
> > Hi-
> >
> > I'm new to Solr, and I'm trying to figure out the best way to  
> > configure it to use BoostingTermQuery in the scoring mechanism.  Do  
> > I need to create a custom query parser?  All I want is the default  
> > parser behavior except to get the custom term boost from the Payload  
> > data.  Thanks!
> >
> > -Ken
> 
> --
> Grant Ingersoll
> http://www.lucidimagination.com
> 
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ



RE: using BoostingTermQuery

2008-09-23 Thread Ensdorf Ken

> At this point, it's roll your own.

That's where I'm getting bogged down - I'm confused by the various queryparser 
classes in lucene and solr and I'm not sure exactly what I need to override.  
Do you know of an example of something similar to what I'm doing that I could 
use as a reference?

> I'd love to see the BTQ in Solr
> (and Spans!), but I wonder if it makes sense w/o better indexing side
> support.  I assume you are rolling your own Analyzer, right?

Yup - I'm pretty sure I have that side figured out.  My input contains terms 
marked up with a score (ie 'software?7')  I just needed to create a TokenFilter 
that parses out the suffix and sets the Payload on the token.

>  Spans and payloads are this huge untapped area for better search!

Completely agree - we do a lot with keyword searching, and we use this type of 
thing in our existing search implementation.  Thanks for the quick response!

> On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:
>
> > Hi-
> >
> > I'm new to Solr, and I'm trying to figure out the best way to
> > configure it to use BoostingTermQuery in the scoring mechanism.  Do
> > I need to create a custom query parser?  All I want is the default
> > parser behavior except to get the custom term boost from the Payload
> > data.  Thanks!
> >
> > -Ken
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>


Re: Highlight Fragments

2008-09-23 Thread David Snelling
Hmmm. That doesn't actually return  anything which is odd because I know
it's in the field if I do a query without specifying the field.

http://qasearch.donorschoose.org/select/?q=synopsis:students

returns nothing

http://qasearch.donorschoose.org/select/?q=students

returns items with query in synopsis field.

This may be causing issues but I'm not sure why it's not working. We use
this live and do very complex queries including facets that work fine.

www.donorschoose.org



On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote:

>
> Try a query where you're sure to get something to highlight in one of your
> highlight fields, for example:
>
>
> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription
>
>
>
> David Snelling-2 wrote:
> >
> > This is the configuration for the two fields I have tried on
> >
> >  > stored="true"/>
> >  > compressed="true"/>
> >
> >
> >
> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote:
> >
> >>
> >> Make sure the fields you're trying to highlight are stored in your
> schema
> >> (e.g. )
> >>
> >>
> >>
> >> David Snelling-2 wrote:
> >> >
> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and
> >> > parameters
> >> > and I cannot get fragments to show up in the highlighting in solr. (no
> >> > fragments at the bottom or highlights  in the text. I must be
> >> > missing something but I'm just not sure what it is.
> >> >
> >> >
> >>
> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
> >> >
> >> > And I get highlight segment, but no fragments or phrase highlighting.
> >> >
> >> > My goal - if I'm doing this completely wrong - is to get google like
> >> > snippets of text around the query term (or at mimimum to highlight the
> >> > query
> >> > term itself).
> >> >
> >> > Results:
> >> > 
> >> > synopsis
> >> > true
> >> > 3
> >> > gap
> >> > crayon
> >> > synopsis
> >> > standard
> >> > true
> >> > 2.1
> >> > 
> >> > 
> >> > −
> >> > 
> >> > −
> >> > ...
> >> > .
> >> > ..
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> >
> >> > --
> >> > "hic sunt dracones"
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > "hic sunt dracones"
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"hic sunt dracones"


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Your fields are all of string type. String fields aren't tokenized or
analyzed, so you have to match the entire text of those fields to actually
get a match. Try the following:

/select/?q=firstname:Kathryn&hl=on&hl.fl=firstname

The reason you're seeing results with just q=students, but not
q=synopsis:students is because you're copying the synopsis field into your
field named 'text', which is of type 'text', which does get tokenized and
analyzed, and 'text' is your default search field.

The reason you don't see any highlights with the following query is because
your 'text' field isn't stored.

select/?q=text:students&hl=on&hl.fl=text





David Snelling-2 wrote:
> 
> Hmmm. That doesn't actually return  anything which is odd because I know
> it's in the field if I do a query without specifying the field.
> 
> http://qasearch.donorschoose.org/select/?q=synopsis:students
> 
> returns nothing
> 
> http://qasearch.donorschoose.org/select/?q=students
> 
> returns items with query in synopsis field.
> 
> This may be causing issues but I'm not sure why it's not working. We use
> this live and do very complex queries including facets that work fine.
> 
> www.donorschoose.org
> 
> 
> 
> On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote:
> 
>>
>> Try a query where you're sure to get something to highlight in one of
>> your
>> highlight fields, for example:
>>
>>
>> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription
>>
>>
>>
>> David Snelling-2 wrote:
>> >
>> > This is the configuration for the two fields I have tried on
>> >
>> > > > stored="true"/>
>> > > > compressed="true"/>
>> >
>> >
>> >
>> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >>
>> >> Make sure the fields you're trying to highlight are stored in your
>> schema
>> >> (e.g. )
>> >>
>> >>
>> >>
>> >> David Snelling-2 wrote:
>> >> >
>> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and
>> >> > parameters
>> >> > and I cannot get fragments to show up in the highlighting in solr.
>> (no
>> >> > fragments at the bottom or highlights  in the text. I must
>> be
>> >> > missing something but I'm just not sure what it is.
>> >> >
>> >> >
>> >>
>> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
>> >> >
>> >> > And I get highlight segment, but no fragments or phrase
>> highlighting.
>> >> >
>> >> > My goal - if I'm doing this completely wrong - is to get google like
>> >> > snippets of text around the query term (or at mimimum to highlight
>> the
>> >> > query
>> >> > term itself).
>> >> >
>> >> > Results:
>> >> > 
>> >> > synopsis
>> >> > true
>> >> > 3
>> >> > gap
>> >> > crayon
>> >> > synopsis
>> >> > standard
>> >> > true
>> >> > 2.1
>> >> > 
>> >> > 
>> >> > −
>> >> > 
>> >> > −
>> >> > ...
>> >> > .
>> >> > ..
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> >
>> >> > --
>> >> > "hic sunt dracones"
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >
>> >
>> > --
>> > "hic sunt dracones"
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> "hic sunt dracones"
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Highlight Fragments

2008-09-23 Thread David Snelling
Ok, thanks, that makes a lot of sense now.
So, how should I be storing the text for the synopsis or shortdescription
fields so it would be tokenized? Should it be text instead of string?


Thank you very much for the help by the way.


On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia <[EMAIL PROTECTED]> wrote:

>
> Your fields are all of string type. String fields aren't tokenized or
> analyzed, so you have to match the entire text of those fields to actually
> get a match. Try the following:
>
> /select/?q=firstname:Kathryn&hl=on&hl.fl=firstname
>
> The reason you're seeing results with just q=students, but not
> q=synopsis:students is because you're copying the synopsis field into your
> field named 'text', which is of type 'text', which does get tokenized and
> analyzed, and 'text' is your default search field.
>
> The reason you don't see any highlights with the following query is because
> your 'text' field isn't stored.
>
> select/?q=text:students&hl=on&hl.fl=text
>
>
>
>
>
> David Snelling-2 wrote:
> >
> > Hmmm. That doesn't actually return  anything which is odd because I know
> > it's in the field if I do a query without specifying the field.
> >
> > http://qasearch.donorschoose.org/select/?q=synopsis:students
> >
> > returns nothing
> >
> > http://qasearch.donorschoose.org/select/?q=students
> >
> > returns items with query in synopsis field.
> >
> > This may be causing issues but I'm not sure why it's not working. We use
> > this live and do very complex queries including facets that work fine.
> >
> > www.donorschoose.org
> >
> >
> >
> > On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote:
> >
> >>
> >> Try a query where you're sure to get something to highlight in one of
> >> your
> >> highlight fields, for example:
> >>
> >>
> >>
> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription
> >>
> >>
> >>
> >> David Snelling-2 wrote:
> >> >
> >> > This is the configuration for the two fields I have tried on
> >> >
> >> >  >> > stored="true"/>
> >> >  >> > compressed="true"/>
> >> >
> >> >
> >> >
> >> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]>
> >> wrote:
> >> >
> >> >>
> >> >> Make sure the fields you're trying to highlight are stored in your
> >> schema
> >> >> (e.g. )
> >> >>
> >> >>
> >> >>
> >> >> David Snelling-2 wrote:
> >> >> >
> >> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and
> >> >> > parameters
> >> >> > and I cannot get fragments to show up in the highlighting in solr.
> >> (no
> >> >> > fragments at the bottom or highlights  in the text. I must
> >> be
> >> >> > missing something but I'm just not sure what it is.
> >> >> >
> >> >> >
> >> >>
> >>
> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
> >> >> >
> >> >> > And I get highlight segment, but no fragments or phrase
> >> highlighting.
> >> >> >
> >> >> > My goal - if I'm doing this completely wrong - is to get google
> like
> >> >> > snippets of text around the query term (or at mimimum to highlight
> >> the
> >> >> > query
> >> >> > term itself).
> >> >> >
> >> >> > Results:
> >> >> > 
> >> >> > synopsis
> >> >> > true
> >> >> > 3
> >> >> > gap
> >> >> > crayon
> >> >> > synopsis
> >> >> > standard
> >> >> > true
> >> >> > 2.1
> >> >> > 
> >> >> > 
> >> >> > −
> >> >> > 
> >> >> > −
> >> >> > ...
> >> >> > .
> >> >> > ..
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> >
> >> >> > --
> >> >> > "hic sunt dracones"
> >> >> >
> >> >> >
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
> >> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > "hic sunt dracones"
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
> > --
> > "hic sunt dracones"
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
"hic sunt dracones"


Re: Highlight Fragments

2008-09-23 Thread wojtekpia

Yes, you can use text (or some custom derivative of it) for your fields. 


David Snelling-2 wrote:
> 
> Ok, thanks, that makes a lot of sense now.
> So, how should I be storing the text for the synopsis or shortdescription
> fields so it would be tokenized? Should it be text instead of string?
> 
> 
> Thank you very much for the help by the way.
> 
> 
> On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia <[EMAIL PROTECTED]> wrote:
> 
>>
>> Your fields are all of string type. String fields aren't tokenized or
>> analyzed, so you have to match the entire text of those fields to
>> actually
>> get a match. Try the following:
>>
>> /select/?q=firstname:Kathryn&hl=on&hl.fl=firstname
>>
>> The reason you're seeing results with just q=students, but not
>> q=synopsis:students is because you're copying the synopsis field into
>> your
>> field named 'text', which is of type 'text', which does get tokenized and
>> analyzed, and 'text' is your default search field.
>>
>> The reason you don't see any highlights with the following query is
>> because
>> your 'text' field isn't stored.
>>
>> select/?q=text:students&hl=on&hl.fl=text
>>
>>
>>
>>
>>
>> David Snelling-2 wrote:
>> >
>> > Hmmm. That doesn't actually return  anything which is odd because I
>> know
>> > it's in the field if I do a query without specifying the field.
>> >
>> > http://qasearch.donorschoose.org/select/?q=synopsis:students
>> >
>> > returns nothing
>> >
>> > http://qasearch.donorschoose.org/select/?q=students
>> >
>> > returns items with query in synopsis field.
>> >
>> > This may be causing issues but I'm not sure why it's not working. We
>> use
>> > this live and do very complex queries including facets that work fine.
>> >
>> > www.donorschoose.org
>> >
>> >
>> >
>> > On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]>
>> wrote:
>> >
>> >>
>> >> Try a query where you're sure to get something to highlight in one of
>> >> your
>> >> highlight fields, for example:
>> >>
>> >>
>> >>
>> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription
>> >>
>> >>
>> >>
>> >> David Snelling-2 wrote:
>> >> >
>> >> > This is the configuration for the two fields I have tried on
>> >> >
>> >> > > >> > stored="true"/>
>> >> > > >> > compressed="true"/>
>> >> >
>> >> >
>> >> >
>> >> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]>
>> >> wrote:
>> >> >
>> >> >>
>> >> >> Make sure the fields you're trying to highlight are stored in your
>> >> schema
>> >> >> (e.g. )
>> >> >>
>> >> >>
>> >> >>
>> >> >> David Snelling-2 wrote:
>> >> >> >
>> >> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and
>> >> >> > parameters
>> >> >> > and I cannot get fragments to show up in the highlighting in
>> solr.
>> >> (no
>> >> >> > fragments at the bottom or highlights  in the text. I
>> must
>> >> be
>> >> >> > missing something but I'm just not sure what it is.
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true
>> >> >> >
>> >> >> > And I get highlight segment, but no fragments or phrase
>> >> highlighting.
>> >> >> >
>> >> >> > My goal - if I'm doing this completely wrong - is to get google
>> like
>> >> >> > snippets of text around the query term (or at mimimum to
>> highlight
>> >> the
>> >> >> > query
>> >> >> > term itself).
>> >> >> >
>> >> >> > Results:
>> >> >> > 
>> >> >> > synopsis
>> >> >> > true
>> >> >> > 3
>> >> >> > gap
>> >> >> > crayon
>> >> >> > synopsis
>> >> >> > standard
>> >> >> > true
>> >> >> > 2.1
>> >> >> > 
>> >> >> > 
>> >> >> > −
>> >> >> > 
>> >> >> > −
>> >> >> > ...
>> >> >> > .
>> >> >> > ..
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> > 
>> >> >> >
>> >> >> > --
>> >> >> > "hic sunt dracones"
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> View this message in context:
>> >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html
>> >> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > "hic sunt dracones"
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >
>> >
>> > --
>> > "hic sunt dracones"
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> "hic sunt dracones"
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Highlight-Fragments-tp19636705p19638296.html
Sent from the Solr - User mailing list archive at Nabble.com.



Snappuller taking up CPU on master

2008-09-23 Thread rahul_k123

Hi,

I am using snappuller to sync my slave with master, i am not using rsync
daemon, i am doing Rsync using remote shell.

When i am serving requests from the master when the snappuller is running
(after optimization, total index is arnd 4 gb it doing the transfer of whole
index), the performance is very bad actually causing timeouts.



Any ideas why this happens .


Any suggestions will help.


Thanks.
-- 
View this message in context: 
http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: using BoostingTermQuery

2008-09-23 Thread Grant Ingersoll


On Sep 23, 2008, at 5:39 PM, Ensdorf Ken wrote:




At this point, it's roll your own.


That's where I'm getting bogged down - I'm confused by the various  
queryparser classes in lucene and solr and I'm not sure exactly what  
I need to override.  Do you know of an example of something similar  
to what I'm doing that I could use as a reference?


I'm no QueryParser expert, but I would probably start w/ the default  
query parser in Solr (LuceneQParser), and then progress a bit to the  
DisMax one.  I'd ask specific questions based on what you see there.   
If you get far enough along, you may consider asking for help on the  
java-user list as well.






I'd love to see the BTQ in Solr
(and Spans!), but I wonder if it makes sense w/o better indexing side
support.  I assume you are rolling your own Analyzer, right?


Yup - I'm pretty sure I have that side figured out.  My input  
contains terms marked up with a score (ie 'software?7')  I just  
needed to create a TokenFilter that parses out the suffix and sets  
the Payload on the token.


Cool.  Patch?





Spans and payloads are this huge untapped area for better search!


Completely agree - we do a lot with keyword searching, and we use  
this type of thing in our existing search implementation.  Thanks  
for the quick response!



On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote:


Hi-

I'm new to Solr, and I'm trying to figure out the best way to
configure it to use BoostingTermQuery in the scoring mechanism.  Do
I need to create a custom query parser?  All I want is the default
parser behavior except to get the custom term boost from the Payload
data.  Thanks!

-Ken


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: Snappuller taking up CPU on master

2008-09-23 Thread Otis Gospodnetic
Hi,

Can't tell with certainty without looking, but my guess would be slow disk, 
high IO, and a large number of processes waiting for IO (run vmstat and look at 
the "wa" column).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: rahul_k123 <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, September 23, 2008 6:56:48 PM
> Subject: Snappuller taking up CPU on master
> 
> 
> Hi,
> 
> I am using snappuller to sync my slave with master, i am not using rsync
> daemon, i am doing Rsync using remote shell.
> 
> When i am serving requests from the master when the snappuller is running
> (after optimization, total index is arnd 4 gb it doing the transfer of whole
> index), the performance is very bad actually causing timeouts.
> 
> 
> 
> Any ideas why this happens .
> 
> 
> Any suggestions will help.
> 
> 
> Thanks.
> -- 
> View this message in context: 
> http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html
> Sent from the Solr - User mailing list archive at Nabble.com.



solr score

2008-09-23 Thread sanraj25

hi,
  How to weightage more frequently searched word in solr?

what is the functionality in Apache solr module?
I have a list of more frequently searched word in my site , i need to
highlight those words.From the net i found out that 'score' is used for this
purpose. Isn't it true?
Anybody knows about it?
Please help me. 

with Regards,
Santhanaraj R

-- 
View this message in context: 
http://www.nabble.com/solr-score-tp19642046p19642046.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Snappuller taking up CPU on master

2008-09-23 Thread rahul_k123

Hi,

Thanks for the reply.

I am not using SOLR for indexing and serving search requests, i am using
only the scripts for replication.

Yes it looks like I/O, but my question is how to handle this problem and is
there any optimal way to achieve this.


Thanks.




Otis Gospodnetic wrote:
> 
> Hi,
> 
> Can't tell with certainty without looking, but my guess would be slow
> disk, high IO, and a large number of processes waiting for IO (run vmstat
> and look at the "wa" column).
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: rahul_k123 <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, September 23, 2008 6:56:48 PM
>> Subject: Snappuller taking up CPU on master
>> 
>> 
>> Hi,
>> 
>> I am using snappuller to sync my slave with master, i am not using rsync
>> daemon, i am doing Rsync using remote shell.
>> 
>> When i am serving requests from the master when the snappuller is running
>> (after optimization, total index is arnd 4 gb it doing the transfer of
>> whole
>> index), the performance is very bad actually causing timeouts.
>> 
>> 
>> 
>> Any ideas why this happens .
>> 
>> 
>> Any suggestions will help.
>> 
>> 
>> Thanks.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19642053.html
Sent from the Solr - User mailing list archive at Nabble.com.



Error running query inside data-config.xml

2008-09-23 Thread con

Hi Guys
 I am trying to take values by connecting two tables. My data-config.xml
looks like:



 









If I try to index the values from a single table, it is working fine. Is
there anything wrong in the above configuration:

Thanks in advance
-- 
View this message in context: 
http://www.nabble.com/Error-running-query-inside-data-config.xml-tp19642540p19642540.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Error running query inside data-config.xml

2008-09-23 Thread Shalin Shekhar Mangar
Since you have not given any information about your schema, we cannot help
with the queries.

What do you mean by error running query? Do you get an exception or no
values for the inner entity's fields?

On Wed, Sep 24, 2008 at 11:34 AM, con <[EMAIL PROTECTED]> wrote:

>
> Hi Guys
>  I am trying to take values by connecting two tables. My data-config.xml
> looks like:
> 
>
>
>
>
>
> column="SALARY" />
>
>
>
> 
>
>
> If I try to index the values from a single table, it is working fine. Is
> there anything wrong in the above configuration:
>
> Thanks in advance
> --
> View this message in context:
> http://www.nabble.com/Error-running-query-inside-data-config.xml-tp19642540p19642540.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


help required: how to design a large scale solr system

2008-09-23 Thread Ben Shlomo, Yatir
Hi!

I am already using solr 1.2 and happy with it.

In a new project with very tight dead line (10 development days from
today) I need to setup a more ambitious system in terms of scale
Here is the spec:

 

* I need to index about 60,000,000
documents 

* Each document is has 11 textual fields to be indexed & stored
and 4 more fields to be stored only 

* Most fields are short (2-14 characters) however 2 indexed
fields can be up to 1KB and another stored field is up to 1KB 

* On average every document is about 0.5 KB to be stored and
0.4KB to be indexed 

* The SLA for data freshness is a full nightly re-index ( I
cannot obtain an incremental update/delete lists of the modified
documents) 

* The SLA for query time is 5 seconds 

* the number of expected queries is 2-3 queries per second 

* the queries are simple a combination of Boolean operation and
name searches (no fancy fuzzy searches and levinstien distances, no
faceting, etc) 

* I have a 64 bit Dell 2950 4-cpu machine  (2 dual cores ) with
RAID 10, 200 GB HD space, and 8GB RAM memory 

* The documents are not given to me explicitly - I am given a
raw-documents in RAM - one by one, from which I create my document in
RAM.
and then I can either http-post is to index it directly or append it to
a tsv file for later indexing 

* Each document has a unique ID

 

I have a few directions I am thinking about

 

The simple approach

* Have one solr instance that will index
the entire document set (from files). I am afraid this will take too
much time

 

Direction 1

* Create TSV files from all the
documents - this will take around 3-4 hours 

* Have all the documents partitioned
into several subsets (how many should I choose? ) 

* Have multiple solr instances on the
same machine 

* Let each solr instance concurrently
index the appropriate subset 

* At the end merge all the indices using
the IndexMergeTool - (how much time will it take ?)

 

Direction 2

* Like  the previous but instead of
using the IndexMergeTool , use distributed search with shards (upgrading
to solr 1.3)

 

Direction 3,4

* Like previous directions only avoid
using TSV files at all and directly index the documents from RAM

Questions:

* Which direction do you recommend in order to meet the SLAs in
the fastest way? 

* Since I have RAID on the machine can I gain performance by
using multiple solr instances on the same machine or only multiple
machines will help me 

* What's the minimal number of machines I should require (I
might get more weaker machines) 

* How many concurrent indexers are recommended? 

* Do you agree that the bottle neck is the indexing time?

Any help is appreciated 

Thanks in advance

yatir

 



Solr Using

2008-09-23 Thread Dinesh Gupta

Which version of tomcat required.

I installed  jboss4.0.2 which have tomcat5.5.9.

JSP pages are not going to compile.

Its giving syntax error.

Please help.

I can't move from jboss4.0.2.

Please help.

Regards,
Dinesh Gupta

> Date: Tue, 23 Sep 2008 19:36:22 +0530
> From: [EMAIL PROTECTED]
> To: solr-user@lucene.apache.org
> Subject: Re: Lucene index
> 
> Hi Dinesh,
> 
> This seems straightforward for Solr. You can use the embedded jetty server
> for a start. Look at the tutorial on how to get started.
> 
> You'll need to modify the schema.xml to define all the fields that you want
> to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good
> start on how to do that. Each field in your code will have a counterpart in
> the schema.xml with appropriate flags (indexed/stored/tokenized etc.)
> 
> Once that is complete, try to modify the DataImportHandler's hsqldb example
> for your mysql database.
> 
> On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote:
> 
> >
> > Hi Shalin Shekhar,
> >
> > Let me explain my issue.
> >
> > I have some tables in my database like
> >
> > Product
> > Category
> > Catalogue
> > Keywords
> > Seller
> > Brand
> > Country_city_group
> > etc.
> > I have a class that represent  product document as
> >
> > Document doc = new Document();
> >// Keywords which can be used directly for search
> >doc.add(new Field("id",(String)
> > data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >
> >// Sorting fields]
> >String priceString = (String) data.get("Price");
> >if (priceString == null)
> >priceString = "0";
> >long price = 0;
> >try {
> >price = (long) Double.parseDouble(priceString);
> >} catch (Exception e) {
> >
> >}
> >
> >doc.add(new
> > Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >Date createDate = (Date) data.get("CreateDate");
> >if (createDate == null) createDate = new Date();
> >
> >doc.add(new Field("cdt",String.valueOf(createDate.getTime()),
> > Field.Store.NO,Field.Index.UN_TOKENIZED));
> >
> >Date modiDate = (Date) data.get("ModiDate");
> >if (modiDate == null) modiDate = new Date();
> >
> >doc.add(new Field("mdt",String.valueOf(modiDate.getTime()),
> > Field.Store.NO,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.UnStored("cdt",
> > String.valueOf(createDate.getTime(;
> >
> >// Additional fields for search
> >doc.add(new Field("bnm",(String)
> > data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED));
> >doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO
> > ,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Text("bnm", (String) data.get("Brand")));
> > //Tokenized and Unstored
> >doc.add(new Field("bid",(String)
> > data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); //
> > untokenized &
> >doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO
> > ,Field.Index.TOKENIZED));
> >//doc.add(Field.Text("grp", (String) data.get("Group")));
> >doc.add(new Field("gid",(String)
> > data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New
> >doc.add(new Field("snm",(String)
> > data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Text("snm", (String) data.get("Seller")));
> >doc.add(new Field("sid",(String)
> > data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); //
> > New
> >doc.add(new Field("ttl",(String)
> > data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED));
> >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));
> >
> >String title1 = (String) data.get("Title");
> >title1 = removeSpaces(title1);
> >doc.add(new Field("ttl1",title1,Field.Store.NO
> > ,Field.Index.UN_TOKENIZED));
> >
> >doc.add(new Field("ttl2",title1,Field.Store.NO
> > ,Field.Index.TOKENIZED));
> >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true));
> >
> >// ColumnC - Product Sequence
> >String productSeq = (String) data.get("ProductSeq");
> >if (productSeq == null) productSeq = "";
> >doc.add(new Field("seq",productSeq,Field.Store.NO
> > ,Field.Index.UN_TOKENIZED));
> >//doc.add(Field.Keyword("seq", productSeq));
> >
> >// New Added
> >doc.add(new Field("sdc",(String) data.get("SpecialDescription"),
> > Field.Store.NO,Field.Index.TOKENIZED));
> >//doc.add(Field.UnStored("sdc", (String)
> > data.get("SpecialDescription"),true));
> >doc.add(new Field("kdc", (String) data.get("KeywordDescription"),
> > Field.S