Re: data-import runned by cron job withou wating the end of the previous one
when I try without adaptive parameter I've OOME: HTTP Status 500 - Java heap space java.lang.OutOfMemoryError: Java heap space Shalin Shekhar Mangar wrote: > > On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > >> >> Hi, >> There is something wierd : >> I've plan cron job every 5mn which heat delta-import's url and works fine >> : >> The point is : It does look like if it doesn't check every data for >> updating >> or creating a new one : >> Because every 5mn the delta importa is started again : (even like if >> delta-import is not done) >> > > That should not be happening. Why do you feel it is starting again without > waiting for the previous import to finish? > > >> >> idle >> >> − >> >> 0:2:23.885 >> 1 >> 1863146 >> 0 >> 0 >> 2008-09-22 17:40:01 >> 2008-09-22 17:40:01 >> >> > > I'm confused by this output. How frequently do you update your database? > How > many rows are modified in the database in that 5 minute period? > > What is the type of your last modified column in the database on which you > use for identifying the deltas? > > >> >> and I wonder if it does come from my data-config file parameters : >> which is adaptive : >> >> > driver="com.mysql.jdbc.Driver" >> url="jdbc:mysql://master.books.com/books" >> user="solr" >> password="tah1Axie" >>batchSize="-1" >> responseBuffering="adaptive"/> >> >> Thanks, >> > > The part on responseBuffering is not applicable for MySQL so you can > remove > that. > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622383.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: data-import runned by cron job withou wating the end of the previous one
When I try without adaptive parameter I've an out of memory error. Shalin Shekhar Mangar wrote: > > On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > >> >> Hi, >> There is something wierd : >> I've plan cron job every 5mn which heat delta-import's url and works fine >> : >> The point is : It does look like if it doesn't check every data for >> updating >> or creating a new one : >> Because every 5mn the delta importa is started again : (even like if >> delta-import is not done) >> > > That should not be happening. Why do you feel it is starting again without > waiting for the previous import to finish? > > >> >> idle >> >> − >> >> 0:2:23.885 >> 1 >> 1863146 >> 0 >> 0 >> 2008-09-22 17:40:01 >> 2008-09-22 17:40:01 >> >> > > I'm confused by this output. How frequently do you update your database? > How > many rows are modified in the database in that 5 minute period? > > What is the type of your last modified column in the database on which you > use for identifying the deltas? > > >> >> and I wonder if it does come from my data-config file parameters : >> which is adaptive : >> >> > driver="com.mysql.jdbc.Driver" >> url="jdbc:mysql://master.books.com/books" >> user="solr" >> password="tah1Axie" >>batchSize="-1" >> responseBuffering="adaptive"/> >> >> Thanks, >> > > The part on responseBuffering is not applicable for MySQL so you can > remove > that. > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/data-import-runned-by-cron-job-withou-wating-the-end-of-the-previous-one-tp19610823p19622498.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr Using
Hi Otis, Currently I am creating indexes from Java standalone program. I am preparing data by using query & have made data to index. Function as blow can we write. I have large number of product & we want to user it at production level. Please provide me sample or tutorials. /** * * * @param pbi * @throws DAOException */ protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException { long start = System.currentTimeMillis(); Long prn = pbi.getPbirfnum(); if (!isValidProduct(pbi)) { if(logger.isDebugEnabled()) logger.debug("Product Discarded" + prn+ " not a valid product. "); discarded++; return null; } IsmpptDAO pptDao = new IsmpptDAO(); Set categoryList = new HashSet(pptDao.findByProductCategories(prn)); Iterator iter = categoryList.iterator(); Set directCategories = new HashSet(); while (iter.hasNext()) { Object[] obj = (Object[]) iter.next(); Long categoryId = (Long) obj[0]; String categoryName = (String) obj[1]; directCategories.add(new CategoryRecord(categoryId, categoryName)); } if (directCategories.size() == 0) { if(logger.isDebugEnabled()) logger.debug("Product Discarded" + prn + " not placed in any category directly [ismppt]."); discarded++; return null; } // Get all the categories for the direct categories - contains // CategoryRecord objects Set categories = getCategories(directCategories, prn); Set categoryIds = new HashSet(); // All category ids Iterator it = categories.iterator(); while (it.hasNext()) { CategoryRecord rec = (CategoryRecord) it.next(); categoryIds.add(rec.getId()); } //All categories so far TOTAL (direct+parent categories) if (categoryIds.size() == 0) { if(logger.isDebugEnabled()) logger.debug("Product Discarded" + prn+ " direct categories are not placed under other categories."); discarded++; return null; } Set catalogues = getCatalogues(prn); if (catalogues.size()!=0){ if(logger.isDebugEnabled()) logger.debug("[" + prn + "]-> Total Direct PCC Catalogues [" + collectionToStringNew(catalogues) +"]"); } getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues); if (catalogues.size() == 0) { if(logger.isDebugEnabled()) logger.debug("Product Discarded " + prn+ " not attached with any catalogue"); discarded++; return null; } String productDirectCategories = collectionToString(directCategories); String productAllCategories = collectionToString(categories); String productAllCatalogues = collectionToStringNew(catalogues); String categoryNames = getCategoryNames(categories); if(logger.isInfoEnabled()) logger.info("TO Document Product " + pbi.getPbirfnum() + " Dir Categories " + productDirectCategories + " All Categories " + productAllCategories + " And Catalogues " + productAllCatalogues); directCategories = null; categories=null; catalogues=null; Document document = new ProductDocument().toDocument(pbi, productAllCategories, productAllCatalogues, productDirectCategories, categoryNames); categoryNames =null; pbi=null; productAllCatalogues =null; productAllCategories =null; productDirectCategories=null; categoryNames=null; long time = System.currentTimeMillis() - start; if (time > longestIndexTime) { longestIndexTime = time; } return document; } > Date: Mon, 22 Sep 2008 22:10:16 -0700 > From: [EMAIL PROTECTED] > Subject: Re: Solr Using > To: solr-user@lucene.apache.org > > Dinesh, > > Please have a look at the Solr tutorial first. > Then have a look at the new DataImportHandler - there is a very detailed page > about it on the Wiki. > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Dinesh Gupta <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Tuesday, September 23, 2008 1:02:34 AM > > Subject: Solr Using > > > > > > > > Hi All, > > > > I am new to Solr. I am using Lucene last 2 years. > > > > We create Lucene indexes for database. > > > > Please help to migrate to Solr. > > > > How can achieve this. > > > > If any one have idea, please help. > > > > Thanks In Advance. > > > > > > Regards, > > Dinesh Gupta > > > > __
Re: Searching for future or "null" dates
On 23.09.2008 00:30 Chris Hostetter wrote: > : Here is what I was able to get working with your help. > : > : (productId:(102685804)) AND liveDate:[* TO NOW] AND ((endDate:[NOW TO *]) OR > : ((*:* -endDate:[* TO *]))) > : > : the *:* is what I was missing. > > Please, PLEASE ... do yourself a favor and stop using "AND" and "OR" ... > food will taste better, flowers will smell fresher, and the world will be > a happy shinny place... > > +productId:102685804 +liveDate:[* TO NOW] +(endDate:[NOW TO *] (*:* > -endDate:[* TO *])) I would also like to follow your advice but don't know how to do it with defaultOperator="AND". What I am missing is the equivalent to OR: AND: + NOT: - OR: ??? I didn't find anything on the Solr or Lucene query syntax pages. If there is such an equivalent then I guess the query would become: productId:102685804 liveDate:[* TO NOW] (endDate:[NOW TO *] (*:* -endDate:[* TO *])) I switched to the AND-default because that is the default in my web frontend so I don't have to change logic. What should I do in this situation? Go back to the OR-default? It is not so much this example I am after but I have a syntax translater in my application that must be able to handle similar expressions and I want to keep it simple and still have tasty food ;-) -Michael
Re: Solr Using
Hi Dinesh, Your code is hardly useful to us since we don't know what you are trying to achieve or what all those Dao classes do. Look at the Solr tutorial first -- http://lucene.apache.org/solr/ Use the SolrJ client for communicating with Solr server -- http://wiki.apache.org/solr/Solrj Also take a look at DataImportHandler which can help avoid all this code -- http://wiki.apache.org/solr/DataImportHandler If you face any problem, first search this mailing list through markmail.orgor nabble.com to find previous posts related to your issue. If you don't find anything helpful, post specific questions here which we will help answer. On Tue, Sep 23, 2008 at 3:56 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > > > > > Hi Otis, > > Currently I am creating indexes from Java standalone program. > > I am preparing data by using query & have made data to index. > > Function as blow can we write. > > I have large number of product & we want to user it at production level. > > Please provide me sample or tutorials. > > > /** > * > * > * @param pbi > * @throws DAOException > */ >protected Document prepareLuceneDocument(Ismpbi pbi) throws DAOException > { >long start = System.currentTimeMillis(); >Long prn = pbi.getPbirfnum(); >if (!isValidProduct(pbi)) { >if(logger.isDebugEnabled()) >logger.debug("Product Discarded" + prn+ " not a valid > product. "); >discarded++; >return null; >} > >IsmpptDAO pptDao = new IsmpptDAO(); >Set categoryList = new HashSet(pptDao.findByProductCategories(prn)); > >Iterator iter = categoryList.iterator(); >Set directCategories = new HashSet(); >while (iter.hasNext()) { >Object[] obj = (Object[]) iter.next(); >Long categoryId = (Long) obj[0]; >String categoryName = (String) obj[1]; >directCategories.add(new CategoryRecord(categoryId, > categoryName)); >} > >if (directCategories.size() == 0) { >if(logger.isDebugEnabled()) >logger.debug("Product Discarded" + prn >+ " not placed in any category directly [ismppt]."); >discarded++; >return null; >} > >// Get all the categories for the direct categories - contains >// CategoryRecord objects >Set categories = getCategories(directCategories, prn); >Set categoryIds = new HashSet(); // All category ids > >Iterator it = categories.iterator(); >while (it.hasNext()) { >CategoryRecord rec = (CategoryRecord) it.next(); >categoryIds.add(rec.getId()); >} > >//All categories so far TOTAL (direct+parent categories) >if (categoryIds.size() == 0) { >if(logger.isDebugEnabled()) >logger.debug("Product Discarded" + prn+ " direct categories > are not placed under other categories."); >discarded++; >return null; >} > >Set catalogues = getCatalogues(prn); >if (catalogues.size()!=0){ >if(logger.isDebugEnabled()) >logger.debug("[" + prn + "]-> Total Direct PCC Catalogues [" > + collectionToStringNew(catalogues) +"]"); >} > >getCatalogueWithAllChildInCCR(prn, categoryIds, catalogues); >if (catalogues.size() == 0) { >if(logger.isDebugEnabled()) >logger.debug("Product Discarded " + prn+ " not attached with > any catalogue"); >discarded++; >return null; >} > >String productDirectCategories = > collectionToString(directCategories); >String productAllCategories = collectionToString(categories); >String productAllCatalogues = collectionToStringNew(catalogues); > >String categoryNames = getCategoryNames(categories); > >if(logger.isInfoEnabled()) >logger.info("TO Document Product " + pbi.getPbirfnum() + " Dir > Categories " + > productDirectCategories + " All Categories " >+ productAllCategories + " And Catalogues " >+ productAllCatalogues); > >directCategories = null; >categories=null; >catalogues=null; > > >Document document = new ProductDocument().toDocument(pbi, >productAllCategories, productAllCatalogues, >productDirectCategories, categoryNames); > >categoryNames =null; >pbi=null; >productAllCatalogues =null; >productAllCategories =null; >productDirectCategories=null; >categoryNames=null; > >long time = System.currentTimeMillis() - start; >if (time > longestIndexTime) { >longestIndexTime = time; >} >return document; >} > > > > > Date: Mon, 22 Sep 2008 22:10:16 -0700 > > From: [EMAIL PROTECTED] > > Subject: Re: Solr Using > > To: solr-user@lucene.a
Lucene index
Hi, Current we are using Lucene api to create index. It creates index in a directory with 3 files like xxx.cfs , deletable & segments. If I am creating Lucene indexes from Solr, these file will be created or not? Please give me example on MySQL data base instead of hsqldb Regards, Dinesh _ Movies, sports & news! Get your daily entertainment fix, only on live.com http://www.live.com/?scope=video&form=MICOAL
EmbeddedSolrServer and the MultiCore functionality
Hello everyone, I'm new to Solr (have been using Lucene for a few years now). We are looking into Solr and have heard many good things about the project:) I have a few questions regarding the EmbeddedSolrServer in Solrj and the MultiCore features... I've tried to find answers to this in the archives but have not succeeded. The thing is, I want to be able to use the Embedded server to access multiple cores on one machine, and I would like to at least have the possibility to access the lucene indexes without http. In particular I'm wondering if it is possible to do the "shards" (distributed search) approach using the embedded server, without using http requests. lets say I register 2 cores to a container and init my embedded server like this: CoreContainer container = new CoreContainer(); container.register("core1", core1, false); container.register("core2", core2, false); server = new EmbeddedSolrServer(container, "core1"); then queries performed on my server will return results from core1... and if i do ..=new EmbeddedSolrServer(container, "core2") the results will come from core2. If i have solr up and running and do something like this: query.set("shards", "localhost:8080/solr/core0,localhost:8080/solr/core1"); I will get the results from both cores, obviously... But is there a way to do this without using shards and accessing the cores through http? I presume it would/should be possible to do the same thing directly against the cores, but my question is really if this has been implemented already / is it possible? Thanks in advance for any replies! Best regards, Aleksander -- Aleksander M. Stensby Senior Software Developer Integrasco A/S +47 41 22 82 72 [EMAIL PROTECTED]
Re: Lucene index
On Tue, Sep 23, 2008 at 5:33 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Hi, > Current we are using Lucene api to create index. > > It creates index in a directory with 3 files like > > xxx.cfs , deletable & segments. > > If I am creating Lucene indexes from Solr, these file will be created or > not? The lucene index will be created in the solr_home inside the data/index directory. > Please give me example on MySQL data base instead of hsqldb > If you are talking about DataImportHandler then there is no difference in the configuration except for using the MySql driver instead of hsqldb. -- Regards, Shalin Shekhar Mangar.
RE: Lucene index
Hi Shalin Shekhar, Let me explain my issue. I have some tables in my database like Product Category Catalogue Keywords Seller Brand Country_city_group etc. I have a class that represent product document as Document doc = new Document(); // Keywords which can be used directly for search doc.add(new Field("id",(String) data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED)); // Sorting fields] String priceString = (String) data.get("Price"); if (priceString == null) priceString = "0"; long price = 0; try { price = (long) Double.parseDouble(priceString); } catch (Exception e) { } doc.add(new Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED)); Date createDate = (Date) data.get("CreateDate"); if (createDate == null) createDate = new Date(); doc.add(new Field("cdt",String.valueOf(createDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED)); Date modiDate = (Date) data.get("ModiDate"); if (modiDate == null) modiDate = new Date(); doc.add(new Field("mdt",String.valueOf(modiDate.getTime()),Field.Store.NO,Field.Index.UN_TOKENIZED)); //doc.add(Field.UnStored("cdt", String.valueOf(createDate.getTime(; // Additional fields for search doc.add(new Field("bnm",(String) data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED)); doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO,Field.Index.UN_TOKENIZED)); //doc.add(Field.Text("bnm", (String) data.get("Brand"))); //Tokenized and Unstored doc.add(new Field("bid",(String) data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); //doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); // untokenized & doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO,Field.Index.TOKENIZED)); //doc.add(Field.Text("grp", (String) data.get("Group"))); doc.add(new Field("gid",(String) data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); //doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New doc.add(new Field("snm",(String) data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED)); //doc.add(Field.Text("snm", (String) data.get("Seller"))); doc.add(new Field("sid",(String) data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); //doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); // New doc.add(new Field("ttl",(String) data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED)); //doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); String title1 = (String) data.get("Title"); title1 = removeSpaces(title1); doc.add(new Field("ttl1",title1,Field.Store.NO,Field.Index.UN_TOKENIZED)); doc.add(new Field("ttl2",title1,Field.Store.NO,Field.Index.TOKENIZED)); //doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); // ColumnC - Product Sequence String productSeq = (String) data.get("ProductSeq"); if (productSeq == null) productSeq = ""; doc.add(new Field("seq",productSeq,Field.Store.NO,Field.Index.UN_TOKENIZED)); //doc.add(Field.Keyword("seq", productSeq)); // New Added doc.add(new Field("sdc",(String) data.get("SpecialDescription"),Field.Store.NO,Field.Index.TOKENIZED)); //doc.add(Field.UnStored("sdc", (String) data.get("SpecialDescription"),true)); doc.add(new Field("kdc", (String) data.get("KeywordDescription"),Field.Store.NO,Field.Index.TOKENIZED)); //doc.add(Field.UnStored("kdc", (String) data.get("KeywordDescription"),true)); // ColumnB - Product Category and parent categories doc.add(new Field("cts",(String) data.get("Categories"),Field.Store.YES,Field.Index.TOKENIZED)); //doc.add(Field.Text("cts", (String) data.get("Categories"))); // ColumnB - Product Category and parent categories //Raman doc.add(new Field("dct",(String) data.get("DirectCategories"),Field.Store.YES,Field.Index.TOKENIZED)); //doc.add(Field.Text("dct", (String) data.get("DirectCategories"))); // ColumnC - Product Catalogues doc.add(new Field("clg",(String) data.get("Catalogues"),Field.Store.YES,Field.Index.TOKENIZED)); //doc.add(Field.Text("clg", (String) data.get("Catalogues"))); //Product Delivery Cities doc.add(new Field("dcty",(String) data.get("DelCities"),Field.Store.YES,Field.Index.TOKENIZED)); // Additional Information //Top Selling Count String sellerCount=((Long)data.get("SellCount")).toString(); doc.add(new Field("bsc",sellerCount,Field.Store.YES,Field.Index.TOKENIZED)); I am preparing data from querying databse. Please tell me how can I migrate my logic to Solr. I
Optimise while uploading?
Hi, Probably a stupid question with the obvious answer, but if I am running a Solr master and accepting updates, do I have to stop the updates when I start the optimise of the index? Or will optimise just take the latest snapshot and work on that independently of the incoming updates? Really enjoying Solr, BTW. Nice job! Thanks Geoff
Re: snapshot.yyyymmdd ... can't found them?
Yes In deed it was problem with the path .. thanks a lot, Just didnt get this part " If you turn up your logging to "FINE" what does that mean ? Huge thanks for your answer, hossman wrote: > > > : And I did change my config file : > : > :
commit
Hi, I don't know why when I start commit manually it doesn't fire snapshooter ? I did it manually because no snapshot was created and if i run it manually it works. so my auto commit is activated (I think) : 1 1000 My snapshooter too: ./data/solr/book/logs/snapshooter data/solr/book/bin true arg1 arg2 MYVAR=val1 Update are done on the server : delta-import idle − 1513 574 0 2008-09-23 16:00:01 2008-09-23 16:00:01 2008-09-23 16:00:37 2008-09-23 16:00:37 216 − Indexing completed. Added/Updated: 216 documents. Deleted 0 documents. 2008-09-23 16:01:29 0:1:28.667 and everything is at the good place I think, my path are good ... -- View this message in context: http://www.nabble.com/commit-tp19628500p19628500.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Lucene index
Hi Dinesh, This seems straightforward for Solr. You can use the embedded jetty server for a start. Look at the tutorial on how to get started. You'll need to modify the schema.xml to define all the fields that you want to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good start on how to do that. Each field in your code will have a counterpart in the schema.xml with appropriate flags (indexed/stored/tokenized etc.) Once that is complete, try to modify the DataImportHandler's hsqldb example for your mysql database. On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > Hi Shalin Shekhar, > > Let me explain my issue. > > I have some tables in my database like > > Product > Category > Catalogue > Keywords > Seller > Brand > Country_city_group > etc. > I have a class that represent product document as > > Document doc = new Document(); >// Keywords which can be used directly for search >doc.add(new Field("id",(String) > data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >// Sorting fields] >String priceString = (String) data.get("Price"); >if (priceString == null) >priceString = "0"; >long price = 0; >try { >price = (long) Double.parseDouble(priceString); >} catch (Exception e) { > >} > >doc.add(new > Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED)); >Date createDate = (Date) data.get("CreateDate"); >if (createDate == null) createDate = new Date(); > >doc.add(new Field("cdt",String.valueOf(createDate.getTime()), > Field.Store.NO,Field.Index.UN_TOKENIZED)); > >Date modiDate = (Date) data.get("ModiDate"); >if (modiDate == null) modiDate = new Date(); > >doc.add(new Field("mdt",String.valueOf(modiDate.getTime()), > Field.Store.NO,Field.Index.UN_TOKENIZED)); >//doc.add(Field.UnStored("cdt", > String.valueOf(createDate.getTime(; > >// Additional fields for search >doc.add(new Field("bnm",(String) > data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED)); >doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO > ,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Text("bnm", (String) data.get("Brand"))); > //Tokenized and Unstored >doc.add(new Field("bid",(String) > data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); // > untokenized & >doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO > ,Field.Index.TOKENIZED)); >//doc.add(Field.Text("grp", (String) data.get("Group"))); >doc.add(new Field("gid",(String) > data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New >doc.add(new Field("snm",(String) > data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Text("snm", (String) data.get("Seller"))); >doc.add(new Field("sid",(String) > data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); // > New >doc.add(new Field("ttl",(String) > data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); > >String title1 = (String) data.get("Title"); >title1 = removeSpaces(title1); >doc.add(new Field("ttl1",title1,Field.Store.NO > ,Field.Index.UN_TOKENIZED)); > >doc.add(new Field("ttl2",title1,Field.Store.NO > ,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); > >// ColumnC - Product Sequence >String productSeq = (String) data.get("ProductSeq"); >if (productSeq == null) productSeq = ""; >doc.add(new Field("seq",productSeq,Field.Store.NO > ,Field.Index.UN_TOKENIZED)); >//doc.add(Field.Keyword("seq", productSeq)); > >// New Added >doc.add(new Field("sdc",(String) data.get("SpecialDescription"), > Field.Store.NO,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("sdc", (String) > data.get("SpecialDescription"),true)); >doc.add(new Field("kdc", (String) data.get("KeywordDescription"), > Field.Store.NO,Field.Index.TOKENIZED)); >//doc.add(Field.UnStored("kdc", (String) > data.get("KeywordDescription"),true)); > >// ColumnB - Product Category and parent categories >doc.add(new Field("cts",(String) > data.get("Categories"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.Text("cts", (String) data.get("Categories"))); > >// ColumnB - Product Category and parent categories //Raman >doc.add(new Field("dct",(String) > data.get("DirectCategories"),Field.Store.YES,Field.Index.TOKENIZED)); >//doc.add(Field.Text("dct", (S
Re: Optimise while uploading?
On Tue, Sep 23, 2008 at 7:06 PM, Geoff Hopson <[EMAIL PROTECTED]>wrote: > > Probably a stupid question with the obvious answer, but if I am > running a Solr master and accepting updates, do I have to stop the > updates when I start the optimise of the index? Or will optimise just > take the latest snapshot and work on that independently of the > incoming updates? Usually an optimize is performed at the end of the indexing operation. However, an optimize operation will block incoming update requests until it completes. Snapshots are a different story. Solr does not even know about any snapshots -- all operations are performed on the main index only. If you look under the hoods, it is the snapshooter shell script which creates the snapshot directories. -- Regards, Shalin Shekhar Mangar.
Re: commit
On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > > My snapshooter too: > > > ./data/solr/book/logs/snapshooter > data/solr/book/bin > true > arg1 arg2 > MYVAR=val1 > > and everything is at the good place I think, my path are good ... > > Those paths look strange. Are you sure your snapshooter script is inside a directory named "logs"? Try giving absolute paths to the snapshooter script in the "exe" section. Also, put the absolute path to the bin directory in the "dir" section and try again. -- Regards, Shalin Shekhar Mangar.
Re: commit
Right my bad it was bin directory, but even when i fire commit no snapshot created ?? Does it check the number of document even when i fire it and another question I dont rember have put in the conf file the path to commit, but even manually it doesnt work [EMAIL PROTECTED]:/# ./data/solr/book/bin/commit -V + [[ -n '' ]] + [[ -z 8180 ]] + [[ -z localhost ]] + [[ -z solr ]] + curl_url=http://localhost:8180/solr/update + fixUser -V + [[ -z root ]] ++ whoami + [[ root != root ]] ++ who -m ++ cut '-d ' -f1 ++ sed '-es/^.*!//' + oldwhoami=root + [[ root == '' ]] + setStartTime + [[ Linux == \S\u\n\O\S ]] ++ date +%s + start=1222180545 + logMessage started by root ++ timeStamp ++ date '+%Y/%m/%d %H:%M:%S' + echo 2008/09/23 16:35:45 started by root + [[ -n '' ]] + logMessage command: ./data/solr/book/bin/commit -V ++ timeStamp ++ date '+%Y/%m/%d %H:%M:%S' + echo 2008/09/23 16:35:45 command: ./data/solr/book/bin/commit -V + [[ -n '' ]] ++ curl http://localhost:8180/solr/update -s -H 'Content-type:text/xml; charset=utf-8' -d '' + rs='' + [[ 0 != 0 ]] + echo '' + grep ' > On Tue, Sep 23, 2008 at 7:36 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > >> >> My snapshooter too: >> >> >> ./data/solr/book/logs/snapshooter >> data/solr/book/bin >> true >> arg1 arg2 >> MYVAR=val1 >> >> and everything is at the good place I think, my path are good ... >> >> > Those paths look strange. Are you sure your snapshooter script is inside a > directory named "logs"? > > Try giving absolute paths to the snapshooter script in the "exe" section. > Also, put the absolute path to the bin directory in the "dir" section and > try again. > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/commit-tp19628500p19629217.html Sent from the Solr - User mailing list archive at Nabble.com.
Refresh of synonyms.txt without reload
Hi, I'm quite new to solr and I'm looking for a way to extend the list of used synonyms used at query-time without having to reload the config. What I've found so far are these tow thread linked to below, of which neither really helped me out. Especially the MultiCore solution seems a little bit too much for 'just reloading' the synonyms.. Right now I would choose a solution where I'd extend the SynonymFilterFactory with a parameter for an interval in which it would look for an update of the synonyms source file (synonyms.txt). In case of an updated file the SynMap would be updated and from that point on the new synonyms would be included in the query analysis. Is this a valid approach? Would someone else find this usefull to? cheers, Axel http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.html Multiple Solr Cores "http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td15089111.html Re: Is it possible to add synonyms run time? -- View this message in context: http://www.nabble.com/Refresh-of-synonyms.txt-without-reload-tp19629361p19629361.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Refresh of synonyms.txt without reload
This is probably not useful because synonyms work better at index time than at query time. Reloading synonyms also requires reindexing all the affected documents. wunder On 9/23/08 7:45 AM, "Batzenmann" <[EMAIL PROTECTED]> wrote: > > Hi, > > I'm quite new to solr and I'm looking for a way to extend the list of used > synonyms used at query-time without having to reload the config. What I've > found so far are these tow thread linked to below, of which neither really > helped me out. > Especially the MultiCore solution seems a little bit too much for 'just > reloading' the synonyms.. > > Right now I would choose a solution where I'd extend the > SynonymFilterFactory with a parameter for an interval in which it would look > for an update of the synonyms source file (synonyms.txt). > In case of an updated file the SynMap would be updated and from that point > on the new synonyms would be included in the query analysis. > > Is this a valid approach? Would someone else find this usefull to? > > cheers, Axel > > http://www.nabble.com/SolrCore%2C-reload%2C-synonyms-not-reloaded-td19339767.h > tml > Multiple Solr Cores > "http://www.nabble.com/Re%3A-Is-it-possible-to-add-synonyms-run-time--td150891 > 11.html > Re: Is it possible to add synonyms run time?
Re: EmbeddedSolrServer and the MultiCore functionality
If i have solr up and running and do something like this: query.set("shards", "localhost:8080/solr/core0,localhost: 8080/solr/core1"); I will get the results from both cores, obviously... But is there a way to do this without using shards and accessing the cores through http? I presume it would/should be possible to do the same thing directly against the cores, but my question is really if this has been implemented already / is it possible? not implemented... Check line 384 of SearchHandler.java SolrServer server = new CommonsHttpSolrServer(url, client); it defaults to CommonsHttpSolrServer. This could easily change to EmbeddedSolrServer, but i'm not sure it is a very common usecase... why would you have multiple shards on the same machine? ryan
RE: deleting record from the index using deleteByQuery method
Thanks for your response Chris. I do see the reviewid in the index through luke. I guess what I am confused about is the field cumulative_delete. Does this have any significance to whether the delete was a success or not? Also shouldn't the method deleteByQuery return a diff status code based on if the delete was successful or not? -Raghu -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Monday, September 22, 2008 11:30 PM To: solr-user@lucene.apache.org Subject: Re: deleting record from the index using deleteByQuery method : I am trying to delete a record from the index using SolrJ. When I : execute it I get a status of 0 which means success. I see that the : "cummulative_deletbyquery" count increases by 1 and also the "commit" : count increases by one. I don't see any decrease on the "numDocs" count. : When I query it back I do see that record again. I'm not positive, but i don't think deleting by query will error if no documents matched the query -- so just because it succeeds doesn't mean it actually deleted anything ... are you sure '"rev.id:" + reviewId' matches on the document you are trying to delete? does that search find it using the default handler? (is there any analyzer weirdness?) -Hoss If you are not the intended recipient of this e-mail message, please notify the sender and delete all copies immediately. The sender believes this message and any attachments were sent free of any virus, worm, Trojan horse, and other forms of malicious code. This message and its attachments could have been infected during transmission. The recipient opens any attachments at the recipient's own risk, and in so doing, the recipient accepts full responsibility for such actions and agrees to take protective and remedial action relating to any malicious code. Travelport is not liable for any loss or damage arising from this message or its attachments.
DataImport troubleshooting
I have searched the forum and the internet at large to find an answer to my simple problem, but have been unable. I am trying to get a simple dataimport to work, and have not been able to. I have Solr installed on an Apache server on Unix. I am able to commit and search for files using the usual Simple* tools. These files begin with ... and so on. On the data import, I have inserted /R1/home/shoshana/kyle/Documents/data-config.xml into solrconfig, and the data import looks like this: http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" /> I apologize for the ugly xml. Nonetheless, when I go to http://host:8080/solr/dataimport, I get a 404, and when I go to http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing happens. I have editted out the host name because I don't know if the employer would be ok with it. Any guidance? Thanks in advance, Kyle -- View this message in context: http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrUpdateServlet Warning
I've got a small configuration question. When posting docs via SolrJ, I get the following warning in the Solr logs: WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: wt=xml&version=2.2 If you are using solrj, make sure to register a request handler to /update rather then use this servlet. Add: to your solrconfig.xml I have an update handler configured in solrconfig.xml as follows: What's the preferred solution? Should I comment out the SolrUpdateServlet in solr's web.xml? My Solr server is running at /solr, if that helps. Thanks. Gregg
Re: SolrUpdateServlet Warning
On Sep 23, 2008, at 12:35 PM, Gregg wrote: I've got a small configuration question. When posting docs via SolrJ, I get the following warning in the Solr logs: WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: wt=xml&version=2.2 If you are using solrj, make sure to register a request handler to / update rather then use this servlet. Add: class="solr.XmlUpdateRequestHandler" > to your solrconfig.xml I have an update handler configured in solrconfig.xml as follows: are you sure? check http://localhost:8983/solr/admin/stats.jsp and search for XmlUpdateRequestHandler make sure it is registered to /update What's the preferred solution? Should I comment out the SolrUpdateServlet in solr's web.xml? My Solr server is running at /solr, if that helps. that will definitely work, but it should not be necessary to crack open the .war file. ryan
Re: DataImport troubleshooting
Are there any exceptions in the log file when you start Solr? On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison <[EMAIL PROTECTED]> wrote: > > I have searched the forum and the internet at large to find an answer to my > simple problem, but have been unable. I am trying to get a simple > dataimport > to work, and have not been able to. I have Solr installed on an Apache > server on Unix. I am able to commit and search for files using the usual > Simple* tools. These files begin with ... and so on. > > On the data import, I have inserted > class="org.apache.solr.handler.dataimport.DataImportHandler"> > > name="config">/R1/home/shoshana/kyle/Documents/data-config.xml > > > > into solrconfig, and the data import looks like this: > > baseUrl="http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" /> > > forEach="/iProClassDatabase/iProClassEntry/" > url="/R1/home/shoshana/kyle/Documents/exampleIproResult.xml"> > > xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession"> > > xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature" > /> > > xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID" > /> > xpath="/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length" /> > > > > > I apologize for the ugly xml. Nonetheless, when I go to > http://host:8080/solr/dataimport, I get a 404, and when I go to > http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing > happens. I have editted out the host name because I don't know if the > employer would be ok with it. Any guidance? > > Thanks in advance, > Kyle > -- > View this message in context: > http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Regards, Shalin Shekhar Mangar.
Re: Precision issue with sum() function
Problem with the span filter - removing some test - re-posting. water4u99 wrote: > > Hi, > > Some additional clue as to where the issue is: the computed number changed > when there is an additional query it in the query request. > > Ex1: .../select/?q=_val_:%22sum(stockPrice_f,10.00)%22&fl=*,score > This yields a correct answer - 38.0 where the stockPrice_f dynamic field > has the value of 28.0 > > However, when there is another query term - the answer changes. > Ex2: > .../select/?q=PRICE_MIN:20%20_val_:%22sum(stockPrice_f,10.00)%22&fl=*,score > > This yields an incorrect answer: 36.41818 > > The config is straight out of the examples/ directory with only my own > field definitions. > > Thanks if anyone can explain or help. > > > > > water4u99 wrote: >> >> Hi, >> >> I have indexed a dynamic field in the as: > name="stockPrice_f">28.00. >> It is visible in my query. >> However, when I issue a query with a function: ... >> _val_:"sum(stockPrice_f, 10.00)"&fl=*,score >> I received the output of: 36.41818 >> There were no other computations. >> >> Can any one help on why the answer is off. >> >> Thank you. >> > > -- View this message in context: http://www.nabble.com/Precision-issue-with-sum%28%29-function-tp19616287p19633206.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to use copyfield with dynamicfield?
Simply set "text" to be multivalued (one for each *_t field). Erik On Sep 22, 2008, at 1:08 PM, Jon Drukman wrote: I have a dynamicField declaration: I want to copy any *_t's into a text field for searching with dismax. As it is, it appears you can't search dynamicfields this way. I tried adding a copyField: I do have a text field in my schema: However I get 400 errors whenever I try to update a record with entries in the *_t. INFO: /update 0 2 Sep 22, 2008 10:04:40 AM org.apache.solr.core.SolrException log SEVERE: org.apache.solr.core.SolrException: ERROR: multiple values encountered for non multiValued field text: first='Centennial Dr, Oakland, CA' second='' at org .apache .solr.update.DocumentBuilder.addSingleField(DocumentBuilder.java:62) I'm going to guess that the copyField with a wildcard is not allowed. If that is true, how does one deal with the situation where you want to allow new fields AND have them searchable? -jsd-
Re: DataImport troubleshooting
Thank you for help. The problem was actually just stupidity on my part, as it seems I was running the wrong startup and shutdown shells for the server, and thus the server was getting restarted. I restarted the server and I can at least access those pages. I'm getting some wonky output, but I assume this will be sorted out. Kyle Shalin Shekhar Mangar wrote: > > Are there any exceptions in the log file when you start Solr? > > On Tue, Sep 23, 2008 at 9:31 PM, KyleMorrison <[EMAIL PROTECTED]> wrote: > >> >> I have searched the forum and the internet at large to find an answer to >> my >> simple problem, but have been unable. I am trying to get a simple >> dataimport >> to work, and have not been able to. I have Solr installed on an Apache >> server on Unix. I am able to commit and search for files using the usual >> Simple* tools. These files begin with ... and so on. >> >> On the data import, I have inserted >> > class="org.apache.solr.handler.dataimport.DataImportHandler"> >> >> > name="config">/R1/home/shoshana/kyle/Documents/data-config.xml >> >> >> >> into solrconfig, and the data import looks like this: >> >>> baseUrl="http://helix.ccb.sickkids.ca:8080/"; encoding="UTF-8" /> >> >>> forEach="/iProClassDatabase/iProClassEntry/" >> url="/R1/home/shoshana/kyle/Documents/exampleIproResult.xml"> >>> >> xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/UniProtKB/UniProtKB_Accession"> >>> >> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Enzyme_Function/EC/Nomenclature" >> /> >>> >> xpath="/iProClassDatabase/iProClassEntry/CROSS_REFERENCES/Bibliography/References/PMID" >> /> >>> xpath="/iProClassDatabase/iProClassEntry/SEQUENCE/Sequence_Length" /> >> >> >> >> >> I apologize for the ugly xml. Nonetheless, when I go to >> http://host:8080/solr/dataimport, I get a 404, and when I go to >> http://host:8080/solr/admin/dataimport.jsp and try to "debug", nothing >> happens. I have editted out the host name because I don't know if the >> employer would be ok with it. Any guidance? >> >> Thanks in advance, >> Kyle >> -- >> View this message in context: >> http://www.nabble.com/DataImport-troubleshooting-tp19630990p19630990.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/DataImport-troubleshooting-tp19630990p19635170.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Precision issue with sum() function
Try adding a debugQuery=true parameter on to see if that helps you decipher what is going on. FWIW, the _val_ boost is a factor in scoring, but it isn't the only factor. Perhaps you're seeing the document score factor in as well? -Grant On Sep 22, 2008, at 6:37 PM, water4u99 wrote: Hi, I have indexed a dynamic field in the as: 28.00. It is visible in my query. However, when I issue a query with a function: ... _val_:"sum(stockPrice_f, 10.00)"&fl=*,score I received the output of: 36.41818 There were no other computations. Can any one help on why the answer is off. Thank you. -- View this message in context: http://www.nabble.com/Precision-issue-with-sum%28%29-function-tp19616287p19616287.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrUpdateServlet Warning
This turned out to be a fairly pedestrian bug on my part: I had "/update" appended to the Solr base URL when I was adding docs via SolrJ. Thanks for the help. --Gregg On Tue, Sep 23, 2008 at 12:42 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > On Sep 23, 2008, at 12:35 PM, Gregg wrote: > > I've got a small configuration question. When posting docs via SolrJ, I >> get >> the following warning in the Solr logs: >> >> WARNING: The @Deprecated SolrUpdateServlet does not accept query >> parameters: >> wt=xml&version=2.2 >> If you are using solrj, make sure to register a request handler to >> /update >> rather then use this servlet. >> Add: > > >> to your solrconfig.xml >> >> I have an update handler configured in solrconfig.xml as follows: >> >> >> >> > are you sure? > > check http://localhost:8983/solr/admin/stats.jsp > and search for XmlUpdateRequestHandler > make sure it is registered to /update > > > What's the preferred solution? Should I comment out the SolrUpdateServlet >> in >> solr's web.xml? My Solr server is running at /solr, if that helps. >> >> > that will definitely work, but it should not be necessary to crack open the > .war file. > > > ryan >
Highlight Fragments
Ok, I'm very frustrated. I've tried every configuraiton I can and parameters and I cannot get fragments to show up in the highlighting in solr. (no fragments at the bottom or highlights in the text. I must be missing something but I'm just not sure what it is. /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true And I get highlight segment, but no fragments or phrase highlighting. My goal - if I'm doing this completely wrong - is to get google like snippets of text around the query term (or at mimimum to highlight the query term itself). Results: synopsis true 3 gap crayon synopsis standard true 2.1 − − ... . .. -- "hic sunt dracones"
Re: Highlight Fragments
Make sure the fields you're trying to highlight are stored in your schema (e.g. ) David Snelling-2 wrote: > > Ok, I'm very frustrated. I've tried every configuraiton I can and > parameters > and I cannot get fragments to show up in the highlighting in solr. (no > fragments at the bottom or highlights in the text. I must be > missing something but I'm just not sure what it is. > > /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true > > And I get highlight segment, but no fragments or phrase highlighting. > > My goal - if I'm doing this completely wrong - is to get google like > snippets of text around the query term (or at mimimum to highlight the > query > term itself). > > Results: > > synopsis > true > 3 > gap > crayon > synopsis > standard > true > 2.1 > > > − > > − > ... > . > .. > > > > > > > > > > > > > > > -- > "hic sunt dracones" > > -- View this message in context: http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Highlight Fragments
This is the configuration for the two fields I have tried on On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > > Make sure the fields you're trying to highlight are stored in your schema > (e.g. ) > > > > David Snelling-2 wrote: > > > > Ok, I'm very frustrated. I've tried every configuraiton I can and > > parameters > > and I cannot get fragments to show up in the highlighting in solr. (no > > fragments at the bottom or highlights in the text. I must be > > missing something but I'm just not sure what it is. > > > > > /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true > > > > And I get highlight segment, but no fragments or phrase highlighting. > > > > My goal - if I'm doing this completely wrong - is to get google like > > snippets of text around the query term (or at mimimum to highlight the > > query > > term itself). > > > > Results: > > > > synopsis > > true > > 3 > > gap > > crayon > > synopsis > > standard > > true > > 2.1 > > > > > > − > > > > − > > ... > > . > > .. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > "hic sunt dracones" > > > > > > -- > View this message in context: > http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- "hic sunt dracones"
using BoostingTermQuery
Hi- I'm new to Solr, and I'm trying to figure out the best way to configure it to use BoostingTermQuery in the scoring mechanism. Do I need to create a custom query parser? All I want is the default parser behavior except to get the custom term boost from the Payload data. Thanks! -Ken
Re: Highlight Fragments
Try a query where you're sure to get something to highlight in one of your highlight fields, for example: /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription David Snelling-2 wrote: > > This is the configuration for the two fields I have tried on > > stored="true"/> > compressed="true"/> > > > > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > >> >> Make sure the fields you're trying to highlight are stored in your schema >> (e.g. ) >> >> >> >> David Snelling-2 wrote: >> > >> > Ok, I'm very frustrated. I've tried every configuraiton I can and >> > parameters >> > and I cannot get fragments to show up in the highlighting in solr. (no >> > fragments at the bottom or highlights in the text. I must be >> > missing something but I'm just not sure what it is. >> > >> > >> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true >> > >> > And I get highlight segment, but no fragments or phrase highlighting. >> > >> > My goal - if I'm doing this completely wrong - is to get google like >> > snippets of text around the query term (or at mimimum to highlight the >> > query >> > term itself). >> > >> > Results: >> > >> > synopsis >> > true >> > 3 >> > gap >> > crayon >> > synopsis >> > standard >> > true >> > 2.1 >> > >> > >> > − >> > >> > − >> > ... >> > . >> > .. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > -- >> > "hic sunt dracones" >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > "hic sunt dracones" > > -- View this message in context: http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: using BoostingTermQuery
At this point, it's roll your own. I'd love to see the BTQ in Solr (and Spans!), but I wonder if it makes sense w/o better indexing side support. I assume you are rolling your own Analyzer, right? Spans and payloads are this huge untapped area for better search! On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote: Hi- I'm new to Solr, and I'm trying to figure out the best way to configure it to use BoostingTermQuery in the scoring mechanism. Do I need to create a custom query parser? All I want is the default parser behavior except to get the custom term boost from the Payload data. Thanks! -Ken -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: using BoostingTermQuery
It may be too early to say this but I'll say it anyway :) There should be a juicy case study that includes payloads, BTQ, and Spans in the upcoming Lucene in Action 2. I can't wait to see it, personally. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Grant Ingersoll <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, September 23, 2008 5:29:05 PM > Subject: Re: using BoostingTermQuery > > At this point, it's roll your own. I'd love to see the BTQ in Solr > (and Spans!), but I wonder if it makes sense w/o better indexing side > support. I assume you are rolling your own Analyzer, right? Spans > and payloads are this huge untapped area for better search! > > > On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote: > > > Hi- > > > > I'm new to Solr, and I'm trying to figure out the best way to > > configure it to use BoostingTermQuery in the scoring mechanism. Do > > I need to create a custom query parser? All I want is the default > > parser behavior except to get the custom term boost from the Payload > > data. Thanks! > > > > -Ken > > -- > Grant Ingersoll > http://www.lucidimagination.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ
RE: using BoostingTermQuery
> At this point, it's roll your own. That's where I'm getting bogged down - I'm confused by the various queryparser classes in lucene and solr and I'm not sure exactly what I need to override. Do you know of an example of something similar to what I'm doing that I could use as a reference? > I'd love to see the BTQ in Solr > (and Spans!), but I wonder if it makes sense w/o better indexing side > support. I assume you are rolling your own Analyzer, right? Yup - I'm pretty sure I have that side figured out. My input contains terms marked up with a score (ie 'software?7') I just needed to create a TokenFilter that parses out the suffix and sets the Payload on the token. > Spans and payloads are this huge untapped area for better search! Completely agree - we do a lot with keyword searching, and we use this type of thing in our existing search implementation. Thanks for the quick response! > On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote: > > > Hi- > > > > I'm new to Solr, and I'm trying to figure out the best way to > > configure it to use BoostingTermQuery in the scoring mechanism. Do > > I need to create a custom query parser? All I want is the default > > parser behavior except to get the custom term boost from the Payload > > data. Thanks! > > > > -Ken > > -- > Grant Ingersoll > http://www.lucidimagination.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > >
Re: Highlight Fragments
Hmmm. That doesn't actually return anything which is odd because I know it's in the field if I do a query without specifying the field. http://qasearch.donorschoose.org/select/?q=synopsis:students returns nothing http://qasearch.donorschoose.org/select/?q=students returns items with query in synopsis field. This may be causing issues but I'm not sure why it's not working. We use this live and do very complex queries including facets that work fine. www.donorschoose.org On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > > Try a query where you're sure to get something to highlight in one of your > highlight fields, for example: > > > /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription > > > > David Snelling-2 wrote: > > > > This is the configuration for the two fields I have tried on > > > > > stored="true"/> > > > compressed="true"/> > > > > > > > > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > > > >> > >> Make sure the fields you're trying to highlight are stored in your > schema > >> (e.g. ) > >> > >> > >> > >> David Snelling-2 wrote: > >> > > >> > Ok, I'm very frustrated. I've tried every configuraiton I can and > >> > parameters > >> > and I cannot get fragments to show up in the highlighting in solr. (no > >> > fragments at the bottom or highlights in the text. I must be > >> > missing something but I'm just not sure what it is. > >> > > >> > > >> > /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true > >> > > >> > And I get highlight segment, but no fragments or phrase highlighting. > >> > > >> > My goal - if I'm doing this completely wrong - is to get google like > >> > snippets of text around the query term (or at mimimum to highlight the > >> > query > >> > term itself). > >> > > >> > Results: > >> > > >> > synopsis > >> > true > >> > 3 > >> > gap > >> > crayon > >> > synopsis > >> > standard > >> > true > >> > 2.1 > >> > > >> > > >> > − > >> > > >> > − > >> > ... > >> > . > >> > .. > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > "hic sunt dracones" > >> > > >> > > >> > >> -- > >> View this message in context: > >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > > "hic sunt dracones" > > > > > > -- > View this message in context: > http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- "hic sunt dracones"
Re: Highlight Fragments
Your fields are all of string type. String fields aren't tokenized or analyzed, so you have to match the entire text of those fields to actually get a match. Try the following: /select/?q=firstname:Kathryn&hl=on&hl.fl=firstname The reason you're seeing results with just q=students, but not q=synopsis:students is because you're copying the synopsis field into your field named 'text', which is of type 'text', which does get tokenized and analyzed, and 'text' is your default search field. The reason you don't see any highlights with the following query is because your 'text' field isn't stored. select/?q=text:students&hl=on&hl.fl=text David Snelling-2 wrote: > > Hmmm. That doesn't actually return anything which is odd because I know > it's in the field if I do a query without specifying the field. > > http://qasearch.donorschoose.org/select/?q=synopsis:students > > returns nothing > > http://qasearch.donorschoose.org/select/?q=students > > returns items with query in synopsis field. > > This may be causing issues but I'm not sure why it's not working. We use > this live and do very complex queries including facets that work fine. > > www.donorschoose.org > > > > On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > >> >> Try a query where you're sure to get something to highlight in one of >> your >> highlight fields, for example: >> >> >> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription >> >> >> >> David Snelling-2 wrote: >> > >> > This is the configuration for the two fields I have tried on >> > >> > > > stored="true"/> >> > > > compressed="true"/> >> > >> > >> > >> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> >> wrote: >> > >> >> >> >> Make sure the fields you're trying to highlight are stored in your >> schema >> >> (e.g. ) >> >> >> >> >> >> >> >> David Snelling-2 wrote: >> >> > >> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and >> >> > parameters >> >> > and I cannot get fragments to show up in the highlighting in solr. >> (no >> >> > fragments at the bottom or highlights in the text. I must >> be >> >> > missing something but I'm just not sure what it is. >> >> > >> >> > >> >> >> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true >> >> > >> >> > And I get highlight segment, but no fragments or phrase >> highlighting. >> >> > >> >> > My goal - if I'm doing this completely wrong - is to get google like >> >> > snippets of text around the query term (or at mimimum to highlight >> the >> >> > query >> >> > term itself). >> >> > >> >> > Results: >> >> > >> >> > synopsis >> >> > true >> >> > 3 >> >> > gap >> >> > crayon >> >> > synopsis >> >> > standard >> >> > true >> >> > 2.1 >> >> > >> >> > >> >> > − >> >> > >> >> > − >> >> > ... >> >> > . >> >> > .. >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > "hic sunt dracones" >> >> > >> >> > >> >> >> >> -- >> >> View this message in context: >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html >> >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> >> >> > >> > >> > -- >> > "hic sunt dracones" >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > "hic sunt dracones" > > -- View this message in context: http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Highlight Fragments
Ok, thanks, that makes a lot of sense now. So, how should I be storing the text for the synopsis or shortdescription fields so it would be tokenized? Should it be text instead of string? Thank you very much for the help by the way. On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > > Your fields are all of string type. String fields aren't tokenized or > analyzed, so you have to match the entire text of those fields to actually > get a match. Try the following: > > /select/?q=firstname:Kathryn&hl=on&hl.fl=firstname > > The reason you're seeing results with just q=students, but not > q=synopsis:students is because you're copying the synopsis field into your > field named 'text', which is of type 'text', which does get tokenized and > analyzed, and 'text' is your default search field. > > The reason you don't see any highlights with the following query is because > your 'text' field isn't stored. > > select/?q=text:students&hl=on&hl.fl=text > > > > > > David Snelling-2 wrote: > > > > Hmmm. That doesn't actually return anything which is odd because I know > > it's in the field if I do a query without specifying the field. > > > > http://qasearch.donorschoose.org/select/?q=synopsis:students > > > > returns nothing > > > > http://qasearch.donorschoose.org/select/?q=students > > > > returns items with query in synopsis field. > > > > This may be causing issues but I'm not sure why it's not working. We use > > this live and do very complex queries including facets that work fine. > > > > www.donorschoose.org > > > > > > > > On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > > > >> > >> Try a query where you're sure to get something to highlight in one of > >> your > >> highlight fields, for example: > >> > >> > >> > /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription > >> > >> > >> > >> David Snelling-2 wrote: > >> > > >> > This is the configuration for the two fields I have tried on > >> > > >> > >> > stored="true"/> > >> > >> > compressed="true"/> > >> > > >> > > >> > > >> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> > >> >> Make sure the fields you're trying to highlight are stored in your > >> schema > >> >> (e.g. ) > >> >> > >> >> > >> >> > >> >> David Snelling-2 wrote: > >> >> > > >> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and > >> >> > parameters > >> >> > and I cannot get fragments to show up in the highlighting in solr. > >> (no > >> >> > fragments at the bottom or highlights in the text. I must > >> be > >> >> > missing something but I'm just not sure what it is. > >> >> > > >> >> > > >> >> > >> > /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true > >> >> > > >> >> > And I get highlight segment, but no fragments or phrase > >> highlighting. > >> >> > > >> >> > My goal - if I'm doing this completely wrong - is to get google > like > >> >> > snippets of text around the query term (or at mimimum to highlight > >> the > >> >> > query > >> >> > term itself). > >> >> > > >> >> > Results: > >> >> > > >> >> > synopsis > >> >> > true > >> >> > 3 > >> >> > gap > >> >> > crayon > >> >> > synopsis > >> >> > standard > >> >> > true > >> >> > 2.1 > >> >> > > >> >> > > >> >> > − > >> >> > > >> >> > − > >> >> > ... > >> >> > . > >> >> > .. > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > "hic sunt dracones" > >> >> > > >> >> > > >> >> > >> >> -- > >> >> View this message in context: > >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html > >> >> Sent from the Solr - User mailing list archive at Nabble.com. > >> >> > >> >> > >> > > >> > > >> > -- > >> > "hic sunt dracones" > >> > > >> > > >> > >> -- > >> View this message in context: > >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > > "hic sunt dracones" > > > > > > -- > View this message in context: > http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- "hic sunt dracones"
Re: Highlight Fragments
Yes, you can use text (or some custom derivative of it) for your fields. David Snelling-2 wrote: > > Ok, thanks, that makes a lot of sense now. > So, how should I be storing the text for the synopsis or shortdescription > fields so it would be tokenized? Should it be text instead of string? > > > Thank you very much for the help by the way. > > > On Tue, Sep 23, 2008 at 2:49 PM, wojtekpia <[EMAIL PROTECTED]> wrote: > >> >> Your fields are all of string type. String fields aren't tokenized or >> analyzed, so you have to match the entire text of those fields to >> actually >> get a match. Try the following: >> >> /select/?q=firstname:Kathryn&hl=on&hl.fl=firstname >> >> The reason you're seeing results with just q=students, but not >> q=synopsis:students is because you're copying the synopsis field into >> your >> field named 'text', which is of type 'text', which does get tokenized and >> analyzed, and 'text' is your default search field. >> >> The reason you don't see any highlights with the following query is >> because >> your 'text' field isn't stored. >> >> select/?q=text:students&hl=on&hl.fl=text >> >> >> >> >> >> David Snelling-2 wrote: >> > >> > Hmmm. That doesn't actually return anything which is odd because I >> know >> > it's in the field if I do a query without specifying the field. >> > >> > http://qasearch.donorschoose.org/select/?q=synopsis:students >> > >> > returns nothing >> > >> > http://qasearch.donorschoose.org/select/?q=students >> > >> > returns items with query in synopsis field. >> > >> > This may be causing issues but I'm not sure why it's not working. We >> use >> > this live and do very complex queries including facets that work fine. >> > >> > www.donorschoose.org >> > >> > >> > >> > On Tue, Sep 23, 2008 at 2:20 PM, wojtekpia <[EMAIL PROTECTED]> >> wrote: >> > >> >> >> >> Try a query where you're sure to get something to highlight in one of >> >> your >> >> highlight fields, for example: >> >> >> >> >> >> >> /select/?qt=standard&q=synopsis:crayon&hl=true&hl.fl=synopsis,shortdescription >> >> >> >> >> >> >> >> David Snelling-2 wrote: >> >> > >> >> > This is the configuration for the two fields I have tried on >> >> > >> >> > > >> > stored="true"/> >> >> > > >> > compressed="true"/> >> >> > >> >> > >> >> > >> >> > On Tue, Sep 23, 2008 at 1:59 PM, wojtekpia <[EMAIL PROTECTED]> >> >> wrote: >> >> > >> >> >> >> >> >> Make sure the fields you're trying to highlight are stored in your >> >> schema >> >> >> (e.g. ) >> >> >> >> >> >> >> >> >> >> >> >> David Snelling-2 wrote: >> >> >> > >> >> >> > Ok, I'm very frustrated. I've tried every configuraiton I can and >> >> >> > parameters >> >> >> > and I cannot get fragments to show up in the highlighting in >> solr. >> >> (no >> >> >> > fragments at the bottom or highlights in the text. I >> must >> >> be >> >> >> > missing something but I'm just not sure what it is. >> >> >> > >> >> >> > >> >> >> >> >> >> /select/?qt=standard&q=crayon&hl=true&hl.fl=synopsis,shortdescription&hl.fragmenter=gap&hl.snippets=3&debugQuery=true >> >> >> > >> >> >> > And I get highlight segment, but no fragments or phrase >> >> highlighting. >> >> >> > >> >> >> > My goal - if I'm doing this completely wrong - is to get google >> like >> >> >> > snippets of text around the query term (or at mimimum to >> highlight >> >> the >> >> >> > query >> >> >> > term itself). >> >> >> > >> >> >> > Results: >> >> >> > >> >> >> > synopsis >> >> >> > true >> >> >> > 3 >> >> >> > gap >> >> >> > crayon >> >> >> > synopsis >> >> >> > standard >> >> >> > true >> >> >> > 2.1 >> >> >> > >> >> >> > >> >> >> > − >> >> >> > >> >> >> > − >> >> >> > ... >> >> >> > . >> >> >> > .. >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > "hic sunt dracones" >> >> >> > >> >> >> > >> >> >> >> >> >> -- >> >> >> View this message in context: >> >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19636915.html >> >> >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> >> >> >> >> >> > >> >> > >> >> > -- >> >> > "hic sunt dracones" >> >> > >> >> > >> >> >> >> -- >> >> View this message in context: >> >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637261.html >> >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> >> >> >> > >> > >> > -- >> > "hic sunt dracones" >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Highlight-Fragments-tp19636705p19637801.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > "hic sunt dracones" > > -- View this message in context: http://www.nabble.com/Highlight-Fragments-tp19636705p19638296.html Sent from the Solr - User mailing list archive at Nabble.com.
Snappuller taking up CPU on master
Hi, I am using snappuller to sync my slave with master, i am not using rsync daemon, i am doing Rsync using remote shell. When i am serving requests from the master when the snappuller is running (after optimization, total index is arnd 4 gb it doing the transfer of whole index), the performance is very bad actually causing timeouts. Any ideas why this happens . Any suggestions will help. Thanks. -- View this message in context: http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: using BoostingTermQuery
On Sep 23, 2008, at 5:39 PM, Ensdorf Ken wrote: At this point, it's roll your own. That's where I'm getting bogged down - I'm confused by the various queryparser classes in lucene and solr and I'm not sure exactly what I need to override. Do you know of an example of something similar to what I'm doing that I could use as a reference? I'm no QueryParser expert, but I would probably start w/ the default query parser in Solr (LuceneQParser), and then progress a bit to the DisMax one. I'd ask specific questions based on what you see there. If you get far enough along, you may consider asking for help on the java-user list as well. I'd love to see the BTQ in Solr (and Spans!), but I wonder if it makes sense w/o better indexing side support. I assume you are rolling your own Analyzer, right? Yup - I'm pretty sure I have that side figured out. My input contains terms marked up with a score (ie 'software?7') I just needed to create a TokenFilter that parses out the suffix and sets the Payload on the token. Cool. Patch? Spans and payloads are this huge untapped area for better search! Completely agree - we do a lot with keyword searching, and we use this type of thing in our existing search implementation. Thanks for the quick response! On Sep 23, 2008, at 5:12 PM, Ensdorf Ken wrote: Hi- I'm new to Solr, and I'm trying to figure out the best way to configure it to use BoostingTermQuery in the scoring mechanism. Do I need to create a custom query parser? All I want is the default parser behavior except to get the custom term boost from the Payload data. Thanks! -Ken -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: Snappuller taking up CPU on master
Hi, Can't tell with certainty without looking, but my guess would be slow disk, high IO, and a large number of processes waiting for IO (run vmstat and look at the "wa" column). Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: rahul_k123 <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, September 23, 2008 6:56:48 PM > Subject: Snappuller taking up CPU on master > > > Hi, > > I am using snappuller to sync my slave with master, i am not using rsync > daemon, i am doing Rsync using remote shell. > > When i am serving requests from the master when the snappuller is running > (after optimization, total index is arnd 4 gb it doing the transfer of whole > index), the performance is very bad actually causing timeouts. > > > > Any ideas why this happens . > > > Any suggestions will help. > > > Thanks. > -- > View this message in context: > http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html > Sent from the Solr - User mailing list archive at Nabble.com.
solr score
hi, How to weightage more frequently searched word in solr? what is the functionality in Apache solr module? I have a list of more frequently searched word in my site , i need to highlight those words.From the net i found out that 'score' is used for this purpose. Isn't it true? Anybody knows about it? Please help me. with Regards, Santhanaraj R -- View this message in context: http://www.nabble.com/solr-score-tp19642046p19642046.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Snappuller taking up CPU on master
Hi, Thanks for the reply. I am not using SOLR for indexing and serving search requests, i am using only the scripts for replication. Yes it looks like I/O, but my question is how to handle this problem and is there any optimal way to achieve this. Thanks. Otis Gospodnetic wrote: > > Hi, > > Can't tell with certainty without looking, but my guess would be slow > disk, high IO, and a large number of processes waiting for IO (run vmstat > and look at the "wa" column). > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: rahul_k123 <[EMAIL PROTECTED]> >> To: solr-user@lucene.apache.org >> Sent: Tuesday, September 23, 2008 6:56:48 PM >> Subject: Snappuller taking up CPU on master >> >> >> Hi, >> >> I am using snappuller to sync my slave with master, i am not using rsync >> daemon, i am doing Rsync using remote shell. >> >> When i am serving requests from the master when the snappuller is running >> (after optimization, total index is arnd 4 gb it doing the transfer of >> whole >> index), the performance is very bad actually causing timeouts. >> >> >> >> Any ideas why this happens . >> >> >> Any suggestions will help. >> >> >> Thanks. >> -- >> View this message in context: >> http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19638474.html >> Sent from the Solr - User mailing list archive at Nabble.com. > > > -- View this message in context: http://www.nabble.com/Snappuller-taking-up-CPU-on-master-tp19638474p19642053.html Sent from the Solr - User mailing list archive at Nabble.com.
Error running query inside data-config.xml
Hi Guys I am trying to take values by connecting two tables. My data-config.xml looks like: If I try to index the values from a single table, it is working fine. Is there anything wrong in the above configuration: Thanks in advance -- View this message in context: http://www.nabble.com/Error-running-query-inside-data-config.xml-tp19642540p19642540.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error running query inside data-config.xml
Since you have not given any information about your schema, we cannot help with the queries. What do you mean by error running query? Do you get an exception or no values for the inner entity's fields? On Wed, Sep 24, 2008 at 11:34 AM, con <[EMAIL PROTECTED]> wrote: > > Hi Guys > I am trying to take values by connecting two tables. My data-config.xml > looks like: > > > > > > > column="SALARY" /> > > > > > > > If I try to index the values from a single table, it is working fine. Is > there anything wrong in the above configuration: > > Thanks in advance > -- > View this message in context: > http://www.nabble.com/Error-running-query-inside-data-config.xml-tp19642540p19642540.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Regards, Shalin Shekhar Mangar.
help required: how to design a large scale solr system
Hi! I am already using solr 1.2 and happy with it. In a new project with very tight dead line (10 development days from today) I need to setup a more ambitious system in terms of scale Here is the spec: * I need to index about 60,000,000 documents * Each document is has 11 textual fields to be indexed & stored and 4 more fields to be stored only * Most fields are short (2-14 characters) however 2 indexed fields can be up to 1KB and another stored field is up to 1KB * On average every document is about 0.5 KB to be stored and 0.4KB to be indexed * The SLA for data freshness is a full nightly re-index ( I cannot obtain an incremental update/delete lists of the modified documents) * The SLA for query time is 5 seconds * the number of expected queries is 2-3 queries per second * the queries are simple a combination of Boolean operation and name searches (no fancy fuzzy searches and levinstien distances, no faceting, etc) * I have a 64 bit Dell 2950 4-cpu machine (2 dual cores ) with RAID 10, 200 GB HD space, and 8GB RAM memory * The documents are not given to me explicitly - I am given a raw-documents in RAM - one by one, from which I create my document in RAM. and then I can either http-post is to index it directly or append it to a tsv file for later indexing * Each document has a unique ID I have a few directions I am thinking about The simple approach * Have one solr instance that will index the entire document set (from files). I am afraid this will take too much time Direction 1 * Create TSV files from all the documents - this will take around 3-4 hours * Have all the documents partitioned into several subsets (how many should I choose? ) * Have multiple solr instances on the same machine * Let each solr instance concurrently index the appropriate subset * At the end merge all the indices using the IndexMergeTool - (how much time will it take ?) Direction 2 * Like the previous but instead of using the IndexMergeTool , use distributed search with shards (upgrading to solr 1.3) Direction 3,4 * Like previous directions only avoid using TSV files at all and directly index the documents from RAM Questions: * Which direction do you recommend in order to meet the SLAs in the fastest way? * Since I have RAID on the machine can I gain performance by using multiple solr instances on the same machine or only multiple machines will help me * What's the minimal number of machines I should require (I might get more weaker machines) * How many concurrent indexers are recommended? * Do you agree that the bottle neck is the indexing time? Any help is appreciated Thanks in advance yatir
Solr Using
Which version of tomcat required. I installed jboss4.0.2 which have tomcat5.5.9. JSP pages are not going to compile. Its giving syntax error. Please help. I can't move from jboss4.0.2. Please help. Regards, Dinesh Gupta > Date: Tue, 23 Sep 2008 19:36:22 +0530 > From: [EMAIL PROTECTED] > To: solr-user@lucene.apache.org > Subject: Re: Lucene index > > Hi Dinesh, > > This seems straightforward for Solr. You can use the embedded jetty server > for a start. Look at the tutorial on how to get started. > > You'll need to modify the schema.xml to define all the fields that you want > to index. The wiki page at http://wiki.apache.org/solr/SchemaXml is a good > start on how to do that. Each field in your code will have a counterpart in > the schema.xml with appropriate flags (indexed/stored/tokenized etc.) > > Once that is complete, try to modify the DataImportHandler's hsqldb example > for your mysql database. > > On Tue, Sep 23, 2008 at 7:01 PM, Dinesh Gupta <[EMAIL PROTECTED]>wrote: > > > > > Hi Shalin Shekhar, > > > > Let me explain my issue. > > > > I have some tables in my database like > > > > Product > > Category > > Catalogue > > Keywords > > Seller > > Brand > > Country_city_group > > etc. > > I have a class that represent product document as > > > > Document doc = new Document(); > >// Keywords which can be used directly for search > >doc.add(new Field("id",(String) > > data.get("PRN"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > > > >// Sorting fields] > >String priceString = (String) data.get("Price"); > >if (priceString == null) > >priceString = "0"; > >long price = 0; > >try { > >price = (long) Double.parseDouble(priceString); > >} catch (Exception e) { > > > >} > > > >doc.add(new > > Field("prc",NumberUtils.pad(price),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >Date createDate = (Date) data.get("CreateDate"); > >if (createDate == null) createDate = new Date(); > > > >doc.add(new Field("cdt",String.valueOf(createDate.getTime()), > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > > > >Date modiDate = (Date) data.get("ModiDate"); > >if (modiDate == null) modiDate = new Date(); > > > >doc.add(new Field("mdt",String.valueOf(modiDate.getTime()), > > Field.Store.NO,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.UnStored("cdt", > > String.valueOf(createDate.getTime(; > > > >// Additional fields for search > >doc.add(new Field("bnm",(String) > > data.get("Brand"),Field.Store.YES,Field.Index.TOKENIZED)); > >doc.add(new Field("bnm1",(String) data.get("Brand1"),Field.Store.NO > > ,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Text("bnm", (String) data.get("Brand"))); > > //Tokenized and Unstored > >doc.add(new Field("bid",(String) > > data.get("BrandId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("bid", (String) data.get("BrandId"))); // > > untokenized & > >doc.add(new Field("grp",(String) data.get("Group"),Field.Store.NO > > ,Field.Index.TOKENIZED)); > >//doc.add(Field.Text("grp", (String) data.get("Group"))); > >doc.add(new Field("gid",(String) > > data.get("GroupId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("gid", (String) data.get("GroupId"))); //New > >doc.add(new Field("snm",(String) > > data.get("Seller"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Text("snm", (String) data.get("Seller"))); > >doc.add(new Field("sid",(String) > > data.get("SellerId"),Field.Store.YES,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("sid", (String) data.get("SellerId"))); // > > New > >doc.add(new Field("ttl",(String) > > data.get("Title"),Field.Store.YES,Field.Index.TOKENIZED)); > >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); > > > >String title1 = (String) data.get("Title"); > >title1 = removeSpaces(title1); > >doc.add(new Field("ttl1",title1,Field.Store.NO > > ,Field.Index.UN_TOKENIZED)); > > > >doc.add(new Field("ttl2",title1,Field.Store.NO > > ,Field.Index.TOKENIZED)); > >//doc.add(Field.UnStored("ttl", (String) data.get("Title"), true)); > > > >// ColumnC - Product Sequence > >String productSeq = (String) data.get("ProductSeq"); > >if (productSeq == null) productSeq = ""; > >doc.add(new Field("seq",productSeq,Field.Store.NO > > ,Field.Index.UN_TOKENIZED)); > >//doc.add(Field.Keyword("seq", productSeq)); > > > >// New Added > >doc.add(new Field("sdc",(String) data.get("SpecialDescription"), > > Field.Store.NO,Field.Index.TOKENIZED)); > >//doc.add(Field.UnStored("sdc", (String) > > data.get("SpecialDescription"),true)); > >doc.add(new Field("kdc", (String) data.get("KeywordDescription"), > > Field.S