Re: Best way to index wordpress blogs in solr

2014-10-08 Thread Jack Krupansky
The LucidWorks product has builtin crawler support so you could crawl one or more web sites. http://lucidworks.com/product/fusion/ -- Jack Krupansky -Original Message- From: Vishal Sharma Sent: Tuesday, October 7, 2014 2:08 PM To: solr-user@lucene.apache.org Subject: Best way to inde

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Ahmet Arslan
Hi Vishal, If you find Nutch heavy-weight, consider using http://manifoldcf.apache.org Ahmet On Wednesday, October 8, 2014 1:54 AM, Vishal Sharma wrote: Hey Jorge, I guess Nutch can help me. Thanks for this. I am sure I should be able to configure it to crawl only specific portions of the si

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Vishal Sharma
Hey Jorge, I guess Nutch can help me. Thanks for this. I am sure I should be able to configure it to crawl only specific portions of the site. *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754 E: vish...@grazitti.com www.grazitti.com [image: Description: LinkedIn]

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Vishal Sharma
Ok not a problem. Thanks anyways. *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754 E: vish...@grazitti.com www.grazitti.com [image: Description: LinkedIn] [image: Description: Twitter] [image: fbook]

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Alexandre Rafalovitch
I have not used Swift before. I just heard of it. Sorry. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 7 October 2014 17:

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Jorge Luis Betancourt Gonzalez
If you’re talking about a generic web crawl you could use something like Nutch [1] keep in mind that his a full web crawler and it does a pretty good job. I’ve been using it for over more than 2 years now and I’m very happy, although I don’t crawl just a couple of sites but a more wide spectrum

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Vishal Sharma
Hey Alex, Do you have a fair comparison of Solr and Swift type you have read somewhere or from your past experience of using them. I would want to use that before I start building everything from scratch in my future implementations. *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754 E:

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Vishal Sharma
Makes sense. I'll just dive in now. Thanks so much. *Vishal Sharma**TL, Grazitti Interactive*T: +1 650­ 641 1754 E: vish...@grazitti.com www.grazitti.com [image: Description: LinkedIn] [image: Description: Twitter]

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Alexandre Rafalovitch
I am pretty sure Swift is not Solr. That's why I was asking whether you were starting from scratch. As to the other items, please re-read my original response. Solr has an example reading in RSS feeds, you could probably use that. Or a generic XML using DataImportHandler's mapping. Or directly fro

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Vishal Sharma
Hey Alex, Thanks for the prompt response. Here is what I am trying to solve: I am showing search results from content coming from 3 different places on a single site. And, I have done that by pumping all this content to Solr server running on single flat schema by using different APIs of these pl

Re: Best way to index wordpress blogs in solr

2014-10-07 Thread Alexandre Rafalovitch
On 7 October 2014 14:08, Vishal Sharma wrote: > Hi, > > I am trying to get some help on finding out if there is any best practice > to index wordpress blogs in solr index? Can someone help with architecture > I shoudl be setting up? > > Do, I need to write separate scripts to crawl wordpress and t