Re: Indexing Wikipedia

2015-12-04 Thread Paul Libbrecht
SImply... some fields are not stored so they are only searched through (being indexed) but not given back? (title and text in the tutorial you refer to). Are these the missing fields? Paul > Kate Kas > 5 décembre 2015 00:23 > Hi, > > i tried to index .xml files from wi

Indexing Wikipedia

2015-12-04 Thread Kate Kas
Hi, i tried to index .xml files from wikipedia articles ( https://dumps.wikimedia.org/enwiki/20150702/) using the method, which is proposed by solr tutorial ( https://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia). I think that some fields are not indexed, because when i use

Re: Indexing Wikipedia

2012-07-08 Thread vineet yadav
Hi, I would recommend indexing wikipedia xml dump. Check out dataimport hander example of indexing wikipedia(http://wiki.apache.org/solr/DataImportHandler#Example%3a_Indexing_wikipedia). Thanks Vineet Yadav On Sun, Jul 8, 2012 at 9:15 AM, kiran kumar wrote: > Hi, > In our office w

Indexing Wikipedia

2012-07-07 Thread kiran kumar
Hi, In our office we have wikipedia setup for intranet. I want to index the wikipedia, I have been recently studying that all the wiki pages are stored in database and the schema is a bit of standard followed from mediawiki. I am also thinking of whether to use xmldumper to dump all the wiki pages

Re: Indexing Wikipedia with Solr/Lucene

2012-05-13 Thread András Bártházi
Hi, Using the RegexTransformer? I guess you can make a regular expression for the wikipedia text field to extract category and external links. Bye, Andras 2012/5/13 vineet yadav > Hi all, > I want to create Lucene/Solr index of

Fwd: Indexing Wikipedia with Solr/Lucene

2012-05-13 Thread vineet yadav
Hi all, I want to create Lucene/Solr index of wikipedia xml dump. I used Solr example(http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia) to index wikipedia xml dump. Since in wikipedia, Category and external links are part of wikipedia text, I am not able to index category a

success with indexing Wikipedia - lessons learned

2011-10-21 Thread Fred Zimmerman
http://business.zimzaz.com/wordpress/2011/10/how-to-clone-wikipedia-mirror-and-index-wikipedia-with-solr/