On 03/24/2010 10:40 AM, Erik Hatcher wrote:
I've got a couple of questions for the community...
* what's the simplest way to get Solr up and running with a
relatively richly schema'd index of a Wikipedia dump?
What I'm looking for is something as easy as something along these lines:
java
Just throwing this out there ... I recently saw something I found pretty
interesting from CMU ...
http://csunplugged.org/activities
The search algorithm exercise was focused on a Battleship lookup I think.
- Jon
On Mar 24, 2010, at 10:40 AM, Erik Hatcher wrote:
> I've got a couple of quest
On Mar 24, 2010, at 1:53 PM, Andrzej Bialecki wrote:
> On 2010-03-24 16:15, Markus Jelsma wrote:
>> A bit off-topic but how about Nutch grabbing some conent and have it indexed
>> in Solr?
>
> The problem is not with collecting and submitting the documents, the problem
> is with parsing the Wik
: My goal is to index wikipedia in order to demonstrate search to a class of
: middle school kids that I've volunteered to teach for a couple of hours.
: Which brings me to my next question...
twitter data is a little easier to ingest easily then the wikipedia markup
(the json based streaming AP
On 2010-03-24 16:15, Markus Jelsma wrote:
A bit off-topic but how about Nutch grabbing some conent and have it indexed
in Solr?
The problem is not with collecting and submitting the documents, the
problem is with parsing the Wikimedia markup embedded in XML.
WikipediaTokenizer from Lucene con
This is brilliant. I love it!
Is a computer game a document? How about each level, each room, each player?
If you want some fancy linguistics besides stemming, try compounding or what I
call "one word or two?" English loves to glom words together.
schoolroom or school room?
babysitter, baby-sit
Erik:
In a former incarnation, I thought I was going to teach 6th graders. Until I
found out I can't deal with 25 kids for 6 hours at a stretch for years on
end
My thoughts, presented in a "feel free to ignore but this is what I'd do"
spirit.
There are some random thoughts below, but here's w
A bit off-topic but how about Nutch grabbing some conent and have it indexed
in Solr?
On Wednesday 24 March 2010 16:08:43 Christopher Laux wrote:
> Hi Erik,
>
> I'm working on Wikipedia search and use Solr. Afaik it can't easily be
> done. The Wikipedia XML dump only provided the page title and
Hi Erik,
I'm working on Wikipedia search and use Solr. Afaik it can't easily be
done. The Wikipedia XML dump only provided the page title and author
in terms of data one would search for. The rest requires parsing the
Mediawiki markup for which there is no good one freely available
(still writing
Hey Erik,
One thing to think about (and I'm no expert at middle school kids) would be
to relate search somehow to a topic they are interested in. My 12 year old
nephew loves the NBA, so if I were to talk to him about search, I would try
and relate it to e.g., NBA.com, or understanding the differen
10 matches
Mail list logo