Hello Jack and Anuj, 2012/5/28 Jack Krupansky <j...@basetechnology.com>: > The Twitter API extracts hash tag and user mentions for you, in addition to > giving you the full raw text. You'll have to read up on the Twitter API.
That's what I thought just after hittind "send" on the message above ;-) I am pretty sure the Twitter API format maps very nicely to a suitable input format for Solr, if not even being already good for direct feeding into Solr. I am a bit unlucky here because I have been provided with only the raw text for about 1.5 million tweets; so I would have to write a few lines of code to restore at least user mentions, hashtags and URLs. 2012/5/28 Anuj Kumar <anujs...@gmail.com>: > This is a bit old but provides good information for schema design- > http://www.readwriteweb.com/archives/this_is_what_a_tweet_looks_like.php > > Found this link as well- https://gist.github.com/702360 > > The types of the field may depend on the search requirements. Anuj you provide very interesting links here, thanks, even tho those kind of specifics might be already present in the twitter API doc. After I'll be done with my first Solr setup, I might setup the whole pipeline (getting the Twitter feeds myself) on my machines, so that I can exploit the whole information content provided by Twitter. Cheers, Giovanni