Greetings, I am trying to index hashtags from twitter -- so they are tokens that start with a # symbol and can have any number of alpha numeric characters.
Examples: 1. #jane 2. #Jane 3. #Jane! At a high level I'd like to be able to: 1. differentiate between say #jane and #jane! 2. differentiate between a hashtag such as #jane and a regular text token jane 3. ask for variation on #jane -- by this I mean #jane? #jane!!! #jane!?!?? are all variations of jane I'd appreciate points to what my considerations should be when I attempt to do the above. Thanks, MM.