Hello, I am looking into how to do document classification for categorization of html documents. I see Solr/Lucene + MoreLikeThis that suits to find similar documents for given document.
I am able to do classification using Lucene + MoreLikeThis example. Then I was looking for how to host Solr on Amazon EC2. I see bitnami provide AMI images for the same. I see there are 4000+ AMI IDs to select from. I am not sure which to use ? Could you please let me know which is correct image to use in this case ? Or how to create new image with tomcat + Solr and save it for future usage ? Thanks, Rajesh