First of all, I'd like to apologize in advance for being a pretty raw newbie when it comes to search technologies, so please bear with me!
The situation: My company has a system that moderates 15 character free form text fields. We have a dictionary of words in our database that are banned due to various legal reasons (profanity, copywrite issues, etc). Our system does an intial check when the user is entering their choices for these fields and auto-rejects anything that matches or comes close to matching (based on phonetics, purposefully misspelled, etc) anything on our banned list. Once the order is placed, the system checks again to see if the fields are exact matches to anything on our "auto-approve" list (held in same database as previous list) and passes those on through. Items that do not match either list are moved to a review queue where a customer service rep manually reviews the items. During the review the CSR can add a word to either list, which will prevent future orders using the newly added value from needing to be reviewed. My question: Currently our system simply holds all the words in a hash map in memory, but we're worried about scalability. I've been asked to try and find out more about Solr and how it compares to Endeca, which another of our department uses but I'm not very familiar with. I've been reading the wiki and other articles I've found online, but it seems like there's a lot of overlap of features between Solr and Endeca, the main difference just seems to be cost. Endeca also seems to have better support of real time searches, and has a stricter sorting algorithm. On the other hand, it sounds like Solr re-indexes quickly enough that its quick enough for my purposes, and its sorting algorithm can be tweaked to match what I need. Are there any other technical differences between the two if used in the scenario I described above? Also, are there any important hardware footprint differences? I'm no admin, but I believe our system runs on Jboss on a Solaris box last I checked. Any help or insight you guys can provide would help greatly. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Endeca-vs-Solr-tp832826p832826.html Sent from the Solr - User mailing list archive at Nabble.com.