Hi Everyone,

We are looking for someone to help us build a similarity engine. Here are some 
preliminary specs for the project.

1) We want to be able to show similar posts when a user posts a new block of 
text. A good example of this is StackOverflow. When a user tries to ask a new 
question, the system displays similar questions.

2) This is for a messaging system, so indexing/analysis should happen 
preferably at the time of posting, not later.

3) The posts are going to be less than 1000 characters.

4) We anticipate to have a millions of posts so the solution should consider 
sharding techniques to shard the indexes on many machines.

5) The solution can be delivered as a stand alone Java SE solution which can be 
run from the command line, no web development necessary.

6) We expect clean APIs.

Thanks,

Drew

Reply via email to