Hello, I am looking for different ideas/suggestions to solve the use case am working on.
We have couple of fields in schema along with id, business_email and personal_email. We need to return all records based on unique business and personal email's. The criteria for unique records is either of business or personal email has not repeated again in other records. The criteria for non-unique records is if any of the business or personal email has occurred/repeats in other records then all those records are non-unique. E.g considering below documents. - for unique records below only id=1 should be returned (since john.doe is not present in any other records personal or business email) - non unique records, below id=2,3 should be returned (since isabel.dora is present in multiple records. doesn't matter if it is present in business or personal email) Documents === {id:1,business_email_s:john....@abc.com,personal_email_s:john....@abc.com} {id:2,business_email_s:isabel.d...@abc.com} {id:3,personal_email_s:isabel.d...@abc.com} I am able to solve this using Streaming expression query but not sure if performance will become an bottleneck as the streaming expression is quite big. So looking for different ideas like using de-dupe or during ingestion/pre-process etc. without impacting performance much. Thanks, Susheel