DeDuplication

Deduplication helps to avoid indexing duplicate records in the search engine.

The Deduplication is an optional parameter (query) that can be passed in the API request of Parse and Index. Once Deduplication is passed in the request parameter of Parse and Index, it will avoid indexing duplicate records in the search engine.

In Deduplication, you need to set checkDuplicate = true, only then duplicate records will be checked. You can set delicacy checks on field level as well as on output level. By Default if it is on the output level and if no fields are set, then the same output is generated for any record is considered as a duplicate record.

You can check the output with the help of md5 checksum hash code. The other way is to set the fields or the combination of fields to avoid duplicate records. You can set the updated parameter to true to update the records in the existing ID. By default, it is false, means the duplicate records will not be updated and will keep the old value.

API request Parameters

Request parameters for Deduplication are described below:
Name Type Description Remarks
Deduplication Object Default is false. If this is set true, it returns the details on how the document is searched. Optional
checkDuplicate Boolean True/false Required
update Boolean Default is false that means if duplicate records exist, then the new entry will be ignored. If it is set as true, then the old record will be updated with new data, but it will remain the same. Optional
Fields String Array If it is not set then deduplication will check on output checksum hash code and a list of fields can be set as the combination for deduplication. Optional

JSON request with Deduplication

"deDuplication":
    {
      "checkDuplicate":true,
      "fields":["Email","FullName"],
      "update":true
    }