DeDuplication
Deduplication helps to avoid indexing duplicate records in the search engine.
The Deduplication is an optional parameter (query) that can be passed in the API request of Parse and Index. Once Deduplication is passed in the request parameter of Parse and Index, it will avoid indexing duplicate records in the search engine.
In Deduplication, you need to set checkDuplicate = true, only then duplicate records will be checked. You can set delicacy checks on field level as well as on output level. By Default if it is on the output level and if no fields are set, then the same output is generated for any record is considered as a duplicate record.
You can check the output with the help of md5 checksum hash code. The other way is to set the fields or the combination of fields to avoid duplicate records. You can set the updated parameter to true to update the records in the existing ID. By default, it is false, means the duplicate records will not be updated and will keep the old value.
API request Parameters
Name | Type | Description | Remarks |
---|---|---|---|
Deduplication | Object | Default is false. If this is set true, it returns the details on how the document is searched. | Optional |
checkDuplicate | Boolean | True/false | Required |
update | Boolean | Default is false that means if duplicate records exist, then the new entry will be ignored. If it is set as true, then the old record will be updated with new data, but it will remain the same. | Optional |
Fields | String Array | If it is not set then deduplication will check on output checksum hash code and a list of fields can be set as the combination for deduplication. | Optional |
JSON request with Deduplication
"deDuplication":
{
"checkDuplicate":true,
"fields":["Email","FullName"],
"update":true
}