Skip to content

CSV Remove Duplicates API Documentation

Overview

The CSV Remove Duplicates API removes duplicate rows from a CSV file.
It supports three modes of operation:

  • Body Mode (Synchronous): Send CSV content directly in the request body.
  • URL Mode (Asynchronous): Provide a remote file URL for processing.
  • File Mode (Asynchronous): Provide a previously uploaded file reference for processing.

The API can automatically detect delimiters or use a custom one. It also allows configuration for whether the CSV has a header row and how duplicates should be handled.


Endpoint (POST)

POST https://api.apidatatools.com/csv-remove-duplicates-api


Headers

Header Type Required Description
x-api-key string Yes Your API key for authentication.
x-source-type string Optional Input source type: body (default), url, or file.
x-has-header string Optional Indicates if the first row is a header. Accepts 1, true, yes (default) or 0, false, no.
x-delimiter string Optional Custom delimiter (e.g., ,, ;, \t, |). If not provided, auto-detected.
x-duplicate-handling string Optional How to handle duplicates: first (default), last, or none.

Accepted File Extensions

Mode Allowed Extensions
URL Mode .txt, .csv, .log
File Mode .txt, .csv, .log

Input Example (Body Mode)

Headers

x-source-type: body
x-has-header: true
x-duplicate-handling: first

Body

name,age,city
Alice,30,New York
Bob,25,London
Alice,30,New York


Input Example (URL Mode)

Headers

x-source-type: url
x-has-header: true
x-duplicate-handling: last

Body

{
  "url": "https://example.com/sample.csv"
}


Input Example (File Mode)

Headers

x-source-type: file
x-has-header: true
x-duplicate-handling: none

Body

{
  "file": "user123/upl_abc123/sample.csv"
}


Example Request

Synchronous (Body Input)

curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "x-source-type: body" \
  -H "x-has-header: true" \
  -H "x-duplicate-handling: first" \
  -d "name,age,city
Alice,30,New York
Bob,25,London
Alice,30,New York"

Asynchronous (Remote File URL)

curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "x-source-type: url" \
  -H "x-has-header: true" \
  -d '{"url":"https://example.com/sample.csv"}'

Asynchronous (Input File)

curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "x-source-type: file" \
  -H "x-has-header: true" \
  -d '{"file":"user123/upl_abc123/sample.csv"}'

Example Response

Successful (Body Mode)

Status Code: 200 OK

{
  "status": "success",
  "request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f",
  "file": "https://downloads.apidatatools.com/apidatatools_convert_abc123.csv",
  "preview": "name,age,city\nAlice,30,New York\nBob,25,London\n"
}

Async Job Accepted

Status Code: 202 Accepted

{
  "status": "accepted",
  "job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
  "status_url": "https://api.apidatatools.com/jobs/c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
  "request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f"
}


Error Handling

Error Code HTTP Status Description Example
INVALID_CSV 400 CSV content is empty or cannot be parsed. {"status":"error","error":"INVALID_CSV","details":{"message":"CSV content is empty."}}
CSV_DUPLICATE_REMOVAL_FAILED 400 Duplicate removal failed due to malformed data. {"status":"error","error":"CSV_DUPLICATE_REMOVAL_FAILED","details":{"message":"Failed to remove duplicates from CSV."}}
INVALID_URL 400 URL is missing or invalid. {"status":"error","error":"INVALID_URL","details":{"message":"Missing or invalid 'url'."}}
URL_UNREACHABLE 400 URL could not be reached. {"status":"error","error":"URL_UNREACHABLE","details":{"message":"Could not reach URL."}}
URL_NOT_OK 400 URL returned a non-200 response. {"status":"error","error":"URL_NOT_OK","details":{"message":"URL returned HTTP 404, expected 200."}}
INVALID_FILE 400 File path missing or invalid. {"status":"error","error":"INVALID_FILE","details":{"message":"Missing or invalid 'file'."}}
FILE_UNAVAILABLE 400 File not accessible in storage. {"status":"error","error":"FILE_UNAVAILABLE","details":{"message":"Could not access file."}}
INVALID_FILE_EXTENSION 400 File extension not supported. {"status":"error","error":"INVALID_FILE_EXTENSION","details":{"message":"Invalid or unsupported file extension."}}
FILE_TOO_LARGE 413 File exceeds plan limit. {"status":"error","error":"FILE_TOO_LARGE","details":{"message":"File exceeds plan limit."}}
PAYLOAD_TOO_LARGE 413 Request body exceeds allowed size. {"status":"error","error":"PAYLOAD_TOO_LARGE","details":{"message":"Request body exceeds plan limit."}}
INTERNAL_ERROR 500 Unexpected internal error. {"status":"error","error":"INTERNAL_ERROR","details":{"message":"Failed to process your request"}}

Async Job Status

GET https://api.apidatatools.com/jobs/{job_id}

Example Status Response for Async

Status Code: 200 OK

{
  "job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
  "status": "success",
  "created_at": 1712345678,
  "updated_at": 1712345690,
  "result": {
    "status": "success",
    "file": "https://downloads.apidatatools.com/apidatatools_convert_abc123.csv",
    "request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f",
    "preview": "name,age,city\nAlice,30,New York\nBob,25,London\n"
  }
}

Queued Example

{
  "job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
  "status": "queued",
  "message": "Your job is being processed.",
  "retry_after": 2
}

Failed Example

{
  "job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
  "status": "failed",
  "error": {
    "code": "INVALID_CSV",
    "message": "Failed to parse CSV.",
    "details": {"message": "CSV content cannot be empty or whitespace."}
  }
}


Notes for Developers

  • The API supports synchronous and asynchronous processing.
  • For large files or remote URLs, use x-source-type: url or x-source-type: file to trigger asynchronous processing.
  • Each response includes a unique request_id for traceability.
  • For asynchronous jobs, use the status_url to check job progress or retrieve results.
  • Output files are temporarily hosted and automatically expire after a retention period defined by your plan.