CSV Remove Duplicates API Documentation¶
Overview¶
The CSV Remove Duplicates API removes duplicate rows from a CSV file.
It supports three modes of operation:
- Body Mode (Synchronous): Send CSV content directly in the request body.
- URL Mode (Asynchronous): Provide a remote file URL for processing.
- File Mode (Asynchronous): Provide a previously uploaded file reference for processing.
The API can automatically detect delimiters or use a custom one. It also allows configuration for whether the CSV has a header row and how duplicates should be handled.
Endpoint (POST)¶
POST https://api.apidatatools.com/csv-remove-duplicates-api
Headers¶
| Header | Type | Required | Description |
|---|---|---|---|
x-api-key | string | Yes | Your API key for authentication. |
x-source-type | string | Optional | Input source type: body (default), url, or file. |
x-has-header | string | Optional | Indicates if the first row is a header. Accepts 1, true, yes (default) or 0, false, no. |
x-delimiter | string | Optional | Custom delimiter (e.g., ,, ;, \t, |). If not provided, auto-detected. |
x-duplicate-handling | string | Optional | How to handle duplicates: first (default), last, or none. |
Accepted File Extensions¶
| Mode | Allowed Extensions |
|---|---|
| URL Mode | .txt, .csv, .log |
| File Mode | .txt, .csv, .log |
Input Example (Body Mode)¶
Headers
x-source-type: body
x-has-header: true
x-duplicate-handling: first
Body
name,age,city
Alice,30,New York
Bob,25,London
Alice,30,New York
Input Example (URL Mode)¶
Headers
x-source-type: url
x-has-header: true
x-duplicate-handling: last
Body
{
"url": "https://example.com/sample.csv"
}
Input Example (File Mode)¶
Headers
x-source-type: file
x-has-header: true
x-duplicate-handling: none
Body
{
"file": "user123/upl_abc123/sample.csv"
}
Example Request¶
Synchronous (Body Input)¶
curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
-H "x-api-key: YOUR_API_KEY" \
-H "x-source-type: body" \
-H "x-has-header: true" \
-H "x-duplicate-handling: first" \
-d "name,age,city
Alice,30,New York
Bob,25,London
Alice,30,New York"
Asynchronous (Remote File URL)¶
curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
-H "x-api-key: YOUR_API_KEY" \
-H "x-source-type: url" \
-H "x-has-header: true" \
-d '{"url":"https://example.com/sample.csv"}'
Asynchronous (Input File)¶
curl -X POST "https://api.apidatatools.com/csv-remove-duplicates-api" \
-H "x-api-key: YOUR_API_KEY" \
-H "x-source-type: file" \
-H "x-has-header: true" \
-d '{"file":"user123/upl_abc123/sample.csv"}'
Example Response¶
Successful (Body Mode)¶
Status Code: 200 OK
{
"status": "success",
"request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f",
"file": "https://downloads.apidatatools.com/apidatatools_convert_abc123.csv",
"preview": "name,age,city\nAlice,30,New York\nBob,25,London\n"
}
Async Job Accepted¶
Status Code: 202 Accepted
{
"status": "accepted",
"job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
"status_url": "https://api.apidatatools.com/jobs/c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
"request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f"
}
Error Handling¶
| Error Code | HTTP Status | Description | Example |
|---|---|---|---|
INVALID_CSV | 400 | CSV content is empty or cannot be parsed. | {"status":"error","error":"INVALID_CSV","details":{"message":"CSV content is empty."}} |
CSV_DUPLICATE_REMOVAL_FAILED | 400 | Duplicate removal failed due to malformed data. | {"status":"error","error":"CSV_DUPLICATE_REMOVAL_FAILED","details":{"message":"Failed to remove duplicates from CSV."}} |
INVALID_URL | 400 | URL is missing or invalid. | {"status":"error","error":"INVALID_URL","details":{"message":"Missing or invalid 'url'."}} |
URL_UNREACHABLE | 400 | URL could not be reached. | {"status":"error","error":"URL_UNREACHABLE","details":{"message":"Could not reach URL."}} |
URL_NOT_OK | 400 | URL returned a non-200 response. | {"status":"error","error":"URL_NOT_OK","details":{"message":"URL returned HTTP 404, expected 200."}} |
INVALID_FILE | 400 | File path missing or invalid. | {"status":"error","error":"INVALID_FILE","details":{"message":"Missing or invalid 'file'."}} |
FILE_UNAVAILABLE | 400 | File not accessible in storage. | {"status":"error","error":"FILE_UNAVAILABLE","details":{"message":"Could not access file."}} |
INVALID_FILE_EXTENSION | 400 | File extension not supported. | {"status":"error","error":"INVALID_FILE_EXTENSION","details":{"message":"Invalid or unsupported file extension."}} |
FILE_TOO_LARGE | 413 | File exceeds plan limit. | {"status":"error","error":"FILE_TOO_LARGE","details":{"message":"File exceeds plan limit."}} |
PAYLOAD_TOO_LARGE | 413 | Request body exceeds allowed size. | {"status":"error","error":"PAYLOAD_TOO_LARGE","details":{"message":"Request body exceeds plan limit."}} |
INTERNAL_ERROR | 500 | Unexpected internal error. | {"status":"error","error":"INTERNAL_ERROR","details":{"message":"Failed to process your request"}} |
Async Job Status¶
GET https://api.apidatatools.com/jobs/{job_id}
Example Status Response for Async¶
Status Code: 200 OK
{
"job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
"status": "success",
"created_at": 1712345678,
"updated_at": 1712345690,
"result": {
"status": "success",
"file": "https://downloads.apidatatools.com/apidatatools_convert_abc123.csv",
"request_id": "b7a1e1e2-9c3f-4a9b-9b1b-1d3f4a2b9e8f",
"preview": "name,age,city\nAlice,30,New York\nBob,25,London\n"
}
}
Queued Example
{
"job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
"status": "queued",
"message": "Your job is being processed.",
"retry_after": 2
}
Failed Example
{
"job_id": "c2f8a3b4-1d2e-4a6b-9f8a-1b2c3d4e5f6a",
"status": "failed",
"error": {
"code": "INVALID_CSV",
"message": "Failed to parse CSV.",
"details": {"message": "CSV content cannot be empty or whitespace."}
}
}
Notes for Developers¶
- The API supports synchronous and asynchronous processing.
- For large files or remote URLs, use
x-source-type: urlorx-source-type: fileto trigger asynchronous processing. - Each response includes a unique
request_idfor traceability. - For asynchronous jobs, use the
status_urlto check job progress or retrieve results. - Output files are temporarily hosted and automatically expire after a retention period defined by your plan.