CSV File Extractor¶
Extracts structured rows from a CSV file. Supports configurable separator, enclosure, escape characters, and bulk size processing.
Configuration¶
Name |
Type |
Description |
|---|---|---|
|
Attribute |
Path to the CSV file to be processed |
|
Attribute |
Number of rows extracted at once (default: 1000) |
|
Attribute |
Character separating values in the CSV (default: |
|
Attribute |
Character wrapping values containing the separator (default: |
|
Attribute |
Character escaping special characters inside enclosed values (default: |
Example¶
Input data¶
category_key |
category_product_order |
abstract_sku |
|---|---|---|
digital-cameras |
16 |
001 |
digital-cameras |
22 |
002 |
digital-cameras |
34 |
003 |
Pipeline configuration¶
<csv-file-extractor filename="data/import/products.csv" bulkSize="1000" separator="," enclosure='"' escape="\\" />
Output data¶
category_key |
category_product_order |
abstract_sku |
|---|---|---|
digital-cameras |
16 |
001 |
digital-cameras |
22 |
002 |
digital-cameras |
34 |
003 |
Error Handling¶
Error |
Description |
|---|---|
File Not Found |
Throws an exception if the file does not exist. |
Invalid CSV Format |
Fails if the file lacks a header row. |
Incorrect Encoding |
Parsing issues may arise with non-UTF-8 files. |
Performance Considerations¶
Consideration |
Description |
|---|---|
Use an appropriate `bulkSize` |
Small sizes improve memory efficiency but increase processing time. |
Ensure CSV encoding is UTF-8 |
Avoids character misinterpretation. |
Validate CSV before extraction |
Ensures data integrity. |
Key Features¶
Feature |
Description |
|---|---|
Supports Custom Separators & Enclosures |
Configurable for different CSV formats. |
Handles Bulk Extraction |
Optimized for large datasets. |
Efficient Memory Usage |
Processes rows in chunks. |
Ensures Data Consistency |
Maps headers correctly to data. |
Error Handling & Logging |
Detects file errors and invalid formats. |