JSON File Extractor¶
Extracts structured data from a JSON file and transforms it into rows for ETL processing. Supports bulk processing and integration with pipeline configurations and console commands.
Configuration¶
Name |
Type |
Description |
|---|---|---|
|
Attribute |
Path to the JSON file to be processed |
|
Attribute |
Number of rows extracted at once (default: 200) |
Example¶
Input data¶
Example JSON file:
[
{"id": 1, "name": "Product A", "price": 100},
{"id": 2, "name": "Product B", "price": 200},
{"id": 3, "name": "Product C", "price": 300},
{"id": 4, "name": "Product D", "price": 400}
]
Pipeline configuration¶
<json-file filename="data/import/common/common/products.json" bulkSize="200" />
Output data¶
id |
name |
price |
|---|---|---|
1 |
Product A |
100 |
2 |
Product B |
200 |
3 |
Product C |
300 |
4 |
Product D |
400 |
Error Handling¶
Error |
Description |
|---|---|
File Not Found |
Throws an exception: “File does not exist” |
Invalid JSON Format |
Throws an exception when decoding fails |
Bulk Processing Issue |
Ensures correct batch sizes based on |
Performance Considerations¶
Consideration |
Description |
|---|---|
Use an appropriate ``bulkSize`` |
Small sizes improve memory efficiency but increase processing time. |
Ensure JSON is correctly formatted |
Avoids exceptions during processing. |
Stream Processing |
Uses |
Key Features¶
Feature |
Description |
|---|---|
Supports Bulk Extraction |
Processes data in chunks. |
Handles JSON Validation |
Prevents errors due to invalid input. |
Optimized for Large JSON Files |
Uses streaming to reduce memory usage. |
Pipeline, Console & Facade Integration |
Supports multiple input sources. |