JSON File Extractor¶

Extracts structured data from a JSON file and transforms it into rows for ETL processing. Supports bulk processing and integration with pipeline configurations and console commands.

Configuration¶

Name	Type	Description
`filename`	Attribute	Path to the JSON file to be processed
`bulkSize`	Attribute	Number of rows extracted at once (default: 200)

Example¶

Input data¶

Example JSON file:

[
    {"id": 1, "name": "Product A", "price": 100},
    {"id": 2, "name": "Product B", "price": 200},
    {"id": 3, "name": "Product C", "price": 300},
    {"id": 4, "name": "Product D", "price": 400}
]

Pipeline configuration¶

<json-file filename="data/import/common/common/products.json" bulkSize="200" />

Output data¶

id	name	price
1	Product A	100
2	Product B	200
3	Product C	300
4	Product D	400

Error Handling¶

Error	Description
File Not Found	Throws an exception: “File does not exist”
Invalid JSON Format	Throws an exception when decoding fails
Bulk Processing Issue	Ensures correct batch sizes based on `bulkSize`

Performance Considerations¶

Consideration	Description
Use an appropriate ``bulkSize``	Small sizes improve memory efficiency but increase processing time.
Ensure JSON is correctly formatted	Avoids exceptions during processing.
Stream Processing	Uses `JsonMachine` to efficiently process large JSON files.

Key Features¶

Feature	Description
Supports Bulk Extraction	Processes data in chunks.
Handles JSON Validation	Prevents errors due to invalid input.
Optimized for Large JSON Files	Uses streaming to reduce memory usage.
Pipeline, Console & Facade Integration	Supports multiple input sources.