JSON File Extractor

Extracts structured data from a JSON file and transforms it into rows for ETL processing. Supports bulk processing and integration with pipeline configurations and console commands.

Configuration

Name

Type

Description

filename

Attribute

Path to the JSON file to be processed

bulkSize

Attribute

Number of rows extracted at once (default: 200)

Example

Input data

Example JSON file:

[
    {"id": 1, "name": "Product A", "price": 100},
    {"id": 2, "name": "Product B", "price": 200},
    {"id": 3, "name": "Product C", "price": 300},
    {"id": 4, "name": "Product D", "price": 400}
]

Pipeline configuration

<json-file filename="data/import/common/common/products.json" bulkSize="200" />

Output data

id

name

price

1

Product A

100

2

Product B

200

3

Product C

300

4

Product D

400

Error Handling

Error

Description

File Not Found

Throws an exception: “File does not exist”

Invalid JSON Format

Throws an exception when decoding fails

Bulk Processing Issue

Ensures correct batch sizes based on bulkSize

Performance Considerations

Consideration

Description

Use an appropriate ``bulkSize``

Small sizes improve memory efficiency but increase processing time.

Ensure JSON is correctly formatted

Avoids exceptions during processing.

Stream Processing

Uses JsonMachine to efficiently process large JSON files.

Key Features

Feature

Description

Supports Bulk Extraction

Processes data in chunks.

Handles JSON Validation

Prevents errors due to invalid input.

Optimized for Large JSON Files

Uses streaming to reduce memory usage.

Pipeline, Console & Facade Integration

Supports multiple input sources.