About

The bulk writer library provides three different strategies for inserting or updating data in bulk into a database: Single, Multiple, and Auto. The library supports both PostgreSQL and MySQL.

Single Strategy

The Single strategy executes a single INSERT statement with conflict handling (ON DUPLICATE KEY UPDATE in MySQL or ON CONFLICT DO UPDATE in PostgreSQL). This strategy is highly efficient because it reduces the number of queries executed, making it suitable for large datasets.

When is the Single Strategy Used?

  • When the target database supports handling multiple unique constraints in conflict resolution.

  • When performance is a priority, and a single bulk operation is preferable.

  • When allowing auto-increment values to increase on updates is acceptable.

Automatic Fallback to Multiple Strategy

If the database adapter does not support multiple unique constraints in conflict resolution and the table has multiple unique constraints, the library automatically falls back to the Multiple strategy to ensure correctness.

Using Single strategy without increasing auto-increment values

When using MySQL or MariaDB databases, InnoDB can be configured to to prevent the auto-increment value from increasing on failed insert attempts. To do so, set the innodb_autoinc_lock_mode to 0 in the database configuration file: docker/context/mysql/my.cnf. This should only be used if you are certain that the changed configuration won’t cause any issues with other parts of the application.

Multiple Strategy

The Multiple strategy separates the import process into two distinct operations: updates and inserts. This prevents unintended auto-incrementing of primary keys when updates occur and ensures better control over how existing and new data are handled.

When is the Multiple Strategy Used?

  • When the database does not support multiple unique constraints in conflict resolution.

  • When updating existing records before inserting new ones is necessary.

  • When auto-increment values must remain unchanged for updates.

How it Works

  • The library first selects all unique columns and values from the existing data.

  • The data is compared with the incoming dataset to determine which rows need to be updated versus inserted.

  • Updates are executed first to modify existing records.

  • Inserts are performed only for new records.

This ensures that updates do not trigger unintended auto-increments, maintaining data integrity.

Auto Strategy

The Auto strategy dynamically selects the appropriate strategy at runtime based on the BulkWriterQueryTransfer Object.

How Auto Strategy Works

  • If Auto is configured, the strategy is determined based on the value in the Transfer Object.

  • If the Transfer Object’s strategy is set to null, the Single strategy is used by default.

  • This allows any loader to be flexible and set the strategy in the transfer object to decide which one to use.

Benefits of Auto Strategy

  • Flexibility: Different tables or data structures can use different strategies dynamically.

  • Customization: Developers can define logic to choose the best approach for a specific data import.

Conclusion

Choosing the right strategy depends on the database capabilities and the data structure:

  • Use Single when performance and efficiency are a priority.

  • Use Multiple when auto-incrementing issues arise due to unique constraints.

  • Use Auto when strategy selection needs to be determined dynamically via a Transfer Object.