Optimizing repository.saveAll() JPA Hibernate
Problem. Invoking repository.saveAll(
) has to insert into DB by batches, but it doesn't.
Result of the investigation.
Current realisation inserts new records one by one. Batching doesn't work because of IDENTITY
id generator.
Hibernate batch processing works only with SEQUENCE
id generator. And with properties:spring.jpa.properties.hibernate.jdbc.batch_size=30
spring.jpa.properties.hibernate.order_inserts=true
Links: https://docs.jboss.org/hibernate/orm/4.3/manual/en-US/html/ch15.html
Solution:
Move to SEQUENCE
id generator.
Create the my_table_id_seq
for table my_table
. Or create new in case there is not exist.
Then restart it from the latest existing primary key.
SELECT SETVAL('my_table_id_seq', (SELECT MAX(id) + 1 FROM my_table));
Test 1: saving of 50_000 records took 20 seconds.
Set up with identity generator:
1) MyEntity
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Column(name = "id")
private Long id;
2) properties.yamlspring.jpa.properties.hibernate.jdbc.batch_size=30
spring.jpa.properties.hibernate.order_inserts=true
In the log you can see that batchSize = 0. There is 50_000 logs entries. Batching doesn't work.
{"name":"Batch-Insert-Logger", "connection":5, "time":0, "success":true, "type":"Prepared", "batch":false, "querySize":1, "batchSize":0, "query":["insert into my_table (created,id,full_name,updated) values (?,?,?,?,)"], "params":[["2025-03-06 12:07:07.010665","698606","John Doe","2025-03-06 12:07:07.010666"]]}
Test 2: saving of 50_000 records took 6 seconds.
Set up with sequence generator:
1) MyEntity
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "mySeqGen")
@SequenceGenerator(name = "mySeqGen", sequenceName = "my_table_id_seq", allocationSize = 1)
@Column(name = "id")
private Long id;
2) properties.yamlspring.jpa.properties.hibernate.jdbc.batch_size=30
spring.jpa.properties.hibernate.order_inserts=true
In the log you can see that batchSize = 30. There is 117 logs entries with batch size = 30 and 1 log entry with batch size = 20. Batching works.
{"name":"Batch-Insert-Logger", "connection":5, "time":0, "success":true, "type":"Prepared", "batch":true, "querySize":1, "batchSize":30, "query":["insert into my_table (created,id,full_name,updated) values (?,?,?,?)"], "params": []}