Back to Summer Batch home page

Chunk handling

Most of the time, a step is a read-process-write task, and the manipulated data is processed through subsets of a given size. This is called chunk handling. Each so-called chunk will be used as a checkpoint during processing. When a step fails, the current chunk will be rollbacked, while all the previous processing will be saved. And on restart (if restart was enabled, which requires a database-persisted repository), the job will be restarted at the exact chunk where the failure happened.

In the following example, the step will be processed through groups of 1000 records read by the reader.

Example 3.2. XML Job Configuration

	...
	<step id="TitleUpdateStep">
			<chunk item-count="1000">
					<reader ref="TitleUpdateStep/ReadTitles" />
					<processor ref="TitleUpdateStep/Processor" />
					<writer ref="TitleUpdateStep/UpdateTitles" />
			</chunk>
	</step>
	...
			

Item Reader

The Item reader is the step phase that retrieves data from a given source (database, file, etc.). It supplies items from the source until no more are available, in which case it will return null, and its processing is complete.

Item Processor

The Item Processor is the step phase that processes the data retrieved by the reader. It can be used for any kind of manipulations: filtering depending on a business logic, field updates, complete transformation into a different kind of element.. It will return the result of the processing, which may be the initial element as is, the initial element with updates, or a completely different element. If it returns null, it means the read element is ignored, (thus filtering the data read from the source).

In case no processor is supplied, the read data is transmitted as is to the writer.

Item Writer

The Item Writer is the final step phase that writes items to a target (database, file, etc.). It processes the elements given by the processor chunk by chunk, enabling the rollback mechanics explained above.

Back to Summer Batch home page