Migrate documentation to Antora

Issue #4422
This commit is contained in:
Rob Winch
2023-07-20 17:07:16 -05:00
committed by Mahmoud Ben Hassine
parent e36a44788d
commit 2e8d5063f7
149 changed files with 9472 additions and 9114 deletions

30
.github/workflows/deploy-docs.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: Deploy Docs
on:
push:
branches-ignore: [ gh-pages ]
tags: '**'
repository_dispatch:
types: request-build-reference # legacy
workflow_dispatch:
permissions:
actions: write
jobs:
build:
runs-on: ubuntu-latest
if: github.repository_owner == 'spring-projects'
steps:
- name: Checkout
uses: actions/checkout@v3
with:
ref: docs-build
fetch-depth: 1
- name: Dispatch (partial build)
if: github.ref_type == 'branch'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: gh workflow run deploy-docs.yml -r $(git rev-parse --abbrev-ref HEAD) -f build-refname=${{ github.ref_name }}
- name: Dispatch (full build)
if: github.ref_type == 'tag'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: gh workflow run deploy-docs.yml -r $(git rev-parse --abbrev-ref HEAD)

5
.gitignore vendored
View File

@@ -25,3 +25,8 @@ out
/.gradletasknamecache
**/*.flattened-pom.xml
node
node_modules
package-lock.json
package.json

View File

@@ -60,12 +60,11 @@ Please note that some integration tests are based on Docker, so please make sure
To generate the reference documentation, run the following commands:
```
$ ./mvnw javadoc:aggregate
$ cd spring-batch-docs
$ ../mvnw site
$ ../mvnw antora:antora
```
The reference documentation can be found in `spring-batch-docs/target`.
The reference documentation can be found in `spring-batch-docs/target/anotra/site`.
## Using Docker

View File

@@ -136,9 +136,7 @@
<groovy.version>3.0.19</groovy.version>
<!-- documentation dependencies -->
<asciidoctorj-pdf.version>1.6.2</asciidoctorj-pdf.version> <!-- FIXME build failure with version 2.3.9 -->
<asciidoctorj-epub.version>1.5.1</asciidoctorj-epub.version>
<spring-asciidoctor-backends.version>0.0.6</spring-asciidoctor-backends.version>
<io.spring.maven.antora-version>0.0.3</io.spring.maven.antora-version>
<!-- plugin versions -->
<maven-compiler-plugin.version>3.11.0</maven-compiler-plugin.version>

View File

@@ -0,0 +1,40 @@
# PACKAGES antora@3.2.0-alpha.2 @antora/atlas-extension:1.0.0-alpha.1 @antora/collector-extension@1.0.0-alpha.3 @springio/antora-extensions@1.1.0-alpha.2 @asciidoctor/tabs@1.0.0-alpha.12 @opendevise/antora-release-line-extension@1.0.0-alpha.2
#
# The purpose of this Antora playbook is to build the docs in the current branch.
antora:
extensions:
- '@springio/antora-extensions/partial-build-extension'
- require: '@springio/antora-extensions/inject-collector-cache-config-extension'
- '@antora/collector-extension'
- '@antora/atlas-extension'
- require: '@springio/antora-extensions/root-component-extension'
root_component_name: 'batch'
site:
title: Spring Batch Reference
url: https://docs.spring.io/spring-batch/reference
content:
sources:
- url: ..
branches: HEAD
start_path: spring-batch-docs
worktrees: true
asciidoc:
attributes:
page-pagination: ''
hide-uri-scheme: '@'
tabs-sync-option: '@'
chomp: 'all'
extensions:
- '@asciidoctor/tabs'
- '@springio/asciidoctor-extensions'
sourcemap: true
urls:
latest_version_segment: ''
runtime:
log:
failure_level: warn
format: pretty
ui:
bundle:
url: https://github.com/spring-io/antora-ui-spring/releases/download/v0.3.3/ui-bundle.zip
snapshot: true

View File

@@ -0,0 +1,11 @@
name: batch
version: true
title: Spring Batch Documentation
nav:
- modules/ROOT/nav.adoc
ext:
collector:
run:
command: mvn process-resources -pl spring-batch-docs -am
scan:
dir: ./target/classes/antora-resources

View File

Before

Width:  |  Height:  |  Size: 13 KiB

After

Width:  |  Height:  |  Size: 13 KiB

View File

Before

Width:  |  Height:  |  Size: 3.7 KiB

After

Width:  |  Height:  |  Size: 3.7 KiB

View File

Before

Width:  |  Height:  |  Size: 60 KiB

After

Width:  |  Height:  |  Size: 60 KiB

View File

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 15 KiB

View File

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 40 KiB

View File

Before

Width:  |  Height:  |  Size: 3.2 KiB

After

Width:  |  Height:  |  Size: 3.2 KiB

View File

Before

Width:  |  Height:  |  Size: 8.1 KiB

After

Width:  |  Height:  |  Size: 8.1 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 251 KiB

After

Width:  |  Height:  |  Size: 251 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Before

Width:  |  Height:  |  Size: 4.6 KiB

After

Width:  |  Height:  |  Size: 4.6 KiB

View File

Before

Width:  |  Height:  |  Size: 5.3 KiB

After

Width:  |  Height:  |  Size: 5.3 KiB

View File

Before

Width:  |  Height:  |  Size: 5.6 KiB

After

Width:  |  Height:  |  Size: 5.6 KiB

View File

Before

Width:  |  Height:  |  Size: 584 KiB

After

Width:  |  Height:  |  Size: 584 KiB

View File

Before

Width:  |  Height:  |  Size: 380 KiB

After

Width:  |  Height:  |  Size: 380 KiB

View File

Before

Width:  |  Height:  |  Size: 143 KiB

After

Width:  |  Height:  |  Size: 143 KiB

View File

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 36 KiB

View File

Before

Width:  |  Height:  |  Size: 43 KiB

After

Width:  |  Height:  |  Size: 43 KiB

View File

Before

Width:  |  Height:  |  Size: 29 KiB

After

Width:  |  Height:  |  Size: 29 KiB

View File

Before

Width:  |  Height:  |  Size: 18 KiB

After

Width:  |  Height:  |  Size: 18 KiB

View File

Before

Width:  |  Height:  |  Size: 19 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View File

Before

Width:  |  Height:  |  Size: 18 KiB

After

Width:  |  Height:  |  Size: 18 KiB

View File

Before

Width:  |  Height:  |  Size: 17 KiB

After

Width:  |  Height:  |  Size: 17 KiB

View File

Before

Width:  |  Height:  |  Size: 106 KiB

After

Width:  |  Height:  |  Size: 106 KiB

View File

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 26 KiB

View File

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

View File

Before

Width:  |  Height:  |  Size: 28 KiB

After

Width:  |  Height:  |  Size: 28 KiB

View File

Before

Width:  |  Height:  |  Size: 12 KiB

After

Width:  |  Height:  |  Size: 12 KiB

View File

Before

Width:  |  Height:  |  Size: 6.5 KiB

After

Width:  |  Height:  |  Size: 6.5 KiB

View File

Before

Width:  |  Height:  |  Size: 4.8 KiB

After

Width:  |  Height:  |  Size: 4.8 KiB

View File

Before

Width:  |  Height:  |  Size: 346 KiB

After

Width:  |  Height:  |  Size: 346 KiB

View File

Before

Width:  |  Height:  |  Size: 96 KiB

After

Width:  |  Height:  |  Size: 96 KiB

View File

Before

Width:  |  Height:  |  Size: 6.9 KiB

After

Width:  |  Height:  |  Size: 6.9 KiB

View File

Before

Width:  |  Height:  |  Size: 39 KiB

After

Width:  |  Height:  |  Size: 39 KiB

View File

Before

Width:  |  Height:  |  Size: 11 KiB

After

Width:  |  Height:  |  Size: 11 KiB

View File

Before

Width:  |  Height:  |  Size: 19 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View File

Before

Width:  |  Height:  |  Size: 12 KiB

After

Width:  |  Height:  |  Size: 12 KiB

View File

Before

Width:  |  Height:  |  Size: 18 KiB

After

Width:  |  Height:  |  Size: 18 KiB

View File

Before

Width:  |  Height:  |  Size: 4.6 KiB

After

Width:  |  Height:  |  Size: 4.6 KiB

View File

Before

Width:  |  Height:  |  Size: 17 KiB

After

Width:  |  Height:  |  Size: 17 KiB

View File

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 52 KiB

View File

@@ -0,0 +1,62 @@
* xref:index.adoc[]
* xref:spring-batch-intro.adoc[]
* xref:spring-batch-architecture.adoc[]
* xref:whatsnew.adoc[]
* xref:domain.adoc[]
* xref:job.adoc[]
** xref:job/configuring.adoc[]
** xref:job/java-config.adoc[]
** xref:job/configuring-repository.adoc[]
** xref:job/configuring-launcher.adoc[]
** xref:job/running.adoc[]
** xref:job/advanced-meta-data.adoc[]
* xref:step.adoc[]
** xref:step/chunk-oriented-processing.adoc[]
*** xref:step/chunk-oriented-processing/configuring.adoc[]
*** xref:step/chunk-oriented-processing/inheriting-from-parent.adoc[]
*** xref:step/chunk-oriented-processing/commit-interval.adoc[]
*** xref:step/chunk-oriented-processing/restart.adoc[]
*** xref:step/chunk-oriented-processing/configuring-skip.adoc[]
*** xref:step/chunk-oriented-processing/retry-logic.adoc[]
*** xref:step/chunk-oriented-processing/controlling-rollback.adoc[]
*** xref:step/chunk-oriented-processing/transaction-attributes.adoc[]
*** xref:step/chunk-oriented-processing/registering-item-streams.adoc[]
*** xref:step/chunk-oriented-processing/intercepting-execution.adoc[]
** xref:step/tasklet.adoc[]
** xref:step/controlling-flow.adoc[]
** xref:step/late-binding.adoc[]
* xref:readersAndWriters.adoc[]
** xref:readers-and-writers/item-reader.adoc[]
** xref:readers-and-writers/item-writer.adoc[]
** xref:readers-and-writers/item-stream.adoc[]
** xref:readers-and-writers/delegate-pattern-registering.adoc[]
** xref:readers-and-writers/flat-files.adoc[]
*** xref:readers-and-writers/flat-files/field-set.adoc[]
*** xref:readers-and-writers/flat-files/file-item-reader.adoc[]
*** xref:readers-and-writers/flat-files/file-item-writer.adoc[]
** xref:readers-and-writers/xml-reading-writing.adoc[]
** xref:readers-and-writers/json-reading-writing.adoc[]
** xref:readers-and-writers/multi-file-input.adoc[]
** xref:readers-and-writers/database.adoc[]
** xref:readers-and-writers/reusing-existing-services.adoc[]
** xref:readers-and-writers/process-indicator.adoc[]
** xref:readers-and-writers/custom.adoc[]
** xref:readers-and-writers/item-reader-writer-implementations.adoc[]
* xref:processor.adoc[]
* xref:scalability.adoc[]
* xref:repeat.adoc[]
* xref:retry.adoc[]
* xref:testing.adoc[]
* xref:common-patterns.adoc[]
* xref:spring-batch-integration.adoc[]
** xref:spring-batch-integration/namespace-support.adoc[]
** xref:spring-batch-integration/launching-jobs-through-messages.adoc[]
** xref:spring-batch-integration/available-attributes-of-the-job-launching-gateway.adoc[]
** xref:spring-batch-integration/sub-elements.adoc[]
* xref:monitoring-and-metrics.adoc[]
* xref:tracing.adoc[]
* Appendices
** xref:appendix.adoc[]
** xref:schema-appendix.adoc[]
** xref:transaction-appendix.adoc[]
** xref:glossary.adoc[]

View File

@@ -1,14 +1,12 @@
:toc: left
:toclevels: 4
[[listOfReadersAndWriters]]
include::attributes.adoc[]
[appendix]
== List of ItemReaders and ItemWriters
[[list-of-itemreaders-and-itemwriters]]
= List of ItemReaders and ItemWriters
[[itemReadersAppendix]]
=== Item Readers
== Item Readers
.Available Item Readers
[options="header"]
@@ -77,7 +75,7 @@ This reader stores message offsets in the execution context to support restart c
[[itemWritersAppendix]]
=== Item Writers
== Item Writers
.Available Item Writers
[options="header"]

View File

@@ -1,15 +1,8 @@
:toc: left
:toclevels: 4
[[commonPatterns]]
include::attributes.adoc[]
== Common Batch Patterns
ifndef::onlyonetoggle[]
include::toggle.adoc[]
endif::onlyonetoggle[]
[[common-batch-patterns]]
= Common Batch Patterns
Some batch jobs can be assembled purely from off-the-shelf components in Spring Batch.
For instance, the `ItemReader` and `ItemWriter` implementations can be configured to
@@ -25,7 +18,7 @@ These examples primarily feature the listener interfaces. It should be noted tha
`ItemReader` or `ItemWriter` can implement a listener interface as well, if appropriate.
[[loggingItemProcessingAndFailures]]
=== Logging Item Processing and Failures
== Logging Item Processing and Failures
A common use case is the need for special handling of errors in a step, item by item,
perhaps logging to a special channel or inserting a record into a database. A
@@ -52,11 +45,31 @@ public class ItemFailureLoggerListener extends ItemListenerSupport {
Having implemented this listener, it must be registered with a step.
[role="xmlContent"]
The following example shows how to register a listener with a step in XML:
[tabs]
====
Java::
+
The following example shows how to register a listener with a step Java:
+
.Java Configuration
[source, java]
----
@Bean
public Step simpleStep(JobRepository jobRepository) {
return new StepBuilder("simpleStep", jobRepository)
...
.listener(new ItemFailureLoggerListener())
.build();
}
----
XML::
+
The following example shows how to register a listener with a step in XML:
+
.XML Configuration
[source, xml, role="xmlContent"]
[source, xml]
----
<step id="simpleStep">
...
@@ -68,20 +81,8 @@ The following example shows how to register a listener with a step in XML:
</step>
----
[role="javaContent"]
The following example shows how to register a listener with a step Java:
====
.Java Configuration
[source, java, role="javaContent"]
----
@Bean
public Step simpleStep(JobRepository jobRepository) {
return new StepBuilder("simpleStep", jobRepository)
...
.listener(new ItemFailureLoggerListener())
.build();
}
----
IMPORTANT: if your listener does anything in an `onError()` method, it must be inside
a transaction that is going to be rolled back. If you need to use a transactional
@@ -90,7 +91,7 @@ transaction to that method (see Spring Core Reference Guide for details), and gi
propagation attribute a value of `REQUIRES_NEW`.
[[stoppingAJobManuallyForBusinessReasons]]
=== Stopping a Job Manually for Business Reasons
== Stopping a Job Manually for Business Reasons
Spring Batch provides a `stop()` method through the `JobOperator` interface, but this is
really for use by the operator rather than the application programmer. Sometimes, it is
@@ -141,11 +142,32 @@ of the `CompletionPolicy` strategy that signals a complete batch when the item t
processed is `null`. A more sophisticated completion policy could be implemented and
injected into the `Step` through the `SimpleStepFactoryBean`.
[role="xmlContent"]
The following example shows how to inject a completion policy into a step in XML:
[tabs]
====
Java::
+
The following example shows how to inject a completion policy into a step in Java:
+
.Java Configuration
[source, java]
----
@Bean
public Step simpleStep(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("simpleStep", jobRepository)
.<String, String>chunk(new SpecialCompletionPolicy(), transactionManager)
.reader(reader())
.writer(writer())
.build();
}
----
XML::
+
The following example shows how to inject a completion policy into a step in XML:
+
.XML Configuration
[source, xml, role="xmlContent"]
[source, xml]
----
<step id="simpleStep">
<tasklet>
@@ -157,21 +179,8 @@ The following example shows how to inject a completion policy into a step in XML
<bean id="completionPolicy" class="org.example...SpecialCompletionPolicy"/>
----
[role="javaContent"]
The following example shows how to inject a completion policy into a step in Java:
====
.Java Configuration
[source, java, role="javaContent"]
----
@Bean
public Step simpleStep(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("simpleStep", jobRepository)
.<String, String>chunk(new SpecialCompletionPolicy(), transactionManager)
.reader(reader())
.writer(writer())
.build();
}
----
An alternative is to set a flag in the `StepExecution`, which is checked by the `Step`
implementations in the framework in between item processing. To implement this
@@ -204,7 +213,7 @@ When the flag is set, the default behavior is for the step to throw a
so this is always an abnormal ending to a job.
[[addingAFooterRecord]]
=== Adding a Footer Record
== Adding a Footer Record
Often, when writing to flat files, a "`footer`" record must be appended to the end of the
file, after all processing has be completed. This can be achieved using the
@@ -212,27 +221,16 @@ file, after all processing has be completed. This can be achieved using the
(and its counterpart, the `FlatFileHeaderCallback`) are optional properties of the
`FlatFileItemWriter` and can be added to an item writer.
[role="xmlContent"]
The following example shows how to use the `FlatFileHeaderCallback` and the
`FlatFileFooterCallback` in XML:
.XML Configuration
[source, xml, role="xmlContent"]
----
<bean id="itemWriter" class="org.spr...FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator" ref="lineAggregator"/>
<property name="headerCallback" ref="headerCallback" />
<property name="footerCallback" ref="footerCallback" />
</bean>
----
[role="javaContent"]
[tabs]
====
Java::
+
The following example shows how to use the `FlatFileHeaderCallback` and the
`FlatFileFooterCallback` in Java:
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public FlatFileItemWriter<String> itemWriter(Resource outputResource) {
@@ -246,6 +244,26 @@ public FlatFileItemWriter<String> itemWriter(Resource outputResource) {
}
----
XML::
+
The following example shows how to use the `FlatFileHeaderCallback` and the
`FlatFileFooterCallback` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemWriter" class="org.spr...FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator" ref="lineAggregator"/>
<property name="headerCallback" ref="headerCallback" />
<property name="footerCallback" ref="footerCallback" />
</bean>
----
====
The footer callback interface has just one method that is called when the footer must be
written, as shown in the following interface definition:
@@ -259,7 +277,7 @@ public interface FlatFileFooterCallback {
----
[[writingASummaryFooter]]
==== Writing a Summary Footer
=== Writing a Summary Footer
A common requirement involving footer records is to aggregate information during the
output process and to append this information to the end of the file. This footer often
@@ -311,28 +329,15 @@ In order for the `writeFooter` method to be called, the `TradeItemWriter` (which
implements `FlatFileFooterCallback`) must be wired into the `FlatFileItemWriter` as the
`footerCallback`.
[role="xmlContent"]
The following example shows how to wire the `TradeItemWriter` in XML:
.XML Configuration
[source, xml, role="xmlContent"]
----
<bean id="tradeItemWriter" class="..TradeItemWriter">
<property name="delegate" ref="flatFileItemWriter" />
</bean>
<bean id="flatFileItemWriter" class="org.spr...FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator" ref="lineAggregator"/>
<property name="footerCallback" ref="tradeItemWriter" />
</bean>
----
[role="javaContent"]
[tabs]
====
Java::
+
The following example shows how to wire the `TradeItemWriter` in Java:
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public TradeItemWriter tradeItemWriter() {
@@ -354,6 +359,29 @@ public FlatFileItemWriter<String> flatFileItemWriter(Resource outputResource) {
}
----
XML::
+
The following example shows how to wire the `TradeItemWriter` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="tradeItemWriter" class="..TradeItemWriter">
<property name="delegate" ref="flatFileItemWriter" />
</bean>
<bean id="flatFileItemWriter" class="org.spr...FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator" ref="lineAggregator"/>
<property name="footerCallback" ref="tradeItemWriter" />
</bean>
----
====
The way that the `TradeItemWriter` has been written so far functions correctly only if
the `Step` is not restartable. This is because the class is stateful (since it stores the
`totalAmount`), but the `totalAmount` is not persisted to the database. Therefore, it
@@ -381,7 +409,7 @@ starting point for processing, allowing the `TradeItemWriter` to pick up on rest
it left off the previous time the `Step` was run.
[[drivingQueryBasedItemReaders]]
=== Driving Query Based ItemReaders
== Driving Query Based ItemReaders
In the link:readersAndWriters.html[chapter on readers and writers], database input using
paging was discussed. Many database vendors, such as DB2, have extremely pessimistic
@@ -393,7 +421,7 @@ by iterating over keys, rather than the entire object that needs to be returned,
following image illustrates:
.Driving Query Job
image::{batch-asciidoc}images/drivingQueryExample.png[Driving Query Job, scaledwidth="60%"]
image::drivingQueryExample.png[Driving Query Job, scaledwidth="60%"]
As you can see, the example shown in the preceding image uses the same 'FOO' table as was
used in the cursor-based example. However, rather than selecting the entire row, only the
@@ -402,14 +430,14 @@ from `read`, an `Integer` is returned. This number can then be used to query for
'details', which is a complete `Foo` object, as shown in the following image:
.Driving Query Example
image::{batch-asciidoc}images/drivingQueryJob.png[Driving Query Example, scaledwidth="60%"]
image::drivingQueryJob.png[Driving Query Example, scaledwidth="60%"]
An `ItemProcessor` should be used to transform the key obtained from the driving query
into a full `Foo` object. An existing DAO can be used to query for the full object based
on the key.
[[multiLineRecords]]
=== Multi-Line Records
== Multi-Line Records
While it is usually the case with flat files that each record is confined to a single
line, it is common that a file might have records spanning multiple lines with multiple
@@ -434,32 +462,15 @@ there are, the `ItemReader` must be careful to always read an entire record. In
do this, a custom `ItemReader` should be implemented as a wrapper for the
`FlatFileItemReader`.
[role="xmlContent"]
The following example shows how to implement a custom `ItemReader` in XML:
.XML Configuration
[source, xml, role="xmlContent"]
----
<bean id="itemReader" class="org.spr...MultiLineTradeItemReader">
<property name="delegate">
<bean class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="data/iosample/input/multiLine.txt" />
<property name="lineMapper">
<bean class="org.spr...DefaultLineMapper">
<property name="lineTokenizer" ref="orderFileTokenizer"/>
<property name="fieldSetMapper" ref="orderFieldSetMapper"/>
</bean>
</property>
</bean>
</property>
</bean>
----
[role="javaContent"]
[tabs]
====
Java::
+
The following example shows how to implement a custom `ItemReader` in Java:
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public MultiLineTradeItemReader itemReader() {
@@ -482,6 +493,33 @@ public FlatFileItemReader flatFileItemReader() {
}
----
XML::
+
The following example shows how to implement a custom `ItemReader` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemReader" class="org.spr...MultiLineTradeItemReader">
<property name="delegate">
<bean class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="data/iosample/input/multiLine.txt" />
<property name="lineMapper">
<bean class="org.spr...DefaultLineMapper">
<property name="lineTokenizer" ref="orderFileTokenizer"/>
<property name="fieldSetMapper" ref="orderFieldSetMapper"/>
</bean>
</property>
</bean>
</property>
</bean>
----
====
To ensure that each line is tokenized properly, which is especially important for
fixed-length input, the `PatternMatchingCompositeLineTokenizer` can be used on the
delegate `FlatFileItemReader`. See
@@ -490,29 +528,15 @@ Writers chapter] for more details. The delegate reader then uses a
`PassThroughFieldSetMapper` to deliver a `FieldSet` for each line back to the wrapping
`ItemReader`.
[role="xmlContent"]
The following example shows how to ensure that each line is properly tokenized in XML:
.XML Content
[source, xml, role="xmlContent"]
----
<bean id="orderFileTokenizer" class="org.spr...PatternMatchingCompositeLineTokenizer">
<property name="tokenizers">
<map>
<entry key="HEA*" value-ref="headerRecordTokenizer" />
<entry key="FOT*" value-ref="footerRecordTokenizer" />
<entry key="NCU*" value-ref="customerLineTokenizer" />
<entry key="BAD*" value-ref="billingAddressLineTokenizer" />
</map>
</property>
</bean>
----
[role="javaContent"]
[tabs]
====
Java::
+
The following example shows how to ensure that each line is properly tokenized in Java:
+
.Java Content
[source, java, role="javaContent"]
[source, java]
----
@Bean
public PatternMatchingCompositeLineTokenizer orderFileTokenizer() {
@@ -532,6 +556,29 @@ public PatternMatchingCompositeLineTokenizer orderFileTokenizer() {
}
----
XML::
+
The following example shows how to ensure that each line is properly tokenized in XML:
+
.XML Content
[source, xml]
----
<bean id="orderFileTokenizer" class="org.spr...PatternMatchingCompositeLineTokenizer">
<property name="tokenizers">
<map>
<entry key="HEA*" value-ref="headerRecordTokenizer" />
<entry key="FOT*" value-ref="footerRecordTokenizer" />
<entry key="NCU*" value-ref="customerLineTokenizer" />
<entry key="BAD*" value-ref="billingAddressLineTokenizer" />
</map>
</property>
</bean>
----
====
This wrapper has to be able to recognize the end of a record so that it can continually
call `read()` on its delegate until the end is reached. For each line that is read, the
wrapper should build up the item to be returned. Once the footer is reached, the item can
@@ -572,7 +619,7 @@ public Trade read() throws Exception {
----
[[executingSystemCommands]]
=== Executing System Commands
== Executing System Commands
Many batch jobs require that an external command be called from within the batch job.
Such a process could be kicked off separately by the scheduler, but the advantage of
@@ -582,24 +629,15 @@ need to be split up into multiple jobs as well.
Because the need is so common, Spring Batch provides a `Tasklet` implementation for
calling system commands.
[role="xmlContent"]
The following example shows how to call an external command in XML:
.XML Configuration
[source, xml, role="xmlContent"]
----
<bean class="org.springframework.batch.core.step.tasklet.SystemCommandTasklet">
<property name="command" value="echo hello" />
<!-- 5 second timeout for the command to complete -->
<property name="timeout" value="5000" />
</bean>
----
[role="javaContent"]
[tabs]
====
Java::
+
The following example shows how to call an external command in Java:
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public SystemCommandTasklet tasklet() {
@@ -612,8 +650,27 @@ public SystemCommandTasklet tasklet() {
}
----
XML::
+
The following example shows how to call an external command in XML:
+
.XML Configuration
[source, xml]
----
<bean class="org.springframework.batch.core.step.tasklet.SystemCommandTasklet">
<property name="command" value="echo hello" />
<!-- 5 second timeout for the command to complete -->
<property name="timeout" value="5000" />
</bean>
----
====
[[handlingStepCompletionWhenNoInputIsFound]]
=== Handling Step Completion When No Input is Found
== Handling Step Completion When No Input is Found
In many batch scenarios, finding no rows in a database or file to process is not
exceptional. The `Step` is simply considered to have found no work and completes with 0
@@ -647,7 +704,7 @@ is the case, an exit code `FAILED` is returned, indicating that the `Step` shoul
Otherwise, `null` is returned, which does not affect the status of the `Step`.
[[passingDataToFutureSteps]]
=== Passing Data to Future Steps
== Passing Data to Future Steps
It is often useful to pass information from one step to another. This can be done through
the `ExecutionContext`. The catch is that there are two `ExecutionContexts`: one at the
@@ -688,41 +745,15 @@ also, optionally, be configured with a list of exit code patterns for which the
should occur (`COMPLETED` is the default). As with all listeners, it must be registered
on the `Step`.
[role="xmlContent"]
The following example shows how to promote a step to the `Job` `ExecutionContext` in XML:
.XML Configuration
[source, xml, role="xmlContent"]
----
<job id="job1">
<step id="step1">
<tasklet>
<chunk reader="reader" writer="savingWriter" commit-interval="10"/>
</tasklet>
<listeners>
<listener ref="promotionListener"/>
</listeners>
</step>
<step id="step2">
...
</step>
</job>
<beans:bean id="promotionListener" class="org.spr....ExecutionContextPromotionListener">
<beans:property name="keys">
<list>
<value>someKey</value>
</list>
</beans:property>
</beans:bean>
----
[role="xmlContent"]
[tabs]
====
Java::
+
The following example shows how to promote a step to the `Job` `ExecutionContext` in Java:
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public Job job1(JobRepository jobRepository) {
@@ -752,6 +783,41 @@ public ExecutionContextPromotionListener promotionListener() {
}
----
XML::
+
The following example shows how to promote a step to the `Job` `ExecutionContext` in XML:
+
.XML Configuration
[source, xml]
----
<job id="job1">
<step id="step1">
<tasklet>
<chunk reader="reader" writer="savingWriter" commit-interval="10"/>
</tasklet>
<listeners>
<listener ref="promotionListener"/>
</listeners>
</step>
<step id="step2">
...
</step>
</job>
<beans:bean id="promotionListener" class="org.spr....ExecutionContextPromotionListener">
<beans:property name="keys">
<list>
<value>someKey</value>
</list>
</beans:property>
</beans:bean>
----
====
Finally, the saved values must be retrieved from the `Job` `ExecutionContext`, as shown
in the following example:

View File

@@ -1,13 +1,7 @@
:toc: left
:toclevels: 4
[[domainLanguageOfBatch]]
== The Domain Language of Batch
= The Domain Language of Batch
include::attributes.adoc[]
ifndef::onlyonetoggle[]
include::toggle.adoc[]
endif::onlyonetoggle[]
To any experienced batch architect, the overall concepts of batch processing used in
Spring Batch should be familiar and comfortable. There are "`Jobs`" and "`Steps`" and
@@ -33,7 +27,7 @@ creation of simple to complex batch applications, with the infrastructure and ex
to address very complex processing needs.
.Batch Stereotypes
image::{batch-asciidoc}images/spring-batch-reference-model.png[Figure 2.1: Batch Stereotypes, scaledwidth="60%"]
image::spring-batch-reference-model.png[Figure 2.1: Batch Stereotypes, scaledwidth="60%"]
The preceding diagram highlights the key concepts that make up the domain language of
Spring Batch. A `Job` has one to many steps, each of which has exactly one `ItemReader`,
@@ -41,7 +35,8 @@ one `ItemProcessor`, and one `ItemWriter`. A job needs to be launched (with
`JobLauncher`), and metadata about the currently running process needs to be stored (in
`JobRepository`).
=== Job
[[job]]
== Job
This section describes stereotypes relating to the concept of a batch job. A `Job` is an
entity that encapsulates an entire batch process. As is common with other Spring
@@ -50,7 +45,7 @@ configuration. This configuration may be referred to as the "`job configuration`
`Job` is only the top of an overall hierarchy, as shown in the following diagram:
.Job Hierarchy
image::{batch-asciidoc}images/job-heirarchy.png[Job Hierarchy, scaledwidth="60%"]
image::job-heirarchy.png[Job Hierarchy, scaledwidth="60%"]
In Spring Batch, a `Job` is simply a container for `Step` instances. It combines multiple
steps that logically belong together in a flow and allows for configuration of properties
@@ -60,49 +55,17 @@ global to all steps, such as restartability. The job configuration contains:
* Definition and ordering of `Step` instances.
* Whether or not the job is restartable.
ifdef::backend-spring-html[]
[role="javaContent"]
[tabs]
====
Java::
+
For those who use Java configuration, Spring Batch provides a default implementation of
the `Job` interface in the form of the `SimpleJob` class, which creates some standard
functionality on top of `Job`. When using Java-based configuration, a collection of
builders is made available for the instantiation of a `Job`, as the following
example shows:
[source, java, role="javaContent"]
----
@Bean
public Job footballJob(JobRepository jobRepository) {
return new JobBuilder("footballJob", jobRepository)
.start(playerLoad())
.next(gameLoad())
.next(playerSummarization())
.build();
}
----
[role="xmlContent"]
For those who use XML configuration, Spring Batch provides a default implementation of the
`Job` interface in the form of the `SimpleJob` class, which creates some standard
functionality on top of `Job`. However, the batch namespace abstracts away the need to
instantiate it directly. Instead, you can use the `<job>` element, as the
following example shows:
[source, xml, role="xmlContent"]
----
<job id="footballJob">
<step id="playerload" next="gameLoad"/>
<step id="gameLoad" next="playerSummarization"/>
<step id="playerSummarization"/>
</job>
----
endif::backend-spring-html[]
ifdef::backend-pdf[]
Spring Batch provides a default implementation of the `Job` interface in the form of the
`SimpleJob` class, which creates some standard functionality on top of `Job`. When using
Java-based configuration, a collection of builders are made available for the
instantiation of a `Job`, as the following example shows:
+
[source, java]
----
@Bean
@@ -115,10 +78,14 @@ public Job footballJob(JobRepository jobRepository) {
}
----
However, when using XML configuration, the batch namespace abstracts away the need to
instantiate it directly. Instead, you can use the `<job>` element, as the following
example shows:
XML::
+
For those who use XML configuration, Spring Batch provides a default implementation of the
`Job` interface in the form of the `SimpleJob` class, which creates some standard
functionality on top of `Job`. However, the batch namespace abstracts away the need to
instantiate it directly. Instead, you can use the `<job>` element, as the
following example shows:
+
[source, xml]
----
<job id="footballJob">
@@ -127,9 +94,15 @@ example shows:
<step id="playerSummarization"/>
</job>
----
endif::backend-pdf[]
==== JobInstance
====
[[jobinstance]]
=== JobInstance
A `JobInstance` refers to the concept of a logical job run. Consider a batch job that
should be run once at the end of the day, such as the `EndOfDay` `Job` from the preceding
@@ -156,7 +129,7 @@ beginning,`" and using an existing instance generally means "`start from where y
off`".
[[jobParameters]]
==== JobParameters
=== JobParameters
Having discussed `JobInstance` and how it differs from `Job`, the natural question to ask
is: "`How is one `JobInstance` distinguished from another?`" The answer is:
@@ -165,7 +138,7 @@ job. They can be used for identification or even as reference data during the ru
following image shows:
.Job Parameters
image::{batch-asciidoc}images/job-stereotypes-parameters.png[Job Parameters, scaledwidth="60%"]
image::job-stereotypes-parameters.png[Job Parameters, scaledwidth="60%"]
In the preceding example, where there are two instances, one for January 1st and another
for January 2nd, there is really only one `Job`, but it has two `JobParameter` objects:
@@ -178,7 +151,8 @@ NOTE: Not all job parameters are required to contribute to the identification of
`JobInstance`. By default, they do so. However, the framework also allows the submission
of a `Job` with parameters that do not contribute to the identity of a `JobInstance`.
==== JobExecution
[[jobexecution]]
=== JobExecution
A `JobExecution` refers to the technical concept of a single attempt to run a Job. An
execution may end in failure or success, but the `JobInstance` corresponding to a given
@@ -344,7 +318,8 @@ in both the `JobInstance` and `JobParameters` tables and two extra entries in th
NOTE: Column names may have been abbreviated or removed for the sake of clarity and
formatting.
=== Step
[[step]]
== Step
A `Step` is a domain object that encapsulates an independent, sequential phase of a batch
job. Therefore, every `Job` is composed entirely of one or more steps. A `Step` contains
@@ -358,9 +333,10 @@ with a `Job`, a `Step` has an individual `StepExecution` that correlates with a
`JobExecution`, as the following image shows:
.Job Hierarchy With Steps
image::{batch-asciidoc}images/jobHeirarchyWithSteps.png[Figure 2.1: Job Hierarchy With Steps, scaledwidth="60%"]
image::jobHeirarchyWithSteps.png[Figure 2.1: Job Hierarchy With Steps, scaledwidth="60%"]
==== StepExecution
[[stepexecution]]
=== StepExecution
A `StepExecution` represents a single attempt to execute a `Step`. A new `StepExecution`
is created each time a `Step` is run, similar to `JobExecution`. However, if a step fails
@@ -427,7 +403,8 @@ back.
|The number of times `write` has failed, resulting in a skipped item.
|===
=== ExecutionContext
[[executioncontext]]
== ExecutionContext
An `ExecutionContext` represents a collection of key/value pairs that are persisted and
controlled by the framework to give developers a place to store persistent
@@ -557,7 +534,8 @@ As noted in the comment, `ecStep` does not equal `ecJob`. They are two different
`ExecutionContexts`. The one scoped to the `Step` is saved at every commit point in the
`Step`, whereas the one scoped to the Job is saved in between every `Step` execution.
=== JobRepository
[[jobrepository]]
== JobRepository
`JobRepository` is the persistence mechanism for all of the stereotypes mentioned earlier.
It provides CRUD operations for `JobLauncher`, `Job`, and `Step` implementations. When a
@@ -565,20 +543,28 @@ It provides CRUD operations for `JobLauncher`, `Job`, and `Step` implementations
the course of execution, `StepExecution` and `JobExecution` implementations are persisted
by passing them to the repository.
[role="xmlContent"]
The Spring Batch XML namespace provides support for configuring a `JobRepository` instance
with the `<job-repository>` tag, as the following example shows:
[source, xml, role="xmlContent"]
----
<job-repository id="jobRepository"/>
----
[role="javaContent"]
[tabs]
====
Java::
+
When using Java configuration, the `@EnableBatchProcessing` annotation provides a
`JobRepository` as one of the components that is automatically configured.
=== JobLauncher
XML::
+
The Spring Batch XML namespace provides support for configuring a `JobRepository` instance
with the `<job-repository>` tag, as the following example shows:
+
[source, xml]
----
<job-repository id="jobRepository"/>
----
====
[[joblauncher]]
== JobLauncher
`JobLauncher` represents a simple interface for launching a `Job` with a given set of
`JobParameters`, as the following example shows:
@@ -596,23 +582,26 @@ public JobExecution run(Job job, JobParameters jobParameters)
It is expected that implementations obtain a valid `JobExecution` from the
`JobRepository` and execute the `Job`.
=== ItemReader
[[itemreader]]
== ItemReader
`ItemReader` is an abstraction that represents the retrieval of input for a `Step`, one
item at a time. When the `ItemReader` has exhausted the items it can provide, it
indicates this by returning `null`. You can find more details about the `ItemReader` interface and its
various implementations in
<<readersAndWriters.adoc#readersAndWriters,Readers And Writers>>.
xref:readersAndWriters.adoc[Readers And Writers].
=== ItemWriter
[[itemwriter]]
== ItemWriter
`ItemWriter` is an abstraction that represents the output of a `Step`, one batch or chunk
of items at a time. Generally, an `ItemWriter` has no knowledge of the input it should
receive next and knows only the item that was passed in its current invocation. You can find more
details about the `ItemWriter` interface and its various implementations in
<<readersAndWriters.adoc#readersAndWriters,Readers And Writers>>.
xref:readersAndWriters.adoc[Readers And Writers].
=== ItemProcessor
[[itemprocessor]]
== ItemProcessor
`ItemProcessor` is an abstraction that represents the business processing of an item.
While the `ItemReader` reads one item, and the `ItemWriter` writes one item, the
@@ -620,16 +609,18 @@ While the `ItemReader` reads one item, and the `ItemWriter` writes one item, the
If, while processing the item, it is determined that the item is not valid, returning
`null` indicates that the item should not be written out. You can find more details about the
`ItemProcessor` interface in
<<readersAndWriters.adoc#readersAndWriters,Readers And Writers>>.
xref:readersAndWriters.adoc[Readers And Writers].
[role="xmlContent"]
=== Batch Namespace
[[batch-namespace]]
== Batch Namespace
Many of the domain concepts listed previously need to be configured in a Spring
`ApplicationContext`. While there are implementations of the interfaces above that you can
use in a standard bean definition, a namespace has been provided for ease of
configuration, as the following example shows:
[source, xml, role="xmlContent"]
----
<beans:beans xmlns="http://www.springframework.org/schema/batch"
@@ -652,8 +643,8 @@ xsi:schemaLocation="
</beans:beans>
----
[role="xmlContent"]
As long as the batch namespace has been declared, any of its elements can be used. You can find more
information on configuring a Job in <<job.adoc#configureJob,Configuring and
Running a Job>>. You can find more information on configuring a `Step` in
<<step.adoc#configureStep,Configuring a Step>>.
information on configuring a Job in xref:job.adoc[Configuring and Running a Job]
. You can find more information on configuring a `Step` in
xref:step.adoc[Configuring a Step].

View File

@@ -1,9 +1,11 @@
[[glossary]]
[appendix]
== Glossary
[[glossary]]
= Glossary
[glossary]
=== Spring Batch Glossary
[[spring-batch-glossary]]
== Spring Batch Glossary
Batch::
An accumulation of business transactions over time.

View File

@@ -0,0 +1,3 @@
[[spring-batch-reference-documentation]]
= Spring Batch - Reference Documentation
:page-section-summary-toc: 1

View File

@@ -0,0 +1,46 @@
= Overview
// ======================================================================================
The reference documentation is divided into several sections:
[horizontal]
xref:spring-batch-intro.adoc[Spring Batch Introduction] :: Background, usage
scenarios, and general guidelines.
xref:spring-batch-architecture.adoc[Spring Batch Architecture] :: Spring Batch
architecture, general batch principles, batch processing strategies.
xref:whatsnew.adoc[What's new in Spring Batch 5.1] :: New features introduced in version 5.1.
xref:domain.adoc[The Domain Language of Batch] :: Core concepts and abstractions
of the Batch domain language.
xref:job.adoc[Configuring and Running a Job] :: Job configuration, execution, and
administration.
xref:step.adoc[Configuring a Step] :: Step configuration, different types of steps, and
controlling step flow.
xref:readersAndWriters.adoc[Item reading and writing] :: `ItemReader`
and `ItemWriter` interfaces and how to use them.
xref:processor.adoc[Item processing] :: `ItemProcessor` interface and how to use it.
xref:scalability.adoc#scalability[Scaling and Parallel Processing] :: Multi-threaded steps,
parallel steps, remote chunking, and partitioning.
<<repeat.adoc#repeat,Repeat>> :: Completion policies and exception handling of repetitive actions.
<<retry.adoc#retry,Retry>> :: Retry and backoff policies of retryable operations.
xref:testing.adoc[Unit Testing] :: Job and Step testing facilities and APIs.
xref:common-patterns.adoc#commonPatterns[Common Patterns] :: Common batch processing patterns
and guidelines.
xref:spring-batch-integration.adoc[Spring Batch Integration] :: Integration
between Spring Batch and Spring Integration projects.
xref:monitoring-and-metrics.adoc[Monitoring and metrics] :: Batch jobs
monitoring and metrics.
xref:tracing.adoc[Tracing] :: Tracing with Micrometer.
The following appendices are available:
[horizontal]
xref:appendix.adoc#listOfReadersAndWriters[List of ItemReaders and ItemWriters] :: List of
all provided item readers and writers.
xref:schema-appendix.adoc#metaDataSchema[Meta-Data Schema] :: Core tables used by the Batch
domain model.
xref:transaction-appendix.adoc#transactions[Batch Processing and Transactions] :: Transaction
boundaries, propagation, and isolation levels used in Spring Batch.
<<glossary.adoc#glossary,Glossary>> :: Glossary of common terms, concepts, and vocabulary of
the Batch domain.

View File

@@ -0,0 +1,22 @@
[[configureJob]]
= Configuring and Running a Job
:page-section-summary-toc: 1
ifndef::onlyonetoggle[]
endif::onlyonetoggle[]
In the xref:domain.adoc[domain section] , the overall
architecture design was discussed, using the following diagram as a
guide:
.Batch Stereotypes
image::spring-batch-reference-model.png[Figure 2.1: Batch Stereotypes, scaledwidth="60%"]
While the `Job` object may seem like a simple
container for steps, you must be aware of many configuration options.
Furthermore, you must consider many options about
how a `Job` can be run and how its metadata can be
stored during that run. This chapter explains the various configuration
options and runtime concerns of a `Job`.

View File

@@ -0,0 +1,556 @@
[[advancedMetaData]]
= Advanced Metadata Usage
So far, both the `JobLauncher` and `JobRepository` interfaces have been
discussed. Together, they represent the simple launching of a job and basic
CRUD operations of batch domain objects:
.Job Repository
image::job-repository.png[Job Repository, scaledwidth="60%"]
A `JobLauncher` uses the
`JobRepository` to create new
`JobExecution` objects and run them.
`Job` and `Step` implementations
later use the same `JobRepository` for basic updates
of the same executions during the running of a `Job`.
The basic operations suffice for simple scenarios. However, in a large batch
environment with hundreds of batch jobs and complex scheduling
requirements, more advanced access to the metadata is required:
.Advanced Job Repository Access
image::job-repository-advanced.png[Job Repository Advanced, scaledwidth="80%"]
The `JobExplorer` and
`JobOperator` interfaces, which are discussed
in the coming sections, add additional functionality for querying and controlling the metadata.
[[queryingRepository]]
== Querying the Repository
The most basic need before any advanced features is the ability to
query the repository for existing executions. This functionality is
provided by the `JobExplorer` interface:
[source, java]
----
public interface JobExplorer {
List<JobInstance> getJobInstances(String jobName, int start, int count);
JobExecution getJobExecution(Long executionId);
StepExecution getStepExecution(Long jobExecutionId, Long stepExecutionId);
JobInstance getJobInstance(Long instanceId);
List<JobExecution> getJobExecutions(JobInstance jobInstance);
Set<JobExecution> findRunningJobExecutions(String jobName);
}
----
As is evident from its method signatures, `JobExplorer` is a read-only version of
the `JobRepository`, and, like the `JobRepository`, it can be easily configured by using a
factory bean.
[tabs]
====
Java::
+
The following example shows how to configure a `JobExplorer` in Java:
+
.Java Configuration
[source, java]
----
...
// This would reside in your DefaultBatchConfiguration extension
@Bean
public JobExplorer jobExplorer() throws Exception {
JobExplorerFactoryBean factoryBean = new JobExplorerFactoryBean();
factoryBean.setDataSource(this.dataSource);
return factoryBean.getObject();
}
...
----
XML::
+
The following example shows how to configure a `JobExplorer` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="jobExplorer" class="org.spr...JobExplorerFactoryBean"
p:dataSource-ref="dataSource" />
----
====
xref:job/configuring-repository.adoc#repositoryTablePrefix[Earlier in this chapter], we noted that you can modify the table prefix
of the `JobRepository` to allow for different versions or schemas. Because
the `JobExplorer` works with the same tables, it also needs the ability to set a prefix.
[tabs]
====
Java::
+
The following example shows how to set the table prefix for a `JobExplorer` in Java:
+
.Java Configuration
[source, java]
----
...
// This would reside in your DefaultBatchConfiguration extension
@Bean
public JobExplorer jobExplorer() throws Exception {
JobExplorerFactoryBean factoryBean = new JobExplorerFactoryBean();
factoryBean.setDataSource(this.dataSource);
factoryBean.setTablePrefix("SYSTEM.");
return factoryBean.getObject();
}
...
----
XML::
+
The following example shows how to set the table prefix for a `JobExplorer` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="jobExplorer" class="org.spr...JobExplorerFactoryBean"
p:tablePrefix="SYSTEM."/>
----
====
[[jobregistry]]
== JobRegistry
A `JobRegistry` (and its parent interface, `JobLocator`) is not mandatory, but it can be
useful if you want to keep track of which jobs are available in the context. It is also
useful for collecting jobs centrally in an application context when they have been created
elsewhere (for example, in child contexts). You can also use custom `JobRegistry` implementations
to manipulate the names and other properties of the jobs that are registered.
There is only one implementation provided by the framework and this is based on a simple
map from job name to job instance.
[tabs]
====
Java::
+
When using `@EnableBatchProcessing`, a `JobRegistry` is provided for you.
The following example shows how to configure your own `JobRegistry`:
+
[source, java]
----
...
// This is already provided via the @EnableBatchProcessing but can be customized via
// overriding the bean in the DefaultBatchConfiguration
@Override
@Bean
public JobRegistry jobRegistry() throws Exception {
return new MapJobRegistry();
}
...
----
XML::
+
The following example shows how to include a `JobRegistry` for a job defined in XML:
+
[source, xml]
----
<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
----
====
You can populate a `JobRegistry` in either of two ways: by using
a bean post processor or by using a registrar lifecycle component. The coming
sections describe these two mechanisms.
[[jobregistrybeanpostprocessor]]
=== JobRegistryBeanPostProcessor
This is a bean post-processor that can register all jobs as they are created.
[tabs]
====
Java::
+
The following example shows how to include the `JobRegistryBeanPostProcessor` for a job
defined in Java:
+
.Java Configuration
[source, java]
----
@Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
JobRegistryBeanPostProcessor postProcessor = new JobRegistryBeanPostProcessor();
postProcessor.setJobRegistry(jobRegistry);
return postProcessor;
}
----
XML::
+
The following example shows how to include the `JobRegistryBeanPostProcessor` for a job
defined in XML:
+
.XML Configuration
[source, xml]
----
<bean id="jobRegistryBeanPostProcessor" class="org.spr...JobRegistryBeanPostProcessor">
<property name="jobRegistry" ref="jobRegistry"/>
</bean>
----
====
Although it is not strictly necessary, the post-processor in the
example has been given an `id` so that it can be included in child
contexts (for example, as a parent bean definition) and cause all jobs created
there to also be registered automatically.
[[automaticjobregistrar]]
=== AutomaticJobRegistrar
This is a lifecycle component that creates child contexts and registers jobs from those
contexts as they are created. One advantage of doing this is that, while the job names in
the child contexts still have to be globally unique in the registry, their dependencies
can have "`natural`" names. So, for example, you can create a set of XML configuration files
that each have only one Job but that all have different definitions of an `ItemReader` with the
same bean name, such as `reader`. If all those files were imported into the same context,
the reader definitions would clash and override one another, but, with the automatic
registrar, this is avoided. This makes it easier to integrate jobs that have been contributed from
separate modules of an application.
[tabs]
====
Java::
+
The following example shows how to include the `AutomaticJobRegistrar` for a job defined
in Java:
+
.Java Configuration
[source, java]
----
@Bean
public AutomaticJobRegistrar registrar() {
AutomaticJobRegistrar registrar = new AutomaticJobRegistrar();
registrar.setJobLoader(jobLoader());
registrar.setApplicationContextFactories(applicationContextFactories());
registrar.afterPropertiesSet();
return registrar;
}
----
XML::
+
The following example shows how to include the `AutomaticJobRegistrar` for a job defined
in XML:
+
.XML Configuration
[source, xml]
----
<bean class="org.spr...AutomaticJobRegistrar">
<property name="applicationContextFactories">
<bean class="org.spr...ClasspathXmlApplicationContextsFactoryBean">
<property name="resources" value="classpath*:/config/job*.xml" />
</bean>
</property>
<property name="jobLoader">
<bean class="org.spr...DefaultJobLoader">
<property name="jobRegistry" ref="jobRegistry" />
</bean>
</property>
</bean>
----
====
The registrar has two mandatory properties: an array of
`ApplicationContextFactory` (created from a
convenient factory bean in the preceding example) and a
`JobLoader`. The `JobLoader`
is responsible for managing the lifecycle of the child contexts and
registering jobs in the `JobRegistry`.
The `ApplicationContextFactory` is
responsible for creating the child context. The most common usage
is (as in the preceding example) to use a
`ClassPathXmlApplicationContextFactory`. One of
the features of this factory is that, by default, it copies some of the
configuration down from the parent context to the child. So, for
instance, you need not redefine the
`PropertyPlaceholderConfigurer` or AOP
configuration in the child, provided it should be the same as the
parent.
You can use `AutomaticJobRegistrar` in
conjunction with a `JobRegistryBeanPostProcessor`
(as long as you also use `DefaultJobLoader`).
For instance, this might be desirable if there are jobs
defined in the main parent context as well as in the child
locations.
[[JobOperator]]
== JobOperator
As previously discussed, the `JobRepository`
provides CRUD operations on the meta-data, and the
`JobExplorer` provides read-only operations on the
metadata. However, those operations are most useful when used together
to perform common monitoring tasks such as stopping, restarting, or
summarizing a Job, as is commonly done by batch operators. Spring Batch
provides these types of operations in the
`JobOperator` interface:
[source, java]
----
public interface JobOperator {
List<Long> getExecutions(long instanceId) throws NoSuchJobInstanceException;
List<Long> getJobInstances(String jobName, int start, int count)
throws NoSuchJobException;
Set<Long> getRunningExecutions(String jobName) throws NoSuchJobException;
String getParameters(long executionId) throws NoSuchJobExecutionException;
Long start(String jobName, String parameters)
throws NoSuchJobException, JobInstanceAlreadyExistsException;
Long restart(long executionId)
throws JobInstanceAlreadyCompleteException, NoSuchJobExecutionException,
NoSuchJobException, JobRestartException;
Long startNextInstance(String jobName)
throws NoSuchJobException, JobParametersNotFoundException, JobRestartException,
JobExecutionAlreadyRunningException, JobInstanceAlreadyCompleteException;
boolean stop(long executionId)
throws NoSuchJobExecutionException, JobExecutionNotRunningException;
String getSummary(long executionId) throws NoSuchJobExecutionException;
Map<Long, String> getStepExecutionSummaries(long executionId)
throws NoSuchJobExecutionException;
Set<String> getJobNames();
}
----
The preceding operations represent methods from many different interfaces, such as
`JobLauncher`, `JobRepository`, `JobExplorer`, and `JobRegistry`. For this reason, the
provided implementation of `JobOperator` (`SimpleJobOperator`) has many dependencies.
[tabs]
====
Java::
+
The following example shows a typical bean definition for `SimpleJobOperator` in Java:
+
[source, java]
----
/**
* All injected dependencies for this bean are provided by the @EnableBatchProcessing
* infrastructure out of the box.
*/
@Bean
public SimpleJobOperator jobOperator(JobExplorer jobExplorer,
JobRepository jobRepository,
JobRegistry jobRegistry,
JobLauncher jobLauncher) {
SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobExplorer(jobExplorer);
jobOperator.setJobRepository(jobRepository);
jobOperator.setJobRegistry(jobRegistry);
jobOperator.setJobLauncher(jobLauncher);
return jobOperator;
}
----
XML::
+
The following example shows a typical bean definition for `SimpleJobOperator` in XML:
+
[source, xml]
----
<bean id="jobOperator" class="org.spr...SimpleJobOperator">
<property name="jobExplorer">
<bean class="org.spr...JobExplorerFactoryBean">
<property name="dataSource" ref="dataSource" />
</bean>
</property>
<property name="jobRepository" ref="jobRepository" />
<property name="jobRegistry" ref="jobRegistry" />
<property name="jobLauncher" ref="jobLauncher" />
</bean>
----
====
As of version 5.0, the `@EnableBatchProcessing` annotation automatically registers a job operator bean
in the application context.
NOTE: If you set the table prefix on the job repository, do not forget to set it on the job explorer as well.
[[JobParametersIncrementer]]
== JobParametersIncrementer
Most of the methods on `JobOperator` are
self-explanatory, and you can find more detailed explanations in the
https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/launch/JobOperator.html[Javadoc of the interface]. However, the
`startNextInstance` method is worth noting. This
method always starts a new instance of a `Job`.
This can be extremely useful if there are serious issues in a
`JobExecution` and the `Job`
needs to be started over again from the beginning. Unlike
`JobLauncher` (which requires a new
`JobParameters` object that triggers a new
`JobInstance`), if the parameters are different from
any previous set of parameters, the
`startNextInstance` method uses the
`JobParametersIncrementer` tied to the
`Job` to force the `Job` to a
new instance:
[source, java]
----
public interface JobParametersIncrementer {
JobParameters getNext(JobParameters parameters);
}
----
The contract of `JobParametersIncrementer` is
that, given a xref:domain.adoc#jobParameters[JobParameters]
object, it returns the "`next`" `JobParameters`
object by incrementing any necessary values it may contain. This
strategy is useful because the framework has no way of knowing what
changes to the `JobParameters` make it the "`next`"
instance. For example, if the only value in
`JobParameters` is a date and the next instance
should be created, should that value be incremented by one day or one
week (if the job is weekly, for instance)? The same can be said for any
numerical values that help to identify the `Job`,
as the following example shows:
[source, java]
----
public class SampleIncrementer implements JobParametersIncrementer {
public JobParameters getNext(JobParameters parameters) {
if (parameters==null || parameters.isEmpty()) {
return new JobParametersBuilder().addLong("run.id", 1L).toJobParameters();
}
long id = parameters.getLong("run.id",1L) + 1;
return new JobParametersBuilder().addLong("run.id", id).toJobParameters();
}
}
----
In this example, the value with a key of `run.id` is used to
discriminate between `JobInstances`. If the
`JobParameters` passed in is null, it can be
assumed that the `Job` has never been run before
and, thus, its initial state can be returned. However, if not, the old
value is obtained, incremented by one, and returned.
[tabs]
====
Java::
+
For jobs defined in Java, you can associate an incrementer with a `Job` through the
`incrementer` method provided in the builders, as follows:
+
[source, java]
----
@Bean
public Job footballJob(JobRepository jobRepository) {
return new JobBuilder("footballJob", jobRepository)
.incrementer(sampleIncrementer())
...
.build();
}
----
XML::
+
For jobs defined in XML, you can associate an incrementer with a `Job` through the
`incrementer` attribute in the namespace, as follows:
+
[source, xml]
----
<job id="footballJob" incrementer="sampleIncrementer">
...
</job>
----
====
[[stoppingAJob]]
== Stopping a Job
One of the most common use cases of
`JobOperator` is gracefully stopping a
Job:
[source, java]
----
Set<Long> executions = jobOperator.getRunningExecutions("sampleJob");
jobOperator.stop(executions.iterator().next());
----
The shutdown is not immediate, since there is no way to force
immediate shutdown, especially if the execution is currently in
developer code that the framework has no control over, such as a
business service. However, as soon as control is returned back to the
framework, it sets the status of the current
`StepExecution` to
`BatchStatus.STOPPED`, saves it, and does the same
for the `JobExecution` before finishing.
[[aborting-a-job]]
== Aborting a Job
A job execution that is `FAILED` can be
restarted (if the `Job` is restartable). A job execution whose status is
`ABANDONED` cannot be restarted by the framework.
The `ABANDONED` status is also used in step
executions to mark them as skippable in a restarted job execution. If a
job is running and encounters a step that has been marked
`ABANDONED` in the previous failed job execution, it
moves on to the next step (as determined by the job flow definition
and the step execution exit status).
If the process died (`kill -9` or server
failure), the job is, of course, not running, but the `JobRepository` has
no way of knowing because no one told it before the process died. You
have to tell it manually that you know that the execution either failed
or should be considered aborted (change its status to
`FAILED` or `ABANDONED`). This is
a business decision, and there is no way to automate it. Change the
status to `FAILED` only if it is restartable and you know that the restart data is valid.

View File

@@ -0,0 +1,120 @@
[[configuringJobLauncher]]
= Configuring a JobLauncher
[tabs]
====
Java::
+
When you use `@EnableBatchProcessing`, a `JobRegistry` is provided for you.
This section describes how to configure your own.
XML::
+
// FIXME what is the XML equivalent?
====
The most basic implementation of the `JobLauncher` interface is the `TaskExecutorJobLauncher`.
Its only required dependency is a `JobRepository` (needed to obtain an execution).
[tabs]
====
Java::
+
The following example shows a `TaskExecutorJobLauncher` in Java:
+
.Java Configuration
[source, java]
----
...
@Bean
public JobLauncher jobLauncher() throws Exception {
TaskExecutorJobLauncher jobLauncher = new TaskExecutorJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
...
----
XML::
+
The following example shows a `TaskExecutorJobLauncher` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.TaskExecutorJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
----
====
Once a xref:domain.adoc[JobExecution] is obtained, it is passed to the
execute method of `Job`, ultimately returning the `JobExecution` to the caller, as
the following image shows:
.Job Launcher Sequence
image::job-launcher-sequence-sync.png[Job Launcher Sequence, scaledwidth="60%"]
The sequence is straightforward and works well when launched from a scheduler. However,
issues arise when trying to launch from an HTTP request. In this scenario, the launching
needs to be done asynchronously so that the `TaskExecutorJobLauncher` returns immediately to its
caller. This is because it is not good practice to keep an HTTP request open for the
amount of time needed by long running processes (such as batch jobs). The following image shows
an example sequence:
.Asynchronous Job Launcher Sequence
image::job-launcher-sequence-async.png[Async Job Launcher Sequence, scaledwidth="60%"]
You can configure the `TaskExecutorJobLauncher` to allow for this scenario by configuring a
`TaskExecutor`.
[tabs]
====
Java::
+
The following Java example configures a `TaskExecutorJobLauncher` to return immediately:
+
.Java Configuration
[source, java]
----
@Bean
public JobLauncher jobLauncher() {
TaskExecutorJobLauncher jobLauncher = new TaskExecutorJobLauncher();
jobLauncher.setJobRepository(jobRepository());
jobLauncher.setTaskExecutor(new SimpleAsyncTaskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
----
XML::
+
The following XML example configures a `TaskExecutorJobLauncher` to return immediately:
+
.XML Configuration
[source, xml]
----
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.TaskExecutorJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor">
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
</property>
</bean>
----
====
You can use any implementation of the spring `TaskExecutor`
interface to control how jobs are asynchronously
executed.

View File

@@ -0,0 +1,262 @@
[[configuringJobRepository]]
= Configuring a JobRepository
As described earlier, the xref:job.adoc[`JobRepository`] is used for basic CRUD operations of the various persisted
domain objects within Spring Batch, such as `JobExecution` and `StepExecution`.
It is required by many of the major framework features, such as the `JobLauncher`,
`Job`, and `Step`.
// FIXME: This did not quite convert properly
[tabs]
====
Java::
+
When using `@EnableBatchProcessing`, a `JobRepository` is provided for you.
This section describes how to configure your own.
+
Other than the `dataSource` and the `transactionManager`, none of the configuration options listed earlier are required.
If they are not set, the defaults shown earlier
are used. The
max `varchar` length defaults to `2500`, which is the
length of the long `VARCHAR` columns in the
xref:schema-appendix.adoc#metaDataSchemaOverview[sample schema scripts]
XML::
+
The batch namespace abstracts away many of the implementation details of the
`JobRepository` implementations and their collaborators. However, there are still a few
configuration options available, as the following example shows:
+
.XML Configuration
[source, xml]
----
<job-repository id="jobRepository"
data-source="dataSource"
transaction-manager="transactionManager"
isolation-level-for-create="SERIALIZABLE"
table-prefix="BATCH_"
max-varchar-length="1000"/>
----
+
Other than the `id`, none of the configuration options listed earlier are required. If they are
not set, the defaults shown earlier are used.
The `max-varchar-length` defaults to `2500`, which is the length of the long
`VARCHAR` columns in the xref:schema-appendix.adoc#metaDataSchemaOverview[sample schema scripts]
.
====
[[txConfigForJobRepository]]
== Transaction Configuration for the JobRepository
If the namespace or the provided `FactoryBean` is used, transactional advice is
automatically created around the repository. This is to ensure that the batch metadata,
including state that is necessary for restarts after a failure, is persisted correctly.
The behavior of the framework is not well defined if the repository methods are not
transactional. The isolation level in the `create*` method attributes is specified
separately to ensure that, when jobs are launched, if two processes try to launch
the same job at the same time, only one succeeds. The default isolation level for that
method is `SERIALIZABLE`, which is quite aggressive. `READ_COMMITTED` usually works equally
well. `READ_UNCOMMITTED` is fine if two processes are not likely to collide in this
way. However, since a call to the `create*` method is quite short, it is unlikely that
`SERIALIZED` causes problems, as long as the database platform supports it. However, you
can override this setting.
[tabs]
====
Java::
+
The following example shows how to override the isolation level in Java:
+
.Java Configuration
[source, java]
----
@Configuration
@EnableBatchProcessing(isolationLevelForCreate = "ISOLATION_REPEATABLE_READ")
public class MyJobConfiguration {
// job definition
}
----
XML::
+
The following example shows how to override the isolation level in XML:
+
.XML Configuration
[source, xml]
----
<job-repository id="jobRepository"
isolation-level-for-create="REPEATABLE_READ" />
----
====
If the namespace is not used, you must also configure the
transactional behavior of the repository by using AOP.
[tabs]
====
Java::
+
The following example shows how to configure the transactional behavior of the repository
in Java:
+
.Java Configuration
[source, java]
----
@Bean
public TransactionProxyFactoryBean baseProxy() {
TransactionProxyFactoryBean transactionProxyFactoryBean = new TransactionProxyFactoryBean();
Properties transactionAttributes = new Properties();
transactionAttributes.setProperty("*", "PROPAGATION_REQUIRED");
transactionProxyFactoryBean.setTransactionAttributes(transactionAttributes);
transactionProxyFactoryBean.setTarget(jobRepository());
transactionProxyFactoryBean.setTransactionManager(transactionManager());
return transactionProxyFactoryBean;
}
----
XML::
+
The following example shows how to configure the transactional behavior of the repository
in XML:
+
.XML Configuration
[source, xml]
----
<aop:config>
<aop:advisor
pointcut="execution(* org.springframework.batch.core..*Repository+.*(..))"/>
<advice-ref="txAdvice" />
</aop:config>
<tx:advice id="txAdvice" transaction-manager="transactionManager">
<tx:attributes>
<tx:method name="*" />
</tx:attributes>
</tx:advice>
----
+
You can use the preceding fragment nearly as is, with almost no changes. Remember also to
include the appropriate namespace declarations and to make sure `spring-tx` and `spring-aop`
(or the whole of Spring) are on the classpath.
====
[[repositoryTablePrefix]]
== Changing the Table Prefix
Another modifiable property of the `JobRepository` is the table prefix of the meta-data
tables. By default, they are all prefaced with `BATCH_`. `BATCH_JOB_EXECUTION` and
`BATCH_STEP_EXECUTION` are two examples. However, there are potential reasons to modify this
prefix. If the schema names need to be prepended to the table names or if more than one
set of metadata tables is needed within the same schema, the table prefix needs to
be changed.
[tabs]
====
Java::
+
The following example shows how to change the table prefix in Java:
+
.Java Configuration
[source, java]
----
@Configuration
@EnableBatchProcessing(tablePrefix = "SYSTEM.TEST_")
public class MyJobConfiguration {
// job definition
}
----
XML::
+
The following example shows how to change the table prefix in XML:
+
.XML Configuration
[source, xml]
----
<job-repository id="jobRepository"
table-prefix="SYSTEM.TEST_" />
----
====
Given the preceding changes, every query to the metadata tables is prefixed with
`SYSTEM.TEST_`. `BATCH_JOB_EXECUTION` is referred to as `SYSTEM.TEST_JOB_EXECUTION`.
NOTE: Only the table prefix is configurable. The table and column names are not.
[[nonStandardDatabaseTypesInRepository]]
== Non-standard Database Types in a Repository
If you use a database platform that is not in the list of supported platforms, you
may be able to use one of the supported types, if the SQL variant is close enough. To do
this, you can use the raw `JobRepositoryFactoryBean` instead of the namespace shortcut and
use it to set the database type to the closest match.
[tabs]
====
Java::
+
The following example shows how to use `JobRepositoryFactoryBean` to set the database type
to the closest match in Java:
+
.Java Configuration
[source, java]
----
@Bean
public JobRepository jobRepository() throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setDatabaseType("db2");
factory.setTransactionManager(transactionManager);
return factory.getObject();
}
----
XML::
+
The following example shows how to use `JobRepositoryFactoryBean` to set the database type
to the closest match in XML:
+
.XML Configuration
[source, xml]
----
<bean id="jobRepository" class="org...JobRepositoryFactoryBean">
<property name="databaseType" value="db2"/>
<property name="dataSource" ref="dataSource"/>
</bean>
----
====
If the database type is not specified, the `JobRepositoryFactoryBean` tries to
auto-detect the database type from the `DataSource`.
The major differences between platforms are
mainly accounted for by the strategy for incrementing primary keys, so
it is often necessary to override the
`incrementerFactory` as well (by using one of the standard
implementations from the Spring Framework).
If even that does not work or if you are not using an RDBMS, the
only option may be to implement the various `Dao`
interfaces that the `SimpleJobRepository` depends
on and wire one up manually in the normal Spring way.

View File

@@ -0,0 +1,316 @@
[[configuringAJob]]
= Configuring a Job
There are multiple implementations of the xref:job.adoc[`Job`] interface. However,
these implementations are abstracted behind either the provided builders (for Java configuration) or the XML
namespace (for XML-based configuration). The following example shows both Java and XML configuration:
[tabs]
====
Java::
+
[source, java]
----
@Bean
public Job footballJob(JobRepository jobRepository) {
return new JobBuilder("footballJob", jobRepository)
.start(playerLoad())
.next(gameLoad())
.next(playerSummarization())
.build();
}
----
+
A `Job` (and, typically, any `Step` within it) requires a `JobRepository`. The
configuration of the `JobRepository` is handled through the xref:job/java-config.adoc[`Java Configuration`].
+
The preceding example illustrates a `Job` that consists of three `Step` instances. The job related
builders can also contain other elements that help with parallelization (`Split`),
declarative flow control (`Decision`), and externalization of flow definitions (`Flow`).
XML::
+
There are multiple implementations of the xref:job.adoc[`Job`]
interface. However, the namespace abstracts away the differences in configuration. It has
only three required dependencies: a name, `JobRepository` , and a list of `Step` instances.
The following example creates a `footballJob`:
+
[source, xml]
----
<job id="footballJob">
<step id="playerload" parent="s1" next="gameLoad"/>
<step id="gameLoad" parent="s2" next="playerSummarization"/>
<step id="playerSummarization" parent="s3"/>
</job>
----
+
The preceding examples uses a parent bean definition to create the steps.
See the section on xref:step.adoc[step configuration]
for more options when declaring specific step details inline. The XML namespace
defaults to referencing a repository with an `id` of `jobRepository`, which
is a sensible default. However, you can explicitly override this default:
+
[source, xml]
----
<job id="footballJob" job-repository="specialRepository">
<step id="playerload" parent="s1" next="gameLoad"/>
<step id="gameLoad" parent="s3" next="playerSummarization"/>
<step id="playerSummarization" parent="s3"/>
</job>
----
+
In addition to steps, a job configuration can contain other elements
that help with parallelization (`<split>`),
declarative flow control (`<decision>`), and
externalization of flow definitions
(`<flow/>`).
====
[[restartability]]
== Restartability
One key issue when executing a batch job concerns the behavior of a `Job` when it is
restarted. The launching of a `Job` is considered to be a "`restart`" if a `JobExecution`
already exists for the particular `JobInstance`. Ideally, all jobs should be able to start
up where they left off, but there are scenarios where this is not possible.
_In this scenario, it is entirely up to the developer to ensure that a new `JobInstance` is created._
However, Spring Batch does provide some help. If a `Job` should never be
restarted but should always be run as part of a new `JobInstance`, you can set the
restartable property to `false`.
[tabs]
====
Java::
+
The following example shows how to set the `restartable` field to `false` in Java:
+
.Java Configuration
[source, java]
----
@Bean
public Job footballJob(JobRepository jobRepository) {
return new JobBuilder("footballJob", jobRepository)
.preventRestart()
...
.build();
}
----
XML::
+
The following example shows how to set the `restartable` field to `false` in XML:
+
.XML Configuration
[source, xml]
----
<job id="footballJob" restartable="false">
...
</job>
----
====
To phrase it another way, setting `restartable` to `false` means "`this
`Job` does not support being started again`". Restarting a `Job` that is not
restartable causes a `JobRestartException` to
be thrown.
The following Junit code causes the exception to be thrown:
[source, java]
----
Job job = new SimpleJob();
job.setRestartable(false);
JobParameters jobParameters = new JobParameters();
JobExecution firstExecution = jobRepository.createJobExecution(job, jobParameters);
jobRepository.saveOrUpdate(firstExecution);
try {
jobRepository.createJobExecution(job, jobParameters);
fail();
}
catch (JobRestartException e) {
// expected
}
----
The first attempt to create a
`JobExecution` for a non-restartable
job causes no issues. However, the second
attempt throws a `JobRestartException`.
[[interceptingJobExecution]]
== Intercepting Job Execution
During the course of the execution of a
`Job`, it may be useful to be notified of various
events in its lifecycle so that custom code can be run.
`SimpleJob` allows for this by calling a
`JobListener` at the appropriate time:
[source, java]
----
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution);
}
----
You can add `JobListeners` to a `SimpleJob` by setting listeners on the job.
[tabs]
====
Java::
+
The following example shows how to add a listener method to a Java job definition:
+
.Java Configuration
[source, java]
----
@Bean
public Job footballJob(JobRepository jobRepository) {
return new JobBuilder("footballJob", jobRepository)
.listener(sampleListener())
...
.build();
}
----
XML::
+
The following example shows how to add a listener element to an XML job definition:
+
.XML Configuration
[source, xml]
----
<job id="footballJob">
<step id="playerload" parent="s1" next="gameLoad"/>
<step id="gameLoad" parent="s2" next="playerSummarization"/>
<step id="playerSummarization" parent="s3"/>
<listeners>
<listener ref="sampleListener"/>
</listeners>
</job>
----
====
Note that the `afterJob` method is called regardless of the success or
failure of the `Job`. If you need to determine success or failure, you can get that information
from the `JobExecution`:
[source, java]
----
public void afterJob(JobExecution jobExecution){
if (jobExecution.getStatus() == BatchStatus.COMPLETED ) {
//job success
}
else if (jobExecution.getStatus() == BatchStatus.FAILED) {
//job failure
}
}
----
The annotations corresponding to this interface are:
* `@BeforeJob`
* `@AfterJob`
[[inheritingFromAParentJob]]
[role="xmlContent"]
[[inheriting-from-a-parent-job]]
== Inheriting from a Parent Job
ifdef::backend-pdf[]
This section applies only to XML based configuration, as Java configuration provides better
reuse capabilities.
endif::backend-pdf[]
[role="xmlContent"]
If a group of Jobs share similar but not
identical configurations, it may help to define a "`parent`"
`Job` from which the concrete
`Job` instances can inherit properties. Similar to class
inheritance in Java, a "`child`" `Job` combines
its elements and attributes with the parent's.
[role="xmlContent"]
In the following example, `baseJob` is an abstract
`Job` definition that defines only a list of
listeners. The `Job` (`job1`) is a concrete
definition that inherits the list of listeners from `baseJob` and merges
it with its own list of listeners to produce a
`Job` with two listeners and one
`Step` (`step1`).
[source, xml]
----
<job id="baseJob" abstract="true">
<listeners>
<listener ref="listenerOne"/>
<listeners>
</job>
<job id="job1" parent="baseJob">
<step id="step1" parent="standaloneStep"/>
<listeners merge="true">
<listener ref="listenerTwo"/>
<listeners>
</job>
----
[role="xmlContent"]
See the section on <<inheritingFromParentStep,Inheriting from a Parent Step>>
for more detailed information.
[[jobparametersvalidator]]
== JobParametersValidator
A job declared in the XML namespace or using any subclass of
`AbstractJob` can optionally declare a validator for the job parameters at
runtime. This is useful when, for instance, you need to assert that a job
is started with all its mandatory parameters. There is a
`DefaultJobParametersValidator` that you can use to constrain combinations
of simple mandatory and optional parameters. For more complex
constraints, you can implement the interface yourself.
[tabs]
====
Java::
+
The configuration of a validator is supported through the Java builders:
+
[source, java]
----
@Bean
public Job job1(JobRepository jobRepository) {
return new JobBuilder("job1", jobRepository)
.validator(parametersValidator())
...
.build();
}
----
XML::
+
The configuration of a validator is supported through the XML namespace through a child
element of the job, as the following example shows:
+
[source, xml]
----
<job id="job1" parent="baseJob3">
<step id="step1" parent="standaloneStep"/>
<validator ref="parametersValidator"/>
</job>
----
+
You can specify the validator as a reference (as shown earlier) or as a nested bean
definition in the `beans` namespace.
====

View File

@@ -0,0 +1,105 @@
[[javaConfig]]
= Java Configuration
Spring 3 brought the ability to configure applications with Java instead of XML. As of
Spring Batch 2.2.0, you can configure batch jobs by using the same Java configuration.
There are three components for the Java-based configuration: the `@EnableBatchProcessing`
annotation and two builders.
The `@EnableBatchProcessing` annotation works similarly to the other `@Enable*` annotations in the
Spring family. In this case, `@EnableBatchProcessing` provides a base configuration for
building batch jobs. Within this base configuration, an instance of `StepScope` and `JobScope` are
created, in addition to a number of beans being made available to be autowired:
* `JobRepository`: a bean named `jobRepository`
* `JobLauncher`: a bean named `jobLauncher`
* `JobRegistry`: a bean named `jobRegistry`
* `JobExplorer`: a bean named `jobExplorer`
* `JobOperator`: a bean named `jobOperator`
The default implementation provides the beans mentioned in the preceding list and requires a `DataSource`
and a `PlatformTransactionManager` to be provided as beans within the context. The data source and transaction
manager are used by the `JobRepository` and `JobExplorer` instances. By default, the data source named `dataSource`
and the transaction manager named `transactionManager` will be used. You can customize any of these beans by using
the attributes of the `@EnableBatchProcessing` annotation. The following example shows how to provide a
custom data source and transaction manager:
[source, java]
----
@Configuration
@EnableBatchProcessing(dataSourceRef = "batchDataSource", transactionManagerRef = "batchTransactionManager")
public class MyJobConfiguration {
@Bean
public DataSource batchDataSource() {
return new EmbeddedDatabaseBuilder().setType(EmbeddedDatabaseType.HSQL)
.addScript("/org/springframework/batch/core/schema-hsqldb.sql")
.generateUniqueName(true).build();
}
@Bean
public JdbcTransactionManager batchTransactionManager(DataSource dataSource) {
return new JdbcTransactionManager(dataSource);
}
public Job job(JobRepository jobRepository) {
return new JobBuilder("myJob", jobRepository)
//define job flow as needed
.build();
}
}
----
NOTE: Only one configuration class needs to have the `@EnableBatchProcessing` annotation. Once
you have a class annotated with it, you have all of the configuration described earlier.
Starting from v5.0, an alternative, programmatic way of configuring base infrastrucutre beans
is provided through the `DefaultBatchConfiguration` class. This class provides the same beans
provided by `@EnableBatchProcessing` and can be used as a base class to configure batch jobs.
The following snippet is a typical example of how to use it:
[source, java]
----
@Configuration
class MyJobConfiguration extends DefaultBatchConfiguration {
@Bean
public Job job(JobRepository jobRepository) {
return new JobBuilder("job", jobRepository)
// define job flow as needed
.build();
}
}
----
The data source and transaction manager will be resolved from the application context
and set on the job repository and job explorer. You can customize the configuration
of any infrastructure bean by overriding the required setter. The following example
shows how to customize the character encoding for instance:
[source, java]
----
@Configuration
class MyJobConfiguration extends DefaultBatchConfiguration {
@Bean
public Job job(JobRepository jobRepository) {
return new JobBuilder("job", jobRepository)
// define job flow as needed
.build();
}
@Override
protected Charset getCharset() {
return StandardCharsets.ISO_8859_1;
}
}
----
NOTE: `@EnableBatchProcessing` should *not* be used with `DefaultBatchConfiguration`. You should
either use the declarative way of configuring Spring Batch through `@EnableBatchProcessing`,
or use the programmatic way of extending `DefaultBatchConfiguration`, but not both ways at
the same time.

View File

@@ -0,0 +1,281 @@
[[runningAJob]]
= Running a Job
At a minimum, launching a batch job requires two things: the
`Job` to be launched and a
`JobLauncher`. Both can be contained within the same
context or different contexts. For example, if you launch jobs from the
command line, a new JVM is instantiated for each `Job`. Thus, every
job has its own `JobLauncher`. However, if
you run from within a web container that is within the scope of an
`HttpRequest`, there is usually one
`JobLauncher` (configured for asynchronous job
launching) that multiple requests invoke to launch their jobs.
[[runningJobsFromCommandLine]]
== Running Jobs from the Command Line
If you want to run your jobs from an enterprise
scheduler, the command line is the primary interface. This is because
most schedulers (with the exception of Quartz, unless using
`NativeJob`) work directly with operating system
processes, primarily kicked off with shell scripts. There are many ways
to launch a Java process besides a shell script, such as Perl, Ruby, or
even build tools, such as Ant or Maven. However, because most people
are familiar with shell scripts, this example focuses on them.
[[commandLineJobRunner]]
=== The CommandLineJobRunner
Because the script launching the job must kick off a Java
Virtual Machine, there needs to be a class with a `main` method to act
as the primary entry point. Spring Batch provides an implementation
that serves this purpose:
`CommandLineJobRunner`. Note
that this is just one way to bootstrap your application. There are
many ways to launch a Java process, and this class should in no way be
viewed as definitive. The `CommandLineJobRunner`
performs four tasks:
* Load the appropriate `ApplicationContext`.
* Parse command line arguments into `JobParameters`.
* Locate the appropriate job based on arguments.
* Use the `JobLauncher` provided in the application context to launch the job.
All of these tasks are accomplished with only the arguments passed in.
The following table describes the required arguments:
.CommandLineJobRunner arguments
|===============
|`jobPath`|The location of the XML file that is used to
create an `ApplicationContext`. This file
should contain everything needed to run the complete
`Job`.
|`jobName`|The name of the job to be run.
|===============
These arguments must be passed in, with the path first and the name second. All arguments
after these are considered to be job parameters, are turned into a `JobParameters` object,
and must be in the format of `name=value`.
[tabs]
====
Java::
+
The following example shows a date passed as a job parameter to a job defined in Java:
+
[source]
----
<bash$ java CommandLineJobRunner io.spring.EndOfDayJobConfiguration endOfDay schedule.date=2007-05-05,java.time.LocalDate
----
XML::
+
The following example shows a date passed as a job parameter to a job defined in XML:
+
[source]
----
<bash$ java CommandLineJobRunner endOfDayJob.xml endOfDay schedule.date=2007-05-05,java.time.LocalDate
----
====
[NOTE]
=====
By default, the `CommandLineJobRunner` uses a `DefaultJobParametersConverter` that implicitly converts
key/value pairs to identifying job parameters. However, you can explicitly specify
which job parameters are identifying and which are not by suffixing them with `true` or `false`, respectively.
In the following example, `schedule.date` is an identifying job parameter, while `vendor.id` is not:
[source]
----
<bash$ java CommandLineJobRunner endOfDayJob.xml endOfDay \
schedule.date=2007-05-05,java.time.LocalDate,true \
vendor.id=123,java.lang.Long,false
----
[source]
----
<bash$ java CommandLineJobRunner io.spring.EndOfDayJobConfiguration endOfDay \
schedule.date=2007-05-05,java.time.LocalDate,true \
vendor.id=123,java.lang.Long,false
----
You can override this behavior by using a custom `JobParametersConverter`.
=====
[tabs]
====
Java::
+
In most cases, you would want to use a manifest to declare your `main` class in a jar. However,
for simplicity, the class was used directly. This example uses the `EndOfDay`
example from the xref:domain.adoc[The Domain Language of Batch]. The first
argument is `io.spring.EndOfDayJobConfiguration`, which is the fully qualified class name
to the configuration class that contains the Job. The second argument, `endOfDay`, represents
the job name. The final argument, `schedule.date=2007-05-05,java.time.LocalDate`, is converted
into a `JobParameter` object of type `java.time.LocalDate`.
+
The following example shows a sample configuration for `endOfDay` in Java:
+
[source, java]
----
@Configuration
@EnableBatchProcessing
public class EndOfDayJobConfiguration {
@Bean
public Job endOfDay(JobRepository jobRepository, Step step1) {
return new JobBuilder("endOfDay", jobRepository)
.start(step1)
.build();
}
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("step1", jobRepository)
.tasklet((contribution, chunkContext) -> null, transactionManager)
.build();
}
}
----
XML::
+
In most cases, you would want to use a manifest to declare your `main` class in a jar. However,
for simplicity, the class was used directly. This example uses the `EndOfDay`
example from the xref:domain.adoc[The Domain Language of Batch]. The first
argument is `endOfDayJob.xml`, which is the Spring ApplicationContext that contains the
`Job`. The second argument, `endOfDay,` represents the job name. The final argument,
`schedule.date=2007-05-05,java.time.LocalDate`, is converted into a `JobParameter` object of type
`java.time.LocalDate`.
+
The following example shows a sample configuration for `endOfDay` in XML:
+
[source, xml]
----
<job id="endOfDay">
<step id="step1" parent="simpleStep" />
</job>
<!-- Launcher details removed for clarity -->
<beans:bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.TaskExecutorJobLauncher" />
----
====
The preceding example is overly simplistic, since there are many more requirements to a
run a batch job in Spring Batch in general, but it serves to show the two main
requirements of the `CommandLineJobRunner`: `Job` and `JobLauncher`.
[[exitCodes]]
=== Exit Codes
When launching a batch job from the command-line, an enterprise
scheduler is often used. Most schedulers are fairly dumb and work only
at the process level. This means that they only know about some
operating system process (such as a shell script that they invoke).
In this scenario, the only way to communicate back to the scheduler
about the success or failure of a job is through return codes. A
return code is a number that is returned to a scheduler by the process
to indicate the result of the run. In the simplest case, 0 is
success and 1 is failure. However, there may be more complex
scenarios, such as "`If job A returns 4, kick off job B, and, if it returns 5, kick
off job C.`" This type of behavior is configured at the scheduler level,
but it is important that a processing framework such as Spring Batch
provide a way to return a numeric representation of the exit code
for a particular batch job. In Spring Batch, this is encapsulated
within an `ExitStatus`, which is covered in more
detail in Chapter 5. For the purposes of discussing exit codes, the
only important thing to know is that an
`ExitStatus` has an exit code property that is
set by the framework (or the developer) and is returned as part of the
`JobExecution` returned from the
`JobLauncher`. The
`CommandLineJobRunner` converts this string value
to a number by using the `ExitCodeMapper`
interface:
[source, java]
----
public interface ExitCodeMapper {
public int intValue(String exitCode);
}
----
The essential contract of an
`ExitCodeMapper` is that, given a string exit
code, a number representation will be returned. The default
implementation used by the job runner is the `SimpleJvmExitCodeMapper`
that returns 0 for completion, 1 for generic errors, and 2 for any job
runner errors such as not being able to find a
`Job` in the provided context. If anything more
complex than the three values above is needed, a custom
implementation of the `ExitCodeMapper` interface
must be supplied. Because the
`CommandLineJobRunner` is the class that creates
an `ApplicationContext` and, thus, cannot be
'wired together', any values that need to be overwritten must be
autowired. This means that if an implementation of
`ExitCodeMapper` is found within the `BeanFactory`,
it is injected into the runner after the context is created. All
that needs to be done to provide your own
`ExitCodeMapper` is to declare the implementation
as a root level bean and ensure that it is part of the
`ApplicationContext` that is loaded by the
runner.
[[runningJobsFromWebContainer]]
== Running Jobs from within a Web Container
Historically, offline processing (such as batch jobs) has been
launched from the command-line, as described earlier. However, there are
many cases where launching from an `HttpRequest` is
a better option. Many such use cases include reporting, ad-hoc job
running, and web application support. Because a batch job (by definition)
is long running, the most important concern is to launch the
job asynchronously:
.Asynchronous Job Launcher Sequence From Web Container
image::launch-from-request.png[Async Job Launcher Sequence from web container, scaledwidth="60%"]
The controller in this case is a Spring MVC controller. See the
Spring Framework Reference Guide for more about https://docs.spring.io/spring/docs/current/spring-framework-reference/web.html#mvc[Spring MVC].
The controller launches a `Job` by using a
`JobLauncher` that has been configured to launch
xref:job/running.adoc#runningJobsFromWebContainer[asynchronously], which
immediately returns a `JobExecution`. The
`Job` is likely still running. However, this
nonblocking behavior lets the controller return immediately, which
is required when handling an `HttpRequest`. The following listing
shows an example:
[source, java]
----
@Controller
public class JobLauncherController {
@Autowired
JobLauncher jobLauncher;
@Autowired
Job job;
@RequestMapping("/jobLauncher.html")
public void handle() throws Exception{
jobLauncher.run(job, new JobParameters());
}
}
----

View File

@@ -1,17 +1,14 @@
:toc: left
:toclevels: 4
[[monitoring-and-metrics]]
== Monitoring and metrics
= Monitoring and metrics
include::attributes.adoc[]
Since version 4.2, Spring Batch provides support for batch monitoring and metrics
based on link:$$https://micrometer.io/$$[Micrometer]. This section describes
which metrics are provided out-of-the-box and how to contribute custom metrics.
[[built-in-metrics]]
=== Built-in metrics
== Built-in metrics
Metrics collection does not require any specific configuration. All metrics provided
by the framework are registered in
@@ -32,7 +29,7 @@ under the `spring.batch` prefix. The following table explains all the metrics in
NOTE: The `status` tag can be either `SUCCESS` or `FAILURE`.
[[custom-metrics]]
=== Custom metrics
== Custom metrics
If you want to use your own metrics in your custom components, we recommend using
Micrometer APIs directly. The following is an example of how to time a `Tasklet`:
@@ -70,7 +67,7 @@ public class MyTimedTasklet implements Tasklet {
----
[[disabling-metrics]]
=== Disabling Metrics
== Disabling Metrics
Metrics collection is a concern similar to logging. Disabling logs is typically
done by configuring the logging library, and this is no different for metrics.
@@ -85,14 +82,4 @@ Metrics.globalRegistry.config().meterFilter(MeterFilter.denyNameStartsWith("spri
----
See Micrometer's link:$$http://micrometer.io/docs/concepts#_meter_filters$$[reference documentation]
for more details.
[[tracing]]
== Tracing
As of version 5, Spring Batch provides tracing through Micrometer's `Observation` API. By default, tracing is enabled
when using `@EnableBatchProcessing`. Spring Batch will create a trace for each job execution and a span for each
step execution.
If you do not use `EnableBatchProcessing`, you need to register a `BatchObservabilityBeanPostProcessor` in your
application context, which will automatically setup Micrometer's observability in your jobs and steps beans.
for more details.

View File

@@ -1,15 +1,8 @@
:toc: left
:toclevels: 4
[[itemProcessor]]
== Item processing
= Item processing
include::attributes.adoc[]
ifndef::onlyonetoggle[]
include::toggle.adoc[]
endif::onlyonetoggle[]
The <<readersAndWriters.adoc#readersAndWriters,ItemReader and ItemWriter interfaces>> are both very useful for their specific
The xref:readersAndWriters.adoc[ItemReader and ItemWriter interfaces] are both very useful for their specific
tasks, but what if you want to insert business logic before writing? One option for both
reading and writing is to use the composite pattern: Create an `ItemWriter` that contains
another `ItemWriter` or an `ItemReader` that contains another `ItemReader`. The following
@@ -90,21 +83,13 @@ objects, throwing an exception if any other type is provided. Similarly, the
`FooProcessor` throws an exception if anything but a `Foo` is provided. The
`FooProcessor` can then be injected into a `Step`, as the following example shows:
.XML Configuration
[source, xml, role="xmlContent"]
----
<job id="ioSampleJob">
<step name="step1">
<tasklet>
<chunk reader="fooReader" processor="fooProcessor" writer="barWriter"
commit-interval="2"/>
</tasklet>
</step>
</job>
----
[tabs]
====
Java::
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public Job ioSampleJob(JobRepository jobRepository) {
@@ -124,11 +109,28 @@ public Step step1(JobRepository jobRepository, PlatformTransactionManager transa
}
----
XML::
+
.XML Configuration
[source, xml]
----
<job id="ioSampleJob">
<step name="step1">
<tasklet>
<chunk reader="fooReader" processor="fooProcessor" writer="barWriter"
commit-interval="2"/>
</tasklet>
</step>
</job>
----
====
A difference between `ItemProcessor` and `ItemReader` or `ItemWriter` is that an `ItemProcessor`
is optional for a `Step`.
[[chainingItemProcessors]]
=== Chaining ItemProcessors
== Chaining ItemProcessors
Performing a single transformation is useful in many scenarios, but what if you want to
"`chain`" together multiple `ItemProcessor` implementations? You can do so by using
@@ -185,31 +187,13 @@ compositeProcessor.setDelegates(itemProcessors);
Just as with the previous example, you can configure the composite processor into the
`Step`:
.XML Configuration
[source, xml, role="xmlContent"]
----
<job id="ioSampleJob">
<step name="step1">
<tasklet>
<chunk reader="fooReader" processor="compositeItemProcessor" writer="foobarWriter"
commit-interval="2"/>
</tasklet>
</step>
</job>
<bean id="compositeItemProcessor"
class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<bean class="..FooProcessor" />
<bean class="..BarProcessor" />
</list>
</property>
</bean>
----
[tabs]
====
Java::
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public Job ioSampleJob(JobRepository jobRepository) {
@@ -242,8 +226,37 @@ public CompositeItemProcessor compositeProcessor() {
}
----
XML::
+
.XML Configuration
[source, xml]
----
<job id="ioSampleJob">
<step name="step1">
<tasklet>
<chunk reader="fooReader" processor="compositeItemProcessor" writer="foobarWriter"
commit-interval="2"/>
</tasklet>
</step>
</job>
<bean id="compositeItemProcessor"
class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<bean class="..FooProcessor" />
<bean class="..BarProcessor" />
</list>
</property>
</bean>
----
====
[[filteringRecords]]
=== Filtering Records
== Filtering Records
One typical use for an item processor is to filter out records before they are passed to
the `ItemWriter`. Filtering is an action distinct from skipping. Skipping indicates that
@@ -263,9 +276,9 @@ the `ItemWriter`. An exception thrown from the `ItemProcessor` results in a
skip.
[[validatingInput]]
=== Validating Input
== Validating Input
The <<readersAndWriters.adoc#readersAndWriters,ItemReaders and ItemWriters>> chapter discusses multiple approaches to parsing input.
The xref:readersAndWriters.adoc[ItemReaders and ItemWriters] chapter discusses multiple approaches to parsing input.
Each major implementation throws an exception if it is not "`well formed.`" The
`FixedLengthTokenizer` throws an exception if a range of data is missing. Similarly,
attempting to access an index in a `RowMapper` or `FieldSetMapper` that does not exist or
@@ -291,22 +304,13 @@ The contract is that the `validate` method throws an exception if the object is
and returns normally if it is valid. Spring Batch provides an
`ValidatingItemProcessor`, as the following bean definition shows:
.XML Configuration
[source, xml, role="xmlContent"]
----
<bean class="org.springframework.batch.item.validator.ValidatingItemProcessor">
<property name="validator" ref="validator" />
</bean>
<bean id="validator" class="org.springframework.batch.item.validator.SpringValidator">
<property name="validator">
<bean class="org.springframework.batch.sample.domain.trade.internal.validator.TradeValidator"/>
</property>
</bean>
----
[tabs]
====
Java::
+
.Java Configuration
[source, java, role="javaContent"]
[source, java]
----
@Bean
public ValidatingItemProcessor itemProcessor() {
@@ -327,6 +331,25 @@ public SpringValidator validator() {
}
----
XML::
+
.XML Configuration
[source, xml]
----
<bean class="org.springframework.batch.item.validator.ValidatingItemProcessor">
<property name="validator" ref="validator" />
</bean>
<bean id="validator" class="org.springframework.batch.item.validator.SpringValidator">
<property name="validator">
<bean class="org.springframework.batch.sample.domain.trade.internal.validator.TradeValidator"/>
</property>
</bean>
----
====
You can also use the `BeanValidatingItemProcessor` to validate items annotated with
the Bean Validation API (JSR-303) annotations. For example, consider the following type `Person`:
@@ -367,7 +390,7 @@ public BeanValidatingItemProcessor<Person> beanValidatingItemProcessor() throws
----
[[faultTolerant]]
=== Fault Tolerance
== Fault Tolerance
When a chunk is rolled back, items that have been cached during reading may be
reprocessed. If a step is configured to be fault-tolerant (typically by using skip or

View File

@@ -0,0 +1,188 @@
[[customReadersWriters]]
= Creating Custom ItemReaders and ItemWriters
So far, this chapter has discussed the basic contracts of reading and writing in Spring
Batch and some common implementations for doing so. However, these are all fairly
generic, and there are many potential scenarios that may not be covered by out-of-the-box
implementations. This section shows, by using a simple example, how to create a custom
`ItemReader` and `ItemWriter` implementation and implement their contracts correctly. The
`ItemReader` also implements `ItemStream`, in order to illustrate how to make a reader or
writer restartable.
[[customReader]]
== Custom `ItemReader` Example
For the purpose of this example, we create a simple `ItemReader` implementation that
reads from a provided list. We start by implementing the most basic contract of
`ItemReader`, the `read` method, as shown in the following code:
[source, java]
----
public class CustomItemReader<T> implements ItemReader<T> {
List<T> items;
public CustomItemReader(List<T> items) {
this.items = items;
}
public T read() throws Exception, UnexpectedInputException,
NonTransientResourceException, ParseException {
if (!items.isEmpty()) {
return items.remove(0);
}
return null;
}
}
----
The preceding class takes a list of items and returns them one at a time, removing each
from the list. When the list is empty, it returns `null`, thus satisfying the most basic
requirements of an `ItemReader`, as illustrated in the following test code:
[source, java]
----
List<String> items = new ArrayList<>();
items.add("1");
items.add("2");
items.add("3");
ItemReader itemReader = new CustomItemReader<>(items);
assertEquals("1", itemReader.read());
assertEquals("2", itemReader.read());
assertEquals("3", itemReader.read());
assertNull(itemReader.read());
----
[[restartableReader]]
=== Making the `ItemReader` Restartable
The final challenge is to make the `ItemReader` restartable. Currently, if processing is
interrupted and begins again, the `ItemReader` must start at the beginning. This is
actually valid in many scenarios, but it is sometimes preferable that a batch job
restarts where it left off. The key discriminant is often whether the reader is stateful
or stateless. A stateless reader does not need to worry about restartability, but a
stateful one has to try to reconstitute its last known state on restart. For this reason,
we recommend that you keep custom readers stateless if possible, so you need not worry
about restartability.
If you do need to store state, then the `ItemStream` interface should be used:
[source, java]
----
public class CustomItemReader<T> implements ItemReader<T>, ItemStream {
List<T> items;
int currentIndex = 0;
private static final String CURRENT_INDEX = "current.index";
public CustomItemReader(List<T> items) {
this.items = items;
}
public T read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
if (currentIndex < items.size()) {
return items.get(currentIndex++);
}
return null;
}
public void open(ExecutionContext executionContext) throws ItemStreamException {
if (executionContext.containsKey(CURRENT_INDEX)) {
currentIndex = new Long(executionContext.getLong(CURRENT_INDEX)).intValue();
}
else {
currentIndex = 0;
}
}
public void update(ExecutionContext executionContext) throws ItemStreamException {
executionContext.putLong(CURRENT_INDEX, new Long(currentIndex).longValue());
}
public void close() throws ItemStreamException {}
}
----
On each call to the `ItemStream` `update` method, the current index of the `ItemReader`
is stored in the provided `ExecutionContext` with a key of 'current.index'. When the
`ItemStream` `open` method is called, the `ExecutionContext` is checked to see if it
contains an entry with that key. If the key is found, then the current index is moved to
that location. This is a fairly trivial example, but it still meets the general contract:
[source, java]
----
ExecutionContext executionContext = new ExecutionContext();
((ItemStream)itemReader).open(executionContext);
assertEquals("1", itemReader.read());
((ItemStream)itemReader).update(executionContext);
List<String> items = new ArrayList<>();
items.add("1");
items.add("2");
items.add("3");
itemReader = new CustomItemReader<>(items);
((ItemStream)itemReader).open(executionContext);
assertEquals("2", itemReader.read());
----
Most `ItemReaders` have much more sophisticated restart logic. The
`JdbcCursorItemReader`, for example, stores the row ID of the last processed row in the
cursor.
It is also worth noting that the key used within the `ExecutionContext` should not be
trivial. That is because the same `ExecutionContext` is used for all `ItemStreams` within
a `Step`. In most cases, simply prepending the key with the class name should be enough
to guarantee uniqueness. However, in the rare cases where two of the same type of
`ItemStream` are used in the same step (which can happen if two files are needed for
output), a more unique name is needed. For this reason, many of the Spring Batch
`ItemReader` and `ItemWriter` implementations have a `setName()` property that lets this
key name be overridden.
[[customWriter]]
== Custom `ItemWriter` Example
Implementing a Custom `ItemWriter` is similar in many ways to the `ItemReader` example
above but differs in enough ways as to warrant its own example. However, adding
restartability is essentially the same, so it is not covered in this example. As with the
`ItemReader` example, a `List` is used in order to keep the example as simple as
possible:
[source, java]
----
public class CustomItemWriter<T> implements ItemWriter<T> {
List<T> output = TransactionAwareProxyFactory.createTransactionalList();
public void write(Chunk<? extends T> items) throws Exception {
output.addAll(items);
}
public List<T> getOutput() {
return output;
}
}
----
[[restartableWriter]]
=== Making the `ItemWriter` Restartable
To make the `ItemWriter` restartable, we would follow the same process as for the
`ItemReader`, adding and implementing the `ItemStream` interface to synchronize the
execution context. In the example, we might have to count the number of items processed
and add that as a footer record. If we needed to do that, we could implement
`ItemStream` in our `ItemWriter` so that the counter was reconstituted from the execution
context if the stream was re-opened.
In many realistic cases, custom `ItemWriters` also delegate to another writer that itself
is restartable (for example, when writing to a file), or else it writes to a
transactional resource and so does not need to be restartable, because it is stateless.
When you have a stateful writer you should probably be sure to implement `ItemStream` as
well as `ItemWriter`. Remember also that the client of the writer needs to be aware of
the `ItemStream`, so you may need to register it as a stream in the configuration.

View File

@@ -0,0 +1,747 @@
[[database]]
= Database
Like most enterprise application styles, a database is the central storage mechanism for
batch. However, batch differs from other application styles due to the sheer size of the
datasets with which the system must work. If a SQL statement returns 1 million rows, the
result set probably holds all returned results in memory until all rows have been read.
Spring Batch provides two types of solutions for this problem:
* xref:readers-and-writers/database.adoc#cursorBasedItemReaders[Cursor-based `ItemReader` Implementations]
* xref:readers-and-writers/database.adoc#pagingItemReaders[Paging `ItemReader` Implementations]
[[cursorBasedItemReaders]]
== Cursor-based `ItemReader` Implementations
Using a database cursor is generally the default approach of most batch developers,
because it is the database's solution to the problem of 'streaming' relational data. The
Java `ResultSet` class is essentially an object oriented mechanism for manipulating a
cursor. A `ResultSet` maintains a cursor to the current row of data. Calling `next` on a
`ResultSet` moves this cursor to the next row. The Spring Batch cursor-based `ItemReader`
implementation opens a cursor on initialization and moves the cursor forward one row for
every call to `read`, returning a mapped object that can be used for processing. The
`close` method is then called to ensure all resources are freed up. The Spring core
`JdbcTemplate` gets around this problem by using the callback pattern to completely map
all rows in a `ResultSet` and close before returning control back to the method caller.
However, in batch, this must wait until the step is complete. The following image shows a
generic diagram of how a cursor-based `ItemReader` works. Note that, while the example
uses SQL (because SQL is so widely known), any technology could implement the basic
approach.
.Cursor Example
image::cursorExample.png[Cursor Example, scaledwidth="60%"]
This example illustrates the basic pattern. Given a 'FOO' table, which has three columns:
`ID`, `NAME`, and `BAR`, select all rows with an ID greater than 1 but less than 7. This
puts the beginning of the cursor (row 1) on ID 2. The result of this row should be a
completely mapped `Foo` object. Calling `read()` again moves the cursor to the next row,
which is the `Foo` with an ID of 3. The results of these reads are written out after each
`read`, allowing the objects to be garbage collected (assuming no instance variables are
maintaining references to them).
[[JdbcCursorItemReader]]
=== `JdbcCursorItemReader`
`JdbcCursorItemReader` is the JDBC implementation of the cursor-based technique. It works
directly with a `ResultSet` and requires an SQL statement to run against a connection
obtained from a `DataSource`. The following database schema is used as an example:
[source, sql]
----
CREATE TABLE CUSTOMER (
ID BIGINT IDENTITY PRIMARY KEY,
NAME VARCHAR(45),
CREDIT FLOAT
);
----
Many people prefer to use a domain object for each row, so the following example uses an
implementation of the `RowMapper` interface to map a `CustomerCredit` object:
[source, java]
----
public class CustomerCreditRowMapper implements RowMapper<CustomerCredit> {
public static final String ID_COLUMN = "id";
public static final String NAME_COLUMN = "name";
public static final String CREDIT_COLUMN = "credit";
public CustomerCredit mapRow(ResultSet rs, int rowNum) throws SQLException {
CustomerCredit customerCredit = new CustomerCredit();
customerCredit.setId(rs.getInt(ID_COLUMN));
customerCredit.setName(rs.getString(NAME_COLUMN));
customerCredit.setCredit(rs.getBigDecimal(CREDIT_COLUMN));
return customerCredit;
}
}
----
Because `JdbcCursorItemReader` shares key interfaces with `JdbcTemplate`, it is useful to
see an example of how to read in this data with `JdbcTemplate`, in order to contrast it
with the `ItemReader`. For the purposes of this example, assume there are 1,000 rows in
the `CUSTOMER` database. The first example uses `JdbcTemplate`:
[source, java]
----
//For simplicity sake, assume a dataSource has already been obtained
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
List customerCredits = jdbcTemplate.query("SELECT ID, NAME, CREDIT from CUSTOMER",
new CustomerCreditRowMapper());
----
After running the preceding code snippet, the `customerCredits` list contains 1,000
`CustomerCredit` objects. In the query method, a connection is obtained from the
`DataSource`, the provided SQL is run against it, and the `mapRow` method is called for
each row in the `ResultSet`. Contrast this with the approach of the
`JdbcCursorItemReader`, shown in the following example:
[source, java]
----
JdbcCursorItemReader itemReader = new JdbcCursorItemReader();
itemReader.setDataSource(dataSource);
itemReader.setSql("SELECT ID, NAME, CREDIT from CUSTOMER");
itemReader.setRowMapper(new CustomerCreditRowMapper());
int counter = 0;
ExecutionContext executionContext = new ExecutionContext();
itemReader.open(executionContext);
Object customerCredit = new Object();
while(customerCredit != null){
customerCredit = itemReader.read();
counter++;
}
itemReader.close();
----
After running the preceding code snippet, the counter equals 1,000. If the code above had
put the returned `customerCredit` into a list, the result would have been exactly the
same as with the `JdbcTemplate` example. However, the big advantage of the `ItemReader`
is that it allows items to be 'streamed'. The `read` method can be called once, the item
can be written out by an `ItemWriter`, and then the next item can be obtained with
`read`. This allows item reading and writing to be done in 'chunks' and committed
periodically, which is the essence of high performance batch processing. Furthermore, it
is easily configured for injection into a Spring Batch `Step`.
[tabs]
====
Java::
+
The following example shows how to inject an `ItemReader` into a `Step` in Java:
+
.Java Configuration
[source, java]
----
@Bean
public JdbcCursorItemReader<CustomerCredit> itemReader() {
return new JdbcCursorItemReaderBuilder<CustomerCredit>()
.dataSource(this.dataSource)
.name("creditReader")
.sql("select ID, NAME, CREDIT from CUSTOMER")
.rowMapper(new CustomerCreditRowMapper())
.build();
}
----
XML::
+
The following example shows how to inject an `ItemReader` into a `Step` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemReader" class="org.spr...JdbcCursorItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="sql" value="select ID, NAME, CREDIT from CUSTOMER"/>
<property name="rowMapper">
<bean class="org.springframework.batch.sample.domain.CustomerCreditRowMapper"/>
</property>
</bean>
----
====
[[JdbcCursorItemReaderProperties]]
==== Additional Properties
Because there are so many varying options for opening a cursor in Java, there are many
properties on the `JdbcCursorItemReader` that can be set, as described in the following
table:
.JdbcCursorItemReader Properties
|===============
|ignoreWarnings|Determines whether or not SQLWarnings are logged or cause an exception.
The default is `true` (meaning that warnings are logged).
|fetchSize|Gives the JDBC driver a hint as to the number of rows that should be fetched
from the database when more rows are needed by the `ResultSet` object used by the
`ItemReader`. By default, no hint is given.
|maxRows|Sets the limit for the maximum number of rows the underlying `ResultSet` can
hold at any one time.
|queryTimeout|Sets the number of seconds the driver waits for a `Statement` object to
run. If the limit is exceeded, a `DataAccessException` is thrown. (Consult your driver
vendor documentation for details).
|verifyCursorPosition|Because the same `ResultSet` held by the `ItemReader` is passed to
the `RowMapper`, it is possible for users to call `ResultSet.next()` themselves, which
could cause issues with the reader's internal count. Setting this value to `true` causes
an exception to be thrown if the cursor position is not the same after the `RowMapper`
call as it was before.
|saveState|Indicates whether or not the reader's state should be saved in the
`ExecutionContext` provided by `ItemStream#update(ExecutionContext)`. The default is
`true`.
|driverSupportsAbsolute|Indicates whether the JDBC driver supports
setting the absolute row on a `ResultSet`. It is recommended that this is set to `true`
for JDBC drivers that support `ResultSet.absolute()`, as it may improve performance,
especially if a step fails while working with a large data set. Defaults to `false`.
|setUseSharedExtendedConnection| Indicates whether the connection
used for the cursor should be used by all other processing, thus sharing the same
transaction. If this is set to `false`, then the cursor is opened with its own connection
and does not participate in any transactions started for the rest of the step processing.
If you set this flag to `true` then you must wrap the DataSource in an
`ExtendedConnectionDataSourceProxy` to prevent the connection from being closed and
released after each commit. When you set this option to `true`, the statement used to
open the cursor is created with both 'READ_ONLY' and 'HOLD_CURSORS_OVER_COMMIT' options.
This allows holding the cursor open over transaction start and commits performed in the
step processing. To use this feature, you need a database that supports this and a JDBC
driver supporting JDBC 3.0 or later. Defaults to `false`.
|===============
[[HibernateCursorItemReader]]
=== `HibernateCursorItemReader`
Just as normal Spring users make important decisions about whether or not to use ORM
solutions, which affect whether or not they use a `JdbcTemplate` or a
`HibernateTemplate`, Spring Batch users have the same options.
`HibernateCursorItemReader` is the Hibernate implementation of the cursor technique.
Hibernate's usage in batch has been fairly controversial. This has largely been because
Hibernate was originally developed to support online application styles. However, that
does not mean it cannot be used for batch processing. The easiest approach for solving
this problem is to use a `StatelessSession` rather than a standard session. This removes
all of the caching and dirty checking Hibernate employs and that can cause issues in a
batch scenario. For more information on the differences between stateless and normal
hibernate sessions, refer to the documentation of your specific hibernate release. The
`HibernateCursorItemReader` lets you declare an HQL statement and pass in a
`SessionFactory`, which will pass back one item per call to read in the same basic
fashion as the `JdbcCursorItemReader`. The following example configuration uses the same
'customer credit' example as the JDBC reader:
[source, java]
----
HibernateCursorItemReader itemReader = new HibernateCursorItemReader();
itemReader.setQueryString("from CustomerCredit");
//For simplicity sake, assume sessionFactory already obtained.
itemReader.setSessionFactory(sessionFactory);
itemReader.setUseStatelessSession(true);
int counter = 0;
ExecutionContext executionContext = new ExecutionContext();
itemReader.open(executionContext);
Object customerCredit = new Object();
while(customerCredit != null){
customerCredit = itemReader.read();
counter++;
}
itemReader.close();
----
This configured `ItemReader` returns `CustomerCredit` objects in the exact same manner
as described by the `JdbcCursorItemReader`, assuming hibernate mapping files have been
created correctly for the `Customer` table. The 'useStatelessSession' property defaults
to true but has been added here to draw attention to the ability to switch it on or off.
It is also worth noting that the fetch size of the underlying cursor can be set with the
`setFetchSize` property. As with `JdbcCursorItemReader`, configuration is
straightforward.
[tabs]
====
Java::
+
The following example shows how to inject a Hibernate `ItemReader` in Java:
+
.Java Configuration
[source, java]
----
@Bean
public HibernateCursorItemReader itemReader(SessionFactory sessionFactory) {
return new HibernateCursorItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.sessionFactory(sessionFactory)
.queryString("from CustomerCredit")
.build();
}
----
XML::
+
The following example shows how to inject a Hibernate `ItemReader` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemReader"
class="org.springframework.batch.item.database.HibernateCursorItemReader">
<property name="sessionFactory" ref="sessionFactory" />
<property name="queryString" value="from CustomerCredit" />
</bean>
----
====
[[StoredProcedureItemReader]]
=== `StoredProcedureItemReader`
Sometimes it is necessary to obtain the cursor data by using a stored procedure. The
`StoredProcedureItemReader` works like the `JdbcCursorItemReader`, except that, instead
of running a query to obtain a cursor, it runs a stored procedure that returns a cursor.
The stored procedure can return the cursor in three different ways:
* As a returned `ResultSet` (used by SQL Server, Sybase, DB2, Derby, and MySQL).
* As a ref-cursor returned as an out parameter (used by Oracle and PostgreSQL).
* As the return value of a stored function call.
[tabs]
====
Java::
+
The following Java example configuration uses the same 'customer credit' example as
earlier examples:
+
.Java Configuration
[source, xml]
----
@Bean
public StoredProcedureItemReader reader(DataSource dataSource) {
StoredProcedureItemReader reader = new StoredProcedureItemReader();
reader.setDataSource(dataSource);
reader.setProcedureName("sp_customer_credit");
reader.setRowMapper(new CustomerCreditRowMapper());
return reader;
}
----
//TODO: Fix the above config to use a builder once we have one for it.
XML::
+
The following XML example configuration uses the same 'customer credit' example as earlier
examples:
+
.XML Configuration
[source, xml]
----
<bean id="reader" class="o.s.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="procedureName" value="sp_customer_credit"/>
<property name="rowMapper">
<bean class="org.springframework.batch.sample.domain.CustomerCreditRowMapper"/>
</property>
</bean>
----
====
The preceding example relies on the stored procedure to provide a `ResultSet` as a
returned result (option 1 from earlier).
If the stored procedure returned a `ref-cursor` (option 2), then we would need to provide
the position of the out parameter that is the returned `ref-cursor`.
[tabs]
====
Java::
+
The following example shows how to work with the first parameter being a ref-cursor in
Java:
+
.Java Configuration
[source, java]
----
@Bean
public StoredProcedureItemReader reader(DataSource dataSource) {
StoredProcedureItemReader reader = new StoredProcedureItemReader();
reader.setDataSource(dataSource);
reader.setProcedureName("sp_customer_credit");
reader.setRowMapper(new CustomerCreditRowMapper());
reader.setRefCursorPosition(1);
return reader;
}
----
XML::
+
The following example shows how to work with the first parameter being a ref-cursor in
XML:
+
.XML Configuration
[source, xml]
----
<bean id="reader" class="o.s.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="procedureName" value="sp_customer_credit"/>
<property name="refCursorPosition" value="1"/>
<property name="rowMapper">
<bean class="org.springframework.batch.sample.domain.CustomerCreditRowMapper"/>
</property>
</bean>
----
====
If the cursor was returned from a stored function (option 3), we would need to set the
property "[maroon]#function#" to `true`. It defaults to `false`.
[tabs]
====
Java::
+
The following example shows property to `true` in Java:
+
.Java Configuration
[source, java]
----
@Bean
public StoredProcedureItemReader reader(DataSource dataSource) {
StoredProcedureItemReader reader = new StoredProcedureItemReader();
reader.setDataSource(dataSource);
reader.setProcedureName("sp_customer_credit");
reader.setRowMapper(new CustomerCreditRowMapper());
reader.setFunction(true);
return reader;
}
----
XML::
+
The following example shows property to `true` in XML:
+
.XML Configuration
[source, xml]
----
<bean id="reader" class="o.s.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="procedureName" value="sp_customer_credit"/>
<property name="function" value="true"/>
<property name="rowMapper">
<bean class="org.springframework.batch.sample.domain.CustomerCreditRowMapper"/>
</property>
</bean>
----
====
In all of these cases, we need to define a `RowMapper` as well as a `DataSource` and the
actual procedure name.
If the stored procedure or function takes in parameters, then they must be declared and
set by using the `parameters` property. The following example, for Oracle, declares three
parameters. The first one is the `out` parameter that returns the ref-cursor, and the
second and third are in parameters that takes a value of type `INTEGER`.
[tabs]
====
Java::
+
The following example shows how to work with parameters in Java:
+
.Java Configuration
[source, java]
----
@Bean
public StoredProcedureItemReader reader(DataSource dataSource) {
List<SqlParameter> parameters = new ArrayList<>();
parameters.add(new SqlOutParameter("newId", OracleTypes.CURSOR));
parameters.add(new SqlParameter("amount", Types.INTEGER);
parameters.add(new SqlParameter("custId", Types.INTEGER);
StoredProcedureItemReader reader = new StoredProcedureItemReader();
reader.setDataSource(dataSource);
reader.setProcedureName("spring.cursor_func");
reader.setParameters(parameters);
reader.setRefCursorPosition(1);
reader.setRowMapper(rowMapper());
reader.setPreparedStatementSetter(parameterSetter());
return reader;
}
----
XML::
+
The following example shows how to work with parameters in XML:
+
.XML Configuration
[source, xml]
----
<bean id="reader" class="o.s.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="procedureName" value="spring.cursor_func"/>
<property name="parameters">
<list>
<bean class="org.springframework.jdbc.core.SqlOutParameter">
<constructor-arg index="0" value="newid"/>
<constructor-arg index="1">
<util:constant static-field="oracle.jdbc.OracleTypes.CURSOR"/>
</constructor-arg>
</bean>
<bean class="org.springframework.jdbc.core.SqlParameter">
<constructor-arg index="0" value="amount"/>
<constructor-arg index="1">
<util:constant static-field="java.sql.Types.INTEGER"/>
</constructor-arg>
</bean>
<bean class="org.springframework.jdbc.core.SqlParameter">
<constructor-arg index="0" value="custid"/>
<constructor-arg index="1">
<util:constant static-field="java.sql.Types.INTEGER"/>
</constructor-arg>
</bean>
</list>
</property>
<property name="refCursorPosition" value="1"/>
<property name="rowMapper" ref="rowMapper"/>
<property name="preparedStatementSetter" ref="parameterSetter"/>
</bean>
----
====
In addition to the parameter declarations, we need to specify a `PreparedStatementSetter`
implementation that sets the parameter values for the call. This works the same as for
the `JdbcCursorItemReader` above. All the additional properties listed in
xref:readers-and-writers/database.adoc#JdbcCursorItemReaderProperties[Additional Properties] apply to the `StoredProcedureItemReader` as well.
[[pagingItemReaders]]
== Paging `ItemReader` Implementations
An alternative to using a database cursor is running multiple queries where each query
fetches a portion of the results. We refer to this portion as a page. Each query must
specify the starting row number and the number of rows that we want returned in the page.
[[JdbcPagingItemReader]]
=== `JdbcPagingItemReader`
One implementation of a paging `ItemReader` is the `JdbcPagingItemReader`. The
`JdbcPagingItemReader` needs a `PagingQueryProvider` responsible for providing the SQL
queries used to retrieve the rows making up a page. Since each database has its own
strategy for providing paging support, we need to use a different `PagingQueryProvider`
for each supported database type. There is also the `SqlPagingQueryProviderFactoryBean`
that auto-detects the database that is being used and determine the appropriate
`PagingQueryProvider` implementation. This simplifies the configuration and is the
recommended best practice.
The `SqlPagingQueryProviderFactoryBean` requires that you specify a `select` clause and a
`from` clause. You can also provide an optional `where` clause. These clauses and the
required `sortKey` are used to build an SQL statement.
NOTE: It is important to have a unique key constraint on the `sortKey` to guarantee that
no data is lost between executions.
After the reader has been opened, it passes back one item per call to `read` in the same
basic fashion as any other `ItemReader`. The paging happens behind the scenes when
additional rows are needed.
[tabs]
====
Java::
+
The following Java example configuration uses a similar 'customer credit' example as the
cursor-based `ItemReaders` shown previously:
+
.Java Configuration
[source, java]
----
@Bean
public JdbcPagingItemReader itemReader(DataSource dataSource, PagingQueryProvider queryProvider) {
Map<String, Object> parameterValues = new HashMap<>();
parameterValues.put("status", "NEW");
return new JdbcPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.dataSource(dataSource)
.queryProvider(queryProvider)
.parameterValues(parameterValues)
.rowMapper(customerCreditMapper())
.pageSize(1000)
.build();
}
@Bean
public SqlPagingQueryProviderFactoryBean queryProvider() {
SqlPagingQueryProviderFactoryBean provider = new SqlPagingQueryProviderFactoryBean();
provider.setSelectClause("select id, name, credit");
provider.setFromClause("from customer");
provider.setWhereClause("where status=:status");
provider.setSortKey("id");
return provider;
}
----
XML::
+
The following XML example configuration uses a similar 'customer credit' example as the
cursor-based `ItemReaders` shown previously:
+
.XML Configuration
[source, xml]
----
<bean id="itemReader" class="org.spr...JdbcPagingItemReader">
<property name="dataSource" ref="dataSource"/>
<property name="queryProvider">
<bean class="org.spr...SqlPagingQueryProviderFactoryBean">
<property name="selectClause" value="select id, name, credit"/>
<property name="fromClause" value="from customer"/>
<property name="whereClause" value="where status=:status"/>
<property name="sortKey" value="id"/>
</bean>
</property>
<property name="parameterValues">
<map>
<entry key="status" value="NEW"/>
</map>
</property>
<property name="pageSize" value="1000"/>
<property name="rowMapper" ref="customerMapper"/>
</bean>
----
====
This configured `ItemReader` returns `CustomerCredit` objects using the `RowMapper`,
which must be specified. The 'pageSize' property determines the number of entities read
from the database for each query run.
The 'parameterValues' property can be used to specify a `Map` of parameter values for the
query. If you use named parameters in the `where` clause, the key for each entry should
match the name of the named parameter. If you use a traditional '?' placeholder, then the
key for each entry should be the number of the placeholder, starting with 1.
[[JpaPagingItemReader]]
=== `JpaPagingItemReader`
Another implementation of a paging `ItemReader` is the `JpaPagingItemReader`. JPA does
not have a concept similar to the Hibernate `StatelessSession`, so we have to use other
features provided by the JPA specification. Since JPA supports paging, this is a natural
choice when it comes to using JPA for batch processing. After each page is read, the
entities become detached and the persistence context is cleared, to allow the entities to
be garbage collected once the page is processed.
The `JpaPagingItemReader` lets you declare a JPQL statement and pass in a
`EntityManagerFactory`. It then passes back one item per call to read in the same basic
fashion as any other `ItemReader`. The paging happens behind the scenes when additional
entities are needed.
[tabs]
====
Java::
+
The following Java example configuration uses the same 'customer credit' example as the
JDBC reader shown previously:
+
.Java Configuration
[source, java]
----
@Bean
public JpaPagingItemReader itemReader() {
return new JpaPagingItemReaderBuilder<CustomerCredit>()
.name("creditReader")
.entityManagerFactory(entityManagerFactory())
.queryString("select c from CustomerCredit c")
.pageSize(1000)
.build();
}
----
XML::
+
The following XML example configuration uses the same 'customer credit' example as the
JDBC reader shown previously:
+
.XML Configuration
[source, xml]
----
<bean id="itemReader" class="org.spr...JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
<property name="queryString" value="select c from CustomerCredit c"/>
<property name="pageSize" value="1000"/>
</bean>
----
====
This configured `ItemReader` returns `CustomerCredit` objects in the exact same manner as
described for the `JdbcPagingItemReader` above, assuming the `CustomerCredit` object has the
correct JPA annotations or ORM mapping file. The 'pageSize' property determines the
number of entities read from the database for each query execution.
[[databaseItemWriters]]
== Database ItemWriters
While both flat files and XML files have a specific `ItemWriter` instance, there is no exact equivalent
in the database world. This is because transactions provide all the needed functionality.
`ItemWriter` implementations are necessary for files because they must act as if they're transactional,
keeping track of written items and flushing or clearing at the appropriate times.
Databases have no need for this functionality, since the write is already contained in a
transaction. Users can create their own DAOs that implement the `ItemWriter` interface or
use one from a custom `ItemWriter` that's written for generic processing concerns. Either
way, they should work without any issues. One thing to look out for is the performance
and error handling capabilities that are provided by batching the outputs. This is most
common when using hibernate as an `ItemWriter` but could have the same issues when using
JDBC batch mode. Batching database output does not have any inherent flaws, assuming we
are careful to flush and there are no errors in the data. However, any errors while
writing can cause confusion, because there is no way to know which individual item caused
an exception or even if any individual item was responsible, as illustrated in the
following image:
.Error On Flush
image::errorOnFlush.png[Error On Flush, scaledwidth="60%"]
If items are buffered before being written, any errors are not thrown until the buffer is
flushed just before a commit. For example, assume that 20 items are written per chunk,
and the 15th item throws a `DataIntegrityViolationException`. As far as the `Step`
is concerned, all 20 item are written successfully, since there is no way to know that an
error occurs until they are actually written. Once `Session#flush()` is called, the
buffer is emptied and the exception is hit. At this point, there is nothing the `Step`
can do. The transaction must be rolled back. Normally, this exception might cause the
item to be skipped (depending upon the skip/retry policies), and then it is not written
again. However, in the batched scenario, there is no way to know which item caused the
issue. The whole buffer was being written when the failure happened. The only way to
solve this issue is to flush after each item, as shown in the following image:
.Error On Write
image::errorOnWrite.png[Error On Write, scaledwidth="60%"]
This is a common use case, especially when using Hibernate, and the simple guideline for
implementations of `ItemWriter` is to flush on each call to `write()`. Doing so allows
for items to be skipped reliably, with Spring Batch internally taking care of the
granularity of the calls to `ItemWriter` after an error.

View File

@@ -0,0 +1,89 @@
[[delegatePatternAndRegistering]]
= The Delegate Pattern and Registering with the Step
Note that the `CompositeItemWriter` is an example of the delegation pattern, which is
common in Spring Batch. The delegates themselves might implement callback interfaces,
such as `StepListener`. If they do and if they are being used in conjunction with Spring
Batch Core as part of a `Step` in a `Job`, then they almost certainly need to be
registered manually with the `Step`. A reader, writer, or processor that is directly
wired into the `Step` gets registered automatically if it implements `ItemStream` or a
`StepListener` interface. However, because the delegates are not known to the `Step`,
they need to be injected as listeners or streams (or both if appropriate).
[tabs]
====
Java::
+
The following example shows how to inject a delegate as a stream in Java:
+
.Java Configuration
[source, java]
----
@Bean
public Job ioSampleJob(JobRepository jobRepository) {
return new JobBuilder("ioSampleJob", jobRepository)
.start(step1())
.build();
}
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager) {
return new StepBuilder("step1", jobRepository)
.<String, String>chunk(2, transactionManager)
.reader(fooReader())
.processor(fooProcessor())
.writer(compositeItemWriter())
.stream(barWriter())
.build();
}
@Bean
public CustomCompositeItemWriter compositeItemWriter() {
CustomCompositeItemWriter writer = new CustomCompositeItemWriter();
writer.setDelegate(barWriter());
return writer;
}
@Bean
public BarWriter barWriter() {
return new BarWriter();
}
----
XML::
+
The following example shows how to inject a delegate as a stream in XML:
+
.XML Configuration
[source, xml]
----
<job id="ioSampleJob">
<step name="step1">
<tasklet>
<chunk reader="fooReader" processor="fooProcessor" writer="compositeItemWriter"
commit-interval="2">
<streams>
<stream ref="barWriter" />
</streams>
</chunk>
</tasklet>
</step>
</job>
<bean id="compositeItemWriter" class="...CustomCompositeItemWriter">
<property name="delegate" ref="barWriter" />
</bean>
<bean id="barWriter" class="...BarWriter" />
----
====

View File

@@ -0,0 +1,11 @@
[[flatFiles]]
= Flat Files
:page-section-summary-toc: 1
One of the most common mechanisms for interchanging bulk data has always been the flat
file. Unlike XML, which has an agreed upon standard for defining how it is structured
(XSD), anyone reading a flat file must understand ahead of time exactly how the file is
structured. In general, all flat files fall into two types: delimited and fixed length.
Delimited files are those in which fields are separated by a delimiter, such as a comma.
Fixed Length files have fields that are a set length.

View File

@@ -0,0 +1,30 @@
[[fieldSet]]
= The `FieldSet`
When working with flat files in Spring Batch, regardless of whether it is for input or
output, one of the most important classes is the `FieldSet`. Many architectures and
libraries contain abstractions for helping you read in from a file, but they usually
return a `String` or an array of `String` objects. This really only gets you halfway
there. A `FieldSet` is Spring Batch's abstraction for enabling the binding of fields from
a file resource. It allows developers to work with file input in much the same way as
they would work with database input. A `FieldSet` is conceptually similar to a JDBC
`ResultSet`. A `FieldSet` requires only one argument: a `String` array of tokens.
Optionally, you can also configure the names of the fields so that the fields may be
accessed either by index or name as patterned after `ResultSet`, as shown in the following
example:
[source, java]
----
String[] tokens = new String[]{"foo", "1", "true"};
FieldSet fs = new DefaultFieldSet(tokens);
String name = fs.readString(0);
int value = fs.readInt(1);
boolean booleanValue = fs.readBoolean(2);
----
There are many more options on the `FieldSet` interface, such as `Date`, long,
`BigDecimal`, and so on. The biggest advantage of the `FieldSet` is that it provides
consistent parsing of flat file input. Rather than each batch job parsing differently in
potentially unexpected ways, it can be consistent, both when handling errors caused by a
format exception, or when doing simple data conversions.

View File

@@ -0,0 +1,660 @@
[[flatFileItemReader]]
= `FlatFileItemReader`
A flat file is any type of file that contains at most two-dimensional (tabular) data.
Reading flat files in the Spring Batch framework is facilitated by the class called
`FlatFileItemReader`, which provides basic functionality for reading and parsing flat
files. The two most important required dependencies of `FlatFileItemReader` are
`Resource` and `LineMapper`. The `LineMapper` interface is explored more in the next
sections. The resource property represents a Spring Core `Resource`. Documentation
explaining how to create beans of this type can be found in
link:$$https://docs.spring.io/spring/docs/current/spring-framework-reference/core.html#resources$$[Spring
Framework, Chapter 5. Resources]. Therefore, this guide does not go into the details of
creating `Resource` objects beyond showing the following simple example:
[source, java]
----
Resource resource = new FileSystemResource("resources/trades.csv");
----
In complex batch environments, the directory structures are often managed by the Enterprise Application Integration (EAI)
infrastructure, where drop zones for external interfaces are established for moving files
from FTP locations to batch processing locations and vice versa. File moving utilities
are beyond the scope of the Spring Batch architecture, but it is not unusual for batch
job streams to include file moving utilities as steps in the job stream. The batch
architecture only needs to know how to locate the files to be processed. Spring Batch
begins the process of feeding the data into the pipe from this starting point. However,
link:$$https://projects.spring.io/spring-integration/$$[Spring Integration] provides many
of these types of services.
The other properties in `FlatFileItemReader` let you further specify how your data is
interpreted, as described in the following table:
.`FlatFileItemReader` Properties
[options="header"]
|===============
|Property|Type|Description
|comments|String[]|Specifies line prefixes that indicate comment rows.
|encoding|String|Specifies what text encoding to use. The default value is `UTF-8`.
|lineMapper|`LineMapper`|Converts a `String` to an `Object` representing the item.
|linesToSkip|int|Number of lines to ignore at the top of the file.
|recordSeparatorPolicy|RecordSeparatorPolicy|Used to determine where the line endings are
and do things like continue over a line ending if inside a quoted string.
|resource|`Resource`|The resource from which to read.
|skippedLinesCallback|LineCallbackHandler|Interface that passes the raw line content of
the lines in the file to be skipped. If `linesToSkip` is set to 2, then this interface is
called twice.
|strict|boolean|In strict mode, the reader throws an exception on `ExecutionContext` if
the input resource does not exist. Otherwise, it logs the problem and continues.
|===============
[[lineMapper]]
== `LineMapper`
As with `RowMapper`, which takes a low-level construct such as `ResultSet` and returns
an `Object`, flat file processing requires the same construct to convert a `String` line
into an `Object`, as shown in the following interface definition:
[source, java]
----
public interface LineMapper<T> {
T mapLine(String line, int lineNumber) throws Exception;
}
----
The basic contract is that, given the current line and the line number with which it is
associated, the mapper should return a resulting domain object. This is similar to
`RowMapper`, in that each line is associated with its line number, just as each row in a
`ResultSet` is tied to its row number. This allows the line number to be tied to the
resulting domain object for identity comparison or for more informative logging. However,
unlike `RowMapper`, the `LineMapper` is given a raw line which, as discussed above, only
gets you halfway there. The line must be tokenized into a `FieldSet`, which can then be
mapped to an object, as described later in this document.
[[lineTokenizer]]
== `LineTokenizer`
An abstraction for turning a line of input into a `FieldSet` is necessary because there
can be many formats of flat file data that need to be converted to a `FieldSet`. In
Spring Batch, this interface is the `LineTokenizer`:
[source, java]
----
public interface LineTokenizer {
FieldSet tokenize(String line);
}
----
The contract of a `LineTokenizer` is such that, given a line of input (in theory the
`String` could encompass more than one line), a `FieldSet` representing the line is
returned. This `FieldSet` can then be passed to a `FieldSetMapper`. Spring Batch contains
the following `LineTokenizer` implementations:
* `DelimitedLineTokenizer`: Used for files where fields in a record are separated by a
delimiter. The most common delimiter is a comma, but pipes or semicolons are often used
as well.
* `FixedLengthTokenizer`: Used for files where fields in a record are each a "fixed
width". The width of each field must be defined for each record type.
* `PatternMatchingCompositeLineTokenizer`: Determines which `LineTokenizer` among a list of
tokenizers should be used on a particular line by checking against a pattern.
[[fieldSetMapper]]
== `FieldSetMapper`
The `FieldSetMapper` interface defines a single method, `mapFieldSet`, which takes a
`FieldSet` object and maps its contents to an object. This object may be a custom DTO, a
domain object, or an array, depending on the needs of the job. The `FieldSetMapper` is
used in conjunction with the `LineTokenizer` to translate a line of data from a resource
into an object of the desired type, as shown in the following interface definition:
[source, java]
----
public interface FieldSetMapper<T> {
T mapFieldSet(FieldSet fieldSet) throws BindException;
}
----
The pattern used is the same as the `RowMapper` used by `JdbcTemplate`.
[[defaultLineMapper]]
== `DefaultLineMapper`
Now that the basic interfaces for reading in flat files have been defined, it becomes
clear that three basic steps are required:
. Read one line from the file.
. Pass the `String` line into the `LineTokenizer#tokenize()` method to retrieve a
`FieldSet`.
. Pass the `FieldSet` returned from tokenizing to a `FieldSetMapper`, returning the
result from the `ItemReader#read()` method.
The two interfaces described above represent two separate tasks: converting a line into a
`FieldSet` and mapping a `FieldSet` to a domain object. Because the input of a
`LineTokenizer` matches the input of the `LineMapper` (a line), and the output of a
`FieldSetMapper` matches the output of the `LineMapper`, a default implementation that
uses both a `LineTokenizer` and a `FieldSetMapper` is provided. The `DefaultLineMapper`,
shown in the following class definition, represents the behavior most users need:
[source, java]
----
public class DefaultLineMapper<T> implements LineMapper<>, InitializingBean {
private LineTokenizer tokenizer;
private FieldSetMapper<T> fieldSetMapper;
public T mapLine(String line, int lineNumber) throws Exception {
return fieldSetMapper.mapFieldSet(tokenizer.tokenize(line));
}
public void setLineTokenizer(LineTokenizer tokenizer) {
this.tokenizer = tokenizer;
}
public void setFieldSetMapper(FieldSetMapper<T> fieldSetMapper) {
this.fieldSetMapper = fieldSetMapper;
}
}
----
The above functionality is provided in a default implementation, rather than being built
into the reader itself (as was done in previous versions of the framework) to allow users
greater flexibility in controlling the parsing process, especially if access to the raw
line is needed.
[[simpleDelimitedFileReadingExample]]
== Simple Delimited File Reading Example
The following example illustrates how to read a flat file with an actual domain scenario.
This particular batch job reads in football players from the following file:
----
ID,lastName,firstName,position,birthYear,debutYear
"AbduKa00,Abdul-Jabbar,Karim,rb,1974,1996",
"AbduRa00,Abdullah,Rabih,rb,1975,1999",
"AberWa00,Abercrombie,Walter,rb,1959,1982",
"AbraDa00,Abramowicz,Danny,wr,1945,1967",
"AdamBo00,Adams,Bob,te,1946,1969",
"AdamCh00,Adams,Charlie,wr,1979,2003"
----
The contents of this file are mapped to the following
`Player` domain object:
[source, java]
----
public class Player implements Serializable {
private String ID;
private String lastName;
private String firstName;
private String position;
private int birthYear;
private int debutYear;
public String toString() {
return "PLAYER:ID=" + ID + ",Last Name=" + lastName +
",First Name=" + firstName + ",Position=" + position +
",Birth Year=" + birthYear + ",DebutYear=" +
debutYear;
}
// setters and getters...
}
----
To map a `FieldSet` into a `Player` object, a `FieldSetMapper` that returns players needs
to be defined, as shown in the following example:
[source, java]
----
protected static class PlayerFieldSetMapper implements FieldSetMapper<Player> {
public Player mapFieldSet(FieldSet fieldSet) {
Player player = new Player();
player.setID(fieldSet.readString(0));
player.setLastName(fieldSet.readString(1));
player.setFirstName(fieldSet.readString(2));
player.setPosition(fieldSet.readString(3));
player.setBirthYear(fieldSet.readInt(4));
player.setDebutYear(fieldSet.readInt(5));
return player;
}
}
----
The file can then be read by correctly constructing a `FlatFileItemReader` and calling
`read`, as shown in the following example:
[source, java]
----
FlatFileItemReader<Player> itemReader = new FlatFileItemReader<>();
itemReader.setResource(new FileSystemResource("resources/players.csv"));
DefaultLineMapper<Player> lineMapper = new DefaultLineMapper<>();
//DelimitedLineTokenizer defaults to comma as its delimiter
lineMapper.setLineTokenizer(new DelimitedLineTokenizer());
lineMapper.setFieldSetMapper(new PlayerFieldSetMapper());
itemReader.setLineMapper(lineMapper);
itemReader.open(new ExecutionContext());
Player player = itemReader.read();
----
Each call to `read` returns a new
`Player` object from each line in the file. When the end of the file is
reached, `null` is returned.
[[mappingFieldsByName]]
== Mapping Fields by Name
There is one additional piece of functionality that is allowed by both
`DelimitedLineTokenizer` and `FixedLengthTokenizer` and that is similar in function to a
JDBC `ResultSet`. The names of the fields can be injected into either of these
`LineTokenizer` implementations to increase the readability of the mapping function.
First, the column names of all fields in the flat file are injected into the tokenizer,
as shown in the following example:
[source, java]
----
tokenizer.setNames(new String[] {"ID", "lastName", "firstName", "position", "birthYear", "debutYear"});
----
A `FieldSetMapper` can use this information as follows:
[source, java]
----
public class PlayerMapper implements FieldSetMapper<Player> {
public Player mapFieldSet(FieldSet fs) {
if (fs == null) {
return null;
}
Player player = new Player();
player.setID(fs.readString("ID"));
player.setLastName(fs.readString("lastName"));
player.setFirstName(fs.readString("firstName"));
player.setPosition(fs.readString("position"));
player.setDebutYear(fs.readInt("debutYear"));
player.setBirthYear(fs.readInt("birthYear"));
return player;
}
}
----
[[beanWrapperFieldSetMapper]]
== Automapping FieldSets to Domain Objects
For many, having to write a specific `FieldSetMapper` is equally as cumbersome as writing
a specific `RowMapper` for a `JdbcTemplate`. Spring Batch makes this easier by providing
a `FieldSetMapper` that automatically maps fields by matching a field name with a setter
on the object using the JavaBean specification.
[tabs]
====
Java::
+
Again using the football example, the `BeanWrapperFieldSetMapper` configuration looks like
the following snippet in Java:
+
.Java Configuration
[source, java]
----
@Bean
public FieldSetMapper fieldSetMapper() {
BeanWrapperFieldSetMapper fieldSetMapper = new BeanWrapperFieldSetMapper();
fieldSetMapper.setPrototypeBeanName("player");
return fieldSetMapper;
}
@Bean
@Scope("prototype")
public Player player() {
return new Player();
}
----
XML::
+
Again using the football example, the `BeanWrapperFieldSetMapper` configuration looks like
the following snippet in XML:
+
.XML Configuration
[source, xml]
----
<bean id="fieldSetMapper"
class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="player" />
</bean>
<bean id="player"
class="org.springframework.batch.sample.domain.Player"
scope="prototype" />
----
====
For each entry in the `FieldSet`, the mapper looks for a corresponding setter on a new
instance of the `Player` object (for this reason, prototype scope is required) in the
same way the Spring container looks for setters matching a property name. Each available
field in the `FieldSet` is mapped, and the resultant `Player` object is returned, with no
code required.
[[fixedLengthFileFormats]]
== Fixed Length File Formats
So far, only delimited files have been discussed in much detail. However, they represent
only half of the file reading picture. Many organizations that use flat files use fixed
length formats. An example fixed length file follows:
----
UK21341EAH4121131.11customer1
UK21341EAH4221232.11customer2
UK21341EAH4321333.11customer3
UK21341EAH4421434.11customer4
UK21341EAH4521535.11customer5
----
While this looks like one large field, it actually represent 4 distinct fields:
. ISIN: Unique identifier for the item being ordered - 12 characters long.
. Quantity: Number of the item being ordered - 3 characters long.
. Price: Price of the item - 5 characters long.
. Customer: ID of the customer ordering the item - 9 characters long.
When configuring the `FixedLengthLineTokenizer`, each of these lengths must be provided
in the form of ranges.
[tabs]
=====
Java::
+
The following example shows how to define ranges for the `FixedLengthLineTokenizer` in
Java:
+
.Java Configuration
[source, java]
----
@Bean
public FixedLengthTokenizer fixedLengthTokenizer() {
FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
tokenizer.setNames("ISIN", "Quantity", "Price", "Customer");
tokenizer.setColumns(new Range(1, 12),
new Range(13, 15),
new Range(16, 20),
new Range(21, 29));
return tokenizer;
}
----
XML::
+
The following example shows how to define ranges for the `FixedLengthLineTokenizer` in
XML:
+
.XML Configuration
[source, xml]
----
<bean id="fixedLengthLineTokenizer"
class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
<property name="names" value="ISIN,Quantity,Price,Customer" />
<property name="columns" value="1-12, 13-15, 16-20, 21-29" />
</bean>
----
+
Because the `FixedLengthLineTokenizer` uses the same `LineTokenizer` interface as
discussed earlier, it returns the same `FieldSet` as if a delimiter had been used. This
allows the same approaches to be used in handling its output, such as using the
`BeanWrapperFieldSetMapper`.
+
[NOTE]
====
Supporting the preceding syntax for ranges requires that a specialized property editor,
`RangeArrayPropertyEditor`, be configured in the `ApplicationContext`. However, this bean
is automatically declared in an `ApplicationContext` where the batch namespace is used.
====
=====
Because the `FixedLengthLineTokenizer` uses the same `LineTokenizer` interface as
discussed above, it returns the same `FieldSet` as if a delimiter had been used. This
lets the same approaches be used in handling its output, such as using the
`BeanWrapperFieldSetMapper`.
[[prefixMatchingLineMapper]]
== Multiple Record Types within a Single File
All of the file reading examples up to this point have all made a key assumption for
simplicity's sake: all of the records in a file have the same format. However, this may
not always be the case. It is very common that a file might have records with different
formats that need to be tokenized differently and mapped to different objects. The
following excerpt from a file illustrates this:
----
USER;Smith;Peter;;T;20014539;F
LINEA;1044391041ABC037.49G201XX1383.12H
LINEB;2134776319DEF422.99M005LI
----
In this file we have three types of records, "USER", "LINEA", and "LINEB". A "USER" line
corresponds to a `User` object. "LINEA" and "LINEB" both correspond to `Line` objects,
though a "LINEA" has more information than a "LINEB".
The `ItemReader` reads each line individually, but we must specify different
`LineTokenizer` and `FieldSetMapper` objects so that the `ItemWriter` receives the
correct items. The `PatternMatchingCompositeLineMapper` makes this easy by allowing maps
of patterns to `LineTokenizers` and patterns to `FieldSetMappers` to be configured.
[tabs]
====
Java::
+
.Java Configuration
[source, java]
----
@Bean
public PatternMatchingCompositeLineMapper orderFileLineMapper() {
PatternMatchingCompositeLineMapper lineMapper =
new PatternMatchingCompositeLineMapper();
Map<String, LineTokenizer> tokenizers = new HashMap<>(3);
tokenizers.put("USER*", userTokenizer());
tokenizers.put("LINEA*", lineATokenizer());
tokenizers.put("LINEB*", lineBTokenizer());
lineMapper.setTokenizers(tokenizers);
Map<String, FieldSetMapper> mappers = new HashMap<>(2);
mappers.put("USER*", userFieldSetMapper());
mappers.put("LINE*", lineFieldSetMapper());
lineMapper.setFieldSetMappers(mappers);
return lineMapper;
}
----
XML::
+
The following example shows how to define ranges for the `FixedLengthLineTokenizer` in
XML:
+
.XML Configuration
[source, xml]
----
<bean id="orderFileLineMapper"
class="org.spr...PatternMatchingCompositeLineMapper">
<property name="tokenizers">
<map>
<entry key="USER*" value-ref="userTokenizer" />
<entry key="LINEA*" value-ref="lineATokenizer" />
<entry key="LINEB*" value-ref="lineBTokenizer" />
</map>
</property>
<property name="fieldSetMappers">
<map>
<entry key="USER*" value-ref="userFieldSetMapper" />
<entry key="LINE*" value-ref="lineFieldSetMapper" />
</map>
</property>
</bean>
----
====
In this example, "LINEA" and "LINEB" have separate `LineTokenizer` instances, but they both use
the same `FieldSetMapper`.
The `PatternMatchingCompositeLineMapper` uses the `PatternMatcher#match` method
in order to select the correct delegate for each line. The `PatternMatcher` allows for
two wildcard characters with special meaning: the question mark ("?") matches exactly one
character, while the asterisk ("\*") matches zero or more characters. Note that, in the
preceding configuration, all patterns end with an asterisk, making them effectively
prefixes to lines. The `PatternMatcher` always matches the most specific pattern
possible, regardless of the order in the configuration. So if "LINE*" and "LINEA*" were
both listed as patterns, "LINEA" would match pattern "LINEA*", while "LINEB" would match
pattern "LINE*". Additionally, a single asterisk ("*") can serve as a default by matching
any line not matched by any other pattern.
[tabs]
====
Java::
+
The following example shows how to match a line not matched by any other pattern in Java:
+
.Java Configuration
[source, java]
----
...
tokenizers.put("*", defaultLineTokenizer());
...
----
XML::
+
The following example shows how to match a line not matched by any other pattern in XML:
+
.XML Configuration
[source, xml]
----
<entry key="*" value-ref="defaultLineTokenizer" />
----
====
There is also a `PatternMatchingCompositeLineTokenizer` that can be used for tokenization
alone.
It is also common for a flat file to contain records that each span multiple lines. To
handle this situation, a more complex strategy is required. A demonstration of this
common pattern can be found in the `multiLineRecords` sample.
[[exceptionHandlingInFlatFiles]]
== Exception Handling in Flat Files
There are many scenarios when tokenizing a line may cause exceptions to be thrown. Many
flat files are imperfect and contain incorrectly formatted records. Many users choose to
skip these erroneous lines while logging the issue, the original line, and the line
number. These logs can later be inspected manually or by another batch job. For this
reason, Spring Batch provides a hierarchy of exceptions for handling parse exceptions:
`FlatFileParseException` and `FlatFileFormatException`. `FlatFileParseException` is
thrown by the `FlatFileItemReader` when any errors are encountered while trying to read a
file. `FlatFileFormatException` is thrown by implementations of the `LineTokenizer`
interface and indicates a more specific error encountered while tokenizing.
[[incorrectTokenCountException]]
=== `IncorrectTokenCountException`
Both `DelimitedLineTokenizer` and `FixedLengthLineTokenizer` have the ability to specify
column names that can be used for creating a `FieldSet`. However, if the number of column
names does not match the number of columns found while tokenizing a line, the `FieldSet`
cannot be created, and an `IncorrectTokenCountException` is thrown, which contains the
number of tokens encountered, and the number expected, as shown in the following example:
[source, java]
----
tokenizer.setNames(new String[] {"A", "B", "C", "D"});
try {
tokenizer.tokenize("a,b,c");
}
catch (IncorrectTokenCountException e) {
assertEquals(4, e.getExpectedCount());
assertEquals(3, e.getActualCount());
}
----
Because the tokenizer was configured with 4 column names but only 3 tokens were found in
the file, an `IncorrectTokenCountException` was thrown.
[[incorrectLineLengthException]]
=== `IncorrectLineLengthException`
Files formatted in a fixed-length format have additional requirements when parsing
because, unlike a delimited format, each column must strictly adhere to its predefined
width. If the total line length does not equal the widest value of this column, an
exception is thrown, as shown in the following example:
[source, java]
----
tokenizer.setColumns(new Range[] { new Range(1, 5),
new Range(6, 10),
new Range(11, 15) });
try {
tokenizer.tokenize("12345");
fail("Expected IncorrectLineLengthException");
}
catch (IncorrectLineLengthException ex) {
assertEquals(15, ex.getExpectedLength());
assertEquals(5, ex.getActualLength());
}
----
The configured ranges for the tokenizer above are: 1-5, 6-10, and 11-15. Consequently,
the total length of the line is 15. However, in the preceding example, a line of length 5
was passed in, causing an `IncorrectLineLengthException` to be thrown. Throwing an
exception here rather than only mapping the first column allows the processing of the
line to fail earlier and with more information than it would contain if it failed while
trying to read in column 2 in a `FieldSetMapper`. However, there are scenarios where the
length of the line is not always constant. For this reason, validation of line length can
be turned off via the 'strict' property, as shown in the following example:
[source, java]
----
tokenizer.setColumns(new Range[] { new Range(1, 5), new Range(6, 10) });
tokenizer.setStrict(false);
FieldSet tokens = tokenizer.tokenize("12345");
assertEquals("12345", tokens.readString(0));
assertEquals("", tokens.readString(1));
----
The preceding example is almost identical to the one before it, except that
`tokenizer.setStrict(false)` was called. This setting tells the tokenizer to not enforce
line lengths when tokenizing the line. A `FieldSet` is now correctly created and
returned. However, it contains only empty tokens for the remaining values.

View File

@@ -0,0 +1,445 @@
[[flatFileItemWriter]]
= `FlatFileItemWriter`
Writing out to flat files has the same problems and issues that reading in from a file
must overcome. A step must be able to write either delimited or fixed length formats in a
transactional manner.
[[lineAggregator]]
== `LineAggregator`
Just as the `LineTokenizer` interface is necessary to take an item and turn it into a
`String`, file writing must have a way to aggregate multiple fields into a single string
for writing to a file. In Spring Batch, this is the `LineAggregator`, shown in the
following interface definition:
[source, java]
----
public interface LineAggregator<T> {
public String aggregate(T item);
}
----
The `LineAggregator` is the logical opposite of `LineTokenizer`. `LineTokenizer` takes a
`String` and returns a `FieldSet`, whereas `LineAggregator` takes an `item` and returns a
`String`.
[[PassThroughLineAggregator]]
=== `PassThroughLineAggregator`
The most basic implementation of the `LineAggregator` interface is the
`PassThroughLineAggregator`, which assumes that the object is already a string or that
its string representation is acceptable for writing, as shown in the following code:
[source, java]
----
public class PassThroughLineAggregator<T> implements LineAggregator<T> {
public String aggregate(T item) {
return item.toString();
}
}
----
The preceding implementation is useful if direct control of creating the string is
required but the advantages of a `FlatFileItemWriter`, such as transaction and restart
support, are necessary.
[[SimplifiedFileWritingExample]]
== Simplified File Writing Example
Now that the `LineAggregator` interface and its most basic implementation,
`PassThroughLineAggregator`, have been defined, the basic flow of writing can be
explained:
. The object to be written is passed to the `LineAggregator` in order to obtain a
`String`.
. The returned `String` is written to the configured file.
The following excerpt from the `FlatFileItemWriter` expresses this in code:
[source, java]
----
public void write(T item) throws Exception {
write(lineAggregator.aggregate(item) + LINE_SEPARATOR);
}
----
[tabs]
====
Java::
+
In Java, a simple example of configuration might look like the following:
+
.Java Configuration
[source, java]
----
@Bean
public FlatFileItemWriter itemWriter() {
return new FlatFileItemWriterBuilder<Foo>()
.name("itemWriter")
.resource(new FileSystemResource("target/test-outputs/output.txt"))
.lineAggregator(new PassThroughLineAggregator<>())
.build();
}
----
XML::
+
In XML, a simple example of configuration might look like the following:
+
.XML Configuration
[source, xml]
----
<bean id="itemWriter" class="org.spr...FlatFileItemWriter">
<property name="resource" value="file:target/test-outputs/output.txt" />
<property name="lineAggregator">
<bean class="org.spr...PassThroughLineAggregator"/>
</property>
</bean>
----
====
[[FieldExtractor]]
== `FieldExtractor`
The preceding example may be useful for the most basic uses of a writing to a file.
However, most users of the `FlatFileItemWriter` have a domain object that needs to be
written out and, thus, must be converted into a line. In file reading, the following was
required:
. Read one line from the file.
. Pass the line into the `LineTokenizer#tokenize()` method, in order to retrieve a
`FieldSet`.
. Pass the `FieldSet` returned from tokenizing to a `FieldSetMapper`, returning the
result from the `ItemReader#read()` method.
File writing has similar but inverse steps:
. Pass the item to be written to the writer.
. Convert the fields on the item into an array.
. Aggregate the resulting array into a line.
Because there is no way for the framework to know which fields from the object need to
be written out, a `FieldExtractor` must be written to accomplish the task of turning the
item into an array, as shown in the following interface definition:
[source, java]
----
public interface FieldExtractor<T> {
Object[] extract(T item);
}
----
Implementations of the `FieldExtractor` interface should create an array from the fields
of the provided object, which can then be written out with a delimiter between the
elements or as part of a fixed-width line.
[[PassThroughFieldExtractor]]
=== `PassThroughFieldExtractor`
There are many cases where a collection, such as an array, `Collection`, or `FieldSet`,
needs to be written out. "Extracting" an array from one of these collection types is very
straightforward. To do so, convert the collection to an array. Therefore, the
`PassThroughFieldExtractor` should be used in this scenario. It should be noted that, if
the object passed in is not a type of collection, then the `PassThroughFieldExtractor`
returns an array containing solely the item to be extracted.
[[BeanWrapperFieldExtractor]]
=== `BeanWrapperFieldExtractor`
As with the `BeanWrapperFieldSetMapper` described in the file reading section, it is
often preferable to configure how to convert a domain object to an object array, rather
than writing the conversion yourself. The `BeanWrapperFieldExtractor` provides this
functionality, as shown in the following example:
[source, java]
----
BeanWrapperFieldExtractor<Name> extractor = new BeanWrapperFieldExtractor<>();
extractor.setNames(new String[] { "first", "last", "born" });
String first = "Alan";
String last = "Turing";
int born = 1912;
Name n = new Name(first, last, born);
Object[] values = extractor.extract(n);
assertEquals(first, values[0]);
assertEquals(last, values[1]);
assertEquals(born, values[2]);
----
This extractor implementation has only one required property: the names of the fields to
map. Just as the `BeanWrapperFieldSetMapper` needs field names to map fields on the
`FieldSet` to setters on the provided object, the `BeanWrapperFieldExtractor` needs names
to map to getters for creating an object array. It is worth noting that the order of the
names determines the order of the fields within the array.
[[delimitedFileWritingExample]]
== Delimited File Writing Example
The most basic flat file format is one in which all fields are separated by a delimiter.
This can be accomplished using a `DelimitedLineAggregator`. The following example writes
out a simple domain object that represents a credit to a customer account:
[source, java]
----
public class CustomerCredit {
private int id;
private String name;
private BigDecimal credit;
//getters and setters removed for clarity
}
----
Because a domain object is being used, an implementation of the `FieldExtractor`
interface must be provided, along with the delimiter to use.
[tabs]
====
Java::
+
The following example shows how to use the `FieldExtractor` with a delimiter in Java:
+
.Java Configuration
[source, java]
----
@Bean
public FlatFileItemWriter<CustomerCredit> itemWriter(Resource outputResource) throws Exception {
BeanWrapperFieldExtractor<CustomerCredit> fieldExtractor = new BeanWrapperFieldExtractor<>();
fieldExtractor.setNames(new String[] {"name", "credit"});
fieldExtractor.afterPropertiesSet();
DelimitedLineAggregator<CustomerCredit> lineAggregator = new DelimitedLineAggregator<>();
lineAggregator.setDelimiter(",");
lineAggregator.setFieldExtractor(fieldExtractor);
return new FlatFileItemWriterBuilder<CustomerCredit>()
.name("customerCreditWriter")
.resource(outputResource)
.lineAggregator(lineAggregator)
.build();
}
----
XML::
+
The following example shows how to use the `FieldExtractor` with a delimiter in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator">
<bean class="org.spr...DelimitedLineAggregator">
<property name="delimiter" value=","/>
<property name="fieldExtractor">
<bean class="org.spr...BeanWrapperFieldExtractor">
<property name="names" value="name,credit"/>
</bean>
</property>
</bean>
</property>
</bean>
----
====
In the previous example, the `BeanWrapperFieldExtractor` described earlier in this
chapter is used to turn the name and credit fields within `CustomerCredit` into an object
array, which is then written out with commas between each field.
[tabs]
====
Java::
+
// FIXME: in the existing docs this is displayed for XML too but there is no config below it
It is also possible to use the `FlatFileItemWriterBuilder.DelimitedBuilder` to
automatically create the `BeanWrapperFieldExtractor` and `DelimitedLineAggregator`
as shown in the following example:
+
.Java Configuration
[source, java]
----
@Bean
public FlatFileItemWriter<CustomerCredit> itemWriter(Resource outputResource) throws Exception {
return new FlatFileItemWriterBuilder<CustomerCredit>()
.name("customerCreditWriter")
.resource(outputResource)
.delimited()
.delimiter("|")
.names(new String[] {"name", "credit"})
.build();
}
----
XML::
+
// FIXME: what is the XML config
+
There is no XML equivalent of using `FlatFileItemWriterBuilder`.
====
[[fixedWidthFileWritingExample]]
== Fixed Width File Writing Example
Delimited is not the only type of flat file format. Many prefer to use a set width for
each column to delineate between fields, which is usually referred to as 'fixed width'.
Spring Batch supports this in file writing with the `FormatterLineAggregator`.
[tabs]
====
Java::
+
Using the same `CustomerCredit` domain object described above, it can be configured as
follows in Java:
+
.Java Configuration
[source, java]
----
@Bean
public FlatFileItemWriter<CustomerCredit> itemWriter(Resource outputResource) throws Exception {
BeanWrapperFieldExtractor<CustomerCredit> fieldExtractor = new BeanWrapperFieldExtractor<>();
fieldExtractor.setNames(new String[] {"name", "credit"});
fieldExtractor.afterPropertiesSet();
FormatterLineAggregator<CustomerCredit> lineAggregator = new FormatterLineAggregator<>();
lineAggregator.setFormat("%-9s%-2.0f");
lineAggregator.setFieldExtractor(fieldExtractor);
return new FlatFileItemWriterBuilder<CustomerCredit>()
.name("customerCreditWriter")
.resource(outputResource)
.lineAggregator(lineAggregator)
.build();
}
----
XML::
+
Using the same `CustomerCredit` domain object described above, it can be configured as
follows in XML:
+
.XML Configuration
[source, xml]
----
<bean id="itemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" ref="outputResource" />
<property name="lineAggregator">
<bean class="org.spr...FormatterLineAggregator">
<property name="fieldExtractor">
<bean class="org.spr...BeanWrapperFieldExtractor">
<property name="names" value="name,credit" />
</bean>
</property>
<property name="format" value="%-9s%-2.0f" />
</bean>
</property>
</bean>
----
====
Most of the preceding example should look familiar. However, the value of the format
property is new.
[tabs]
====
Java::
+
The following example shows the format property in Java:
+
[source, java]
----
...
FormatterLineAggregator<CustomerCredit> lineAggregator = new FormatterLineAggregator<>();
lineAggregator.setFormat("%-9s%-2.0f");
...
----
XML::
+
The following example shows the format property in XML:
+
[source, xml]
----
<property name="format" value="%-9s%-2.0f" />
----
====
The underlying implementation is built using the same
`Formatter` added as part of Java 5. The Java
`Formatter` is based on the
`printf` functionality of the C programming
language. Most details on how to configure a formatter can be found in
the Javadoc of link:$$https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html$$[Formatter].
[tabs]
====
Java::
+
It is also possible to use the `FlatFileItemWriterBuilder.FormattedBuilder` to
automatically create the `BeanWrapperFieldExtractor` and `FormatterLineAggregator`
as shown in following example:
+
.Java Configuration
[source, java]
----
@Bean
public FlatFileItemWriter<CustomerCredit> itemWriter(Resource outputResource) throws Exception {
return new FlatFileItemWriterBuilder<CustomerCredit>()
.name("customerCreditWriter")
.resource(outputResource)
.formatted()
.format("%-9s%-2.0f")
.names(new String[] {"name", "credit"})
.build();
}
----
XML::
+
// FIXME: What is the XML equivalent
====
[[handlingFileCreation]]
== Handling File Creation
`FlatFileItemReader` has a very simple relationship with file resources. When the reader
is initialized, it opens the file (if it exists), and throws an exception if it does not.
File writing isn't quite so simple. At first glance, it seems like a similar
straightforward contract should exist for `FlatFileItemWriter`: If the file already
exists, throw an exception, and, if it does not, create it and start writing. However,
potentially restarting a `Job` can cause issues. In normal restart scenarios, the
contract is reversed: If the file exists, start writing to it from the last known good
position, and, if it does not, throw an exception. However, what happens if the file name
for this job is always the same? In this case, you would want to delete the file if it
exists, unless it's a restart. Because of this possibility, the `FlatFileItemWriter`
contains the property, `shouldDeleteIfExists`. Setting this property to true causes an
existing file with the same name to be deleted when the writer is opened.

View File

@@ -0,0 +1,313 @@
[[itemReaderAndWriterImplementations]]
= Item Reader and Writer Implementations
In this section, we will introduce you to readers and writers that have not already been
discussed in the previous sections.
[[decorators]]
== Decorators
In some cases, a user needs specialized behavior to be appended to a pre-existing
`ItemReader`. Spring Batch offers some out of the box decorators that can add
additional behavior to to your `ItemReader` and `ItemWriter` implementations.
Spring Batch includes the following decorators:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#synchronizedItemStreamReader[`SynchronizedItemStreamReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#singleItemPeekableItemReader[`SingleItemPeekableItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#synchronizedItemStreamWriter[`SynchronizedItemStreamWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#multiResourceItemWriter[`MultiResourceItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#classifierCompositeItemWriter[`ClassifierCompositeItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#classifierCompositeItemProcessor[`ClassifierCompositeItemProcessor`]
[[synchronizedItemStreamReader]]
=== `SynchronizedItemStreamReader`
When using an `ItemReader` that is not thread safe, Spring Batch offers the
`SynchronizedItemStreamReader` decorator, which can be used to make the `ItemReader`
thread safe. Spring Batch provides a `SynchronizedItemStreamReaderBuilder` to construct
an instance of the `SynchronizedItemStreamReader`.
For example, the `FlatFileItemReader` is *not* thread-safe and cannot be used in
a multi-threaded step. This reader can be decorated with a `SynchronizedItemStreamReader`
in order to use it safely in a multi-threaded step. Here is an example of how to decorate
such a reader:
[source, java]
----
@Bean
public SynchronizedItemStreamReader<Person> itemReader() {
FlatFileItemReader<Person> flatFileItemReader = new FlatFileItemReaderBuilder<Person>()
// set reader properties
.build();
return new SynchronizedItemStreamReaderBuilder<Person>()
.delegate(flatFileItemReader)
.build();
}
----
[[singleItemPeekableItemReader]]
=== `SingleItemPeekableItemReader`
Spring Batch includes a decorator that adds a peek method to an `ItemReader`. This peek
method lets the user peek one item ahead. Repeated calls to the peek returns the same
item, and this is the next item returned from the `read` method. Spring Batch provides a
`SingleItemPeekableItemReaderBuilder` to construct an instance of the
`SingleItemPeekableItemReader`.
NOTE: SingleItemPeekableItemReader's peek method is not thread-safe, because it would not
be possible to honor the peek in multiple threads. Only one of the threads that peeked
would get that item in the next call to read.
[[synchronizedItemStreamWriter]]
=== `SynchronizedItemStreamWriter`
When using an `ItemWriter` that is not thread safe, Spring Batch offers the
`SynchronizedItemStreamWriter` decorator, which can be used to make the `ItemWriter`
thread safe. Spring Batch provides a `SynchronizedItemStreamWriterBuilder` to construct
an instance of the `SynchronizedItemStreamWriter`.
For example, the `FlatFileItemWriter` is *not* thread-safe and cannot be used in
a multi-threaded step. This writer can be decorated with a `SynchronizedItemStreamWriter`
in order to use it safely in a multi-threaded step. Here is an example of how to decorate
such a writer:
[source, java]
----
@Bean
public SynchronizedItemStreamWriter<Person> itemWriter() {
FlatFileItemWriter<Person> flatFileItemWriter = new FlatFileItemWriterBuilder<Person>()
// set writer properties
.build();
return new SynchronizedItemStreamWriterBuilder<Person>()
.delegate(flatFileItemWriter)
.build();
}
----
[[multiResourceItemWriter]]
=== `MultiResourceItemWriter`
The `MultiResourceItemWriter` wraps a `ResourceAwareItemWriterItemStream` and creates a new
output resource when the count of items written in the current resource exceeds the
`itemCountLimitPerResource`. Spring Batch provides a `MultiResourceItemWriterBuilder` to
construct an instance of the `MultiResourceItemWriter`.
[[classifierCompositeItemWriter]]
=== `ClassifierCompositeItemWriter`
The `ClassifierCompositeItemWriter` calls one of a collection of `ItemWriter`
implementations for each item, based on a router pattern implemented through the provided
`Classifier`. The implementation is thread-safe if all delegates are thread-safe. Spring
Batch provides a `ClassifierCompositeItemWriterBuilder` to construct an instance of the
`ClassifierCompositeItemWriter`.
[[classifierCompositeItemProcessor]]
=== `ClassifierCompositeItemProcessor`
The `ClassifierCompositeItemProcessor` is an `ItemProcessor` that calls one of a
collection of `ItemProcessor` implementations, based on a router pattern implemented
through the provided `Classifier`. Spring Batch provides a
`ClassifierCompositeItemProcessorBuilder` to construct an instance of the
`ClassifierCompositeItemProcessor`.
[[messagingReadersAndWriters]]
== Messaging Readers And Writers
Spring Batch offers the following readers and writers for commonly used messaging systems:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#amqpItemReader[`AmqpItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#amqpItemWriter[`AmqpItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#jmsItemReader[`JmsItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#jmsItemWriter[`JmsItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#kafkaItemReader[`KafkaItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#kafkaItemWriter[`KafkaItemWriter`]
[[amqpItemReader]]
=== `AmqpItemReader`
The `AmqpItemReader` is an `ItemReader` that uses an `AmqpTemplate` to receive or convert
messages from an exchange. Spring Batch provides a `AmqpItemReaderBuilder` to construct
an instance of the `AmqpItemReader`.
[[amqpItemWriter]]
=== `AmqpItemWriter`
The `AmqpItemWriter` is an `ItemWriter` that uses an `AmqpTemplate` to send messages to
an AMQP exchange. Messages are sent to the nameless exchange if the name not specified in
the provided `AmqpTemplate`. Spring Batch provides an `AmqpItemWriterBuilder` to
construct an instance of the `AmqpItemWriter`.
[[jmsItemReader]]
=== `JmsItemReader`
The `JmsItemReader` is an `ItemReader` for JMS that uses a `JmsTemplate`. The template
should have a default destination, which is used to provide items for the `read()`
method. Spring Batch provides a `JmsItemReaderBuilder` to construct an instance of the
`JmsItemReader`.
[[jmsItemWriter]]
=== `JmsItemWriter`
The `JmsItemWriter` is an `ItemWriter` for JMS that uses a `JmsTemplate`. The template
should have a default destination, which is used to send items in `write(List)`. Spring
Batch provides a `JmsItemWriterBuilder` to construct an instance of the `JmsItemWriter`.
[[kafkaItemReader]]
=== `KafkaItemReader`
The `KafkaItemReader` is an `ItemReader` for an Apache Kafka topic. It can be configured
to read messages from multiple partitions of the same topic. It stores message offsets
in the execution context to support restart capabilities. Spring Batch provides a
`KafkaItemReaderBuilder` to construct an instance of the `KafkaItemReader`.
[[kafkaItemWriter]]
=== `KafkaItemWriter`
The `KafkaItemWriter` is an `ItemWriter` for Apache Kafka that uses a `KafkaTemplate` to
send events to a default topic. Spring Batch provides a `KafkaItemWriterBuilder` to
construct an instance of the `KafkaItemWriter`.
[[databaseReaders]]
== Database Readers
Spring Batch offers the following database readers:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#Neo4jItemReader[`Neo4jItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#mongoItemReader[`MongoItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#hibernateCursorItemReader[`HibernateCursorItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#hibernatePagingItemReader[`HibernatePagingItemReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#repositoryItemReader[`RepositoryItemReader`]
[[Neo4jItemReader]]
=== `Neo4jItemReader`
The `Neo4jItemReader` is an `ItemReader` that reads objects from the graph database Neo4j
by using a paging technique. Spring Batch provides a `Neo4jItemReaderBuilder` to
construct an instance of the `Neo4jItemReader`.
[[mongoItemReader]]
=== `MongoItemReader`
The `MongoItemReader` is an `ItemReader` that reads documents from MongoDB by using a
paging technique. Spring Batch provides a `MongoItemReaderBuilder` to construct an
instance of the `MongoItemReader`.
[[hibernateCursorItemReader]]
=== `HibernateCursorItemReader`
The `HibernateCursorItemReader` is an `ItemStreamReader` for reading database records
built on top of Hibernate. It executes the HQL query and then, when initialized, iterates
over the result set as the `read()` method is called, successively returning an object
corresponding to the current row. Spring Batch provides a
`HibernateCursorItemReaderBuilder` to construct an instance of the
`HibernateCursorItemReader`.
[[hibernatePagingItemReader]]
=== `HibernatePagingItemReader`
The `HibernatePagingItemReader` is an `ItemReader` for reading database records built on
top of Hibernate and reading only up to a fixed number of items at a time. Spring Batch
provides a `HibernatePagingItemReaderBuilder` to construct an instance of the
`HibernatePagingItemReader`.
[[repositoryItemReader]]
=== `RepositoryItemReader`
The `RepositoryItemReader` is an `ItemReader` that reads records by using a
`PagingAndSortingRepository`. Spring Batch provides a `RepositoryItemReaderBuilder` to
construct an instance of the `RepositoryItemReader`.
[[databaseWriters]]
== Database Writers
Spring Batch offers the following database writers:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#neo4jItemWriter[`Neo4jItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#mongoItemWriter[`MongoItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#repositoryItemWriter[`RepositoryItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#hibernateItemWriter[`HibernateItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#jdbcBatchItemWriter[`JdbcBatchItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#jpaItemWriter[`JpaItemWriter`]
[[neo4jItemWriter]]
=== `Neo4jItemWriter`
The `Neo4jItemWriter` is an `ItemWriter` implementation that writes to a Neo4j database.
Spring Batch provides a `Neo4jItemWriterBuilder` to construct an instance of the
`Neo4jItemWriter`.
[[mongoItemWriter]]
=== `MongoItemWriter`
The `MongoItemWriter` is an `ItemWriter` implementation that writes to a MongoDB store
using an implementation of Spring Data's `MongoOperations`. Spring Batch provides a
`MongoItemWriterBuilder` to construct an instance of the `MongoItemWriter`.
[[repositoryItemWriter]]
=== `RepositoryItemWriter`
The `RepositoryItemWriter` is an `ItemWriter` wrapper for a `CrudRepository` from Spring
Data. Spring Batch provides a `RepositoryItemWriterBuilder` to construct an instance of
the `RepositoryItemWriter`.
[[hibernateItemWriter]]
=== `HibernateItemWriter`
The `HibernateItemWriter` is an `ItemWriter` that uses a Hibernate session to save or
update entities that are not part of the current Hibernate session. Spring Batch provides
a `HibernateItemWriterBuilder` to construct an instance of the `HibernateItemWriter`.
[[jdbcBatchItemWriter]]
=== `JdbcBatchItemWriter`
The `JdbcBatchItemWriter` is an `ItemWriter` that uses the batching features from
`NamedParameterJdbcTemplate` to execute a batch of statements for all items provided.
Spring Batch provides a `JdbcBatchItemWriterBuilder` to construct an instance of the
`JdbcBatchItemWriter`.
[[jpaItemWriter]]
=== `JpaItemWriter`
The `JpaItemWriter` is an `ItemWriter` that uses a JPA `EntityManagerFactory` to merge
any entities that are not part of the persistence context. Spring Batch provides a
`JpaItemWriterBuilder` to construct an instance of the `JpaItemWriter`.
[[specializedReaders]]
== Specialized Readers
Spring Batch offers the following specialized readers:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#ldifReader[`LdifReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#mappingLdifReader[`MappingLdifReader`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#avroItemReader[`AvroItemReader`]
[[ldifReader]]
=== `LdifReader`
The `LdifReader` reads LDIF (LDAP Data Interchange Format) records from a `Resource`,
parses them, and returns a `LdapAttribute` object for each `read` executed. Spring Batch
provides a `LdifReaderBuilder` to construct an instance of the `LdifReader`.
[[mappingLdifReader]]
=== `MappingLdifReader`
The `MappingLdifReader` reads LDIF (LDAP Data Interchange Format) records from a
`Resource`, parses them then maps each LDIF record to a POJO (Plain Old Java Object).
Each read returns a POJO. Spring Batch provides a `MappingLdifReaderBuilder` to construct
an instance of the `MappingLdifReader`.
[[avroItemReader]]
=== `AvroItemReader`
The `AvroItemReader` reads serialized Avro data from a Resource.
Each read returns an instance of the type specified by a Java class or Avro Schema.
The reader may be optionally configured for input that embeds an Avro schema or not.
Spring Batch provides an `AvroItemReaderBuilder` to construct an instance of the `AvroItemReader`.
[[specializedWriters]]
== Specialized Writers
Spring Batch offers the following specialized writers:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#simpleMailMessageItemWriter[`SimpleMailMessageItemWriter`]
* xref:readers-and-writers/item-reader-writer-implementations.adoc#avroItemWriter[`AvroItemWriter`]
[[simpleMailMessageItemWriter]]
=== `SimpleMailMessageItemWriter`
The `SimpleMailMessageItemWriter` is an `ItemWriter` that can send mail messages. It
delegates the actual sending of messages to an instance of `MailSender`. Spring Batch
provides a `SimpleMailMessageItemWriterBuilder` to construct an instance of the
`SimpleMailMessageItemWriter`.
[[avroItemWriter]]
=== `AvroItemWriter`
The `AvroItemWrite` serializes Java objects to a WriteableResource according to the given type or Schema.
The writer may be optionally configured to embed an Avro schema in the output or not.
Spring Batch provides an `AvroItemWriterBuilder` to construct an instance of the `AvroItemWriter`.
[[specializedProcessors]]
== Specialized Processors
Spring Batch offers the following specialized processors:
* xref:readers-and-writers/item-reader-writer-implementations.adoc#scriptItemProcessor[`ScriptItemProcessor`]
[[scriptItemProcessor]]
=== `ScriptItemProcessor`
The `ScriptItemProcessor` is an `ItemProcessor` that passes the current item to process
to the provided script and the result of the script is returned by the processor. Spring
Batch provides a `ScriptItemProcessorBuilder` to construct an instance of the
`ScriptItemProcessor`.

View File

@@ -0,0 +1,48 @@
[[itemReader]]
= `ItemReader`
Although a simple concept, an `ItemReader` is the means for providing data from many
different types of input. The most general examples include:
* Flat File: Flat-file item readers read lines of data from a flat file that typically
describes records with fields of data defined by fixed positions in the file or delimited
by some special character (such as a comma).
* XML: XML `ItemReaders` process XML independently of technologies used for parsing,
mapping and validating objects. Input data allows for the validation of an XML file
against an XSD schema.
* Database: A database resource is accessed to return resultsets which can be mapped to
objects for processing. The default SQL `ItemReader` implementations invoke a `RowMapper`
to return objects, keep track of the current row if restart is required, store basic
statistics, and provide some transaction enhancements that are explained later.
There are many more possibilities, but we focus on the basic ones for this chapter. A
complete list of all available `ItemReader` implementations can be found in
xref:appendix.adoc#listOfReadersAndWriters[Appendix A].
`ItemReader` is a basic interface for generic
input operations, as shown in the following interface definition:
[source, java]
----
public interface ItemReader<T> {
T read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException;
}
----
The `read` method defines the most essential contract of the `ItemReader`. Calling it
returns one item or `null` if no more items are left. An item might represent a line in a
file, a row in a database, or an element in an XML file. It is generally expected that
these are mapped to a usable domain object (such as `Trade`, `Foo`, or others), but there
is no requirement in the contract to do so.
It is expected that implementations of the `ItemReader` interface are forward only.
However, if the underlying resource is transactional (such as a JMS queue) then calling
`read` may return the same logical item on subsequent calls in a rollback scenario. It is
also worth noting that a lack of items to process by an `ItemReader` does not cause an
exception to be thrown. For example, a database `ItemReader` that is configured with a
query that returns 0 results returns `null` on the first invocation of `read`.

View File

@@ -0,0 +1,38 @@
[[itemStream]]
= `ItemStream`
Both `ItemReaders` and `ItemWriters` serve their individual purposes well, but there is a
common concern among both of them that necessitates another interface. In general, as
part of the scope of a batch job, readers and writers need to be opened, closed, and
require a mechanism for persisting state. The `ItemStream` interface serves that purpose,
as shown in the following example:
[source, java]
----
public interface ItemStream {
void open(ExecutionContext executionContext) throws ItemStreamException;
void update(ExecutionContext executionContext) throws ItemStreamException;
void close() throws ItemStreamException;
}
----
Before describing each method, we should mention the `ExecutionContext`. Clients of an
`ItemReader` that also implement `ItemStream` should call `open` before any calls to
`read`, in order to open any resources such as files or to obtain connections. A similar
restriction applies to an `ItemWriter` that implements `ItemStream`. As mentioned in
Chapter 2, if expected data is found in the `ExecutionContext`, it may be used to start
the `ItemReader` or `ItemWriter` at a location other than its initial state. Conversely,
`close` is called to ensure that any resources allocated during open are released safely.
`update` is called primarily to ensure that any state currently being held is loaded into
the provided `ExecutionContext`. This method is called before committing, to ensure that
the current state is persisted in the database before commit.
In the special case where the client of an `ItemStream` is a `Step` (from the Spring
Batch Core), an `ExecutionContext` is created for each StepExecution to allow users to
store the state of a particular execution, with the expectation that it is returned if
the same `JobInstance` is started again. For those familiar with Quartz, the semantics
are very similar to a Quartz `JobDataMap`.

View File

@@ -0,0 +1,30 @@
[[itemWriter]]
= `ItemWriter`
`ItemWriter` is similar in functionality to an `ItemReader` but with inverse operations.
Resources still need to be located, opened, and closed but they differ in that an
`ItemWriter` writes out, rather than reading in. In the case of databases or queues,
these operations may be inserts, updates, or sends. The format of the serialization of
the output is specific to each batch job.
As with `ItemReader`,
`ItemWriter` is a fairly generic interface, as shown in the following interface definition:
[source, java]
----
public interface ItemWriter<T> {
void write(Chunk<? extends T> items) throws Exception;
}
----
As with `read` on `ItemReader`, `write` provides the basic contract of `ItemWriter`. It
attempts to write out the list of items passed in as long as it is open. Because it is
generally expected that items are 'batched' together into a chunk and then output, the
interface accepts a list of items, rather than an item by itself. After writing out the
list, any flushing that may be necessary can be performed before returning from the write
method. For example, if writing to a Hibernate DAO, multiple calls to write can be made,
one for each item. The writer can then call `flush` on the hibernate session before
returning.

Some files were not shown because too many files have changed in this diff Show More