903 lines
37 KiB
XML
903 lines
37 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
|
|
<chapter id="springBatchIntegration" xreflabel="Spring Batch Integration">
|
|
<title>Spring Batch Integration</title>
|
|
|
|
<sect1 id="spring-batch-integration-introduction">
|
|
<title>Spring Batch Integration Introduction</title>
|
|
<para>
|
|
Many users of Spring Batch may encounter requirements that are
|
|
outside the scope of Spring Batch, yet may be efficiently and
|
|
concisely implemented using Spring Integration. Conversely, Spring
|
|
Batch users may encounter Spring Batch requirements and need a way
|
|
to efficiently integrate both frameworks. In this context several
|
|
patterns and use-cases emerge and Spring Batch Integration will
|
|
address those requirements.
|
|
</para>
|
|
<para>
|
|
The line between Spring Batch and Spring Integration is not always
|
|
clear, but there are guidelines that one can follow. Principally,
|
|
these are: think about granularity, and apply common patterns. Some
|
|
of those common patterns are described in this reference manual
|
|
section.
|
|
</para>
|
|
<para>
|
|
Adding messaging to a batch process enables automation of
|
|
operations, and also separation and strategizing of key concerns.
|
|
For example a message might trigger a job to execute, and then the
|
|
sending of the message can be exposed in a variety of ways. Or when
|
|
a job completes or fails that might trigger a message to be sent,
|
|
and the consumers of those messages might have operational concerns
|
|
that have nothing to do with the application itself. Messaging can
|
|
also be embedded in a job, for example reading or writing items for
|
|
processing via channels. Remote partitioning and remote chunking
|
|
provide methods to distribute workloads over an number of workers.
|
|
</para>
|
|
<para>
|
|
Some key concepts that we will cover are:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<link linkend="namespace-support">Namespace Support</link>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<link linkend="launching-batch-jobs-through-messages">Launching
|
|
Batch Jobs through Messages</link>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<link linkend="providing-feedback-with-informational-messages">Providing
|
|
Feedback with Informational Messages</link>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<link linkend="asynchronous-processors">Asynchronous
|
|
Processors</link>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<link linkend="externalizing-batch-process-execution">Externalizing
|
|
Batch Process Execution</link>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<sect2 id="namespace-support">
|
|
<title>Namespace Support</title>
|
|
<para>
|
|
Since Spring Batch Integration 1.3, dedicated XML Namespace
|
|
support was added, with the aim to provide an easier configuration
|
|
experience. In order to activate the namespace, add the following
|
|
namespace declarations to your Spring XML Application Context
|
|
file:
|
|
</para>
|
|
<programlisting language="xml"><beans xmlns="http://www.springframework.org/schema/beans"
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
xmlns:batch-int="http://www.springframework.org/schema/batch-integration"
|
|
xsi:schemaLocation="
|
|
http://www.springframework.org/schema/batch-integration
|
|
http://www.springframework.org/schema/batch-integration/spring-batch-integration.xsd">
|
|
|
|
...
|
|
|
|
</beans></programlisting>
|
|
<para>
|
|
A fully configured Spring XML Application Context file for Spring
|
|
Batch Integration may look like the following:
|
|
</para>
|
|
<programlisting language="xml"><beans xmlns="http://www.springframework.org/schema/beans"
|
|
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
xmlns:int="http://www.springframework.org/schema/integration"
|
|
xmlns:batch="http://www.springframework.org/schema/batch"
|
|
xmlns:batch-int="http://www.springframework.org/schema/batch-integration"
|
|
xsi:schemaLocation="
|
|
http://www.springframework.org/schema/batch-integration
|
|
http://www.springframework.org/schema/batch-integration/spring-batch-integration.xsd
|
|
http://www.springframework.org/schema/batch
|
|
http://www.springframework.org/schema/batch/spring-batch.xsd
|
|
http://www.springframework.org/schema/beans
|
|
http://www.springframework.org/schema/beans/spring-beans.xsd
|
|
http://www.springframework.org/schema/integration
|
|
http://www.springframework.org/schema/integration/spring-integration.xsd">
|
|
|
|
...
|
|
|
|
</beans></programlisting>
|
|
<para>
|
|
Appending version numbers to the referenced XSD file is also
|
|
allowed but, as a version-less declaration will always use the
|
|
latest schema, we generally don't recommend appending the version
|
|
number to the XSD name. Adding a version number, for instance,
|
|
would create possibly issues when updating the Spring Batch
|
|
Integration dependencies as they may require more recent versions
|
|
of the XML schema.
|
|
</para>
|
|
</sect2>
|
|
<sect2 id="launching-batch-jobs-through-messages">
|
|
<title>Launching Batch Jobs through Messages</title>
|
|
<para>
|
|
When starting batch jobs using the core Spring Batch API you
|
|
basically have 2 options:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Command line via the <classname>CommandLineJobRunner</classname>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Programatically via either
|
|
<classname>JobOperator.start()</classname> or
|
|
<classname>JobLauncher.run()</classname>.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
For example, you may want to use the
|
|
<classname>CommandLineJobRunner</classname> when invoking Batch Jobs
|
|
using a shell script. Alternatively, you may use the
|
|
<classname>JobOperator</classname> directly, for example when using
|
|
Spring Batch as part of a web application. However, what about
|
|
more complex use-cases? Maybe you need to poll a remote (S)FTP
|
|
server to retrieve the data for the Batch Job. Or your application
|
|
has to support multiple different data sources simultaneously. For
|
|
example, you may receive data files not only via the web, but also
|
|
FTP etc. Maybe additional transformation of the input files is
|
|
needed before invoking Spring Batch.
|
|
</para>
|
|
<para>
|
|
Therefore, it would be much more powerful to execute the batch job
|
|
using Spring Integration and its numerous adapters. For example,
|
|
you can use a <emphasis>File Inbound Channel Adapter</emphasis> to
|
|
monitor a directory in the file-system and start the Batch Job as
|
|
soon as the input file arrives. Additionally you can create Spring
|
|
Integration flows that use multiple different adapters to easily
|
|
ingest data for your Batch Jobs from multiple sources
|
|
simultaneously using configuration only. Implementing all these
|
|
scenarios with Spring Integration is easy as it allow for an
|
|
decoupled event-driven execution of the
|
|
<classname>JobLauncher</classname>.
|
|
</para>
|
|
<para>
|
|
Spring Batch Integration provides the
|
|
<classname>JobLaunchingMessageHandler</classname> class that you can
|
|
use to launch batch jobs. The input for the
|
|
<classname>JobLaunchingMessageHandler</classname> is provided by a
|
|
Spring Integration message, which payload is of type
|
|
<classname>JobLaunchRequest</classname>. This class is a wrapper around the Job
|
|
that needs to be launched as well as the <classname>JobParameters</classname>
|
|
necessary to launch the Batch job.
|
|
</para>
|
|
<para>
|
|
The following image illustrates the typical Spring Integration
|
|
message flow in order to start a Batch job. The
|
|
<ulink url="http://www.eaipatterns.com/toc.html">EIP (Enterprise IntegrationPatterns) website</ulink>
|
|
provides a full overview of messaging icons and their descriptions.
|
|
</para>
|
|
|
|
<mediaobject>
|
|
<imageobject role="html">
|
|
<imagedata align="center"
|
|
fileref="images/launch-batch-job.png"
|
|
format="PNG" scale="100"/>
|
|
</imageobject>
|
|
<imageobject role="fo">
|
|
<imagedata align="center"
|
|
fileref="images/launch-batch-job.png"
|
|
format="PNG" scale="60"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
|
|
<sect3 id="transforming-a-file-into-a-joblaunchrequest">
|
|
<title>Transforming a file into a JobLaunchRequest</title>
|
|
<programlisting language="java">package io.spring.sbi;
|
|
|
|
import org.springframework.batch.core.Job;
|
|
import org.springframework.batch.core.JobParametersBuilder;
|
|
import org.springframework.batch.integration.launch.JobLaunchRequest;
|
|
import org.springframework.integration.annotation.Transformer;
|
|
import org.springframework.messaging.Message;
|
|
|
|
import java.io.File;
|
|
|
|
public class FileMessageToJobRequest {
|
|
private Job job;
|
|
private String fileParameterName;
|
|
|
|
public void setFileParameterName(String fileParameterName) {
|
|
this.fileParameterName = fileParameterName;
|
|
}
|
|
|
|
public void setJob(Job job) {
|
|
this.job = job;
|
|
}
|
|
|
|
@Transformer
|
|
public JobLaunchRequest toRequest(Message<File> message) {
|
|
JobParametersBuilder jobParametersBuilder =
|
|
new JobParametersBuilder();
|
|
|
|
jobParametersBuilder.addString(fileParameterName,
|
|
message.getPayload().getAbsolutePath());
|
|
|
|
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
|
|
}
|
|
}</programlisting>
|
|
</sect3>
|
|
<sect3 id="the-jobexecution-response">
|
|
<title>The JobExecution Response</title>
|
|
<para>
|
|
When a Batch Job is being executed, a
|
|
<classname>JobExecution</classname> instance is returned. This
|
|
instance can be used to determine the status of an execution. If
|
|
a <classname>JobExecution</classname> was able to be created
|
|
successfully, it will always be returned, regardless of whether
|
|
or not the actual execution was successful.
|
|
</para>
|
|
<para>
|
|
The exact behavior on how the <classname>JobExecution</classname>
|
|
instance is returned depends on the provided
|
|
<classname>TaskExecutor</classname>. If a
|
|
<classname role="strong">synchronous</classname> (single-threaded)
|
|
<classname>TaskExecutor</classname> implementation is used, the
|
|
<classname>JobExecution</classname> response is only returned
|
|
<classname>after</classname> the job completes. When using an
|
|
<classname role="strong">asynchronous</classname>
|
|
<classname>TaskExecutor</classname>, the
|
|
<classname>JobExecution</classname> instance is returned
|
|
immediately. Users can then take the <classname>id</classname> of
|
|
<classname>JobExecution</classname> instance
|
|
(<classname>JobExecution.getJobId()</classname>) and query the
|
|
<classname>JobRepository</classname> for the job's updated status
|
|
using the <classname>JobExplorer</classname>. For more
|
|
information, please refer to the <classname>Spring
|
|
Batch</classname> reference documentation on
|
|
<ulink url="http://docs.spring.io/spring-batch/reference/html/configureJob.html#queryingRepository">Querying
|
|
the Repository</ulink>.
|
|
</para>
|
|
<para>
|
|
The following configuration will create a file
|
|
<classname>inbound-channel-adapter</classname> to listen for CSV
|
|
files in the provided directory, hand them off to our
|
|
transformer (<classname>FileMessageToJobRequest</classname>),
|
|
launch the job via the <emphasis>Job Launching
|
|
Gateway</emphasis> then simply log the output of the
|
|
<classname>JobExecution</classname> via the
|
|
<classname>logging-channel-adapter</classname>.
|
|
</para>
|
|
</sect3>
|
|
<sect3 id="spring-batch-integration-configuration">
|
|
<title>Spring Batch Integration Configuration</title>
|
|
<programlisting language="xml"><int:channel id="inboundFileChannel"/>
|
|
<int:channel id="outboundJobRequestChannel"/>
|
|
<int:channel id="jobLaunchReplyChannel"/>
|
|
|
|
<int-file:inbound-channel-adapter id="filePoller"
|
|
channel="inboundFileChannel"
|
|
directory="file:/tmp/myfiles/"
|
|
filename-pattern="*.csv">
|
|
<int:poller fixed-rate="1000"/>
|
|
</int-file:inbound-channel-adapter>
|
|
|
|
<int:transformer input-channel="inboundFileChannel"
|
|
output-channel="outboundJobRequestChannel">
|
|
<bean class="io.spring.sbi.FileMessageToJobRequest">
|
|
<property name="job" ref="personJob"/>
|
|
<property name="fileParameterName" value="input.file.name"/>
|
|
</bean>
|
|
</int:transformer>
|
|
|
|
<batch-int:job-launching-gateway request-channel="outboundJobRequestChannel"
|
|
reply-channel="jobLaunchReplyChannel"/>
|
|
|
|
<int:logging-channel-adapter channel="jobLaunchReplyChannel"/></programlisting>
|
|
<para>
|
|
Now that we are polling for files and launching jobs, we need to
|
|
configure for example our Spring Batch
|
|
<classname>ItemReader</classname> to utilize found file
|
|
represented by the job parameter "input.file.name":
|
|
</para>
|
|
</sect3>
|
|
<sect3 id="example-itemreader-configuration">
|
|
<title>Example ItemReader Configuration</title>
|
|
<programlisting language="xml"><bean id="itemReader" class="org.springframework.batch.item.file.FlatFileItemReader"
|
|
scope="step">
|
|
<property name="resource" value="file://#{jobParameters['input.file.name']}"/>
|
|
...
|
|
</bean></programlisting>
|
|
<para>
|
|
The main points of interest here are injecting the value of
|
|
<classname role="strong">#{jobParameters['input.file.name']}</classname>
|
|
as the Resource property value and setting the ItemReader bean
|
|
to be of <emphasis>Step scope</emphasis> to take advantage of
|
|
the late binding support which allows access to the
|
|
<classname>jobParameters</classname> variable.
|
|
</para>
|
|
<sect4 id="available-attributes-of-the-job-launching-gateway">
|
|
<title>Available Attributes of the Job-Launching Gateway</title>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">id</classname> Identifies the
|
|
underlying Spring bean definition, which is an instance of
|
|
either:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<classname>EventDrivenConsumer</classname>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname>PollingConsumer</classname>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
The exact implementation depends on whether the component's
|
|
input channel is a:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<classname>SubscribableChannel</classname> or
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname>PollableChannel</classname>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">auto-startup</classname>
|
|
Boolean flag to indicate that the endpoint should start automatically on
|
|
startup. The default is <emphasis>true</emphasis>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">request-channel</classname>
|
|
The input <classname>MessageChannel</classname> of this endpoint.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">reply-channel</classname> <classname>Message Channel</classname>
|
|
to which the resulting <classname>JobExecution</classname> payload will be sent.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">reply-timeout</classname>
|
|
Allows you to specify how long this gateway will wait for the reply message
|
|
to be sent successfully to the reply channel before throwing
|
|
an exception. This attribute only applies when the channel
|
|
might block, for example when using a bounded queue channel
|
|
that is currently full. Also, keep in mind that when sending to a
|
|
<classname>DirectChannel</classname>, the invocation will occur
|
|
in the sender's thread. Therefore, the failing of the send
|
|
operation may be caused by other components further downstream.
|
|
The <classname>reply-timeout</classname> attribute maps to the
|
|
<classname>sendTimeout</classname> property of the underlying
|
|
<classname>MessagingTemplate</classname> instance. The attribute
|
|
will default, if not specified, to<emphasis>-1</emphasis>,
|
|
meaning that by default, the Gateway will wait indefinitely.
|
|
The value is specified in milliseconds.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">job-launcher</classname>
|
|
Pass in a
|
|
custom
|
|
<classname>JobLauncher</classname>
|
|
bean reference. This
|
|
attribute is optional. If not specified the adapter will
|
|
re-use the instance that is registered under the id
|
|
<classname>jobLauncher</classname>. If no default instance
|
|
exists an exception is thrown.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<classname role="strong">order</classname>
|
|
Specifies the order
|
|
for invocation when this endpoint is connected as a subscriber
|
|
to a <classname>SubscribableChannel</classname>.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</sect4>
|
|
<sect4 id="sub-elements">
|
|
<title>Sub-Elements</title>
|
|
<para>
|
|
When this Gateway is receiving messages from a
|
|
<classname>PollableChannel</classname>, you must either provide
|
|
a global default Poller or provide a Poller sub-element to the
|
|
<classname>Job Launching Gateway</classname>:
|
|
</para>
|
|
<programlisting language="xml"><batch-int:job-launching-gateway request-channel="queueChannel"
|
|
reply-channel="replyChannel" job-launcher="jobLauncher">
|
|
<int:poller fixed-rate="1000"/>
|
|
</batch-int:job-launching-gateway></programlisting>
|
|
</sect4>
|
|
</sect3>
|
|
</sect2>
|
|
<sect2 id="providing-feedback-with-informational-messages">
|
|
<title>Providing Feedback with Informational Messages</title>
|
|
<para>
|
|
As Spring Batch jobs can run for long times, providing progress
|
|
information will be critical. For example, stake-holders may want
|
|
to be notified if a some or all parts of a Batch Job has failed.
|
|
Spring Batch provides support for this information being gathered
|
|
through:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Active polling or
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Event-driven, using listeners.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
When starting a Spring Batch job asynchronously, e.g. by using the
|
|
<classname>Job Launching Gateway</classname>, a
|
|
<classname>JobExecution</classname> instance is returned. Thus,
|
|
<classname>JobExecution.getJobId()</classname> can be used to
|
|
continuously poll for status updates by retrieving updated
|
|
instances of the <classname>JobExecution</classname> from the
|
|
<classname>JobRepository</classname> using the
|
|
<classname>JobExplorer</classname>. However, this is considered
|
|
sub-optimal and an event-driven approach should be preferred.
|
|
</para>
|
|
<para>
|
|
Therefore, Spring Batch provides listeners such as:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
StepListener
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
ChunkListener
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
JobExecutionListener
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
In the following example, a Spring Batch job was configured with a
|
|
<classname>StepExecutionListener</classname>. Thus, Spring
|
|
Integration will receive and process any step before/after step
|
|
events. For example, the received
|
|
<classname>StepExecution</classname> can be inspected using a
|
|
<classname>Router</classname>. Based on the results of that
|
|
inspection, various things can occur for example routing a message
|
|
to a Mail Outbound Channel Adapter, so that an Email notification
|
|
can be sent out based on some condition.
|
|
</para>
|
|
<mediaobject>
|
|
<imageobject role="html">
|
|
<imagedata align="center"
|
|
fileref="images/handling-informational-messages.png"
|
|
format="PNG" scale="100"/>
|
|
</imageobject>
|
|
<imageobject role="fo">
|
|
<imagedata align="center"
|
|
fileref="images/handling-informational-messages.png"
|
|
format="PNG" scale="60"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
<para>
|
|
Below is an example of how a listener is configured to send a
|
|
message to a <classname>Gateway</classname> for
|
|
<classname>StepExecution</classname> events and log its output to a
|
|
<classname>logging-channel-adapter</classname>:
|
|
</para>
|
|
<para>
|
|
First create the notifications integration beans:
|
|
</para>
|
|
<programlisting language="xml"><int:channel id="stepExecutionsChannel"/>
|
|
|
|
<int:gateway id="notificationExecutionsListener"
|
|
service-interface="org.springframework.batch.core.StepExecutionListener"
|
|
default-request-channel="stepExecutionsChannel"/>
|
|
|
|
<int:logging-channel-adapter channel="stepExecutionsChannel"/></programlisting>
|
|
<para>
|
|
Then modify your job to add a step level listener:
|
|
</para>
|
|
<programlisting language="xml"><job id="importPayments">
|
|
<step id="step1">
|
|
<tasklet ../>
|
|
<chunk ../>
|
|
<listeners>
|
|
<listener ref="notificationExecutionsListener"/>
|
|
</listeners>
|
|
</tasklet>
|
|
...
|
|
</step>
|
|
</job></programlisting>
|
|
</sect2>
|
|
<sect2 id="asynchronous-processors">
|
|
<title>Asynchronous Processors</title>
|
|
<para>
|
|
Asynchronous Processors help you to to scale the processing of
|
|
items. In the asynchronous processor use-case, an
|
|
<classname>AsyncItemProcessor</classname> serves as a dispatcher,
|
|
executing the <classname>ItemProcessor</classname>'s logic for an
|
|
item on a new thread. The <classname>Future</classname> is passed to
|
|
the <classname>AsynchItemWriter</classname> to be written once the
|
|
processor completes.
|
|
</para>
|
|
<para>
|
|
Therefore, you can increase performance by using asynchronous item
|
|
processing, basically allowing you to implement
|
|
<emphasis>fork-join</emphasis> scenarios. The
|
|
<classname>AsyncItemWriter</classname> will gather the results and
|
|
write back the chunk as soon as all the results become available.
|
|
</para>
|
|
<para>
|
|
Configuration of both the <classname>AsyncItemProcessor</classname>
|
|
and <classname>AsyncItemWriter</classname> are simple, first the
|
|
<classname>AsyncItemProcessor</classname>:
|
|
</para>
|
|
<programlisting language="xml"><bean id="processor"
|
|
class="org.springframework.batch.integration.async.AsyncItemProcessor">
|
|
<property name="delegate">
|
|
<bean class="your.ItemProcessor"/>
|
|
</property>
|
|
<property name="taskExecutor">
|
|
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>
|
|
</property>
|
|
</bean></programlisting>
|
|
<para>
|
|
The property "<classname>delegate</classname>" is actually
|
|
a reference to your <classname>ItemProcessor</classname> bean and
|
|
the "<classname>taskExecutor</classname>" property is a
|
|
reference to the <classname>TaskExecutor</classname> of your choice.
|
|
</para>
|
|
<para>
|
|
Then we configure the <classname>AsyncItemWriter</classname>:
|
|
</para>
|
|
<programlisting language="xml"><bean id="itemWriter"
|
|
class="org.springframework.batch.integration.async.AsyncItemWriter">
|
|
<property name="delegate">
|
|
<bean id="itemWriter" class="your.ItemWriter"/>
|
|
</property>
|
|
</bean></programlisting>
|
|
<para>
|
|
Again, the property "<classname>delegate</classname>" is
|
|
actually a reference to your <classname>ItemWriter</classname> bean.
|
|
</para>
|
|
</sect2>
|
|
<sect2 id="externalizing-batch-process-execution">
|
|
<title>Externalizing Batch Process Execution</title>
|
|
<para>
|
|
The integration approaches discussed so far suggest use-cases
|
|
where Spring Integration wraps Spring Batch like an outer-shell.
|
|
However, Spring Batch can also use Spring Integration internally.
|
|
Using this approach, Spring Batch users can delegate the
|
|
processing of items or even chunks to outside processes. This
|
|
allows you to offload complex processing. Spring Batch Integration
|
|
provides dedicated support for:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Remote Chunking
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Remote Partitioning
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<sect3 id="remote-chunking">
|
|
<title>Remote Chunking</title>
|
|
<mediaobject>
|
|
<imageobject role="html">
|
|
<imagedata align="center"
|
|
fileref="images/remote-chunking-sbi.png"
|
|
format="PNG" scale="100"/>
|
|
</imageobject>
|
|
<imageobject role="fo">
|
|
<imagedata align="center"
|
|
fileref="images/remote-chunking-sbi.png"
|
|
format="PNG" scale="60"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
<para>
|
|
Taking things one step further, one can also externalize the
|
|
chunk processing using the
|
|
<classname>ChunkMessageChannelItemWriter</classname> which is
|
|
provided by Spring Batch Integration which will send items out
|
|
and collect the result. Once sent, Spring Batch will continue the
|
|
process of reading and grouping items, without waiting for the results.
|
|
Rather it is the responsibility of the <classname>ChunkMessageChannelItemWriter</classname>
|
|
to gather the results and integrate them back into the Spring Batch process.
|
|
</para>
|
|
<para>
|
|
Using Spring Integration you have full
|
|
control over the concurrency of your processes, for instance by
|
|
using a <classname>QueueChannel</classname> instead of a
|
|
<classname>DirectChannel</classname>. Furthermore, by relying on
|
|
Spring Integration's rich collection of Channel Adapters (E.g.
|
|
JMS or AMQP), you can distribute chunks of a Batch job to
|
|
external systems for processing.
|
|
</para>
|
|
<para>
|
|
A simple job with a step to be remotely chunked would have a
|
|
configuration similar to the following:
|
|
</para>
|
|
<programlisting language="xml"><job id="personJob">
|
|
<step id="step1">
|
|
<tasklet>
|
|
<chunk reader="itemReader" writer="itemWriter" commit-interval="200"/>
|
|
</tasklet>
|
|
...
|
|
</step>
|
|
</job></programlisting>
|
|
<para>
|
|
The ItemReader reference would point to the bean you would like
|
|
to use for reading data on the master. The ItemWriter reference
|
|
points to a special ItemWriter
|
|
"<classname>ChunkMessageChannelItemWriter</classname>"
|
|
as described above. The processor (if any) is left off the
|
|
master configuration as it is configured on the slave. The
|
|
following configuration provides a basic master setup. It's
|
|
advised to check any additional component properties such as
|
|
throttle limits and so on when implementing your use case.
|
|
</para>
|
|
<programlisting language="xml"><bean id="connectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
|
|
<property name="brokerURL" value="tcp://localhost:61616"/>
|
|
</bean>
|
|
|
|
<int-jms:outbound-channel-adapter id="requests" destination-name="requests"/>
|
|
|
|
<bean id="messagingTemplate"
|
|
class="org.springframework.integration.core.MessagingTemplate">
|
|
<property name="defaultChannel" ref="requests"/>
|
|
<property name="receiveTimeout" value="2000"/>
|
|
</bean>
|
|
|
|
<bean id="itemWriter"
|
|
class="org.springframework.batch.integration.chunk.ChunkMessageChannelItemWriter"
|
|
scope="step">
|
|
<property name="messagingOperations" ref="messagingTemplate"/>
|
|
<property name="replyChannel" ref="replies"/>
|
|
</bean>
|
|
|
|
<bean id="chunkHandler"
|
|
class="org.springframework.batch.integration.chunk.RemoteChunkHandlerFactoryBean">
|
|
<property name="chunkWriter" ref="itemWriter"/>
|
|
<property name="step" ref="step1"/>
|
|
</bean>
|
|
|
|
<int:channel id="replies">
|
|
<int:queue/>
|
|
</int:channel>
|
|
|
|
<int-jms:message-driven-channel-adapter id="jmsReplies"
|
|
destination-name="replies"
|
|
channel="replies"/></programlisting>
|
|
<para>
|
|
This configuration provides us with a number of beans. We
|
|
configure our messaging middleware using ActiveMQ and
|
|
inbound/outbound JMS adapters provided by Spring Integration. As
|
|
shown, our <classname>itemWriter</classname> bean which is
|
|
referenced by our job step utilizes the
|
|
<classname>ChunkMessageChannelItemWriter</classname> for writing chunks over the
|
|
configured middleware.
|
|
</para>
|
|
<para>
|
|
Now lets move on to the slave configuration:
|
|
</para>
|
|
<programlisting language="xml"><bean id="connectionFactory" class="org.apache.activemq.ActiveMQConnectionFactory">
|
|
<property name="brokerURL" value="tcp://localhost:61616"/>
|
|
</bean>
|
|
|
|
<int:channel id="requests"/>
|
|
<int:channel id="replies"/>
|
|
|
|
<int-jms:message-driven-channel-adapter id="jmsIn"
|
|
destination-name="requests"
|
|
channel="requests"/>
|
|
|
|
<int-jms:outbound-channel-adapter id="outgoingReplies"
|
|
destination-name="replies"
|
|
channel="replies">
|
|
</int-jms:outbound-channel-adapter>
|
|
|
|
<int:service-activator id="serviceActivator"
|
|
input-channel="requests"
|
|
output-channel="replies"
|
|
ref="chunkProcessorChunkHandler"
|
|
method="handleChunk"/>
|
|
|
|
<bean id="chunkProcessorChunkHandler"
|
|
class="org.springframework.batch.integration.chunk.ChunkProcessorChunkHandler">
|
|
<property name="chunkProcessor">
|
|
<bean class="org.springframework.batch.core.step.item.SimpleChunkProcessor">
|
|
<property name="itemWriter">
|
|
<bean class="io.spring.sbi.PersonItemWriter"/>
|
|
</property>
|
|
<property name="itemProcessor">
|
|
<bean class="io.spring.sbi.PersonItemProcessor"/>
|
|
</property>
|
|
</bean>
|
|
</property>
|
|
</bean></programlisting>
|
|
<para>
|
|
Most of these configuration items should look familiar from the
|
|
master configuration. Slaves do not need access to things like
|
|
the Spring Batch <classname>JobRepository</classname> nor access
|
|
to the actual job configuration file. The main bean of interest
|
|
is the
|
|
"<classname>chunkProcessorChunkHandler</classname>". The
|
|
<classname>chunkProcessor</classname> property of
|
|
<classname>ChunkProcessorChunkHandler</classname> takes a
|
|
configured <classname>SimpleChunkProcessor</classname> which is
|
|
where you would provide a reference to your
|
|
<classname>ItemWriter</classname> and optionally your
|
|
<classname>ItemProcessor</classname> that will run on the slave
|
|
when it receives chunks from the master.
|
|
</para>
|
|
<para>
|
|
For more information, please also consult the Spring Batch
|
|
manual, specifically the chapter on
|
|
<ulink url="http://docs.spring.io/spring-batch/reference/html/scalability.html#remoteChunking">Remote
|
|
Chunking</ulink>.
|
|
</para>
|
|
</sect3>
|
|
<sect3 id="remote-partitioning">
|
|
<title>Remote Partitioning</title>
|
|
<mediaobject>
|
|
<imageobject role="html">
|
|
<imagedata align="center"
|
|
fileref="images/remote-partitioning.png"
|
|
format="PNG" scale="100"/>
|
|
</imageobject>
|
|
<imageobject role="fo">
|
|
<imagedata align="center"
|
|
fileref="images/remote-partitioning.png"
|
|
format="PNG" scale="60"/>
|
|
</imageobject>
|
|
</mediaobject>
|
|
<para>
|
|
Remote Partitioning, on the other hand, is useful when the
|
|
problem is not the processing of items, but the associated I/O
|
|
represents the bottleneck. Using Remote Partitioning, work can
|
|
be farmed out to slaves that execute complete Spring Batch
|
|
steps. Thus, each slave has its own
|
|
<classname>ItemReader</classname>,
|
|
<classname>ItemProcessor</classname> and
|
|
<classname>ItemWriter</classname>. For this purpose, Spring Batch
|
|
Integration provides the
|
|
<classname>MessageChannelPartitionHandler</classname>.
|
|
</para>
|
|
<para>
|
|
This implementation of the <classname>PartitionHandler</classname>
|
|
interface uses <classname>MessageChannel</classname> instances to
|
|
send instructions to remote workers and receive their responses.
|
|
This provides a nice abstraction from the transports (E.g. JMS
|
|
or AMQP) being used to communicate with the remote workers.
|
|
</para>
|
|
<para>
|
|
The reference manual section
|
|
<ulink url="http://docs.spring.io/spring-batch/reference/html/scalability.html#partitioning">Remote
|
|
Partitioning</ulink> provides an overview of the concepts and
|
|
components needed to configure Remote Partitioning and shows an
|
|
example of using the default
|
|
<classname>TaskExecutorPartitionHandler</classname> to partition
|
|
in separate local threads of execution. For Remote Partitioning
|
|
to multiple JVM's, two additional components are required:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Remoting fabric or grid environment
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
A PartitionHandler implementation that supports the desired
|
|
remoting fabric or grid environment
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>
|
|
Similar to Remote Chunking JMS can be used as the "remoting
|
|
fabric" and the PartitionHandler implementation to be used
|
|
as described above is the
|
|
<classname>MessageChannelPartitionHandler</classname>. The example
|
|
shown below assumes an existing partitioned job and focuses on
|
|
the <classname>MessageChannelPartitionHandler</classname> and JMS
|
|
configuration:
|
|
</para>
|
|
<programlisting language="xml"><bean id="partitionHandler"
|
|
class="org.springframework.batch.integration.partition.MessageChannelPartitionHandler">
|
|
<property name="stepName" value="step1"/>
|
|
<property name="gridSize" value="3"/>
|
|
<property name="replyChannel" ref="outbound-replies"/>
|
|
<property name="messagingOperations">
|
|
<bean class="org.springframework.integration.core.MessagingTemplate">
|
|
<property name="defaultChannel" ref="outbound-requests"/>
|
|
<property name="receiveTimeout" value="100000"/>
|
|
</bean>
|
|
</property>
|
|
</bean>
|
|
|
|
<int:channel id="outbound-requests"/>
|
|
<int-jms:outbound-channel-adapter destination="requestsQueue"
|
|
channel="outbound-requests"/>
|
|
|
|
<int:channel id="inbound-requests"/>
|
|
<int-jms:message-driven-channel-adapter destination="requestsQueue"
|
|
channel="inbound-requests"/>
|
|
|
|
<bean id="stepExecutionRequestHandler"
|
|
class="org.springframework.batch.integration.partition.StepExecutionRequestHandler">
|
|
<property name="jobExplorer" ref="jobExplorer"/>
|
|
<property name="stepLocator" ref="stepLocator"/>
|
|
</bean>
|
|
|
|
<int:service-activator ref="stepExecutionRequestHandler" input-channel="inbound-requests"
|
|
output-channel="outbound-staging"/>
|
|
|
|
<int:channel id="outbound-staging"/>
|
|
<int-jms:outbound-channel-adapter destination="stagingQueue"
|
|
channel="outbound-staging"/>
|
|
|
|
<int:channel id="inbound-staging"/>
|
|
<int-jms:message-driven-channel-adapter destination="stagingQueue"
|
|
channel="inbound-staging"/>
|
|
|
|
<int:aggregator ref="partitionHandler" input-channel="inbound-staging"
|
|
output-channel="outbound-replies"/>
|
|
|
|
<int:channel id="outbound-replies">
|
|
<int:queue/>
|
|
</int:channel>
|
|
|
|
<bean id="stepLocator"
|
|
class="org.springframework.batch.integration.partition.BeanFactoryStepLocator" /></programlisting>
|
|
<para>
|
|
Also ensure the partition <classname>handler</classname> attribute
|
|
maps to the <classname>partitionHandler</classname> bean:
|
|
</para>
|
|
<programlisting language="xml"><job id="personJob">
|
|
<step id="step1.master">
|
|
<partition partitioner="partitioner" handler="partitionHandler"/>
|
|
...
|
|
</step>
|
|
</job></programlisting>
|
|
</sect3>
|
|
</sect2>
|
|
</sect1>
|
|
</chapter>
|