369 lines
17 KiB
XML
369 lines
17 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
|
|
<chapter id="retry">
|
|
<title>Retry</title>
|
|
|
|
<section id="retryTemplate">
|
|
<title>RetryTemplate</title>
|
|
|
|
<note>
|
|
<para>The retry functionality was pulled out of Spring Batch as of 2.2.0.
|
|
It is now part of a new library, Spring Retry.</para>
|
|
</note>
|
|
|
|
<para>To make processing more robust and less prone to failure, sometimes
|
|
it helps to automatically retry a failed operation in case it might
|
|
succeed on a subsequent attempt. Errors that are susceptible to this kind
|
|
of treatment are transient in nature. For example a remote call to a web
|
|
service or RMI service that fails because of a network glitch or a
|
|
<classname>DeadLockLoserException</classname> in a database update may
|
|
resolve themselves after a short wait. To automate the retry of such
|
|
operations Spring Batch has the <classname>RetryOperations</classname>
|
|
strategy. The <classname>RetryOperations</classname> interface looks like
|
|
this:</para>
|
|
|
|
<programlisting language="java">public interface RetryOperations {
|
|
|
|
<T> T execute(RetryCallback<T> retryCallback) throws Exception;
|
|
|
|
<T> T execute(RetryCallback<T> retryCallback, RecoveryCallback<T> recoveryCallback)
|
|
throws Exception;
|
|
|
|
<T> T execute(RetryCallback<T> retryCallback, RetryState retryState)
|
|
throws Exception, ExhaustedRetryException;
|
|
|
|
<T> T execute(RetryCallback<T> retryCallback, RecoveryCallback<T> recoveryCallback,
|
|
RetryState retryState) throws Exception;
|
|
|
|
}</programlisting>
|
|
<para>The basic callback is a simple interface that allows you to
|
|
insert some business logic to be retried:</para>
|
|
|
|
<programlisting language="java">public interface RetryCallback<T> {
|
|
|
|
T doWithRetry(RetryContext context) throws Throwable;
|
|
|
|
}</programlisting>
|
|
<para>The callback is executed and if it fails (by throwing an
|
|
<classname>Exception</classname>), it will be retried until either it is
|
|
successful, or the implementation decides to abort. There are a number of
|
|
overloaded <methodname>execute</methodname> methods in the
|
|
<classname>RetryOperations</classname> interface dealing with various use
|
|
cases for recovery when all retry attempts are exhausted, and also with
|
|
retry state, which allows clients and implementations to store information
|
|
between calls (more on this later).</para>
|
|
|
|
<para>The simplest general purpose implementation of
|
|
<classname>RetryOperations</classname> is
|
|
<classname>RetryTemplate</classname>. It could be used like this</para>
|
|
|
|
<programlisting language="java">RetryTemplate template = new RetryTemplate();
|
|
|
|
TimeoutRetryPolicy policy = new TimeoutRetryPolicy();
|
|
policy.setTimeout(30000L);
|
|
|
|
template.setRetryPolicy(policy);
|
|
|
|
Foo result = template.execute(new RetryCallback<Foo>() {
|
|
|
|
public Foo doWithRetry(RetryContext context) {
|
|
// Do stuff that might fail, e.g. webservice operation
|
|
return result;
|
|
}
|
|
|
|
});</programlisting>
|
|
|
|
<para>In the example we execute a web service call and return the result
|
|
to the user. If that call fails then it is retried until a timeout is
|
|
reached.</para>
|
|
|
|
<section id="retryContext">
|
|
<title>RetryContext</title>
|
|
|
|
<para>The method parameter for the <classname>RetryCallback</classname>
|
|
is a <classname>RetryContext</classname>. Many callbacks will simply
|
|
ignore the context, but if necessary it can be used as an attribute bag
|
|
to store data for the duration of the iteration.</para>
|
|
|
|
<para>A <classname>RetryContext</classname> will have a parent context
|
|
if there is a nested retry in progress in the same thread. The parent
|
|
context is occasionally useful for storing data that need to be shared
|
|
between calls to <methodname>execute</methodname>.</para>
|
|
</section>
|
|
|
|
<section id="recoveryCallback">
|
|
<title>RecoveryCallback</title>
|
|
|
|
<para>When a retry is exhausted the
|
|
<classname>RetryOperations</classname> can pass control to a different
|
|
callback, the <classname>RecoveryCallback</classname>. To use this
|
|
feature clients just pass in the callbacks together to the same method,
|
|
for example:</para>
|
|
|
|
<programlisting language="java">Foo foo = template.execute(new RetryCallback<Foo>() {
|
|
public Foo doWithRetry(RetryContext context) {
|
|
// business logic here
|
|
},
|
|
new RecoveryCallback<Foo>() {
|
|
Foo recover(RetryContext context) throws Exception {
|
|
// recover logic here
|
|
}
|
|
});</programlisting>
|
|
<para>If the business logic does not succeed before the template
|
|
decides to abort, then the client is given the chance to do some
|
|
alternate processing through the recovery callback.</para>
|
|
</section>
|
|
|
|
<section id="statelessRetry">
|
|
<title>Stateless Retry</title>
|
|
|
|
<para>In the simplest case, a retry is just a while loop: the
|
|
<classname>RetryTemplate</classname> can just keep trying until it
|
|
either succeeds or fails. The <classname>RetryContext</classname>
|
|
contains some state to determine whether to retry or abort, but this
|
|
state is on the stack and there is no need to store it anywhere
|
|
globally, so we call this stateless retry. The distinction between
|
|
stateless and stateful retry is contained in the implementation of the
|
|
<classname>RetryPolicy</classname> (the
|
|
<classname>RetryTemplate</classname> can handle both). In a stateless
|
|
retry, the callback is always executed in the same thread on retry as
|
|
when it failed.</para>
|
|
</section>
|
|
|
|
<section id="statefulRetry">
|
|
<title>Stateful Retry</title>
|
|
|
|
<para>Where the failure has caused a transactional resource to become
|
|
invalid, there are some special considerations. This does not apply to a
|
|
simple remote call because there is no transactional resource (usually),
|
|
but it does sometimes apply to a database update, especially when using
|
|
Hibernate. In this case it only makes sense to rethrow the exception
|
|
that called the failure immediately so that the transaction can roll
|
|
back and we can start a new valid one.</para>
|
|
|
|
<para>In these cases a stateless retry is not good enough because the
|
|
re-throw and roll back necessarily involve leaving the
|
|
<code>RetryOperations.execute()</code> method and potentially losing the
|
|
context that was on the stack. To avoid losing it we have to introduce a
|
|
storage strategy to lift it off the stack and put it (at a minimum) in
|
|
heap storage. For this purpose Spring Batch provides a storage strategy
|
|
<classname>RetryContextCache</classname> which can be injected into the
|
|
<classname>RetryTemplate</classname>. The default implementation of the
|
|
<classname>RetryContextCache</classname> is in memory, using a simple
|
|
<classname>Map</classname>. Advanced usage with multiple processes in a
|
|
clustered environment might also consider implementing the
|
|
<classname>RetryContextCache</classname> with a cluster cache of some
|
|
sort (though, even in a clustered environment this might be
|
|
overkill).</para>
|
|
|
|
<para>Part of the responsibility of the
|
|
<classname>RetryOperations</classname> is to recognize the failed
|
|
operations when they come back in a new execution (and usually wrapped
|
|
in a new transaction). To facilitate this, Spring Batch provides the
|
|
<classname>RetryState</classname> abstraction. This works in conjunction
|
|
with a special <classname>execute</classname> methods in the
|
|
<classname>RetryOperations</classname>.</para>
|
|
|
|
<para>The way the failed operations are recognized is by identifying the
|
|
state across multiple invocations of the retry. To identify the state,
|
|
the user can provide an <classname>RetryState</classname> object that is
|
|
responsible for returning a unique key identifying the item. The
|
|
identifier is used as a key in the
|
|
<classname>RetryContextCache</classname>.</para>
|
|
|
|
<warning>
|
|
<para>Be very careful with the implementation of
|
|
<code>Object.equals()</code> and <code>Object.hashCode()</code> in the
|
|
key returned by <classname>RetryState</classname>. The best advice is
|
|
to use a business key to identify the items. In the case of a JMS
|
|
message the message ID can be used.</para>
|
|
</warning>
|
|
|
|
<para>When the retry is exhausted there is also the option to handle the
|
|
failed item in a different way, instead of calling the
|
|
<classname>RetryCallback</classname> (which is presumed now to be likely
|
|
to fail). Just like in the stateless case, this option is provided by
|
|
the <classname>RecoveryCallback</classname>, which can be provided by
|
|
passing it in to the <classname>execute</classname> method of
|
|
<classname>RetryOperations</classname>.</para>
|
|
|
|
<para>The decision to retry or not is actually delegated to a regular
|
|
<classname>RetryPolicy</classname>, so the usual concerns about limits
|
|
and timeouts can be injected there (see below).</para>
|
|
</section>
|
|
</section>
|
|
|
|
<section id="retryPolicies">
|
|
<title>Retry Policies</title>
|
|
|
|
<para>Inside a <classname>RetryTemplate</classname> the decision to retry
|
|
or fail in the <methodname>execute</methodname> method is determined by a
|
|
<classname>RetryPolicy</classname> which is also a factory for the
|
|
<classname>RetryContext</classname>. The
|
|
<classname>RetryTemplate</classname> has the responsibility to use the
|
|
current policy to create a <classname>RetryContext</classname> and pass
|
|
that in to the <classname>RetryCallback</classname> at every attempt.
|
|
After a callback fails the <classname>RetryTemplate</classname> has to
|
|
make a call to the <classname>RetryPolicy</classname> to ask it to update
|
|
its state (which will be stored in the
|
|
<classname>RetryContext</classname>), and then it asks the policy if
|
|
another attempt can be made. If another attempt cannot be made (e.g. a
|
|
limit is reached or a timeout is detected) then the policy is also
|
|
responsible for handling the exhausted state. Simple implementations will
|
|
just throw <classname>RetryExhaustedException</classname> which will cause
|
|
any enclosing transaction to be rolled back. More sophisticated
|
|
implementations might attempt to take some recovery action, in which case
|
|
the transaction can remain intact.</para>
|
|
|
|
<tip>
|
|
<para>Failures are inherently either retryable or not - if the same
|
|
exception is always going to be thrown from the business logic, it
|
|
doesn't help to retry it. So don't retry on all exception types - try to
|
|
focus on only those exceptions that you expect to be retryable. It's not
|
|
usually harmful to the business logic to retry more aggressively, but
|
|
it's wasteful because if a failure is deterministic there will be time
|
|
spent retrying something that you know in advance is fatal.</para>
|
|
</tip>
|
|
|
|
<para>Spring Batch provides some simple general purpose implementations of
|
|
stateless <classname>RetryPolicy</classname>, for example a
|
|
<classname>SimpleRetryPolicy</classname>, and the
|
|
<classname>TimeoutRetryPolicy</classname> used in the example
|
|
above.</para>
|
|
|
|
<para>The <classname>SimpleRetryPolicy</classname> just allows a retry on
|
|
any of a named list of exception types, up to a fixed number of times. It
|
|
also has a list of "fatal" exceptions that should never be retried, and
|
|
this list overrides the retryable list so that it can be used to give
|
|
finer control over the retry behavior:</para>
|
|
|
|
<programlisting language="java">SimpleRetryPolicy policy = new SimpleRetryPolicy();
|
|
// Set the max retry attempts
|
|
policy.setMaxAttempts(5);
|
|
// Retry on all exceptions (this is the default)
|
|
policy.setRetryableExceptions(new Class[] {Exception.class});
|
|
// ... but never retry IllegalStateException
|
|
policy.setFatalExceptions(new Class[] {IllegalStateException.class});
|
|
|
|
// Use the policy...
|
|
RetryTemplate template = new RetryTemplate();
|
|
template.setRetryPolicy(policy);
|
|
template.execute(new RetryCallback<Foo>() {
|
|
public Foo doWithRetry(RetryContext context) {
|
|
// business logic here
|
|
}
|
|
});</programlisting>
|
|
|
|
<para>There is also a more flexible implementation called
|
|
<classname>ExceptionClassifierRetryPolicy</classname>, which allows the
|
|
user to configure different retry behavior for an arbitrary set of
|
|
exception types though the <classname>ExceptionClassifier</classname>
|
|
abstraction. The policy works by calling on the classifier to convert an
|
|
exception into a delegate <classname>RetryPolicy</classname>, so for
|
|
example, one exception type can be retried more times before failure than
|
|
another by mapping it to a different policy.</para>
|
|
|
|
<para>Users might need to implement their own retry policies for more
|
|
customized decisions. For instance, if there is a well-known,
|
|
solution-specific, classification of exceptions into retryable and not
|
|
retryable.</para>
|
|
</section>
|
|
|
|
<section id="backoffPolicies">
|
|
<title>Backoff Policies</title>
|
|
|
|
<para>When retrying after a transient failure it often helps to wait a bit
|
|
before trying again, because usually the failure is caused by some problem
|
|
that will only be resolved by waiting. If a
|
|
<classname>RetryCallback</classname> fails, the
|
|
<classname>RetryTemplate</classname> can pause execution according to the
|
|
<classname>BackoffPolicy</classname> in place.</para>
|
|
|
|
<programlisting language="java">public interface BackoffPolicy {
|
|
|
|
BackOffContext start(RetryContext context);
|
|
|
|
void backOff(BackOffContext backOffContext)
|
|
throws BackOffInterruptedException;
|
|
|
|
}</programlisting>
|
|
<para>A <classname>BackoffPolicy</classname> is free to implement
|
|
the backOff in any way it chooses. The policies provided by Spring Batch
|
|
out of the box all use <code>Object.wait()</code>. A common use case is to
|
|
backoff with an exponentially increasing wait period, to avoid two retries
|
|
getting into lock step and both failing - this is a lesson learned from
|
|
the ethernet. For this purpose Spring Batch provides the
|
|
<classname>ExponentialBackoffPolicy</classname>.</para>
|
|
</section>
|
|
|
|
<section id="retryListeners">
|
|
<title>Listeners</title>
|
|
|
|
<para>Often it is useful to be able to receive additional callbacks for
|
|
cross cutting concerns across a number of different retries. For this
|
|
purpose Spring Batch provides the <classname>RetryListener</classname>
|
|
interface. The <classname>RetryTemplate</classname> allows users to
|
|
register <classname>RetryListener</classname>s, and they will be given
|
|
callbacks with the <classname>RetryContext</classname> and
|
|
<classname>Throwable</classname> where available during the
|
|
iteration.</para>
|
|
|
|
<para>The interface looks like this:</para>
|
|
|
|
<programlisting language="java">public interface RetryListener {
|
|
|
|
void open(RetryContext context, RetryCallback<T> callback);
|
|
|
|
void onError(RetryContext context, RetryCallback<T> callback, Throwable e);
|
|
|
|
void close(RetryContext context, RetryCallback<T> callback, Throwable e);
|
|
}</programlisting>
|
|
<para>The <methodname>open</methodname> and
|
|
<methodname>close</methodname> callbacks come before and after the entire
|
|
retry in the simplest case and <methodname>onError</methodname> applies to
|
|
the individual <classname>RetryCallback</classname> calls. The
|
|
<methodname>close</methodname> method might also receive a
|
|
<classname>Throwable</classname>; if there has been an error it is the
|
|
last one thrown by the <classname>RetryCallback</classname>.</para>
|
|
|
|
<para>Note that when there is more than one listener, they are in a list,
|
|
so there is an order. In this case <methodname>open</methodname> will be
|
|
called in the same order while <methodname>onError</methodname> and
|
|
<methodname>close</methodname> will be called in reverse order.</para>
|
|
</section>
|
|
|
|
<section id="declarativeRetry">
|
|
<title>Declarative Retry</title>
|
|
|
|
<para>Sometimes there is some business processing that you know you want
|
|
to retry every time it happens. The classic example of this is the remote
|
|
service call. Spring Batch provides an AOP interceptor that wraps a method
|
|
call in a <classname>RetryOperations</classname> for just this purpose.
|
|
The <classname>RetryOperationsInterceptor</classname> executes the
|
|
intercepted method and retries on failure according to the
|
|
<classname>RetryPolicy</classname> in the provided
|
|
<classname>RepeatTemplate</classname>.</para>
|
|
|
|
<para>Here is an example of declarative iteration using the Spring AOP
|
|
namespace to repeat a service call to a method called
|
|
<methodname>remoteCall</methodname> (for more detail on how to configure
|
|
AOP interceptors see the Spring User Guide):</para>
|
|
|
|
<programlisting language="xml"><aop:config>
|
|
<aop:pointcut id="transactional"
|
|
expression="execution(* com..*Service.remoteCall(..))" />
|
|
<aop:advisor pointcut-ref="transactional"
|
|
advice-ref="retryAdvice" order="-1"/>
|
|
</aop:config>
|
|
|
|
<bean id="retryAdvice"
|
|
class="org.springframework.batch.retry.interceptor.RetryOperationsInterceptor"/></programlisting>
|
|
|
|
<para>The example above uses a default
|
|
<classname>RetryTemplate</classname> inside the interceptor. To change the
|
|
policies or listeners, you only need to inject an instance of
|
|
<classname>RetryTemplate</classname> into the interceptor.</para>
|
|
</section>
|
|
</chapter>
|