spring-batch/build/reference/html/readersAndWriters.html

<html><head>
      <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <title>6.&nbsp;ItemReaders and ItemWriters</title><link rel="stylesheet" type="text/css" href="css/manual-multipage.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"><link rel="home" href="index.html" title="Spring Batch - Reference Documentation"><link rel="up" href="index.html" title="Spring Batch - Reference Documentation"><link rel="prev" href="configureStep.html" title="5.&nbsp;Configuring a Step"><link rel="next" href="scalability.html" title="7.&nbsp;Scaling and Parallel Processing"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">6.&nbsp;ItemReaders and ItemWriters</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="configureStep.html">Prev</a>&nbsp;</td><th width="60%" align="center">&nbsp;</th><td width="20%" align="right">&nbsp;<a accesskey="n" href="scalability.html">Next</a></td></tr></table><hr></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="readersAndWriters" href="#readersAndWriters"></a>6.&nbsp;ItemReaders and ItemWriters</h1></div></div></div><p>All batch processing can be described in its most simple form as
  reading in large amounts of data, performing some type of calculation or
  transformation, and writing the result out. Spring Batch provides three key
  interfaces to help perform bulk reading and writing:
  <code class="classname">ItemReader</code>, <code class="classname">ItemProcessor</code> and
  <code class="classname">ItemWriter</code>.</p><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="itemReader" href="#itemReader"></a>6.1&nbsp;ItemReader</h2></div></div></div><p>Although a simple concept, an <code class="classname">ItemReader</code> is
    the means for providing data from many different types of input. The most
    general examples include: </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Flat File- Flat File Item Readers read lines of data from a
          flat file that typically describe records with fields of data
          defined by fixed positions in the file or delimited by some special
          character (e.g. Comma).</p></li><li class="listitem"><p>XML - XML ItemReaders process XML independently of
          technologies used for parsing, mapping and validating objects. Input
          data allows for the validation of an XML file against an XSD
          schema.</p></li><li class="listitem"><p>Database - A database resource is accessed to return
          resultsets which can be mapped to objects for processing. The
          default SQL ItemReaders invoke a <code class="classname">RowMapper</code> to
          return objects, keep track of the current row if restart is
          required, store basic statistics, and provide some transaction
          enhancements that will be explained later.</p></li></ul></div><p>There are many more possibilities, but we'll focus on the
    basic ones for this chapter. A complete list of all available ItemReaders
    can be found in Appendix A.</p><p><code class="classname">ItemReader</code> is a basic interface for generic
    input operations:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> ItemReader&lt;T&gt; {

    T read() <span class="hl-keyword">throws</span> Exception, UnexpectedInputException, ParseException;

}</pre><p>The <code class="methodname">read</code> method defines the most essential
    contract of the <code class="classname">ItemReader</code>; calling it returns one
    Item or null if no more items are left. An item might represent a line in
    a file, a row in a database, or an element in an XML file. It is generally
    expected that these will be mapped to a usable domain object (i.e. Trade,
    Foo, etc) but there is no requirement in the contract to do so.</p><p>It is expected that implementations of the
    <code class="classname">ItemReader</code> interface will be forward only. However,
    if the underlying resource is transactional (such as a JMS queue) then
    calling read may return the same logical item on subsequent calls in a
    rollback scenario. It is also worth noting that a lack of items to process
    by an <code class="classname">ItemReader</code> will not cause an exception to be
    thrown. For example, a database <code class="classname">ItemReader</code> that is
    configured with a query that returns 0 results will simply return null on
    the first invocation of <code class="methodname">read</code>.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="itemWriter" href="#itemWriter"></a>6.2&nbsp;ItemWriter</h2></div></div></div><p><code class="classname">ItemWriter</code> is similar in functionality to an
    <code class="classname">ItemReader</code>, but with inverse operations. Resources
    still need to be located, opened and closed but they differ in that an
    <code class="classname">ItemWriter</code> writes out, rather than reading in. In
    the case of databases or queues these may be inserts, updates, or sends.
    The format of the serialization of the output is specific to each batch
    job.</p><p>As with <code class="classname">ItemReader</code>,
    <code class="classname">ItemWriter</code> is a fairly generic interface:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> ItemWriter&lt;T&gt; {

    <span class="hl-keyword">void</span> write(List&lt;? <span class="hl-keyword">extends</span> T&gt; items) <span class="hl-keyword">throws</span> Exception;

}</pre><p>As with <code class="methodname">read</code> on
    <code class="classname">ItemReader</code>, <code class="methodname">write</code> provides
    the basic contract of <code class="classname">ItemWriter</code>; it will attempt
    to write out the list of items passed in as long as it is open. Because it
    is generally expected that items will be 'batched' together into a chunk
    and then output, the interface accepts a list of items, rather than an
    item by itself. After writing out the list, any flushing that may be
    necessary can be performed before returning from the write method. For
    example, if writing to a Hibernate DAO, multiple calls to write can be
    made, one for each item. The writer can then call close on the hibernate
    Session before returning.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="itemProcessor" href="#itemProcessor"></a>6.3&nbsp;ItemProcessor</h2></div></div></div><p>The <code class="classname">ItemReader</code> and
    <code class="classname">ItemWriter</code> interfaces are both very useful for
    their specific tasks, but what if you want to insert business logic before
    writing? One option for both reading and writing is to use the composite
    pattern: create an <code class="classname">ItemWriter</code> that contains another
    <code class="classname">ItemWriter</code>, or an <code class="classname">ItemReader</code>
    that contains another <code class="classname">ItemReader</code>. For
    example:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CompositeItemWriter&lt;T&gt; <span class="hl-keyword">implements</span> ItemWriter&lt;T&gt; {

    ItemWriter&lt;T&gt; itemWriter;

    <span class="hl-keyword">public</span> CompositeItemWriter(ItemWriter&lt;T&gt; itemWriter) {
        <span class="hl-keyword">this</span>.itemWriter = itemWriter;
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> write(List&lt;? <span class="hl-keyword">extends</span> T&gt; items) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-comment">//Add business logic here</span>
       itemWriter.write(item);
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> setDelegate(ItemWriter&lt;T&gt; itemWriter){
        <span class="hl-keyword">this</span>.itemWriter = itemWriter;
    }
}</pre><p>The class above contains another <code class="classname">ItemWriter</code>
    to which it delegates after having provided some business logic. This
    pattern could easily be used for an <code class="classname">ItemReader</code> as
    well, perhaps to obtain more reference data based upon the input that was
    provided by the main <code class="classname">ItemReader</code>. It is also useful
    if you need to control the call to <code class="classname">write</code> yourself.
    However, if you only want to 'transform' the item passed in for writing
    before it is actually written, there isn't much need to call
    <code class="methodname">write</code> yourself: you just want to modify the item.
    For this scenario, Spring Batch provides the
    <code class="classname">ItemProcessor</code> interface:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> ItemProcessor&lt;I, O&gt; {

    O process(I item) <span class="hl-keyword">throws</span> Exception;
}</pre><p>An <code class="classname">ItemProcessor</code> is very simple; given one
    object, transform it and return another. The provided object may or may
    not be of the same type. The point is that business logic may be applied
    within process, and is completely up to the developer to create. An
    <code class="classname">ItemProcessor</code> can be wired directly into a step,
    For example, assuming an <code class="classname">ItemReader</code> provides a
    class of type Foo, and it needs to be converted to type Bar before being
    written out. An <code class="classname">ItemProcessor</code> can be written that
    performs the conversion:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Foo {}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Bar {
    <span class="hl-keyword">public</span> Bar(Foo foo) {}
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> FooProcessor <span class="hl-keyword">implements</span> ItemProcessor&lt;Foo,Bar&gt;{
    <span class="hl-keyword">public</span> Bar process(Foo foo) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-comment">//Perform simple transformation, convert a Foo to a Bar</span>
        <span class="hl-keyword">return</span> <span class="hl-keyword">new</span> Bar(foo);
    }
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> BarWriter <span class="hl-keyword">implements</span> ItemWriter&lt;Bar&gt;{
    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> write(List&lt;? <span class="hl-keyword">extends</span> Bar&gt; bars) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-comment">//write bars</span>
    }
}</pre><p>In the very simple example above, there is a class
    <code class="classname">Foo</code>, a class <code class="classname">Bar</code>, and a
    class <code class="classname">FooProcessor</code> that adheres to the
    <code class="classname">ItemProcessor</code> interface. The transformation is
    simple, but any type of transformation could be done here. The
    <code class="classname">BarWriter</code> will be used to write out
    <code class="classname">Bar</code> objects, throwing an exception if any other
    type is provided. Similarly, the <code class="classname">FooProcessor</code> will
    throw an exception if anything but a <code class="classname">Foo</code> is
    provided. The <code class="classname">FooProcessor</code> can then be injected
    into a <code class="classname">Step</code>:</p><pre class="programlisting"><span class="hl-tag">&lt;job</span> <span class="hl-attribute">id</span>=<span class="hl-value">"ioSampleJob"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;step</span> <span class="hl-attribute">name</span>=<span class="hl-value">"step1"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;tasklet&gt;</span>
            <span class="hl-tag">&lt;chunk</span> <span class="hl-attribute">reader</span>=<span class="hl-value">"fooReader"</span> <span class="hl-attribute">processor</span>=<span class="hl-value">"fooProcessor"</span> <span class="hl-attribute">writer</span>=<span class="hl-value">"barWriter"</span>
                   <span class="hl-attribute">commit-interval</span>=<span class="hl-value">"2"</span><span class="hl-tag">/&gt;</span>
        <span class="hl-tag">&lt;/tasklet&gt;</span>
    <span class="hl-tag">&lt;/step&gt;</span>
<span class="hl-tag">&lt;/job&gt;</span></pre><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="chainingItemProcessors" href="#chainingItemProcessors"></a>6.3.1&nbsp;Chaining ItemProcessors</h3></div></div></div><p>Performing a single transformation is useful in many scenarios,
      but what if you want to 'chain' together multiple
      <code class="classname">ItemProcessor</code>s? This can be accomplished using
      the composite pattern mentioned previously. To update the previous,
      single transformation, example, <code class="classname">Foo</code> will be
      transformed to <code class="classname">Bar</code>, which will be transformed to
      <code class="classname">Foobar</code> and written out:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Foo {}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Bar {
    <span class="hl-keyword">public</span> Bar(Foo foo) {}
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Foobar{
    <span class="hl-keyword">public</span> Foobar(Bar bar) {}
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> FooProcessor <span class="hl-keyword">implements</span> ItemProcessor&lt;Foo,Bar&gt;{
    <span class="hl-keyword">public</span> Bar process(Foo foo) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-comment">//Perform simple transformation, convert a Foo to a Bar</span>
        <span class="hl-keyword">return</span> <span class="hl-keyword">new</span> Bar(foo);
    }
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> BarProcessor <span class="hl-keyword">implements</span> ItemProcessor&lt;Bar,FooBar&gt;{
    <span class="hl-keyword">public</span> FooBar process(Bar bar) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-keyword">return</span> <span class="hl-keyword">new</span> Foobar(bar);
    }
}

<span class="hl-keyword">public</span> <span class="hl-keyword">class</span> FoobarWriter <span class="hl-keyword">implements</span> ItemWriter&lt;FooBar&gt;{
    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> write(List&lt;? <span class="hl-keyword">extends</span> FooBar&gt; items) <span class="hl-keyword">throws</span> Exception {
        <span class="hl-comment">//write items</span>
    }
}</pre><p>A <code class="classname">FooProcessor</code> and
      <code class="classname">BarProcessor</code> can be 'chained' together to give
      the resultant <code class="classname">Foobar</code>:</p><pre class="programlisting">CompositeItemProcessor&lt;Foo,Foobar&gt; compositeProcessor =
                                      <span class="hl-keyword">new</span> CompositeItemProcessor&lt;Foo,Foobar&gt;();
List itemProcessors = <span class="hl-keyword">new</span> ArrayList();
itemProcessors.add(<span class="hl-keyword">new</span> FooTransformer());
itemProcessors.add(<span class="hl-keyword">new</span> BarTransformer());
compositeProcessor.setDelegates(itemProcessors);</pre><p>Just as with the previous example, the composite processor can be
      configured into the <code class="classname">Step</code>:</p><pre class="programlisting"><span class="hl-tag">&lt;job</span> <span class="hl-attribute">id</span>=<span class="hl-value">"ioSampleJob"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;step</span> <span class="hl-attribute">name</span>=<span class="hl-value">"step1"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;tasklet&gt;</span>
            <span class="hl-tag">&lt;chunk</span> <span class="hl-attribute">reader</span>=<span class="hl-value">"fooReader"</span> <span class="hl-attribute">processor</span>=<span class="hl-value">"compositeProcessor"</span> <span class="hl-attribute">writer</span>=<span class="hl-value">"foobarWriter"</span>
                   <span class="hl-attribute">commit-interval</span>=<span class="hl-value">"2"</span><span class="hl-tag">/&gt;</span>
        <span class="hl-tag">&lt;/tasklet&gt;</span>
    <span class="hl-tag">&lt;/step&gt;</span>
<span class="hl-tag">&lt;/job&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"compositeItemProcessor"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.support.CompositeItemProcessor"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"delegates"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;list&gt;</span>
            <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"..FooProcessor"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"..BarProcessor"</span><span class="hl-tag"> /&gt;</span>
        <span class="hl-tag">&lt;/list&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="filiteringRecords" href="#filiteringRecords"></a>6.3.2&nbsp;Filtering Records</h3></div></div></div><p>One typical use for an item processor is to filter out records
      before they are passed to the ItemWriter. Filtering is an action
      distinct from skipping; skipping indicates that a record is invalid
      whereas filtering simply indicates that a record should not be
      written.</p><p>For example, consider a batch job that reads a file containing
      three different types of records: records to insert, records to update,
      and records to delete. If record deletion is not supported by the
      system, then we would not want to send any "delete" records to the
      <code class="classname">ItemWriter</code>. But, since these records are not
      actually bad records, we would want to filter them out, rather than
      skip. As a result, the ItemWriter would receive only "insert" and
      "update" records.</p><p>To filter a record, one simply returns "null" from the
      <code class="classname">ItemProcessor</code>. The framework will detect that the
      result is "null" and avoid adding that item to the list of records
      delivered to the <code class="classname">ItemWriter</code>. As usual, an
      exception thrown from the <code class="classname">ItemProcessor</code> will
      result in a skip.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="faultTolerant" href="#faultTolerant"></a>6.3.3&nbsp;Fault Tolerance</h3></div></div></div><p>When a chunk is rolled back, items that have been cached
          during reading may be reprocessed.  If a step is configured to
          be fault tolerant (uses skip or retry processing typically),
          any ItemProcessor used should be implemented in a way that is
          idempotent.  Typically that would consist of performing no changes
          on the input item for the ItemProcessor and only updating the
          instance that is the result.</p></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="itemStream" href="#itemStream"></a>6.4&nbsp;ItemStream</h2></div></div></div><p>Both <code class="classname">ItemReader</code>s and
    <code class="classname">ItemWriter</code>s serve their individual purposes well,
    but there is a common concern among both of them that necessitates another
    interface. In general, as part of the scope of a batch job, readers and
    writers need to be opened, closed, and require a mechanism for persisting
    state:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> ItemStream {

    <span class="hl-keyword">void</span> open(ExecutionContext executionContext) <span class="hl-keyword">throws</span> ItemStreamException;

    <span class="hl-keyword">void</span> update(ExecutionContext executionContext) <span class="hl-keyword">throws</span> ItemStreamException;

    <span class="hl-keyword">void</span> close() <span class="hl-keyword">throws</span> ItemStreamException;
}</pre><p>Before describing each method, we should mention the
    <code class="classname">ExecutionContext</code>. Clients of an
    <code class="classname">ItemReader</code> that also implement
    <code class="classname">ItemStream</code> should call
    <code class="methodname">open</code> before any calls to
    <code class="methodname">read</code> in order to open any resources such as files
    or to obtain connections. A similar restriction applies to an
    <code class="classname">ItemWriter</code> that implements
    <code class="classname">ItemStream</code>. As mentioned in Chapter 2, if expected
    data is found in the <code class="classname">ExecutionContext</code>, it may be
    used to start the <code class="classname">ItemReader</code> or
    <code class="classname">ItemWriter</code> at a location other than its initial
    state. Conversely, <code class="methodname">close</code> will be called to ensure
    that any resources allocated during <code class="methodname">open</code> will be
    released safely. <code class="methodname">update</code> is called primarily to
    ensure that any state currently being held is loaded into the provided
    <code class="classname">ExecutionContext</code>. This method will be called before
    committing, to ensure that the current state is persisted in the database
    before commit.</p><p>In the special case where the client of an
    <code class="classname">ItemStream</code> is a <code class="classname">Step</code> (from
    the Spring Batch Core), an <code class="classname">ExecutionContext</code> is
    created for each <code class="classname">StepExecution</code> to allow users to
    store the state of a particular execution, with the expectation that it
    will be returned if the same <code class="classname">JobInstance</code> is started
    again. For those familiar with Quartz, the semantics are very similar to a
    Quartz <code class="classname">JobDataMap</code>.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="delegatePatternAndRegistering" href="#delegatePatternAndRegistering"></a>6.5&nbsp;The Delegate Pattern and Registering with the Step</h2></div></div></div><p>Note that the <code class="classname">CompositeItemWriter</code> is an
    example of the delegation pattern, which is common in Spring Batch. The
    delegates themselves might implement callback interfaces <code class="classname">StepListener</code>.
    If they do, and they are being used in conjunction with Spring Batch Core
    as part of a <code class="classname">Step</code> in a <code class="classname">Job</code>,
    then they almost certainly need to be registered manually with the
    <code class="classname">Step</code>. A reader, writer, or processor that is
    directly wired into the Step will be registered automatically if it
    implements <code class="classname">ItemStream</code> or a
    <code class="classname">StepListener</code> interface. But because the delegates
    are not known to the <code class="classname">Step</code>, they need to be injected
    as listeners or streams (or both if appropriate):</p><pre class="programlisting"><span class="hl-tag">&lt;job</span> <span class="hl-attribute">id</span>=<span class="hl-value">"ioSampleJob"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;step</span> <span class="hl-attribute">name</span>=<span class="hl-value">"step1"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;tasklet&gt;</span>
            <span class="hl-tag">&lt;chunk</span> <span class="hl-attribute">reader</span>=<span class="hl-value">"fooReader"</span> <span class="hl-attribute">processor</span>=<span class="hl-value">"fooProcessor"</span> <span class="hl-attribute">writer</span>=<span class="hl-value">"compositeItemWriter"</span>
                   <span class="hl-attribute">commit-interval</span>=<span class="hl-value">"2"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;streams&gt;</span>
                    <span class="hl-tag">&lt;stream</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"barWriter"</span><span class="hl-tag"> /&gt;</span>
                <span class="hl-tag">&lt;/streams&gt;</span>
            <span class="hl-tag">&lt;/chunk&gt;</span>
        <span class="hl-tag">&lt;/tasklet&gt;</span>
    <span class="hl-tag">&lt;/step&gt;</span>
<span class="hl-tag">&lt;/job&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"compositeItemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"...CustomCompositeItemWriter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"delegate"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"barWriter"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"barWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"...BarWriter"</span><span class="hl-tag"> /&gt;</span></pre></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="flatFiles" href="#flatFiles"></a>6.6&nbsp;Flat Files</h2></div></div></div><p>One of the most common mechanisms for interchanging bulk data has
    always been the flat file. Unlike XML, which has an agreed upon standard
    for defining how it is structured (XSD), anyone reading a flat file must
    understand ahead of time exactly how the file is structured. In general,
    all flat files fall into two types: Delimited and Fixed Length. Delimited
    files are those in which fields are separated by a delimiter, such as a
    comma. Fixed Length files have fields that are a set length.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="fieldSet" href="#fieldSet"></a>6.6.1&nbsp;The FieldSet</h3></div></div></div><p>When working with flat files in Spring Batch, regardless of
      whether it is for input or output, one of the most important classes is
      the <code class="classname">FieldSet</code>. Many architectures and libraries
      contain abstractions for helping you read in from a file, but they
      usually return a String or an array of Strings. This really only gets
      you halfway there. A <code class="classname">FieldSet</code> is Spring Batch&#8217;s
      abstraction for enabling the binding of fields from a file resource. It
      allows developers to work with file input in much the same way as they
      would work with database input. A <code class="classname">FieldSet</code> is
      conceptually very similar to a Jdbc <code class="classname">ResultSet</code>.
      FieldSets only require one argument, a <code class="classname">String</code>
      array of tokens. Optionally, you can also configure in the names of the
      fields so that the fields may be accessed either by index or name as
      patterned after <code class="classname">ResultSet</code>:</p><pre class="programlisting">String[] tokens = <span class="hl-keyword">new</span> String[]{<span class="hl-string">"foo"</span>, <span class="hl-string">"1"</span>, <span class="hl-string">"true"</span>};
FieldSet fs = <span class="hl-keyword">new</span> DefaultFieldSet(tokens);
String name = fs.readString(<span class="hl-number">0</span>);
<span class="hl-keyword">int</span> value = fs.readInt(<span class="hl-number">1</span>);
<span class="hl-keyword">boolean</span> booleanValue = fs.readBoolean(<span class="hl-number">2</span>);</pre><p>There are many more options on the <code class="classname">FieldSet</code>
      interface, such as <code class="classname">Date</code>, long,
      <code class="classname">BigDecimal</code>, etc. The biggest advantage of the
      <code class="classname">FieldSet</code> is that it provides consistent parsing
      of flat file input. Rather than each batch job parsing differently in
      potentially unexpected ways, it can be consistent, both when handling
      errors caused by a format exception, or when doing simple data
      conversions.</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="flatFileItemReader" href="#flatFileItemReader"></a>6.6.2&nbsp;FlatFileItemReader</h3></div></div></div><p>A flat file is any type of file that contains at most
      two-dimensional (tabular) data. Reading flat files in the Spring Batch
      framework is facilitated by the class
      <code class="classname">FlatFileItemReader</code>, which provides basic
      functionality for reading and parsing flat files. The two most important
      required dependencies of <code class="classname">FlatFileItemReader</code> are
      <code class="classname">Resource</code> and <code class="classname">LineMapper.
      </code>The <code class="classname">LineMapper</code> interface will be
      explored more in the next sections. The resource property represents a
      Spring Core <code class="classname">Resource</code>. Documentation explaining
      how to create beans of this type can be found in <a class="ulink" href="http://docs.spring.io/spring/docs/3.2.x/spring-framework-reference/html/resources.html" target="_top"><em class="citetitle">Spring
      Framework, Chapter 5.Resources</em></a>. Therefore, this
      guide will not go into the details of creating
      <code class="classname">Resource</code> objects. However, a simple example of a
      file system resource can be found below:
      </p><pre class="programlisting">Resource resource = <span class="hl-keyword">new</span> FileSystemResource(<span class="hl-string">"resources/trades.csv"</span>);</pre><p>In complex batch environments the directory structures are often
      managed by the EAI infrastructure where drop zones for external
      interfaces are established for moving files from ftp locations to batch
      processing locations and vice versa. File moving utilities are beyond
      the scope of the spring batch architecture but it is not unusual for
      batch job streams to include file moving utilities as steps in the job
      stream. It is sufficient that the batch architecture only needs to know
      how to locate the files to be processed. Spring Batch begins the process
      of feeding the data into the pipe from this starting point. However,
      <a class="ulink" href="http://projects.spring.io/spring-integration/" target="_top"><em class="citetitle">Spring
      Integration</em></a> provides many of these types of
      services.</p><p>The other properties in <code class="classname">FlatFileItemReader</code>
      allow you to further specify how your data will be interpreted: </p><div class="table"><a name="d5e2230" href="#d5e2230"></a><p class="title"><b>Table&nbsp;6.1.&nbsp;FlatFileItemReader Properties</b></p><div class="table-contents"><table summary="FlatFileItemReader Properties" style="border-collapse: collapse;border-top: 0.5pt solid ; border-bottom: 0.5pt solid ; border-left: 0.5pt solid ; border-right: 0.5pt solid ; "><colgroup><col align="center"><col><col></colgroup><thead><tr><th style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="center">Property</th><th style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="center">Type</th><th style="border-bottom: 0.5pt solid ; " align="center">Description</th></tr></thead><tbody><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">comments</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">String[]</td><td style="border-bottom: 0.5pt solid ; " align="left">Specifies line prefixes that indicate
                comment rows</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">encoding</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">String</td><td style="border-bottom: 0.5pt solid ; " align="left">Specifies what text encoding to use -
                default is "ISO-8859-1"</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">lineMapper</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">LineMapper</td><td style="border-bottom: 0.5pt solid ; " align="left">Converts a <code class="classname">String</code>
                to an <code class="classname">Object</code> representing the
                item.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">linesToSkip</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">int</td><td style="border-bottom: 0.5pt solid ; " align="left">Number of lines to ignore at the top of
                the file</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">recordSeparatorPolicy</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">RecordSeparatorPolicy</td><td style="border-bottom: 0.5pt solid ; " align="left">Used to determine where the line endings
                are and do things like continue over a line ending if inside a
                quoted string.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">resource</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">Resource</td><td style="border-bottom: 0.5pt solid ; " align="left">The resource from which to read.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">skippedLinesCallback</td><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; " align="left">LineCallbackHandler</td><td style="border-bottom: 0.5pt solid ; " align="left">Interface which passes the raw line
                content of the lines in the file to be skipped. If linesToSkip
                is set to 2, then this interface will be called twice.</td></tr><tr><td style="border-right: 0.5pt solid ; " align="left">strict</td><td style="border-right: 0.5pt solid ; " align="left">boolean</td><td style="" align="left">In strict mode, the reader will throw an
                exception on ExecutionContext if the input resource does not
                exist.</td></tr></tbody></table></div></div><p><br class="table-break"></p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="lineMapper" href="#lineMapper"></a>LineMapper</h4></div></div></div><p>As with <code class="classname">RowMapper</code>, which takes a low
        level construct such as <code class="classname">ResultSet</code> and returns
        an <code class="classname">Object</code>, flat file processing requires the
        same construct to convert a <code class="classname">String</code> line into an
        <code class="classname">Object</code>:
        </p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> LineMapper&lt;T&gt; {

    T mapLine(String line, <span class="hl-keyword">int</span> lineNumber) <span class="hl-keyword">throws</span> Exception;

}</pre><p>The basic contract is that, given the current line and the line
        number with which it is associated, the mapper should return a
        resulting domain object. This is similar to
        <code class="classname">RowMapper</code> in that each line is associated with
        its line number, just as each row in a
        <code class="classname">ResultSet</code> is tied to its row number. This
        allows the line number to be tied to the resulting domain object for
        identity comparison or for more informative logging. However, unlike
        <code class="classname">RowMapper</code>, the
        <code class="classname">LineMapper</code> is given a raw line which, as
        discussed above, only gets you halfway there. The line must be
        tokenized into a <code class="classname">FieldSet</code>, which can then be
        mapped to an object, as described below.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="lineTokenizer" href="#lineTokenizer"></a>LineTokenizer</h4></div></div></div><p>An abstraction for turning a line of input into a line into a
        <code class="classname">FieldSet</code> is necessary because there can be many
        formats of flat file data that need to be converted to a
        <code class="classname">FieldSet</code>. In Spring Batch, this interface is
        the <code class="classname">LineTokenizer</code>:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> LineTokenizer {

    FieldSet tokenize(String line);

}</pre><p>The contract of a <code class="classname">LineTokenizer</code> is such
        that, given a line of input (in theory the
        <code class="classname">String</code> could encompass more than one line), a
        <code class="classname">FieldSet</code> representing the line will be
        returned. This <code class="classname">FieldSet</code> can then be passed to a
        <code class="classname">FieldSetMapper</code>. Spring Batch contains the
        following <code class="classname">LineTokenizer</code> implementations:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="classname">DelmitedLineTokenizer</code> - Used for
            files where fields in a record are separated by a delimiter. The
            most common delimiter is a comma, but pipes or semicolons are
            often used as well.</p></li><li class="listitem"><p><code class="classname">FixedLengthTokenizer</code> - Used for files
            where fields in a record are each a 'fixed width'. The width of
            each field must be defined for each record type.</p></li><li class="listitem"><p><code class="classname">PatternMatchingCompositeLineTokenizer</code>
            - Determines which among a list of
            <code class="classname">LineTokenizer</code>s should be used on a
            particular line by checking against a pattern.</p></li></ul></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="fieldSetMapper" href="#fieldSetMapper"></a>FieldSetMapper</h4></div></div></div><p>The <code class="classname">FieldSetMapper</code> interface defines a
        single method, <code class="methodname">mapFieldSet</code>, which takes a
        <code class="classname">FieldSet</code> object and maps its contents to an
        object. This object may be a custom DTO, a domain object, or a simple
        array, depending on the needs of the job. The
        <code class="classname">FieldSetMapper</code> is used in conjunction with the
        <code class="classname">LineTokenizer</code> to translate a line of data from
        a resource into an object of the desired type:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> FieldSetMapper&lt;T&gt; {

    T mapFieldSet(FieldSet fieldSet);

}</pre><p>The pattern used is the same as the
        <code class="classname">RowMapper</code> used by
        <code class="classname">JdbcTemplate</code>.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="defaultLineMapper" href="#defaultLineMapper"></a>DefaultLineMapper</h4></div></div></div><p>Now that the basic interfaces for reading in flat files have
        been defined, it becomes clear that three basic steps are
        required:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Read one line from the file.</p></li><li class="listitem"><p>Pass the string line into the
              <code class="methodname">LineTokenizer#tokenize</code>() method, in
              order to retrieve a <code class="classname">FieldSet</code>.</p></li><li class="listitem"><p>Pass the <code class="classname">FieldSet</code> returned from
              tokenizing to a <code class="classname">FieldSetMapper</code>, returning
              the result from the <code class="methodname">ItemReader#read</code>()
              method.</p></li></ol></div><p>The two interfaces described above represent two separate tasks:
        converting a line into a <code class="classname">FieldSet</code>, and mapping
        a <code class="classname">FieldSet</code> to a domain object. Because the
        input of a <code class="classname">LineTokenizer</code> matches the input of
        the <code class="classname">LineMapper</code> (a line), and the output of a
        <code class="classname">FieldSetMapper</code> matches the output of the
        <code class="classname">LineMapper</code>, a default implementation that uses
        both a <code class="classname">LineTokenizer</code> and
        <code class="classname">FieldSetMapper</code> is provided. The
        <code class="classname">DefaultLineMapper</code> represents the behavior most
        users will need:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> DefaultLineMapper&lt;T&gt; <span class="hl-keyword">implements</span> LineMapper&lt;T&gt;, InitializingBean {

    <span class="hl-keyword">private</span> LineTokenizer tokenizer;

    <span class="hl-keyword">private</span> FieldSetMapper&lt;T&gt; fieldSetMapper;

    <span class="hl-keyword">public</span> T mapLine(String line, <span class="hl-keyword">int</span> lineNumber) <span class="hl-keyword">throws</span> Exception {
        <span class="bold"><strong>return fieldSetMapper.mapFieldSet(tokenizer.tokenize(line));</strong></span>
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> setLineTokenizer(LineTokenizer tokenizer) {
        <span class="hl-keyword">this</span>.tokenizer = tokenizer;
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> setFieldSetMapper(FieldSetMapper&lt;T&gt; fieldSetMapper) {
        <span class="hl-keyword">this</span>.fieldSetMapper = fieldSetMapper;
    }
}</pre><p>The above functionality is provided in a default implementation,
        rather than being built into the reader itself (as was done in
        previous versions of the framework) in order to allow users greater
        flexibility in controlling the parsing process, especially if access
        to the raw line is needed.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="simpleDelimitedFileReadingExample" href="#simpleDelimitedFileReadingExample"></a>Simple Delimited File Reading Example</h4></div></div></div><p>The following example will be used to illustrate this using an
        actual domain scenario. This particular batch job reads in football
        players from the following file:
        </p><pre class="programlisting">ID,lastName,firstName,position,birthYear,debutYear
"AbduKa00,Abdul-Jabbar,Karim,rb,1974,1996",
"AbduRa00,Abdullah,Rabih,rb,1975,1999",
"AberWa00,Abercrombie,Walter,rb,1959,1982",
"AbraDa00,Abramowicz,Danny,wr,1945,1967",
"AdamBo00,Adams,Bob,te,1946,1969",
"AdamCh00,Adams,Charlie,wr,1979,2003"        </pre><p>The contents of this file will be mapped to the following
        <code class="classname">Player</code> domain object:
        </p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> Player <span class="hl-keyword">implements</span> Serializable {

    <span class="hl-keyword">private</span> String ID;
    <span class="hl-keyword">private</span> String lastName;
    <span class="hl-keyword">private</span> String firstName;
    <span class="hl-keyword">private</span> String position;
    <span class="hl-keyword">private</span> <span class="hl-keyword">int</span> birthYear;
    <span class="hl-keyword">private</span> <span class="hl-keyword">int</span> debutYear;

    <span class="hl-keyword">public</span> String toString() {
        <span class="hl-keyword">return</span> <span class="hl-string">"PLAYER:ID="</span> + ID + <span class="hl-string">",Last Name="</span> + lastName +
            <span class="hl-string">",First Name="</span> + firstName + <span class="hl-string">",Position="</span> + position +
            <span class="hl-string">",Birth Year="</span> + birthYear + <span class="hl-string">",DebutYear="</span> +
            debutYear;
    }

    <span class="hl-comment">// setters and getters...</span>
}</pre><p>In order to map a <code class="classname">FieldSet</code> into a
        <code class="classname">Player</code> object, a
        <code class="classname">FieldSetMapper</code> that returns players needs to be
        defined:</p><pre class="programlisting"><span class="hl-keyword">protected</span> <span class="hl-keyword">static</span> <span class="hl-keyword">class</span> PlayerFieldSetMapper <span class="hl-keyword">implements</span> FieldSetMapper&lt;Player&gt; {
    <span class="hl-keyword">public</span> Player mapFieldSet(FieldSet fieldSet) {
        Player player = <span class="hl-keyword">new</span> Player();

        player.setID(fieldSet.readString(<span class="hl-number">0</span>));
        player.setLastName(fieldSet.readString(<span class="hl-number">1</span>));
        player.setFirstName(fieldSet.readString(<span class="hl-number">2</span>));
        player.setPosition(fieldSet.readString(<span class="hl-number">3</span>));
        player.setBirthYear(fieldSet.readInt(<span class="hl-number">4</span>));
        player.setDebutYear(fieldSet.readInt(<span class="hl-number">5</span>));

        <span class="hl-keyword">return</span> player;
    }
}</pre><p>The file can then be read by correctly constructing a
        <code class="classname">FlatFileItemReader</code> and calling
        <code class="methodname">read</code>:</p><pre class="programlisting">FlatFileItemReader&lt;Player&gt; itemReader = <span class="hl-keyword">new</span> FlatFileItemReader&lt;Player&gt;();
itemReader.setResource(<span class="hl-keyword">new</span> FileSystemResource(<span class="hl-string">"resources/players.csv"</span>));
<span class="hl-comment">//DelimitedLineTokenizer defaults to comma as its delimiter</span>
DefaultLineMapper&lt;Player&gt; lineMapper = <span class="hl-keyword">new</span> DefaultLineMapper&lt;Player&gt;();
lineMapper.setLineTokenizer(<span class="hl-keyword">new</span> DelimitedLineTokenizer());
lineMapper.setFieldSetMapper(<span class="hl-keyword">new</span> PlayerFieldSetMapper());
itemReader.setLineMapper(lineMapper);
itemReader.open(<span class="hl-keyword">new</span> ExecutionContext());
Player player = itemReader.read();</pre><p>Each call to <code class="methodname">read</code> will return a new
        Player object from each line in the file. When the end of the file is
        reached, null will be returned.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="mappingFieldsByName" href="#mappingFieldsByName"></a>Mapping Fields by Name</h4></div></div></div><p>There is one additional piece of functionality that is allowed
        by both <code class="classname">DelimitedLineTokenizer</code> and
        <code class="classname">FixedLengthTokenizer</code> that is similar in
        function to a Jdbc <code class="classname">ResultSet</code>. The names of the
        fields can be injected into either of these
        <code class="classname">LineTokenizer</code> implementations to increase the
        readability of the mapping function. First, the column names of all
        fields in the flat file are injected into the tokenizer:</p><pre class="programlisting">tokenizer.setNames(<span class="hl-keyword">new</span> String[] {<span class="hl-string">"ID"</span>, <span class="hl-string">"lastName"</span>,<span class="hl-string">"firstName"</span>,<span class="hl-string">"position"</span>,<span class="hl-string">"birthYear"</span>,<span class="hl-string">"debutYear"</span>});          </pre><p>A <code class="classname">FieldSetMapper</code> can use this information
        as follows:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> PlayerMapper <span class="hl-keyword">implements</span> FieldSetMapper&lt;Player&gt; {
    <span class="hl-keyword">public</span> Player mapFieldSet(FieldSet fs) {

       <span class="hl-keyword">if</span>(fs == null){
           <span class="hl-keyword">return</span> null;
       }

       Player player = <span class="hl-keyword">new</span> Player();
       player.setID(fs.readString(<span class="bold"><strong>"ID"</strong></span>));
       player.setLastName(fs.readString(<span class="bold"><strong>"lastName"</strong></span>));
       player.setFirstName(fs.readString(<span class="bold"><strong>"firstName"</strong></span>));
       player.setPosition(fs.readString(<span class="bold"><strong>"position"</strong></span>));
       player.setDebutYear(fs.readInt(<span class="bold"><strong>"debutYear"</strong></span>));
       player.setBirthYear(fs.readInt(<span class="bold"><strong>"birthYear"</strong></span>));

       <span class="hl-keyword">return</span> player;
   }
}</pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="beanWrapperFieldSetMapper" href="#beanWrapperFieldSetMapper"></a>Automapping FieldSets to Domain Objects</h4></div></div></div><p>For many, having to write a specific
        <code class="classname">FieldSetMapper</code> is equally as cumbersome as
        writing a specific <code class="classname">RowMapper</code> for a
        <code class="classname">JdbcTemplate</code>. Spring Batch makes this easier by
        providing a <code class="classname">FieldSetMapper</code> that automatically
        maps fields by matching a field name with a setter on the object using
        the JavaBean specification. Again using the football example, the
        <code class="classname">BeanWrapperFieldSetMapper</code> configuration looks
        like the following:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"fieldSetMapper"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"prototypeBeanName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"player"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"player"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.domain.Player"</span>
      <span class="hl-attribute">scope</span>=<span class="hl-value">"prototype"</span><span class="hl-tag"> /&gt;</span></pre><p>For each entry in the <code class="classname">FieldSet</code>, the
        mapper will look for a corresponding setter on a new instance of the
        <code class="classname">Player</code> object (for this reason, prototype scope
        is required) in the same way the Spring container will look for
        setters matching a property name. Each available field in the
        <code class="classname">FieldSet</code> will be mapped, and the resultant
        <code class="classname">Player</code> object will be returned, with no code
        required.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="fixedLengthFileFormats" href="#fixedLengthFileFormats"></a>Fixed Length File Formats</h4></div></div></div><p>So far only delimited files have been discussed in much detail,
        however, they represent only half of the file reading picture. Many
        organizations that use flat files use fixed length formats. An example
        fixed length file is below:</p><pre class="programlisting">UK21341EAH4121131.11customer1
UK21341EAH4221232.11customer2
UK21341EAH4321333.11customer3
UK21341EAH4421434.11customer4
UK21341EAH4521535.11customer5</pre><p>While this looks like one large field, it actually represent 4
        distinct fields:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>ISIN: Unique identifier for the item being order - 12
            characters long.</p></li><li class="listitem"><p>Quantity: Number of this item being ordered - 3 characters
            long.</p></li><li class="listitem"><p>Price: Price of the item - 5 characters long.</p></li><li class="listitem"><p>Customer: Id of the customer ordering the item - 9
            characters long.</p></li></ol></div><p>When configuring the
        <code class="classname">FixedLengthLineTokenizer</code>, each of these lengths
        must be provided in the form of ranges:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"fixedLengthLineTokenizer"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.io.file.transform.FixedLengthTokenizer"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"names"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"ISIN,Quantity,Price,Customer"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"columns"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1-12, 13-15, 16-20, 21-29"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>Because the <code class="classname">FixedLengthLineTokenizer</code> uses
        the same <code class="classname">LineTokenizer</code> interface as discussed
        above, it will return the same <code class="classname">FieldSet</code> as if a
        delimiter had been used. This allows the same approaches to be used in
        handling its output, such as using the
        <code class="classname">BeanWrapperFieldSetMapper</code>.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td align="left" valign="top"><p>Supporting the above syntax for ranges requires that a
            specialized property editor,
            <code class="classname">RangeArrayPropertyEditor</code>, be configured in
            the <code class="classname">ApplicationContext</code>. However, this bean
            is automatically declared in an
            <code class="classname">ApplicationContext</code> where the batch
            namespace is used.</p></td></tr></table></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="prefixMatchingLineMapper" href="#prefixMatchingLineMapper"></a>Multiple Record Types within a Single File</h4></div></div></div><p>All of the file reading examples up to this point have all made
        a key assumption for simplicity's sake: all of the records in a file
        have the same format. However, this may not always be the case. It is
        very common that a file might have records with different formats that
        need to be tokenized differently and mapped to different objects. The
        following excerpt from a file illustrates this:</p><pre class="programlisting">USER;Smith;Peter;;T;20014539;F
LINEA;1044391041ABC037.49G201XX1383.12H
LINEB;2134776319DEF422.99M005LI</pre><p>In this file we have three types of records, "USER", "LINEA",
        and "LINEB". A "USER" line corresponds to a User object. "LINEA" and
        "LINEB" both correspond to Line objects, though a "LINEA" has more
        information than a "LINEB".</p><p>The <code class="classname">ItemReader </code>will read each line
        individually, but we must specify different
        <code class="classname">LineTokenizer</code> and
        <code class="classname">FieldSetMapper</code> objects so that the
        <code class="classname">ItemWriter</code> will receive the correct items. The
        <code class="classname">PatternMatchingCompositeLineMapper</code> makes this
        easy by allowing maps of patterns to
        <code class="classname">LineTokenizer</code>s and patterns to
        <code class="classname">FieldSetMapper</code>s to be configured:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"orderFileLineMapper"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...PatternMatchingCompositeLineMapper"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"tokenizers"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;map&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"USER*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"userTokenizer"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"LINEA*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"lineATokenizer"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"LINEB*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"lineBTokenizer"</span><span class="hl-tag"> /&gt;</span>
        <span class="hl-tag">&lt;/map&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"fieldSetMappers"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;map&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"USER*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"userFieldSetMapper"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"LINE*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"lineFieldSetMapper"</span><span class="hl-tag"> /&gt;</span>
        <span class="hl-tag">&lt;/map&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>In this example, "LINEA" and "LINEB" have separate
        <code class="classname">LineTokenizer</code>s but they both use the same
        <code class="classname">FieldSetMapper</code>.</p><p>The <code class="classname">PatternMatchingCompositeLineMapper</code>
        makes use of the <code class="classname">PatternMatcher</code>'s
        <code class="classname">match</code> method in order to select the correct
        delegate for each line. The <code class="classname">PatternMatcher</code>
        allows for two wildcard characters with special meaning: the question
        mark ("?") will match exactly one character, while the asterisk ("*")
        will match zero or more characters. Note that in the configuration
        above, all patterns end with an asterisk, making them effectively
        prefixes to lines. The <code class="classname">PatternMatcher</code> will
        always match the most specific pattern possible, regardless of the
        order in the configuration. So if "LINE*" and "LINEA*" were both
        listed as patterns, "LINEA" would match pattern "LINEA*", while
        "LINEB" would match pattern "LINE*". Additionally, a single asterisk
        ("*") can serve as a default by matching any line not matched by any
        other pattern.</p><pre class="programlisting"><span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"*"</span> <span class="hl-attribute">value-ref</span>=<span class="hl-value">"defaultLineTokenizer"</span><span class="hl-tag"> /&gt;</span></pre><p>There is also a
        <code class="classname">PatternMatchingCompositeLineTokenizer</code> that can
        be used for tokenization alone.</p><p>It is also common for a flat file to contain records that each
        span multiple lines. To handle this situation, a more complex strategy
        is required. A demonstration of this common pattern can be found in
        <a class="xref" href="patterns.html#multiLineRecords" title="11.5&nbsp;Multi-Line Records">Section&nbsp;11.5, &#8220;Multi-Line Records&#8221;</a>.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="exceptionHandlingInFlatFiles" href="#exceptionHandlingInFlatFiles"></a>Exception Handling in Flat Files</h4></div></div></div><p>There are many scenarios when tokenizing a line may cause
        exceptions to be thrown. Many flat files are imperfect and contain
        records that aren't formatted correctly. Many users choose to skip
        these erroneous lines, logging out the issue, original line, and line
        number. These logs can later be inspected manually or by another batch
        job. For this reason, Spring Batch provides a hierarchy of exceptions
        for handling parse exceptions:
        <code class="classname">FlatFileParseException</code> and
        <code class="classname">FlatFileFormatException</code>.
        <code class="classname">FlatFileParseException</code> is thrown by the
        <code class="classname">FlatFileItemReader</code> when any errors are
        encountered while trying to read a file.
        <code class="classname">FlatFileFormatException</code> is thrown by
        implementations of the <code class="classname">LineTokenizer</code> interface,
        and indicates a more specific error encountered while
        tokenizing.</p><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="incorrectTokenCountException" href="#incorrectTokenCountException"></a>IncorrectTokenCountException</h5></div></div></div><p>Both <code class="classname">DelimitedLineTokenizer</code> and
          <code class="classname">FixedLengthLineTokenizer</code> have the ability to
          specify column names that can be used for creating a
          <code class="classname">FieldSet</code>. However, if the number of column
          names doesn't match the number of columns found while tokenizing a
          line the <code class="classname">FieldSet</code> can't be created, and a
          <code class="classname">IncorrectTokenCountException</code> is thrown, which
          contains the number of tokens encountered, and the number
          expected:</p><pre class="programlisting">tokenizer.setNames(<span class="hl-keyword">new</span> String[] {<span class="hl-string">"A"</span>, <span class="hl-string">"B"</span>, <span class="hl-string">"C"</span>, <span class="hl-string">"D"</span>});

<span class="hl-keyword">try</span> {
    tokenizer.tokenize(<span class="hl-string">"a,b,c"</span>);
}
<span class="hl-keyword">catch</span>(IncorrectTokenCountException e){
    assertEquals(<span class="hl-number">4</span>, e.getExpectedCount());
    assertEquals(<span class="hl-number">3</span>, e.getActualCount());
}</pre><p>Because the tokenizer was configured with 4 column names, but
          only 3 tokens were found in the file, an
          <code class="classname">IncorrectTokenCountException</code> was
          thrown.</p></div><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="incorrectLineLengthException" href="#incorrectLineLengthException"></a>IncorrectLineLengthException</h5></div></div></div><p>Files formatted in a fixed length format have additional
          requirements when parsing because, unlike a delimited format, each
          column must strictly adhere to its predefined width. If the total
          line length doesn't add up to the widest value of this column, an
          exception is thrown:</p><pre class="programlisting">tokenizer.setColumns(<span class="hl-keyword">new</span> Range[] { <span class="hl-keyword">new</span> Range(<span class="hl-number">1</span>, <span class="hl-number">5</span>),
                                   <span class="hl-keyword">new</span> Range(<span class="hl-number">6</span>, <span class="hl-number">10</span>),
                                   <span class="hl-keyword">new</span> Range(<span class="hl-number">11</span>, <span class="hl-number">15</span>) });
<span class="hl-keyword">try</span> {
    tokenizer.tokenize(<span class="hl-string">"12345"</span>);
    fail(<span class="hl-string">"Expected IncorrectLineLengthException"</span>);
}
<span class="hl-keyword">catch</span> (IncorrectLineLengthException ex) {
    assertEquals(<span class="hl-number">15</span>, ex.getExpectedLength());
    assertEquals(<span class="hl-number">5</span>, ex.getActualLength());
}</pre><p>The configured ranges for the tokenizer above are: 1-5, 6-10,
          and 11-15, thus the total length of the line expected is 15.
          However, in this case a line of length 5 was passed in, causing an
          <code class="classname">IncorrectLineLengthException</code> to be thrown.
          Throwing an exception here rather than only mapping the first column
          allows the processing of the line to fail earlier, and with more
          information than it would if it failed while trying to read in
          column 2 in a <code class="classname">FieldSetMapper</code>. However, there
          are scenarios where the length of the line isn't always constant.
          For this reason, validation of line length can be turned off via the
          'strict' property:</p><pre class="programlisting">tokenizer.setColumns(<span class="hl-keyword">new</span> Range[] { <span class="hl-keyword">new</span> Range(<span class="hl-number">1</span>, <span class="hl-number">5</span>), <span class="hl-keyword">new</span> Range(<span class="hl-number">6</span>, <span class="hl-number">10</span>) });
<span class="bold"><strong>tokenizer.setStrict(false);</strong></span>
FieldSet tokens = tokenizer.tokenize(<span class="hl-string">"12345"</span>);
assertEquals(<span class="hl-string">"12345"</span>, tokens.readString(<span class="hl-number">0</span>));
assertEquals(<span class="hl-string">""</span>, tokens.readString(<span class="hl-number">1</span>));</pre><p>The above example is almost identical to the one before it,
          except that tokenizer.setStrict(false) was called. This setting
          tells the tokenizer to not enforce line lengths when tokenizing the
          line. A <code class="classname">FieldSet</code> is now correctly created and
          returned. However, it will only contain empty tokens for the
          remaining values.</p></div></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="flatFileItemWriter" href="#flatFileItemWriter"></a>6.6.3&nbsp;FlatFileItemWriter</h3></div></div></div><p>Writing out to flat files has the same problems and issues that
      reading in from a file must overcome. A step must be able to write out
      in either delimited or fixed length formats in a transactional
      manner.</p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="lineAggregator" href="#lineAggregator"></a>LineAggregator</h4></div></div></div><p>Just as the <code class="classname">LineTokenizer</code> interface is
        necessary to take an item and turn it into a
        <code class="classname">String</code>, file writing must have a way to
        aggregate multiple fields into a single string for writing to a file.
        In Spring Batch this is the
        <code class="classname">LineAggregator</code>:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> LineAggregator&lt;T&gt; {

    <span class="hl-keyword">public</span> String aggregate(T item);

}</pre><p>The <code class="classname">LineAggregator</code> is the opposite of a
        <code class="classname">LineTokenizer</code>.
        <code class="classname">LineTokenizer</code> takes a
        <code class="classname">String</code> and returns a
        <code class="classname">FieldSet</code>, whereas
        <code class="classname">LineAggregator</code> takes an
        <code class="classname">item</code> and returns a
        <code class="classname">String</code>.</p><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="PassThroughLineAggregator" href="#PassThroughLineAggregator"></a>PassThroughLineAggregator</h5></div></div></div><p>The most basic implementation of the LineAggregator interface
          is the <code class="classname">PassThroughLineAggregator</code>, which
          simply assumes that the object is already a string, or that its
          string representation is acceptable for writing:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> PassThroughLineAggregator&lt;T&gt; <span class="hl-keyword">implements</span> LineAggregator&lt;T&gt; {

    <span class="hl-keyword">public</span> String aggregate(T item) {
        <span class="hl-keyword">return</span> item.toString();
    }
}</pre><p>The above implementation is useful if direct control of
          creating the string is required, but the advantages of a
          <code class="classname">FlatFileItemWriter</code>, such as transaction and
          restart support, are necessary.</p></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="SimplifiedFileWritingExample" href="#SimplifiedFileWritingExample"></a>Simplified File Writing Example</h4></div></div></div><p>Now that the <code class="classname">LineAggregator</code> interface and
        its most basic implementation,
        <code class="classname">PassThroughLineAggregator</code>, have been defined,
        the basic flow of writing can be explained:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>The object to be written is passed to the
            <code class="classname">LineAggregator</code> in order to obtain a
            <code class="classname">String</code>.</p></li><li class="listitem"><p>The returned <code class="classname">String</code> is written to the
            configured file.</p></li></ol></div><p>The following excerpt from the
        <code class="classname">FlatFileItemWriter</code> expresses this in
        code:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">void</span> write(T item) <span class="hl-keyword">throws</span> Exception {
    write(lineAggregator.aggregate(item) + LINE_SEPARATOR);
}</pre><p>A simple configuration would look like the following:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...FlatFileItemWriter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resource"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"file:target/test-outputs/output.txt"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"lineAggregator"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...PassThroughLineAggregator"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="FieldExtractor" href="#FieldExtractor"></a>FieldExtractor</h4></div></div></div><p>The above example may be useful for the most basic uses of a
        writing to a file. However, most users of the
        <code class="classname">FlatFileItemWriter</code> will have a domain object
        that needs to be written out, and thus must be converted into a line.
        In file reading, the following was required:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Read one line from the file.</p></li><li class="listitem"><p>Pass the string line into the
              <code class="methodname">LineTokenizer#tokenize</code>() method, in
              order to retrieve a <code class="classname">FieldSet</code></p></li><li class="listitem"><p>Pass the <code class="classname">FieldSet</code> returned from
              tokenizing to a <code class="classname">FieldSetMapper</code>, returning
              the result from the <code class="methodname">ItemReader#read</code>()
              method</p></li></ol></div><p>File writing has similar, but inverse steps:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>Pass the item to be written to the writer</p></li><li class="listitem"><p>convert the fields on the item into an array</p></li><li class="listitem"><p>aggregate the resulting array into a line</p></li></ol></div><p>Because there is no way for the framework to know which fields
        from the object need to be written out, a
        <code class="classname">FieldExtractor</code> must be written to accomplish
        the task of turning the item into an array:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> FieldExtractor&lt;T&gt; {

    Object[] extract(T item);

}</pre><p>Implementations of the <code class="classname">FieldExtractor</code>
        interface should create an array from the fields of the provided
        object, which can then be written out with a delimiter between the
        elements, or as part of a field-width line.</p><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="PassThroughFieldExtractor" href="#PassThroughFieldExtractor"></a>PassThroughFieldExtractor</h5></div></div></div><p>There are many cases where a collection, such as an array,
          <code class="classname">Collection</code>, or
          <code class="classname">FieldSet</code>, needs to be written out.
          "Extracting" an array from a one of these collection types is very
          straightforward: simply convert the collection to an array.
          Therefore, the <code class="classname">PassThroughFieldExtractor</code>
          should be used in this scenario. It should be noted, that if the
          object passed in is not a type of collection, then the
          <code class="classname">PassThroughFieldExtractor</code> will return an
          array containing solely the item to be extracted.</p></div><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="BeanWrapperFieldExtractor" href="#BeanWrapperFieldExtractor"></a>BeanWrapperFieldExtractor</h5></div></div></div><p>As with the <code class="classname">BeanWrapperFieldSetMapper</code>
          described in the file reading section, it is often preferable to
          configure how to convert a domain object to an object array, rather
          than writing the conversion yourself. The
          <code class="classname">BeanWrapperFieldExtractor</code> provides just this
          type of functionality:</p><pre class="programlisting">BeanWrapperFieldExtractor&lt;Name&gt; extractor = <span class="hl-keyword">new</span> BeanWrapperFieldExtractor&lt;Name&gt;();
extractor.setNames(<span class="hl-keyword">new</span> String[] { <span class="hl-string">"first"</span>, <span class="hl-string">"last"</span>, <span class="hl-string">"born"</span> });

String first = <span class="hl-string">"Alan"</span>;
String last = <span class="hl-string">"Turing"</span>;
<span class="hl-keyword">int</span> born = <span class="hl-number">1912</span>;

Name n = <span class="hl-keyword">new</span> Name(first, last, born);
Object[] values = extractor.extract(n);

assertEquals(first, values[<span class="hl-number">0</span>]);
assertEquals(last, values[<span class="hl-number">1</span>]);
assertEquals(born, values[<span class="hl-number">2</span>]);</pre><p>This extractor implementation has only one required property,
          the names of the fields to map. Just as the
          <code class="classname">BeanWrapperFieldSetMapper</code> needs field names
          to map fields on the <code class="classname">FieldSet</code> to setters on
          the provided object, the
          <code class="classname">BeanWrapperFieldExtractor</code> needs names to map
          to getters for creating an object array. It is worth noting that the
          order of the names determines the order of the fields within the
          array.</p></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="delimitedFileWritingExample" href="#delimitedFileWritingExample"></a>Delimited File Writing Example</h4></div></div></div><p>The most basic flat file format is one in which all fields are
        separated by a delimiter. This can be accomplished using a
        <code class="classname">DelimitedLineAggregator</code>. The example below
        writes out a simple domain object that represents a credit to a
        customer account:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CustomerCredit {

    <span class="hl-keyword">private</span> <span class="hl-keyword">int</span> id;
    <span class="hl-keyword">private</span> String name;
    <span class="hl-keyword">private</span> BigDecimal credit;

    <span class="hl-comment">//getters and setters removed for clarity</span>
}</pre><p>Because a domain object is being used, an implementation of the
        FieldExtractor interface must be provided, along with the delimiter to
        use:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.file.FlatFileItemWriter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"outputResource"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"lineAggregator"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...DelimitedLineAggregator"</span><span class="hl-tag">&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"delimiter"</span> <span class="hl-attribute">value</span>=<span class="hl-value">","</span><span class="hl-tag">/&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"fieldExtractor"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...BeanWrapperFieldExtractor"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"names"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"name,credit"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;/bean&gt;</span>
            <span class="hl-tag">&lt;/property&gt;</span>
        <span class="hl-tag">&lt;/bean&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>In this case, the
        <code class="classname">BeanWrapperFieldExtractor</code> described earlier in
        this chapter is used to turn the name and credit fields within
        <code class="classname">CustomerCredit</code> into an object array, which is
        then written out with commas between each field.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="fixedWidthFileWritingExample" href="#fixedWidthFileWritingExample"></a>Fixed Width File Writing Example</h4></div></div></div><p>Delimited is not the only type of flat file format. Many prefer
        to use a set width for each column to delineate between fields, which
        is usually referred to as 'fixed width'. Spring Batch supports this in
        file writing via the <code class="classname">FormatterLineAggregator</code>.
        Using the same <code class="classname">CustomerCredit</code> domain object
        described above, it can be configured as follows:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.file.FlatFileItemWriter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"outputResource"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"lineAggregator"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...FormatterLineAggregator"</span><span class="hl-tag">&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"fieldExtractor"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...BeanWrapperFieldExtractor"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"names"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"name,credit"</span><span class="hl-tag"> /&gt;</span>
                <span class="hl-tag">&lt;/bean&gt;</span>
            <span class="hl-tag">&lt;/property&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"format"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"%-9s%-2.0f"</span><span class="hl-tag"> /&gt;</span>
        <span class="hl-tag">&lt;/bean&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>Most of the above example should look familiar. However, the
        value of the format property is new:</p><pre class="programlisting"><span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"format"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"%-9s%-2.0f"</span><span class="hl-tag"> /&gt;</span></pre><p>The underlying implementation is built using the same
        <code class="classname">Formatter</code> added as part of Java 5. The Java
        <code class="classname">Formatter</code> is based on the
        <code class="methodname">printf</code> functionality of the C programming
        language. Most details on how to configure a formatter can be found in
        the javadoc of <a class="ulink" href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/Formatter.html" target="_top"><em class="citetitle">Formatter</em></a>.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="handlingFileCreation" href="#handlingFileCreation"></a>Handling File Creation</h4></div></div></div><p><code class="classname">FlatFileItemReader</code> has a very simple
        relationship with file resources. When the reader is initialized, it
        opens the file if it exists, and throws an exception if it does not.
        File writing isn't quite so simple. At first glance it seems like a
        similar straight forward contract should exist for
        <code class="classname">FlatFileItemWriter</code>: if the file already exists,
        throw an exception, and if it does not, create it and start writing.
        However, potentially restarting a <code class="classname">Job</code> can cause
        issues. In normal restart scenarios, the contract is reversed: if the
        file exists, start writing to it from the last known good position,
        and if it does not, throw an exception. However, what happens if the
        file name for this job is always the same? In this case, you would
        want to delete the file if it exists, unless it's a restart. Because
        of this possibility, the <code class="classname">FlatFileItemWriter</code>
        contains the property, <code class="methodname">shouldDeleteIfExists</code>.
        Setting this property to true will cause an existing file with the
        same name to be deleted when the writer is opened.</p></div></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmlReadingWriting" href="#xmlReadingWriting"></a>6.7&nbsp;XML Item Readers and Writers</h2></div></div></div><p>Spring Batch provides transactional infrastructure for both reading
    XML records and mapping them to Java objects as well as writing Java
    objects as XML records.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note: Constraints on streaming XML"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Constraints on streaming XML</th></tr><tr><td align="left" valign="top"><p>The StAX API is used for I/O as other standard XML parsing APIs do
      not fit batch processing requirements (DOM loads the whole input into
      memory at once and SAX controls the parsing process allowing the user
      only to provide callbacks).</p></td></tr></table></div><p>Lets take a closer look how XML input and output works in Spring
    Batch. First, there are a few concepts that vary from file reading and
    writing but are common across Spring Batch XML processing. With XML
    processing, instead of lines of records (FieldSets) that need to be
    tokenized, it is assumed an XML resource is a collection of 'fragments'
    corresponding to individual records:</p><div class="mediaobject" align="center"><img src="images/xmlinput.png" align="middle"><div class="caption"><p>Figure 3.1: XML Input</p></div></div><p>The 'trade' tag is defined as the 'root element' in the scenario
    above. Everything between '&lt;trade&gt;' and '&lt;/trade&gt;' is
    considered one 'fragment'. Spring Batch uses Object/XML Mapping (OXM) to
    bind fragments to objects. However, Spring Batch is not tied to any
    particular XML binding technology. Typical use is to delegate to <a class="ulink" href="http://docs.spring.io/spring-ws/site/reference/html/oxm.html" target="_top"><em class="citetitle">Spring
    OXM</em></a>, which provides uniform abstraction for the most
    popular OXM technologies. The dependency on Spring OXM is optional and you
    can choose to implement Spring Batch specific interfaces if desired. The
    relationship to the technologies that OXM supports can be shown as the
    following:</p><div class="mediaobject" align="center"><img src="images/oxm-fragments.png" align="middle"><div class="caption"><p>Figure 3.2: OXM Binding</p></div></div><p>Now with an introduction to OXM and how one can use XML fragments to
    represent records, let's take a closer look at readers and writers.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="StaxEventItemReader" href="#StaxEventItemReader"></a>6.7.1&nbsp;StaxEventItemReader</h3></div></div></div><p>The <code class="classname">StaxEventItemReader</code> configuration
      provides a typical setup for the processing of records from an XML input
      stream. First, lets examine a set of XML records that the
      <code class="classname">StaxEventItemReader</code> can process.</p><pre class="programlisting"><span class="hl-directive" style="color: maroon">&lt;?xml version="1.0" encoding="UTF-8"?&gt;</span>
<span class="hl-tag">&lt;records&gt;</span>
    <span class="hl-tag">&lt;trade</span> <span class="hl-attribute">xmlns</span>=<span class="hl-value">"http://springframework.org/batch/sample/io/oxm/domain"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;isin&gt;</span>XYZ0001<span class="hl-tag">&lt;/isin&gt;</span>
        <span class="hl-tag">&lt;quantity&gt;</span>5<span class="hl-tag">&lt;/quantity&gt;</span>
        <span class="hl-tag">&lt;price&gt;</span>11.39<span class="hl-tag">&lt;/price&gt;</span>
        <span class="hl-tag">&lt;customer&gt;</span>Customer1<span class="hl-tag">&lt;/customer&gt;</span>
    <span class="hl-tag">&lt;/trade&gt;</span>
    <span class="hl-tag">&lt;trade</span> <span class="hl-attribute">xmlns</span>=<span class="hl-value">"http://springframework.org/batch/sample/io/oxm/domain"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;isin&gt;</span>XYZ0002<span class="hl-tag">&lt;/isin&gt;</span>
        <span class="hl-tag">&lt;quantity&gt;</span>2<span class="hl-tag">&lt;/quantity&gt;</span>
        <span class="hl-tag">&lt;price&gt;</span>72.99<span class="hl-tag">&lt;/price&gt;</span>
        <span class="hl-tag">&lt;customer&gt;</span>Customer2c<span class="hl-tag">&lt;/customer&gt;</span>
    <span class="hl-tag">&lt;/trade&gt;</span>
    <span class="hl-tag">&lt;trade</span> <span class="hl-attribute">xmlns</span>=<span class="hl-value">"http://springframework.org/batch/sample/io/oxm/domain"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;isin&gt;</span>XYZ0003<span class="hl-tag">&lt;/isin&gt;</span>
        <span class="hl-tag">&lt;quantity&gt;</span>9<span class="hl-tag">&lt;/quantity&gt;</span>
        <span class="hl-tag">&lt;price&gt;</span>99.99<span class="hl-tag">&lt;/price&gt;</span>
        <span class="hl-tag">&lt;customer&gt;</span>Customer3<span class="hl-tag">&lt;/customer&gt;</span>
    <span class="hl-tag">&lt;/trade&gt;</span>
<span class="hl-tag">&lt;/records&gt;</span></pre><p>To be able to process the XML records the following is needed:
      </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Root Element Name - Name of the root element of the fragment
            that constitutes the object to be mapped. The example
            configuration demonstrates this with the value of trade.</p></li><li class="listitem"><p>Resource - Spring Resource that represents the file to be
            read.</p></li><li class="listitem"><p><code class="classname">Unmarshaller</code> - Unmarshalling
            facility provided by Spring OXM for mapping the XML fragment to an
            object.</p></li></ul></div><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.xml.StaxEventItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"fragmentRootElementName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"trade"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resource"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"data/iosample/input/input.xml"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"unmarshaller"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"tradeMarshaller"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>Notice that in this example we have chosen to use an
      <code class="classname">XStreamMarshaller</code> which accepts an alias passed
      in as a map with the first key and value being the name of the fragment
      (i.e. root element) and the object type to bind. Then, similar to a
      <code class="classname">FieldSet</code>, the names of the other elements that
      map to fields within the object type are described as key/value pairs in
      the map. In the configuration file we can use a Spring configuration
      utility to describe the required alias as follows:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"tradeMarshaller"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.oxm.xstream.XStreamMarshaller"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"aliases"</span><span class="hl-tag">&gt;</span>
<span class="bold"><strong>        &lt;util:map id="aliases"&gt;
            &lt;entry key="trade"
                   value="org.springframework.batch.sample.domain.Trade" /&gt;
            &lt;entry key="price" value="java.math.BigDecimal" /&gt;
            &lt;entry key="name" value="java.lang.String" /&gt;
        &lt;/util:map&gt;</strong></span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>On input the reader reads the XML resource until it recognizes
      that a new fragment is about to start (by matching the tag name by
      default). The reader creates a standalone XML document from the fragment
      (or at least makes it appear so) and passes the document to a
      deserializer (typically a wrapper around a Spring OXM
      <code class="classname">Unmarshaller</code>) to map the XML to a Java
      object.</p><p>In summary, this procedure is analogous to the following scripted
      Java code which uses the injection provided by the Spring
      configuration:</p><pre class="programlisting">StaxEventItemReader xmlStaxEventItemReader = <span class="hl-keyword">new</span> StaxEventItemReader()
Resource resource = <span class="hl-keyword">new</span> ByteArrayResource(xmlResource.getBytes())

Map aliases = <span class="hl-keyword">new</span> HashMap();
aliases.put(<span class="hl-string">"trade"</span>,<span class="hl-string">"org.springframework.batch.sample.domain.Trade"</span>);
aliases.put(<span class="hl-string">"price"</span>,<span class="hl-string">"java.math.BigDecimal"</span>);
aliases.put(<span class="hl-string">"customer"</span>,<span class="hl-string">"java.lang.String"</span>);
XStreamMarshaller unmarshaller = <span class="hl-keyword">new</span> XStreamMarshaller();
unmarshaller.setAliases(aliases);
xmlStaxEventItemReader.setUnmarshaller(unmarshaller);
xmlStaxEventItemReader.setResource(resource);
xmlStaxEventItemReader.setFragmentRootElementName(<span class="hl-string">"trade"</span>);
xmlStaxEventItemReader.open(<span class="hl-keyword">new</span> ExecutionContext());

<span class="hl-keyword">boolean</span> hasNext = true

CustomerCredit credit = null;

<span class="hl-keyword">while</span> (hasNext) {
    credit = xmlStaxEventItemReader.read();
    <span class="hl-keyword">if</span> (credit == null) {
        hasNext = false;
    }
    <span class="hl-keyword">else</span> {
        System.out.println(credit);
    }
}</pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="StaxEventItemWriter" href="#StaxEventItemWriter"></a>6.7.2&nbsp;StaxEventItemWriter</h3></div></div></div><p>Output works symmetrically to input. The
      <code class="classname">StaxEventItemWriter</code> needs a
      <code class="classname">Resource</code>, a marshaller, and a <code class="literal">rootTagName</code>. A Java
      object is passed to a marshaller (typically a standard Spring OXM
      <code class="classname">Marshaller</code>) which writes to a
      <code class="classname">Resource</code> using a custom event writer that filters
      the <code class="classname">StartDocument</code> and
      <code class="classname">EndDocument</code> events produced for each fragment by
      the OXM tools. We'll show this in an example using the
      <code class="classname">MarshallingEventWriterSerializer</code>. The Spring
      configuration for this setup looks as follows:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.xml.StaxEventItemWriter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"outputResource"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"marshaller"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"customerCreditMarshaller"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rootTagName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"customers"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"overwriteOutput"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"true"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>The configuration sets up the three required properties and
      optionally sets the overwriteOutput=true, mentioned earlier in the
      chapter for specifying whether an existing file can be overwritten. It
      should be noted the marshaller used for the writer is the exact same as
      the one used in the reading example from earlier in the chapter:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"customerCreditMarshaller"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.oxm.xstream.XStreamMarshaller"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"aliases"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;util:map</span> <span class="hl-attribute">id</span>=<span class="hl-value">"aliases"</span><span class="hl-tag">&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"customer"</span>
                   <span class="hl-attribute">value</span>=<span class="hl-value">"org.springframework.batch.sample.domain.CustomerCredit"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"credit"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"java.math.BigDecimal"</span><span class="hl-tag"> /&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"name"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"java.lang.String"</span><span class="hl-tag"> /&gt;</span>
        <span class="hl-tag">&lt;/util:map&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>To summarize with a Java example, the following code illustrates
      all of the points discussed, demonstrating the programmatic setup of the
      required properties:</p><pre class="programlisting">StaxEventItemWriter staxItemWriter = <span class="hl-keyword">new</span> StaxEventItemWriter()
FileSystemResource resource = <span class="hl-keyword">new</span> FileSystemResource(<span class="hl-string">"data/outputFile.xml"</span>)

Map aliases = <span class="hl-keyword">new</span> HashMap();
aliases.put(<span class="hl-string">"customer"</span>,<span class="hl-string">"org.springframework.batch.sample.domain.CustomerCredit"</span>);
aliases.put(<span class="hl-string">"credit"</span>,<span class="hl-string">"java.math.BigDecimal"</span>);
aliases.put(<span class="hl-string">"name"</span>,<span class="hl-string">"java.lang.String"</span>);
Marshaller marshaller = <span class="hl-keyword">new</span> XStreamMarshaller();
marshaller.setAliases(aliases);

staxItemWriter.setResource(resource);
staxItemWriter.setMarshaller(marshaller);
staxItemWriter.setRootTagName(<span class="hl-string">"trades"</span>);
staxItemWriter.setOverwriteOutput(true);

ExecutionContext executionContext = <span class="hl-keyword">new</span> ExecutionContext();
staxItemWriter.open(executionContext);
CustomerCredit Credit = <span class="hl-keyword">new</span> CustomerCredit();
trade.setPrice(<span class="hl-number">11.39</span>);
credit.setName(<span class="hl-string">"Customer1"</span>);
staxItemWriter.write(trade);</pre></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="multiFileInput" href="#multiFileInput"></a>6.8&nbsp;Multi-File Input</h2></div></div></div><p>It is a common requirement to process multiple files within a single
    <code class="classname">Step</code>. Assuming the files all have the same
    formatting, the <code class="classname">MultiResourceItemReader</code> supports
    this type of input for both XML and flat file processing. Consider the
    following files in a directory:</p><pre class="programlisting">file-1.txt  file-2.txt  ignored.txt</pre><p>file-1.txt and file-2.txt are formatted the same and for business
    reasons should be processed together. The
    <code class="classname">MuliResourceItemReader</code> can be used to read in both
    files by using wildcards:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"multiResourceReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...MultiResourceItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"resources"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"classpath:data/input/file-*.txt"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"delegate"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"flatFileItemReader"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>The referenced delegate is a simple
    <code class="classname">FlatFileItemReader</code>. The above configuration will
    read input from both files, handling rollback and restart scenarios. It
    should be noted that, as with any <code class="classname">ItemReader</code>,
    adding extra input (in this case a file) could cause potential issues when
    restarting. It is recommended that batch jobs work with their own
    individual directories until completed successfully.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="database" href="#database"></a>6.9&nbsp;Database</h2></div></div></div><p>Like most enterprise application styles, a database is the central
    storage mechanism for batch. However, batch differs from other application
    styles due to the sheer size of the datasets with which the system must
    work. If a SQL statement returns 1 million rows, the result set probably
    holds all returned results in memory until all rows have been read. Spring
    Batch provides two types of solutions for this problem: Cursor and Paging
    database ItemReaders.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="cursorBasedItemReaders" href="#cursorBasedItemReaders"></a>6.9.1&nbsp;Cursor Based ItemReaders</h3></div></div></div><p>Using a database cursor is generally the default approach of most
      batch developers, because it is the database's solution to the problem
      of 'streaming' relational data. The Java
      <code class="classname">ResultSet</code> class is essentially an object
      orientated mechanism for manipulating a cursor. A
      <code class="classname">ResultSet</code> maintains a cursor to the current row
      of data. Calling <code class="methodname">next</code> on a
      <code class="classname">ResultSet</code> moves this cursor to the next row.
      Spring Batch cursor based ItemReaders open the a cursor on
      initialization, and move the cursor forward one row for every call to
      <code class="methodname">read</code>, returning a mapped object that can be
      used for processing. The <code class="methodname">close</code> method will then
      be called to ensure all resources are freed up. The Spring core
      <code class="classname">JdbcTemplate</code> gets around this problem by using
      the callback pattern to completely map all rows in a
      <code class="classname">ResultSet</code> and close before returning control back
      to the method caller. However, in batch this must wait until the step is
      complete. Below is a generic diagram of how a cursor based
      <code class="classname">ItemReader</code> works, and while a SQL statement is
      used as an example since it is so widely known, any technology could
      implement the basic approach:</p><div class="mediaobject" align="center"><img src="images/cursorExample.png" align="middle"></div><p>This example illustrates the basic pattern. Given a 'FOO' table,
      which has three columns: ID, NAME, and BAR, select all rows with an ID
      greater than 1 but less than 7. This puts the beginning of the cursor
      (row 1) on ID 2. The result of this row should be a completely mapped
      Foo object. Calling <code class="methodname">read</code>() again moves the
      cursor to the next row, which is the Foo with an ID of 3. The results of
      these reads will be written out after each
      <code class="methodname">read</code>, thus allowing the objects to be garbage
      collected (assuming no instance variables are maintaining references to
      them).</p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="JdbcCursorItemReader" href="#JdbcCursorItemReader"></a>JdbcCursorItemReader</h4></div></div></div><p><code class="classname">JdbcCursorItemReader</code> is the Jdbc
        implementation of the cursor based technique. It works directly with a
        <code class="classname">ResultSet</code> and requires a SQL statement to run
        against a connection obtained from a
        <code class="classname">DataSource</code>. The following database schema will
        be used as an example:</p><pre class="programlisting"><span class="hl-keyword">CREATE</span> <span class="hl-keyword">TABLE</span> CUSTOMER (
   ID <span class="hl-keyword">BIGINT</span> <span class="hl-keyword">IDENTITY</span> <span class="hl-keyword">PRIMARY</span> <span class="hl-keyword">KEY</span>,
   <span class="hl-keyword">NAME</span> <span class="hl-keyword">VARCHAR</span>(<span class="hl-number">45</span>),
   CREDIT <span class="hl-keyword">FLOAT</span>
);</pre><p>Many people prefer to use a domain object for each row, so we'll
        use an implementation of the <code class="classname">RowMapper</code>
        interface to map a <code class="classname">CustomerCredit</code>
        object:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CustomerCreditRowMapper <span class="hl-keyword">implements</span> RowMapper {

    <span class="hl-keyword">public</span> <span class="hl-keyword">static</span> <span class="hl-keyword">final</span> String ID_COLUMN = <span class="hl-string">"id"</span>;
    <span class="hl-keyword">public</span> <span class="hl-keyword">static</span> <span class="hl-keyword">final</span> String NAME_COLUMN = <span class="hl-string">"name"</span>;
    <span class="hl-keyword">public</span> <span class="hl-keyword">static</span> <span class="hl-keyword">final</span> String CREDIT_COLUMN = <span class="hl-string">"credit"</span>;

    <span class="hl-keyword">public</span> Object mapRow(ResultSet rs, <span class="hl-keyword">int</span> rowNum) <span class="hl-keyword">throws</span> SQLException {
        CustomerCredit customerCredit = <span class="hl-keyword">new</span> CustomerCredit();

        customerCredit.setId(rs.getInt(ID_COLUMN));
        customerCredit.setName(rs.getString(NAME_COLUMN));
        customerCredit.setCredit(rs.getBigDecimal(CREDIT_COLUMN));

        <span class="hl-keyword">return</span> customerCredit;
    }
}</pre><p>Because <code class="classname">JdbcTemplate</code> is so familiar to
        users of Spring, and the <code class="classname">JdbcCursorItemReader</code>
        shares key interfaces with it, it is useful to see an example of how
        to read in this data with <code class="classname">JdbcTemplate</code>, in
        order to contrast it with the <code class="classname">ItemReader</code>. For
        the purposes of this example, let's assume there are 1,000 rows in the
        CUSTOMER database. The first example will be using
        <code class="classname">JdbcTemplate</code>:</p><pre class="programlisting"><span class="hl-comment">//For simplicity sake, assume a dataSource has already been obtained</span>
JdbcTemplate jdbcTemplate = <span class="hl-keyword">new</span> JdbcTemplate(dataSource);
List customerCredits = jdbcTemplate.query(<span class="hl-string">"SELECT ID, NAME, CREDIT from CUSTOMER"</span>,
                                          <span class="hl-keyword">new</span> CustomerCreditRowMapper());</pre><p>After running this code snippet the customerCredits list will
        contain 1,000 <code class="classname">CustomerCredit</code> objects. In the
        query method, a connection will be obtained from the
        <code class="classname">DataSource</code>, the provided SQL will be run
        against it, and the <code class="methodname">mapRow</code> method will be
        called for each row in the <code class="classname">ResultSet</code>. Let's
        contrast this with the approach of the
        <code class="classname">JdbcCursorItemReader</code>:</p><pre class="programlisting">JdbcCursorItemReader itemReader = <span class="hl-keyword">new</span> JdbcCursorItemReader();
itemReader.setDataSource(dataSource);
itemReader.setSql(<span class="hl-string">"SELECT ID, NAME, CREDIT from CUSTOMER"</span>);
itemReader.setRowMapper(<span class="hl-keyword">new</span> CustomerCreditRowMapper());
<span class="hl-keyword">int</span> counter = <span class="hl-number">0</span>;
ExecutionContext executionContext = <span class="hl-keyword">new</span> ExecutionContext();
itemReader.open(executionContext);
Object customerCredit = <span class="hl-keyword">new</span> Object();
<span class="hl-keyword">while</span>(customerCredit != null){
    customerCredit = itemReader.read();
    counter++;
}
itemReader.close(executionContext);</pre><p>After running this code snippet the counter will equal 1,000. If
        the code above had put the returned customerCredit into a list, the
        result would have been exactly the same as with the
        <code class="classname">JdbcTemplate</code> example. However, the big
        advantage of the <code class="classname">ItemReader</code> is that it allows
        items to be 'streamed'. The <code class="methodname">read</code> method can
        be called once, and the item written out via an
        <code class="classname">ItemWriter</code>, and then the next item obtained via
        <code class="methodname">read</code>. This allows item reading and writing to
        be done in 'chunks' and committed periodically, which is the essence
        of high performance batch processing. Furthermore, it is very easily
        configured for injection into a Spring Batch
        <code class="classname">Step</code>:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...JdbcCursorItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"sql"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"select ID, NAME, CREDIT from CUSTOMER"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.domain.CustomerCreditRowMapper"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><div class="section"><div class="titlepage"><div><div><h5 class="title"><a name="JdbcCursorItemReaderProperties" href="#JdbcCursorItemReaderProperties"></a>Additional Properties</h5></div></div></div><p>Because there are so many varying options for opening a cursor
          in Java, there are many properties on the
          <code class="classname">JdbcCustorItemReader</code> that can be set:</p><div class="table"><a name="d5e2752" href="#d5e2752"></a><p class="title"><b>Table&nbsp;6.2.&nbsp;JdbcCursorItemReader Properties</b></p><div class="table-contents"><table summary="JdbcCursorItemReader Properties" style="border-collapse: collapse;border-top: 0.5pt solid ; border-bottom: 0.5pt solid ; border-left: 0.5pt solid ; border-right: 0.5pt solid ; "><colgroup><col><col></colgroup><tbody><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">ignoreWarnings</td><td style="border-bottom: 0.5pt solid ; ">Determines whether or not SQLWarnings are logged or
                  cause an exception - default is true</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">fetchSize</td><td style="border-bottom: 0.5pt solid ; ">Gives the Jdbc driver a hint as to the number of rows
                  that should be fetched from the database when more rows are
                  needed by the <code class="classname">ResultSet</code> object used
                  by the <code class="classname">ItemReader</code>. By default, no
                  hint is given.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">maxRows</td><td style="border-bottom: 0.5pt solid ; ">Sets the limit for the maximum number of rows the
                  underlying <code class="classname">ResultSet</code> can hold at any
                  one time.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">queryTimeout</td><td style="border-bottom: 0.5pt solid ; ">Sets the number of seconds the driver will wait for a
                  <code class="classname">Statement</code> object to execute to the
                  given number of seconds. If the limit is exceeded, a
                  <code class="classname">DataAccessEception</code> is thrown.
                  (Consult your driver vendor documentation for
                  details).</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">verifyCursorPosition</td><td style="border-bottom: 0.5pt solid ; ">Because the same <code class="classname">ResultSet</code>
                  held by the <code class="classname">ItemReader</code> is passed to
                  the <code class="classname">RowMapper</code>, it is possible for
                  users to call <code class="methodname">ResultSet.next</code>()
                  themselves, which could cause issues with the reader's
                  internal count. Setting this value to true will cause an
                  exception to be thrown if the cursor position is not the
                  same after the <code class="classname">RowMapper</code> call as it
                  was before.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">saveState</td><td style="border-bottom: 0.5pt solid ; ">Indicates whether or not the reader's state should be
                  saved in the <code class="classname">ExecutionContext</code>
                  provided by
                  <code class="methodname">ItemStream#update</code>(<code class="classname">ExecutionContext</code>)
                  The default value is true.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">driverSupportsAbsolute</td><td style="border-bottom: 0.5pt solid ; ">Defaults to false. Indicates whether the Jdbc driver
                  supports setting the absolute row on a
                  <code class="classname">ResultSet</code>. It is recommended that
                  this is set to true for Jdbc drivers that supports
                  <code class="methodname">ResultSet.absolute</code>() as it may
                  improve performance, especially if a step fails while
                  working with a large data set.</td></tr><tr><td style="border-right: 0.5pt solid ; ">setUseSharedExtendedConnection</td><td style="">Defaults to false. Indicates whether the connection
                  used for the cursor should be used by all other processing
                  thus sharing the same transaction. If this is set to false,
                  which is the default, then the cursor will be opened using
                  its own connection and will not participate in any
                  transactions started for the rest of the step processing. If
                  you set this flag to true then you must wrap the
                  <code class="classname">DataSource</code> in an
                  <code class="classname">ExtendedConnectionDataSourceProxy</code> to
                  prevent the connection from being closed and released after
                  each commit. When you set this option to true then the
                  statement used to open the cursor will be created with both
                  'READ_ONLY' and 'HOLD_CUSORS_OVER_COMMIT' options. This
                  allows holding the cursor open over transaction start and
                  commits performed in the step processing. To use this
                  feature you need a database that supports this and a Jdbc
                  driver supporting Jdbc 3.0 or later.</td></tr></tbody></table></div></div><br class="table-break"></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="HibernateCursorItemReader" href="#HibernateCursorItemReader"></a>HibernateCursorItemReader</h4></div></div></div><p>Just as normal Spring users make important decisions about
        whether or not to use ORM solutions, which affect whether or not they
        use a <code class="classname">JdbcTemplate</code> or a
        <code class="classname">HibernateTemplate</code>, Spring Batch users have the
        same options. <code class="classname">HibernateCursorItemReader</code> is the
        Hibernate implementation of the cursor technique. Hibernate's usage in
        batch has been fairly controversial. This has largely been because
        Hibernate was originally developed to support online application
        styles. However, that doesn't mean it can't be used for batch
        processing. The easiest approach for solving this problem is to use a
        <code class="classname">StatelessSession</code> rather than a standard
        session. This removes all of the caching and dirty checking hibernate
        employs that can cause issues in a batch scenario. For more
        information on the differences between stateless and normal hibernate
        sessions, refer to the documentation of your specific hibernate
        release. The <code class="classname">HibernateCursorItemReader</code> allows
        you to declare an HQL statement and pass in a
        <code class="classname">SessionFactory</code>, which will pass back one item
        per call to <code class="methodname">read</code> in the same basic fashion as
        the <code class="classname">JdbcCursorItemReader</code>. Below is an example
        configuration using the same 'customer credit' example as the JDBC
        reader:</p><pre class="programlisting">HibernateCursorItemReader itemReader = <span class="hl-keyword">new</span> HibernateCursorItemReader();
itemReader.setQueryString(<span class="hl-string">"from CustomerCredit"</span>);
<span class="hl-comment">//For simplicity sake, assume sessionFactory already obtained.</span>
itemReader.setSessionFactory(sessionFactory);
itemReader.setUseStatelessSession(true);
<span class="hl-keyword">int</span> counter = <span class="hl-number">0</span>;
ExecutionContext executionContext = <span class="hl-keyword">new</span> ExecutionContext();
itemReader.open(executionContext);
Object customerCredit = <span class="hl-keyword">new</span> Object();
<span class="hl-keyword">while</span>(customerCredit != null){
    customerCredit = itemReader.read();
    counter++;
}
itemReader.close(executionContext);</pre><p>This configured <code class="classname">ItemReader</code> will return
        <code class="classname">CustomerCredit</code> objects in the exact same manner
        as described by the <code class="classname">JdbcCursorItemReader</code>,
        assuming hibernate mapping files have been created correctly for the
        Customer table. The 'useStatelessSession' property defaults to true,
        but has been added here to draw attention to the ability to switch it
        on or off. It is also worth noting that the fetchSize of the
        underlying cursor can be set via the setFetchSize property. As with
        <code class="classname">JdbcCursorItemReader</code>, configuration is
        straightforward:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.database.HibernateCursorItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"sessionFactory"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"sessionFactory"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"queryString"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"from CustomerCredit"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="StoredProcedureItemReader" href="#StoredProcedureItemReader"></a>StoredProcedureItemReader</h4></div></div></div><p>Sometimes it is necessary to obtain the cursor data using a
        stored procedure. The <code class="classname">StoredProcedureItemReader</code>
        works like the <code class="classname">JdbcCursorItemReader</code> except that
        instead of executing a query to obtain a cursor we execute a stored
        procedure that returns a cursor. The stored procedure can return the
        cursor in three different ways:</p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>as a returned ResultSet (used by SQL Server, Sybase, DB2,
            Derby and MySQL)</p></li><li class="listitem"><p>as a ref-cursor returned as an out parameter (used by Oracle
            and PostgreSQL)</p></li><li class="listitem"><p>as the return value of a stored function call</p></li></ol></div><p>Below is a basic example configuration using the same 'customer
        credit' example as earlier:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"reader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"o.s.batch.item.database.StoredProcedureItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"procedureName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"sp_customer_credit"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.domain.CustomerCreditRowMapper"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>
</pre><p>This example relies on the stored procedure to provide a
        ResultSet as a returned result (option 1 above). </p><p>If the stored procedure returned a ref-cursor (option 2) then we
        would need to provide the position of the out parameter that is the
        returned ref-cursor. Here is an example where the first parameter is
        the returned ref-cursor:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"reader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"o.s.batch.item.database.StoredProcedureItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"procedureName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"sp_customer_credit"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"refCursorPosition"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.domain.CustomerCreditRowMapper"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>
</pre><p>If the cursor was returned from a stored function (option 3) we
        would need to set the property "<code class="varname">function</code>" to
        <code class="literal">true</code>. It defaults to <code class="literal">false</code>. Here
        is what that would look like:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"reader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"o.s.batch.item.database.StoredProcedureItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"procedureName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"sp_customer_credit"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"function"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"true"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.domain.CustomerCreditRowMapper"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>
</pre><p>In all of these cases we need to define a
        <code class="classname">RowMapper</code> as well as a
        <code class="classname">DataSource</code> and the actual procedure
        name.</p><p>If the stored procedure or function takes in parameter then they
        must be declared and set via the parameters property. Here is an
        example for Oracle that declares three parameters. The first one is
        the out parameter that returns the ref-cursor, the second and third
        are in parameters that takes a value of type INTEGER:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"reader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"o.s.batch.item.database.StoredProcedureItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"procedureName"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"spring.cursor_func"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"parameters"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;list&gt;</span>
            <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.jdbc.core.SqlOutParameter"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"0"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"newid"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"1"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;util:constant</span> <span class="hl-attribute">static-field</span>=<span class="hl-value">"oracle.jdbc.OracleTypes.CURSOR"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;/constructor-arg&gt;</span>
            <span class="hl-tag">&lt;/bean&gt;</span>
            <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.jdbc.core.SqlParameter"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"0"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"amount"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"1"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;util:constant</span> <span class="hl-attribute">static-field</span>=<span class="hl-value">"java.sql.Types.INTEGER"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;/constructor-arg&gt;</span>
            <span class="hl-tag">&lt;/bean&gt;</span>
            <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.jdbc.core.SqlParameter"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"0"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"custid"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;constructor-arg</span> <span class="hl-attribute">index</span>=<span class="hl-value">"1"</span><span class="hl-tag">&gt;</span>
                    <span class="hl-tag">&lt;util:constant</span> <span class="hl-attribute">static-field</span>=<span class="hl-value">"java.sql.Types.INTEGER"</span><span class="hl-tag">/&gt;</span>
                <span class="hl-tag">&lt;/constructor-arg&gt;</span>
            <span class="hl-tag">&lt;/bean&gt;</span>
        <span class="hl-tag">&lt;/list&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"refCursorPosition"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"preparedStatementSetter"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"parameterSetter"</span><span class="hl-tag">/&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>In addition to the parameter declarations we need to specify a
        <code class="classname">PreparedStatementSetter</code> implementation that
        sets the parameter values for the call. This works the same as for the
        <code class="classname">JdbcCursorItemReader</code> above. All the additional
        properties listed in <a class="xref" href="readersAndWriters.html#JdbcCursorItemReaderProperties" title="Additional Properties">the section called &#8220;Additional Properties&#8221;</a>
        apply to the <code class="classname">StoredProcedureItemReader</code> as well.
        </p></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="pagingItemReaders" href="#pagingItemReaders"></a>6.9.2&nbsp;Paging ItemReaders</h3></div></div></div><p>An alternative to using a database cursor is executing multiple
      queries where each query is bringing back a portion of the results. We
      refer to this portion as a page. Each query that is executed must
      specify the starting row number and the number of rows that we want
      returned for the page.</p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="JdbcPagingItemReader" href="#JdbcPagingItemReader"></a>JdbcPagingItemReader</h4></div></div></div><p>One implementation of a paging <code class="classname">ItemReader</code>
        is the <code class="classname">JdbcPagingItemReader</code>. The
        <code class="classname">JdbcPagingItemReader</code> needs a
        <code class="classname">PagingQueryProvider</code> responsible for providing
        the SQL queries used to retrieve the rows making up a page. Since each
        database has its own strategy for providing paging support, we need to
        use a different <code class="classname">PagingQueryProvider</code> for each
        supported database type. There is also the
        <code class="classname">SqlPagingQueryProviderFactoryBean</code> that will
        auto-detect the database that is being used and determine the
        appropriate <code class="classname">PagingQueryProvider</code> implementation.
        This simplifies the configuration and is the recommended best
        practice.</p><p>The <code class="classname">SqlPagingQueryProviderFactoryBean</code>
        requires that you specify a select clause and a from clause. You can
        also provide an optional where clause. These clauses will be used to
        build an SQL statement combined with the required sortKey.</p><p>After the reader has been opened, it will pass back one item per
        call to <code class="methodname">read</code> in the same basic fashion as any
        other <code class="classname">ItemReader</code>. The paging happens behind the
        scenes when additional rows are needed.</p><p>Below is an example configuration using a similar 'customer
        credit' example as the cursor based ItemReaders above:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...JdbcPagingItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"queryProvider"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...SqlPagingQueryProviderFactoryBean"</span><span class="hl-tag">&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"selectClause"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"select id, name, credit"</span><span class="hl-tag">/&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"fromClause"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"from customer"</span><span class="hl-tag">/&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"whereClause"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"where status=:status"</span><span class="hl-tag">/&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"sortKey"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"id"</span><span class="hl-tag">/&gt;</span>
        <span class="hl-tag">&lt;/bean&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"parameterValues"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;map&gt;</span>
            <span class="hl-tag">&lt;entry</span> <span class="hl-attribute">key</span>=<span class="hl-value">"status"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"NEW"</span><span class="hl-tag">/&gt;</span>
        <span class="hl-tag">&lt;/map&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"pageSize"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1000"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"customerMapper"</span><span class="hl-tag">/&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>This configured <code class="classname">ItemReader</code> will return
        <code class="classname">CustomerCredit</code> objects using the
        <code class="classname">RowMapper</code> that must be specified. The
        'pageSize' property determines the number of entities read from the
        database for each query execution.</p><p>The 'parameterValues' property can be used to specify a Map of
        parameter values for the query. If you use named parameters in the
        where clause the key for each entry should match the name of the named
        parameter. If you use a traditional '?' placeholder then the key for
        each entry should be the number of the placeholder, starting with
        1.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="JpaPagingItemReader" href="#JpaPagingItemReader"></a>JpaPagingItemReader</h4></div></div></div><p>Another implementation of a paging
        <code class="classname">ItemReader</code> is the
        <code class="classname">JpaPagingItemReader</code>. JPA doesn't have a concept
        similar to the Hibernate <code class="classname">StatelessSession</code> so we
        have to use other features provided by the JPA specification. Since
        JPA supports paging, this is a natural choice when it comes to using
        JPA for batch processing. After each page is read, the entities will
        become detached and the persistence context will be cleared in order
        to allow the entities to be garbage collected once the page is
        processed.</p><p>The <code class="classname">JpaPagingItemReader</code> allows you to
        declare a JPQL statement and pass in a
        <code class="classname">EntityManagerFactory</code>. It will then pass back
        one item per call to <code class="methodname">read</code> in the same basic
        fashion as any other <code class="classname">ItemReader</code>. The paging
        happens behind the scenes when additional entities are needed. Below
        is an example configuration using the same 'customer credit' example
        as the JDBC reader above:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...JpaPagingItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"entityManagerFactory"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"entityManagerFactory"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"queryString"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"select c from CustomerCredit c"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"pageSize"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1000"</span><span class="hl-tag">/&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>This configured <code class="classname">ItemReader</code> will return
        <code class="classname">CustomerCredit</code> objects in the exact same manner
        as described by the <code class="classname">JdbcPagingItemReader</code> above,
        assuming the Customer object has the correct JPA annotations or ORM
        mapping file. The 'pageSize' property determines the number of
        entities read from the database for each query execution.</p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="IbatisPagingItemReader" href="#IbatisPagingItemReader"></a>IbatisPagingItemReader</h4></div></div></div><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="images/note.png"></td><th align="left">Note</th></tr><tr><td align="left" valign="top">This reader is deprecated as of Spring Batch 3.0.</td></tr></table></div><p>If you use IBATIS for your data access then you can use the
        <code class="classname">IbatisPagingItemReader</code> which, as the name
        indicates, is an implementation of a paging
        <code class="classname">ItemReader</code>. IBATIS doesn't have direct support
        for reading rows in pages but by providing a couple of standard
        variables you can add paging support to your IBATIS queries.</p><p>Here is an example of a configuration for a
        <code class="classname">IbatisPagingItemReader</code> reading CustomerCredits
        as in the examples above:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...IbatisPagingItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"sqlMapClient"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"sqlMapClient"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"queryId"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"getPagedCustomerCredits"</span><span class="hl-tag">/&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"pageSize"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"1000"</span><span class="hl-tag">/&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>The <code class="classname">IbatisPagingItemReader</code> configuration
        above references an IBATIS query called "getPagedCustomerCredits".
        Here is an example of what that query should look like for
        MySQL.</p><pre class="programlisting"><span class="hl-tag">&lt;select</span> <span class="hl-attribute">id</span>=<span class="hl-value">"getPagedCustomerCredits"</span> <span class="hl-attribute">resultMap</span>=<span class="hl-value">"customerCreditResult"</span><span class="hl-tag">&gt;</span>
    select id, name, credit from customer order by id asc LIMIT #_skiprows#, #_pagesize#
<span class="hl-tag">&lt;/select&gt;</span></pre><p>The <code class="classname">_skiprows</code> and
        <code class="classname">_pagesize</code> variables are provided by the
        <code class="classname">IbatisPagingItemReader</code> and there is also a
        <code class="classname">_page</code> variable that can be used if necessary.
        The syntax for the paging queries varies with the database used. Here
        is an example for Oracle (unfortunately we need to use CDATA for some
        operators since this belongs in an XML document):</p><pre class="programlisting"><span class="hl-tag">&lt;select</span> <span class="hl-attribute">id</span>=<span class="hl-value">"getPagedCustomerCredits"</span> <span class="hl-attribute">resultMap</span>=<span class="hl-value">"customerCreditResult"</span><span class="hl-tag">&gt;</span>
    select * from (
      select * from (
        select t.id, t.name, t.credit, ROWNUM ROWNUM_ from customer t order by id
       )) where ROWNUM_ <span class="hl-tag">&lt;![CDATA[</span> &gt; <span class="hl-tag">]]&gt;</span> ( #_page# * #_pagesize# )
    ) where ROWNUM <span class="hl-tag">&lt;![CDATA[</span> &lt;= <span class="hl-tag">]]&gt;</span> #_pagesize#
<span class="hl-tag">&lt;/select&gt;</span></pre></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="databaseItemWriters" href="#databaseItemWriters"></a>6.9.3&nbsp;Database ItemWriters</h3></div></div></div><p>While both Flat Files and XML have specific ItemWriters, there is
      no exact equivalent in the database world. This is because transactions
      provide all the functionality that is needed. ItemWriters are necessary
      for files because they must act as if they're transactional, keeping
      track of written items and flushing or clearing at the appropriate
      times. Databases have no need for this functionality, since the write is
      already contained in a transaction. Users can create their own DAOs that
      implement the <code class="classname">ItemWriter</code> interface or use one
      from a custom <code class="classname">ItemWriter</code> that's written for
      generic processing concerns, either way, they should work without any
      issues. One thing to look out for is the performance and error handling
      capabilities that are provided by batching the outputs. This is most
      common when using hibernate as an <code class="classname">ItemWriter</code>, but
      could have the same issues when using Jdbc batch mode. Batching database
      output doesn't have any inherent flaws, assuming we are careful to flush
      and there are no errors in the data. However, any errors while writing
      out can cause confusion because there is no way to know which individual
      item caused an exception, or even if any individual item was
      responsible, as illustrated below:</p><div class="mediaobject" align="center"><img src="images/errorOnFlush.png" align="middle"></div><p>If items are buffered before being written out, any
      errors encountered will not be thrown until the buffer is flushed just
      before a commit. For example, let's assume that 20 items will be written
      per chunk, and the 15th item throws a DataIntegrityViolationException.
      As far as the Step is concerned, all 20 item will be written out
      successfully, since there's no way to know that an error will occur
      until they are actually written out. Once
      <code class="classname">Session#</code><code class="methodname">flush</code>() is
      called, the buffer will be emptied and the exception will be hit. At
      this point, there's nothing the <code class="classname">Step</code> can do, the
      transaction must be rolled back. Normally, this exception might cause
      the Item to be skipped (depending upon the skip/retry policies), and
      then it won't be written out again. However, in the batched scenario,
      there's no way for it to know which item caused the issue, the whole
      buffer was being written out when the failure happened. The only way to
      solve this issue is to flush after each item:</p><div class="mediaobject" align="center"><img src="images/errorOnWrite.png" align="middle"></div><p>This is a common use case, especially when using Hibernate, and
      the simple guideline for implementations of
      <code class="classname">ItemWriter</code>, is to flush on each call to
      <code class="methodname">write()</code>. Doing so allows for items to be
      skipped reliably, with Spring Batch taking care internally of the
      granularity of the calls to <code class="classname">ItemWriter</code> after an
      error.</p></div></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="reusingExistingServices" href="#reusingExistingServices"></a>6.10&nbsp;Reusing Existing Services</h2></div></div></div><p>Batch systems are often used in conjunction with other application
    styles. The most common is an online system, but it may also support
    integration or even a thick client application by moving necessary bulk
    data that each application style uses. For this reason, it is common that
    many users want to reuse existing DAOs or other services within their
    batch jobs. The Spring container itself makes this fairly easy by allowing
    any necessary class to be injected. However, there may be cases where the
    existing service needs to act as an <code class="classname">ItemReader</code> or
    <code class="classname">ItemWriter</code>, either to satisfy the dependency of
    another Spring Batch class, or because it truly is the main
    <code class="classname">ItemReader</code> for a step. It is fairly trivial to
    write an adaptor class for each service that needs wrapping, but because
    it is such a common concern, Spring Batch provides implementations:
    <code class="classname">ItemReaderAdapter</code> and
    <code class="classname">ItemWriterAdapter</code>. Both classes implement the
    standard Spring method invoking the delegate pattern and are fairly simple
    to set up. Below is an example of the reader:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemReader"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.adapter.ItemReaderAdapter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"targetObject"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"fooService"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"targetMethod"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"generateFoo"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"fooService"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.sample.FooService"</span><span class="hl-tag"> /&gt;</span></pre><p>One important point to note is that the contract of the targetMethod
    must be the same as the contract for <code class="methodname">read</code>: when
    exhausted it will return null, otherwise an <code class="classname">Object</code>.
    Anything else will prevent the framework from knowing when processing
    should end, either causing an infinite loop or incorrect failure,
    depending upon the implementation of the
    <code class="classname">ItemWriter</code>. The <code class="classname">ItemWriter</code>
    implementation is equally as simple:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"itemWriter"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.adapter.ItemWriterAdapter"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"targetObject"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"fooService"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"targetMethod"</span> <span class="hl-attribute">value</span>=<span class="hl-value">"processFoo"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"fooService"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.sample.FooService"</span><span class="hl-tag"> /&gt;</span>
</pre></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="validatingInput" href="#validatingInput"></a>6.11&nbsp;Validating Input</h2></div></div></div><p>During the course of this chapter, multiple approaches to parsing
    input have been discussed. Each major implementation will throw an
    exception if it is not 'well-formed'. The
    <code class="classname">FixedLengthTokenizer</code> will throw an exception if a
    range of data is missing. Similarly, attempting to access an index in a
    <code class="classname">RowMapper</code> of <code class="classname">FieldSetMapper</code>
    that doesn't exist or is in a different format than the one expected will
    cause an exception to be thrown. All of these types of exceptions will be
    thrown before <code class="methodname">read</code> returns. However, they don't
    address the issue of whether or not the returned item is valid. For
    example, if one of the fields is an age, it obviously cannot be negative.
    It will parse correctly, because it existed and is a number, but it won't
    cause an exception. Since there are already a plethora of Validation
    frameworks, Spring Batch does not attempt to provide yet another, but
    rather provides a very simple interface that can be implemented by any
    number of frameworks:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">interface</span> Validator {

    <span class="hl-keyword">void</span> validate(Object value) <span class="hl-keyword">throws</span> ValidationException;

}</pre><p>The contract is that the <code class="methodname">validate</code> method
    will throw an exception if the object is invalid, and return normally if
    it is valid. Spring Batch provides an out of the box
    <code class="classname">ItemProcessor:</code></p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.validator.ValidatingItemProcessor"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"validator"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"validator"</span><span class="hl-tag"> /&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span>

<span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"validator"</span>
      <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.item.validator.SpringValidator"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"validator"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"orderValidator"</span>
              <span class="hl-attribute">class</span>=<span class="hl-value">"org.springmodules.validation.valang.ValangValidator"</span><span class="hl-tag">&gt;</span>
            <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"valang"</span><span class="hl-tag">&gt;</span>
                <span class="hl-tag">&lt;value&gt;</span>
                    <span class="hl-tag">&lt;![CDATA[</span>
           { orderId : ? &gt; 0 AND ? &lt;= 9999999999 : 'Incorrect order ID' : 'error.order.id' }
           { totalLines : ? = size(lineItems) : 'Bad count of order lines'
                                              : 'error.order.lines.badcount'}
           { customer.registered : customer.businessCustomer = FALSE OR ? = TRUE
                                 : 'Business customer must be registered'
                                 : 'error.customer.registration'}
           { customer.companyName : customer.businessCustomer = FALSE OR ? HAS TEXT
                                  : 'Company name for business customer is mandatory'
                                  :'error.customer.companyname'}
                    <span class="hl-tag">]]&gt;</span>
                <span class="hl-tag">&lt;/value&gt;</span>
            <span class="hl-tag">&lt;/property&gt;</span>
        <span class="hl-tag">&lt;/bean&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>This simple example shows a simple
    <code class="classname">ValangValidator</code> that is used to validate an order
    object. The intent is not to show Valang functionality as much as to show
    how a validator could be added.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="process-indicator" href="#process-indicator"></a>6.12&nbsp;Preventing State Persistence</h2></div></div></div><p>By default, all of the <code class="classname">ItemReader</code> and
    <code class="classname">ItemWriter</code> implementations store their current
    state in the <code class="classname">ExecutionContext</code> before it is
    committed. However, this may not always be the desired behavior. For
    example, many developers choose to make their database readers
    'rerunnable' by using a process indicator. An extra column is added to the
    input data to indicate whether or not it has been processed. When a
    particular record is being read (or written out) the processed flag is
    flipped from false to true. The SQL statement can then contain an extra
    statement in the where clause, such as "where PROCESSED_IND = false",
    thereby ensuring that only unprocessed records will be returned in the
    case of a restart. In this scenario, it is preferable to not store any
    state, such as the current row number, since it will be irrelevant upon
    restart. For this reason, all readers and writers include the 'saveState'
    property:</p><pre class="programlisting"><span class="hl-tag">&lt;bean</span> <span class="hl-attribute">id</span>=<span class="hl-value">"playerSummarizationSource"</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.spr...JdbcCursorItemReader"</span><span class="hl-tag">&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"dataSource"</span> <span class="hl-attribute">ref</span>=<span class="hl-value">"dataSource"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"rowMapper"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;bean</span> <span class="hl-attribute">class</span>=<span class="hl-value">"org.springframework.batch.sample.PlayerSummaryMapper"</span><span class="hl-tag"> /&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
    <span class="bold"><strong>&lt;property name="saveState" value="false" /&gt;</strong></span>
    <span class="hl-tag">&lt;property</span> <span class="hl-attribute">name</span>=<span class="hl-value">"sql"</span><span class="hl-tag">&gt;</span>
        <span class="hl-tag">&lt;value&gt;</span>
            SELECT games.player_id, games.year_no, SUM(COMPLETES),
            SUM(ATTEMPTS), SUM(PASSING_YARDS), SUM(PASSING_TD),
            SUM(INTERCEPTIONS), SUM(RUSHES), SUM(RUSH_YARDS),
            SUM(RECEPTIONS), SUM(RECEPTIONS_YARDS), SUM(TOTAL_TD)
            from games, players where players.player_id =
            games.player_id group by games.player_id, games.year_no
        <span class="hl-tag">&lt;/value&gt;</span>
    <span class="hl-tag">&lt;/property&gt;</span>
<span class="hl-tag">&lt;/bean&gt;</span></pre><p>The <code class="classname">ItemReader</code> configured above will not make
    any entries in the <code class="classname">ExecutionContext</code> for any
    executions in which it participates.</p></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="customReadersWriters" href="#customReadersWriters"></a>6.13&nbsp;Creating Custom ItemReaders and
    ItemWriters</h2></div></div></div><p>So far in this chapter the basic contracts that exist for reading
    and writing in Spring Batch and some common implementations have been
    discussed. However, these are all fairly generic, and there are many
    potential scenarios that may not be covered by out of the box
    implementations. This section will show, using a simple example, how to
    create a custom <code class="classname">ItemReader</code> and
    <code class="classname">ItemWriter</code> implementation and implement their
    contracts correctly. The <code class="classname">ItemReader</code> will also
    implement <code class="classname">ItemStream</code>, in order to illustrate how to
    make a reader or writer restartable.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="customReader" href="#customReader"></a>6.13.1&nbsp;Custom ItemReader Example</h3></div></div></div><p>For the purpose of this example, a simple
      <code class="classname">ItemReader</code> implementation that reads from a
      provided list will be created. We'll start out by implementing the most
      basic contract of <code class="classname">ItemReader</code>,
      <code class="methodname">read</code>:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CustomItemReader&lt;T&gt; <span class="hl-keyword">implements</span> ItemReader&lt;T&gt;{

    List&lt;T&gt; items;

    <span class="hl-keyword">public</span> CustomItemReader(List&lt;T&gt; items) {
        <span class="hl-keyword">this</span>.items = items;
    }

    <span class="hl-keyword">public</span> T read() <span class="hl-keyword">throws</span> Exception, UnexpectedInputException,
       NoWorkFoundException, ParseException {

        <span class="hl-keyword">if</span> (!items.isEmpty()) {
            <span class="hl-keyword">return</span> items.remove(<span class="hl-number">0</span>);
        }
        <span class="hl-keyword">return</span> null;
    }
}</pre><p>This very simple class takes a list of items, and returns them one
      at a time, removing each from the list. When the list is empty, it
      returns null, thus satisfying the most basic requirements of an
      <code class="classname">ItemReader</code>, as illustrated below:</p><pre class="programlisting">List&lt;String&gt; items = <span class="hl-keyword">new</span> ArrayList&lt;String&gt;();
items.add(<span class="hl-string">"1"</span>);
items.add(<span class="hl-string">"2"</span>);
items.add(<span class="hl-string">"3"</span>);

ItemReader itemReader = <span class="hl-keyword">new</span> CustomItemReader&lt;String&gt;(items);
assertEquals(<span class="hl-string">"1"</span>, itemReader.read());
assertEquals(<span class="hl-string">"2"</span>, itemReader.read());
assertEquals(<span class="hl-string">"3"</span>, itemReader.read());
assertNull(itemReader.read());</pre><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="restartableReader" href="#restartableReader"></a>Making the <code class="classname">ItemReader</code>
        Restartable</h4></div></div></div><p>The final challenge now is to make the
        <code class="classname">ItemReader</code> restartable. Currently, if the power
        goes out, and processing begins again, the
        <code class="classname">ItemReader</code> must start at the beginning. This is
        actually valid in many scenarios, but it is sometimes preferable that
        a batch job starts where it left off. The key discriminant is often
        whether the reader is stateful or stateless. A stateless reader does
        not need to worry about restartability, but a stateful one has to try
        and reconstitute its last known state on restart. For this reason, we
        recommend that you keep custom readers stateless if possible, so you
        don't have to worry about restartability.</p><p>If you do need to store state, then the
        <code class="classname">ItemStream</code> interface should be used:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CustomItemReader&lt;T&gt; <span class="hl-keyword">implements</span> ItemReader&lt;T&gt;, ItemStream {

    List&lt;T&gt; items;
    <span class="hl-keyword">int</span> currentIndex = <span class="hl-number">0</span>;
    <span class="hl-keyword">private</span> <span class="hl-keyword">static</span> <span class="hl-keyword">final</span> String CURRENT_INDEX = <span class="hl-string">"current.index"</span>;

    <span class="hl-keyword">public</span> CustomItemReader(List&lt;T&gt; items) {
        <span class="hl-keyword">this</span>.items = items;
    }

    <span class="hl-keyword">public</span> T read() <span class="hl-keyword">throws</span> Exception, UnexpectedInputException,
        ParseException {

        <span class="hl-keyword">if</span> (currentIndex &lt; items.size()) {
            <span class="hl-keyword">return</span> items.get(currentIndex++);
        }

        <span class="hl-keyword">return</span> null;
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> open(ExecutionContext executionContext) <span class="hl-keyword">throws</span> ItemStreamException {
        <span class="hl-keyword">if</span>(executionContext.containsKey(CURRENT_INDEX)){
            currentIndex = <span class="hl-keyword">new</span> Long(executionContext.getLong(CURRENT_INDEX)).intValue();
        }
        <span class="hl-keyword">else</span>{
            currentIndex = <span class="hl-number">0</span>;
        }
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> update(ExecutionContext executionContext) <span class="hl-keyword">throws</span> ItemStreamException {
        executionContext.putLong(CURRENT_INDEX, <span class="hl-keyword">new</span> Long(currentIndex).longValue());
    }

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> close() <span class="hl-keyword">throws</span> ItemStreamException {}
}</pre><p>On each call to the <code class="classname">ItemStream</code>
        <code class="methodname">update</code> method, the current index of the
        <code class="classname">ItemReader</code> will be stored in the provided
        <code class="classname">ExecutionContext</code> with a key of 'current.index'.
        When the <code class="classname">ItemStream</code> <code class="classname">open</code>
        method is called, the <code class="classname">ExecutionContext</code> is
        checked to see if it contains an entry with that key. If the key is
        found, then the current index is moved to that location. This is a
        fairly trivial example, but it still meets the general
        contract:</p><pre class="programlisting">ExecutionContext executionContext = <span class="hl-keyword">new</span> ExecutionContext();
((ItemStream)itemReader).open(executionContext);
assertEquals(<span class="hl-string">"1"</span>, itemReader.read());
((ItemStream)itemReader).update(executionContext);

List&lt;String&gt; items = <span class="hl-keyword">new</span> ArrayList&lt;String&gt;();
items.add(<span class="hl-string">"1"</span>);
items.add(<span class="hl-string">"2"</span>);
items.add(<span class="hl-string">"3"</span>);
itemReader = <span class="hl-keyword">new</span> CustomItemReader&lt;String&gt;(items);

((ItemStream)itemReader).open(executionContext);
assertEquals(<span class="hl-string">"2"</span>, itemReader.read());</pre><p>Most ItemReaders have much more sophisticated restart logic. The
        <code class="classname">JdbcCursorItemReader</code>, for example, stores the
        row id of the last processed row in the Cursor.</p><p>It is also worth noting that the key used within the
        <code class="classname">ExecutionContext</code> should not be trivial. That is
        because the same <code class="classname">ExecutionContext</code> is used for
        all <code class="classname">ItemStream</code>s within a
        <code class="classname">Step</code>. In most cases, simply prepending the key
        with the class name should be enough to guarantee uniqueness. However,
        in the rare cases where two of the same type of
        <code class="classname">ItemStream</code> are used in the same step (which can
        happen if two files are need for output) then a more unique name will
        be needed. For this reason, many of the Spring Batch
        <code class="classname">ItemReader</code> and
        <code class="classname">ItemWriter</code> implementations have a
        <code class="methodname">setName</code>() property that allows this key name
        to be overridden.</p></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="customWriter" href="#customWriter"></a>6.13.2&nbsp;Custom ItemWriter Example</h3></div></div></div><p>Implementing a Custom <code class="classname">ItemWriter</code> is similar
      in many ways to the <code class="classname">ItemReader</code> example above, but
      differs in enough ways as to warrant its own example. However, adding
      restartability is essentially the same, so it won't be covered in this
      example. As with the <code class="classname">ItemReader</code> example, a
      <code class="classname">List</code> will be used in order to keep the example as
      simple as possible:</p><pre class="programlisting"><span class="hl-keyword">public</span> <span class="hl-keyword">class</span> CustomItemWriter&lt;T&gt; <span class="hl-keyword">implements</span> ItemWriter&lt;T&gt; {

    List&lt;T&gt; output = TransactionAwareProxyFactory.createTransactionalList();

    <span class="hl-keyword">public</span> <span class="hl-keyword">void</span> write(List&lt;? <span class="hl-keyword">extends</span> T&gt; items) <span class="hl-keyword">throws</span> Exception {
        output.addAll(items);
    }

    <span class="hl-keyword">public</span> List&lt;T&gt; getOutput() {
        <span class="hl-keyword">return</span> output;
    }
}</pre><div class="section"><div class="titlepage"><div><div><h4 class="title"><a name="restartableWriter" href="#restartableWriter"></a>Making the <code class="classname">ItemWriter</code>
        Restartable</h4></div></div></div><p>To make the ItemWriter restartable we would follow the same
        process as for the <code class="classname">ItemReader</code>, adding and
        implementing the <code class="classname">ItemStream</code> interface to
        synchronize the execution context. In the example we might have to
        count the number of items processed and add that as a footer record.
        If we needed to do that, we could implement
        <code class="classname">ItemStream</code> in our
        <code class="classname">ItemWriter</code> so that the counter was
        reconstituted from the execution context if the stream was
        re-opened.</p><p>In many realistic cases, custom ItemWriters also delegate to
        another writer that itself is restartable (e.g. when writing to a
        file), or else it writes to a transactional resource so doesn't need
        to be restartable because it is stateless. When you have a stateful
        writer you should probably also be sure to implement
        <code class="classname">ItemStream</code> as well as
        <code class="classname">ItemWriter</code>. Remember also that the client of
        the writer needs to be aware of the <code class="classname">ItemStream</code>,
        so you may need to register it as a stream in the configuration
        xml.</p></div></div></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="configureStep.html">Prev</a>&nbsp;</td><td width="20%" align="center">&nbsp;</td><td width="40%" align="right">&nbsp;<a accesskey="n" href="scalability.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">5.&nbsp;Configuring a Step&nbsp;</td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top">&nbsp;7.&nbsp;Scaling and Parallel Processing</td></tr></table></div></body></html>