647 lines
25 KiB
XML
647 lines
25 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
|
|
<appendix id="metaDataSchema">
|
|
<title>Meta-Data Schema</title>
|
|
|
|
<section id="metaDataSchemaOverview">
|
|
<title>Overview</title>
|
|
|
|
<para>The Spring Batch Meta-Data tables very closely match the Domain
|
|
objects that represent them in Java. For example,
|
|
<classname>JobInstance</classname>, <classname>JobExecution</classname>,
|
|
<classname>JobParameters</classname>, and
|
|
<classname>StepExecution</classname> map to BATCH_JOB_INSTANCE,
|
|
BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS, and BATCH_STEP_EXECUTION,
|
|
respectively. <classname>ExecutionContext</classname> maps to both
|
|
BATCH_JOB_EXECUTION_CONTEXT and BATCH_STEP_EXECUTION_CONTEXT. The
|
|
<classname>JobRepository</classname> is responsible for saving and storing
|
|
each Java object into its correct table. The following appendix describes
|
|
the meta-data tables in detail, along with many of the design decisions
|
|
that were made when creating them. When viewing the various table creation
|
|
statements below, it is important to realize that the data types used are
|
|
as generic as possible. Spring Batch provides many schemas as examples,
|
|
which all have varying data types due to variations in individual database
|
|
vendors' handling of data types. Below is an ERD model of all 6 tables and
|
|
their relationships to one another:</para>
|
|
|
|
<mediaobject>
|
|
<imageobject role="html">
|
|
<imagedata align="center" fileref="images/meta-data-erd.png"
|
|
scale="100" />
|
|
</imageobject>
|
|
|
|
<imageobject role="fo">
|
|
<imagedata align="center" fileref="images/meta-data-erd.png"
|
|
scale="45" />
|
|
</imageobject>
|
|
</mediaobject>
|
|
|
|
<section id="exampleDDLScripts">
|
|
<title>Example DDL Scripts</title>
|
|
|
|
<para>The Spring Batch Core JAR file contains example
|
|
scripts to create the relational tables for a number of database
|
|
platforms (which are in turn auto-detected by the job repository factory
|
|
bean or namespace equivalent). These scripts can be used as is, or
|
|
modified with additional indexes and constraints as desired. The file
|
|
names are in the form <literal>schema-*.sql</literal>, where "*" is the
|
|
short name of the target database platform. The scripts are in
|
|
the package <literal>org.springframework.batch.core</literal>.</para>
|
|
</section>
|
|
|
|
<section id="metaDataVersion">
|
|
<title>Version</title>
|
|
|
|
<para>Many of the database tables discussed in this appendix contain a
|
|
version column. This column is important because Spring Batch employs an
|
|
optimistic locking strategy when dealing with updates to the database.
|
|
This means that each time a record is 'touched' (updated) the value in
|
|
the version column is incremented by one. When the repository goes back
|
|
to try and save the value, if the version number has change it will
|
|
throw <classname>OptimisticLockingFailureException</classname>,
|
|
indicating there has been an error with concurrent access. This check is
|
|
necessary since, even though different batch jobs may be running in
|
|
different machines, they are all using the same database tables.</para>
|
|
</section>
|
|
|
|
<section id="metaDataIdentity">
|
|
<title>Identity</title>
|
|
|
|
<para>BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, and BATCH_STEP_EXECUTION
|
|
each contain columns ending in _ID. These fields act as primary keys for
|
|
their respective tables. However, they are not database generated keys,
|
|
but rather they are generated by separate sequences. This is necessary
|
|
because after inserting one of the domain objects into the database, the
|
|
key it is given needs to be set on the actual object so that they can be
|
|
uniquely identified in Java. Newer database drivers (Jdbc 3.0 and up)
|
|
support this feature with database generated keys, but rather than
|
|
requiring it, sequences were used. Each variation of the schema will
|
|
contain some form of the following:</para>
|
|
|
|
<programlisting language="sql">CREATE SEQUENCE BATCH_STEP_EXECUTION_SEQ;
|
|
CREATE SEQUENCE BATCH_JOB_EXECUTION_SEQ;
|
|
CREATE SEQUENCE BATCH_JOB_SEQ;</programlisting>
|
|
|
|
<para>Many database vendors don't support sequences. In these cases,
|
|
work-arounds are used, such as the following for MySQL:</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_STEP_EXECUTION_SEQ (ID BIGINT NOT NULL) type=InnoDB;
|
|
INSERT INTO BATCH_STEP_EXECUTION_SEQ values(0);
|
|
CREATE TABLE BATCH_JOB_EXECUTION_SEQ (ID BIGINT NOT NULL) type=InnoDB;
|
|
INSERT INTO BATCH_JOB_EXECUTION_SEQ values(0);
|
|
CREATE TABLE BATCH_JOB_SEQ (ID BIGINT NOT NULL) type=InnoDB;
|
|
INSERT INTO BATCH_JOB_SEQ values(0);</programlisting>
|
|
|
|
<para>In the above case, a table is used in place of each sequence. The
|
|
Spring core class <classname>MySQLMaxValueIncrementer</classname> will
|
|
then increment the one column in this sequence in order to give similar
|
|
functionality.</para>
|
|
</section>
|
|
</section>
|
|
|
|
<section id="metaDataBatchJobInstance">
|
|
<title>BATCH_JOB_INSTANCE</title>
|
|
|
|
<para>The BATCH_JOB_INSTANCE table holds all information relevant to a
|
|
<classname>JobInstance</classname>, and serves as the top of the overall
|
|
hierarchy. The following generic DDL statement is used to create
|
|
it:</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_JOB_INSTANCE (
|
|
JOB_INSTANCE_ID BIGINT PRIMARY KEY ,
|
|
VERSION BIGINT,
|
|
JOB_NAME VARCHAR(100) NOT NULL ,
|
|
JOB_KEY VARCHAR(2500)
|
|
);</programlisting>
|
|
|
|
<para>Below are descriptions of each column in the table:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>JOB_INSTANCE_ID: The unique id that will identify the instance,
|
|
which is also the primary key. The value of this column should be
|
|
obtainable by calling the <methodname>getId</methodname> method on
|
|
<classname>JobInstance</classname>.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>VERSION: See above section.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>JOB_NAME: Name of the job obtained from the
|
|
<classname>Job</classname> object. Because it is required to identify
|
|
the instance, it must not be null.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>JOB_KEY: A serialization of the
|
|
<classname>JobParameters</classname> that uniquely identifies separate
|
|
instances of the same job from one another.
|
|
(<classname>JobInstances</classname> with the same job name must have
|
|
different <classname>JobParameters</classname>, and thus, different
|
|
JOB_KEY values).</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="metaDataBatchJobParams">
|
|
<title>BATCH_JOB_EXECUTION_PARAMS</title>
|
|
|
|
<para>The BATCH_JOB_EXECUTION_PARAMS table holds all information relevant to the
|
|
<classname>JobParameters</classname> object. It contains 0 or more
|
|
key/value pairs passed to a <classname>Job</classname> and serve as a record of the parameters
|
|
a job was run with. For each parameter that contributes to the generation of a job's identity,
|
|
the IDENTIFYING flag is set to true. It should be noted that the table has been
|
|
denormalized. Rather than creating a separate table for each type, there
|
|
is one table with a column indicating the type:</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_JOB_EXECUTION_PARAMS (
|
|
JOB_EXECUTION_ID BIGINT NOT NULL ,
|
|
TYPE_CD VARCHAR(6) NOT NULL ,
|
|
KEY_NAME VARCHAR(100) NOT NULL ,
|
|
STRING_VAL VARCHAR(250) ,
|
|
DATE_VAL DATETIME DEFAULT NULL ,
|
|
LONG_VAL BIGINT ,
|
|
DOUBLE_VAL DOUBLE PRECISION ,
|
|
IDENTIFYING CHAR(1) NOT NULL ,
|
|
constraint JOB_EXEC_PARAMS_FK foreign key (JOB_EXECUTION_ID)
|
|
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
|
|
);</programlisting>
|
|
|
|
<para>Below are descriptions for each column:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>JOB_EXECUTION_ID: Foreign Key from the BATCH_JOB_EXECUTION table
|
|
that indicates the job execution the parameter entry belongs to. It
|
|
should be noted that multiple rows (i.e key/value pairs) may exist for
|
|
each execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>TYPE_CD: String representation of the type of value stored,
|
|
which can be either a string, date, long, or double. Because the type
|
|
must be known, it cannot be null.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>KEY_NAME: The parameter key.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>STRING_VAL: Parameter value, if the type is string.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>DATE_VAL: Parameter value, if the type is date.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>LONG_VAL: Parameter value, if the type is a long.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>DOUBLE_VAL: Parameter value, if the type is double.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>IDENTIFYING: Flag indicating if the parameter contributed to the identity of the related <classname>JobInstance</classname>.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>It is worth noting that there is no primary key for this table. This
|
|
is simply because the framework has no use for one, and thus doesn't
|
|
require it. If a user so chooses, one may be added with a database
|
|
generated key, without causing any issues to the framework itself.</para>
|
|
</section>
|
|
|
|
<section id="metaDataBatchJobExecution">
|
|
<title>BATCH_JOB_EXECUTION</title>
|
|
|
|
<para>The BATCH_JOB_EXECUTION table holds all information relevant to the
|
|
<classname>JobExecution</classname> object. Every time a
|
|
<classname>Job</classname> is run there will always be a new
|
|
<classname>JobExecution</classname>, and a new row in this table:</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_JOB_EXECUTION (
|
|
JOB_EXECUTION_ID BIGINT PRIMARY KEY ,
|
|
VERSION BIGINT,
|
|
JOB_INSTANCE_ID BIGINT NOT NULL,
|
|
CREATE_TIME TIMESTAMP NOT NULL,
|
|
START_TIME TIMESTAMP DEFAULT NULL,
|
|
END_TIME TIMESTAMP DEFAULT NULL,
|
|
STATUS VARCHAR(10),
|
|
EXIT_CODE VARCHAR(20),
|
|
EXIT_MESSAGE VARCHAR(2500),
|
|
LAST_UPDATED TIMESTAMP,
|
|
JOB_CONFIGURATION_LOCATION VARCHAR(2500) NULL,
|
|
constraint JOB_INSTANCE_EXECUTION_FK foreign key (JOB_INSTANCE_ID)
|
|
references BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
|
|
) ;</programlisting>
|
|
|
|
<para>Below are descriptions for each column:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>JOB_EXECUTION_ID: Primary key that uniquely identifies this
|
|
execution. The value of this column is obtainable by calling the
|
|
<methodname>getId</methodname> method of the
|
|
<classname>JobExecution</classname> object.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>VERSION: See above section.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>JOB_INSTANCE_ID: Foreign key from the BATCH_JOB_INSTANCE table
|
|
indicating the instance to which this execution belongs. There may be
|
|
more than one execution per instance.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>CREATE_TIME: Timestamp representing the time that the execution
|
|
was created.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>START_TIME: Timestamp representing the time the execution was
|
|
started.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>END_TIME: Timestamp representing the time the execution was
|
|
finished, regardless of success or failure. An empty value in this
|
|
column even though the job is not currently running indicates that
|
|
there has been some type of error and the framework was unable to
|
|
perform a last save before failing.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>STATUS: Character string representing the status of the
|
|
execution. This may be COMPLETED, STARTED, etc. The object
|
|
representation of this column is the
|
|
<classname>BatchStatus</classname> enumeration.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>EXIT_CODE: Character string representing the exit code of the
|
|
execution. In the case of a command line job, this may be converted
|
|
into a number.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>EXIT_MESSAGE: Character string representing a more detailed
|
|
description of how the job exited. In the case of failure, this might
|
|
include as much of the stack trace as is possible.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>LAST_UPDATED: Timestamp representing the last time this
|
|
execution was persisted.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="metaDataBatchStepExecution">
|
|
<title>BATCH_STEP_EXECUTION</title>
|
|
|
|
<para>The BATCH_STEP_EXECUTION table holds all information relevant to the
|
|
<classname>StepExecution</classname> object. This table is very similar in
|
|
many ways to the BATCH_JOB_EXECUTION table and there will always be at
|
|
least one entry per <classname>Step</classname> for each
|
|
<classname>JobExecution</classname> created:</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_STEP_EXECUTION (
|
|
STEP_EXECUTION_ID BIGINT PRIMARY KEY ,
|
|
VERSION BIGINT NOT NULL,
|
|
STEP_NAME VARCHAR(100) NOT NULL,
|
|
JOB_EXECUTION_ID BIGINT NOT NULL,
|
|
START_TIME TIMESTAMP NOT NULL ,
|
|
END_TIME TIMESTAMP DEFAULT NULL,
|
|
STATUS VARCHAR(10),
|
|
COMMIT_COUNT BIGINT ,
|
|
READ_COUNT BIGINT ,
|
|
FILTER_COUNT BIGINT ,
|
|
WRITE_COUNT BIGINT ,
|
|
READ_SKIP_COUNT BIGINT ,
|
|
WRITE_SKIP_COUNT BIGINT ,
|
|
PROCESS_SKIP_COUNT BIGINT ,
|
|
ROLLBACK_COUNT BIGINT ,
|
|
EXIT_CODE VARCHAR(20) ,
|
|
EXIT_MESSAGE VARCHAR(2500) ,
|
|
LAST_UPDATED TIMESTAMP,
|
|
constraint JOB_EXECUTION_STEP_FK foreign key (JOB_EXECUTION_ID)
|
|
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
|
|
) ;</programlisting>
|
|
|
|
<para>Below are descriptions for each column:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>STEP_EXECUTION_ID: Primary key that uniquely identifies this
|
|
execution. The value of this column should be obtainable by calling
|
|
the <methodname>getId</methodname> method of the
|
|
<classname>StepExecution</classname> object.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>VERSION: See above section.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>STEP_NAME: The name of the step to which this execution
|
|
belongs.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>JOB_EXECUTION_ID: Foreign key from the BATCH_JOB_EXECUTION table
|
|
indicating the JobExecution to which this StepExecution belongs. There
|
|
may be only one <classname>StepExecution</classname> for a given
|
|
<classname>JobExecution</classname> for a given
|
|
<classname>Step</classname> name.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>START_TIME: Timestamp representing the time the execution was
|
|
started.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>END_TIME: Timestamp representing the time the execution was
|
|
finished, regardless of success or failure. An empty value in this
|
|
column even though the job is not currently running indicates that
|
|
there has been some type of error and the framework was unable to
|
|
perform a last save before failing.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>STATUS: Character string representing the status of the
|
|
execution. This may be COMPLETED, STARTED, etc. The object
|
|
representation of this column is the
|
|
<classname>BatchStatus</classname> enumeration.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>COMMIT_COUNT: The number of times in which the step has
|
|
committed a transaction during this execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>READ_COUNT: The number of items read during this
|
|
execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>FILTER_COUNT: The number of items filtered out of this
|
|
execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>WRITE_COUNT: The number of items written and committed during
|
|
this execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>READ_SKIP_COUNT: The number of items skipped on read during this
|
|
execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>WRITE_SKIP_COUNT: The number of items skipped on write during
|
|
this execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>PROCESS_SKIP_COUNT: The number of items skipped during
|
|
processing during this execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>ROLLBACK_COUNT: The number of rollbacks during this execution.
|
|
Note that this count includes each time rollback occurs, including
|
|
rollbacks for retry and those in the skip recovery procedure.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>EXIT_CODE: Character string representing the exit code of the
|
|
execution. In the case of a command line job, this may be converted
|
|
into a number.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>EXIT_MESSAGE: Character string representing a more detailed
|
|
description of how the job exited. In the case of failure, this might
|
|
include as much of the stack trace as is possible.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>LAST_UPDATED: Timestamp representing the last time this
|
|
execution was persisted.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="metaDataBatchJobExecutionContext">
|
|
<title>BATCH_JOB_EXECUTION_CONTEXT</title>
|
|
|
|
<para>The BATCH_JOB_EXECUTION_CONTEXT table holds all information relevant
|
|
to an <classname>Job</classname>'s
|
|
<classname>ExecutionContext</classname>. There is exactly one
|
|
<classname>Job</classname> <classname>ExecutionContext</classname> per
|
|
<classname>JobExecution</classname>, and it contains all of the job-level
|
|
data that is needed for a particular job execution. This data typically
|
|
represents the state that must be retrieved after a failure so that a
|
|
<classname>JobInstance</classname> can 'start from where it left
|
|
off'.</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT (
|
|
JOB_EXECUTION_ID BIGINT PRIMARY KEY,
|
|
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
|
|
SERIALIZED_CONTEXT CLOB,
|
|
constraint JOB_EXEC_CTX_FK foreign key (JOB_EXECUTION_ID)
|
|
references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
|
|
) ;</programlisting>
|
|
|
|
<para>Below are descriptions for each column:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>JOB_EXECUTION_ID: Foreign key representing the
|
|
<classname>JobExecution</classname> to which the context belongs.
|
|
There may be more than one row associated to a given execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>SHORT_CONTEXT: A string version of the
|
|
SERIALIZED_CONTEXT.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>SERIALIZED_CONTEXT: The entire context, serialized.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="metaDataBatchStepExecutionContext">
|
|
<title>BATCH_STEP_EXECUTION_CONTEXT</title>
|
|
|
|
<para>The BATCH_STEP_EXECUTION_CONTEXT table holds all information
|
|
relevant to an <classname>Step</classname>'s
|
|
<classname>ExecutionContext</classname>. There is exactly one
|
|
<classname>ExecutionContext</classname> per
|
|
<classname>StepExecution</classname>, and it contains all of the data that
|
|
needs to persisted for a particular step execution. This data typically
|
|
represents the state that must be retrieved after a failure so that a
|
|
<classname>JobInstance</classname> can 'start from where it left
|
|
off'.</para>
|
|
|
|
<programlisting language="sql">CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
|
|
STEP_EXECUTION_ID BIGINT PRIMARY KEY,
|
|
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
|
|
SERIALIZED_CONTEXT CLOB,
|
|
constraint STEP_EXEC_CTX_FK foreign key (STEP_EXECUTION_ID)
|
|
references BATCH_STEP_EXECUTION(STEP_EXECUTION_ID)
|
|
) ;</programlisting>
|
|
|
|
<para>Below are descriptions for each column:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>STEP_EXECUTION_ID: Foreign key representing the
|
|
<classname>StepExecution</classname> to which the context belongs.
|
|
There may be more than one row associated to a given execution.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>SHORT_CONTEXT: A string version of the
|
|
SERIALIZED_CONTEXT.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>SERIALIZED_CONTEXT: The entire context, serialized.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="metaDataArchiving">
|
|
<title>Archiving</title>
|
|
|
|
<para>Because there are entries in multiple tables every time a batch job
|
|
is run, it is common to create an archive strategy for the meta-data
|
|
tables. The tables themselves are designed to show a record of what
|
|
happened in the past, and generally won't affect the run of any job, with
|
|
a couple of notable exceptions pertaining to restart:</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>The framework will use the meta-data tables to determine if a
|
|
particular JobInstance has been run before. If it has been run, and
|
|
the job is not restartable, then an exception will be thrown.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>If an entry for a JobInstance is removed without having
|
|
completed successfully, the framework will think that the job is new,
|
|
rather than a restart.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>If a job is restarted, the framework will use any data that has
|
|
been persisted to the ExecutionContext to restore the Job's state.
|
|
Therefore, removing any entries from this table for jobs that haven't
|
|
completed successfully will prevent them from starting at the correct
|
|
point if run again.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</section>
|
|
|
|
<section id="multiByteCharacters">
|
|
<title>International and Multi-byte Characters</title>
|
|
|
|
<para>If you are using multi-byte character sets (e.g. Chines or Cyrillic)
|
|
in your business processing, then those characters might need to be
|
|
persisted in the Spring Batch schema. Many users find that
|
|
simply changing the schema to double the length of the <literal>VARCHAR</literal>
|
|
columns is enough. Others prefer to configure the <link
|
|
linkend="configuringJobRepository"><classname>JobRepository</classname></link> with <literal>max-varchar-length</literal> half the value of the <literal>VARCHAR</literal> column length is enough. Some users have also reported that
|
|
they use <literal>NVARCHAR</literal> in place of <literal>VARCHAR</literal>
|
|
in their schema definitions. The best result will depend on the database
|
|
platform and the way the database server has been configured locally.</para>
|
|
</section>
|
|
|
|
<section id="recommendationsForIndexingMetaDataTables">
|
|
<title>Recommendations for Indexing Meta Data Tables</title>
|
|
|
|
<para>Spring Batch provides DDL samples for the meta-data tables in the
|
|
Core jar file for several common database platforms. Index declarations
|
|
are not included in that DDL because there are too many variations in how
|
|
users may want to index depending on their precise platform, local
|
|
conventions and also the business requirements of how the jobs will be
|
|
operated. The table below provides some indication as to which columns are
|
|
going to be used in a WHERE clause by the Dao implementations provided by
|
|
Spring Batch, and how frequently they might be used, so that individual
|
|
projects can make up their own minds about indexing.</para>
|
|
|
|
<table>
|
|
<title>Where clauses in SQL statements (excluding primary keys) and
|
|
their approximate frequency of use.</title>
|
|
|
|
<tgroup cols="3">
|
|
<tbody>
|
|
<row>
|
|
<entry>Default Table Name</entry>
|
|
|
|
<entry>Where Clause</entry>
|
|
|
|
<entry>Frequency</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>BATCH_JOB_INSTANCE</entry>
|
|
|
|
<entry>JOB_NAME = ? and JOB_KEY = ?</entry>
|
|
|
|
<entry>Every time a job is launched</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>BATCH_JOB_EXECUTION</entry>
|
|
|
|
<entry>JOB_INSTANCE_ID = ?</entry>
|
|
|
|
<entry>Every time a job is restarted</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>BATCH_EXECUTION_CONTEXT</entry>
|
|
|
|
<entry>EXECUTION_ID = ? and KEY_NAME = ?</entry>
|
|
|
|
<entry>On commit interval, a.k.a. chunk</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>BATCH_STEP_EXECUTION</entry>
|
|
|
|
<entry>VERSION = ?</entry>
|
|
|
|
<entry>On commit interval, a.k.a. chunk (and at start and end of
|
|
step)</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>BATCH_STEP_EXECUTION</entry>
|
|
|
|
<entry>STEP_NAME = ? and JOB_EXECUTION_ID = ?</entry>
|
|
|
|
<entry>Before each step execution</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</section>
|
|
</appendix>
|