diff --git a/docs/src/site/docbook/reference/images/meta-data-erd.png b/docs/src/site/docbook/reference/images/meta-data-erd.png
new file mode 100755
index 000000000..2a7179068
Binary files /dev/null and b/docs/src/site/docbook/reference/images/meta-data-erd.png differ
diff --git a/docs/src/site/docbook/reference/index.xml b/docs/src/site/docbook/reference/index.xml
index e133beb63..ed1d2fb57 100644
--- a/docs/src/site/docbook/reference/index.xml
+++ b/docs/src/site/docbook/reference/index.xml
@@ -48,6 +48,8 @@
+
+
\ No newline at end of file
diff --git a/docs/src/site/docbook/reference/schema-appendix.xml b/docs/src/site/docbook/reference/schema-appendix.xml
new file mode 100644
index 000000000..871a731d0
--- /dev/null
+++ b/docs/src/site/docbook/reference/schema-appendix.xml
@@ -0,0 +1,448 @@
+
+
+
+ Meta-Data Schema
+
+
+ Overview
+
+ The Spring Batch Meta-Data tables very closely match the Domain
+ objects that represent them in Java. For example, JobInstance,
+ JobExecution, JobParameters, StepExecution, and ExecutionContext map to
+ BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_PARAMS,
+ BATCH_STEP_EXECUTION, BATCH_STEP_EXECUTION_CONTEXT, respectively. The
+ JobRepository is responsible for saving and storing
+ each of java object into it's correct table. The following appendix
+ describes the meta-data tables in detail, along with many of the design
+ decisions that were made when creating them. When viewing the various
+ table creation statements below, it is important to realize that the
+ datatypes used are as generic as possible. Spring Batch provides many
+ schemas as examples, which all have varying datatypes due to quirks in
+ individual database vendors' handling of data types. Below is an ERD model
+ of all 5 tables and their relationships to one another:
+
+
+
+
+
+
+
+
+ Version
+
+ Many of the databse tables discussed in this appendix contain a
+ version column. This column is important because Spring Batch employs an
+ optimistic locking strategy when dealing with updates to the database.
+ This means that each time a record is 'touched' (updated) the value in
+ the version column is incremented by one. When the repository goes back
+ to try and save the value, if the version number has change it will
+ throw OptimisticLockingFailureException,
+ indicating there has been an error with concurrent access. This check is
+ very necessary, since even though different batch jobs may be running in
+ different machines, they are all using the same database tables.
+
+
+
+ Identity
+
+ BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, and BATCH_STEP_EXECUTION
+ each contain columns ending in _ID, which act as primary keys for their
+ respective tables. However, they are not database generated keys, but
+ rather are generated by separate sequences. This is necessary because
+ after inserting one of the domain objects into the database, the key it
+ is given need to be set on the actual object, so that they can be
+ uniquely identified in Java. Newer database drivers (Jdbc 3.0 and up)
+ support this feature with database generated keys, but rather than
+ requiring it, sequences were used. Each variation of the schema will
+ contain some form of the following:
+
+ CREATE SEQUENCE BATCH_STEP_EXECUTION_SEQ;
+CREATE SEQUENCE BATCH_JOB_EXECUTION_SEQ;
+CREATE SEQUENCE BATCH_JOB_SEQ;
+
+ Many database vendors don't official support sequences. In these
+ cases, work arounds are used, such as the following for mySQL:
+
+ CREATE TABLE BATCH_STEP_EXECUTION_SEQ (ID BIGINT NOT NULL) type=MYISAM;
+INSERT INTO BATCH_STEP_EXECUTION_SEQ values(0);
+CREATE TABLE BATCH_JOB_EXECUTION_SEQ (ID BIGINT NOT NULL) type=MYISAM;
+INSERT INTO BATCH_JOB_EXECUTION_SEQ values(0);
+CREATE TABLE BATCH_JOB_SEQ (ID BIGINT NOT NULL) type=MYISAM;
+INSERT INTO BATCH_JOB_SEQ values(0);
+
+ In the above case, a table is used in place of each sequence. The
+ Spring core class MySQLMaxValueIncrementer will
+ then increment hte one column in this sequence in order to give similar
+ functionality.
+
+
+
+
+ BATCH_JOB_INSTANCE
+
+ The BATCH_JOB_INSTANCE table holds all information relevant to a
+ JobInstance, and serves as the top of the overall
+ heirarchy. The following generic DDL statement is used to create
+ it:
+
+ CREATE TABLE BATCH_JOB_INSTANCE (
+ JOB_INSTANCE_ID BIGINT PRIMARY KEY ,
+ VERSION BIGINT,
+ JOB_NAME VARCHAR(100) NOT NULL ,
+ JOB_KEY VARCHAR(2500)
+);
+
+ Below are descriptions of each column in the table:
+
+
+
+ JOB_INSTANCE_ID: The unique id that will identify the instance,
+ which is also the primary key. The value of this column should be
+ obtainable by calling the getId method on
+ JobInstance.
+
+
+
+ VERSION: See above section.
+
+
+
+ JOB_NAME: Name of the job obtained from the
+ Job object. Because it is required to identify
+ the instance, it must not be null.
+
+
+
+ JOB_KEY: A serialization of the
+ JobParameters that uniquely identifies separate
+ instances of the same job from one another.
+ (JobInstances with the same job name
+
+
+
+
+
+ BATCH_JOB_PARAMS
+
+ The BATCH_JOB_PARAMS table holds all information relevant to the
+ JobParameters object. It contains 0 or more key/value pairs that together
+ uniquely identify a JobInstance and serve as a
+ record of the parameters a job was run with. It should be noted that the
+ table has been denormalized. Rather than creating a separate table for
+ each type, there is one table with a column indicating the type:
+
+ CREATE TABLE BATCH_JOB_PARAMS (
+ JOB_INSTANCE_ID BIGINT NOT NULL ,
+ TYPE_CD VARCHAR(6) NOT NULL ,
+ KEY_NAME VARCHAR(100) NOT NULL ,
+ STRING_VAL VARCHAR(250) ,
+ DATE_VAL TIMESTAMP DEFAULT NULL,
+ LONG_VAL BIGINT ,
+ DOUBLE_VAL DOUBLE PRECISION,
+ constraint JOB_INSTANCE_PARAMS_FK foreign key (JOB_INSTANCE_ID)
+ references BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
+);
+
+ Below are descriptions for each column:
+
+
+
+ JOB_INSTANCE_ID: Foreign Key from the BATCH_JOB_INSTANCE table
+ that indicates the job instance the parameter entry belongs to. It
+ should be noted that multiple rows (i.e key/value pairs) may exist for
+ each instance.
+
+
+
+ TYPE_CD: String representation of the type of value stored,
+ which can be either a character string, date, long, or double. Because
+ the type must be known, it cannot be null.
+
+
+
+ KEY_NAME: The Parameter key.
+
+
+
+ STRING_VAL: Parameter value, if the type is string.
+
+
+
+ DATE_VAL: Parameter value, if the type is date.
+
+
+
+ LONG_VAL: Parameter value, if the type is a long.
+
+
+
+ DOUBLE_VAL: Paramter value, if the type is double.
+
+
+
+ It is worth noting that there is no primary key for this table. This
+ is simply because the framework has no use for one, and thus doesn't
+ require it. If a user so chooses, one may be added with a database
+ generated key, without causing any issues to the framework itself.
+
+
+
+ BATCH_JOB_EXECUTION
+
+ The BATCH_JOB_EXECUTION table holds all information relevant to the
+ JobExecution object. Every time a
+ Job is run there will always be a new
+ JobExecution, and a new row in this table:
+
+ CREATE TABLE BATCH_JOB_EXECUTION (
+ JOB_EXECUTION_ID BIGINT PRIMARY KEY ,
+ VERSION BIGINT,
+ JOB_INSTANCE_ID BIGINT NOT NULL,
+ START_TIME TIMESTAMP DEFAULT NULL,
+ END_TIME TIMESTAMP DEFAULT NULL,
+ STATUS VARCHAR(10),
+ CONTINUABLE CHAR(1),
+ EXIT_CODE VARCHAR(20),
+ EXIT_MESSAGE VARCHAR(2500),
+ constraint JOB_INSTANCE_EXECUTION_FK foreign key (JOB_INSTANCE_ID)
+ references BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
+) ;
+
+ Below are descriptions for each column:
+
+
+
+ JOB_EXECUTION_ID: Primary key that uniquely identifies this
+ execution. The value of this column should be obtainable by calling
+ the getId method of the
+ JobExecution object.
+
+
+
+ VERSION: See above section.
+
+
+
+ JOB_INSTANCE_ID: Foreign key from the BATCH_JOB_INSTANCE table
+ indicating the instance to which this execution belongs. There may be
+ more than one execution per instance.
+
+
+
+ START_TIME: Timestamp representing the time the execution was
+ started.
+
+
+
+ END_TIME: Timestamp representing the time the execution was
+ finished, regardless of success or failure. An empty value in this
+ column even though the job is not currently running indicates that
+ there has been some type of error and the framework was unable to
+ perform a last save before failing.
+
+
+
+ STATUS: Character string representing the status of the
+ execution. This may be COMPLETED, STARTED, etc. The object
+ representation of this column is the
+ BatchStatus enumeration.
+
+
+
+ CONTINUABLE: Character indicating whether or not the execution
+ is currently able to continue. 'Y' for yes and 'N' for no.
+
+
+
+ EXIT_CODE: Character string representing the exit code of the
+ execution. In the case of a command line job, this may be converted
+ into a number.
+
+
+
+ EXIT_MESSAGE: Character string representing a more detailed
+ description of how the job exited. In the case of failure, this might
+ include as much of the stack trace as is possible.
+
+
+
+
+
+ BATCH_STEP_EXECUTION
+
+ The BATCH_STEP_EXECUTION table holds all information relevant to the
+ StepExecution object. This table is very similar in
+ many ways to the BATCH_JOB_EXECUTION table and there will always be at
+ least one entry per Step for each
+ JobExecution created:
+
+ CREATE TABLE BATCH_STEP_EXECUTION (
+ STEP_EXECUTION_ID BIGINT PRIMARY KEY ,
+ VERSION BIGINT NOT NULL,
+ STEP_NAME VARCHAR(100) NOT NULL,
+ JOB_EXECUTION_ID BIGINT NOT NULL,
+ START_TIME TIMESTAMP NOT NULL ,
+ END_TIME TIMESTAMP DEFAULT NULL,
+ STATUS VARCHAR(10),
+ COMMIT_COUNT BIGINT ,
+ ITEM_COUNT BIGINT ,
+ CONTINUABLE CHAR(1),
+ EXIT_CODE VARCHAR(20),
+ EXIT_MESSAGE VARCHAR(2500),
+ constraint JOB_EXECUTION_STEP_FK foreign key (JOB_EXECUTION_ID)
+ references BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
+) ;
+
+ Below are descriptions for each column:
+
+
+
+ STEP_EXECUTION_ID: Primary key that uniquely identifies this
+ execution. The value of this column should be obtainable by calling
+ the getId method of the
+ StepExecution object.
+
+
+
+ VERSION: See above section.
+
+
+
+ STEP_NAME: The name of the step to which this execution
+ belongs.
+
+
+
+ JOB_EXECUTION_ID: Foreign key from the BATCH_JOB_EXECUTION table
+ indicating the JobExecution to which this StepExecution belongs. There
+ may be only one StepExecution for a given
+ JobExecution for a given
+ Step name.
+
+
+
+ START_TIME: Timestamp representing the time the execution was
+ started.
+
+
+
+ END_TIME: Timestamp representing the time the execution was
+ finished, regardless of success or failure. An empty value in this
+ column even though the job is not currently running indicates that
+ there has been some type of error and the framework was unable to
+ perform a last save before failing.
+
+
+
+ STATUS: Character string representing the status of the
+ execution. This may be COMPLETED, STARTED, etc. The object
+ representation of this column is the
+ BatchStatus enumeration.
+
+
+
+ COMMIT_COUNT: The number of times in which the step has
+ committed a transaction during this execution.
+
+
+
+ ITEM_COUNT: The number of items that have been writtne out
+ during this execution.
+
+
+
+ CONTINUABLE: Character indicating whether or not the execution
+ is currently able to continue. 'Y' for yes and 'N' for no.
+
+
+
+ EXIT_CODE: Character string representing the exit code of the
+ execution. In the case of a command line job, this may be converted
+ into a number.
+
+
+
+ EXIT_MESSAGE: Character string representing a more detailed
+ description of how the job exited. In the case of failure, this might
+ include as much of the stack trace as is possible.
+
+
+
+
+
+ BATCH_STEP_EXECUTION_CONTEXT
+
+ The BATCH_STEP_EXECUTION_CONTEXT table holds all information
+ relevant to an ExecutionContext. There is exactly
+ one ExecutionContext per
+ StepExecution, and it contains all user defined
+ key/value pairs that need to persisted for a particular job run. This data
+ is usually state information that must be retrieved back after a failure
+ so that a JobInstance can 'start from where it left off'. As with the
+ BATCH_JOB_PARAMS table, this table has been denormalized and uses a column
+ to determine the type:
+
+ CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
+ STEP_EXECUTION_ID BIGINT NOT NULL ,
+ TYPE_CD VARCHAR(6) NOT NULL ,
+ KEY_NAME VARCHAR(1000) NOT NULL ,
+ STRING_VAL VARCHAR(1000) ,
+ DATE_VAL TIMESTAMP DEFAULT NULL ,
+ LONG_VAL VARCHAR(10) ,
+ DOUBLE_VAL DOUBLE PRECISION ,
+ OBJECT_VAL BLOB,
+ constraint STEP_EXECUTION_CONTEXT_FK foreign key (STEP_EXECUTION_ID)
+ references BATCH_STEP_EXECUTION(STEP_EXECUTION_ID)
+) ;
+
+ Below are descriptions for each column:
+
+
+
+ STEP_EXECUTION_ID: Foreign key representing the
+ StepExecution to which the context belongs.
+ There may be more than one row associated to a given
+ StepExecution.
+
+
+
+ TYPE_CD: String representation of the type of value stored,
+ which can be either a character string, date, long, or double. Because
+ the type must be known, it cannot be null.
+
+
+
+ KEY_NAME: The Parameter key.
+
+
+
+ STRING_VAL: Parameter value, if the type is string.
+
+
+
+ DATE_VAL: Parameter value, if the type is date.
+
+
+
+ LONG_VAL: Parameter value, if the type is a long.
+
+
+
+ DOUBLE_VAL: Paramter value, if the type is double.
+
+
+
+ OBJECT_VAL: Parameter value, if the type is a blob.
+
+
+
+ When an ExecutionContext is stored, values that are one of the well
+ known types above will be stored as their respective type. Any unknown
+ type will be serialized to a blob and stored in the OBJECT_VAL column. As
+ with BATCH_JOB_PARAMS, there is no primary key for this table. This is
+ simply because the framework has no use for one, and thus doesn't require
+ it. If a user so chooses, one may be added with a database generated key,
+ without causing any issues to the framework itself.
+
+
\ No newline at end of file