Files
spring-cloud-dataflow-samples/dataflow-migrate-schedules/README.adoc
2020-04-20 18:29:41 -04:00

189 lines
9.3 KiB
Plaintext

= Spring Cloud Data Flow Schedule Migration
The purpose of this project is to migrate existing schedules that are using the SchedulerTaskLauncher
created with Spring Cloud Data Flow 2.3.0.RELEASE and 2.4.0.RELEASE to the new format. This new format allows the platform scheduler
to launch the task directly, returning it to the functionality that existing prior to 2.3.0.RELEASE.
This is a Spring Boot application that utilizes Spring Batch to create a workflow
to migrate the schedules. This is a single step Spring Batch Job that does the following:
* Read - Retrieves all schedules from scheduler.
* Process - Extracts the app properties from the SchedulerTaskLauncher instance and the associated platform job.
* Write - Deploys artifacts if required and creates the new schedule. Once the migrated
schedule has been created, the old schedule is destroyed.
In the case that the application fails with an exception. You can re-execute the
application and it will pick up where it left off. Thanks Spring Batch! :-)
== Build the project
=== Services Required
In order to migrate the existing schedules, the Schedule Migrator requires the following services:
1. Access to the database where Spring Cloud Data Flow stores its Task Definitions. For Cloud Foundry we need to bind the database to the Schedule Migrator.
2. Access to the Scheduling Agent. For Cloud Foundry we need to bind the PCF Scheduler to the Scheduler Migrator.
=== Running the maven command
```
./mvnw clean package
```
== Running The Project For Cloud Foundry
=== Prerequisites
Spring Cloud Data Flow 2.3+ must be installed and running prior to launching the migration app.
=== Launching your migration
1) Create a manifest.yml file in a work directory.
```
---
applications:
- name: schedulemigrator
host: schedulemigrator
memory: 1G
disk_quota: 1G
instances: 0
path: <location of the jar>
env:
SPRING_APPLICATION_NAME: schedulemigrator
spring_cloud_deployer_cloudfoundry_url: <CF API URL>
spring_cloud_deployer_cloudfoundry_org: <Your org>
spring_cloud_deployer_cloudfoundry_space: <your space>
spring_cloud_deployer_cloudfoundry_username: <Your CF user name>
spring_cloud_deployer_cloudfoundry_password: <Your CF password>
spring_cloud_deployer_cloudfoundry_skipSslValidation: <true/false>
spring_cloud_deployer_cloudfoundry_services: <your SCDF database service,your scheduler service>
spring_cloud_scheduler_cloudfoundry_schedulerUrl: <URL to scheduler service>
spring_profiles_active: cloudfoundry
dataflowServerUri: <Your SCDF version 2.3+ server URI>
spring_cloud_task_closecontextEnabled: true
remoteRepositories_repo1_url: https://repo.spring.io/libs-snapshot
services:
- <Your SCDF database service>
- <Your SCDF scheduler service>
```
2) From the command line use the cf cli to log into your org and space for which you will migrate your schedules
```
cf login -a <your CF API endpoint>
```
-or if you need to skip ssl validation-
```
cf login -a <your CF API endpoint> --skip-ssl-validation
```
3) Now push the schedulemigrator from the directory where the manifest.yml is present:
```
cf push
```
3) To start the migration:
From the `dataflow-migrate-schedules` directory launch the `runMigration.sh` using the commands below:
```
chmod +x scripts/runMigration.sh
./scripts/runMigration.sh
```
=== Picking which schedules to migrate
Use the `scheduleNamesToMigrate` property to specify a comma delimited list of
the schedules you wish to migrate. If you don't specify this property
all schedules will be migrated. For example:
```
./scripts/runMigration.sh --scheduleNamesToMigrate=schedulename-scdf-taskdefname,anotherschedulename-scdf-taskdefname
```
=== Limiting one Scheduler to run at a time
If there is a requirement that only one `schedulemigrator` should run at a time you can set the `spring.cloud.task.single-instance-enabled` property to true. This will stop other executions of the schedulemigrator till the currently running instance completes.
To enable this feature use the `runMigration.sh` script as follows.
```
./scripts/runMigration.sh --spring.cloud.task.single-instance-enabled=true
```
=== Previously Pushed Apps
The Cloud Schedule Migration app deletes SchedulerTaskLauncher and its associated schedule.
NOTE: If a single task definition has more than one schedule, the task definition
droplet created for first migrated schedule will be used for all scheduled tasks of that task definition.
== Running The Project For Kubernetes
=== Prerequisites
* Spring Cloud Data Flow 2.3.+ must be installed and running prior to launching the migration app.
* Spring Cloud Data Flow must be using the `exec` entry point for scheduling apps.
=== Launching your migration
1) Establish the following environment variables:
```
export spring_datasource_url=<The Spring Cloud Data Flow database URL to be used by the migration tool>
export spring_datasource_username=<The Spring Cloud Data Flow database username to be used by the migration tool>
export spring_datasource_password=<The Spring Cloud Data Flow database password to be used by the migration tool>
export spring_datasource_driverClassName=<The Spring Cloud Data Flow database driverClassName to be used by the migration tool>
export KUBERNETES_NAMESPACE=< The namespace for the scheduled apps >
export dbDriverClassName=<The Spring Cloud Data Flow database driverClassName to be used by the migrated tasks>
export dbUrl=<The Spring Cloud Data Flow database database url to be used by the migrated tasks>
export dbUserName=<The Spring Cloud Data Flow database user name to be used by the migrated tasks>
export dbPassword=<The Spring Cloud Data Flow database password to be used by the migrated tasks>
export dataflowUrl=<The URL to the current Spring Cloud Data Flow Server>
```
NOTE: There are two sets of datasource properties. This is because the `spring_datasource_*`
properties are used by the migration tool , while `db*` properties are used to
set the database connection information that is required by the migrated tasks.
The database connection URL used by the migration tool can be different than the
one used by the tasks.
2) Configure environment to access the cluster where the schedules are located.
NOTE: In some cases if `KUBECONFIG` has a list of kubeconfig files the application may not select the proper kubeconfig file.
In these cases set the `KUBECONFIG` so that it will use the proper config file in the `$HOME/.kube/` directory.
3) Be sure to establish a port forward to the database that Spring Cloud Data Flow is using,
thus allowing the migration tool to gather information about task definitions.
For example if using mysql: `kubectl port-forward <mysql pod name> 3306:3306`
4) To start the migration:
From the `dataflow-migrate-schedules` directory launch the `runKubernetesMigration.sh` using the commands below:
```
chmod +x scripts/runKubernetesMigration.sh
./scripts/runKubernetesMigration.sh
```
=== Picking which schedules to migrate
By default the `runKubernetesMigration.sh` will migrate all schedules.
However if a specific set of schedules need to be migrated, then use the
`scheduleNamesToMigrate` property to specify a comma delimited list of
the schedules you wish to migrate. For example:
```
./scripts/runKubernetesMigration.sh --scheduleNamesToMigrate=schedulename-scdf-taskdefname,anotherschedulename-scdf-taskdefname
```
=== Limiting one Scheduler to run at a time
If there is a requirement that only one `schedulemigrator` should run at a time you can set the `spring.cloud.task.single-instance-enabled` property to true. This will stop other executions of the schedulemigrator till the currently running instance completes.
To enable this feature use the `runMigration.sh` script as follows.
```
./scripts/runKubernetesMigration.sh --spring.cloud.task.single-instance-enabled=true
```
== Configuring the Schedule Migration
The following properties configure how the scheduler migrator will migrate the schedules.
* schedulerToken - The token (default `scdf-`) is used by SchedulerTaskLauncher as a delimiter
to separate the schedule name of each schedule into 2 components: `base schedule name`
and `task name`. This value will be used by the migration tool identify schedules to be migrated.
* taskLauncherPrefix - The prefix used by the SchedulerTaskLauncher to mark the properties for the launched apps. Default: `tasklauncher`
* scheduleNamesToMigrate - Comma delimited list of schedules to migrate. If empty then all schedules will be migrated.
* composedTaskRunnerRegisteredAppName - The registered application name for the composed task runner. Default: `composed-task-runner`
* dataflowUrl - The url of the Spring Cloud Data Flow Server that migrated composed task runners should execute task launch commands.
=== Database Configuration for Kubernetes Migration
* dbUserName - The user name of the database that contains the task definitions for schedules to be migrated.
* dbPassword - The password of the database that contains the task definitions for schedules to be migrated.
* dbUrl - The url to the database that contains the task definitions for schedules to be migrated.
* dbDriverClassName - The driver class name to use for the database that contains the task definitions for schedules to be migrated.
== Supported Databases
The database supported are enumerated https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#configuration-local-rdbms[here].