DB Modernization

Blu Age Database and data Modernization (BDM) helps you to modernize
any type of database.

Blu Age DB Modernization

Its main goal is the modernization of the databases of old applications (non-normalized relational databases, file databases, XML databases, …) which naturally implies the re-normalization of schemas and the complete integration of Blu Age products for the modernization of legacy software, both of which contribute to the modernization of both the database and the code that uses it.

BDM thereby allows for the generating of scripts that can create the new relational database, of scripts that transfer the data - to be used with Talend Open Studio (www.talend.com), and of the UML2 model that can be exploited thanks to Blu Age V3.6 and beyond.

What’s more, BDM can also be used to maintain these modernized bases, thanks to the management of different SQL artifacts or the adding of tables, columns, indexes, constraints, etc…

This tools will be particularly useful for:

  • Data Base Administrators,
  • Functional Domain Experts,
  • Developers,
  • Software and Information System Architects,
  • Modernization experts.

Introduction

Blu Age Database and data Modernization (BDM) helps you to modernize any type of database. Its main goal is the modernization of the databases of old applications (non-normalized relational databases, file databases, XML databases, …) which naturally implies the re-normalization of schemas and the complete integration of Blu Age products for the modernization of legacy software, both of which contribute to the modernization of both the database and the code that uses it. BDM thereby allows for the generating of scripts that can create the new relational database, of scripts that transfer the data - to be used with Talend Open Studio (www.talend.com), and of the UML2 model that can be exploited thanks to Blu Age V3.6 and beyond. What’s more, BDM can also be used to maintain these modernized bases, thanks to the management of different SQL artifacts or the adding of tables, columns, indexes, constraints, etc…

This tools will be particularly useful for:

  • Data Base Administrators,
  • Functional Domain Experts,
  • Developers,
  • Software and Information System Architects,
  • Modernization experts.

A dedicated training class is available. For more detailed information please refer to www.bluage.com. This entire document is protected by French and US laws specially for what is regarding ownership and intellectual properties. All rights of intellectual and industrial property (copyright, patents, trademarks), including in particular the exploitation rights, copyrights and extraction in any medium, of some or all data, files and all Material on the Web pages on this site, and representation rights and copyright in any medium, of all or part of the document itself, the rights of modification, adaptation or translation, are reserved exclusively for Netfective Technology - and its potential third party licensors. In addition, all trademarks and logos represented on this site are property of their respective owners.

Getting Started

Installing

Installing the reverse BDM tool is done by executing the downloaded executable. Required prerequisites: a JDK 1.6 must be installed along with Talend 4.2 (http://www.talend.com).img_src:self link_mfp:img class:rd_jmpopup_img width:400 height:325 img_alt:Installing title:Installing)

Launching

The reverse application is launched via BDM.exe or a shortcut to the executable. After the initial loading, you are requested to specify the location of your workspace.img_src:self link_mfp:img class:rd_jmpopup_img width:400 height:181 img_alt:Launching title:launching)

Choose a location for the first launch. Whenever starting the software later, the last workspaces that were used will be suggested as choices in the selection tool.

 

Main Icons

: Allows for the complete reduction of the tree structure
: Allows one to link the current editor to the tree structure
: Whether to display the categories or not
: Creating a repository
: Creating a connection profile
: Opening an SQL “scrapbook” (query and execution editor)
: Displaying elements by their “databaseName” (modernized name)
: Displaying elements by their “modelName” (name in the UML2 model)
: Starting the registry of a macro
: Configuring the display

Choosing a perspective

Once the application is launched, go to: Window > Open Perspective > Other… The choose the “Database Development” perspective.

 

It is also a good idea to open the “Package Explorer” view so as to make it easier to look at open projects. (Window > Show View > Other… then Java > Package Explorer)

Initializing the BDM Project

Creating a repository

The first step consists in creating a repository. To that end, click on the icon of the “Data Source Explorer” view. The following pop-up appears:

 

There are four ways to create repositories.

Create

“Create a Workspace Based Connection Profile Repository” allows you to create a new repository. One then has to configure the connection not only to the source database but also to the modernized database.

Once it is validated, three connections appear:

  • “Connected” corresponds to the connection to the source database that will be used to visualize its structure as well as make queries to its data. (see the SQL Editor).
  • “Legacy” is initially identical to “Connected”, but this time it becomes the working connection where all the modernizations are going to be done. The connection properties allow a first filter (tables, schemas, etc…) and then allow you to shift into offline mode (right-click > Work Offline). This operation consists in locally recovering the entire model of the database, in light of future modernizations.
  • “Modernized”: connection to the modernized database. It allows one to execute the DDL scripts for the creation of artefacts in the new database and to handle requests once the transfer has taken place (see SQL editor).

Executing this wizard will also lead to the creation of an Eclipse project that corresponds to the repository. (the link is made using the name that is common to both artefacts). This project will contain the folders that correspond to the “legacy” connection, the connection information, the resulting artifacts (UML model, audit reports, Talend scripts), the modernization routines being used, the generated SQL scripts, etc…
As far as collaborative work is concerned, this is the project that is sent to the sources manager. Once the project is recovered, the repository can be recovered by using the “Import” wizard.
The Package Explorer view allows one to visualize the different projects whereas the Data Source Explorer view allows one to visualize the different repositories.

The GlobalRefactorings folder will contain all the refactorings of global type that have been created.
The SingleRefactorings folder will contain all the refactorings of unitary type that have been made.
The Stereotypes folder will contain all the stereotypes used in this repository.
Those 3 elements will be further described later in this document.
The Migration Variables folder contains all the variables used for the migration.
The Sequences folder (visible or not according to the type of database) contains all the sequences for that schema.
The Stored Procedures folder will contain all the registered procedures.
The Tables folder will contain all the tables for that scheme

Modernize

“Modernize An Existing Workspace Based Connection profile Repository”, will allow you to do the modernization work iteratively.

The idea is to cut up the problem into smaller parts so as to be able to handle them better in isolation. Oftentimes, the first modernization will involve renaming and management of types or dates. Following up on that first modernization, we would look at adding new SQL artifacts (foreign keys, index, etc…) to the repository that was made during the modernization of the first modernized base.

In fact, at each new step we are going to modernize the database produced in the previous step. This wizard can just as easily allow you to modernize a database that is under production.

The resulting Legacy connection will therefore correspond to the modernized connection of the repository that is currently being modernized. One still has to Configure the connection to the new modernized database (oftentimes the same database but with a different schema).


A new type of connection appears: “Summary”:

In connected mode, this will show the connection to the first iteration repository of the modernization project. In a COBOL project, connection is empty since useless.

In offline mode, it corresponds to the sum of all the modernization modifications from the first iteration repository to the last modernization step, excepted the current modernization, with the goal of achieving high traceability and to allow the production of scripts for a complete modernization.

Import

A wizard that is mainly used during the recovery of a modernization project from a source manager: the project must already exist in the workspace – select it to start the import. To share a repository from within a source manager, one need only add the project corresponding to the repository (the names are identical) to the source manager.

The connection profile name will automatically be the same as the imported project.

Merge

This wizard allows one to unify two distinct modernization repositories. It can be used when the source database is made up of heterogeneous data support.

Connecting to a relational database

Choose the type of connection from among the available choices. The connection will automatically be named after the repository.

Whenever connecting for the first time to a database of a certain type, it will be necessary to configure the appropriate drivers (base versions, the corresponding JAR files, etc…).

Once the driver has been defined, all that is left is to fill in the connection parameters in the usual way. The Test Connection button then allows you to make sure that no errors have been made.

Connecting to RBDMS
Fig. 1. Legacy Connection type
Fig. 1. Legacy Connection type
Fig. 2. New Driver - choosing the type and the name
Fig. 2. New Driver - choosing the type and the name
Fig. 3. Indicating the driver jar file
Fig. 3. Indicating the driver jar file
Fig. 4. Connecting to an Oracle base
Fig. 4. Connecting to an Oracle base

Connection to a legacy Pacbase database

This type of connection requires the BRM tool (BluAge Reverse Modeling) and a corresponding extraction project.

Choose “Pacbase Connection Profile Wizard”, Fig 2.

The first element of the next screen is the path that needs to be set to the “vap.ecore” file that needs to be used. This file is usually found in the BRM project in the reverse > PAFAssociationsPosAbs2VAP folder.

The second element determines whether or not this is a batch project.

All that is left to be done is to create the connection to the modernized base in the usual way.

Nevertheless, it will be necessary to shift the “Legacy” connection into offline mode to start to work on the modernization of data even if the filters do not work for this type of connection.

Connection to a patrimonial base written in Pacbase
Fig. 1. Project created with BRM
Fig. 1. Project created with BRM
Fig. 2. Choosing the connection type
Fig. 2. Choosing the connection type
Fig. 3. Choosing the Pacbase project
Fig. 3. Choosing the Pacbase project
Fig. 4. vap.ecore file in the folder
Fig. 4. vap.ecore file in the folder

Connection to a legacy COBOL database

Choose “Cobol Connection Profile Wizard”, Fig 1.

All that is left to be done is to create the connection to the modernized base in the usual way.

Once the project is created, the COBOL data definition can be imported to create the legacy tables. To do that, open the copybook file in an editor, select one or many segments, and right-click > Import into a BDM Project. You can enter a logical name for this file, it will be used as a business domain.

All imported data definitions will be present in the imported folder, for future reference.

Nevertheless, it will be necessary to shift the “Legacy” connection into offline mode to start to work on the modernization of data even if the filters do not work for this type of connection.

Connection to a patrimonial base written in COBOL
Fig. 1. Choosing the connection type
Fig. 1. Choosing the connection type
Fig. 2. Select the segment definition
Fig. 2. Select the segment definition
Fig. 3. Enter a logical name
Fig. 3. Enter a logical name

Filtering of connections

When we establish a connection to a base, all of the schemas and tables are recovered. If you only want to keep certain elements: use the properties of the connection to parameterize the filter for those schemas or tables. It is important to go through this step before shifting the “Legacy” connection to off-line mode, so as to be left with only the necessary elements. This step is useless in the case of PACBASE or COBOL connections. Indeed, for these connections, unnecessary elements are directly deleted

These filters can also be defined for “Connected” and “Modernized” connections if need be.

Filtering of connections
Fig. 1. Schema filter
Fig. 1. Schema filter
Fig. 2. Table Filter
Fig. 2. Table Filter

“Connected” and “Modernized” Connections

The “Connected” and “Modernized” connections allow one to access the content of these schemas (sequences, tables, views…). The main purpose of these connections is to be able to visualize the structure of the base as it first comes in and the base we get at the end and to be able to examine its data.

 

From the tree structure, it then becomes possible to visualize and edit the data from a table (Data > Edit), load data from a file (Data > Load…), extract the data into a file (Data > Extract).

“Legacy” Connection

The “Legacy” connection, once it is in offline mode (“Work Offline”), allows additional elements to appear in the schemas, most significant of which are the rehandling (or refactoring) rules and the stereotypes. The contextual menu (right-click on an element) allows different actions depending on the selected element. It is possible to create an element (table, column, constraint, stereotype, …), to launch refactorings (refactoring > …), to generate the script for the creation of a database (Generate DDL), to generate Talend scripts (Transform to Talend) or the analysis files for the base (Generate Audit Reports).

To make things easier, it is possible to visualize the elements with their modernized names instead of their original names, either using the future base name (databaseName) or in the UML model used by Blu Age (modelName), accessing them directly in the tree structure. One need only click on the icon to see the databaseName or to see the modelName.

Data Base Audit

Generating the analysis

BDM allows you to analyze all the elements of the database based on a schema in the “Legacy” connection. This functionality is especially useful when measuring the extent of the changes that need to be made to the initial base, at the start of the modernization. It can also be used at the end of the modernization to check that there are no more problems left to address.

 

The result of the analysis presents itself in the form of Excel files. Each Excel file is linked to an audit rule. These are logged in the project that corresponds to the repository, as a function of the strictness of the rule. They can therefore be seen from the Package Explorer view.

The audit rules

Error

  • “PK_With_Nullable_Columns” locates all the primary keys for which at least one of the columns that make it up accepts the null value, which is not coherent for a primary key.
  • “Repeating_Group_With_Different_Type_Definition” classifies the columns of a single table that have similar names, as in a kind of list (CLIENT1, CLIENT2, CLIENT3,…) but of which the types may be different. While this may be an error or be intentional, it will become necessary to homogenize these types to be able to modernize the data together.
  • “Repeating_Group_With_Same_Type_Definition” classifies the columns of a single table that have similar names, as in a kind of list (CLIENT1, CLIENT2, CLIENT3,…) and which are all of the same type. This list will have to be factorized into the column of a daughter table.
  • “Similar_Columns_Not_Linked” lists the columns from different tables that have the same name and the same type but which are not linked to each other (no foreign key). This is typical of data redundancy.
  • “Unique_Indexes_With_Nullable_Columns” locates all the unique indices that have a column that accepts null values, which is not coherent with a unique index.
  • Performance
  • “Index_Columns_Contained_By_Another_Index” indicates all the indices whose columns are already indexed by another index. The index is perhaps useless.
  • “Index_With_Too_Many_Columns” shows the indices that index too many columns (4 or more). This can lead to worse performances.
  • “Table_With_Too_Many_Indexes” shows the tables having an overly high index (5 or more) Too high of an index can ruin the performance.
  • “Table_Without_Index_On_FK” classifies the talbes having foreign keys whose columns are not indexed. It is oftentimes prefereable to index them to avoid performance issues.
  • Warning
  • “Non_Unique_Index_With_Unique_Behavior” locates the unique undeclared indices whose columns contain the primary key. These indices should therefore be unique.
  • “Similar_Columns_With_Different_Types” classifies all the columns from all the tables that bear the same name but not necessarily the same type. This most likely stems from some kind of redundancy but with a type error.
  • “Table_With_No_Alternate_Key” lists the tables that only have a single primary key made up of a column that is of the numerical type and which is incremented automatically. These tables have no business keys.
  • “Table_With_No_Relationship” indicates the tables that have no link to any other table. This table is probably useless in this scheme.
  • “Table_Without_PK_Candidate” shows all the tables that have no constraint that guarantees the uniqueness of the table recordings.

Validating elements of the base

Launching a validation

The validation rules allow you to know whether the base is correct, by taking into account the modifications that were made during the modernization. It is possible to launch a validation on any element of the base. The validation will also be applied to the elements contained within the validated element.

The validation errors appear in the “Validation View”, Fig 2.

It is also possible to filter the rules that are used (Window > Preferences > Model Validation > Constraints > SQL Constraints.), Fig 3.

 

Launching a validation
Fig. 1. Contextual menu of a file
Fig. 1. Contextual menu of a file
Fig. 2. Result from validating a table
Fig. 2. Result from validating a table
Fig. 3. Filter for validation rules
Fig. 3. Filter for validation rules

Handling validation errors

From the “Validation View”, a few choices are available. From the validation view, double-clicking on a line allows you to open the incriminated element so as to go through the right changes with assistance. On the contrary, the contextual menu allows you to launch the refactoring that is tied to the validation rule that corresponds to the error (the refactoring must explicitly be connected to the validation rule, for more details see Refactoring), ignores the error, or eliminates the element that produces the error (if it is possible).

 

Validation rules

SQLObject

  • “ModelNameValidation” produces an error if the element does not have a Model Name or if another element in the same container already has the same name.
  • “DatabaseNameValidation” produces an error if the element has no Database Name or if another element in the same container has the same.

Table rules

  • “PrimaryKeyValidation” yields an error if the table does not have a primary key.
  • “TableWithNoRelationshipValidation” yields an error if the table has no relation to the other tables.
  • “TooManyIndexesValidation” yields an error if the table contains too many indices (6 or more.
  • “TableWithoutPKCandidatesValidation” yields an error if the table has no means of guaranteeing the uniqueness of the recordings. It has no unique constraints and no unique index.
  • “RepeatingGroupWithSameTypeValidation” yields an error if the table has columns that have names that form a sequence and whose types are the same. One must factorize these repetitions into the column of a daughter table.
  • “RepeatingGroupWithDifferentTypeValidation” yields an error if the table has columns that have names that form a sequence and whose types are not the same. The types must be homogenized.
  • “TableWithNoAlternateKeyValidation” yields an error if the table only has one primary key made up of a column of the numerical type and which does not have a business key.

Column rules

  • “SimilarColumnsNotLinkedValidation” yields an error if columns with the same name in different tables have no relation that connects them (the talbes have no foreign key that connects them).

Primary Key Rules

  • “PrimaryKeyColumnsNullableValidation” yields an error if all the columns that make up the primary key accept null values.

Foreign Key Rules

  • “TableWithoutIndexOnFKValidation” yields an error if the columns of the foreign key are not indexed.

Index Rules

  • “UniqueIndexColumnsNullableValidation” yields an error if the columns of the index all accept null values.
  • “NonUniqueIndexWithUniqueBehaviorValidation” yields an error if the index is not declared to be unique and nevertheless contains the columns of a unique constraint.
  • “IndexWithTooManyColumnsValidation” yields an error if the index has too many columns.
  • “IndexColumnsContainedByAnotherIndexValidation” yields an error if the columns of the index are already indexed by another.
  • “IndexWithAllColumnsToDeleteValidation” yields an error if the columns of the index all need to be eliminated.

Routines

Creation

A routine is a java class that is used to generate specific processes during the transfer. They enable the manipulation of data from columns. The routines must be created in the package that has been built for that purpose within the project (src > routines). The project refers directly to the standard Talend libraries that can therefore be used in the routines that are created.

For example: the code of a routine containing a function that recovers a boolean in the form of a chain of characters and returning an integer corresponding to the boolean (1 = true, 0 = false).

Creation of a routine
Fig. 1. Package explorer – routines
Fig. 1. Package explorer – routines
Fig. 2. Code example from a routine
Fig. 2. Code example from a routine

Externals libraries

It is possible to call externals libraries in your routines. You just have to put the jar in the lib folder of the project and add this library to the classpath.

Use

The routine can be called from a table or column as follows: “routines.name_routine.name_method(parameters)”

With the previous example, we can use the routine on a column that is stereotyped as “boolean” and whose original type is “VARCHAR(S)” and whose modernized type is “NUMBER(1)” . During the transfer of data, the column expression will be used.

Use of a routine
Fig. 1. Example of how to use a routine in a column
Fig. 1. Example of how to use a routine in a column
Fig. 2. Column expression that calls the routine
Fig. 2. Column expression that calls the routine

SQL Editor

Creating a Scrapbook

Click on the icon. A tab with connection profile parameters appears. To give more information about the base with which the script will be working, fill in the information about the type of connection, the name of the connection and the database via the scrolling menus.

By merely choosing the right connection name, the right type of connection will be selected directly.

Creating a Scrapbook
Fig. 1. Choosing the type of connection
Fig. 1. Choosing the type of connection
Fig. 2. Choosing the connection name
Fig. 2. Choosing the connection name
Fig. 3. Choosing the database
Fig. 3. Choosing the database

Writing and launching queries

When the connection profile has been filled, the auto completion (Ctrl + space) facilitates the construction of SQL queries. The queries can be launched individually or collectively, either with the contextual menu (right-click) or with corresponding keyboard shortcuts, Writing and launching queries, Fig 1.

The result is displayed in the “SQL Results” view, Fig 2.

The “Status” tab indicates whether or not the query has been successful. If not, the reason for the failure will be displayed. The “Result” tab displays the result of the query. The entire history is stored and allows one to visualize the results from previous queries.

Writing and launching queries
Fig. 1. Contextual menu of the editor - right-click
Fig. 1. Contextual menu of the editor - right-click
Fig. 2. Result of the query
Fig. 2. Result of the query
Fig. 3. Example of a failed query
Fig. 3. Example of a failed query

Producing DDL

The production of database scripts is launched from the contextual menu of the base. This operation can be launched at any level of the tree structure. The generated artifacts will depend on the element on which the wizard has been launched, Producing DDL Fig 1.

In the pop-up that appears, one must first define the format of the instructions that need to be followed, Fig 2.

Then, one must define the types of SQL artifacts that need to be created. It is advised that you only create different constraints once the data has been transferred, Fig 3.

The instructions appear in the pop-up. You need only choose whether you want the instructions to be launched directly (check “Run DDL on server” ) or displayed by the SQL editor (check “Open DDL file for editing”). You can, of course, check both, Fig 4.

If you checked “Run DDL on server”, you are asked to choose a connection. If no connection has yet been defined, it is necessary to make one via the “New…” button (When there is no defined connection, one have to be defined using the “New…” button (see ).

All you have left to do is to validate by clicking on “Finish”.

If you selected “Open DDL file for editing”, the instructions are displayed on the SQL editor. You can then edit and execute the script as you wish (see Writing and Launching Queries)

Producing DDL
Fig. 1. Contextual menu of the base - right-click
Fig. 1. Contextual menu of the base - right-click
Fig. 2. Choice of the format for the instructions of the base-creation script
Fig. 2. Choice of the format for the instructions of the base-creation script
Fig. 3. Choice of the elements to create
Fig. 3. Choice of the elements to create
Fig. 4. Generated script
Fig. 4. Generated script

Data Migration Scripts

Generating Talend Scripts

Important: Before starting the data migration, the elements of the modernized base must be created (see Generating DDL). Nevertheless, you are strongly encouraged to wait until the data has been transferred to start adding constraints to the tables.

The contextual menu of the base allows you to create scripts for data transfer. One must select “Transform to Talend”, Generating Scripts Fig 1.

A project with migration scripts (or jobs) is created and can be seen in the “Package Explorer” view, Fig 2.

Generating Scripts
Fig. 1. Contextual menu of the base
Fig. 1. Contextual menu of the base
Fig. 2. Talend Project in the - Package Explorer - view
Fig. 2. Talend Project in the - Package Explorer - view

Talend Open Studio (migration script)

Launch when Talend is integrated in BDM (Recommended)

Talend Open Studio can be integrated with BDM. It helps the user to launch Talend directly on the current working project.

To launch Talend, right-click on the Database element, and select Import to Talend. Import to Talend is only visible if the talend scripts were previously generated by clicking on Transform to Talend.

Launching Talend
Fig. 1. Contextual menu of the base - right-click
Fig. 1. Contextual menu of the base - right-click
Fig. 2. Choice the name for the Talend Project
Fig. 2. Choice the name for the Talend Project

Launch when Talend is not integrated

Launch Talend Open Studio (if necessary, consult the Talend documentation, [1]).

Choose a workspace (“Referential”) or make one via the “…” button.

Import the project generated by BDM via the “Importing existing project(s) locally” and click on “Ok”. Select the “TALEND” project that was generated previously.

Approve by clicking on “open”.

Principles

The migration is done in multiple steps defined by the types of jobs that are created. There are three types of jobs, distributed over the following folders:

  • Migrations: Transfer the data into temporary tables that contain the data from both legacy and modernized tables. You must have Legacy and modernized data for the transfer of data onto foreign keys (in the case where a foreign key gets values from a column, and the values are then deleted).
  • ForeignKeys: Enhances the temporary tables that deal with added foreign keys.
  • Finalizations: Transfer the data from temporary tables onto final tables.

Generated Jobs

The jobs will be executed in the following order: Migrations, ForeignKeys, and Finalizations. There exists a last job called “Migration” that calls all the jobs in the right order. The goal is to execute all of it in a single job.

Note: Since the talend scripts are generated, you should not change any element in talend. You are highly encouraged to make your edits to the BDM format so that your changes do not get lost if scripts ever have to be re-generated. Nevertheless during the development and adjustment phase, you keep the possibility to make a change in order to test quickly a solution.

Launching a job

To start a job, simply click on “Execute” in the “Execute” view. You can launch multiple jobs in parallel to accelerate the whole process. Otherwise, you can just launch the “Migration” job and wait until it’s all done. You can of course open jobs, view them and even edit them. In that case, you are highly encouraged to propagate your edits to the BDM format so that your changes do not get lost if scripts ever have to be re-generated.

 

Context Variables

Context groups are also generated. Each group contains context variables that will be used in your jobs.

Contexts groups

In the general context group, you can customize the limited number of records to migrate or change the log folder.

In the LegacyConnection or ModernizedConnection contexts groups, you can redefine the connection parameters for the source and the modernized base. Just open the file you’re interested in and redefine the desired parameters on the second screen.

General context
General context
Connection context
Connection context

It is possible to create other context variables that will be used in your jobs. In the “expression” attribute of columns or tables, just use the “context.[variable name]” and a context variable will be created in the Talend “context” file. Example: dynamical parametrizing of information, etc…. These variables will then be taken into consideration during the job execution.

Custom components

When you use the Talend workspace for the first time, it is necessary to specify the location of the custom components and to load them otherwise an error will take place when you open a job, indicating that a component is not recognized. Go into Window > Preferences. Then go to Talend > Components and indicate the location of the “components” folder in the Talend project that has been created. Then click on “Apply”

Setting window
Setting window

Talend Documentation

For more information on Talend Open Studio, check out http://www.talend.com .

Deploy and Execute scripts

Talend offers the possibility to execute scripts in headless mode. Jobs can be run via ant task or planned via a continuous integration.

Development Environment

Exporting the jobs

The first step is to export the jobs with Talend.

We have to unselect the global jobs, but select all other jobs, then export the jobs to an archive file in zip format Fig. 2. You have then to unzip this file containing the job jar files. For a COBOL project, you have to manually copy paste the copybooks folder (myProject/LegacyMyProject/TALEND/copybooks) into the exported jobs folder (myExportFolder/Migrationname/myProject_talend) Fig. 3.

Export Talend Jobs
Fig. 1. Contextual menu
Fig. 1. Contextual menu
Fig. 2. Export the Talend jobs
Fig. 2. Export the Talend jobs
Fig. 3. Export tree structure
Fig. 3. Export tree structure
Example of build.properties file
Example of build.properties file

During the export to Talend phase, a build.properties.gen file is generated. It will be used as basis to correctly write the build.properties used during job execution. In this file you just have to complete talendProjectName and folderName, as the Talend Project Name and the Folder where the Talend Scripts have been exported. Then you can recopy the content into the final build.properties file.

Note, you can also customize the other properties like connection information, the data file location, or the log folder. For information ${basedir} means the current working directory containing this properties file.

Executing the job

To run the job via command line, you can launch the build.xml via ant, by just typing ant in a command line. Apache Ant should be correctly configured, see ant.apache.org for more information.

 

SQL prescripts and postscripts folder

Before and after the migration, you have to possibility to execute SQL scripts. This is especially useful if table, sequences, index or other need to be created in the modernized database before launch the data migration. In the same way, executing SQL scripts after is useful to add constraints or recompute index over the modernized database.

Just place the files in .sql format in the desired folder (prescript means before whereas postscripts means after), and the ant task will execute all the SQL files present in these folders.

Using Continuous Integration

This part describes how to use continuous integration to launch migrations - Available Soon.

MagicDraw (UML model)

Generating the UML model

The UML2 model that corresponds to the database can be used for documentation purposes or as an entry point into Blu Age Reverse Modeling or Blu Age Forward Engineering (for more details, check out the documentation for those products).

The model is generated via the contextual menu of the base, by selecting “Transform to UML2”, Generating the UML model Fig 1. `

The generated files are visible from the “Package Explorer”, Fig 2.

Fig. 1. Contextual menu of the base
Fig. 1. Contextual menu of the base
Fig. 2. UML generated files
Fig. 2. UML generated files

Note: You have the possibility to generate the VO (means Value Object). It is particularly useful if the model is the basis for a screen application modernization with Blu Age. If you generate the VO, one VO will be created and associated to each Business Object.

Importing the model into Magic Draw made by No Magic Inc

MagiDraw can be integrated with BDM. It helps the user to open MagicDraw in BDM, directly on the current working project.

To import the model, right click on the Database element, and select Import to MagicDraw Fig 1. BDM will open the model in the MagicDraw perpective Fig 2. Import to MagicDraw is only visible if the UML2 model was previously generated by clicking on Transform to UML2.

If you need to use this model with Blu Age Forward Engineering, you have to save your model into the mdzip format: Click File > Save As, and select Packed MagicDraw File Format in the File Type dropdown list Fig 3.

Import to MagicDraw
Fig. 1. Menu to import the UML model
Fig. 1. Menu to import the UML model
Fig. 2. MagicDraw perspective
Fig. 2. MagicDraw perspective
Fig. 3. Save As Packed MagicDraw File Format
Fig. 3. Save As Packed MagicDraw File Format

The elements

Decorator

It is possible to activate and deactivate the element decorators (additional information about an element on its label or icon). Go to Window > Preferences. Then go to General > Appearance > Label Decorations and then check or uncheck the decorators that you would like to keep.

Generated

BDM is very tightly linked to the use of refactorings. Nevertheless, to every rule there is an exception.
As far as handling exceptions is concerned, the “Generated” checkboxes that follow each piece of information are very important.
They tell you if the corresponding information was provided automatically by a refactoring or not.
By modifying any of this information, you escape the “Generated” mode: the information will no longer be modified by any future refactoring.
To shift back into “Generated” mode, you need only re-check the checkbox and launch the corresponding refactoring.

Legacy and Added

All the elements of a base are either “Legacy” [the element exists in the original base] or “Added” [the element is added during the modernization]. “Legacy” elements contain attributes allowing them to differentiate their original state from the modernized one. This allows them to maintain a trace of the starting base. A typical example is the management of element names. A “Legacy” element has a “name” field that corresponds to its original name. But it also has the “databaseName” field that allows it to define its name in the future base. We can then use the visualization mode of the modernized names (see “Legacy” Connection) to work more easily with the new names. For the “Added” elements, the name and the modernized name are identical (BDM will automatically modify the other field during the modification of one of two names). If one does not want a “Legacy” element to appear in the modernized base, one should not eliminate it (which is impossible anyway) but rather check the “to delete” attribute. An added element can be directly deleted. During the modernization of a repository, the “Added” elements are transformed into “Legacy” elements in the new repository.

Schema

A schema only has a “name” attribute.

If MagicDraw is well integrated with BDM, double-clicking on schema provide you more information. You have access to the graphical representations of the objects and their relations present in the database. Schema tab display a diagram with only the table names, and their attributes can be added by clicking plus sign over tables. Full Schema tab directly display a diagram with all tables and their attributes.

Actions

Migration Variable

A migration variable is a variable that can be used in expressions, to provide more possibilities. During a migration, the variables are populated at the beginning of the migration before everything else.

Migration Variable
Migration Variable
Migration Variable SQL
Migration Variable SQL
  • Name: Name of the variable
  • Type: Type of the variable. It can be a String, BigDecimal, Date, List of String or List of BigDecimal.
  • Connection: Legacy or Modernized. Legacy is chosen in most cases, since the value for this variable coming from the legacy database.

In the SQL Expression tab, the SQL expression will be used to populate the value. Only one column can be returned by SQL order, and if it returns multiple results, the type of the variable should be type of List. You can directly test the SQL expression in this editor, by executing the order like in the SQL Scrapbook editor (Right click —> Execute).

The variable can simply be used in an expression (as a column expression or a table condition expression): Migration Variable

 

Actions

  • Add a variable: Right-Click on Migration Variable Folder and select Add a variable.
  • Remove a variable: Select the variable in the Data Source Explorer tree, and press the Del key.

Sequence

  • Name: Name of the sequence
  • Database Name: Modernized name of the sequence.
  • Start Value: Starting value (BigInteger)
  • Increment: Increment value (BigInteger)
  • Minimum: Minimal value (BigInteger)
  • Maximum: Maximal value (BigInteger)

For the “LegacySequence”, the modernized values are those that will be used in the new base.

Actions

  • Add a Sequence: Right-click on Sequence folder and select Add a Sequence.
  • Remove a Sequence: In the Sequence properties, click the trash icone top right, or press the del key on the selected Sequence in the tree.

Stereotype

Creating a Stereotype

The BDM stereotypes work in the same way as the UML stereotypes. They allow you to categorize different elements in light of their use in refactorings or simply for visualization. A stereotype can only be applied to a single type of element. Nevertheless, since many types inherit from others, child types can be stereotyped in the same way their parent type is. Since a stereotype is not an SQL element, it does not have the “Legacy” and “Added” concepts.

  • Name: Name of the stereotype.
  • Color: Color associated with the stereotype.
  • Type: Type of element to which the stereotype can be applied.

To affect a stereotype, one must open the element that needs to be stereotyped and click on the S icon.

Activated: allows you to deactivate the stereotype if it does not need to be applied to the element. This is remarkably useful when a refactoring automatically applies the stereotypes and you eventually realize that the element does not need those stereotypes. A stereotype can have its own unique refactorings. They apply uniquely to stereotyped elements. The stereotypes are usually used to bring out dates, booleans or occurs elements.

Creating a TaggedValue

A TaggedValue allows you to define a variable that is tied to the stereotype commonly referred to as a tag. This tag is going to play the role of an additional attribute of the stereotyped element. A stereotype can have multiple TaggedValues.

  • Name: Name of the tag.

To define the value, click on the name of the stereotype in the list of stereotypes applied to the element. Then edit the value that you want to change. This value is a chain of characters.

A TaggedValue is useful when a refactoring will handle the stereotyped elements. As an example in case of a Boolean modernization, you can specify the legacy value representing the true value.

Actions

  • Add a Stereotype: Right-click on Stereotype folder and select Add a stereotype.
  • Add a Stereotype Refactoring: Right-click on the Refactorings folder inside a stereotype and select Add a stereotype refactoring.
  • Add a Tagged Value: Right-click on the Tagged Values folder inside a stereotype and select Add a tagged value
  • Remove a Stereotype: In the Stereotype properties, click the trash icon top right, or press the del key on the selected element in the tree.

Table

  • Name: Name of a table.
  • Database Name: Modernized name of the base.
  • Model Name: Name in the UML model of the table.
  • Domain: Business domain in the UML model.
  • Table Constraints: List of the table constraints (primary key, foreign key, unique constraint)
  • Index: List of table indices.
  • To Delete: Indicates if the table is kept in the modernized database.
  • Comment: Table comments
  • Expression: The Java expression to keep the records in the table (Example: (hdis:a link:img img_src:BDM_0005_table_conditions.png link_mfp:img class:rd_jmpopup_img width:319 height:17))

Actions

  • Add a Table: Right-click on Tables folder and select Add a Table
  • Execute a refactoring: Right-click on the table element, and select Refactorings, the select the desired refactoring to run.
  • Produce a DDL file for the selected table: Right-click on the table element, and select Generate DDL.
  • Remove a Table: The table need to be an Added Table: In the table properties, click the trash icon top right, or press the del key.

Column

  • Name: Name of the column.
  • Database Name: Modernized name of the column.
  • Model Name: Name in the UML model of the table.
  • Default Value: Default value of the column.
  • Nullable: Accept or not the null value.
  • Table: Column container.
  • Type: Actual type of the column (the type that will be used in the modernized base)
  • Legacy Type: Type in the old base.
  • Auto Generated: The Database will populate the value, particularly useful to compute identifiers, but depending on the DBMS.
  • To Delete: Indicates if the column is kept in the modernized database.
  • Move To: Modernized column where the data of this column will be transferred (if this table is to be deleted).
  • Expression: The Java expression to modernize the column data.
  • To Trim: Trim the legacy value. Notice that the trim is executed before compute the Expression field.

Advanced expressions

The expression field for Column or Table elements is very powerful.

In BDM, all number are handle by the BigDecimal type (Fig. 1)

User Routines (or directly Talend Routines) can be called (Fig. 2).

Migration variable can be used, the variable should be initialized first (Fig. 3)

You can also directly use a context variable (Fig. 4). If a context variable is used, BDM will automatically create an entry in the Talend scripts (see Context Variables). It has to be defined before launching the migration with Talend.

Connecting to RBDMS
Fig. 1. Table Condition using number
Fig. 1. Table Condition using number
Fig. 2. Expression using talend routine
Fig. 2. Expression using talend routine
Fig. 3. Expression using a migration variable
Fig. 3. Expression using a migration variable
Fig. 4. Expression using context variable
Fig. 4. Expression using context variable

Actions

  • Add a Column: Right-click on Column folder in a table, and select Add a Column
  • Execute a refactoring: Right-click on the column element, and select Refactorings, the select the desired refactoring to run.
  • Remove a Column: The column needs to be an Added Column: In the column properties, click the trash icon top right, or press the del key on the selected column in the tree.

Primary Key

  • Name: Name of the primary key
  • Database name: Modernized name of the primary key.
  • Members: List of columns that make up the primary key.

Actions

  • Add a Primary Key: Right-click on Constraints folder in the desired table and select Add a Primary Key.
  • Remove a Primary Key: In the Primary Key properties, click the trash icon top right, or press the del key on the selected primary key in the tree

Unique Constraint

  • Name: Name of the Unique Constraint
  • Database name: Modernized name of the unique constraint.
  • Members: List of columns that make up the unique constraint.

Actions

  • Add an Unique Constraint: Right-click on Constraints folder in the desired table and select Add an Unique Constraint.
  • Remove a Unique Constraint: In the Unique Constraint properties, click the trash icon top right, or press the del key on the selected unique constraint in the tree

Foreign Key

  • Name: Name of the foreign key.
  • Database Name: Modernized name of the foreign key.
  • Target Table: Table containing the constrained columns.
  • Unique Constraint: Unique constraint of the target table (if defined, the Unique Index is undefined).
  • Unique Index: Unique index of the target table (if defined, the Unique Constraint is undefined).
  • Members: Columns that make up the foreign key (in agreement with Constraint Members)
  • Constraint Members: Columns in the foreign key constraints (in agreement with Members).
  • Child Role Name: Name of the role of the child table (table containing the stranger key)
  • Child Multiplicity: UML multiplicity of the child table.
  • Parent Role Name: Name of the role of the parent table (table constraint)
  • Parent Multiplicity: UML multiplicity of the parent table.
  • Aggregation: the type of the relation (none, shared, composite).

The SQL Expression tab allows you to express the mapping for the feeding of a foreign key. You have to code the condition, just as you would for the where clause of a query.

SQL expression of a foreign key
SQL expression of a foreign key

Actions

  • Add a Foreign Key: Right-click on Constraints folder in the desired table and select Add a Foreign Key.
  • Remove a Foreign Key: In the Foreign Key properties, click the trash icon top right, or press the del key on the selected foreign key in the tree

Index

  • Name: Name of the index
  • Database name: Modernized name of the index.
  • Unique: Determine whether the indexed recordings are unique or not.
  • Columns: List the indexed columns.

Actions

  • Add an Index: Right-click on Indexes folder in the desired table and select Add an Index.
  • Remove an Index: In the Index properties, click the trash icon top right, or press the del key on the selected index in the tree

Refactoring

Refactorings allow the automation of tasks using batch processing. This is especially useful to modernize a set of elements in the same way, like to change the column type, rename columns, change conditions and more.

Single Refactoring

Definition of a single refactoring
Definition of a single refactoring

A single refactoring is launched from an element to which it can be applied.

Attributes are:

  • Name: Name of refactoring
  • Description: Description of the refactoring
  • Type: Type of element to which the refactoring can be applied.
  • Validation Rule: Validation rule to which the refactoring is tied.
  • Pre Check: Opens the code tab that checks the element before the refactoring is launched.
  • Final Check: Opens the code tab that checks the parameters of the refactoring before the latter is launched.
  • Refactoring: Opens the tab towards the body of the refactoring

Global Refactoring

Definition of a global refactoring
Definition of a global refactoring

Global refactoring can be used in two ways:

  • As a single refactoring. It will only be applied to the selected elements.
  • From anywhere, it will be applied to all the elements of the schema in accordance with the parameters.

Attributes are:

  • Name: Name of refactoring
  • Description: Description of refactoring
  • Type: Type of element to which the refactoring can be applied.
  • Validation Rule: Validation rule to which the refactoring is tied.
  • Refactoring: Opens the tab to the body of the refactoring

Stereotype Refactoring

Stereotype refactorings are defined for only one stereotype. It will only apply to the element stereotyped by the corresponding stereotype. They can be used in two ways:

Definition of a stereotype refactoring
Definition of a stereotype refactoring
  • As a single refactoring. It will only be applied to the selected elements if element having the corresponding stereotype.
  • From the refactoring definition, it will be applied to all the elements having the corresponding stereotype of the schema.

Attributes are:

  • Name: Name of refactoring
  • Description: Description of refactoring
  • Refactoring: Opens the tab to the body of the refactoring

Initialize refactoring using Recording Macro

A macro is used to generate code for the design of a refactoring by recording the manual actions of the user. To start the recording and stop it, click on the record icon. All the modifications made during the recording are cancelled at the end of the recording and the corresponding code is generated. The generated code is not meant to be used as is but can be a foundation to create the refactorings.

Parameters

To create a parameter for a refactoring, one must right-click on the “Parameters” folder of the refactoring and select “Add a parameter” Fig. 1. The name and description of the parameter are required. Choosing a type allows BDM to provide a list with only the corresponding elements at the launching of the refactoring Fig. 4.

Refactoring parameters
Fig. 1. Contextual menu of a refactoring settings
Fig. 1. Contextual menu of a refactoring settings
Fig. 2. Parameter definition
Fig. 2. Parameter definition
Fig. 3. Js code for the refactoring
Fig. 3. Js code for the refactoring
Fig. 4. Select the parameter value
Fig. 4. Select the parameter value

If you have already generated the code before creating the parameters, you will have to add them in the declaration of the “refactor” function Fig. 3. The parameters are chains of characters. When the refactoring is launched, you will be asked to select them.

Code

For more details on this part, refer to the BDM JSDoc. Here is an example of a single refactoring called “Example” with a parameter called “parameter”, Code Fig 1.

First of all, here’s the code for the precheck that checks that the column name is not empty (otherwise it records an error), Fig 2.
For the final check, the code checks that the parameter is not empty, otherwise it registers an error, Fig 3.
Finally the refactoring code consists in concatenating the name of the column with the parameter and assigning it to the databaseName, Fig 4.

Refactor is the main function being called, “this” corresponds to the element on which the refactoring is applied and add suffix is an additional function. (For more details on the code, see the API documentation)

Code
Fig. 1. Example – main information
Fig. 1. Example – main information
Fig. 2. Example – PreCheck
Fig. 2. Example – PreCheck
Fig. 3. Example – FinalCheck
Fig. 3. Example – FinalCheck
Fig. 4. Example – Refactoring
Fig. 4. Example – Refactoring

Launch the refactoring

To launch a refactoring, open the contextual menu of the element (right-click) then go to Refactorings. Then, you have the choice between the global or simple refactorings. If you choose a global refactoring, the element you have selected does not matter because the refactoring will be applied to all the elements. However, if you chose a simple refactoring, only the selected element will be affected. It is worth noting that a global refactoring will appear on the list of simple refactorings if the type that is applied corresponds to the selected element, Launch the refactoring Fig 1. Here’s an example with a pre-check error:

The refactoring is cancelled.

Here is an example of an error of warning type from the final check, Fig 4.

You have the choice either to continue, even with an error in your parameters, in which case the refactoring will be executed or to go back, in which case you can redefine the parameter, Fig 5.

When defining the parameter, you are given the choice to either validate or visualize the changes (preview). This previous visualization allows one to preview the changes made by the refactoring, Fig 6.

On the left, you can find the modified base, and on the right you can see the original base. You need only click on “OK” to launch the refactoring.

Launch the refactoring
Fig. 1. Contextual menu to launch a global refactoring
Fig. 1. Contextual menu to launch a global refactoring
Fig. 2. Contextual menu to launch a simple refactoring
Fig. 2. Contextual menu to launch a simple refactoring
Fig. 3. Error during the pre-check
Fig. 3. Error during the pre-check
Fig. 4. Error during the FinalCheck
Fig. 4. Error during the FinalCheck
Fig. 5. Defining the parameter
Fig. 5. Defining the parameter
Fig. 6. Refactoring preview
Fig. 6. Refactoring preview

Note: You can decide to cancel the changes just after the execution of a refactoring. Before doing anything after the refactoring execution, click anywhere to select the Data Source Explorer, and click Edit > Undo the executed refactoring

Recovering data from a workbook

It is possible to use an Excel spreadsheet (.xls) in a refactoring. One must then use the “getWorkbook(file_path)” method. The required parameter is a chain of characters that contains the file path. It is recommended that you create a separate folder in the project for this sort of file.

Folder with Excel file

Here’s the code to open the spreadsheet:

Open the spreadsheet

(Check out the API documentation for more information on how to manipulate the file data)

Base Refactorings

SQLObject

  • “ModelNameRefactoring” adds the modelName selected by the user.
  • “DatabaseNameRefactoring” adds the databaseName selected by the user.

Table

  • “PrimaryKeyRefactoring” creates a new primary key made up of columns whose names are selected by the user.
  • “TableWithNoRelationshipRefactoring” sets up the table for deletion.
  • «TableWith TooManyIndexesRefactoring” deletes the indices whose names are selected by the user.
  • “TableWithoutPKCandidatesRefactoring” creates a primary key made up of columns whose names are selected by the user.
  • “RepeatingGroupWithSameTypeRefactoring” creates a child table whose columns are the repeating column.
  • “RepeatingGroupWithDifferentTypeRefactoring” affects the type selected by the user.
  • “TableWithNoAlternateKeyRefactoring” creates a unique constraint with the columns whose names are selected by the user.

Column

  • “TypeRefactoring” affects the type selected by the user.
  • “SimilarColumnsWithDifferentTypesRefactoring” affects the type selected by the user.
  • “SimilarColumnsNotLinkedRefactoring” creates a foreign key with the column that constrains the similar column that is part of the primary key.

Primary Key

  • “PrimaryKeyColumnsNullableRefactoring” removes the “nullable” attribute from the columns whose names are selected by the user.

Foreign Key

  • “TableWithoutIndexOnFKRefactoring” creates an index over the columns of the foreign key.

Index

  • “UniqueIndexColumnsNullableRefactoring” removes the “nullable” attribute from the columns whose names are selected by the user.
  • “NonUniqueIndexWithUniqueBehaviorRefactoring” makes the index unique.
  • “IndexWithTooManyColumnsRefactoring” dissociates the columns whose names are selected by the index user.
  • “IndexColumnsContainedByAnotherIndexRefactoring” sets up the index for deletion.
  • “IndexWithAllColumnsToDeleteRefactoring” sets up the index for deletion.

Legacy Cobol/Pacbase Database modernization

Goals

The modernization of a patrimonial base must be done in several steps. The first step consists only in going from the base in file mode to an SQL relational base that keeps all the COBOL information. Segments will be replaced by tables while rubrics will be replaced by columns. The file data should be transferable to the SQL base without loss of information.

The next steps are then only typical modernizations of relational databases (relabeling, creating/deleting elements, changing types, ….). (See Modernize). In case of Pacbase, the language is exploited to the maximum to increase the automation and the recovery semantics (relabeling information, links between tables, etc…).

The following part presents the elements and the processes needed for the first step of the modernization..

The elements

PacbaseTable / CobolTable

In BDM, the segments are represented by “PacbaseTable” or “CobolTable”. This element is similar to a “LegacyTable”. It has an additional element called “Option” that contains the parameters needed to extract the data that correspond to its data file. It will become a table in the modernized base.

Some segments are found as “Group” in other segments or only contain information relative to the structure of data in the files. In that case, it will be necessary to delete these segments.

  • Properties are similar to “LegacyTable”.
  • Length: In readonly, the number of characters of the legacy record.
Table Option

The Option element is associated with a “PacbaseTable” / “CobolTable”. It allows you to define which data from the files will feed into the future table.

Example of Table Options
Example of Table Options
  • Name: the name of the option.
  • File: the logical name of the data source in the files that supply the table. This name will be associated with a file. If this file supplies multiple tables, the name must be identical throughout the table options.
  • Header Size: the number of characters of the header of the data file, preceding the first data record.
  • New Line Size: the number of characters between each record of the data file.
  • Line Header Size: the number of characters preceding each record. In fact, two records are separated by a number of characters as the sum of the new line size and the line header size.
  • Variable length: Indicates whether the record can have a variable length or not.
  • Path: the physical path of the data file.
  • Isolate: When the same data file contains lot of tables, check this checkbox to generate one talend job by table. Use this to avoid some compilation errors in the generated Talend classes.
  • Encoding: The file encoding. For EBCDIC files, the encoding is “Cp037”. You also have the possibility to define and use your own charset.
  • Condition: The condition for filtering data from files. Necessary when the file will supply several other tables. It is important to only recover the necessary adequate data. On the explorer, this condition appears next to the name of the table. This expression is based on a regular expression. expression example

PacbaseColumn / CobolColumn

A call to a rubric is represented by a “PacbaseColumn” / “CobolColumn”. This element is similar to a LegacyColumn. It has the following additional attributes:

  • ParsingType: the type considered during the Reading of the file in order to transfer the data. In principle, it should be identical to the “LegacyType”. Nevertheless, it sometimes happens that the specified data type does not correspond to the data presented in the files. This error is oftentimes detected during the transfer of data when one notices that the data that has been read does not have the right type to be inserted into the new column.
  • Signed: only indicates whether the values in the rubric call are signed or not.
  • Offset: In readonly, the position of the first characters for the data column in the record.
  • Length: In readonly, the number of characters of the legacy rubric.

A rubric call becomes a column in the modernized base.

Types

In the Pacbase files, there are many data types:

  • X: strings of characters
  • 9: classical numeric
  • 3-8: compact numeric

Each type has a size. Ex: X(10), string of characters of size 10. The length is automatically computed and indicated in the column properties.

Group

“Group” (identified by the group icon) is a “PacbaseColumn”/”CobolColumn” that can contain other “Group” or “PacbaseColumn”/”CobolColumn”.

Generally, these groups are marked as to delete Fig. 1. However, if you wish to keep one, uncheck the “To Delete” attribute. In this case, all the elements contained by the group will be marked as deleted. If you also wish to keep the elements contained in the group, uncheck the “To Delete” attribute on the wanted elements. On the contrary, if you mark the “Group” as To Delete, all contained elements will be kept and the “To Delete” will be unchecked Fig. 2.

Handling Groups
Fig. 1. Group is deleted, fields are kept
Fig. 1. Group is deleted, fields are kept
Fig. 2. Group is kepd, fiels are deleted
Fig. 2. Group is kepd, fiels are deleted

To have a better representation of the final table, fields presents in a group are replicated at the first level of the table, following the group.

Redefine

In the legacy data files, it could appear that the same area in a record can be reused in order to contain heterogeneous data to save space. For example, a customer may be either a company or a person, but never both. So reuse (or redefine) the same area allows you to preserve space, instead of having company and person as fields, with one of both empty. In such case, a flag must be used to determine the way to read the record: either parse a company name or a person name.

In BDM, “Redefine” (identified by the redefine icon) is similar to a “PacbaseTable”/”CobolTable” except that it can be created in a “PacbaseTable”/”CobolTable”, a “Group” or a “Redefine”.

Below an example with 2 redefines.

Example of multiple redefines
Example of multiple redefines
is the graphical representation for
Redefine
Redefine

Compared to a “PacbaseTable”/”CobolTable”, a redefine indicates the column that it redefines. This information is also shown in the Data Source Explorer tree, following the redefine name.

A Redefine has also a “To Table” attribute that indicates whether one should create a separate table for this “Redefine”. In general, if a redefine is transformed into a table, this operation is done the same way for the other “Redefine” of the same group.

Options

The Option element is associated with a “Redefine”. It allows you to define which data from the files will feed into the future table.

Example of redefine condition
Example of redefine condition

Unlike the Table Option, only the Condition option is useful here. The condition is used here to determine when the redefine should be use to read the data, depending on another column for example. This expression is based on a regular expression, and this condition appears next to the redefine name in the Data Source Explorer tree.

Warning: Take care that condition have to be exclusive to keep only one redefine, in order to avoid errors or avoid duplicates: if two or more redefines matches the same area, the next data will be out of phase.

Note: Sometimes condition is not required because only one redefine definition is used to read data in the file. In this case, just mark other redefine as “To Delete”, and keep only one redefine for previous reason.

Handling Occurs and Double Occurs

During COBOL to Database transformation

In COBOL, files are of fixed length. To handle lists, one has to use OCCURS structures and the maximum number of elements is fixed.

Double occurs structure
Double occurs structure
is the graphical representation for
Copybook with double occurs
Copybook with double occurs

Each fields or group in occurs are suffixed by a number. One number for the first level of the occurs, two number for the second level of the occurs.

Second step: Modernization

Check if the phone_address column in double occurs is not blank
Check if the phone_address column in double occurs is not blank

Usually, the fields in occurs will be moved to another table (but not necessarily: You can decide to keep several columns in the same table for any functional reason ). To do that, all the occurrences of the same field will be marked as “Move To” to one unique column in another table. For information, the “move to” means a new instance of the destination table will be created for each occurrence.

In the table or column expressions fields, you can use the _index suffix or _index_double_index to select the value for the current column in transfert transfer. It is particularly useful for example to check when the value is blank, or if you want to add an attribute index to handle an ordered list.

Automatic Modifications for PACBASE connection

When creating the connection, a set of modifications is done automatically:

  • Add a primary key to each table with a column and its associated sequence
  • Add foreign keys for each linking information (ex. Hierarchical description in PACBASE)
  • Add a “PacbaseColumn” named “DUMMY” that marked as deleted because it only reads the file during the transfer.
  • By default, all groups should be deleted (see Group Group for more details on this)

By default, the “PacbaseColumn” named “Filler” should also be deleted. Generally, they only served to fill the extra space that was allocated to the data when they did not meet.

Collaborative Work

Sharing a Project

For collaborative work, one must use a version controller. Above all else, consider refreshing the project, otherwise you risk running into problems when the files have to be committed. To share the project, go into the “Package Explorer” view, open the contextual menu of the project (right-click). Go to Team > Share Project, Sharing a Project Fig 1.

The following pop-up opens Fig 2.

First choose the version controller that you want to use. In this example we use CVS. In the case that you do not have a defined repository, we will ask you to define one, Fig 3. Define the parameters necessary to the repository, Fig 4.

You can choose the name of the project on the repository, Fig 5.

Click on “Finish” to commit all the files.

The commit wizard opens. Click on “Next”, add a comment if you wish and click on “Finish”. Reminder: If you have a problem , you might have forgotten to refresh your project before the commit.

Sharing a Project
Fig. 1. Contextual menu of the project
Fig. 1. Contextual menu of the project
Fig. 2. Choosing the version controller
Fig. 2. Choosing the version controller
Fig. 3. Parametrizing the repository
Fig. 3. Parametrizing the repository
Fig. 4. Choosing the name of the project on the repository
Fig. 4. Choosing the name of the project on the repository
Fig. 5. Commit the files
Fig. 5. Commit the files

Import a project

To import a project, open your source manager perspective and check out the project.

Once the project has been imported, create an “import” repository and choose the project you just checked-out (See Import a Workspace Based Connection Profile Repository)

Synchronization and Commit

To synchronize with the project on the repository, open the contextual menu of the base of the “Legacy” connection (right-click) and select “Synchronize”. Don’t forget to refresh the project before getting started, Synchronization and Commit Fig 1.

The connection is going to be in connected mode and is going to be open the appropriate perspective. You need only select the files that you want to “commit”, right-click, select “Commit” and validate with a potential comment, Fig 2 and 3.

Finally, when it’s all done, remember to Switch back to offline mode (right click > Work Offline).
Warning: Do not go directly through the project in the “Package Explorer” view to start synchronization.

(For more details on the use of version control, check out the appropriate documentation)

Synchronization and Commit
Fig. 1. Contextual menu of the base
Fig. 1. Contextual menu of the base
Fig. 2. Contextual Menu of the CVS perspective
Fig. 2. Contextual Menu of the CVS perspective
Fig. 3. Validation of the commit
Fig. 3. Validation of the commit

Tests Driven Development

Presentation

Example of a FitNesse test page in success
Example of a FitNesse test page in success

The aim of the Test Driven Development (TDD) during a database migration is to help to ensure that the resulting database is as expected according to the specifications.

For this purpose, FitNesse will be used and provide the following advantages:

  • Provide to the user some documentation as wiki format. Pages can simply be edited.
  • Provide a test automation tools
  • Directly integrated with BDM

Deployed on a FitNesse server, the test page can be edited then used later for future reference. It can also be executed like a test plan: the user is able to see if tests are running successfully or check if there is any error.

After a test execution, a new link Output Captured is available: It provides more information and execution logs (useful when the test is in failure).

Getting started with FitNesse

When a BDM project is created, FitNesse features are automatically available.

Tree structure for Tests

Launch the FitNesse Server

To launch the FitNesse server, just right-click on the start-fitnesse.xml file then select Run As > Ant Build FIg. 1. FitNesse will start on port 8090. To access to the deployed pages, open a web navigator on “http://localhost:8090” Fig. 2.

Connecting to RBDMS
Fig. 1. Contextual menu to start FitNesse
Fig. 1. Contextual menu to start FitNesse
Fig. 2. New FitNesse page
Fig. 2. New FitNesse page

To Stop the FitNesse server, just right-click on the stop-fitnesse.xml file then select Run As > Ant Build. FitNesse will stop.

Add a new test page

To add a new page, you just have to click on Add in the home page, then edit your page. The page can directly be edited in Rich Text, or in Plain Text for comfortable user.

Add a new test page
Fig. 1. To add a new page
Fig. 1. To add a new page
Fig. 2. Editig The new Page
Fig. 2. Editig The new Page
Fig. 3. Page created
Fig. 3. Page created

Once created, you can continue to edit the page by clicking on Edit.

Run a test

In a test page, just click on Test link at the top of the page, FitNesse will launch the test plan.

FitNesse Documentation

FitNesse provide more wiki features like search, versioning, changes history, etc…

For more information on FitNesse, see The FitNesse UserGuide.

Using Templates in pages

In the context of the data migration, a set of templates are available in BDM to customize the page and make operation and verification through the database.

Here is the list of available template:

  • EmptyTableTemplate: Empty the specified table (perform a delete SQL order)
  • ExecuteSqlScriptTemplate: Execute the sql script in parameter
  • InsertTableTemplate: Insert records in the specified table (perform an insert SQL order)
  • LaunchTheMigrationTemplate: Execute the migration job
  • VerifyTableTemplate: Check the records are presents in the database (perform a select SQL order)
  • OneTableFullMigrationTemplate: Entire test page for one table migration
  • TwoTablesFullMigrationTemplate: Entire test page for two tables migration

To import a template in the page, edit the page, select the desired template in the list and click on the Insert Template button.

To use templates, you have to import the bdm fixtures in your pages, just add the four following lines at the beginning of your page:

!*< Hidden

!|Import     |

|bdm.fixtures|

*!

Example of a FitNesse test source page
Example of a FitNesse test source page

For advanced user, you can edit the test page in BDM. With this editor, you can directly format the code (right-click in the editor > FitNesse > Format). It allows to align columns in the templates.

Once the template is imported, you have to customize it (See screenshot for example of use).

  • Select the destination database where perform the order: leave only Legacy or Modernized, not both.
  • Specify the table (replace the ${table_name} occurrence) when needed
  • Specify the sql script name (replace the ${table_name} occurrence) when needed
  • Add columns when required (replace ${columns} occurrence and separate columns by the | character)
  • Add records to multiply the test data.

Note: During a table verification, you can add ? following the column name (without space), to exclude the column from the select, and leave the first column or the key without ?. It avoid any errors due to a disorder in the result set.

Save Database results under FitNesse format

BDM offers the possibility to directly save a database results set in a FitNesse format. It is particularly useful to add database verifications or data insertions in your test page, using a template.

In the result tab view of an SQL request (See SQL Editor), you can right-click and select Save > All results (Fig 1.) In the Save popup, change the file type to select FitNesse File, and select a destination project (Fig 2.). The results will be automatically put into the clipboard under a template format, you just have to past it into a test page (Fig 3.)

At the end, you have to customize the template: Choose InsertInATable or VerifyATable for the action, chose Legacy or Modernized for the environment and enter the table name. Example to verify records in the Customer table, in the modernized environment:

          !|Action(InsertInATable or VerifyATable)|Environment(Legacy or Modernized)|Table name |

 becomes !|VerifyATable |Modernized |CUSTOMER |
Fig. 1. Database Results
Fig. 1. Database Results
Fig. 2. File selection
Fig. 2. File selection
Fig. 3. Import to the test page and format.
Fig. 3. Import to the test page and format.

Database preparation

Since this is unit testing, the database should be used only for the tests. It allows to reinit database when desired but especially to reset the database before launch the tests.

Usually at the beginning of the test, you should execute an sql script (call by the ExecuteASqlScript template). This kind of file can be put in the file folder of test tree.

External Files for the test pages
Fig. 1. Sql script file
Fig. 1. Sql script file
Fig. 2. Example of ExecuteASqlScript template
Fig. 2. Example of ExecuteASqlScript template