Course Module 5. Spring - Lecture: Initializing DataSource in Spring

The org.springframework.jdbc.datasource.init package provides support for initializing an existing DataSource. The embedded database support tools provide one option for creating and initializing a DataSource for an application. However, sometimes you may need to initialize an instance that is running on a server.

Initializing a database using Spring XML

If you need to initialize a database and it is possible to specify a link to the DataSource bean, then you can use the initialize-database tag in the spring-jdbc namespace:

<jdbc:initialize-database data-source="dataSource">
    <jdbc:script location="classpath:com/foo/sql/db-schema.sql"/>
    <jdbc:script location="classpath:com/foo/sql/db-test-data.sql"/>
</jdbc:initialize-database>

The previous example runs the two specified scripts against the database. The first script creates the schema, and the second populates the tables with the control data set. Script locations can also be wildcarded in the usual Ant style used for resources in Spring (eg classpath*:/com/foo/**/sql/*-data.sql). If you use a template, the scripts are executed in the lexical order of their URL or file name.

The default logic of the database initializer is to unconditionally execute the specified scripts. This is not always what is required - for example, if the scripts are run against a database that already has audit data. The likelihood of accidentally deleting data is reduced by following the common pattern (shown earlier): first creating tables and then inserting data. The first step will fail if the tables already exist.

However, to give you more control over the creation and deletion of existing data, the XML namespace provides several additional capabilities. The first is a flag to turn initialization on and off. You can set it according to your environment (for example, extract a boolean value from system properties or from an environment bean). In the following example, the value is obtained from a system property:

<jdbc:initialize-database data-source="dataSource"
    enabled="#{systemProperties.INITIALIZE_DATABASE}"> 
    <jdbc:script location="..."/>
</jdbc:initialize-database>

Get the value for enabled from the system property INITIALIZE_DATABASE.

The second option for controlling what happens to existing data is to be more resilient to failure. To do this, you can control the initializer's ability to ignore certain errors in the SQL it runs from scripts, as shown in the following example:

<jdbc:initialize-database data-source="dataSource" ignore-failures="DROPS">
    <jdbc:script location="..."/>
</jdbc:initialize-database>

In the previous example, we are saying that we expect scripts to sometimes be executed against an empty database, and those scripts have some DROP statements that will therefore not be executed. Thus, SQL DROP queries with errors will be ignored, but other failures will result in an exception being thrown. This is useful if the SQL dialect does not support DROP ... IF EXISTS (or similar), but you need to unconditionally remove all control data before recreating. In this case, the first script is usually a set of DROP statements, followed by a set of CREATE statements.

The ignore-failures parameter can be set to NONE (default), DROPS (ignore erroneous DROP statements ) or ALL (ignore all failures).

Each instruction must be separated by a ; character or a new line if the ; character is not present in the script at all. You can control this globally or by script, as shown in the following example:

<jdbc:initialize-database data-source="dataSource" separator="@@"> 
    <jdbc:script location="classpath:com/myapp/sql/db-schema.sql" separator=";"/> 
    <jdbc:script location="classpath:com/myapp/sql/db-test-data-1.sql"/>
    <jdbc:script location="classpath:com/myapp/sql/db-test-data-2.sql"/>
</jdbc:initialize-database>

Install separator scripts in @@.
Set the separator for db-schema.sql to ;.

In this example, two test-data scripts use @@ as statement delimiters, and only db-schema.sql uses ;. This configuration sets the default delimiter to be @@ and overrides this value for the db-schema script.

If you want more control than using an XML namespace, you can use DataSourceInitializer directly and define it as a component in your application.

Initializing other database-dependent components

A large class of applications (those that do not use a database before starting the context in Spring) can use the database initializer without additional hurdles. If your application is not one of these, you may need to read the rest of this section.

The database initializer depends on the DataSource instance and executes the scripts provided in the initialization callback (similar to init-method in the XML bean definition, a method with the @PostConstruct annotation in a component or the afterPropertiesSet() method in a component that implements InitializingBean). If other beans depend on the same data source and use the data source in the initialization callback, a problem may occur because the data has not yet been initialized. A common example is a cache that initializes without delay and loads data from the database when the application starts.

To work around this problem, there are two ways: change the cache initialization strategy to a later stage, or ensure that the database initializer is initialized first.

Changing the cache initialization strategy can be easy if the application is under your control and not otherwise. Some suggestions on how to accomplish this:

Make the cache initialize later on first use, which will improve application startup time.
Let your cache, or a separate component that initializes the cache, implement Lifecycle or SmartLifecycle. When an application context starts, you can automatically start SmartLifecycle by setting its autoStartup flag, and you can manually start Lifecycle by calling ConfigurableApplicationContext.start() for the enclosing context.
Use ApplicationEvent from Spring or a similar custom observer mechanism to trigger cache initialization. ContextRefreshedEvent is always published by the context when it is ready to be used (after all beans have been initialized), so this is often a useful method (this is how SmartLifecycle works by default).

Making sure that the database initializer is initialized first is also easy. Some thoughts on how to do this:

You will use Spring's default BeanFactory logic, which is that beans are initialized in the order they are registered. This can be easily accomplished by adopting the common practice of using a set of <import/> elements in the XML configuration that organize your application modules and ensure that the database and database initialization are listed first.
Separate the DataSource from the business components that use it and control the order in which they run by placing them in separate ApplicationContext instances (for example, the parent context contains the DataSource, and the child is the business components). This structure is common in Spring web applications, but can be used more widely.