Advanced Data Load with MySQL Shell

Over the last few years, I have become quite smitten with MySQL Shell. For those who may not be familiar with MySQL Shell, it is a new(ish) command line interface (CLI) for connecting to and managing MySQL instances. During a recent episode of Inside MySQL: Sakila Speaks, Fred and I talked to Miguel Araujo about many of the helpful (and lesser known) features of MySQL Shell. This post is the ninth in a series about these “hidden gem” features.
The Problem
There is no ‘problem’ for this post, but I wanted to keep the formatting I have used for other MySQL Shell Hidden Gems posts. I have also discussed dumping and loading data with MySQL Shell in previous posts, so today, I want to cover some of the options available when loading data using MySQL Shell.
The Solution
Once again, not really a ‘solution’, but I want to stick to the formatting of related posts.
First, let’s take a look at some of the options we can use when loading data. We can use these options as part of a call to util.loadDump()
. This function’s last (and optional) argument is the options configuration block in JSON format. For example, if we wanted to load data from a dump that was on our local system, we would use a command that resembles:
I will discuss only some options, those I find interesting and helpful.
Enabling local_infile
Before running any of the data load commands, we need to ensure that the global system variable named local_infile
is set to ON
. To check if this variable has the correct value, run the SQL command:
The output from this command might resemble the following:
If the result shows the value as ‘OFF’, we must change it. To change the value of this variable, run the following SQL command:
After setting the variable, we should recheck the value by running:
The value should now show as ON
.
Dry Ryn
If we have a very large database dump or are moving data to a HeatWave MySQL instance, we may want to ensure everything will work as expected before we do the actual load. To do a load with this option, we add dryRun: true
. For example:
When I run this command against my local MySQL instance, I see the following results:
As we can see, there were no warnings when loading this dump, and it should execute without issue on the MySQL instance we are connected to.
Threading
I have talked about this before, but it bears repeating. MySQL Shell can do multi-threaded loads. This option makes the process of loading data faster. By default, MySQL Shell uses four threads. Here is the syntax:
The output from this command would resemble the text below.
When watching the progress of the data load, we may see that the number of threads used is lower than the number we specify. While MySQL Shell will try to use the number of threads we specify, sometimes fewer threads are used. Also, don’t think that more threads will always mean better performance. That may not be the case.
Load Progress File
The progressFile
option allows us to specify the location of a local progress state file for tracking load progress. If the dump file is local to the MySQl instance to which we are loading the data, this file is created automatically in the source directory for the dump. The name of the file will be in the format: load-progress-<server-uuid>.json
.
The file output will resemble the following:
Skipping the Bin Log
The’ skipBinLog’ option can increase performance when loading a large data set. When set to true
, this option tells MySQl Shell to issue a SET sql_bin_log=0
statement before loading the data. This will only work for on-premise MySQL instances. If you set this option to true
when loading data into a HeatWave MySQL instance, it will throw an error.
Here is an example of using this option:
The output for this command will be similar to what we saw in the threading example above.
Reset Progress
Suppose I run the command util.loadDump("~/dumps/example2", {skipBinLog: true})
again, I will see the following message:
We see a message that a load progress file was detected, and the load will resume from where it left off. If there is an error or some other kind of disruption when we first try to load the data, the load will continue from that point. If the dump succeeded the first time, nothing will be executed.
We can use the resetProgess
option to tell MySQL Shell to start the progress from the beginning of the data load. We must first remove all database objects created in the previous load attempt to use this option. These objects include schemas, tables, users, views, triggers, routines, and events. If we don’t, we will get an error (unless we use the ignoreExistingObjects
option…more on that in a bit.).
Here is an example of using this option:
And here is what the output will resemble if we do not remove any previously created objects.
The output shows a list of items that have already been created.
Ignoring Existing Objects
As I noted above, we can use the ignoreExistingObjects
option to re-run a data load and not get any errors when we encounter an existing object. Here is a command that uses the resetProgress
and ignoreExistingObjects
options (after I dropped a few schemas loaded with the previous command).
The output of this command would look like the text below.
The output will show what objects were skipped because they already exist.
Loading From Oracle Cloud
One of my favorite features of MySQL Shell is the ability to load data from an Oracle Cloud storage bucket. There are several options available to us to take advantage of this feature.
This example assumes you have installed the OCI CLI.
Depending on how the OCI CLI is configured on your system, you may also need to include your bucket’s namespace
.
Here is the output from this command.
This image shows the progress file created when we loaded the dump. It was added to our OCI bucket in a folder named ‘example5’.
Creating Invisible Primary Keys
The createInvisiblePKs
option will add primary keys in invisible columns for every table in the dump that does not have a primary key defined. If the dump was created using the createInvisiblePKs
option, the createInvisiblePKs
option of the load is automatically set to true
.
Wrap Up
The utilities in MySQL Shell for loading data from a MySQL database are an excellent combination of ease and power. We can use options to perform a dry run to ensure everything will run as expected, reset the progress of a load if it was interrupted or needs to be re-run, ignore any existing objects in the target MySQL instance, and load data from a dump stored in an OCI storage bucket. Check out the MySQL Shell Load Documentation to learn more about the different load options available in MySQL Shell.
Photo by Michael Fousert on Unsplash
Related Entries
- Running External Scripts in MySQL Shell
- Executing System Commands in MySQL Shell
- Getting Help in MySQL Shell
- Sandboxes in MySQL Shell
- Server Upgrade Checks with MySQL Shell
- MySQL Connection Status with MySQL Shell
- Managing MySQL Shell Configuration Options
- Advanced Data Dump with MySQL Shell
- MySQL Shell CLI Integration
- MySQL Shell Table Export Utility
- Copy Data Directly To A MySQL Instance With MySQL Shell
- Create an InnoDB Replica Set With MySQL Shell
- Using MySQL Shell to Get MySQL Diagnostic Information
- Customizing The MySQL Shell Prompt
- Extending MySQL Shell with Startup Scripts
- Extending MySQL Shell with Plugins
- Doing a 'Incremental' Dump With MySQL Shell