Exporting LOBs fast using EXPDP in parallel

Recently I had a customer who was struggling to export their 5TB database without hitting ORA-01555 errors. The export was running for over 2 days to reasonably fast storage so I couldn’t realistically set UNDO_RETENTION to 48 hours plus! The export was running in parallel already but looking through the logs I could see nearly all the time was spent on a large 1.5TB LOB table. The problem is that EXPDP cannot export LOB tables in parallel so we’ve got a huge amount of data to shift serially. At this point it was obvious that ‘tuning’ the export wasn’t going to achieve the seismic time saving we needed so it was time to get creative.

One of the features of EXPDP is the ability to set a point in time SCN to ensure consistency. You can also use a where clause in a query to filter data. Combining these 2 options allow us to export multiple subsets of data consistently even if they are taken with different EXPDP jobs.

My solution was to export the LOB table separately from the rest of the data and then to further split the LOB table into pieces by rowids and export each piece individually making sure the FLASHBACK_SCN was the same for each export. My initial test commands looked like this:

expdp <user>/<password> TABLES=employees 
QUERY='employees:"where mod(dbms_rowid.rowid_block_number(rowid), 20) = 1"' 
directory=DATA_PUMP_DIR dumpfile=EXPORT_1.dmp logfile=LOGFILE_1.log 
FLASHBACK_SCN=53001181934 expdp <user>/<password> dumpfile = EXPORT_%U.DMP directory = DATA_PUMP_DIR exclude =TABLE:"='EMPLOYEES'" logfile = EXPORT.LOG parallel=8

This worked great and completed in just under an hour. To explain what’s going on here, this example of the where clause might help:

SQL> select count(*) from employees where mod(dbms_rowid.rowid_block_number(rowid), 20) = 1;
COUNT(*)
----------
1206

SQL> select count(*) from employees where mod(dbms_rowid.rowid_block_number(rowid), 20) = 2;
COUNT(*)
----------
1298

SQL> select count(*) from employees where mod(dbms_rowid.rowid_block_number(rowid), 20) = 11;
COUNT(*)
----------
1251

SQL> select count(*) from employees where mod(dbms_rowid.rowid_block_number(rowid), 20) = 15;
COUNT(*)
----------
1297

After the test run of just 1/20th of the total data I needed to script something to loop through the same command 20 times changing the numbers. As I could see me needing something similar again in the future I created a version for both Windows and Linux:

Linux:

#!/bin/bash
pieces=20
for ((i=0;i<=19;i++));
do
expdp <user>/<password> TABLES=employees 
QUERY='employees:"where mod(dbms_rowid.rowid_block_number(rowid), ${pieces}) = ${i}"' 
directory=DATA_PUMP_DIR dumpfile=EXPORT_${i}.dmp logfile=LOGFILE_${i}.log 
FLASHBACK_SCN=53001181934 & done

Windows. The start /b command works like nohup in linux so stops expdp ‘grabbing’ your session:

@echo off
set pieces=20                                                                     
for /L %a in (1,1,20) do (
start /b expdp <user>/<password> TABLES=employees 
QUERY='employees:"where mod(dbms_rowid.rowid_block_number(rowid), %pieces) = %a"' 
directory=DATA_PUMP_DIR dumpfile=EXPORT_%a.dmp logfile=LOGFILE_%a.log 
FLASHBACK_SCN= 53001181934     
) 

 

Leave a Reply