Quite often poor write performance is due to a low degree of parallelism on the write side. This is particularly problematic for operations like copy (cp) where each write is essentially synchronous. The issue is that when a file is created or ‘extended’ (more blocks are allocated to a file) which is the case when doing a copy, the filesystem meta-data is altered. In the case of cp, a block is taken from the free pool and allocated to a file.
A custome of mine was copying a large file – around 300Gb if I remember correctly, that was taking several hours. The utilisation of the storage was very low, even though service times (as measured by iostat on Solaris) were also low. I used ‘dd’ to chop up the file into 4 pieces (using ‘seek’ to index into the file) and so created a parallel copy. This technique improved the copy performance by 50%.
I wanted to create a scripted version that could be run on any fle regardless of size. The script is below.
#!/bin/bash
#Parallel copy.
BLOCKSIZE=512IO_IN_MB=1let pcpBSIZE=$IO_IN_MB*1024*1024let THREADS=4DD=/bin/ddSOURCE=$1DEST=$2echo Source is $1echo Dest is $2#rm $DEST
SIZEINBLOCKS=`ls -s $1|cut -d' ' -f 1`echo Size of $SOURCE is $SIZEINBLOCKS blockslet SIZEINBYTES=$SIZEINBLOCKS*BLOCKSIZElet CHUNK=SIZEINBYTES/$THREADSecho Size of $SOURCE is $SIZEINBYTESecho Size of chunk=$CHUNKlet COUNT=$CHUNK/$pcpBSIZE
let loop=0while ((loop<THREADS-1))do let OFFSET=$CHUNK*$loop/$pcpBSIZE $DD if=$SOURCE of=$DEST bs=$pcpBSIZE count=$COUNT iseek=$OFFSET oseek=$OFFSET & let loop=loop+1done #Special case, the last dd does until EOF #let loop=loop+1 let OFFSET=$CHUNK*$loop/$pcpBSIZE $DD if=$SOURCE of=$DEST bs=$pcpBSIZE iseek=$OFFSET oseek=$OFFSET wait