So I finished Freakonomics in just over a day. This is pretty impressive for me since , most of the time I fall asleep before I’ve finished the second page of any book I try to read. Anyhow i was surprised how much I enjoyed the book, and how applicable it was to what we do as engineers every day; which is to sift through lots of data looking for the clue that will give the answer to the problem at hand. Often we’re looking at the same data that other engineers have seen and try to interpret the data in a different way, or indeed look at data which has previously been overlooked. In the case of Freakonimics the data is not code, errors or performance statistics, but nor is it about ‘monetary’ data. It’s Sumo wrestlers win/lose ratios against same opponents (looking for corruption in Sumo) and other stuff too like ‘why do drug dealers still live with their mums?’. Entertaining, and well worth a read.
- Author: gary
- Published: Jan 21st, 2007
- Category: Learning, Uncategorized
- Comments: None
A sideways look at data.
Creating snapshots using passwordless ssh
These instructions show how to setup a user so that he can use ssh to create snapshots on a filer. The user can ONLY carry out snapshot operations, but cannot login or run other commands (even a simple ‘version’) will fail. This sort of thing might be useful if you were writing a script, which could run under this username.
References:
Roles and RBAC on NetApp filers
Secure Admin (Requires NOW login)
First setup ssh, and make sure root can get to the filer via ssh.
(1) On the filer, setup ssh, accept the defaults.... filer2> secureadmin setup ssh SSH Setup---------Determining if SSH Setup has already been done before...no SSH server supports both ssh1.x and ssh2.0 protocols. SSH server needs two RSA keys to support ssh1.x protocol. The host key isgenerated and saved to file /etc/sshd/ssh_host_key during setup. The serverkey is re-generated every hour when SSH server is running. SSH server needs a RSA host key and a DSA host key to support ssh2.0 protocol.The host keys are generated and saved to /etc/sshd/ssh_host_rsa_key and/etc/sshd/ssh_host_dsa_key files respectively during setup. SSH Setup will now ask you for the sizes of the host and server keys. For ssh1.0 protocol, key sizes must be between 384 and 2048 bits. For ssh2.0 protocol, key sizes must be between 768 and 2048 bits. The size of the host and server keys must differ by at least 128 bits. Please enter the size of host key for ssh1.x protocol [768] :Please enter the size of server key for ssh1.x protocol [512] :Please enter the size of host keys for ssh2.0 protocol [768] : You have specified these parameters: host key size = 768 bits server key size = 512 bits host key size for ssh2.0 protocol = 768 bitsIs this correct? [yes] Setup will now generate the host keys in the background. It will take afew minutes. After Setup is finished you can start SSH server withcommand 'secureadmin enable ssh'. A syslog message will be generatedwhen Setup is complete. Thu Jan 11 12:32:21 GMT [filer2: rc:info]: SSH Setup: SSH Setup is done. Host keys are stored in /etc/sshd/ssh_host_key, /etc/sshd/ssh_host_rsa_key and /etc/sshd/ssh_host_dsa_key.filer2> (2) Start ssh on the filer. filer2> secureadmin enable ssh (3) Attempt a standard ssh login as root user. gjl-powerbook:~ garylittle$ ssh -l root filer2The authenticity of host 'filer2 (192.168.1.104)' can't be established.RSA key fingerprint is 9b:99:37:9f:21:c1:09:1f:45:82:25:fd:5c:d8:99:a1.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'filer2,192.168.1.104' (RSA) to the list of known hosts.root@filer2's password: (4) Check remote execution (use version command) gjl-powerbook:~ garylittle$ ssh -l root filer2 versionroot@filer2's password: NetApp Release 7.1: Fri Dec 23 00:48:41 PST 2005
Now we setup another user to use passwordless login.
(5) Enable passwordless login. 5.1 Create a user if required, use the non priviliged group 'Users' filer2> useradmin user add garylittle -g UsersNew password:Retype new password:User added. 5.2 Create a role calles 'snaps' defining what he can do. In this case manipulate snapshots only. filer2> useradmin role add snaps -c "CLI Snapshots" -a cli-snap* 5.3 Assign that role to a group filer2> useradmin group add cli-snapshot-group -r snapsGroup added. 5.4 Assign the user to the group filer2> useradmin user modify garylittle -f -g cli-snapshot-groupUser modified. filer2> useradmin user list garylittle Name: garylittle Info: Rid: 131072Groups: cli-snapshot-groupFull Name: Allowed Capabilities: cli-snap*Password min/max age in days: 0/4294967295Status: enabled 5.5 Generate a public/private key for the user who wants to login on the system that they want to login from. In my case this is my laptop. gjl-powerbook:~/.ssh garylittle$ ssh-keygen -t rsa -b 1024 Generating public/private rsa key pair.Enter file in which to save the key (/Users/garylittle/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /Users/garylittle/.ssh/id_rsa.Your public key has been saved in /Users/garylittle/.ssh/id_rsa.pub.The key fingerprint is:64:2b:29:40:53:34:83:a2:b0:b4:5e:10:3f:a2:b5:5b garylittle@gjl-powerbook.local gjl-powerbook:~/.ssh garylittle$ ssh-keygen -t dsa -b 1024 Generating public/private dsa key pair.Enter file in which to save the key (/Users/garylittle/.ssh/id_dsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /Users/garylittle/.ssh/id_dsa.Your public key has been saved in /Users/garylittle/.ssh/id_dsa.pub.The key fingerprint is:b7:0a:e1:f0:51:f4:b0:e5:c0:6a:06:da:6d:0b:f7:16 garylittle@gjl-powerbook.local 5.6 Add the public keys to the filer. In my case I had mounted /vol/vol0 onto /mnt on my local machine. I had to create the directories /etc/sshd/garylittle manually. gjl-powerbook:~/.ssh garylittle$ cat id_rsa.pub >> /mnt/etc/sshd/garylittle/.ssh/authorized_keysgjl-powerbook:~/.ssh garylittle$ cat id_dsa.pub >> /mnt/etc/sshd/garylittle/.ssh/authorized_keys (6) Test... executing a snap list works fine... gjl-powerbook:~/.ssh garylittle$ ssh filer2 snap listVolume ora9test_dbfworking... %/used %/total date name---------- ---------- ------------ -------- 0% ( 0%) 0% ( 0%) Jan 04 14:28 volsnap 0% ( 0%) 0% ( 0%) Dec 15 16:00 hourly.0 0% ( 0%) 0% ( 0%) Oct 20 18:10 DBsnap1 0% ( 0%) 0% ( 0%) Oct 20 18:06 adhoc 0% ( 0%) 0% ( 0%) Oct 20 16:00 hourly.1
Check the user only has the privs we defines
gjl-powerbook:~ garylittle$ ssh filer2 Connection to filer2 closed. gjl-powerbook:~ garylittle$ ssh filer2 version Permission denied, user garylittle does not have access to version
But we can create snapshots..
gjl-powerbook:~ garylittle$ ssh filer2 snap create vol0 ssh-snapcreating snapshot...gjl-powerbook:~ garylittle$ ssh filer2 snap list vol0Volume vol0working... %/used %/total date name---------- ---------- ------------ -------- 0% ( 0%) 0% ( 0%) Jan 11 13:45 ssh-snap 1% ( 1%) 0% ( 0%) Dec 15 16:00 hourly.0 2% ( 1%) 1% ( 0%) Oct 20 16:00 hourly.1 2% ( 1%) 1% ( 0%) Oct 19 16:00 hourly.2 3% ( 1%) 1% ( 0%) Oct 17 16:00 hourly.3 4% ( 1%) 1% ( 0%) Sep 17 16:00 hourly.4 5% ( 1%) 2% ( 0%) Sep 15 16:00 hourly.5 gjl-powerbook:~ garylittle$
Analysing NetApp sysstat PT1: The CP columns
The CP type is displayed in the CP/ty column in the output of sysstat. The CP type column contains two pieces of data. The cause of the CP (the CP type) – the first character, and the ‘phase’. The second character. In the output below, the first row shows a CP type “T” and phase of “f”. The second row shows a cp type of “:” which just means that the same CP was still ongoing when sysstat sampled the internal CP counters the next time. We get a bit more insight into this process in the CP/time column which just represents the amount of the sample time which was spent in the CP. So, in the case where we are sampling every second – the entire CP took about 1/2 a second in the first sample, and then continued over into the second sample for 17% of one second. So, we can assume that the CP started in the second half of the first sample period, and continued a little into the next sample.
filer*> sysstat -x 1
CP CP Disk FCP iSCSI FCP kB/s iSCSI kB/s
time ty util in out in out
54% Tf 92% 183 0 4 145228 0 0
17% : 96% 201 0 5 149234 0 0
The table below describes the CP types as per the sysstat manpage. In a system with a small amount of incoming data (or none at all) there will be an artificially generated CP caused by a timer, which fires once every 10 seconds, this is the CP type “T”.
When there is a high incoming data rate, the filer tries to free up resources before they become exhausted, which would mean that the user would see high write latencies. When everything is working well, as is normally the case – the incoming write latency is the time it takes to write the users data into NVRAM (plus the network round trip time). This is how the filer is able to achieve extremely high write rates, even for a random IO pattern. If the filer is not able to keep up with the incoming workload, it will sometimes show CP type “B” which can mean that the filer is continually in a state of CP, and the user workload can see higher latencies as a result.
The log full CP literally means that the NVRAM log is 50% full, and so the filer must write this dirty data out to disk. Whilst the CP is happening, the other 50% of the NVRAM is used to accept more incoming data, and so the clients should see the same low latency as when there is no CP ongoing.
A CP type of “H” indicates that the filer has a large number of dirty buffers in the system, and even though the NRVRAM is not yet 50% full, the filer issues a CP in order to free up RAM. This CP type is sometimes seen on filers with a small amount of RAM, and a large incoming write rate with a small IO size to random offsets.
CP type “Z” is often caused by snapshot creation or deletion. Remember that snapshot deletion can be triggered by several conditions besides a user typing “snap delete” at the console. Examples of ‘automatic’ deletion are.
- Snapshot deletion to recover space on the aggregate
- Snapshot deletion, to maintain a specific snap schedule
| CP Types | CP Phases |
|---|---|
| B – Back to back CPs (CP generated CP) | 0 – Initializing |
| b – Deferred back to back CPs (CP generated CP) | n – Processing normal files |
| F – CP caused by full NVLog | s – Processing special files |
| H – CP caused by high water mark | f – Flushing modified data to disk |
| L – CP caused by low water mark | v – Flushing modified superblock to disk |
| S – CP caused by snapshot operation | |
| T – CP caused by timer | |
| U – CP caused by flush | |
| Z – CP caused by internal sync | |
| : continuation of CP from previous interval | |
| # continuation of CP from previous interval, and the NVLog for the next CP is now full, so that the next CP will be of type B. |
A quicker cp (Results)
I tested my cp script (which I called pcp ‘ parallel cp’) against a NetApp NFS server from a Solaris client. The best results vs standard ‘cp’ was when the FS was mounted ‘-o forcedirectio’. Since forcedirectio is often used by DB hosts, and DB servers typically use very large files the performance of cp can be quite important. During my tests I saw roughly 100% improvement, i.e. it took 1/2 the time to copy a large file using ‘pcp’ vs standard ‘cp’ for a file of 2Gb.
Here are some graphed results which I took from the raw iostat.
Standard Solaris cp
Vs pcp using 1MB IO’s and 8 dd ‘threads’.
A Quicker copy (cp)
Quite often poor write performance is due to a low degree of parallelism on the write side. This is particularly problematic for operations like copy (cp) where each write is essentially synchronous. The issue is that when a file is created or ‘extended’ (more blocks are allocated to a file) which is the case when doing a copy, the filesystem meta-data is altered. In the case of cp, a block is taken from the free pool and allocated to a file.
A custome of mine was copying a large file – around 300Gb if I remember correctly, that was taking several hours. The utilisation of the storage was very low, even though service times (as measured by iostat on Solaris) were also low. I used ‘dd’ to chop up the file into 4 pieces (using ‘seek’ to index into the file) and so created a parallel copy. This technique improved the copy performance by 50%.
I wanted to create a scripted version that could be run on any fle regardless of size. The script is below.
#!/bin/bash #Parallel copy. BLOCKSIZE=512IO_IN_MB=1let pcpBSIZE=$IO_IN_MB*1024*1024let THREADS=4DD=/bin/ddSOURCE=$1DEST=$2echo Source is $1echo Dest is $2#rm $DEST SIZEINBLOCKS=`ls -s $1|cut -d' ' -f 1`echo Size of $SOURCE is $SIZEINBLOCKS blockslet SIZEINBYTES=$SIZEINBLOCKS*BLOCKSIZElet CHUNK=SIZEINBYTES/$THREADSecho Size of $SOURCE is $SIZEINBYTESecho Size of chunk=$CHUNKlet COUNT=$CHUNK/$pcpBSIZE let loop=0while ((loop<THREADS-1))do let OFFSET=$CHUNK*$loop/$pcpBSIZE $DD if=$SOURCE of=$DEST bs=$pcpBSIZE count=$COUNT iseek=$OFFSET oseek=$OFFSET & let loop=loop+1done #Special case, the last dd does until EOF #let loop=loop+1 let OFFSET=$CHUNK*$loop/$pcpBSIZE $DD if=$SOURCE of=$DEST bs=$pcpBSIZE iseek=$OFFSET oseek=$OFFSET wait

