super-user

Practical computer science for performance analysis and debugging.

VMware performance for guru’s.

TAGS: None

I am definitely not a guru yet. But one day….

VMware Performance for Gurus – A Tutorial from Richard McDougall

Predicting disk cache hits for 100% random access.

TAGS: None

Lets say I have a cache of 512G. My block size is 4K, so that’s 134,217,728 entries. How many blocks will I need to read to fill the cache? Well if I read data sequentially, then obviously I just need to read 512G worth of files. But what if I read random blocks? Most caches will try to cache randomly read blocks, since sequential reads get least benefit from disk caching.

So, If I read that same 512G randomly how many blocks will end up in cache? Not 512G because some of those random blocks will be ‘re-hits’ of blocks that were already cached.

It turns out that by simulation, we find the ratio of 0.6321 of the entire cache (about 323.5G). Repeated simulations show that the ratio is pretty constant. So, is there something magical about the ratio 0.6321 (Rather than 0.666 which was my guess).

  • Exmample output.
    Garys-Nutanix-MBP13:Versions garylittle$ python ~/Dropbox/scratch/cachehit.py 
    Re-Hit ratio 0.3676748
    Miss (Insert) ratio 0.6323252
    

    Result of 4 trials…

    print (0.6322271+0.6320339+0.6322528+0.6320873)/4
    0.632150275
    

    Is there anything interesting about that value?
    http://www.wolframalpha.com/input/?i=0.6321
    Tells us that the value 0.6321 can be more-or-less represented as.

    1-(1/e)
    

    Furthermore we see http://www.wolframalpha.com/input/?i=1-1%2Fe

    Series representation.

    I can’t figure out what of the above series representations actually explains the cache hit behavior, but it makes sense to have something to so with factorials since the more data we read in, the higher chance that the next read will actually be a hit in the cache rather than inserting a new value.

    If anyone can explain the underlying math to this effect, I would be very interested. Looks like it’s related to http://en.wikipedia.org/wiki/Derangement

    Thanks go to Matti Vanninen for pointing out that 0.6321 was somehow magical.

    Here’s code to simulate the cache in Python. This causes python to malloc about 400M of memory.

    Garys-Nutanix-MBP13:Versions garylittle$ python ~/Dropbox/scratch/cachehit.py 
    Re-Hit ratio 0.3677729
    Miss (Insert) ratio 0.6322271
    
    import random
    import math
    import numpy
    
    #10 Million entries.
    cachesize=10000000
    hit=0.0
    cache=[]
    miss=0.0
    
    for i in range(0,cachesize):
            cache.append(0)
    
    for i in range(0,cachesize):
            b=random.randint(0,cachesize-1)
            if cache[b] == 1:
                    hit+=1
            else:
                    cache[b]=1
                    miss+=1
    
    print "Re-Hit ratio",hit/cachesize
    print "Miss (Insert) ratio",miss/cachesize
    
  • Show status of aggregate creation / disk zero status.

    Tags: , , ,

    For the most part, you only really care about disk zeroing if you’re waiting for an aggregate to be created on disks that were previously used on an old aggregate that was only just destroyed.

    To see how long disk-zeroing is taking, use the command “aggr status -r”

    filer-6280*> aggr status -r
    Aggregate aggr1_large (creating, raid_dp, initializing) (block checksums)
      Plex /aggr1_large/plex0 (offline, empty, active)
    
      Targeted to traditional volume or aggregate but not yet assigned to a raid group
          RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM     
          --------- ------          ------------- ---- ---- ---- ----- --------------    --------------
          pending   3a.00.8         3a    0   8   SA:A   0   SAS 10000 (zeroing, 71% done)
          pending   3a.00.14        3a    0   14  SA:A   0   SAS 10000 (zeroing, 69% done)
          pending   3a.00.16        3a    0   16  SA:A   0   SAS 10000 (zeroing, 70% done)
          pending   3a.00.18        3a    0   18  SA:A   0   SAS 10000 (zeroing, 69% done)
          pending   3a.00.20        3a    0   20  SA:A   0   SAS 10000 (zeroing, 69% done)
          pending   3a.00.22        3a    0   22  SA:A   0   SAS 10000 (zeroing, 71% done)
     
    • Author:
    • Published: Jan 9th, 2013
    • Category: how-to
    • Comments: 5

    Netapp API hacking with python

    Tags: , , , , , , ,

    As a non-programmer, I’ve always been reticent to use anything with acronyms like API and SDK, relying instead on issuing a full command line using rsh or ssh. That works for a while until you want to start doing things like checking for errors – and until you get fed up with 90% of the script being dedicated to parsing the output. NetApp filers have a reasonable API, that can be used to get both sysadmin data (number and fullness of volumes) and also performance analysis numbers (number of ops and the response times). Best of all the SDK provides libraries for both perl and python. I have switched almost entirely to Python for anything that needs more than a few lines of automation. In python, all you have to do is import 2 libraries and you can start using the API.

    The API can be downloaded from the netapp support site now.netapp.com under the ‘Download Software’ link.

    The API is implemented at the lowest level by sending RPC/XML calls over http to the filer. Inside NetApp, any new functionality must provide API (aka ZAPI) access – so learning the API should be a good investment. It helps to know that the implementation is XML since retrieving the data returned from the filer – follows the tortuous access pattern familiar to anyone who has used XML in the past.

    I have used the API in my lab to monitor disk usage during long term testing, and in my previous life as a consultant implemented a scheduler to manage database snapshots. Once you’re used to accessing the API, it’s much easier than sending CLI commands over ssh/rsh.

    Here are the steps to get the API

    Download the API (ontapi SDK)

  • Head to http://support.netapp.com/NOW/cgi-bin/software/

    You’re looking for
    NetApp Manageability SDK

  • Select “All platforms” then hit “GO”.
  • Click the button “View & Download”
  • Fill in some sort of form…. fill in all the fields, otherwise you’ll have to start over. For some reason, the form only talks about Perl, Java, C and .Net. After that a download link will appear, there’s a license to click through – and eventually you’ll be able to download the SDK. When I downloaded, the tarball was 89 MB.
  • Click on yet another hyperlink “Thank you for completing the registration form. To continue with your download click here
  • Scroll to the bottom of the page, and click the “CONTINUE” link (yes, painful isn’t it?)
  • Now you’ll have to read the EULA… then hit “Accept” (if you can live with the the EULA)
  • Maddeningly, there is yet another link to click, that implies you need to login elsewhere… but actually clicking the link “Log in to the NetApp Support Site and click NetApp Manageability SDK.” will actually start the download.

    !!Now the download will actually start!!

  • Once the tarball/zip file has downloaded on your client machine, go and find it, extract it to somewhere sensible as you normally would.

    The file is called “”netapp-manageability-sdk-5.0R1.zip”" on my mac.

    Inside the ‘lib’ directory, is python/NetApp. These are the python modules that we’ll be using.

    DfmErrno.py
    NaElement.py
    NaErrno.py
    NaServer.py
    
  • Now, regarding the documentation. At the top level of the directory structure, of the unzipped file – there is a file SDK_help.htm. If I open this file with the Chrome browser – then I get a mostly blank page. If I open the file in Safari browser, then I get a decent help screen.
  • To see some examples
  • Home > NetApp Manageability SDK > Sample Codes > Data ONTAP sample codes
  • Now, let’s run the ZAPI ‘hello world’, return the name of the filer. Obviously – this is just as easy to do with the CLI – but once we start to get into iterating over tens or hundreds of volumes, or other structured data – the power of using the API will become obvious.

    For now though, let’s start with something simple…

    lovebox:[~] $ export PYTHONPATH=$PYTHONPATH:~/Downloads/netapp-manageability-sdk-5.0R1/lib/python/NetApp/
    
    lovebox:[~] $ ipython
    
    In [1]: from NaElement import *
    In [2]: from NaServer import *
    In [3]: server=NaServer("gjlfiler.mylab.netapp.com",1,6)
    In [4]: server.set_admin_user('root',"root")
    In [5]: cmd = NaElement('system-get-info')
    In [6]: out=server.invoke_elem(cmd)
    In [7]: system_info=out.child_get("system-info")
    In [8]: system_info.child_get_string("system-name")
    
    Out[8]: u'gjlfiler'
    
    from NaElement import *
    from NaServer import *
    server=NaServer("gjlfiler.mylab.netapp.com",1,6)
    server.set_admin_user('root',"root")
    cmd = NaElement('system-get-info')
    out=server.invoke_elem(cmd)
    system_info=out.child_get("system-info")
    system_info.child_get_string("system-name")
    

    Something more tricky

    Regrettably, the SDK documentation no-longer contains the ONTAP portion of the API, IOW – what calls I can make to the filer, and what it will respond with. To access that documentation.

    https://communities.netapp.com/community/interfaces_and_tools/developer/apidoc

    Click the link for Data ONTAP API Documentation, and download the zipfile (currently that filename is netapp-manageability-sdk-ontap-api-documentation.zip)

    Again, I was unable to view the html doc with Chrome for some reason, but Firefox works OK.

    So, let’s try somehing else – how about a short script to get the size of each volume in the system.

  • Open the documentation folder, and fire open “SDK_help.htm”.
  • Go to the index tab, and find the “Volume” section – click on “volume-list-info”, because that just sounds like it might be what we want.

    We see from the documentation that the function returns a list of volumes, in a structure named “volumes”, because that’s the name given in the “Output Name” field in the API documentation.

    Input Name Range Type Description
    verbose boolean

    optional

    If set to “true”, more detailed volume information is returned. If not supplied or set to “false”, this extra information is not returned.
    volume string

    optional

    The name of the volume for which we want status information. If not supplied, then we want status for all volumes on the filer. Note that if status information for more than 20 volumes is desired, the volume-list-info-iter-* zapis will be more efficient and should be used instead.
     
    Output Name Range Type Description
    volumes volume-info[]

    List of volumes and their status information.
  • We can tell that the return type is going to be a list because of the [ ] at the end of the type description e.g. volume-info[]. Click on the volume-info[] link to see what is in the list.
  • One of the elements is “size-total” which is what we want.

    So, now we know that we want to grab the volume-info[] list (which we can guess is a list of each volume in the system, and each item in the list has information about the volume).

    So, as before we do some setup to reach the filer, and this time we’re going to issue the command volume-ist-info.

    In [1]: import sys
    
    In [2]: sys.path.append("/opt/netapp/netapp-manageability-sdk-5.0R1/lib/python/")
    
    In [3]: from NaElement import *
    
    In [4]: from NaServer import *
    
    In [6]: filer=NaServer("gjlfile.mylab.netapp.com",1,6)
    
    In [7]: filer.set_admin_user('root','root')
    
    In [8]: cmd = NaElement("volume-list-info")
    
    In [10]: ret = filer.invoke_elem(cmd)
    
    In [11]: ret
    Out[11]: 
    
  • The object ‘ret’, is a container, which we know contains the output “volumes” – but we need to access it via the magical accessors. How do we know the magic words to use? Well, we know from the API document that the output name as returned by the API is “volumes” and that it is of type list (because of the []). So, we ask the container (i.e. the whole XML returned from the filer) to provide us the “volumes” object.

    Above we see that ‘ret’ is just an NaElement instance, we want the specific ‘volumes’ object. The might be more than one object returned to us (although in this case there is not) so we have to unpack the volumes object – just like in the simple example we did first.

    
    In [12]: volumes = ret.child_get("volumes")
    
    n [13]: volumes
    Out[13]: 
    

    So, now we have an object that should be a list of volumes. But unfortunately we can’t just access that as a real python list – we have to use the specific accessors.

  • We know that volumes contains a list of per-volume information.
  • To access an individual item of the list we need to issue children_get()
  • By looking at the definition of ‘volume-info’ we can see that there is a field called “name” which is a string.
  • To extract a particular field from the per-volume structure we issue child_get_string()
      volume-info
         vol
         vol
         vol --- size
               --- name
               --- etc,
         vol
         vol
    
    
    In [14]: for vol in volumes.children_get():
        ...:     print vol.child_get_string("name")
        ...:     
    db2_fv
    db3_fv
    db4_fv
    db5_fv
    db6_fv
    log1_fv
    log2_fv
    log3_fv
    log4_fv
    log5_fv
    log6_fv
    db1_fv
    vol0
    

    Since we also want the size – we can find that field in the volume-info structure, figure out the type and use the correct accessor. In this case we can make a guess that size-total is what we want, and that its type is integer and so we end up with this :-

    for vol in volumes.children_get():
        print vol.child_get_string("name")
        print vol.child_get_int("size-total")
        
    db2_fv
    3006477107200
    db3_fv
    3006477107200
    db4_fv
    3006477107200
    db5_fv
    3006477107200
    db6_fv
    3006477107200
    log1_fv
    21474836480
    log2_fv
    21474836480
    log3_fv
    21474836480
    log4_fv
    21474836480
    log5_fv
    21474836480
    log6_fv
    21474836480
    db1_fv
    3006477107200
    vol0
    476002729984
    

    The documentation that tells you about child_get() etc. is in the SDK documentation, in the following section

    Home > NetApp Manageability SDK > Programming Guide > SDK Core APIs > Python Core APIs > Input Output Management APIs
    
  • Thoughts on “Latency Numbers Every Programmer Should Know”

    TAGS: None

    Last week a number of twitter users were posting about a very cool looking interactive latency predicterizor. In the small world of computer performance nerds, it was a veritable tsunami of attention. My twitter skills are so feeble that I cannot figure out how to determine the total number of tweets – but as of now (Wed Jan 2nd 5:32 EST) the most recent tweet was 19 minutes ago. Fascinating.

    The interactive paper is here http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html

    At face value, it seemed that some of the numbers were really quite low. For instance I was pretty surprised to see the projected random read latency from an SSD at 17 micro-seconds (uSec). What’s really nice about this graphic is that the sources for the numbers presented, are contained in the javascript source. In fact that’s really really nice. The numbers for flash/SSD are taken from a berkley paper published in 2012? which is, itself quite an interesting read.

    Flash paper : http://cseweb.ucsd.edu/users/swanson/papers/FAST2012BleakFlash.pdf

    From the interactive diagram – which states main memory read latency of 1uSec, you’d assume that reading from an SSD is 17x slower than main memory. Personally I think that’s WAY off for a couple of reasons. Firstly, and least interestingly – the 17uSec is for a direct access to the NAND cell itself. In practice SSD’s are packaged with a Flash Translation Layer (FTL) which translates something like a linear address range (LBA) into mappings to the flash memories. The FAST paper (above) pegs the FTL latency at 30 uSec – which seems high relative to the 17 uSec response from the NAND memory (the comments in the code, and the paper itself – seem to peg the NAND response time at 20 uSec – but the interactive tool shows me 17 uSec for “2012″).

    The more interesting consideration – particularly given the title “Latency Numbers every _programmer_ should know” is that, as a programmer – I can to a large extent expect to achieve a 1 uSec response from main memory when I attempt to read some value. However, there’s no way that I will ever get the 17 uSec (or even a 17+39 uSec) response from SSD. The reason is that, as a user I cannot access that SSD directly. For most programmers – the SSD will be accessed via a filesystem, then a device driver.

    Programmers access to the filesystem is almost always the other side of a system call, which means a trap or interrupt call saving stack pointers and setting up adresses for buffers etc. Typically the data will be read from SSD into the memory of the host computer – and then returned to the user/programmer. Even with DMA and other zero-copy techniques there will be many reads/writes to main memory to setup the system call.

    When we think about spinning-disk accesses of ~4 milliseconds (ms) or 4,0000 uSecs – we can more-or-less gloss over the setup cost required to setup the system call and move the data through the kernel and back to the user, because the overhead of moving this mechanical instrument dwarfs the other costs. But with SSD’s that setup cost starts to impact how quickly a user-land application can really access data.

    Thinking Clearly about Performance – ACM Queue article.

    Tags: , , , ,

    Great article from Cary Millsap, covers performance analysis in general – not specific to Oracle.

    Oracle performance diagnostics (Session centric)

    Tags: , , ,

    Anyone who deals with systems performance in an enterprise environment, will inevitably need to deal with an Oracle database performance problem. These databases often make up parf of the largest, most complex and business critical infrastructures. In the past I have always looked at DB performance issues from a systemic perspective – looking at the database in the large. Below are a couple of articles that I found while looking through some old Oracle magazines. They take a more session centric view of performance – diagnosing specific wait events that might be causing a particular user session to appear slow.

    Examples include finding the oracle session which belongs to a particular unix PID, and a bunch of related stuff using the v$ tables.

    Posted from Diigo. The rest of my favorite links are here.

    Effects of Windows power management on storage performance.

    TAGS: None

    Something very odd happened on my Jetstress box. I disconnected my iSCSI LUN’s in order to do an aggregate snap-restore, and when I restarted the test – the oprate achieved from Jetstress (with the exact same parameters) was around 1/2 of what it had achieved just a few minutes ago. It looks like power-savings kicked in while the luns were offline.

    Here’s the short take-away. With Windows 2008 power settings set to “Balanced” I achieve only 1/2 the throughput (oprate) that I do when the power setting in Windows is set to “Full Power”

    with “Balanced” power setting

    CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                           in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
    13%      0      0      0    1870   24859  70727   57600      0       0      0     0s    97%    0%  -    34%       0      0   1870       0      0   24475  70414
     8%      0      0      0    1141   11407  58021   48496      0       0      0     0s    98%    0%  -    26%       4      0   1137       0      0   11148  57799
    29%      0      0      0    1054    8284  55235   50180  16878       0      0   >60     99%   22%  Ts   28%       0      0   1054       0      0    8050  55034
    16%      0      0      0     981   11293  50431   64100 168960       0      0   >60     99%  100%  :f   43%       0      0    981       0      0   11064  50233
    11%      0      0      0     935    9829  49900   44216 106192       0      0   >60     98%   94%  :    33%       0      0    935       0      0    9608  49709
    10%      0      0      0    1059   13088  51064   36503      0       0      0   >60     98%    0%  -    23%       0      0   1059       0      0   12946  50885
     8%      0      0      0    1052    9332  53281   57060      0       0      0   >60     98%    0%  -    27%       4      0   1048       0      0    9001  53051
     9%      0      0      0    1287   12829  58028   43288      0       0      0     1s    98%    0%  -    25%       0      0   1287       0      0   12555  57799
    11%      0      0      0    1529   13086  66611   59088      0       0      0     1s    97%    0%  -    32%       0      0   1529       0      0   12771  66351
    10%      0      0      0    1387   12685  66571   56900      0       0      0     1s    98%    0%  -    33%       0      0   1387       0      0   12382  66314
     8%      0      0      0    1037   12370  51644   38640      0       0      0     1s    98%    0%  -    25%       0      0   1037       0      0   12131  51442
     9%      0      0      0    1118   10402  53779   52108      0       0      0     1s    98%    0%  -    27%       4      0   1114       0      0   10158  53576
     9%      0      0      0    1172   12642  57266   47236      0       0      0     2s    98%    0%  -    26%       0      0   1172       0      0   12378  57569
     9%      0      0      0    1305   12714  58096   48824      0       0      0     2s    97%    0%  -    30%       0      0   1305       0      0   12441  57344
    11%      0      0      0    1044   11125  51985   46134      8       0      0     2s    98%    2%  Tn   25%       0      0   1044       0      0   10889  51782
    

    with “High performance” power setting

    perfdisk-6280-1*> sysstat -x 1
    CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                           in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
    14%      0      0      0    2198   19282  85558   70372      0       0      0     2s    97%    0%  -    39%       0      0   2198       0      0   18777  85225
    15%      0      0      0    2312   24365  80659   67488      0       0      0     3s    97%    0%  -    45%       0      0   2312       0      0   23848  80314
    16%      0      0      0    2469   22900  88474   66532      0       0      0     3s    97%    0%  -    41%       0      0   2469       0      0   22350  88105
    16%      0      0      0    2560   24089  96448   78876      0       0      0     2s    97%    0%  -    44%       0      0   2560       0      0   23511  96068
    15%      0      0      0    2298   23169  80427   62648      0       0      0     2s    97%    0%  -    36%       5      0   2293       0      0   22662  80085
    15%      0      0      0    2387   25311  85323   70716      0       0      0     2s    97%    0%  -    39%       0      0   2387       0      0   24780  84963
    18%      0      0      0    2810   25211  93821   73608      0       0      0     2s    97%    0%  -    41%       0      0   2810       0      0   24608  93422
    16%      0      0      0    2847   31037  94805   72640      0       0      0     2s    97%    0%  -    41%       0      0   2847       0      0   30418  94392
    60%      0      0      0    2161   31831  78890   86160 107040       0      0     1s    99%   72%  Tf   49%       0      0   2161       0      0   31311  78545
    19%      0      0      0    2284   29344  84730   68596 146432       0      0     1s    97%  100%  :f   49%       8      0   2276       0      0   28798  84369
    18%      0      0      0    2347   26055  85583   68028 157888       0      0     2s    97%  100%  :f   56%       0      0   2347       0      0   25518  85230
    17%      0      0      0    2107   18594  79451   66024 117756       0      0     2s    97%   99%  :    48%       0      0   2107       0      0   18114  79135
    14%      0      0      0    2066   19512  81547   69632      0       0      0     2s    97%    0%  -    39%       2      0   2064       0      0   19028  81228
    20%      0      0      0    3301   26889  90424   70632      0       0      0     2s    97%    0%  -    42%     695      0   2606       0      0   26315  90038
    17%      0      0      0    2560   27279  85441   70044      0       0      0     3s    97%    0%  -    41%       9      0   2551       0      0   26722  85062
    

    The longer story

    After some head scratching and googling, I found that the power scheme on the Windows box was set to “balanced” – which seemed a bit odd for a server OS. So, I switched it to “High Performance” and almost instantaneously the throughput to the filer doubled (back to what it was previously).

    It seems that the most likely culprit is that the “balanced” power option manages power to the PCI bus as well as CPU. My guess is that when I disconnected the iSCSI luns – the PCI card (Intel 10GbE) went idle and the power-saving mode kicked in. For some reason, it never went back to full-power even though I was once again using the 10GbE card.

    One of the things that makes the power-saving issue a little tricky is that although I’m not moving a LOT of data over the network – I AM dependent on achieving low latency to meet my oprate (IOW I don’t have a lot of concurrency). I wonder how much work has been done to measure the effect of power saving mode on latency at low loads.

    It was lucky for me that I stumbled across the power savings mode – the only evidence I had that the problem lay at the Windows side was a large unexplained delta in the latency seen by the Windows host – and the latency attributable to the filer.

    Create a sysstat like command for any filer statistic.

    Tags: , , , ,

    Here’s a neat trick to create a sysstat like output for any statistic available in the counter manager. In this example I chose some counters that were relevant to my Jetstress testing. Here’s the output

    ----------------iSCSI-------------  --Disk--  --CPU--  ram   EC  hdd  ssd  hyaA hyaB hyaC
       ops   lat     Rd_lat     Wr_lat  read_lat     all     
        /s    ms         ms         ms        ms       %    %    %    %    %    %    %    %
       937  3.12       4.28       0.17      8.87      31   74    0   25    0    0    0    0
       886  2.84       4.30       0.29      8.68      32   74    0   25    0    0    0    0
      1846  2.97       5.04       0.19      9.33      48   68    0   31    0    0    0    0
       866  3.07       4.58       0.14      8.75      31   73    0   26    0    0    0    0
       619  3.34       4.20       0.15     10.00      25   78    0   21    0    0    0    0
       558  3.43       4.19       0.12     10.05      29   80    0   20    0    0    0    0
       905  2.70       4.41       0.19      9.95      30   77    0   22    0    0    0    0
       828  2.70       4.25       0.16      9.78      28   77    0   22    0    0    0    0
      1336  2.77       5.19       0.26     10.90      99   74    0   25    0    0    0    0
      1078  3.03       4.53       0.18      9.57      45   74    0   25    0    0    0    0
       654  2.09       3.54       0.18     11.00      29   83    0   16    0    0    0    0
       749  3.13       4.27       0.16      9.23      26   75    0   24    0    0    0    0
    

    To the left I have some iSCSI stats, since I connected the filer to my Windows box via iSCSI. I show the number of iSCSI ops/second, followed by the average latency. In the next two columns I break out the iSCSI read and write latencies separately. Next I display the average time taken to return a read from disk.

    Next I show the total CPU usage. This column will show anything from 0 to 100x#Number of CPU’s so if you have a filer with 4 CPU’s the range is 0-400.

    The next section attempts to show where the blocks are being read from.

    ram = buffer cache
    EC = flash cache
    hdd = from a physical spinning disk
    ssd = from a physical SSD disk

    Then there are several Flash Pool (hybrid aggregate) counters that I do not fully understand. And since I am not using flash pools I didn’t spend time trying to find out what the counters mean.

    To get this sort of report on your own filer – all you have to do is to create an XML file of the correct format- and upload the file to your filers’ /etc/stats/preset directory. The easiest way to do that is to mount the filers /vol/vol0 to some friendly unix machine. Below is the XML file I used to generate the output shown above.

    Then on the filer, type “stats show -p . In my case I called the file iscsi.xml, so I type “stats show -p iscsi”.

    
    
    
    
    	
    
    
      
        
          
    	
    	5
          
          
                  
    	      5
          
          
                  
    	      10
          
           
                  
    	      10
          
        
     
       
    
    
      
        
          
          9
        
      
    
    
    
      
       
          
          7
        
      
    
    
    
      
        
          
          4
        
      
    
    
    
    

    filebench “reuse” parameter does not work on NFS mountpoint

    Tags:

    Opened bug 3581710 on Sourceforge to cover this behavior. For some reason, filebench seems not to honor the reuse parameter correctly if the filebench dataset is created at in the mountpoint root (e.g. /a). The test itself seems to work OK, but the datafile(s) is always removed. However, if a subdir is created in the root e.g. /a/dir1 – then the reuse parameter works as expected.

    Bug 3581710

    © 2009 super-user. All Rights Reserved.

    This blog is powered by Wordpress and Magatheme by Bryan Helmig.