When collecting network traces to solve performance problems, we need to balance the need to capture enough data to re-construct the upper protocols with the need to limit the amount of trace data collected overall. When capturing NFS traffic, the packet size can grow to be quite large, since the NFS packets will contain the “payload” data i.e. the data that the userland application is reading or writing.
Most of the tools used to collect network trace data support the ‘snaplen’ option which controls how much data to capture per packet. For NFS traces a snaplen of 256 bytes seems to work well. With a snaplen of 256 the trace analysers will have enough data to reconstruct the filehandles, offset and IO size. Tools like tshark will be able to match the NFS response to the request and so you can see the response time from the NFS server.
In this trace below, the total response time is from the first NFS READ Call (packet 13) until the final RPC Continuation in the sequence. In NFS the NFS V3 READ Reply (Packet 14) is just the beginning of the NFS server response. The bulk of the payload is in the “RPC Continuation” packets. The number of Continuation packets will depend on the size of the NFS request and the size of the Ethernet frames. In the trace below, the IO size is 32K, the Ethernet segment size is 1500 and the IO stream is single threaded.
I use the tshark option “-R rpc” to show the “NFS READ calls” as well as the “RPC Continuation” responses from the NFS server.
The command line.
lovebox:test_snaplen gjl$ tshark -r snap256_from_open -R rpc|head -35
The trace.
13 3.529598 192.168.10.3 -> 192.168.10.4 NFS V3 READ Call, FH:0x44281a4d Offset:0 Len:32768
14 3.529844 192.168.10.4 -> 192.168.10.3 NFS V3 READ Reply (Call In 13) Len:32768[Unreassembled Packet]
15 3.529856 192.168.10.4 -> 192.168.10.3 RPC Continuation
17 3.529872 192.168.10.4 -> 192.168.10.3 RPC Continuation
18 3.529882 192.168.10.4 -> 192.168.10.3 RPC Continuation
20 3.529896 192.168.10.4 -> 192.168.10.3 RPC Continuation
21 3.529905 192.168.10.4 -> 192.168.10.3 RPC Continuation
23 3.529921 192.168.10.4 -> 192.168.10.3 RPC Continuation
24 3.529930 192.168.10.4 -> 192.168.10.3 RPC Continuation
26 3.529944 192.168.10.4 -> 192.168.10.3 RPC Continuation
27 3.529955 192.168.10.4 -> 192.168.10.3 RPC Continuation
29 3.529970 192.168.10.4 -> 192.168.10.3 RPC Continuation
30 3.529980 192.168.10.4 -> 192.168.10.3 RPC Continuation
32 3.529997 192.168.10.4 -> 192.168.10.3 RPC Continuation
33 3.530005 192.168.10.4 -> 192.168.10.3 RPC Continuation
35 3.530019 192.168.10.4 -> 192.168.10.3 RPC Continuation
36 3.530029 192.168.10.4 -> 192.168.10.3 RPC Continuation
38 3.530044 192.168.10.4 -> 192.168.10.3 RPC Continuation
39 3.530053 192.168.10.4 -> 192.168.10.3 RPC Continuation
41 3.530068 192.168.10.4 -> 192.168.10.3 RPC Continuation
42 3.530077 192.168.10.4 -> 192.168.10.3 RPC Continuation
44 3.530092 192.168.10.4 -> 192.168.10.3 RPC Continuation
45 3.530100 192.168.10.4 -> 192.168.10.3 RPC Continuation
47 3.530115 192.168.10.4 -> 192.168.10.3 RPC Continuation
48 3.530197 192.168.10.3 -> 192.168.10.4 NFS V3 READ Call, FH:0x44281a4d Offset:32768 Len:32768
If you are processing someone elses trace data, one clue that the snaplen was too short is that the NFS reads do not contain the offset or Len values. In this example the snaplen was 192. Notice that packet 27 does not show the offset of the read, compare to packet 13 in the above trace, where the snaplen was 256.
17 30.646984 192.168.10.3 -> 192.168.10.4 NFS V3 ACCESS Call, FH:0x34d96998
18 30.647233 192.168.10.4 -> 192.168.10.3 NFS V3 ACCESS Reply (Call In 17)
20 30.647279 192.168.10.3 -> 192.168.10.4 NFS V3 GETATTR Call, FH:0x44281a4d
21 30.647472 192.168.10.4 -> 192.168.10.3 NFS V3 GETATTR Reply (Call In 20) Regular File mode:0666 uid:0 gid:0
22 30.647501 192.168.10.3 -> 192.168.10.4 NFS V3 ACCESS Call, FH:0x44281a4d
23 30.647727 192.168.10.4 -> 192.168.10.3 NFS V3 ACCESS Reply (Call In 22)
27 30.648254 192.168.10.3 -> 192.168.10.4 NFS V3 READ Call[Packet size limited during capture]
28 30.648539 192.168.10.4 -> 192.168.10.3 NFS V3 READ Reply (Call In 27) Len:32768[Packet size limited during capture]
In wireshark, there are a number of settings which control how wireshark/tshark attempts to re-assemble rpc/tcp/nfs packets. Enabling “more aggressive” RPC decoding seemed to mess up the ability to show the RPC continuation packets. Here are the RPC settings from the wireshark preferences used to generate this data. These prefs also affect the way tshark works.
lovebox:test_snaplen gjl$ grep rpc ~/.wireshark/preferences
dcerpc.desegment_dcerpc: TRUE
dcerpc.reassemble_dcerpc: TRUE
rpc.desegment_rpc_over_tcp: TRUE
rpc.defragment_rpc_over_tcp: TRUE
rpc.max_tcp_pdu_size: 4194304
rpc.dissect_unknown_programs: FALSE
rpc.find_fragment_start: FALSE
rpcap.desegment_pdus: TRUE
rpcap.decode_content: TRUE
rpcap.linktype: 0