Table of contents
Mark6 playback performance
playback speeds have been obtained using the EHT 2015 data. Speeds were determined by doing local reads from the vdiffuse mounted data; e.g.:
dd if=/mark6-01_fuse/... of=dev/null bs=1M count=5000
The measurements was done over several scans and than averaged.
Results:
machine | speed (averaged) |
---|---|
mark6-01 | |
mark6-02 | PV: 3120 Mbps |
mark6-03 | AZ: 1275 Mbbs |
mark6-04 | 1224 Mbps |
mark6-05 | 1190 Mbps |
mark6-06 |
RDMA Testing
(todo: add 'dd' performance in mark6-03 2-module SMA scan)
The pure file-to-Infiniband connectivity performance can be tested with various RDMA based file transfer utilities. One compact transfer utility is https://github.com/JeffersonLab/hdrdmacp. Building needs the CentOS package rdma-core-devel.
sudo yum install rdma-core-devel git clone https://github.com/JeffersonLab/hdrdmacp cd hdrdmacp; g++ -I . -g -std=c++11 -o hdrdmacp *.cc -libverbs -lz
The server to receive files can be started e.g. on fxmanager, specifying a buffer set of 4 buffers (-n 4) x 4M each (-m 4):
./hdrdmacp -s -n 4 -m 4
Transfer speed from a FUSE-mounted 2 x 8-disk Mark6 module pair can be tested with e.g.
ssh oper@mark6-04 cd ~/jwagner/hdrdmacp/ fuseMk6 -r '/mnt/disks/[12]/*/band1/' /`hostname -s`_fuse/b1/12 vdifuse -a /tmp/label.cache -xm6sg -xrate=125000 -v /mark6-04_fuse/vdifuse_12/ /mnt/disks/[12]/*/band1/ ./hdrdmacp -n 4 -m 4 /mark6-04_fuse/b1/12/e18g27_Sw_117-0737.vdif fxmanager:/dev/null ./hdrdmacp -n 4 -m 4 /mark6-04_fuse/vdifuse_12/sequences/e18g27/Sw/117-0737.vdif fxmanager:/dev/null
Performance of RDMA from a local file into a remote (or local) /dev/null:
Client | Server (dest.) | Rate (fuseMk6->rdmacp->dest) | Rate (vdifuse->rdmacp->dest) |
---|---|---|---|
mark6-04 | fxmanager:/dev/null | Transferred 308.29 GB in 198.791 sec (12.4066 Gbps) | Transferred 308.29 GB in 267.357 sec (9.2248 Gbps) |
mark6-04 | mark6-04:/dev/null | Transferred 308.29 GB in 207.402 sec (11.8915 Gbps) | Transferred 308.29 GB in 283.05 sec (8.71336 Gbps) |
Test: Swaping modules
In order to determine if the different playback speeds are due to differences in the mark6 units or tie to the data recorded on the modules two sets of modules (PV, AZ) were swapped.:
mark6-02 | AZ: 1272 Mbps |
mark6-03 | PV: 3669 Mbps |
Playback performance seems to be tied to the data on the module. Need to repeat the playback speed measurements with recently recorded data (e.g. from the DBBC3 recordings in the lab).
Comparison: Fuse/Gather
Mark6 files were gathered on the fly and piped trough dd:
./jwagner/kvnvdiftools/gather-stdout/gather /mnt/disks/[1234]/*/data/bf114a_Lm_142-0628.vdif - | dd of=/dev/null bs=1M count=100000
Results:
90000+10000 records in 90000+10000 records out 99921920768 bytes (100 GB) copied, 43.7195 s, 2.3 GB/s
Gathering yields much higher performance (=18 Gbps) than vdifuse (=.1.4 Gbps)
Using fuseMk6 instead of vdifuse:
fuseMk6 -r "/mnt/disks/[12]/*/data/" /home/oper/ftmp/ Found 258 scans, and 258 entries in JSON dd if=/home/oper/ftmp/c22gl_Cr_081-0000.vdif of=/dev/null bs=1M count=1000 1048576000 Bytes (1,0 GB) kopiert, 0,480969 s, 2,2 GB/s dd if=/home/oper/ftmp/w27us_Cr_086-1830.vdif of=/dev/null bs=1M count=1000 1048576000 Bytes (1,0 GB) kopiert, 0,464167 s, 2,3 GB/s dd if=/home/oper/ftmp/w27us_Cr_086-1821.vdif of=/dev/null bs=1M count=15000 15728640000 Bytes (16 GB) kopiert, 5,56799 s, 2,8 GB/s
iostat
on mark6-01 iostat finds the following:
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 12.12 1767.14 0.01 486307174 4096 sdb 12.12 1767.17 0.01 486315780 4096 sdc 12.12 1767.22 0.01 486328745 4096 sdd 12.12 1767.18 0.01 486316915 4096 sde 12.12 1767.13 0.01 486305259 4096 sdf 12.12 1767.13 0.01 486304968 4096 sdg 11.94 1768.72 0.01 486741075 4096 sdh 12.12 1767.09 0.01 486292692 4096 sdi 12.12 1767.13 0.01 486304984 4096 sdj 11.93 1768.20 0.01 486597699 4096 sdk 11.93 1767.90 0.01 486516615 4096 sdl 11.95 1770.25 0.01 487163049 4096 sdm 11.93 1767.95 0.01 486530620 4096 sdn 11.93 1767.94 0.01 486527604 4096 sdo 11.93 1767.94 0.01 486526314 4096 sdp 11.93 1767.86 0.01 486506020 4096 sdr 0.00 0.02 0.01 6119 4096 sds 11.94 1767.81 0.01 486490497 4096 sdt 11.94 1767.80 0.01 486487765 4096 sdu 11.95 1769.34 0.01 486911815 4096 sdv 0.00 0.02 0.01 6117 4096 sdw 0.00 0.02 0.01 6121 4096 sdx 0.00 0.02 0.01 6119 4096 sdy 0.00 0.02 0.01 6290 4096 sdz 0.00 0.02 0.01 6116 4096 sdaa 12.07 1767.16 0.01 486313319 4096 sdab 0.00 0.02 0.01 6119 4096 sdac 0.00 0.02 0.01 6117 4096 sdad 12.06 1767.11 0.01 486298721 4096 sdae 12.06 1767.13 0.01 486304411 4096 sdaf 12.06 1767.11 0.01 486300109 4096
The io performance of some disks is much lower than expected. The following mount logic applies (red are slow devices):
Module 1: g j k l m n o p
Module2: y aa ab ac ad ae af ag
Module 3: r s t u v w x z
Module 4: a b c d e f g h i
Repeat speed measurements on Mark6 lab machines
Mark6 machines in the correlator cluster have a redhat based OS installation. In order to check whether the differences in playback speed reported by Haystack and measured in Bonn are due to OS specific differences the speed tests were repeated on Mark6 machines running the original Debian installation.
Results: Playback speed < 1Gbbps
so the OS does not seem to be the reason for the slow playback speeds.
General IO tuning
Take a look at: http://cromwell-intl.com/linux/perfo...ing/disks.html
IO scheduler should probably be set to noop on all mark6 machines
Tested setting io scheduler to NOOP on mark6-05. No measurable difference in read performance
Hyperthreading
Repeated tests with Hyperthreading enabled & disabled. no significant difference in results.