Typical Pure backup and restore performance  

  RSS

tkaiser
(@tkaiser)
Active Member
Joined: 4 months ago
Posts: 5
06/08/2019 1:44 pm  

Hi,

we switched recently from the Pure appliance (VM) to Pure on a bare metal installation (16 core Xeon with 10GbE) and after resolving some network problems we're now up and running with full 10GbE connectivity between our Pure server and all 3 ESXi servers. According to iperf3 without any tuning at all, we're at above 7Gbit/sec which is fine [1].

Nevertheless individual backup runs do not exceed 27 MB/sec (we already took initialization overhead into account). With multiple backups running in parallel we already achieved a total of 146 MB/s (8 parallel backups) so the local storage on the Pure server doesn't seem to be a bottleneck but there's something that limits backup performance.

Also we're seeing tons of these errors in removed link  on all affected ESXi servers:

SSL Handshake failed for stream <SSL(<io_obj p:0x033f05e8, h:13, <TCP '192.168.21.130:443'>, <TCP '192.168.21.104:43451'>>)>: N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read)

Is this as expected or do we need to search for SSL anomalies on our side?

Which backup performance do others achieve?

Thanks in advance,

Thomas

 

[1] Quick iperf3 test with 'iperf3 -s' running on the Pure server:

[root@dell-2:~] cd /usr/lib/vmware/vsan/bin
[root@dell-2:/usr/lib/vmware/vsan/bin] ./iperf3 -c 192.168.21.104 -V
iperf 3.1.6
VMkernel dell-2.a-o.intern 6.5.0 #1 SMP Release build-13932383 Jun 7 2019 21:08:10 x86_64
Control connection MSS 1448
Time: Tue, 06 Aug 2019 09:42:29 GMT
Connecting to host 192.168.21.104, port 5201
Cookie: dell-2.a-o.intern.1565084549.494779.
TCP MSS: 1448 (default)
[ 4] local 192.168.21.160 port 43802 connected to 192.168.21.104 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test
iperf3: getsockopt - Function not implemented
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 902 MBytes 7.57 Gbits/sec 8626536 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 1.00-2.00 sec 928 MBytes 7.79 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 2.00-3.00 sec 872 MBytes 7.32 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 3.00-4.00 sec 943 MBytes 7.90 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 4.00-5.00 sec 808 MBytes 6.78 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 5.00-6.00 sec 930 MBytes 7.81 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 6.00-7.00 sec 885 MBytes 7.42 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 7.00-8.00 sec 716 MBytes 6.01 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 8.00-9.00 sec 706 MBytes 5.92 Gbits/sec 0 0.00 Bytes
iperf3: getsockopt - Function not implemented
[ 4] 9.00-10.00 sec 880 MBytes 7.38 Gbits/sec 4286340760 0.00 Bytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 8.37 GBytes 7.19 Gbits/sec 0 sender
[ 4] 0.00-10.00 sec 8.37 GBytes 7.19 Gbits/sec receiver
CPU Utilization: local/sender 72.2% (72.2%u removed link ), remote/receiver 2.3% (0.2%u removed link )
snd_tcp_congestion newreno

iperf Done.
[root@dell-2:/usr/lib/vmware/vsan/bin] ./iperf3 -R -c 192.168.21.104 -V
iperf 3.1.6
VMkernel dell-2.a-o.intern 6.5.0 #1 SMP Release build-13932383 Jun 7 2019 21:08:10 x86_64
Control connection MSS 1448
Time: Tue, 06 Aug 2019 09:42:44 GMT
Connecting to host 192.168.21.104, port 5201
Reverse mode, remote host 192.168.21.104 is sending
Cookie: dell-2.a-o.intern.1565084564.934636.
TCP MSS: 1448 (default)
[ 4] local 192.168.21.160 port 35172 connected to 192.168.21.104 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test
iperf3: getsockopt - Function not implemented
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 1.06 GBytes 9.10 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 1.00-2.00 sec 1.08 GBytes 9.31 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 2.00-3.00 sec 1.09 GBytes 9.35 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 3.00-4.00 sec 1.09 GBytes 9.35 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 4.00-5.00 sec 1.09 GBytes 9.34 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 5.00-6.00 sec 1.09 GBytes 9.34 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 6.00-7.00 sec 1.09 GBytes 9.35 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 7.00-8.00 sec 1.09 GBytes 9.32 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 8.00-9.00 sec 1.08 GBytes 9.30 Gbits/sec
iperf3: getsockopt - Function not implemented
[ 4] 9.00-10.00 sec 1.08 GBytes 9.30 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 10.8 GBytes 9.31 Gbits/sec 0 sender
[ 4] 0.00-10.00 sec 10.8 GBytes 9.31 Gbits/sec receiver
CPU Utilization: local/receiver 106.9% (106.9%u removed link ), remote/sender 10.3% (0.3%u removed link )
rcv_tcp_congestion newreno

iperf Done.
This topic was modified 3 months ago by tkaiser

Quote
Topic Tags
marijan
(@marijan)
Member Admin
Joined: 5 months ago
Posts: 18
06/08/2019 3:32 pm  

Hi Thomas,

Seeing tons of ssl errors is not an expected thing so I would definitely look into that. From what I managed to find online, it could be that the ssl handshake timeout is simply too short and that the connection keeps being dropped. This old post in the VMware forums reports the very same message that you are receiving and the solutions seems to be increasing the timeouts so you can check that article: https://communities.vmware.com/thread/463256

If this does not help, I will create a small test utility for you tomorrow to see if we can get better network transfer times by modifying connection settings. For network transfer, Pure is using the official VMware libraries so there are certain limitations we must accept but there are various settings we can try to increase the performance.


ReplyQuote
tkaiser
(@tkaiser)
Active Member
Joined: 4 months ago
Posts: 5
06/08/2019 4:17 pm  
Posted by: marijan

This old post in the VMware forums reports the very same message that you are receiving and the solutions seems to be increasing the timeouts so you can check that article: <a href=" removed link " target="true"> removed link

I gave it a try. It still works with 6.5, tried it first with 120 seconds and then even with 240 seconds:

[root@dell-1:~] grep handshake  removed link  
2019-08-06T13:56:44.237Z info hostd[9502B10] [Originator@6876 sub=Default] Vmacore::InitSSL: handshakeTimeoutUs = 120000000
2019-08-06T14:04:51.827Z info hostd[9D5AB10] [Originator@6876 sub=Default] Vmacore::InitSSL: handshakeTimeoutUs = 240000000

Unfortunately to no avail. I stopped Pure on the dedicated server using stop.sh, then waited a minute for the messages on the ESXi server to stop, then started using start.sh on the Pure server just to see 4 occasions 3 seconds later on the ESXi server:

2019-08-06T14:05:55.692Z warning rhttpproxy[5536B70] [Originator@6876 sub=Default] SSL Handshake failed for stream <SSL(<io_obj p:0x0346f310, h:41, <TCP '192.168.21.130:443'>, <TCP '192.168.21.104:38999'>>)>: N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read)
2019-08-06T14:05:55.692Z warning rhttpproxy[55F9B70] [Originator@6876 sub=Default] SSL Handshake failed for stream <SSL(<io_obj p:0x051d1bd8, h:42, <TCP '192.168.21.130:443'>, <TCP '192.168.21.104:33525'>>)>: N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read)
2019-08-06T14:05:55.722Z warning rhttpproxy[4E1DB70] [Originator@6876 sub=Default] SSL Handshake failed for stream <SSL(<io_obj p:0x050049d8, h:41, <TCP '192.168.21.130:443'>, <TCP '192.168.21.104:41855'>>)>: N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read)
2019-08-06T14:05:55.723Z warning rhttpproxy[53F1B70] [Originator@6876 sub=Default] SSL Handshake failed for stream <SSL(<io_obj p:0x04f48088, h:42, <TCP '192.168.21.130:443'>, <TCP '192.168.21.104:45051'>>)>: N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read)

I waited another 2 minutes, then started a backup manually and now I have the very same error messages multiple times per second in rhttpproxy.log. So this is obviously not related to a timeout or do I miss something?


ReplyQuote
Share:

Please Login or Register