Improving multipath routing of TCP flows by network exploration
2019, IEEE Access
https://doi.org/10.1109/ACCESS.2019.2893412Abstract
Ethernet switched networks are widely used in enterprise and data center networks. However, they have some drawbacks, mainly that, to prevent loops, they cannot take advantage of multipath topologies to balance traffic. Several multipath routing proposals use link-state protocols and Equal Cost Multi-Path routing (ECMP) to distribute the load over multiple paths. But, these proposals are complex and prone to flow collisions that may degrade performance. This paper studies TCP-Path, a protocol that employs a different approach. It uses a distributed network exploration mechanism based on broadcasting the TCP-SYN packet to identify and select the fastest available path to the destination host, on the fly. Our evaluation shows that it improves on ECMP by up to 70% in terms of throughput for elephant flows and by up to 60% in terms of flow completion time for mouse flows. Indeed, network exploration offers a better, yet simple alternative to ECMP-based solutions for multipath topologies. In addition, we also study TCP-Path for elephant flows (TFE), which restricts TCP-Path application to elephant flows to reduce the exploration broadcast overhead and the size of forwarding tables, thus improving its scalability. Although elephant flows represent a small fraction (about 5%) of total flows, they have a major impact on overall performance, as we show in our evaluation. TFE reduces both the overhead incurred during path setup and the size of the forwarding tables by a factor of almost 20. Moreover, it achieves results close to those obtained by TCP-Path for elephant flows, especially when working with high loads, and yields significant improvements for all types of flow at medium and high load levels.
References (35)
- "Transparent interconnection of lots of links (trill)." [Online]. Available: https://datatracker.ietf.org/wg/trill/charter/
- "802.1aq -Shortest Path Bridging," http://www.ieee802.org/1/pages/802. 1aq.html.
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, "Hedera: Dynamic flow scheduling for data center networks," in Proceed- ings of the 7th USENIX, ser. NSDI'10, 2010, pp. 19-19.
- C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Han- dley, "Improving Datacenter Performance and Robustness with Multipath TCP," in SIGCOMM 2011, New York, NY, 2011, pp. 266-277.
- M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese, "CONGA: Distributed Congestion-aware Load Balancing for Datacen- ters," SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 503-514, Aug. 2014.
- E. Rojas, G. IbaÃśez, J. M. Gimenez-Guzman, J. A. Carral, A. Garcia- Martinez, I. Martinez-Yelmo, and J. M. Arco, "All-Path bridging: Path exploration protocols for data center and campus networks," Computer Networks, vol. 79, no. 0, pp. 120 -132, 2015. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1389128615000055
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, "VL2: a scalable and flexible data center network," SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 51-62, 2009.
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan, "Data center TCP (DCTCP)," SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 63-74, Aug. 2010.
- J. Qadir, A. Ali, K. L. A. Yau, A. Sathiaseelan, and J. Crowcroft, "Ex- ploiting the Power of Multiplicity: A Holistic Survey of Network-Layer Multipath," IEEE Communications Surveys Tutorials, vol. 17, no. 4, pp. 2176-2213, 2015.
- J. Alvarez-Horcajo, D. Lopez-Pajares, J. M. Arco, J. A. Carral, and I. Martinez-Yelmo, "Tcp-path: Improving load balance by network ex- ploration," in Cloud Networking (CloudNet), 2017 IEEE 6th International Conference on. IEEE, 2017, pp. 1-6.
- D. Thaler and C. E. Hopps, "Multipath issues in unicast and multicast next-hop selection." RFC, vol. 2991, pp. 1-9, November 2000. [Online]. Available: http://dblp.uni-trier.de/db/journals/rfc/rfc2900-2999.html
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, "Hedera: Dynamic Flow Scheduling for Data Center Networks," in Pro- ceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI'10.
- T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker, "Onix: A Distributed Control Platform for Large-scale Production Networks," in Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI'10. Berkeley, CA, USA: USENIX Association, 2010, pp. 1-6. [Online]. Available: http://dl.acm.org/citation.cfm?id=1924943.1924968
- "ONOS -Open Network Operating System." [Online]. Available: http://onosproject.org/
- F. X. Wibowo, M. A. Gregory, K. Ahmed, and K. M. Gomez, "Multi- domain software defined networking: research status and challenges," Journal of Network and Computer Applications, vol. 87, pp. 32-45, 2017.
- S. Wang, J. Zhang, T. Huang, T. Pan, J. Liu, and Y. Liu, "Flow distribution- aware load balancing for the datacenter," Computer Communications, vol. 106, pp. 136 -146, 2017. [Online]. Available: http://www.sciencedirect. com/science/article/pii/S0140366417303043
- S. Kandula, D. Katabi, S. Sinha, and A. Berger, "Dynamic Load Balancing Without Packet Reordering," SIGCOMM Comput. Commun. Rev., vol. 37, no. 2, pp. 51-62, Mar. 2007.
- S. Sen, D. Shue, S. Ihm, and M. J. Freedman, "Scalable, Optimal Flow Routing in Datacenters via Local Link Balancing," in Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies, ser. CoNEXT '13. New York, NY, USA: ACM, 2013, pp. 151-162. [Online]. Available: http://doi.acm.org/10.1145/2535372. 2535397
- S. Vutukury and J. J. Garcia-Luna-Aceves, "A simple approximation to minimum-delay routing," SIGCOMM Comput. Commun. Rev., vol. 29, no. 4, pp. 227-238, Aug. 1999. [Online]. Available: http: //doi.acm.org/10.1145/316194.316227
- J. Cao, R. Xia, P. Yang, C. Guo, G. Lu, L. Yuan, Y. Zheng, H. Wu, Y. Xiong, and D. A. Maltz, "Per-packet load-balanced, low-latency routing for clos-based data center networks," in CoNEXT '13, Santa Barbara, CA, December 9-12, 2013, pp. 49-60.
- K. He, E. Rozner, K. Agarwal, W. Felter, J. Carter, and A. Akella, "Presto: Edge-based Load Balancing for Fast Datacenter Networks," SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 465-478, Aug. 2015.
- D. Xu, M. Chiang, and J. Rexford, "Link-state routing with hop-by-hop forwarding can achieve optimal traffic engineering," IEEE/ACM Transac- tions on Networking, vol. 19, no. 6, pp. 1717-1730, Dec 2011.
- M. Chiesa, G. Kindler, and M. Schapira, "Traffic engineering with equal- cost-multipath: An algorithmic perspective," IEEE/ACM Transactions on Networking, vol. 25, no. 2, pp. 779-792, April 2017.
- M. Shafiee and J. Ghaderi, "A simple congestion-aware algorithm for load balancing in datacenter networks," IEEE/ACM Transactions on Network- ing, vol. 25, no. 6, pp. 3670-3682, Dec 2017.
- N. Katta, M. Hira, C. Kim, A. Sivaraman, and J. Rexford, "HULA: Scalable Load Balancing Using Programmable Data Planes," in Proceedings of the Symposium on SDN Research, ser. SOSR '16. New York, NY, USA: ACM, 2016, pp. 10:1-10:12. [Online]. Available: http://doi.acm.org/10.1145/2890955.2890968
- D. Katz and D. Ward, "Bidirectional forwarding detection (bfd)," June 2010, rFC5880. [Online]. Available: http://tools.ietf.org/rfc/rfc5880.txt
- D. Lopez-Pajares, J. Alvarez-Horcajo, E. Rojas, G. Ibanez, and J. A. Carral, "Iterative discovery of multiple disjoint paths in switched networks with multicast frames," in Proceedings of the 43nd IEEE Conference on Local Computer Networks (LCN. IEEE, 2018.
- Joaquin Alvarez-Horcajo, Isaias Martinez-Yelmo, Elisa Rojas, Juan A. Carral-Pelayo, and Diego Lopez-Pajares, "New cooperative mechanisms for Software Defined Networks based on Hybrid Switches," Transactions on Emerging Telecommunications Technologies, 2016.
- "ns-3 simulator," https://www.nsnam.org/.
- "CPqD: OpenFlow 1.3 Software Switch," https://github.com/CPqD/ ofsoftswitch13.
- "Mininet: An instant virtual network on your laptop (or other PC) - mininet." [Online]. Available: http://mininet.org/
- M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, "pFabric: Minimal Near-optimal Datacenter Transport," SIGCOMM Comput. Commun. Rev., vol. 43, no. 4, pp. 435-446, Aug. 2013.
- N. Farrington and A. Andreyev, "Facebook's data center network architec- ture," in Optical Interconnects Conference, 2013 IEEE. Citeseer, 2013, pp. 49-50.
- J. Zhang, F. Ren, and C. Lin, "Survey on transport control in data center networks," IEEE Network, vol. 27, no. 4, pp. 22-26, Jul. 2013.
- "OpenFlow Switch Specification v1.3.2," https://www.opennetworking. org/images/stories/downloads/sdn-resources/onf-specifications/openflow/ openflow-spec-v1.3.2.pdf.