Pick Your TOR
March 8, 2011
Top of Rack (TOR) switches are becoming the mainstream networking gears in the data centers. In the old days, the TOR switches were lack of features and speed, and could only handle simple tasks. With the fast advance of off-the-shelf chip technology, many TOR switches now have more features and better performance than legacy chassis switches.
This change has obviously caught the attention from the big vendors. For example, Cisco has released Nexus 5000 and 7000 as their TOR product lines. Juniper has released EX2400, EX2500, and recently QFX3500. These TOR switches are more or less similar to each other. They are all high performance with rich protocols.
Since TORs are more or less the same, how does Cisco and Juniper protect their margin? Well, the incumbents now all start package these TOR switches into proprietary “data center architecture”. Cisco has created Data Center 3.0 with Unified Fabric, while Juniper has created QFabric.
So, why TOR? Well, it is low cost, easy to deploy, and most importantly, easy to scale in a fabric architecture. There are many new exciting innovations that use TOR to build fabric, such as Openflow, Pica8 Xorplus, and TRILL. These open program, either open source or open standard, will bring the true scalability to the data centers.
With all that said, how do we compare different TOR?
- Bandwidth – this is the key performance factor. Not every TOR is non-blocking.
- Packet buffer – some applications, such as storage, require deep packet buffer to handle the jitter of the network. It is good to have bigger buffer on your TOR. However, these packet buffers are really expensive. It is always a trade-off between the cost and the size of the buffer.
- Protocol Support – Depending on different application, users might need different protocols. With more features and protocols added to TOR, the management complexity and cost also increases. It is NOT a good idea to assume using only one type of switches to perform all the functions in your data center. It is also a bad idea to use proprietary protocol to scale your data center.
- Latency – Some applications, such as financial transactions, are sensitive to latency. In a uni-speed switch (such as 48x10GE Pronto 3780), the switch can operate at cut-through mode and the latency can be at hundred-nanosecond range. However, at a multi-speed switch (such as 48GE+4x10GE Pronto 3290), the switch must operate at store-and-forward mode, hence the latency must be at microsecond range.
- Management -It is increasingly important for TOR to support various management interface. In the old days, users only manage switches through CLI, SNMP, or web interface. There will be more ways, such as Netflow or IPMI, to manage TORs, especially in a scalable data center. Having a extensible and open management interface is important.
- Cost – The cost of TOR could be widely different. Even though most of TORs use the same (or similar) chips from off-shelf vendors (such as Broadcom, Marvell, and Fulcrum), their prices could be 10 times different. Why? The name brand TOR uses outdated embedded systems as OS, instead of modern OS like Linux. This adds a lot of unnecessary software cost. Another reason is some TORs come with a lot of high-cost software you don’t use or need in your data center.
- Does it physically fit into your data center – make sure the TOR fit into the rack of your data center. Do you plan to install your backward on top of your rack? If so, you need a different airflow than typical switches. Do you need redundant power? Do you need all data ports on the same side? All these factors might influence your choice of the TOR.