Cloud

Public cloud architecture and its performance impact on enterprises

Author at TechGenyz Cloud

Most of the enterprises making cloud choices focus on three main things – Global data Center presence; Services offered and pricing.

However, one thing that has been missing in action and not available to enterprises is unbiased data and independent views on two very crucial things

  • Public cloud performance.
  • Underlying Architecture of public clouds.

These are crucial when it comes to both the end-user experience and back-end application architecture. Public cloud providers like to keep these things private! and hence most of the times enterprises end up making a decision based on cognitive biases and gut feelings.

Let me try to give you a comparative look at the most significant cloud architectural differences and similarities between these five cloud providers – AWS, Microsoft Azure, Google Cloud, Alibaba, and IBM.

All public cloud platforms are created uniquely. Cloud connectivity architecture determines how users around the globe access computing resources deployed in the public cloud and hence very crucial factors affecting the performance of the Cloud platform. How the cloud is architected directly impact the end-user experience. 

Architectural and connectivity differences between the five cloud providers result in varying levels of Internet exposure. As we all know – the Internet is a best-effort network that is vulnerable to security threats, DDoS attacks, congestion, and infrastructure outages – so relying on the Internet increases unpredictability in performance, creates risk for cloud strategy, and raises operational complexity.

Bidirectional traces and Analysis shows some significant contrasts in cloud connectivity architectures between AWS, Azure, GCP, IBM Cloud and Alibaba Cloud, primarily around the level of Internet exposure in the end-to-end network paths.

AWS and Alibaba love the Internet! Traffic destined to AWS and Alibaba Cloud regions (data centers) enter their respective Backbone closest to the target region. This is a marked difference from how GCP and Azure handle incoming traffic. Traffic enters the GCP and Azure backbone closest to the end-user, regardless of the destination region. IBM takes a hybrid approach to cloud connectivity, with some regions purely relying on the IBM backbone and others that primarily rely on Internet connectivity to transport user traffic to its hosting regions.

The resulting exposure to the Internet subjects its deployments to more significant operational challenges and risks, especially in regions with less stable Internet performance, such as in Asia.

Typically relying on the provider’s backbone results in lower latency values and thus better performance, it is sometimes the case that the absence of a direct path through the Backbone results in circuitous routing and higher latency.

Google Cloud still has some significant global gaps in its Underlying Network Architecture notably that traffic from Europe and Africa takes 2.5-3x longer to get to India because it is routed through the GCP backbone in the US first. it looks like GCP doesn’t have (or have limited capacity) any direct route from Europe to ASIA which makes GCP Slower from Europe to India.

Most possibly the reason why Amazon extensively uses the Internet but other players use their Backbone may have to do with how these providers have evolved. Google and Microsoft have technical expertise in building and managing large subsea backbone networks. AWS, the leader in public cloud offering focuses more on rapid fast delivery to the market and doesn’t concentrate much on building out a massive network. However, it is highly likely that given increasing profitability, their architecture will change shortly.

Enterprises considering a move to the public cloud should consider connectivity architectures to evaluate their appetite for risk while striking a balance with features and functionality. Enterprises should also be aware that even though public cloud backbones are each maintained by a single vendor, they are still multi-tenant service infrastructures that typically don’t offer SLAs.

Furthermore, public cloud connectivity architectures continuously evolve and can be subject to abrupt changes at the discretion of the provider. While all public cloud providers rely on the public Internet to a certain extent, their level of dependence on the Internet varies greatly – and this can have downstream impacts on the enterprises they serve. Simply put, the less time spent riding the public Internet, the more reliable and stable of an experienced enterprise can be expected.                     

Alibaba and AWS have Internet intensive approach, and hence less performance is less predictable. 

GCP has Backbone friendly approach so it provides stable and predictable performance across the globe. But the circuitous subsea backbone route between EU and ASIA locations and it results in much higher network latency which affects users connecting to workloads in GCP’s hosting region in Mumbai, India.

Azure has a backbone friendly approach while connecting Europe users with workloads hosted in ASIA regions and most direct subsea routes from Europe to ASIA. This results in much lower end-to-end latency for users accessing workloads hosted in Microsoft Azure’s hosting region in Mumbai from Europe.

However, if you are serving customers in China from a Singapore hosting region, Alibaba Cloud shows the best latency while IBM is almost 3x slower in the same geography.

The Great Firewall imposes a performance toll on all cloud provider traffic entering and exiting China. Traffic to and from China, irrespective of which cloud hosting region it is destined to, or originating from, is subject to high packet loss. On the contrary, traffic that is contained within China does not experience packet loss.

Enterprises hesitant to choose a China-based hosting region with a cloud provider have other viable options that offer reasonable latency. Data-driven decisions enable enterprises to pick the optimal cloud provider and hosting regions to serve customers in China. Singapore and Hong Kong are two viable hosting regions with optimal network latency from China. Alibaba Cloud has the best network latency between China and Hong Kong, outperforming both Azure and IBM.

Enterprises expanding its global presence in the Asia Pac market are challenged with varying and unpredictable performance. Sitting in between Chinese citizens and the global Internet is the Great Firewall of China, a sophisticated content filtering machine. Employing a multitude of censorship tools – such as IP blocking, DNS tampering and hijacking, deep packet inspection, and keyword filtering – the Great Firewall is designed to ensure that online content aligns with the government party line.

Privacy and ethics concern aside, one of the drawbacks to this system is a vast reduction in performance. Alibaba Cloud performs the best in both Singapore and Hong Kong for users connecting from China, from the perspective of network latency, predictability (black vertical lines), and packet loss, but not for India. If for some reason, Alibaba Cloud is not your first choice, Azure performs equally well across all three hosting regions, not compromising speed, predictability, or packet loss.

Key takeaways 

All Public clouds are created differently and have different end user: Hosting Region ( workloads) Connectivity approach: Backbone Friendly (GCP, Azure ), Internet-centric (AWS, Alibaba ), Hybrid (IBM). 

Cloud routing architecture and hence preferences continue to vary. AWS Loves Internet and therefore has low-performance stability in ASIA. GCX has limited connectivity from Europe to ASIA and hence is 3X Slower from Europe to ASIA. Alibaba Cloud has the best network latency and performance between China and Hong Kong /Singapore.

The Great China Firewall imposes a performance toll on all cloud provider traffic entering and exiting China.

Some advice 

Enterprises should use a data-driven approach to avoid cognitive biases before making cloud investment decisions.

Enterprises should consider the organization’s tolerance to internet exposure and evaluate risk.

Trust but verify and Avoid assumptions in the cloud.

Career

Subscribe