Distributed Hybrid MultiCloud Mesh with VNS3 and LNKe

Distributed Hybrid MultiCloud Mesh with VNS3 and LNKe

As cloud adoption continues to ramp up in 2022, with Gartner projecting another 21.7% growth in cloud spend this year, companies are maturing beyond their initial workload migrations to single cloud vendors. Whether to create resiliency due to the now not so uncommon major outages we have seen in the past few years, to tailor their many application environments to changing business requirements, or to migrate to new cloud vendors whose offering is the best fit. However, in order to realize these opportunities, companies need a consistent network layer that is uncoupled from any one cloud vendors specific dependancies. No matter which cloud you choose, achieving this goal requires utilizing third party network solutions. Such a solution should ideally facilitate connectivity to data-centers, remote users, and IOT devices as well.

Cohesive Networks VNS3 cloud edge security controllers can create the backbone across all of your public cloud vendors in an easy to manage and secure mesh, with LNKe connecting all of your virtual private networks. This gives you a fully transitive network across all of your cloud real estate, running at performative speeds with built in failover and self healing mesh capabilities. Granular IPSec cloud edge configurations allow you to connect corporate data centers, partner networks and vendor access, regardless of their hardware. Policy enforcement is consistent across the network and has been simplified for ease of management. With our comprehensive firewall you can easily define people, groups and network objects to allow your remote workforce to securely connect at the edge closest to their physical location. In short, with VNS3 and LNKe, you can create a full network mesh consistent with your needs that can grow to anywhere that you need to be and scale with your deploments.

Please reach out to the Cohesive Networks sales and solutions team at contactme@www.cohesive.net to further the discussion with any interests that you may have. We are always happy to help.

News Roundup: Week of Dec 26, 2021

News Roundup: Week of Dec 26, 2021

Could Continuing AWS Outages Give Rise to Distributed Cloud Deployments?

Widespread disruption of high-use internet services was recently experienced as a result of the third AWS outage in the span of a month. AWS reported this latest disruption was caused by “a power outage at a data center in Northern Virginia” which saw giants like Hulu and Slack offline for about two and a half hours. A recent article from The Washington Post suggests that having a cloud deployment with a singular, critical point of failure creates opportunities for widespread outages, in a world where distributed cloud deployments can offer you some protection from these outages. As “the cloud’s increasing intricacy and demands” continue to increase, and companies continue to migrate and develop in the cloud, the potential for outages caused by the “over-centralization” of infrastructure into heavily-used AWS regions also increases.

Azure App Service Insecurity Exposing Source Code Since 2017

A recently discovered insecurity in the Azure App Service has “exposed the source code of applications written in PHP, Python, Ruby, and Node” and has been prevalent since September 2017. SC Magazine purports that this security flaw was first widely reported to the public by The Wiz on Oct. 7, 2021, and Microsoft has since updated it’s security recommendations document and mitigated the default behavior that caused this issue. Further research suggests that this vulnerability was likely not a well-kept secret and would have been widely exploited during the purported four year window of this vulnerability. We recommend double-checking your deployments against these new recommendations to ensure that your source code isn’t vulnerable.

Security Attacks Likely to Continue to Increase in 2022

2020 and 2021 have been marred by an increase in the commonality and sophistication of security attacks on companies as we all navigate the uncharted waters of remote work, and address the new connectivity and security concerns that have surfaced as a result of this necessary transition. A recent article from Bloomberg law suggest that some of the most damaging attacks have targeted backbone systems and solutions, such as the Microsoft Exchange software attacks that affected many companies in 2021. Alarmingly, many of the “exploits used in the first quarter of 2021 are still being used today” which only serves to create added pressure on both the solutions providers and companies that build critical systems upon such backbones solutions. These attacks are complemented by more ‘traditional’ phishing attacks, “which remains one of the highest-volume types of vulnerabilities” across all business sectors. Having proper security procedures and communication channels in place is more important than ever, and the criticality of such considerations will only increase as we move into 2022.

JEDI Becomes JWCC With Decision Target of Q3 2022

In the wake of four years of legal challenges and congressional inquiries, The JEDI contract has been replaced with a new framework, the Joint Warfighter Cloud Compatibility (JWCC), “from which to deliver commercial cloud services to Defense personnel.” The Pentagon “issued formal solicitations for JWCC” to AWS, Microsoft, Google, and Oracle, effectively leveling the playing field for the biggest US cloud providers. According to Nextgov “The Pentagon plans to make JWCC awards in the third quarter of fiscal 2022” which could bring some interesting infrastructure developments from these cloud providers.
VNS3 LNKe: Creating Cloud-Agnostic Transitive Networks Without a lot of Fuss

VNS3 LNKe: Creating Cloud-Agnostic Transitive Networks Without a lot of Fuss

Cohesive Networks has been helping our customers build robust transit networks on public cloud infrastructure since our early days. Doing so on VNS3 technology gives you secure and observable methods consistent across cloud providers and other virtualization platforms. Up until recently we achieved this by creating site to site IPSec tunnels into our federated mesh backbone. This approach, while robust due to BGP failover capabilities, adds quite a lot of complexity. Each of these connections have unique peering addresses and autonomous system numbers (ASN), as well as peer access lists to configure and manage. Which brings us to our new offering, the VNS3 LNKe controller. LNKe controllers are simple to set up while still providing robust failover capabilities.

The VNS3 LNKe controller is one of Cohesive Networks latest offerings. It’s been designed to provide a low cost, easy to deploy, method of connecting your private cloud networks, regardless of the provider. Let’s take a look at the mechanics of it.

VNS3 can be deployed in a peered mesh topology, where by all of the members of the mesh exchange connection and routing information with all of the other members of the mesh across encrypted peering links. These mesh peers can be situated in any cloud provider and in any region. This is the hub in your typical hub and spoke model. The difference being that VNS3 hub, or mesh, components can exist in many different locations, while still being aware of all of the other components. Extending the hub
simply entails adding new peers. This hub can be as little as one or two VNS3 controllers to many tens of controllers spanning across your cloud vendors regions. Within this mesh you have full visibility and attestability of network flows.

    Now to connect your various networks into the mesh so as to facilitate your transitive network. LNKe is a light weight variant, thats has been designed to work with the encrypted overlay networking capabilities of VNS3. It uses the cryptographic key architecture to create a tunnel from the LNKe controller to the closest mesh controller. This link can be established through a VPC peering link between the connecting VPC or over public IP. You simply have to deploy the LNKe controller into the connecting VPC and push the VNS3 client pack to it. This gives it a unique overlay address that the hub mesh is aware of.

    The LNKe can be configured to have failover hub members that it will connect to should any failure occur. On the hub members that it is configured to connect to we then create route entries for the LNKe’s network. This route is pointed at the overlay IP that has been associated with the LNKe controller. While these are effectively static entries, VNS3 will only ever enable the one that is actively connected to. We call this dynamic static routing.

    On the connected VPC you can set your subnet route of 0.0.0.0/0 to point to the LNKe controller, since LNKe can also serve duty as your NAT gateway. In this way any traffic that is bound for other connected networks will traverse into the hub, where as non transit network traffic can get out as needed.

    This solution gives you a lot of flexibility in managing your network connections. You have full firewall capabilities to restrict and shape traffic. You can transform traffic should you have overlapping CIDRs. You can combine other connections into the mesh such as remote workforces or data center connectivity. You can inject network function virtualization like NIDS and WAF. You end up with a network control plane that works the same across all cloud providers that is cost effective and easy to deploy and mange.

    How to Replace your NAT Gateway with VNS3 NATe: Part II

    How to Replace your NAT Gateway with VNS3 NATe: Part II

    In part I of our NATe post we discussed economic advantages to replacing your AWS NAT Gateways with Cohesive Networks VNS3 NATe devices. We also walked through how to deploy them. In this follow up post I want to dig in a little into the some of the incredibly useful capabilities that VNS3 NATe provides.

    In part I we discussed how to replace your current NAT Gateway. One of the steps in that process was to repoint any VPC route table rules from the NAT gateway to the Elastic Network Interface (ENI) of the Elastic IP (EIP) that we moved to the VNS3 NATe ec2 instance. This was done for consistency so that if you had any rules in place that referenced that IP, you would remain intact. Unfortunately, due to the mechanics of AWS NAT Gateway, you can not reassign its EIP while it is running. So we had to delete it in order to free up the EIP. This part of the operation introduced 15 to 30 seconds of down time. With VNS3 NATe devices you can easily reassign an EIP without the need to delete an instance. Our upgrade path is to launch a new instance and move the EIP to it. Since we set our VPC route table rules to point to the ENI of the EIP, our routes will follow.

    Another powerful feature of VNS3 NATe is the container plugin system. All VNS3 devices have a plugin system based around the docker subsystem. This allows our users to inject any open source, commercial or proprietary software into the network. Coupled with the VNS3 comprehensive firewall section, this becomes a versatile and powerful feature. In the case of a NAT device, there are some serious security concerns that can be addressed by leveraging this plugin system.

    VNS3 Plugin System

    Suricata, the Network Intrusion Detection and Prevention System (IDS/IPS) developed by the Open Information Security Foundation (OISF) for the United States Department of Homeland Security (USDHS), is a powerful addition to a VNS3 NATe device. Modern day hacking is all about exfiltration of data. That data has to exit your network and the device that is in path is your NAT device. Additionally malware will often attempt to download additional software kits from the internet, this traffic also traverses through your NATe device. In both of the these scenarios Suricata has powerful features to analyze your traffic and identify suspicious activityand files. You can find directions for setting up Suricata on VNS3 here:

    https://docs.cohesive.net/docs/network-edge-plugins/nids/

    Additionally, Cohesive Networks has developed a Brightcloud Category Based Web Filtering Plugin. Brightcloud have created a distinct list of categories that all domain names have been divided into. The plugin takes advantage of this categorization in three ways:

    • An allow list – A comma separated list of categories that are allowed. The presence of the allow_list.txt will block all traffic, only allowed categories will be permitted.
    • A deny list – A comma separated list of categories that are denied. The presence of the deny_list.txt will allow all traffic, only the specified categories will be forbidden.
    • An exclusion list – A comma separated list of URLs that are not to be considered by the plugin.

    These are just a few examples of software plugins that can run on VNS3 NATe. Others that will work are Snort, Bro (Zeek), Security Onion. Our Sales and Support teams are always happy to assist you in determining the right approach and assisting you with your implementation.

    How to Replace Your NAT Gateway with VNS3 NATe

    How to Replace Your NAT Gateway with VNS3 NATe

    From Wikipedia:

    “Network address translation (NAT) is a method of mapping an IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device. The technique was originally used to avoid the need to assign a new address to every host when a network was moved, or when the upstream Internet service provider was replaced, but could not route the networks address space. It has become a popular and essential tool in conserving global address space in the face of IPv4 address exhaustion. One Internet-routable IP address of a NAT gateway can be used for an entire private network.”

    Cohesive Networks introduced the NATe offering into our VNS3 lineup of network devices back in March. It lowers operational costs while adding functionality and increasing visibility. Easily deployable and managed, it should be a no brainer once you consider its functional gains and lower spend rate. Some of our large customers have already started the migration and are seeing savings in the tens, hundreds and thousands of thousands of dollars.

    The AWS NAT Gateways provide bare bones functionality at a premium cost. They simply provide a drop in NAT function on a per availability zone basis within your VPC, nothing more.No visibility, no egress controls, and lots of hidden costs. You get charged between $0.045 and $0.093 an hour depending on the region. You get charged the same per gigabyte of data that they ‘process’, meaning data coming in and going out. That’s it, and it can really add up. A VPC with two availability zones will cost you $788.40 a year before data tax in the least expensive regions, going up to double that in the most expensive regions. Now consider that across tens, hundreds or thousands of VPCs. That’s some real money.

    With Cohesive Networks VNS3 NATe you can run the same two availability zones on two t3a.nano instances with 1 year reserved instances as low as $136.66 per a year, with no data tax as ec2 instances do not incur inbound data fees. It is about a sixth to a tenth of the price depending on the region you are running in.

    As a Solutions Architect at Cohesive Networks I’ve worked with enterprise customers around the world and understand the difficulty and challenges to change existing architecture and cloud design. Using cloud vender prescribed architecture is not always easy to replace as there are up and down stream dependancies. The really nice thing about swapping your NAT Gateways with VNS3 NATe devices is that it is really a drop in replacement for a service that is so well defined. It can be programatically accomplished to provide near zero downtime replacement. Then you can start to build upon all the new things that VNS3 NATe gives you.

    The process of replacement is very straight forward:

    1. First you deploy a VNS3 NATe for each availability zone that you have in your VPC in a public subnet.
    2. Configure its security group to allow all traffic from the subnet CIDR ranges of your private subnets.
    3. You do not need to install a key pair.
    4. Once launched turn off source / destination checking under instance networking.
    5. Next you will repoint any VPC route table rules, typically 0.0.0.0/0, from the existing NAT Gateway to the Elastic Network Interface of the Elastic IP that is attached to your NAT Gateway.
    6. Delete the NAT Gateway so as o free up the Elastic IP.
    7. Finally, associate the Elastic IP to your VNS3 NATe instance.

    The only downtime will be the 30 or so seconds that it takes to delete the NAT Gateway.

    One safeguard we always recommend to our customers to set up a Cloud Watch Recovery Alarm on all VNS3 instances. This will protect your AWS instances from any underlining hardware and hypervisor failures. Which will give you effectively the same uptime assurancesas services like NAT Gateway. If the instance “goes away” the alarm will trigger an automatic recovery, including restoring the elastic ip, so that VPC route table rules remain intact as well as state as restoration occurs from run time cache.

    Now you can log into your VNS3 NATe device by going to:

    https://<elastic ip>:8000
    
    usename: vnscubed
    
    password: <ec2 instance-id>
    VNS3 AWS Transitive Routing Deployment

    Head over to the Network Sniffer page from the link on the left had side of the page and set up a trace for your private subnet range to get visibility into your NATe traffic.

    VNS3 Transit Ipsec deployment

    Cloud Costs and the “Economically Defined Architecture”​

    Cloud Costs and the “Economically Defined Architecture”​

    FinOps, Cloud Costs, Repatriation, and most recently the in-depth post by Martin Casado and Sarah Wang from a16z on the cloud/cost paradox are all topics in the news reflecting that we have reached the “Holy Cow, I am spending a ton on cloud!” inflection point.

    As I discussed in previous posts there are key distinctions between running “in the cloud” and “on the cloud”; the ability to tune, tweak and manage costs are some of these distinctions. At the end of the day the Cloud Service Providers are businesses out to maximize revenue, not missionaries proselytizing a new design center. Yes, cloud is the newest, and increasingly dominant design center, but never forget cloud service providers are businesses out to maximize revenue.

    They are experts at presenting to their customers the “economically defined architecture”. Through their documentation, training, certifications, sku-ing, and feature segmentation they guide the cloud customer to acceptable, best-practice recognizable, deployment architectures. In fact, due to the consistency of the aforementioned assets (training, docs, etc..) they have most likely raised the bar in terms of overall deployment quality for many enterprises. All that said, the amalgam of capabilities and the manner deployed ALSO maximize their revenue. This is not surprising.

    When you create a deployment architecture guided by cloud service provider user interface wizards and devops automation template best practices, it is quite possible you are getting a number of constructs your application doesn’t need, despite the fact that some applications might need them. Is this a global scale application and you hope to be the new Netflix? Or is this a supply chain application shared by 12 partners in the midwestern automated sprinkler space?

    The cloud service providers are making specific architectural recommendations, that easily convert to IT deployment decisions, that are integrated to their pricing strategies. It is a level of alignment probably not seen in the previous design cycle where you would hire a consultancy to help with a data center deployment of vendor product XYZ.

    Cohesive has provided cloud connectivity and security, helping customers get to, through and across the clouds since 2008, as such we have seen a lot of both customer migrations and greenfield development. There are an array of capabilities many of these applications do NOT need. To put it another way “not all of them need all of” cloud vendor-provided autoscale, load balancers, direct connects, transit gateways, nat-gateways, BGP, complex service roles/permissions, global network funnels, lambdas, etc.. Yet, due to a combination of the path of least resistance, peer conformity and even “cool kid behavior”, your team might feel reticence to dispense with any of these constructs. Practically speaking, there is a strong chance that declining default practices and recommendations could create personal risk for them. What if there is an issue, and the initial blowback narrative of an outage is “well Bob said we didn’t need a cloud load balancer”? Even if the root cause analysis shows it wasn’t the load balancing or lack thereof that caused issues – does “Bob” ever get truly “cleared of the charges”.

    At the end of the day cloud service providers are extracting billions with a “B” from their customers for what is without argument high quality deployment architectures. But those architectures have some of their own complexity in that they are possibly more complex than what many customers need. Even if you set aside the runtime costs of those components, and some of the inherent complexity, there is another “gotcha”, which is “data taxes”. Increasingly newer offerings have both runtime (price per hour or price per invocation costs for usage) but also added on data taxes on top of the existing cloud egress charges. This is where 3rd party offerings from vendors like Cohesive, (certainly not limited to us) can make a substantial cost difference. Our view on the data taxes is that they are an awful lot of icing on some pretty thin pieces of cake.

    I do want to note that our whole business has been built around cloud, and we ourselves use a lot of it, in fact too much of it from a pure costing point of view. We trade off speed for $$$, and while sometimes hard to quantify, we believe in the payoff. Cloud lets us move fast, and when you do so you leave bits of cost lying about via both neglect and opportunism. For the most part we do this with eyes wide open.

    That to us is the crux of it. Controlling cloud costs requires a clear headed view by IT management of what their options are. What is the cost of conformity to cloud service provider economically defined architecture? What is the cost of non-conformity? If you are trying to save money, increase your non-conformity. If you are trying to reduce a measure of both actual and perceived risk increase your conformity. BUT, when you are trying to figure out how to cut cloud spend by seven figures (or a multiple of that), understand the structure of the tradeoffs you need to make.