Some Top-Down Service Lifecycle Modeling and Orchestration Commentary

Most of you know that I’ve been an advocate of intent-modeled, model-driven, networking for almost two decades. This approach would divide a service into functional/deployable elements, each represented by a “black box” intent model responsible for meeting an SLA. This approach has some major advantages, in my view, and also a few complications. Operators generally agree with this view.

I believe that a model-based approach, one aligned with seminal work done by the TMF on what was called “NGOSS Contract”, is the only way to make service lifecycle automation, service creation, and service-to-resource virtualization mapping, work. I’ve laid out some of my thoughts on this before, but some operators have told me they’d like a somewhat more top-down view, but one that’s aligned with current infrastructure and service realities. This is my attempt to respond.

The first and perhaps paramount factor is the relationship between functional division and “administrative control”. For sake of discussion, let’s say that an API through which it’s possible to exercise operational control over a set of service elements is an administrative control point. Let’s say that such a control point would let an operations process create multiple “functions”, which are visible features that can be coerced through the control point. Obviously, this control point could represent a current management system, or a new management tool.

My approach to modeling is also based on a division between the logical/functional and the actual/behavioral, meaning that it has a “service domain” and a “resource domain”. The former expresses the relationship between features and services, and the latter expresses how features map to the behavior of actual resources, including both software and hardware. The “bottom” of the service model has to “bind” to the “top” of the resource model in some way, and that binding is the basis for “deployment”. Once deployment has been completed, the fulfillment of the service-level agreement is based on the enforcement of the SLAs passed down the models (service and resource).

The service level top, which creates the actual overall SLA, has to “decompose” that SLA into subordinate SLAs for it’s child model elements, and each of them in turn must do the same. Each model element represents a kind of contract to meet its derived SLA. The model element relies on one of two things to enforce the contract—a notice from a child element that it has failed its own SLA (and by implication, no notice means it hasn’t) or, for a resource model, an event from a resource within that shows a fault that must either be remedied or reported.

The binding between service model and resource model is based on supported resource “behaviors”, which are functionalities or capabilities that the resource administration has committed to support. The behavior resource model would then be divided based on the administrative control points through which the behavior was controlled. There might be one, or there might be many such points, depending on just how the collective set of resources were managed.

The reason for the domain division is to loosely couple services to resources. With this approach, it would be possible to create a service model that consumed a behavior set that could then be realized by any combination of resource providers, without modification. Anyone could author “service models” to fit a set of behaviors and anyone would be able to advertise those behaviors and thus support the services. This could be used to focus standards activities on behavior definition.

Another feature of this approach is “functional equivalence”. Any implementation of a behavior could be mapped to a service element that consumed it, which means that you could deploy both features based on network behaviors—router networks—and features created by hosting software instances. In fact, you could even have something based on a totally manual process, so that if actual field provisioning of something was needed, you could reflect that in the way a behavior was implemented.

To return to SLAs, each model element in both domains has the common responsibility to either meet the SLA it commits to, to remedy its performance within the SLA, or report an SLA failure. During service creation, or “deployment/redeployment”, each model element has a responsibility to select a child element that can meet its SLA requirements on deployment, and to re-select if the previously selected element reports a failure. The SLA would necessarily include three things—the service parameters expected, the SLA terms, and the location(s) where the connections to the element were made and where the SLA would be expected to be enforced. “I need a VPN with 20 Mbps capacity, latency x, packet loss y, at locations 1, 2, and 3”. That “contract” would then be offered to child elements or translated into resource control parameters and actions.

At one level, this sounds complicated. If, for example, we had a service model that contained a dozen elements, we would have a dozen management processes “running” during the lifecycle of the service. If the resource model contained a half-dozen elements, there would then be 18 such processes. Some could argue that this is a lot of management activity, and it is.

But what is the actual process? It’s a state/event table or graph that references what are almost surely microservices or functions that run only when an event is recognized. A service or resource “architect” who builds the model would either build or identify each process referenced. Many of the processes would be common across all models, particularly in the service domain where deployment/redeployment is based on common factors. I’ve actually built “services” based on this approach, and process creation wasn’t a big deal, but some might think it’s an issue.

The upside of this approach is that each model element is essentially an autonomous application that’s event-driven and that can be run anywhere. An event-handling process is needed to receive events, consult the state/event reference, and activate the designated process, but even that “process” could be a process set with an instance in multiple places. In my own test implementation, I had a single service-domain process “owned” by the seller of the service, and resource-domain processes “owned” by each administrative domain who offered behaviors. This is possible because my presumption was (and is) that model elements could generate events only to their own parent or child elements.

The place where special considerations are needed is the binding point between the domains. A bottom-level service model element has to exchange events with the top-level behavior element in the resource domain. In my implementation, the binding process was separate and provided for the event exchange between what were two different event queues.

This approach also conserves lower-level management processes in the resource domain, processes that are likely already in place. All that’s needed is to wrap the API of an administrative control point in an intent model (a resource model element) that can coerce behaviors and you can then “advertise” them to bind with services. This is possible at any level, and at multiple levels, meaning that if there is some over-arching service management system in place, that system could advertise behaviors on behalf of what it controls, and so could any lower-level control APIs.

For those who wonder about ONAP and my negative views on it, I’m hoping this blog helps explain my objections. ONAP is, IMHO, a telco view of how cloud-centric service lifecycle management would work. It’s monolithic and it doesn’t address many of the issues I’ve noted because it doesn’t rely on intent modeling or service/resource models based on intent models. I don’t think that ONAP was architected correctly at the start, I don’t believe they want to fix it (any more than the NFV ISG really wants to fix NFV), and I don’t believe it could be fixed even if they wanted to, without starting over.

I’m not saying that my approach is the only one that would work, just that I believe it would work and that I’ve done some proof-of-concept development to prove out the major points. I’d love to see some vendor or body take up the issue from the top down, as I have, and I’d be happy to chat with a group that takes on that responsibility.

Will Crypto-Crashes also Crash Web3 and the Metaverse?

Crypto has surely had its problems recently, even major problems. Meta has been under pressure too, and there are signs that the Web3 hype wave has already crested. Is there a connection between these events, either in the sense that there’s a common underlying issue, or in the sense that one problem area might feed concerns in other areas? Maybe, but there are both technical factors and other factors linking these market pieces.

Let’s start with crypto, which IMHO is a kind of do-it-yourself bubble. In the world of investments, nearly everything that’s commonly traded is both regulated and has some “intrinsic” value. If you buy a share of stock, you are buying a piece of the ownership of the company the stock represents. If you buy a bond, you’re buying a promise to pay back, plus interest, from a specific player. Currency is backed by the country that issues it. The fact that there is real underlying value to an investment means that there’s a brake on just how much it can be hyped; the relationship between the underlying value and the price can limit enthusiasm. Not so with crypto.

A crypto-coin is intrinsically worth nothing; pegging crypto to a real asset doesn’t guarantee convertibility. Nobody stands behind some underlying asset with crypto. It’s worth whatever the market is willing to pay, and that means that enthusiasm can bid the price of crypto up without setting off the same kind of alarm bells that would be set off if a stock were similarly bid up. Yes, you can have stock bubbles (the so-called meme stocks were a recent example), but they don’t last long, and there are objective metrics you can use (price/earnings ratio, for example) that lets you spot trouble…spot a bubble.

I say “do-it-yourself bubble” because with crypto, not only do you have an asset with no intrinsic value, you have an uncontrollable quantity of it. Crypto mining is a bit like any other kind of mining in that it generates more of something. In the past, both silver and gold were somewhat scarce, but over time silver was mined more successfully and the quantity of silver increased. As that happened, the ratio of gold-to-silver pricing altered sharply, which demonstrates that the value of something is related to its scarcity. More crypto-coins being mined has that same effect. The higher crypto goes, the smarter it is to mine some, which means that the demand for crypto is influenced by more being produced.

There are also persistent questions about blockchain as a means of ensuring authenticity. A recent DARPA study says that blockchain is subject to tamper risks, which of course has always been known. The essential truth of blockchain authentication is that it relies on more “honest” nodes processing chains than “dishonest” nodes trying to create a fraud. It’s hard to know when that’s true, especially if nation-states are among the bad actors. But while this risk exists, I’ve not seen credible indications that it’s impacted crypto or created any issues with authentication.

My point here is that the problem with cryptocurrency hasn’t been the technology per se, it’s been the fact that there’s nothing behind it. Yes, blockchain underpins crypto, Web3 and most metaverse strategies, but the current problems with crypto aren’t directly linked to issues with blockchain, so they wouldn’t directly impact other blockchain-based concepts.

Not directly linked, but does everyone who’s looking at these three technical developments see it that way? Crypto is worth what the collective market believes it is, which means that it’s almost a social phenomena. What happens when it’s negatively socialized?

Venture capitalists are bubble-makers too. They pick a concept, capitalize a bunch of companies to support it, use that stable of companies to generate a lot of media hype, ride that hype to an exit of a few of their companies at a very high multiple. Web3 is an example of that, and so is the metaverse. Every investor wants in on the Next Big Thing, every Valley technical worker wants to work for the Next Big Startup, and every publication wants to capitalize on the Next Big Story. You can see how this turns into “The Emperor’s New Clothes” very quickly.

There are basic problems with crypto, as I’ve already noted above. There are basic issues with Web3 too, relating to whether any form of validation-by-consensus offers any meaningful validation, because you have no recourse against the masses if there’s something wrong. The metaverse has issues of both scale and value, and also some compelling business challenges that arise out of trying to solve the value problem. My view is that these issues are more than enough to call these technologies into question, to give us not only a right but an obligation to demand some proof of value. The suitability of the underlying technology is a consideration only if what that technology underlies is meaningful. If it’s not, no amount of technology is going to change things.

There is a kernel of real value in cryptocurrency, as a concept. There’s a kernel of real value in distributed validation and authentication as Web3 proposes, and there’s a kernel of real value in the metaverse concept. The problem is that realizing that real value is going to be expensive and time-consuming, just as building a real business would be. It’s a lot easier to figure out a way to create a hype wave, a bubble, and make money off that. Greed and lack of negative social or regulatory pressure almost guarantee that bubble-making will win over value realization.

The question is whether, given the bubble-state of our three technologies, there’s any hope that efforts to realize the real value will continue once the bubble bursts.

We’ve had real challenges in computing and networking over the last couple decades. Some have been addressed, and others never really got much attention, and some could be addressed by technology developments related to crypto, Web3, and the metaverse. I would contend that part of the problem is that when Bubble A bursts, those who made money promoting it are more likely to move on to Bubble B than to go back and invest in realizing the real benefits of that first bubble-creating technology.

The metaverse is a poster child for this issue. It’s pretty obvious from stories like THIS that “Meta says its ultimate goal with its VR hardware is to make a comfortable, compact headset with visual finality that’s ‘indistinguishable from reality’.” Why? Because that’s what the metaverse needs, and that’s a daunting challenge in itself, but it’s not the end.

The metaverse needs edge computing. It needs very low-latency connections between edge points, creating low-latency global reach. Anything less means that users will run up against barriers to immersion more often, and more often easily becomes too often. That’s bad news, but there’s also a touch of good news. With the metaverse, there is a player (Meta, obviously) who could actually drive the changes needed. It would be a bold move on their part, but if it paid off it would pay off big.

Web3 also has issues, but while its issues may be simpler, there is no big player dedicated to resolving them. In fact, the whole technology is anti-big-player. Ben Franklin once griped about the challenges of getting thirteen clocks to chime at the same time. Try getting thirteen hundred startups to collectively support a complex vision. One problem with decentralization is that you decentralize benefits.

Our problem here isn’t blockchain technology or even the business/technical challenges of creating the kind of ecosystem that Web3 or the metaverse need to truly succeed. It’s bubbles. What separates us from the age of tech innovation isn’t that the founders of the Internet or the personal computer or the smartphone were smarter, but that in that time, success required actually doing something, not just claiming you would and then taking the first exit ramp. There are plenty of opportunities to recreate that old model of tech success, and maybe some smart VC will follow one.

Google’s Private 5GaaS: A Good Move but Needs Better Singing

Google is entering the private 5G space, so the story goes, but what it’s likely doing is a bit more profound than that. Does it mean that “private 5G” is going to take off? Does that mean that public 5G is in even more trouble than some have said? Is something even more potentially disruptive going on? Let’s see.

We all know what public 5G is; the next generation of mobile broadband, voice, and messaging. Defining private 5G isn’t as easy these days. Some vendors believe that it’s enterprise deployment of 5G technology to create their own network. Some believe that it’s really “private-5G-as-a-service”, and that’s pretty clearly what Google is promoting. How does p5GaaS (to coin an acronym of convenience) work and is it really a game-changer? That’s the deeper view of my opening questions.

Let’s start with what I think is a fundamental truth, which is that the average enterprise has no solid reason to consider private 5G. We know that because there’s been private wireless versions of 4G available for years, and it’s not exactly swept the market. The truth is that the average enterprise is perfectly fine with WiFi and public carrier wireless services, because that’s what they now use.

I’ve chatted with enterprises who do have a justification for private 5G, and they tend to be companies with spread-out campus operations. I’ve seen it in factory settings, in transportation hubs, warehousing, and so forth. In nearly all the cases where a “justification” has turned into an actual deployment, there’s been a need to support both reliable telephony and data, and it’s also likely there’s an element of mobility in their need for data connectivity. Those who argue that IoT will be the driver of private 5G are right in that it is a driver, but wrong supposing that it’s a driver that will impact a substantial market segment.

Given this, what the heck are vendors thinking with the private 5G push? Answer: Wall Street creds. Many vendors and startups benefited from the 5G hype wave, but 5G has turned out to be what really should have been expected all along, a simple evolution of cellular communications. The big revenue explosion that many had been talking up was clearly not materializing. Solution? Find another reason to say that big bucks were coming. Thus, private 5G explosion.

Of course, as I’ve noted, private 5G wasn’t destined to explode. The cost and complexity of creating your own cellular network would be justified for only that small segment of enterprises with specialized communications needs. But given that, suppose that you could offer 5G in as-a-service form? Lower the bar on the cost and complexity challenge and maybe you’d get some more buyers. It was (and is) a new slant, mixing cloud and private 5G, so the media was in. The Street might be OK with it too.

Cloud providers like Amazon, Google, and Microsoft don’t really need the hypothetical p5GaaS revenue, though, so why would they push the concept? Answer: They want to host public 5G elements, and do so as an on-ramp to becoming the player who fulfills the requirements of the “telco cloud”, which is really more about edge computing than about 5G.

There are credible, if yet unproven, applications that demand processing hosted closer to the point of activity, meaning “the edge”. The common challenge for all of them is the “first telephone” problem. Who buys the first telephone, given that there’s nobody for them to call? There are plenty of things that edge computing might do, but who will plan to do them, absent any available edge computing services? Who will offer those services, absent anyone really planning to use them? Bootstrapping the edge is complicated, but 5G promised at least a way of getting things started.

5G requires hosted virtual functions, which obviously requires something to host them on. Most of the requirements would focus on the metro area, where user concentrations are high enough to justify resource pools and where user-to-hosting latency is low. Those same factors are the underpinning of the generalized applications for edge computing, so 5G-justified resource pools could also serve as edge computing resources, making edge services available and encouraging the planning of applications that require them.

Google, despite very strong cloud technology, hasn’t been able to gain much, if any, market share in cloud computing services. The last thing they need is an emerging edge computing opportunity that they can’t address. There is clearly interest among the network operators for 5G as a service, but much of the reason for that is that the operators themselves don’t want to capitalize telco cloud, given that so far 5G is mostly a radio access network (RAN) play. That suggests that the current telco opportunity for 5GaaS could be very small.

The best solution to that, from the cloud provider perspective, is p5GaaS, but in a particular form. Google’s approach is to offer p5GaaS as an application of Google Distributed Cloud Edge. This is Google’s “cloud-on-the prem” model, announced last year, and it extends the Google cloud right onto customers’ premises, using customer-provided hardware. Google joins its competitors in deploying a model of edge computing that’s slaved to their cloud, yet capitalized by the buyer. If GDCE proves popular enough, even if that’s just in select locations, Google can easily push its own hosting toward the edge to tap the additional potential revenue.

Seen in this light, Google’s push for p5GaaS makes a lot of sense. They cannot afford to let other cloud providers ride private 5G to extend their cloud onto customers’ premises. It is possible that there will be no third-party edge computing deployed at all; that all that will happen is that the cloud-as-a-platform will end up extending outward to customer equipment. If that happens, could the cloud become effectively the data center platform? Companies like Red Hat and VMware, after all, want the data center platform to become the cloud platform.

The other possible strategy Google may be pursuing is simply dominating the telco opportunity. As I’ve noted in other blogs, telcos in my experience have seen Google as less a threat to them than the other cloud providers. Google is the only company that Tier Ones solicited me to contact on their behalf (Google at the time rebuffed the initiative, but this was before the cloud wave) for a relationship. Among the cloud providers, AWS has the startup businesses, Microsoft generally has enterprises, but Google lacks a big core constituency. Telcos might make a very nice one.

Of course, all of this demonstrates Google’s technology more than it validates their marketing/positioning, and that seems to be their problem with the cloud. I would argue that Google has already done more for cloud technology than any other company, and it’s gotten less for its efforts. I’m not sure what’s going on with Google in the engagement process, but the need to fix that just as much as they need to advance their cloud capabilities.

How Did IBM Buck the Tech Downturn?

Let’s face it, this hasn’t been a good quarter, even a good year, for tech. Given that, how is it that an IT company that’s been around for longer than most of today’s technology professionals have lived seems to be doing more than OK? IBM seems to be bucking the downturn. What can we learn from that?

One question that comes to mind is how IBM could buck the supply-chain problem that others are complaining about. The obvious answer is that the key product ingredient in IBM’s success is software, and in particular Red Hat software. There’s no supply chain issue with software, and Red Hat has unquestionably reshaped IBM, but IBM has assets of its own in play too, which leads us to the first lesson.

Since the 1990s, IBM has consistently been the vendor with the largest “strategic influence” in our surveys. We define strategic influence as the ability to shape customer technology plans in a way that conforms to the vendor’s own view of the market, and to its product offerings. At the same time, strong strategic influence seems to ensure relevant and accurate feedback from account teams to senior vendor management, which helps with positioning and messaging.

You can see that with IBM’s “hybrid cloud” slant. The fact is that the number of enterprises who aren’t committed long-term to a hybrid cloud is below the level of statistical significance for my surveys. Despite this, the slant on cloud computing that most vendors take is based on the media, and so they talk about “multi-cloud”, which is something enterprises see as a defensive tactic rather than a strategic direction. From IBM’s quarterly earnings call: “Hybrid cloud is all about providing a platform that can straddle multiple public clouds, private cloud and on-premise properties that our clients typically have.” That simple statement shows how IBM has made their cloud story resonate.

Another benefit of strategic influence is the opportunity to sell consulting and other professional services. Many enterprises have commented that having an IBM team on site has made IBM a kind of de facto IT partner. Who better to turn to when you need outside resources to augment your own team, or for a task you have no internal skills to address? I’ve seen this in accounts where my own consulting brought me into contact with IBM people.

The effects of all of this are clear when you look at the numbers. IBM’s overall revenues were up 11%, but software revenue was up 15%, consulting revenue up 17%, Red Hat revenue up 21%, and hybrid-cloud-associated revenue up 25%. Another interesting revenue number that’s my transition into the next IBM point is that transaction processing revenue was up 31%.

IBM has strategic influence dominance in no small part because IBM is the vendor most involved in the core business applications of major enterprises in major verticals. The old adage, so old I remember it from my early days in programming was “Nobody ever got fired for buying IBM.” With vertical-market expertise that’s literally unparalleled in the industry, IBM can count on C-level engagement and that protects those who make a pro-IBM decision. Add to this the fact that one class of application that’s immune from outside pressure is the core business stuff. You don’t have a business at all if you don’t invest there.

One interesting point raised by IBM’s strategic-influence success is whether IBM’s challenge with marketing over the last two decades is linked to its strategic influence. Strategic influence for vendors tends to be linked to sales-centricity; Cisco among network vendors has the highest strategic influence. That raises the question of whether IBM’s current success is due to a market-behavior shift that suddenly values sales over marketing. It may be, but there’s a bit more to it.

I think it is true that IBM’s resilience in 2022 is attributable to the strategic-interest factors I’ve just noted, but I also think that they’ve been perhaps a bit smarter than I had expected in the way they’ve integrated Red Hat. IBM has made no bones about how critical Red Hat is to their future, both on earnings calls and in other conferences and media comments.

What the current quarter shows relative to IBM and Red Hat is that IBM appears to have added value to Red Hat, created more value for its customers based on Red Hat features, and at the same time not interfered in any noticeable way with Red Hat’s own trajectory. They also appear to have been successful in leveraging Red Hat to gain additional sales traction and strategic influence beyond their original (pre-acquisition) stable of accounts. All that is good news for IBM…and for Red Hat.

The quarter may demonstrate that IBM is at least capable of, if not intending to, balance between leveraging its strategic influence and leveraging Red Hat’s marketing potential. In good times, when tech is stronger overall, they could expect to add to their bottom line through the evangelism of Red Hat’s stuff, and when times are tough (like now) they can still rely on their strategic base.

Looking at lessons for the broader market, the one I think is most important is that hybrid cloud is the cloud. Attempts to gain strong account traction with another message, including “multi-cloud” is almost surely a bad idea, and it could be really bad if competitors manage to figure out how to do hybrid cloud while you wallow in the multi-cloud media and PR machine. Red Hat’s own website features hybrid cloud prominently, and while it would be easy to say that’s IBM’s influence in play, remember that IBM’s hybrid play is largely directed at its own strategic accounts. Red Hat doesn’t have to sing that song, yet they do.

That sets up the question of what the tail and the dog might be in a hybrid relationship. IBM’s success suggests that in a major-market enterprise and in core business applications, the data center is very much the tail. Otherwise why pick your key strategic partner from the data center side of the hybrid? Could this mean that, for enterprise cloud use at least, a data center player with dominant strategic influence could drive the cloud more than cloud providers? If so, it would be huge for IBM.

Not so huge for others, of course. Ironically, the vendor who has the most to fear from IBM’s success might well be the vendor who has the most strategic influence in the network space—Cisco. There are two reasons. First, network strategic influence doesn’t translate to CxO IT strategic influence. If it did, Cisco would have turned in a better quarter, and would have been able to get its software initiatives going better. Second, Red Hat might end up poisoning Cisco’s well, directly or indirectly.

Data center networking is in some ways the only dominating piece of “capital equipment networking” that remains. People don’t buy routers to build a WAN. However, there is already pressure on the switching market from white-box solutions. I think that the Broadcom deal for VMware could boost VMware’s strategic influence in the data center, and they’re already second to IBM. Might Broadcom leverage that to push white-box switches (with their chips, or perhaps even their own switches) in competition with network vendors like Cisco? Could Red Hat be a competing source of open-source switching software? Could a lot of credible backers of white boxes shift data center switching decisively?

This might be the time we find that out. Enterprises are obviously having trouble getting incremental dollars for networking, and almost all of those I’ve chatted with on the topic say that they would be more likely to embrace white-box data centers if they were backed by a big player. They don’t necessarily need that player to be a big network vendor. This seems to present another argument that network equipment vendors can’t simply assume that the future will be a linear descendant of the past. Times change.

Optimization, Virtualization, and Orchestration

What makes virtualization, whether it be IT or network, work? The best definition for virtualization, IMHO, is that it’s a technology set that creates a behavioral abstraction of infrastructure that behaves like the real infrastructure would. To make that true, you need an abstraction and a realization, the latter being a mapping of a virtual representation of a service (hosting, connectivity) to real infrastructure.

If we accept this, then we could accept that in a pure virtual world, where everything consumed a virtual service or feature, would require some intense realization. Further, if we assume that our virtual service is a high-level service that (like nearly every service these days) involved layers of resources that had to be virtualized, from servers to networking, we could assume that the process of optimizing our realization would require we consider multiple resource types at once. The best cloud configuration is the one that creates the best price/performance when all the costs and capabilities, including hosting hardware, software, and network, are considered in realizing our abstraction.

Consider this issue. We have a company with a hundred branch offices. The company runs applications that each office must access, and they expect to run those applications at least in part (the front-end piece for sure) in the cloud/edge. When it’s time to deploy an instance of an application, where’s the best place to realize the abstraction that the cloud represents? It depends on the location of the users, the location of the hosting options, the network connectivity available, the cost of all the resources…you get the picture. Is it possible to pick optimum hosting first, then optimum networking to serve it? Some, even most of the time, perhaps. Not always. In some cases, the best place to run the application will depend on the cost of getting to that location as well as the cost of running on it. In some cases, the overall QoE will vary depending on the network capabilities of locations whose costs may be similar.

We have “orchestration” today, the task of deploying components on resource pools. One of the implicit assumptions of most orchestration is the concept of “resource equivalence” within the pools, meaning that you can pick a resource from the pool without much regard for its location or specific technical details. But even today, that concept is under pressure because resource pools may be distributed geographically and be served by different levels of connectivity. There’s every reason to believe that things like edge computing will put that principle of equivalence under fatal pressure.

The “ideal” model for orchestration would be one where a deployment or redeployment was requested by providing a set of parameters that locate the users, establish cost goals, and define the QoE requirements. From that, the software would find the best place, taking into account all the factors that the parameters described. Further, the software would then create the component/connection/user relationships needed to make the application accessible. It’s possible, sort of, to do much of this today, but only by taking some points for granted. Some of those points are likely to require more attention down the line.

I think that for this kind of orchestration to work, we need to presume that there’s a model, and software that decomposes it. This is essential because a service is made up of multiple interdependent things and somehow both the things and the dependencies have to be expressed. I did some extensive tutorial presentations on a model-based approach in ExperiaSphere, and I’ve also blogged about it many times, so I won’t repeat all that here.

One key element in the approach I described is that there’s a separate “service domain” and “resource domain”, meaning that there is a set of models that describe a service as a set of functional elements and another set that describe how resource “behaviors” are bound to the bottom layer of those service elements. The goal was to make service definitions independent of implementation details, and to permit late binding of suitable resource behaviors to services. If a service model element’s bound resource behavior (as advertised by the resource owner/manager) broke, another compatible resource behavior could be bound to replace it.

This could offer a way for “complex” orchestration to work above cloud- and network-specific orchestration. The service models could, based on their parameters, select the optimum placement and connection model, and then pass the appropriate parameters to the cloud/network orchestration tool to actually do the necessary deploying and connecting. It would be unnecessary then for the existing cloud/network orchestration tools to become aware of the service-to-resource constraints and optimizations.

The potential problem with this approach is that the higher orchestration layer would have to be able to relate its specific requirements to a specific resource request. For example, if a server in a given city was best, the higher-level orchestrator would have to “know” that, first, and second be able to tell the cloud orchestrator to deploy to that city. In order to pick that city, it would have to know both the hosting capabilities there and the network capabilities there. This means that what I’ve called “behaviors”, advertised resource capabilities, would have to be published so that the higher-layer orchestrator could use them. These behaviors, being as they are products of lower-level orchestration, would then have to drive that lower-layer orchestration to fulfill their promise.

That exposes what I think is the biggest issue for the complex orchestration of fully virtualized services—advertising capabilities. If that isn’t done, then there’s no way for the higher layer to make decisions or the lower-layer processes to carry them out faithfully. The challenge is to frame “connectivity” in a way that allows it to be related to the necessary endpoints, costs included.

Deploying connectivity is also an issue. It’s the “behaviors” that bind things together, but if network and hosting are interdependent, how do we express the interdependence? If the higher-layer orchestration selects an optimal hosting point based on advertised behaviors, how does the decision also create connectivity? If the network can be presumed to be fully connective by default, then no connection provisioning is required, but if there are any requirements for explicit connection, or placement or removal of barriers imposed for security reasons, then it’s necessary to know where things have been put in order to facilitate these steps.

It’s possible that these issues will favor the gradual incorporation of network services into the cloud, simply because a single provider of hosting and connectivity can resolve the issues without the need to develop international standards and practices, things that take all too long to develop at best. It’s also possible that a vendor or, better yet, an open-source body might consider all these factors and advance a solution. If I had to bet on how this might develop, I’d put my money on Google in the cloud, or on Red Hat/IBM or VMware among vendors.

Something is needed, because the value of virtualization can’t be achieved without the ability to orchestrate everything. A virtual resource in one space, like hosting, can’t be nailed to the ground by a fixed resource relationship in another, like networking, and optimization demands that we consider all costs, not just a few.

There’s a New Flow Optimization Algorithm; What Might Need It?

One problem that networks have posed from the first is how to optimize them. An optimum network, of course, is in the eye of the beholder; you have to have a standard you’re trying to meet to talk about optimization. Networks can be optimized by flow and by cost, and most experts have always believed that the same process could do both, and algorithms have evolved to provide network optimization since the dawn of the Internet.

One challenge with optimization is the time it takes to do it, particularly given that the state of a network isn’t static. Traffic uses resources, things fail, and errors get made. IP networks have generally been designed to “adapt” to conditions, something that involves “convergence” on a new topology or optimality goal. That takes time, during which networks might not only be sub-optimal, they might even fail to deliver some packets.

A new development (scientific paper here) seems to show promise in this area. Even the first of my references is hardly easy to understand, and the research paper itself is beyond almost everyone but a mathematician, so I won’t dwell on the details, but rather on the potential impacts.

Convergence time and flow/cost optimization accuracy are critical for networks. The bigger the network, and the more often condition changes impact cost/performance, the harder it is to come up with the best answer in time to respond to changes. This problem was the genesis for “software-defined networks” or SDN. SDN in its pure form advocates the replacement protocol exchanges between routers to find optimum routes (“adaptive routing”) by a centralized route management process (the SDN controller). Google’s core network is probably the largest deployment of SDN today.

It’s centralized route management that enables algorithmic responses to network conditions. Centralized management requires that you have a network map that shows nodes and trunks, and that you can determine the state of each of the elements in the map. If you can do that, then you can determine the optimum route map and distribute it to the nodes.

Obviously, we already have multiple strategies for defining optimum routes, and my first reference says that the new approach is really needed only for very large networks, implying that it’s not needed for current networks. There are reasons to agree with that, IMHO, but also reasons to question it.

The largest network we have today is the Internet, but the Internet is a network of networks and not a network in itself. Each Autonomous System (AS) has its own network, and each “peers” with others to exchange traffic. There are a limited number of peering points, and the optimization processes for the Internet work at multiple levels; (in simple terms) within an AS and between ASs. If we look at the way that public networks are built and regulated, it’s hard to see how “more” Internet usage would build the complexity of optimization all that much, and it’s hard to see how anyone would build a network large enough to need to new algorithm.

But…and this is surely a speculative “but”…recall that the need for optimization efficiency depends on both the size of the network and the pace of things that drive a need to re-optimize. The need will also depend on the extent to which network performance, meaning QoS, needs to be controlled. If you can accept a wide range of QoS parameter values, you can afford to wait a bit for an optimum route map. If you have very rigorous service SLAs, then you may need a result faster.

We know of things that likely need more tightly constrained QoS. IoT is one example, and the metaverse another. What we don’t know is whether any of these things actually represent any major network service opportunity, or whether the other things needed to realize these applications can be delivered. An application of networking is almost certainly a swirling mix of technologies, some of which are on the network and not in it. The failure to realize a connectivity/QoS goal could certainly kill an application, but just having the connectivity and QoS needed for a given application doesn’t mean that application will automatically meet its business case overall. We need more information before we could declare that QoS demands could justify a new way of optimizing network routes, but there are other potential drivers of a new optimization model.

Network faults, meaning the combination of node, trunk, and human-error problems, can drive a need to redefine a route map. If you had a very faulty network, it might make sense to worry more about how fast you could redraw your map, providing that there wasn’t such a high level of problems that no alternative routes were available. My intuition tells me that before you’d reach the point where existing route optimization algorithms didn’t work, you’d have no customers. I think we could scratch this potential driver.

There’s a related one that may be of more value. The reason why faults drive re-optimizing is that they change topology. Suppose we had dynamic topology changes created some other way? Satellite services based on low-orbit satellites, unlike services based on geostationary satellites, are likely to exhibit variable availability based on the position of all the satellites and the location of the sources and destinations of traffic. These new satellite options also often have variable QoS (latency in particular) depending on just how traffic hops based on current satellite positions relative to each other. Increased reliance on low-earth-orbit satellites could mean that better route optimization performance would be a benefit, particularly where specific QoS needs have to be met.

Then there’s security, and denial-of-service attacks in particular. Could such an attack be thwarted by changing the route map to isolate the sources? There’s no reliable data on just how many DoS attacks are active at a given time, but surely it’s a large number. However, the fact that there is no reliable data illustrates that we’d need to capture more information about DoS and security in order to make them a factor in justifying enhanced route optimization and control. Jury’s out here.

Where does this lead us? I think that anything that enhances our ability to create an optimum route map for a network is a good thing. I think that the new approach is likely to be adopted by companies who, like Google, rely already on SDN and central route/topology management. I don’t think that, as of this moment, it would be of enough benefit to be a wholesale driver of centralized SDN, though it would likely increase the number of use cases for it.

The biggest potential driver of a new route optimization model, though, is cloud/edge computing, and the reason goes back to changes in topology. You can change traffic patterns by network faults or changes, but a bigger source of change is the unification of network and hosting optimization. The biggest network, the Internet, is increasingly integrating with the biggest data center, the cloud. Edge computing (if it develops) will increase the number of hosting points radically; even without competitive overbuild my model says we could deploy over 40,000 new hosting points, and considering everything 100,000. Future applications will need to not only pick a path between the ever-referenced Point A and Point B, but they’ll also have to pick where one or both points are located. Could every transitory edge relationship end up generating a need for optimal routes? It’s possible. I’m going to blog more about this issue later this week.

The strategy this new optimization algorithm defines is surely complex, even obtuse, and it’s not going to have any impact until it’s productized. The need for it is still hazy and evolving, but I think it will develop, and it’s going to be interesting to see just what vendors and network operators move to take advantage of it. I’ll keep you posted.

Cloud-Native, Composable Services, and Network Functions

I get a lot of comments and feedback from vendors, enterprises, and network operators. Recently, there’s been an uptick on topics related to Network Functions Virtualization, NFV. Some of this has apparently come out of the increased activity around Google’s Nephio initiative, which aims at creating a Kukbernetes-based management and orchestration framework. Some has come out of the realization that 5G deployments will increasingly commit operators to an approach to function hosting, and many realize that the work of the NFV ISG, while seminal in conceptualizing the value proposition and approach, isn’t the technical answer.

It’s important to get a “technical answer” to network function hosting because, whatever the mechanism, it’s clear that the industry is moving toward being able to compose services from functional elements. That means that it will be necessary to define service models and then orchestrate their deployment. If we’re going to go through that admittedly complex process, we don’t want to do it more than once. We need a model of service orchestration and management that fits all.

What, then, is “all?” One interesting thread that runs through the comments is that we aren’t entering the debates on function hosting with a consistent set of definitions and goals. It’s difficult to get to the right answers without asking the right questions, and more difficult if you can’t agree on basic terminology, so let’s try to unravel some of the knots that these comments have uncovered.

The first point is that while the great majority of operators seem to accept the need for “cloud-native” function hosting, they don’t have a solid definition for the term. That’s not surprising given that the same could be said for the industry at large. In fact, the “purist” definition and the popular meaning of the term seem to be diverging.

When I queried the operators who used the term, most said that “cloud-native” meant “based on native cloud computing technology”. Containers, Kubernetes, and the like were their examples of what went into a cloud-native function hosting framework. To cloud people, the term usually means “stateless and microservice-based”. In an earlier blog, I noted that many of the “network functions” that needed to be hosted didn’t fit with that purist definition. They shouldn’t be required to, because to do so would likely result in service failures.

Stateless microservices are a great strategy to support a human-interactive application and some other event-driven applications, including the control-plane part of networking. They’re not suitable, IMHO, for data plane applications because splitting functionality and introducing network latency between functional elements will impact performance and actually reduce reliability.

Containers are probably a good, general, way of hosting network functions. The big advantage of containers is that they’re a kind of self-describing load unit, which means that it’s possible to configure the deployed elements using generalized tools rather than requiring tweaking by human operations management. However, containers aren’t as valuable where something is going to be deployed in a static way on specific devices rather than on a pool of resources. Containers also have a higher overhead, meaning that flow-through traffic may not be ideal.

What we really need for function hosting is a universal deployment and management framework that works for containers, virtual machines, bare metal servers, and white boxes. That framework has to be able to deploy network functions and also on-network functions, because even the Internet, when considered as a service, has both. I’d argue that there are already initiatives that satisfy that for Kubernetes, and that’s one reason I am so interested in the Nephio project and its Kubernetes-centricity. What Nephio proposes to do is to create hosting generalization without sacrificing common orchestration/management, and that’s critical.

This point is important when addressing another theme of the comments, which is that achieving “cloud-native” (or whatever you’d like to call it) for function hosting is something that has to come out of the work of the NFV ISG. I disagree with that, strongly, and the reason is that I believe that the early work of the ISG has made it difficult (if not impossible) for it to embrace the Kubernetes-centricity that function hosting needs.

If future network services are ever going to be differentiable, different, then that’s going to have to be achieved through a unification of network functions and the functions that create and sustain experiences. Networking today, from the perspective of the user, is all about experience delivery. The front-end of those experiences is already largely hosted in the cloud, where cloud-centric practices (including/especially Kubernetes) prevail.

Despite the fact that operators believe that function hosting will enrich their services and make them more differentiable (raising revenues), they have spent little time trying to identify what specific functions would accomplish those goals. As I pointed out in my blog yesterday, they seem to believe that abstract function hosting will do the job, which is essentially a claim that providing a mechanism to do something is equivalent to actually doing it. They haven’t confronted specific service examples that mingle network functions and other hosting. In fact, they’ve really not considered network functions themselves, on any broad scale other than that outlined in places like the 5G standards. That’s likely why they aren’t seeing Kubernetes-centricity as pivotal as it is.

Operator standards make glaciers seem to move a breathtaking speeds by comparison. The cloud has done the opposite, compressing application development schedules and creating an explosion of tools and techniques to support the ever-changing needs of businesses. If Internet-centric experience delivery is what’s really driving the cloud and networking, then can telcos afford to have their own function hosting framework lagging so far behind what’s driving services? I don’t think so.

The pace of cloud progress can be traced to two things. First, there is a compelling mission for the cloud, even though it’s not the mission most think of when they hear “cloud computing”. It’s not about replacing the data center, but about extending applications’ user relationships out of the data center and closer to “the Internet”. Second, the cloud has produced application development, hosting, deployment, and management tools and techniques that favor rapid progress. They’ve done that by replacing “standards” with open-source projects.

A few operators, and more vendors, tell me that the “standards” activity itself is creating a problem. There’s an established operator reliance on standards and on their work in defining them. For mobile networking, the 3GPP has almost literally ruled in defining how equipment works and interoperates. This flies in the face of the fact that IP networking is utterly dominant today, and that the IETF and not any carrier-centric standards body, drives IP specifications. It’s not unreasonable to see the rise of the Open RAN (O-RAN) initiatives as an indicator that even the 3GPP may be losing influence. However, “losing” doesn’t mean “lost”, and so we can’t hope for quick progress in turning operators away from their own standards initiatives. Especially when there are a lot of people in operator organizations whose career is based on those initiatives.

Vendors, I think, can and must be the solution here. I’ve advocated an approach of blowing kisses at ETSI NFV while working hard on developing a true cloud-centric function hosting framework. I think (I hope) that Nephio does just that, but what will decide the fate of Nephio, and perhaps that of function hosting and even telecom, is how vendors support the initiative, and that’s not something that’s easy to assess.

Nephio is a hybrid of cloud and telecom in more than just a technical sense. Most of the truly seminal elements of the cloud were created because a primary vendor did something and made it open-sourced. Rarely has a collection of vendors somehow come together on a concept successfully. The old “horse-designed-by-committee” analogy sure seems to apply. Vendors not only make up the majority of participants in open network-industry bodies, they tend to supply the majority of the resources. What happens next for “cloud-native” in telecom depends on how those vendor participants work to advance something useful, fast enough to matter to a rapidly changing market.

Getting Telecom Beyond the Dumb Pipe

Many people have heard the “A rose by any other name…” quote. Let me offer a network technology slant on that, which is “A dumb pipe created by any technology is still a dumb pipe”. Given that we’ve got operator and vendor commentary that takes the opposite stance, we apparently need to look a bit at why technology doesn’t trump mission for either wireline or wireless. We also need to look at why new technologies could create new services, but not because they smarten dumb pipes.

People and companies share a lot of things, and one is a resistance to change. The longer a given practice has been followed, a given strategy accepted, the harder it is to displace it. Network operators have been offering the service of connectivity for what’s surely longer than anyone on the planet has been alive. It’s no wonder that they see every “new” service as some new spin on connectivity.

Wireless, meaning cellular telephony and broadband, is new in the sense that it’s not tethered, and so becomes a personal information conduit that follows the user around (as long as they don’t forget their phone, of course). That proved to be a valuable capability, and so it’s prospered, but at the same time it validated the operator preconception that if you looked hard and long enough, some comfortable extension to connectivity services would ride in to save the day. All the commentary on 5G, IMHO, stems from that preconception.

Fundamentally, 5G is just an upgrade to mobile networking. I’ve had a 5G phone for some time, and it’s hardly been life-changing, or in truth even offered a noticeably different experience. Then, of course, many would argue that making a mobile phone call wasn’t a noticeable different experience either, it was only the context that was different. I think that’s why so much of 5G interest early on focused on IoT. If you had to connect billions of devices in addition to billions of people, you might well need new and revolutionary technology. But it’s clear by now that we don’t have that need today, and we may never have it.

How about things like network slicing? You can get your own virtual private cellular network, you can create separate networks for separate missions. Isn’t that worth something? Hearken back to the 70s when we saw voice services based on a network-hosted PBX, something called “Centrex”. It got some play, but it didn’t change operators’ fortunes, even in modernized IP-PBX form. As far as having mission-specific networks, that’s useful only if there are missions that ordinary broadband Internet won’t support, that users need support for, and that regulatory policies won’t declare to be a form of paid prioritization.

The driver behind this is simple; IMHO; mobile services are no longer the dependable source of high margins they were in the past. Connectivity monetizes whatever creates it, because it’s a means to an end and not an end in itself. That’s the bad news for 5G and dumb pipes. Is there any good news? Maybe.

5G takes a step, perhaps a decisive step, toward unifying computing and network technology as the framework for services. The sad truth is that when operators and vendors try to use 5G as a crutch to making dumb pipes valuable, they’re ignoring its potential to make dumb pipes smart, and that is the future of connectivity if there’s any future beyond commoditization.

The Internet is an important indicator here. It’s a network, but first and foremost it’s an experience host. The value of the Internet lies only partly in its near-universal connectivity. The other part, the important part, is the support for what people want to be connected with. It shifted us from connection as the experience to connection with the experience. There’s a good argument to be made that the network and the experience are always one, but that means that when the experience is beyond connection, the network has to somehow integrate as tightly with it as possible so as to rebuild its own value. That means unifying connectivity and experience hosting, and that is something 5G could advance, though at this point it would likely have to be an indirect sort of support, in three areas.

The first area is identity routing. From the first, there’s been a discussion in IP networks as to how we address things, or what things we’re addressing. In traditional networks, there is a “network service access point” or NSAP, and this is what the network sees as a user address. In mobile networks, mobility means that the relationship between a user and an NSAP has to be more agile, and in the IETF there have been a number of location-independent routing notions floating about. Mobility management a la 5G (and earlier) is helpful for addressing users who move within a relatively contained area. It would be better to have a strategy that would address users who are “mobile” through any geography, and users who are “portable” in that they may operate from a variety of locations.

The second area is layer relationship management. IP has a data plane and a control plane. 5G has its own “control plane” and considers both IP planes to be its “user plane”. We also have the venerable OSI 7-layer model, which has been steadily augmented with sublayers and which doesn’t conform to the layer structure of IP. We really need to go back to basics here and redefine how networks mix control and data, end-to-end and per-hop, and so forth. Maybe we need to accept that there is no standard layer set, and adopt a model that allows for arbitrary layering. This is what I think “intent modeling” might do, for example.

The third area is session awareness. All networking is justified by experience delivery of some sort, and an experience is a relationship between a consumer and a provider that endures for some reasonable period. The network analog of an experience relationship is a session, something the OSI model defines as living in the fifth layer. How do we dependably understand when a session is being established? There are inspection approaches (Juniper’s Session Smart Routing is among if not the best example of this) but could we make session boundaries explicit? If so, we could identify session requirements in terms of service features, which would then permit dynamic mapping to services.

There is a lot of potential in the integration of hosting and networking, for operators and for the industry at large. The problem is that we’ve accepted goals for that integration that are vague, insipid, conventional, unrealistic, and sometimes all of the above. If we really want to make networks more than dumb pipes, we need to use function hosting to attack areas where connectivity and experiences merge, and that starts by identifying where those points are. Some good discussion on this, endorsed/sponsored by operators, would be very helpful. It could even be essential for the industry’s overall health.

Is the Broadcom deal for VMware a smart, even pivotal, move?

OK, Broadcom is buying VMware, and most of the comments I’ve seen from industry analysts or the Street have been, well, doubtful. I guess that makes it fair play that I have doubts about their doubts. There are potential issues raised by the deal, but they’re not spectacularly different from the issues raised by any large M&A. There are also potential signals of a shift in the industry that could be very important indeed. I’ll start with the issues and move on to the signals.

The first issue is the classic “channel competition”. Broadcom is primarily a hardware company, and its products are used in a lot of devices that also run higher-level software. VMware, as a major supplier of higher-level software, will surely compete with some of the stuff current Broadcom customers offer. Some might see that competition as a reason to seek an alternate chip supplier. I don’t see this as a big deal; Broadcom has offered cards and devices as well as chips, so there’s always been a bit of an overlap between levels of their products and offerings of others that are based in part on Broadcom elements. In networking, recall that they bought Brocade earlier, so there’s also been a bit of overlap in the networking space.

One way Broadcom could address this issue is to visibly embrace standard architectures. The P4 standard for switching chips is an example; Broadcom has its own model, but P4 is backed by the ONF and others. An open link between lower-level hardware components and the software, including VMware’s stuff, would help ease any concerns software firms might have in supporting the hardware elements in software that’s competitive with VMware.

The second issue is the potential dilution of management attention, which some industry and financial analysts see as a risk to VMware customers. This is a harder issue to dismiss because it’s based on something very subtle, but it’s hard for me to understand why Broadcom would make its largest-ever acquisition and then poison the customer base of the company they bought.

Moving on to signals now, the most obvious is that the deal suggests a shift toward a single-source play for a complete product, breaking away from a tendency for the industry to separate IT hardware and IT software in terms of suppliers. Broadcom could, in theory, offer a package of technology that would be mutually supporting, and that could work against competitors who offer only piece parts. This issue is particularly important because VMware is strong in the data center and hopes to be strong in networking, and Broadcom has hardware components/chips in both these spaces.

Why, you could rightfully ask, is single-source plays for all the elements of an IT device important? Most enterprise buyers could answer that one for you. It’s been increasingly difficult for enterprises to hire and retain highly qualified technology teams, because most such people think their prospects are brighter working for a vendor. Every year, things like virtualization and white-box networking come along and make it more challenging to stay ahead of critical tech developments. The users find it harder to integrate the pieces of technology needed to deploy a unified IT or network device, so they want to lean on vendors. Vendors are happy to prop things up, as long as it doesn’t hurt their own bottom line. The more skin they have in the game, the more likely they actually have all the pieces needed themselves, the more likely it is that they’ll be willing to do the requisite propping.

Networking, even the network interface to servers, is already heavily dependent on custom silicon and network adapter cards. It’s also obviously dependent on software, and so it’s a place where creating a total solution could save users headaches and at the same time create better margins for the vendor who creates that total solution.

Differentiation, or lack of it, also enters in here. You can justify higher prices and margins if you offer something others don’t have. If there is no “something” that’s a recognized differentiator, then price is what matters. The more technology elements you can provide to build a usable unit of deployed IT or networking, the more likely you can hold off competitors where differentiation is difficult, because your own margins are strong.

All of this is a sort-of-selfish justification for the deal, I admit, but there’s also a broader industry signal that may be in play here. When a network or IT device is assembled from the hardware and software pieces provided by multiple players, there is a real risk that the sum of the business goals of all the players don’t lead to an optimum-from-the-user-perspective solution. Software vendors want to sell their software, and hardware vendors have a likewise-self-interested vision of the market. Could a company who sold both, drove development in both spaces, find it easier to create an optimum hardware/software partnership? I wonder.

Virtualization, the cloud, the edge, white-box networking, 5G, and even things like AI are all dependent on a highly symbiotic hardware/software relationship. The question is whether a vendor like the newly combined Broadcom/VMware could create that symbiosis better and quicker than a competitive market for the pieces of the solution. Competition doesn’t necessarily generate innovation, and that’s especially true when the competitors are bent on protecting their current market incumbencies.

Custom chips are increasingly the foundation of innovation, certainly for hardware and even credibly for devices overall. Can we envision realistic white boxes or AI without them? One of the most successful of all chip companies, Nvidia, has both chip drivers and additional software offerings. I think you can make a case for the same strategy in networking and IT, which means that the Broadcom acquisition of VMware could be good for innovation.

A final point is that vendors like Cisco and Juniper have developed custom silicon, entered into alliances on silicon and silicon photonics (Juniper, in particular), and so forth. That makes the Broadcom/VMware decision look like less of an outlier, and in fact raises the question of whether Broadcom and VMware, separately, could be competitive in the kind of market that this sudden chip interest says is now developing.

The value of the deal may tie back to some of the points I made yesterday on Cisco’s quarter. Differentiation in networking is becoming more difficult, so the space is threatened with price commoditization. When that happens, it’s not uncommon to try to combine a product area with one whose value and pricing power are higher. In other words, the earlier point I made on the ecosystemic value of the deal may be its best justification. If the other potential values are also realized, then the deal could be very good indeed.

Thoughts on Cisco’s Business Trajectory, and on Networking

Let’s face it, Cisco’s quarter was bad, and nothing management says can alter that. Supply chain issues may have been a factor, but it’s hard to justify the miss and the weak guidance Cisco supported with that excuse (one, by the way, that all the vendors who have weak quarters have been using). They did much better just a quarter ago, after all. Cisco, Cisco’s competitors, and everyone in the networking industry need to take stock here, and reflect on new details on some of the points I made in my blog on their prior quarter.

If we want to look beyond the now-classic supply chain excuses, there are two sources of revenue/profit issues that Cisco and others face. The first is differentiation, needed to sustain pricing power and margins. The second is return on investment, which is needed to get any additional budget for network gear, or to sustain current spending levels if they come under pressure. Both seem to be at work here.

The challenge for network vendors is that networking at the device level has been commoditizing for ages. A router is based on many broadly accepted standards, making it difficult to claim any great feature differentiation if you stick to the basic function of pushing packets around. The vendors responded to that by creating “ecosystems” of products that reflected the reality that networking today is a complex assemblage of stuff that buyers find hard to integrate. And, of course, with creative marketing and sales.

To me, the big question raised by Cisco’s quarter is whether even the network-ecosystem approach is running out of gas, and if so why. The answer to the latter may be easier to see than the answer to the former.

I’ve mentioned many times that Cisco really wants to be a “fast follower” in technology. They want to exploit proven opportunities more than evangelize new stuff in the hope it will catch on. That’s understandable in a sales-driven company; you don’t want your sales force pushing something that turns out to be a dud, both because it hurts their credibility and because it overhangs sales of current-generation stuff. To me, a problem with ecosystem credibility is most likely to lie with a shortage of exciting ecosystems. You can’t differentiate with old stuff in the world of ecosystems, any more than you can in the world of devices.

If you ask enterprises and service providers whether they believe that networking is changing, almost 100% say it is. If you ask of the changes are radical, just short of 90% say that’s also true. I don’t have up-to-the-minute data on the point, but last fall about two-thirds of enterprises and three-quarters of operators said their vendors weren’t offering “new” or “novel” solutions to their network problems and challenges. So let me get this straight; networking is changing radically and vendors aren’t changing their stuff to keep up, right? It sure seems so.

The popular ecosystem strategies for Cisco and other vendors have tended to center on things like network management and operations or network security. These things are important, of course, but they’re ecosystemic product sets long recognized and offered. The changing network issues that buyers are referencing can’t be the same stuff that’s been around for a decade or more. What then are they?

Networking has had its share of transformations, particularly for enterprises, and the enterprise transformations have been tied to shifts in the network services offered them. In the past, we saw a shift from networking built from user-provided nodes and leased lines to IP VPNs. In the present, we’re seeing a series of shifts created by the cloud.

Enterprises use networks to connect users (employees, customers/prospects, partners) with information and application resources. The cloud has transformed where the “front ends” of these information/application resources are hosted, and by doing so have changed both the network connection for the users and for the rest of the applications and databases involved. If we were to assume that the popular view that “everything will move to the cloud” were correct (note that I don’t subscribe to that view), then networking would be nothing more than the Internet for access to cloud apps. Even steps short of that extreme would surely give cloud providers a much greater role in enterprise networks.

Most vendors, including Cisco, have focused on “multi-cloud”, which isn’t the real problem, but which has the advantage of being easy to promote in the media and follow up with sales. There is certainly a shift in networking going on that’s driven by the cloud in general, but nobody is really pushing it.

Edge computing, which is a subset of cloud computing, would magnify the number of things that the cloud would do, and thus magnify the impact on network services. The impact on networking would be greatest if one of the drivers of edge computing were to be an increased use of “the edge” to host network functions, as 5G proposes to do. Since this kind of impact would be most likely confined to metro centers, I’ve tended to call this a “metro” shift. Cisco rival Juniper did an announcement on “Cloud Metro” a year ago.

I think what’s going on here is simple. Cisco, not surprisingly, isn’t anxious to tout a change in networking that would validate cloud providers rather than their traditional network operator customers. Not only that, cloud providers are more willing to embrace white-box technology or SDN for their networking, and neither favors vendors like Cisco.

We can now attack the question of whether ecosystem differentiation is running out of gas, because it’s also a good transition into the second potential challenge—difficulties with network ROI. The failure to develop new ecosystems can have the effect of removing justifications for projects, which removes spending authority, if the new ecosystem can potentially also represent business value-add. If Cisco and others were able to develop new benefits for networking, that would drive new spending. To the extent that new ecosystems were able to generate new benefits, they could help Cisco boost its numbers, but we’ve already seen that things like the cloud would more likely reduce spending than boost it.

This reflects the real challenge for Cisco and others in the space. What’s really needed is a new set of network benefits, and that’s a problem because every network vendor has focused on the notion that connectivity is the only real goal of the network. We’ve achieved connectivity. To find other network benefits, we’d have to find things to do with networks that step beyond basic connectivity. That almost surely involves going “up the stack” and more into applications.

This circles back to the cloud, too. The cloud is winning the battle of new benefits, which is advancing computing by distributing it, and making more of the network a between-cloud-stuff proposition than a separate entity. Cisco has offered servers and software for years, so it’s unlikely they could make a push for cloud hosting gear that would offset any network revenue challenges.

We can sum up the ROI issue with some data. Back up 20 years, and we find that network budgets were almost equally balanced between “sustaining spending” on current infrastructure and “project spending” designed to add business value. Since then, most of the projects have focused on cutting sustaining costs, not adding new business value. That means that networks have been under constant budget pressure for decades now, and we’re probably seeing this exacerbated today because of economic uncertainties.

You can’t expect users to spend more annually to sustain the same set of benefits, particularly if there’s hope of spending less. That hope materializes in things like white-box competition and discount pressure on vendors like Cisco. The only sure way to fix this is to make networks do more, not just cost less.

RSS
Follow by Email