Achieving Hyperscale Analysis: A Deep Dive with Ocient

*Hyperscale analytics comes with a unique set of challenges. Ocient outlines what that means.*

What does it take to manage increasingly complex data processing requirements? The kind that artificial intelligence, machine learning, and 5G need? CDInsights’ Elisabeth Strenger sat down with Chris Gladwin, co-founder and CEO of Ocient, and Jenna Boller, Director of Marketing, to talk about where hyperscale analysis stands now and how companies can balance the ever-present question of cost with what it takes to process truly massive amounts of data.

CDI: One of the things that RTInsights and CDInsights are very interested in is the changes in the requirements for performing AI-based actions in an application. What is your experience with what customers require in terms of access rates and results speeds?

Chris Gladwin: This is something Ocient is really good at. When I started my career, the very first thing I did out of college was work for a big aerospace company. I spent a lot of my career as a “professional customer,” evaluating IT products for enterprises and understanding what enterprises need.

It’s hard for customers to express their requirements, and it’s hard to hear and interpret them. But I think that’s what we’re good at. Most companies are trying to commoditize everything and put it in a cloud form where you can swipe a credit card, and two minutes later, you have a storage container, and off you go. That’s fine, and a lot of the market wants that.

But there are also times when requirements are complex, and it takes a high bandwidth dialogue. Our average customers are multimillion-dollar customers, and we spend hundreds of hours with them to understand what they want. So our focus is really on hyperscale analysis. The requirements [our customers] were asking us to meet were unmet -– analyzing data at hyperscale — not just storing a petabyte, but analyzing a petabyte every time they run a query on average.

So we focus on “interactive time,” which means the time that an analyst or person or an application is waiting for the answer.

CDI: Having been in the real-time operating system space in my early days at Red Hat with IoT, I know the formal definitions of real-time are very different from what people are seeing now. So I love the idea of “interactive time.” I used to use something called operational time, which is even slower, right?

Gladwin: They’re talking like the speed of light for real time. And that’s easy to do when you’re pulling a value out of an index, or it’s in your results cache. What’s hard is when that isn’t the case.

When you look under the hood at the database, it’s not “go to this location, grab the value, get it back.” It’s: “Oh, I gotta have a million parallel tasks that get kickoff, and then there’s going to be all these little intermediate result sets to get added together.”

And you have to do all that, and 2.7 seconds later, here you go. That’s what we do. That’s novel. Just five years ago, the hardware that would enable that software to do that did not exist. Yeah, you could solve these problems with billion-dollar supercomputers. That’s easy. But most people don’t have a billion dollars or $500 million to spend on a supercomputer.

So then what do you do? The level of parallelization you’re going to have at every layer–your stack starting down at the memory allocator and going all the way up to the top of the sequel prompt on even a single small cluster. You might have a million parallel tasks in flight. That is a very, very different architecture than anything else that was built before.

Jenna Boller: The other thing that was mentioned was cost. Where we excel is in continuous analysis and movement of data. So not just the queries, but also loading data, extracting it, and sending it somewhere. That’s where the cost can get out of control in the cloud. It’s not just running a report once a day or once a week; it’s constantly using processing power. So [we] build in some layer of predictability because a lot of our customers are not incentivized to run the analytics they need at scale because of the cost.

We’re focused on removing that constraint.

CDI:I did a bit of reading on the Ocient cloud. Can you even do the kind of hyperscale analytics you’re talking about on the three big clouds in the US? Is it even feasible?

Chris: Sometimes it is, but it would be prohibitively expensive.

CDI: Consistency and predictability are vitally important. So is there anything you’d like to talk about in terms of why? Something like the Ocient cloud, or if you’re a US government, your whole cloud? Why is that a value to customers working at hyperscale?

Chris: The predictability of knowing what you’re going to spend is super important. It’s not good for anyone’s career to implement something and then come rolling in with the build at five times what was budgeted. If it’s usage-based, that can and does happen when you start doing hyperscale analysis. The other is that the cost will be much lower for an Ocient system. One reason is we physically have to do less stuff.

We think of everything in terms of what we call an entitlement. We look at what all the hardware is physically capable of. You can always solve the problem with money, but how do you optimize costs? Our job as software engineers is to keep everything at 85% capacity, but that costs money. So one thing is you are wildly efficient in using every available resource.

The other thing is there are some techniques that we employ that also save cost — one of them is zero-copy reliability. This was a set of techniques that we developed that Cleversafe, my prior company, has patents on. Instead of making copies, we virtualized data. We encode and decode in real time as you write and read. Copies are expensive, especially when it’s a petabyte or an exabyte. If you want to maintain a given level of reliability, you have to make an accelerating number of copies. It is much cheaper to use math to encode and decode in real time.

We encode and decode data as needed to get reliability like we’re making multiple copies. But generally speaking, the total amount of physical storage we need is not three times or five times. It’s 1.3 times to start with, and that saves huge money. Our design goal is to have the most price-performance possible. As a result, we were able to do that.

CDI: Early adopters of non-scientific computing hyperscale were the telcos and ad tech seems, but who do you see as the next industries to be interested in hyperscale processing analytics?

Chris: One area we’re seeing is automotive. Every time a new car is produced in a country like the United States, it’s replacing, on average, a car that’s 20 years old and doesn’t make a lot of data. But the new car has hundreds of computers in there, and it makes tons and tons of data.

So that industry is shifting and will become even more so one of the largest creators of data. The fleet vehicle manufacturers themselves— as they’re transitioning from internal combustion to electrical technologies—know everything there is to know about internal combustion engines, but they’re just learning about batteries and electric power. It’s easy to build a car that runs on a battery. What’s hard is how to manufacture hundreds of millions of these super high quality [cars], with different lithium that has been mined from different places at different times, different weather conditions, and different driving patterns, and make all that reliable all the time? The only way you can look at all that data is hyperscale.

Jenna: And there’s a knock-on effect. Once one thing is transformed, things around it get transformed, and then they’re leveraging more data as well.

Chris: When I speak generally on data growth, I always challenge the audience to name one thing where the new version makes less data than the old version. So far, no one’s met the challenge.

It’s this never-ending acceleration. And you can’t stop it. In every enterprise, there’s the line of business people, and they’re like, “Oh my God, this new microscope is amazing. Let’s use it.” And then the IT people always get stuck with the same thing: oh, it’s ten times as much. You don’t get ten times as much budget, but you’ve got to figure it out. That is always the dilemma of IT.

CDI: And metadata is increasing too. That’s completely hidden from the business user. What do you see as the next big trend or challenge that will affect your customers or even your company’s direction?

Chris: 5g — What consumers will experience is a big speed increase, and it opens up all these new kinds of high res video capabilities like that. But what the telcos face is the first major upgrade of the whole back-end infrastructure for decades. It’s gonna go faster. They’ll know much more about what’s happening in the network and have much more ability to route traffic, manage it, and optimize it.

But that comes at the cost of the amount of data they have to deal with. It just went up by at least an order of magnitude. So from a telecom network performance optimizer, this is amazing. But then there’s somebody else at the telecom whose job is to be the data analysis platform. Team, and they’ve got a huge challenge ahead of them. The scale just exploded. So that’s the big thing we see. And I’d argue it’s the largest capital investment in human history. It’s in the trillions of dollars.

I don’t know what number two is. So it’s a big deal, and it affects everything. It affects telcos. It affects airlines. Everything will be affected by 5g.

Chris: I’m going to make one concluding point. What’s exciting for Ocient right now is we spent years creating this new architecture and coming out with an initial product a couple of years ago. But the phase we’re in now is where each release, each customer engagement, offers major new chunks of functionality. So it’s a really fun time for us, and it’s fun for our customers too because [we’re doing] cool new stuff that was never possible.

Jenna: I also just wanted to clarify one thing. We do enable our customers to run in the cloud. When we’re architecting at hyperscale, we have to look for every efficiency. At the petabyte scale, inefficiency obviously can create so much more waste. So whether it’s consolidating more workloads and data into Ocient and not having to copy the data as much — that’s where you can see we’ve done a lot to drive efficiencies. So if you are running Ocient in Google Cloud or AWS, you’re getting the benefit of lowering your cost that way because it’s a more efficient platform engine.

CDI: They don’t only get the benefit in the Ocient cloud; they also receive those benefits in other clouds.

Jenna: We’re deployment agnostic. Our data centers run on 100% renewable energies. We just wanted to open up whatever works best for our customers.