After completing a $100 million Series C financing round, VAST Data, a New York City data storage company with customers in life sciences and other verticals has ascended to the proverbial rarefied air of “unicorn” startups with a valuation of $1.2 billion.
Yet despite soaring financially, VAST Data’s ambitions are more down to earth. The developer of an all-flash data storage platform—the world’s first to use quad-level cell optimized flash architecture—says it will use the new funding to expand operations overseas, grow its workforce, and step up development of innovative products.
The result is a platform that is both fast but also “extremely large and scalable,” while also being cost effective, according to VAST Data CEO and co-founder Renen Hallak.
“Wherever customers require fast access to a lot of data, that’s where we shine. And in the life science space, we see that in research institutes around genomics research. We see that with cryo-EM [cryo-electron microscopy] data, with electromagnetic microscopes. We see that with brain imaging research that a few of our customers perform,” Hallak tells GEN Edge.
VAST Data has 146 employees right now and is seeking to fill 100 open positions. “Once that happens, then we’ll hire more. I expect us to double ourselves over the next year or two,” Hallak tells GEN Edge.
Hallak says the company seeks to grow beyond the U.S.. With its first sales team in the U.K., the company will expand in Europe, then Australia, South Africa and Japan. “Wherever there are organizations that have a need for large scale storage with a very high performance, we intend to be over the next two years.”
VAST Data plans to expand overseas through its “channel partners” or value-added resellers: “100% of our deals were done through a reseller in the US and we expect to continue that as we expand,” Hallak says. “We intend to partner with them and leverage their relationships and abilities in order to require less of our people on the ground. Typically, VAST Data expands into a market by installing a sales team consisting of a salesperson and a sales engineer. According to Hallak, many more such teams will be created as the company expands internationally.
Same proportion, no more tiers
Hallak said the company intends to maintain up to half its total business in the life sciences and genomics space. VAST Data specializes in large-scale storage systems designed to render the hard drive and storage tiering obsolete by making flash infrastructure affordable for all classes of data.
“Our average sales price so far has been above a million dollars,” Hallak said. Pricing depends on the customer, the data being stored and the size of the data deployment. “We start at a petabyte and we grow into tens and hundreds of petabytes. The potential is to grow into the exabyte range as well under a single namespace, and at the same time provide very good access to that data even though it’s so much capacity.”
“Rather than spending most of their effort on shuffling data back and forth between a fast tier and a cheap tier, [VAST] decided to create a single tier out of cost-effective solid-state devices. This led to a very flat, scalable architecture that completely omitted one of the most challenging pieces of most storage platforms,” said Chris Dwan, an independent technology consultant whose experience includes IT leadership roles at the New York Genome Center and the Broad Institute.
In data storage, Dwan explains, the usual assumption is that low-cost data storage devices are slower than their more expensive cousins, leading to architectures where the bulk of capacity is provided using slower, high-capacity, low-cost (per terabyte) devices. To obtain performance, relatively small amounts of expensive, high-speed capacity are added.
“This leaves storage engineers with the challenge of shuffling data back and forth between a fast vs. a ‘cheap’ tier. While this challenge exists within single storage solutions, it also manifests when enterprises buy different products and spread their data across multiple technology stacks. Employees then have to guess future usage patterns and to place data appropriately,” Dwan said. “Consumer grade (cheap) solid state storage have comparable latency and streaming read performance with higher end equipment. What they lack is the ability to erase and re-write data very many times.”
Hallak says VAST makes a trade-off between price and performance in a way “that really allows our customers to transform the way they do business. It reduces the amount of time that they need to wait for their computation significantly, and it allows them to run those analytics processes on a lot more data than they could before, which qualitatively improves the results.”
Buying increased durability
“The insight that first got my attention about VAST is that solid-state storage follows a different pattern. Rather than more money per terabyte buying you additional speed, it buys increased durability in terms of re-writing data on the same device,” Dwan added.
What this means for life sciences users is that VAST Data’s storage systems are designed to enable bioinformatics applications at any scale to benefit from the speed, input-output (I/O) consistency, and simplicity of all-flash parallel file system storage. The company’s systems eliminate the tiering of data across a complex, pyramidal hierarchy of storage systems, each focused on either fast I/O or large capacity, and replacing it with a single system designed to be as fast as, or faster than, a tier one, all-flash system today, and as cost effective, so that all of a customer’s life science research data can be retrievable on one fast tier of Flash.
VAST’s data storage negates the oft-heard argument that cloud solutions are always better, faster, and cheaper than on-premises solutions. “VAST came up with a very compelling financial model that had many of the positive features of public clouds, including year over year price improvements, elastic pricing based on usage, and continual improvement of their technology,” Dwan said. “It was really that—business acumen—that re-opened the door to an apples-to-apples comparison of the technology.
VAST Data’s systems incorporate Intel’s 3D XPoint (3D Cross-Point), a non-volatile RAM designed to enable the company to be faster than alternative Flash systems, but on a cost par with or less expensive than hard drive-based systems.
NIH, Harvard Med School, and Ginkgo
Among VAST Data’s dozens of customers are big-name life sciences institutions and companies that include the NIH, Harvard Medical School—and Ginkgo Bioworks, whose cell programming platform specializes in synthetic biology applications through the design of custom microbes for customers across multiple industries.
Ginkgo was VAST Data’s first customer. The companies connected when one of VAST Data’s channel partners was helping Ginkgo build out its infrastructure. It told Ginkgo about VAST Data’s proof-of-concept version of its data storage system at “one of the major life science organizations in the Boston area,” which VAST won’t disclose.
“Ginko heard that we were running a beta there, and they obviously talked among colleagues and what they told that reseller was that they heard VAST was the future, and they don’t want to invest in anything that’s the past,” Hallak recalled. “We were actually not generally available at that point in time, so they waited four or five months until our GA [general availability] date, so that they could be the first one to buy us.”
Dave Treff, Ginkgo’s Head of Information Technology and Development Operations (DevOps), said that just as his company needs to work with laboratory equipment manufacturers on equipment that won’t be available for two or three years, “it’s very natural for us to want the same thing with our high speed storage.”
“That’s why we went to VAST, because its system was newer. It looked like they were solving the problem correctly, that is as a communications problem rather than as a memory problem. They were doing it all in software, and the price point was very compelling,” Treff said.
“Very, very Helpful”
Ginkgo first used VAST Data’s system for high-speed storage of its Jupyter notebooks, open-source web applications that allow users to create and share documents that contain live code, equations, visualizations and narrative text. “They were very, very helpful and very attentive to us with support while we were doing that,” Treff said of VAST Data. “Their support has been top notch.”
Ginkgo last year raised $290 million in Series E financing, with plans to use the capital toward expanding synthetic biology applications for its cell programming platform.
Another VAST Data customer is Zebra Medical, whose software analyzes medical imaging data in real time with human level accuracy using a proprietary database of millions of scans.
Headquartered in New York City, VAST Data was founded in 2016 by Hallak, who was formerly XtremIO VP of R&D, and two other co-founders: Shachar Fienblit, VP of R&D, who previously served as Kaminario’s CTO; and Jeff Denworth, VP of Products, who most recently led marketing for CTERA. The company began shipping its first GA product in November of 2018, and formally launched in February 2019.
VAST Data’s Universal Storage system has had five major feature releases since product launch. The fifth occurred April 30, when the company launched Version 3.0 of Universal Storage with more than 20 new features and updates designed for easier deploying and scaling, with enhanced user behavior monitoring and performance improvements:
- SMB support for Windows and MacOS applications: VAST has developed its own SMB server stack to power high throughput Windows and Mac applications—a solution designed to enable enterprise customers to enjoy seamless multi-protocol access between NFS and SMB as well as the performance needed for media and entertainment organizations.
- Cloud-Based Backups: VAST’s new Snap-to-Object feature allows customers to protect their critical data assets by snapshotting data to another VAST system, an on-premises S3 storage system or the cloud service of their choosing.
- Native encryption at rest: Version 3 will encrypt user data at rest using FIPS-class AES-256 when it is stored to 3D XPoint and QLC flash
- Enhanced data reduction for unstructured data: This release enables VAST to showcase its unique Similarity-Based Data Reduction and delivers dramatic storage efficiency gains to customers who have never experienced any data reduction from their legacy storage options—a change that VAST says will thus make flash affordable for all applications.
VAST Data has enjoyed trade news outlet honors that include Storage Magazine and SearchStorage’s 2019 Storage Product of the Year, inclusion in CRN’s 10 Hottest Data Storage Startups of 2019, and Channel Partners 2020 Channel Influencer.
The $100 million in Series C financing has more than doubled VAST Data’s total funding raised to $180 million, with most of that capital—the company’s “war chest” of $140 million—available to satisfy global customer demand for next-gen infrastructure, and to enable data driven applications through continued innovation, Hallak said. The company has yet to tap into its $40 million in Series B financing raised in February 2019.
“This expanded war chest lays out a clear path to break even, allowing us to thoughtfully invest in our business and maintain forward momentum, to innovate and capture market share across boom times as well as times of market turbulence,” Denworth stated on the company’s blog.
Next47, a global venture firm backed by Siemens, led the Series C financing, with participation by past investors as well as new investors—including 83North, Commonfund Capital, Dell Technologies Capital, Goldman Sachs, Greenfield Partners, Mellanox Capital and Norwest Venture Partners.