Big Data

Distributed File Storage System For Machine Learning: Interview with Bjorn Kolbeck, the CEO of Quobyte

Distributed File Storage System

Quobyte offers a first distributed file system with a TensorFlow Plugin. With Quobyte software, you can store all stages of Machine Learning with an unlimited performance. Bjorn Kolbeck, CEO of Quobyte will be telling us more in this interview.

Could you please tell us a little more about yourself?

I am Björn Kolbeck, the CEO of Quobyte. I became involved with Quobyte after employment with Google, where my colleague Felix Hupfeld and I noticed the ease with which surprisingly small teams were operating the storage infrastructure fueling the tech giant’s servers and services. After conducting research and development for the distributed file system XtreemFS, we realized the approach would allow for additional innovative solutions.

The file system framework decouples storage from the physical hardware, liberating it by using smart algorithms. The result, through the development efforts at Quobyte, is a unified software storage system that is self-managing, reliable and scalable.

What is Quobyte?

Building on a decade of research and experience with open-source distributed file system technology and from working on Google’s infrastructure, Quobyte delivers on the promise of software defined storage for the world’s most demanding application environments including High Performance Computing (HPC), Life Sciences, Financial Services, and Electronic Design Automation (EDA). Quobyte’s hyperscaler parallel distributed file system unifies file, block, and object storage, allowing customers to easily replace storage silos with a single, scalable storage system to significantly save manpower, money, and time spent on storage management.

What are the unique features and services is Quobyte bringing to the Machine Learning market?

Quobyte allows computing professionals to move custom machine learning models into production quickly to better support these unique workloads from the data center to the cloud to the edge. The system can be used to train models locally on sample data sets and use the Google Cloud Platform for training at scale because Quobyte runs on-prem and in the cloud. Additionally, because it bypasses the kernel entirely, Quobyte works with both current and older versions of Linux, providing a full range of flexible deployment options for use in ML.

What is the market size of the Machine Learning market and why are Quobyte services in demand?

According to International Data Corporation (IDC), forecasts for spending on AI and machine learning will grow to $57.6 billion by 2021. Additionally, Deloitte Global predicts that enterprises will continue increasing their research, investment, and the piloting of machine learning programs. This reflects how machine learning is improving the acuity and insights of companies on how to grow faster and more profitably.

As more and more businesses look to leverage ML to increase innovation, achieve a faster time to market and provide a more positive customer experience, there is an increasing need for storage infrastructures that offer higher performance and increased flexibility that these workloads need. Quobyte offers high performance, broad platform support and flexible deployment options that are well positioned to help companies handle bigger data sets, achieve more accurate results and run ML workloads in any environment.

Tell us more about the High-Performance Computing and Big Data storage problems and the solutions from Quobyte?

Because of the complexity and unique throughput requirements of machine learning and big data environments, very expensive, customized storage systems have been required to meet the needs of these workloads. With Quobyte, such systems are no longer required to get the most out of these application areas. Quobyte is a single storage system that addresses many different performance profiles, including the high-throughput, low-latency requirement of ML’s model training stage, as well as large block sequential, small block random or mixed general workloads.

Quobyte supports the broadest set of access protocols and clients, such as S3, Linux, Hadoop, Windows and NFS for greater platform flexibility and more complete data ingest and preparation. Data is readily available at any stage all within a single global namespace and all managed through Quobyte’s intuitive management console.

Tell us more about Quobyte’s one storage system for all stages in a machine learning workflow?

With Quobyte, operators have the ability to leverage HDD and SSD to get the best price-performance ratio without cumbersome tiering. Additionally, because much of the machine-generated data uses a sequential naming convention that makes it ideal for prefetching, the prefetching of training data by Quobyte allows for substantial performance improvements over alternative approaches.

The platform’s infinite scalability allows users to grow storage in terms of throughput and capacity when they need it and the system can adapt as ML project requirements change – oftentimes more quickly than anticipated. In an ML deployment, disks or servers can be quickly and easily added when needed to provide more capacity or performance without any interruption to applications or services and the system’s multi-tenancy provides strong security by allowing users to define isolated namespaces and physical separation of data/workloads inside the same cluster.

Tell us more about Quobyte’s solution for Financial Services?

In one particular use case, Quobyte is supporting High Frequency Trading (HFT) infrastructure, where nanosecond driven applications literally convert time into money. Because financial service provider profits are being squeezed by similarly equipped rivals, ensuring the deployment of the most powerful storage system available puts market participants in an effective “arms race” against their competition. Quobyte helps customers stay ahead in this ultra-competitive market by maximizing IOPS while driving down latency.

Capable of supporting simultaneous parallel file access from hundreds of compute nodes, a Quobyte storage cluster significantly reduces time spent on backtesting. The system is far easier to operate than a number of alternatives, delivers unlimited data capacity, non-disruptive live updates, and 1,000+ node scalability so that financial services operations never outgrow the Quobyte investment.

Could you tell us about your team and customer support?

The Quobyte team includes the industry’s most educated experts on the subject of HPC-class software defined storage, helping business customers to understand the advantages available to them as a user of this unique technology. The company’s global support team is on-call for enterprise customers should their expertise be required which configuring high throughput storage environments for their data intensive application requirements.

How Safe is Quobyte, would you like to talk about your legal and security measures?

Quobyte storage is designed for highly secure and compliant environments and includes both IP-based and X.509-based access controls, integrated certificate management, management authentication via LDAP, OpenStack Keystone or integrated user management, S3 authentication via LDAP, OpenStack Keystone or password files, multi-tenancy on the logical and physical level, at-rest data encryption, and NFSv4 ACLs

Do you have more information for our readers?

In addition to machine learning applications, the Quobyte data center file system is ideal for HPC computing and big data, life sciences and bio IT, container infrastructures, virtual machine infrastructures and OpenStack and is a preferred platform for multi-tenanted environments deployed by hosting and IT service providers.

For more information, visit the Quobyte website:

To Top

Pin It on Pinterest

Share This