A guide to computational storage: Boosting performance for SSD storage arrays

Sagar Nangare is assistant manager marketing at Calsoft Inc. Calsoft is a leading software product engineering services company specializing in Storage, Networking, Virtualization and Cloud business verticals. Calsoft provides End-to-End Product Development, Quality Assurance Sustenance, Solution Engineering and Professional Services expertise to assist customers in achieving their product development and business goals.

With the proliferation of IoT devices and 5G fast wireless appearing on the horizon, enterprises are moving towards edge-based infrastructure deployment. Computational storage is a new storage technology powering the storage and compute part of edge infrastructure. With computational storage, it will be possible for enterprises as well as telecom service providers to support a massive amount of data processing, mainly at edge nodes.

A few companies have started offering computational storage solutions to businesses and organisations. Let’s understand the computational storage concept in-depth and how it is backed by communities and tech vendors.

The need for computational storage

Currently, most developments in the technology domain are focusing on digital user experience in real time with intelligence. This calls for the data centre or infrastructure stack to be at the highest performance level, equipped with all the latest hardware resources and computational processing techniques. Artificial intelligence/machine learning, analytics techniques are moved into the data centre to make digital devices intelligent.

As a result, we have seen the evolution of many new data centre technologies to boost data centre performance. We have also seen legacy HDD getting replaced by flash-based SSD arrays; the use of NVMe and FPGAs to boost data access in storage devices; the use of GPUs for hyper-scale data centre; and so on. Overall, we are witnessing the emergence of High-Performance Computing (HPC) systems that support the processing of huge amounts of data.

This leads to two types of gradual demands as we move forward into digital transformation. One, AI/ML and analytics applications need faster access to data than which are currently provided via traditional storage systems.

Secondly, data processing demands will continuously increase as per growth in IoT and Edge computing. Moreover, the humongous data generated by 5G networks will exponentially support IoT and edge use cases.

Although a maximum number of data centres are equipped with all-flash storage arrays, organisations face bottlenecks in supporting the ever-growing processing demands by AI/ML or big data applications.

This is where computational storage comes in.

What is computational storage and why do we need it?

Computational storage is a technique of moving at least some processing closer to or along with storage devices. It is also being referred to as ‘in-situ’ or ‘in-storage’ processing.

Generally, data has to move between the CPU and a storage layer that causes a delay in response time for input queries. Applying computational storage is critical to address the real-time processing requirement of AI/ML or analytics applications. We can host such high-performance computing applications within the storage itself, reducing resource consumption and costs, and achieving a higher throughput for latency-sensitive applications. Additionally, computational storage enables reduction in power consumption by data centre resources.

The core reason why computational storage stands advantageous to data centres is due to a mismatch between the storage capacity and the host machine’s memory data bandwidth (PCI links) that are connected to the CPU. To understand how this mismatch can be caused in a hyper-scale data centre, let’s take an example of the proposed server architecture by Azure and Facebook at OpenCompute.

In this proposed server, 64 SSDs are attached to one CPU host through PCI links. As shown in the above proposed server block diagram, 64 SSDs are connected to PCI links of 16 lanes. Each of the SSDs has 16 flash channels for data access, taking the total internal flash bandwidth to 8.5 GB/s. Now, 64 flash channels are available across 16 SSDs, which makes the total storage capacity as 544 GB/s. The bandwidth of PCI links is limited to 16 GB/s. This is a huge mismatch in the path of data to the host CPU. In such cases, in-situ processing can be applied so that most critical high-performance applications move to SSDs.

SNIA standards and market development

A global storage community, SNIA, has formed a Computational Storage Technical Work Group (TWG) to promote the interoperability of computational storage devices, and to define interface standards for system deployment, provisioning, management, and security. The TWG includes storage product companies such as Arm, Eideticom, Inspur, Lenovo, Micron Technology, NetApp, NGD Systems, Nyriad, Samsung Electronics, ScaleFlux, SK Hynix, Western Digital Corporation, and Xilinx. 

SNIA has defined the following three standards to implement computational storage in any type of server, whether it’s a small-medium scale enterprise data centre or a hyperscale data centre.

Computational Storage Drive (CSD): A component that provides persistent data storage and computational services

Computational Storage Processor (CSP): A component that provides computational services to a storage system without providing persistent storage

Computational Storage Array (CSA): A collection of computational storage drives, computational storage processors and/or storage devices, combined with a body of control software

Several R&Ds are under way and researchers are developing POCs to test standards defined by SNIA on high-performance computing applications. For example: CSD is demonstrated with project Catalina.

What's more, some of the core members of SNIA’s computations storage TWG have already started offering solutions. The vendors are NGD Systems, Samsung, ScaleFlux, Eideticom, and Nyriad.

Conclusion

Computational storage standards will be a great addition keeping in mind the growing demand for process data through high-performance computing applications. This type of in-storage embedded processing will come along with different forms and approaches which can be offered with NVMe-based architecture to boost SSD stacked servers.

https://www.cybersecuritycloudexpo.com/wp-content/uploads/2018/09/cyber-security-world-series-1.pngInterested in hearing industry leaders discuss subjects like this and sharing their experiences and use-cases? Attend the Cyber Security & Cloud Expo World Series with upcoming events in Silicon Valley, London and Amsterdam to learn more.

View Comments
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *