The cleaning and decluttering that comes with the seasonal change of Spring also makes us think about a fresh outlook for our data and the storage technologies that will be housing it in the future.
As springtime arrives many of us turn to thinking about organizing our lives for the new season. Getting down to a good cleaning giving our homes and offices a new shine and sparkle is often the focus. You can also be doing a bit of a ‘data spring clean’ revisiting your data practices and habits. Seasonal refreshing also makes us think about the technologies that house our data and new horizons that lie ahead. What kind of storage tech will our digital lives be on in the future?
An explosion of data needs across many different realms of business and society means it is an exciting time for researchers and development in the storage world. A recent blog post by Vijay Chidambaram who works on the Systems and Storage Lab research group at the University of Texas at Austin highlights how new technologies being worked on are directing changes. It’s the season for new growth and beginnings so let’s look at how new applications with new requirements as well as breakthrough storage technologies will bring fresh approaches to the ways we use and store our data.
Persistent memory has finally started to reach the market and provides a next-generation way to achieve high-capacity and low-latency. In traditional storage terms, solid-state drives (SSD) and hard disk drives (HDD) have always been slow compared to memory or DRAM (Dynamic Random Access Memory) but DRAM sizes are limited and data stored is volatile (goes away when power does). Products like Intel’s DC Optane Persistent Memory are what’s called storage-class memory because they are accessed like memory decreasing latency significantly over SSD and HDD but can also be used as storage because of the increased capacities available to flash-memory technologies. The line between memory and storage is increasingly blurry with this technology but with specific implementations the challenge of finding the ideal ‘sweet spot’ can have a significant impact on storage systems going forward.
Researchers at the University of Southampton have been working with Microsoft to develop a new data format that encodes information in tiny structures of glass. The media has potential to provide high-density (hundreds of terabytes on a piece only millimeters thick) as well as long-lasting durability helping with archival needs for the increasing amounts of data the world needs to store and maintain. Traditional formats like hard drives and tape have degradation issues that silica-based glass technology does not. Progress has been demonstrated with a neat proof-of-concept in conjunction with Warner Bros. storing (and retrieving) a copy of 1978 classic film Superman on one of the experimental glass discs. While it won’t come to market any time soon, we may see this become an option further into the decade as an option for cloud storage providers and companies looking to preserve libraries of content.
The technological possibilities , a technique developed by a team at the University of Washington details a system that can encode, store and retrieve data using DNA molecules. DNA efficiently stores all kinds of information about genes and various elements of our own living systems, millions of times more densely than any current digital storage, so repurposing it to hold digital data opens up a wide range of possibilities to for large amounts (215 Petabytes per gram!) and preserve it for hundreds of years. This is done by mapping the strings of 1s and 0s that make up digital data into the four basic building blocks of DNA sequences and synthesizing those ‘nucleotides’ into molecules in a test-tube. Accessing the data after, while possible, may suffer latency challenges due to its organic nature but the density and long-lasting capabilities of DNA storage could make it a viable storage media down the road.
As new applications are created in the modern computing world they find new ways to use and challenge storage systems.
Machine learning is a good example of a developing area where new computing methodologies require new approaches to storage. Machine learning is a subfield of artificial intelligence that focuses on analyzing large data structures to gain an understanding to create models for use in problem-solving and automation applications. Performing the analysis on the fly, known as training the machine learning models, requires a lot of data and a lot of processing power. For this purpose a specialized form of processor called a Graphics Processing Unit (GPU) is used instead of a traditional Central Processing Unit (CPU) that would be found in a desktop or server environment. This is because GPUs are designed to process large arrays of data common to graphics and scientific computing. In practice, the GPUs can actually consume data so fast, that they end up having to wait while storage systems fetch the large amounts of data. Reducing the latency is a challenge that perhaps new technologies that lessen the gap between storage and processor might be able to address to make machine learning more efficient.
The burgeoning use of blockchain technology in various applications presents challenges to the way data is stored securely. The blockchain-based platform Ethereum places importance on secure authentication of the data it is storing. This means every bit and byte accessed on the platform requires scrutiny. The decentralized and encrypted nature of blockchain technology means achieving security verifications on every bit requires intense reading and writing operations. Of course when any kind of sensitive information is involved you want to be able to have authentication every step of the way, but this causes bottlenecks during processing as data is called, authenticated and accessed. Blockchain applications and technologies like Ethereum are ever-evolving and improving storage systems to support their data security needs is an area of ongoing development.
Scientific computing has become a huge deal as we look to computers to solve data-intensive problems but applications are faced with storage technology challenges. High performance database researcher Spyros Blanos writes:
“Although scientific computing continuously shatters records for floating point performance, the I/O capabilities of scientific computers significantly lag on-premise datacenters and the cloud.”
In any system, it is the storage area that causes slowdowns, not necessarily the processor speed. Depending on the requirements – some scientific computing needs to access data randomly from very large sets, others need to read records in order from small areas – the challenges presented and the approaches differ. Scientific computing may have different needs than a business storage system that is just archiving large quantities of data and changing storage technologies may work to address those needs as we continue to tackle new areas of research.
The technologies that house our data and the new applications that utilize it are changing. Modern advances in storage technology from memory to new materials will combine with improvements to storage system approaches. As a new spring arrives at the start of a new decade, we can find many areas where applications will require these new technologies. We can envision many new horizons for the way we store and access our data and the fact remains in this digital world: it’s the data that’s important!