Generative AI spurs new demand for enterprise SSDs

(6 pages)

Generative AI (gen AI) will propel the next S-curve for the semiconductor industry. ¹ Realizing the projected productivity gains from gen AI across the economy (up to 2.1 percent of total global revenues²) will require significantly more computing power, servers, and memory—all dependent on chips.

Sixty-five percent of enterprises have adopted gen AI applications, up from 33 percent in 2023. ³ As adoption has increased, high-bandwidth memory (HBM) has emerged as a core semiconductor component for gen AI servers. However, low-latency, nonvolatile NAND flash memory—already the preferred storage method for gen AI workloads—is equally vital. NAND chips are assembled into solid state drives (SSDs) to store text, images, videos, and other file types in gen AI servers. The focus of this article is the enterprise SSD (eSSD) portion of the market, referring to SSDs used within servers and data storage units (see sidebar, “Forecasting methodology”).

Forecasting methodology

Our estimates on server volume, solid state drive (SSD) content per server, and overall bit demand (measured in exabytes) are based on primary research and detailed bottom-up forecasting conducted by McKinsey’s Semiconductor Practice. These forecasts were developed based on a combination of the following:

GPU production estimates, based on future manufacturing capacity announced by GPU manufacturers
Advances in both server system architecture and large language and multimodal models, based on primary research, including consultation with server designers from hyperscalers, enterprise players, memory manufacturers, and model developers
Generative AI (gen AI)–related compute demand estimates, measured in floating point operations per second and based on primary research, including consultation with model developers, hyperscalers, and enterprises adopting gen AI
Macro “limiting factors,” including datacenter build-out constraints, hyperscaler and enterprise IT capital expenditure deployment potential, and likelihood of gen AI monetization potential for both hyperscalers and enterprises adopting gen AI

Gen AI’s widespread implementation is boosting sales of eSSD because of its superior performance in speed, latency, reliability, and energy efficiency compared with hard disk drives (HDDs). As AI data proliferates, the demand for multiple secure, localized copies creates a flywheel effect, further accelerating eSSD market growth. This symbiotic relationship between gen AI and eSSD storage is reshaping the data storage landscape.

Training and inference fuels fast growth for eSSDs

Under our baseline scenario, the total eSSD market will grow 35 percent annually, from 181 exabytes (EB) in 2024 to 1,078 EB in 2030, driven primarily by the training and inference gen AI market segment (Exhibit 1). The largest driver of this growth is the deployment of inference AI servers and retrieval-augmented generation (RAG) databases as large companies adopt gen AI use cases. SSD content per inference server is also expected to multiply as models become larger and more complex. Last, the share of SSDs in the market is expected to increase relative to the HDD market, given SSDs’ superior performance for AI workloads.

AI inference will drive enterprise solid state drive market growth to 2030.

Advances in AI models fuel training segment growth

Training is the process of teaching gen AI models to make predictions or inferences by identifying patterns in curated data sets. SSD demand to support training is accelerating as the number, size, and complexity of large language models (LLMs) and training data sets increases. We project this market to grow 62 percent per year, from seven EB in 2024 to 127 EB in 2030. Including both training servers and external databases, we expect that training server and storage unit shipments will double from 0.2 million in 2024 to 0.4 million in 2030 as an increasing number of tech-forward companies train models and the models grow in size and complexity.

Many experts expect that LLM parameters (the factors that define how LLMs operate) will likely double every three years before plateauing in 2030 (as the supply of high-quality text-based training data is exhausted). Although LLM size varies widely, we expect an average LLM training server will need 100 terabytes of SSD storage in 2030, up from 30 terabytes today. Beyond 2027, expert consensus is that multimodal models (MMMs), which generate images, video, audio, and text, will enter the mainstream. MMM training servers will need significantly more SSD storage, which we expect will be in the hundreds of terabytes, depending on the mix of video and images that need to be stored and accessed. For both MMMs and LLMs, we expect that the external databases used to store raw training data will also require hundreds of terabytes of SSD storage moving forward.

SSDs’ high-speed read capabilities make them indispensable for AI training use cases. They are also crucial for storing model checkpoints—backups that safeguard against system failures. Using SSDs instead of slower HDDs for checkpoints significantly reduces training interruptions, thereby optimizing the overall AI model development process.

Inference segment growth propelled by enterprise adoption

Most large organizations will adopt inferencing, the process of using a trained model to analyze data and generate responses and content. We project demand for inference servers will grow 105 percent per year, from six EB in 2024 to 447 EB in 2030. The most significant driver of growth is inference server and storage unit shipments (including RAG), which are expected to increase more than tenfold over this period, to 3.6 million from 0.3 million—given that most large companies are expected to adopt inferencing across sectors. A portion of these companies are also expected to implement RAG to improve the accuracy and specificity of their model outputs.

Most shipments will go to hyperscalers—leading cloud service providers—that are heavily investing in inference servers for enterprise customers. SSD demand in this segment will come from three sources:

RAG. RAG is a new technology that assembles companies’ own data into vectorized databases, which models then refer to, improving the accuracy and specificity of outputs. For example, a pharmaceutical company can conduct drug discovery more effectively using a model that accesses 20 years of proprietary drug discovery research. RAG requires two forms of storage: active storage, a large repository of a company’s “useful” data; and vector databases, which organize active data before it is fed to LLMs. Gen AI experts tend to indicate that about 25 percent of large enterprises will adopt RAG by 2030, although some uncertainty remains because this technology is still in the early stages of deployment. Bit demand is negligible today but could grow to 260 EB over this period with the above adoption rate. A major driver of growth is “duplication”: both active storage and vector databases are incremental duplications of a company’s existing data, organized into a new format. Additionally, these databases are duplicated across geographical regions to enable inferencing to be conducted locally at speed, significantly increasing storage demand.

Contents created by inferencing. We project storage demand of 141 EB in 2030, with 90 percent earmarked for storing AI-generated entertainment media (for example, videos and images used by online content creators). Because MMMs cannot yet quickly generate video at scale, this forecast is harder to pin down.

Inference servers. Inference servers require comparatively less storage because they store model parameters and not training data. We project inference server storage will grow from five terabytes today to 35 terabytes in 2030, translating to 46 EB of bit demand. Storage needs per server will be lower than the 90 terabytes needed for training servers.

Alternative scenarios would affect NAND demand

Gen AI is a new and fast-moving technology that may evolve in unpredictable ways. Variations from our baseline assumptions could lead to alternative upside or downside scenarios (Exhibit 2).

The size of the enterprise solid state drive market in 2030 will depend on which of three scenarios materializes.

Gen AI server deployment. Companies across the economy could rein in IT capital expenditures if they are short on cash or if high interest rates persist. Likewise, current constraints to data center deployment (particularly access to power) could scale back hyperscalers’ server deployments. Last, stricter regulations could make it difficult for enterprises to comply with data protection rules, inhibiting adoption.

RAG adoption. RAG will be the largest driver of inference-related eSSD demand, but it is still nascent. In our upside scenario, 40 percent of enterprises adopt RAG to support inferencing; on the downside, only 15 percent of enterprises risk achieving a return on investment.

MMM adoption. MMMs will remain in the R&D phase until 2028 in our base case scenario. The maturation of these models will grow the eSSD market for training servers, databases, and contents storage. Under a downside scenario, MMMs will not be commercially adopted until 2030.

Under any scenario, we project sustained market growth and an optimistic outlook for NAND and eSSD manufacturers. However, achieving this growth will require continued at-scale investments in gen AI infrastructure from hyperscalers and enterprises.

Generative AI spurs new demand for enterprise SSDs

Forecasting methodology

Training and inference fuels fast growth for eSSDs

Advances in AI models fuel training segment growth

Inference segment growth propelled by enterprise adoption

Alternative scenarios would affect NAND demand

Explore a career with us

Related Articles

Generative AI: The next S-curve for the semiconductor industry?

New tactics for new talent: Closing US semiconductor labor gaps