Skip to content
 

Tonight Nvidia reports. Retail is buying? Institutional is selling. I’m in this for the long-term. You should be too.

Nvidia will report excellent earnings this evening — at 5 PM. CEO Jensen will be on Bloomberg Radio and TV shortly after — at 6:30 PM.

The stockmarket is weird. It may react to Nvidia’s earnings by knocking Nvidia’s stock (it’s down today) or boosting it.

I personally don’t care, because I’m in Nvidia for the long term. It’s a core stock holding of mine — my largest one.

The key to having it as a core long-term holding is to understand what Nvidia does and why it’s going to continue to do well, and hence why I should continue to own the Nvidia stock.

The past 50 years have seen increasingly powerful computers. With more powerful computers you can do more useful things with them. I’ve written extensively about AI’s impact on everything from self-driving cars to new drugs invented solely inside a computer.

So?

NO OTHER FIRM has benefited from the boom in artificial intelligence (AI) as much as Nvidia. Since January 2023 the chipmaker’s share price has surged by almost 450%. With the total value of its shares approaching $2trn, Nvidia is now America’s third-most valuable firm, behind Microsoft and Apple. Its revenues for the most recent quarter were $22bn, up from $6bn in the same period last year. Most analysts expect that Nvidia, which controls more than 95% of the market for specialist AI chips, will continue to grow at a blistering pace for the foreseeable future. What makes its chips so special?

Nvidia’s AI chips, also known as graphics processor units (GPUs) or “accelerators”, were initially designed for video games. They use parallel processing, breaking each computation into smaller chunks, then distributing them among multiple “cores”—the brains of the processor—in the chip. This means that a GPU can run calculations far faster than it would if it completed tasks sequentially. This approach is ideal for gaming: lifelike graphics require countless pixels to be rendered simultaneously on the screen. Nvidia’s high-performance chips now account for four-fifths of gaming GPUs.

Happily for Nvidia, its chips have found much wider uses: cryptocurrency mining, self-driving cars and, most important, training of AI models. Machine-learning algorithms, which underpin AI, use a branch of deep learning called artificial neural networks. In these networks computers extract rules and patterns from massive datasets. Training a network involves large-scale computations—but because the tasks can be broken into smaller chunks, parallel processing is an ideal way to speed things up. A high-performance GPU can have more than a thousand cores, so it can handle thousands of calculations at the same time.

Once Nvidia realised that its accelerators were highly efficient at training AI models, it focused on optimising them for that market. Its chips have kept pace with ever more complex AI models: in the decade to 2023 Nvidia increased the speed of its computations 1,000-fold.

But Nvidia’s soaring valuation is not just because of faster chips. Its competitive edge extends to two other areas. One is networking. As AI models continue to grow, the data centres running them need thousands of GPUs lashed together to boost processing power (most computers use just a handful). Nvidia connects its GPUs through a high-performance network based on products from Mellanox, a supplier of networking technology that it acquired in 2019 for $7bn. This allows it to optimise the performance of its network of chips in a way that competitors can’t match.

Nvidia’s other strength is CUDA, a software platform that allows customers to fine tune the performance of its processors. Nvidia has been investing in this software since the mid-2000s, and has long encouraged developers to use it to build and test AI applications. This has made CUDA the de facto industry standard.

Nvidia’s juicy profit margins and the rapid growth of the AI accelerator market—projected to reach $400bn per year by 2027—have attracted competitors. Amazon and Alphabet are crafting AI chips for their data centres. Other big chipmakers and startups also want a slice of Nvidia’s business. In December 2023 Advanced Micro Devices, another chipmaker, unveiled a chip that by some measures is roughly twice as powerful as Nvidia’s most advanced chip.

But even building better hardware may not be enough. Nvidia dominates AI chipmaking because it offers the best chips, the best networking kit and the best software. Any competitor hoping to displace the semiconductor behemoth will need to beat it in all three areas. That will be a tall order.■

The economist published this piece on February 24, 2024. Since then Nvidia’s market cap has risen to over $3 trillion.

I asked my favorite technology researcher, Richard Grigonis, a question:

What is Nvidia’s Relevance in AI’s Future?

The trend toward smaller, more efficient versions of Large Language Models (LLMs) means that AI models can run on less powerful hardware, not just high-end specialized chips. Under such scenarios, who provides the chips and the hardware? Can a company like Nvidia provide the low-end, middle-end and high-end chips needed for AI going forward? Let’s see…

Nvidia is currently the dominant player in the AI hardware industry, particularly in the realm of Graphics Processing Units (GPUs), which are used for training and running large-scale AI models. Indeed, their GPUs are the backbone of most AI research and commercial applications today. Nvidia’s GPUs provide the almost unbelievable computational power to handle the massive parallel processing required by Large Language Models (LLMs) and other AI tasks.

Nvidia’s Current Position Across Hardware Tiers

High-End Chips: Nvidia’s A100, H100 and H200 GPUs – and now the Blackwell, said to be the world’s most powerful chip – are the go-to hardware for high-end AI applications. These chips are used in data centers around the world to train massive AI models that require extensive computational resources.

Mid-Range Chips: For the middle-end market, Nvidia offers products like the A30 and T4 GPUs, which provide a balance between performance and cost. These are often used for inference tasks and in scenarios where the full power of an A100 isn’t necessary.

Low-End Chips: On the lower end, Nvidia’s Jetson series provides AI capabilities for edge devices, robotics, and IoT (Internet of Things) applications. These chips are more affordable and energy-efficient, making AI accessible for smaller-scale projects and real-time applications. (And in PCs and laptops.)

Competition and Alternatives

While Nvidia is currently the leader, the landscape is evolving rapidly. Companies like AMD, Intel, along with more recent startups such as Graphcore with its Intelligence Processing Units (IPUs) and Cerebras with its Wafer-Scale Engine (WSE) technology, are developing specialized chips designed for AI workloads. Additionally, Google’s Tensor Processing Units (TPUs) are another example of hardware specifically tailored for AI tasks, offering competition to Nvidia’s GPUs in certain niches.

Efficiency Trends and General-Purpose Chips

The AI field is also seeing a trend towards developing more efficient models that can run on less specialized hardware. Techniques like quantization, model pruning, and distillation are making it possible to deploy AI models on less powerful, general-purpose chips. For instance, smaller AI models optimized for edge devices might not need the power of a high-end GPU and could run effectively on general-purpose CPUs (made by companies like Intel) or even low-power ARM processors.

We see that the push for more efficient AI models is driven by:

  1. The need to deploy AI in resource-constrained environments (e.g., mobile devices, IoT).
  2. Reducing computational costs for large-scale deployments.
  3. Addressing environmental concerns related to AI’s energy consumption.
  4. Enabling real-time AI applications.

Techniques for improving efficiency include:

  1. Model Compression:

+ Pruning: Removing unnecessary neurons or connections.
+ Quantization: Reducing the precision of model parameters.
+ Knowledge Distillation: Training smaller models to mimic larger ones.

  1. Architectural Innovations:

+ Designing inherently efficient architectures.
+ Attention mechanisms and sparse computations.

  1. Hardware-Software Co-Design:

+ Optimizing models for specific hardware platforms.

Examples of Efficient Models:

  1. MobileNet: Designed for mobile and embedded vision applications, MobileNetV2 reduces computation and model size. Best of all, it can run on smartphones and other low-power devices.
  2. DistilBERT: A computationally “distilled” version of BERT (Bidirectional Encoder Representations from Transformers). DistelBERT is 40% smaller and 60% faster, while retaining 97% of BERT’s performance on language understanding tasks. It also enables deployment of BERT-like models on less powerful hardware.
  3. EfficientNet: This is a family of image classification models that systematically scale width, depth, and resolution. It can achieve state-of-the-art accuracy with fewer parameters than previous models.
  4. Small LLM Models: Gemma, by Google, are lightweight, open models derived from the same technology powering the much larger Gemini models. Despite their 2B and 7B sizes, Gemma models have demonstrated that even “small” language models can perform well on many tasks.

Impact on Hardware Requirements:

+ Smaller models require less RAM, making them suitable for devices with limited memory
+ Efficient models can run on less powerful CPUs or low-end GPUs, expanding deployment options
+ Models like MobileNet enable AI directly on edge devices, reducing reliance on cloud computing
+ Less computation translates to lower power consumption, crucial for battery-powered devices
+ Lower hardware requirements make AI more accessible to smaller companies and individual developers
+ Some applications use efficient models for real-time tasks on-device, with occasional use of more powerful cloud-based models

Efficiency and New Technology Changes the Landscape

The current explosion in hardware diversity has yielded increased use of mobile System-on-a-Chip(s) (SoCs), Field Programmable Gate Arrays (FPGAs), and specialized AI accelerators alongside traditional CPUs and GPUs.

On the software side, there’s a growing recognition in the importance of optimized inference engines (e.g., TensorRT, OpenVINO) to maximize efficiency on various hardware.

As efficiency increases and more new hardware appears, there will be a shift towards more “on-device AI,” thus changing the balance of cloud vs. edge computing.

The diversity in hardware also leads to required support for heterogeneous computing: Systems combining different types of processors to balance performance and efficiency.

While these trends don’t eliminate the need for high-performance hardware in AI research and large-scale applications, they are expanding the range of devices capable of running AI models. This evolution is creating new opportunities and challenges for hardware manufacturers, software developers, and AI researchers alike, driving innovation across the entire AI ecosystem.

How This Impacts Nvidia and Its Future Relevance

All of this raises the possibility that, in the future, as AI models become more efficient, the reliance on specialized AI chips could decrease. This question gets to the heart of Nvidia’s prospects in the rapidly evolving AI landscape. The efficiency trends in AI models obviously have implications for Nvidia’s business model and technological relevance. Let’s analyze this relationship:

  1. Diversification of Product Portfolio:

+ Challenge: As more efficient models can run on less powerful hardware, demand for high-end GPUs might decrease in some segments.

+ Nvidia’s Response: Nvidia has been proactively diversifying its product line. Beyond high-end GPUs, they offer solutions like the Jetson series of embedded computing boards for edge AI and have been developing ARM-based CPUs. This diversification helps Nvidia remain relevant across the spectrum of AI hardware needs.

  1. Software Ecosystem:

+ Advantage: While one tends to focus on its hardware, Nvidia’s software ecosystem (CUDA, cuDNN, etc.) is also a big part of their dominance. Even as models become more efficient, optimizing them for Nvidia hardware often yields the best performance.

+ Future Focus: Nvidia will likely continue to invest heavily in its software stack, ensuring that even smaller models run most efficiently on their hardware.

  1. AI Accelerators:

+ Opportunity: The trend towards efficient models is driving the development of specialized AI accelerators. Nvidia’s expertise in parallel processing puts them in a strong position to develop these.

+ Example: Nvidia’s Tensor Cores, specialized for AI workloads, demonstrate their ability to adapt to changing computational needs.

  1. Edge AI and IoT:

+ Challenge: Efficient models enable more AI to run on edge devices, potentially reducing the need for cloud-based GPU clusters.

+ Nvidia’s Strategy: Products like the Jetson series and EGX platform show Nvidia’s commitment to edge AI, allowing them to compete in this growing market. Nvidia has solutions in the world of enterprise edge, embedded edge, and industrial edge. And in April 2024, Nvidia acquired the Israel-based, AI edge-cloud management platform Run:ai.

  1. Research and Development:

+ Continuous Innovation: Nvidia invests heavily in R&D to stay ahead of efficiency trends. They’re not just hardware providers but active contributors to AI research.

+ Collaboration: Partnerships with research institutions and tech companies help Nvidia align their hardware development with emerging AI techniques.

  1. Market Segmentation:

+ High-End Persistence: While efficient models are growing, cutting-edge AI research and large-scale deployments still require powerful hardware, where Nvidia excels.

+ Balanced Approach: Nvidia can cater to both high-performance needs and the growing market for efficient, lower-powered AI solutions.

  1. Energy Efficiency:

+ Growing Concern: As AI’s energy consumption becomes a bigger issue, Nvidia’s focus on improving the performance-per-watt of their GPUs becomes crucial.

+ Green AI: Nvidia’s advancements in energy-efficient computing could position them as leaders in sustainable AI infrastructure.

  1. Inference Optimization:

+ Opportunity: As models become more efficient, the focus shifts from training to inference optimization. Nvidia’s TensorRT and other tools are well-positioned for this shift.

  1. Heterogeneous Computing:

+ Future Trend: As we saw previously, AI systems may increasingly use a mix of CPUs, GPUs, and specialized accelerators. Nvidia’s broad product range and expertise in parallel computing position them well for this scenario.

  1. Acquisitions and Partnerships:

+ Strategic Moves: Nvidia’s attempted $40 billion acquisition of ARM in 2022 (though unsuccessful) and other partnerships demonstrate their awareness of the need to adapt to changing hardware requirements.

Future Outlook for Nvidia:

Nvidia’s extraordinary dominance in the AI hardware market is built on a foundation of high-performance GPUs, a robust software ecosystem, and a product range that spans from data centers to edge devices. As we have seen, however, the AI hardware landscape is rapidly evolving, driven by several key factors:

  1. Increasing competition from established tech giants and innovative startups.
  2. The trend towards more efficient AI models that can run on less powerful hardware.
  3. Growing demand for energy-efficient AI solutions.
  4. Geopolitical factors affecting chip production and distribution.

While the trend towards more efficient AI models and competing hardware presents challenges to Nvidia’s traditional high-performance GPU business, the company has shown remarkable adaptability. Their diversified product range, strong software ecosystem, investments in edge AI and specialized accelerators, and continued innovation in high-performance computing all contribute to maintaining their relevance and future growth.

++++++

Nvidia introducing a new stronger, faster chip called Blackwell, which is up to 30 times faster than previous models. Blackwell has speeds of up to 20 petaflops and contains 208 billion transistors, which is more than the H100’s 80 billion — that’s the one they’re selling now.

If your head is spinning, so is mine.
Tune in at 5 PM EST tonight. You can register for the webcast on Nvidia’s site.
See you there — Harry Newton