Apple bets on visual AI: a new breakthrough in computing power from edge devices to the cloud

2026-03-11 04:45:42

Imagine a future where, one day soon, the following scenarios become a real part of our daily lives:

When mealtime arrives, you stand at a busy food street intersection, feeling a bit overwhelmed. At that moment, AI glasses act like a caring food guide, clearly highlighting in your view: “Look, that small shop with the blue sign on your left has the best rice noodle rolls on this street. Go in and try them.”

While driving on an unfamiliar road, navigation no longer just says “Turn right in 300 meters,” but kindly reminds you: “See that small building with the red roof? It’s very eye-catching. Turn right at the next intersection beside it.”

Suddenly, a small household appliance at home breaks down, and you’re clueless about what to do. Don’t worry—just place it in front of a smart speaker with a camera, and it can instantly identify the fault and patiently teach you how to repair it.

All these high-tech scenes rely on a technology called “Visual AI.” Simply put, Visual AI allows AI to break free from screens, grow “eyes,” and truly integrate into our lives. It is an important branch of physical AI, with the ultimate goal of perceiving the real world and assisting people in their daily routines.

Most of these changes began around 2026, known as the “Year of Physical AI,” with Visual AI being a key focus for tech companies competing for dominance.

In that year, Lenovo launched Maxwell, positioned as an “AI Perception Companion.” It’s like a keen observer that can “see” everything the user sees through a camera and accurately understand the surrounding environment. The company also released ThinkBook Plus Gen 7 Auto Twist, a PC with AI visual tracking that automatically adjusts its screen angle based on the user’s position—perfect for video conferences and creative presentations. Additionally, their new concept smart glasses, Lenovo AI Glasses Concept, debuted at CES that year.

Meanwhile, OpenAI, a star startup in the large model field, began making strides in visual AI devices. The camera-equipped smart speaker mentioned earlier was one of their AI terminal devices revealed that year. Of course, OpenAI also has plans for smart glasses.

On February 22 of that year, Apple CEO Tim Cook announced that Visual Intelligence would be Apple’s next “major breakthrough.”

Yes, the same Apple that once single-handedly opened the era of smartphones. This time, Apple isn’t just expanding its product line but aiming to redefine terminal devices in the AI era. This move signals that multimodal AI will fully explode in the terminal device space.

As terminal devices evolve from “able to listen and speak” to “able to see and think,” a new cycle of computing power from edge to cloud is quietly beginning. Tech giants like Lenovo, with their deep investments in hybrid AI, are entering a golden period of re-evaluating the value of full-stack computing capabilities.

The New Turning Point for Visual AI

For a long time, visual AI has been applied in many fields, but due to limited user awareness, immature technology, and incomplete ecosystems, large-scale deployment has been elusive.

However, with its vast user base, strong product iteration capabilities, and comprehensive ecosystem, Apple plans to integrate visual AI features into daily work, content creation, smart travel, and other scenarios through iPhone, Mac, Vision Pro, and future lightweight wearables.

Apple’s approach to visual AI is not just about adding features but treating it as a “defining function” for next-generation hardware products. From the expected camera-equipped AirPods launching around late 2026, to fashionable AI wearable devices, and the planned launch of all-weather AI companion smart glasses in 2027, Apple is centered on “seeing the world” to reshape human-computer interaction.

This method of enabling Siri to understand user needs and environmental context through visual scenarios will have a profound “educational effect” on the global terminal market. When industry leaders establish visual interaction as a standard feature, user acceptance of multimodal AI will significantly increase, activating demand across the entire visual AI terminal and application markets—marking the official arrival of a market inflection point.

This trend aligns closely with Lenovo’s terminal strategy. The previously mentioned ThinkBook Plus Gen 7 Auto Twist and Lenovo AI Glasses Concept already cater to different user groups, including personal and business users.

Especially, Maxwell, the AI perception companion concept, can continuously gather panoramic contextual data, truly “see what you see, hear what you hear,” and provide real-time insights and personalized suggestions. This design philosophy echoes Apple’s pendant-style products and further confirms that “environment-aware AI terminals” are becoming an industry consensus.

Unlike Apple, Lenovo not only focuses on consumer scenarios but also has deep experience in enterprise AI solutions. While Apple educates the market through consumer products, enterprise demand for multimodal interaction is shifting from novelty to necessity. Fields like retail product recognition, manufacturing equipment inspection, and medical imaging screening will benefit from mature visual AI, enabling large-scale deployment.

Lenovo’s extensive experience in the enterprise market has led to a comprehensive suite of visual AI solutions across industries. Backed by its intelligent IT engine, Lenovo can provide full-process services from visual data collection and analysis to application deployment. As Apple promotes a mature visual AI ecosystem, it will further amplify enterprise demand, offering more opportunities for Lenovo’s solutions.

Under the leadership of these two global terminal giants, visual AI is entering a dual-market explosion—both consumer and enterprise.

Moreover, in today’s data security environment, visual AI involves vast amounts of privacy-sensitive images and videos. Ensuring maximum privacy, reducing network latency, and improving interaction experience have become industry consensus. Apple and Lenovo’s choices in deploying visual AI precisely address these core concerns.

Apple emphasizes “device-side processing” and “privacy-first” principles, developing visual models that run locally on iPhone, allowing Siri to see and control applications without uploading sensitive data to the cloud. Lenovo’s “Personal AI Twin” vision also stresses building a local user knowledge base, enabling AI to understand user habits and preferences on the device, only calling on cloud models when necessary. When these two industry leaders converge on “edge intelligence,” this model is poised to become the definitive direction for terminal AI development over the next decade.

The New Boom in Full-Stack Computing Power

From the core of AI industry development—transforming computing resources into intelligent productivity to empower production, daily life, and create more convenient consumer experiences—the large-scale deployment of visual AI essentially means a comprehensive surge in demand for computing power.

Compared to text-only AI, visual AI imposes more systemic requirements on the computing infrastructure. On the one hand, edge devices need stronger local inference capabilities to handle real-time visual processing. On the other hand, cloud and edge must support complex multimodal model computations and low-latency responses. This “edge-cloud collaboration” is reshaping the entire computing industry chain. Therefore, visual AI is not only a celebration for terminal device manufacturers but also a full-stack industry opportunity for providers like Lenovo.

At the edge, device upgrade demand is being actively stimulated. Real-time image generation, video analysis, AR interactions—all require higher NPU/GPU computing power, memory bandwidth, and ISP processing capabilities, accelerating the upgrade from traditional terminals to AI-enabled devices.

AI PCs, as the core platform for visual AI interaction, are expected to see explosive growth. Canalys predicts that by 2028, the global AI PC penetration rate could reach 79.7%, with shipments soaring. As a leading global PC manufacturer, Lenovo’s scale and technological advantages will be further amplified. With the world’s largest terminal shipment volume and accumulated AI PC expertise, Lenovo will directly benefit from this upgrade wave.

Furthermore, as AI terminal giants, the mass procurement of low-power AI chips and high-performance NPU components will expand capacity and reduce costs, further lowering R&D and manufacturing expenses, solidifying their leadership in edge computing.

In the cloud and at the edge, inference computing power is becoming a new growth driver. Cloud data centers will handle large-scale visual model training and massive visual data storage and analysis, while edge computing will support scenario-specific low-latency inference. This will drive cloud service providers and enterprises to increase investments in cloud inference and edge infrastructure. Lenovo’s “One Horizontal, Five Vertical” infrastructure strategy aligns well with these market needs.

Following this logic, Lenovo’s release at CES 2026 of inference-optimized servers like SR675i and SR650i, and their joint AI cloud super-factory with NVIDIA, demonstrate foresight in AI computing demand. These products, with high performance, efficiency, and reliability, combined with a heterogeneous intelligent computing platform, precisely match large-scale visual AI training and inference needs.

Lenovo’s ThinkEdge series edge servers and AI gateways bring AI models closer to data sources, meeting enterprise demands for low latency, low power, and high security in real-time inference. As visual AI applications expand into industrial inspection, customer flow analysis, and autonomous driving perception, Lenovo’s edge computing products will see greater deployment opportunities.

In summary, the computing power demand of visual AI is driving the industry from “hardware supply” to “full-stack computing services.” Enterprise needs are shifting from simple hardware like servers and chips to comprehensive solutions covering hardware, software optimization, resource scheduling, and operations. Lenovo’s “end-edge-cloud” full-stack infrastructure is entering a critical valuation phase.

Lenovo’s value in the AI era lies in providing end-to-end services—from hardware (servers, storage, networking), to resource scheduling (heterogeneous intelligent computing platform), to computing services (building intelligent centers, operations, AI solutions). It can tailor personalized computing solutions based on visual AI scenarios, helping clients reduce costs and improve efficiency.

Specifically, at the edge, Lenovo can leverage AI PCs, smartphones, and smart glasses to capture consumer and commercial markets; in cloud and edge, it can meet enterprise demands with its infrastructure and solutions; at the service level, full-stack computing services will enable value enhancement.

This “end-edge-cloud” collaborative full-stack computing capability not only allows Lenovo to fully benefit from the growth of the visual AI industry but also helps the company transition from a traditional hardware manufacturer to a “Global Hybrid AI Leader,” continuously increasing corporate value.

From this perspective, Apple’s focus on visual AI is not just a strategic upgrade but a strong signal of the entire AI industry shifting from “technological iteration” to “scenario implementation.” As computing power is the core support for visual AI deployment, it has become the industry’s key battleground. In the future, as visual AI scenarios deepen and computing demands grow, companies with full-stack computing capabilities will stand out, becoming the main drivers of AI industry scaling.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes