Advertisement

Embedded world 2024: MCUs and SoCs

Embedded world showcases the latest advances in processors, focusing on edge AI challenges across a broad range of applications.

The annual embedded world 2024 Exhibition & Conference in Nuremberg, Germany, April 9-11, showcases the latest innovations in embedded system development. Key themes at this year’s show include edge AI, embedded vision, IoT, electronic displays, connectivity, safety and security, IC and IP design and embedded OS.

This year’s keynote addresses the challenges of AI at the edge, a growing trend highlighted with the most recent product launches for microcontrollers (MCUs) and system on chips (SoCs). Improving AI/ML performance is a big driver for both startup and leading chipmakers, especially as many product designers are looking to add AI capabilities. Embedded.com’s 2023 embedded market survey found that 26% of respondents are currently using embedded AI and 24% are considering it. For ML model–based capabilities, 23% are currently using it and 24% are considering it.

The global AI chips market is expected to reach $257.6 billion by 2033, according to IDTechEx. The market researcher also projects the global AI chips market for edge devices to grow to $22 billion by 2034, driven by consumer electronics, industrial and automotive applications.

RISC-V processors are also making headway into edge AI, with chip shipments reaching 129 million by 2030, according to ABI Research.

The latest processor technologies focus on providing higher integration, improved performance and lower power consumption, while delivering greater flexibility, scalability and ease of design. Still, the overriding goal is to make it easier for engineers to integrate these improved and more complex devices into their new product designs. Also, many of these devices are cost- and performance-optimized for specific applications, so designers don’t pay for performance that is not required for their designs.

Here is a sampling of those product launches.

Accelerating edge AI

New edge AI processors and platforms are meeting AI/ML workload demands by focusing on the biggest challenges, which include providing enough compute capabilities for the specific application, along with power efficiency, power consumption and security. These are all critical in edge devices from consumer wearables and smartphones to industrial IoT, smart-home devices and intelligent vehicles.

Alif Semiconductor’s Balletto MCUs.

Alif Semiconductor’s Balletto MCUs (Source: Alif Semiconductor)

Alif Semiconductor has launched its Balletto family, claiming the industry’s first Bluetooth Low Energy (BLE) and Matter wireless MCU with a neural co-processor for AI/ML workloads. The new Balletto MCUs leverage the company’s advanced MCU architecture, which uses DSP acceleration and the dedicated NPU for low-power execution of AI/ML workloads, and its aiPM power management technology, which can dynamically power only the logic and associated memory that are in use to deliver the lowest overall system power consumption. The aiPM technology offers four system level power modes including a stop mode that draws just 700 nA.

The Balletto MCU, housed in an ultra-small WLCSP package, supports AI/ML functions such as speech recognition, adaptive noise cancellation, vocal targeting and beam forming in true wireless stereo (TWS) earbuds, and sensor fusion in lifestyle wristbands and other types of space-constrained devices.

The Balletto chip includes the Arm Cortex-M55 core, which achieves an EEMBC CoreMark score of 704 at 160 MHz, with the Arm Helium M-profile vector extension (MVE), providing a 500% improvement in DSP performance. It also features an Arm Ethos-U55 Neural Processing Unit (NPU), which can perform up to 46 GOPS and a 2-MB tightly coupled memory (TCM).

Alif said the neural network processing performance is up to 15× better than a Cortex-M4 processor, which benefits audio encoding/decoding functions.

In addition, thanks to its own separate processor and memory, Balletto’s radio supports concurrent Bluetooth LE 5.3 and Thread operations via a single on-chip antenna, enabling use in home automation networks and systems that support the Matter protocol. Other features include receive sensitivity of -101 dBm, a dual-PA architecture, up to 77 GPIOs and advanced algorithms to help with interference avoidance from other devices and coexistence with the other protocols sharing the 2.4-GHz frequency band.

The Balletto MCUs also leverage Alif’s state-of-the-art multi-layered security fabric first introduced in Alif Semiconductors’ Ensemble family of MCUs and fusion processors.

Infineon Technologies AG has launched its next-generation of PSOC Edge MCUs for a range of ML applications, first introduced in November 2023 for ML at the edge. The three new series of PSOC Edge MCUs—E81, E83 and E84—offer a range of performance, features and memory options for ML-enabled IoT, consumer and industrial applications. The PSOC Edge E8 series devices come with a complete set of system design tools and software.

Infineon’s PSOC Edge MCUs.

Infineon’s PSOC Edge MCUs (Source: Infineon Technologies Inc.)

The PSOC Edge E8x MCUs are based on the high-performance Arm Cortex-M55, including Helium DSP support paired with the Arm Ethos-U55 and Cortex-M33 as well as Infineon’s ultra-low-power NNLite, a proprietary hardware accelerator for neural networks. All three series support extensive peripheral sets, on-chip memory and hardware security with a variety of connectivity options including USB HS/FS with PHY CAN, Ethernet, Wi-Fi 6, Bluetooth, Bluetooth LE and Matter.

The PSOC Edge E81 uses the ARM Helium DSP technology along with the Infineon NNLite accelerator. The PSOC Edge E83 and E84 integrate the Arm Ethos-U55 micro-NPU processor, which provides a 480× improvement in ML performance compared to existing Cortex-M systems, according to Infineon, and they support the Infineon NNlite neural network accelerator for ML applications in the low-power compute domain.

Hardware design support is comprehensive, including an evaluation base board with an Arduino expansion header, sensor suite, Bluetooth LE connectivity for provisioning and Wi-Fi for smartphone and cloud connectivity. It is also supported by Infineon’s ModusToolbox software platform, which provides development tools, libraries and embedded runtime assets. ModusToolbox supports a range of use cases including consumer IoT, industrial, smart home and wearables.

Target applications include human machine interface (HMI) in appliances and industrial devices, smart home and security systems, robotics and wearables. Infineon will demo the new PSOC Edge MCUs at the Infineon booth (hall 4A, stand #138) and the Arm booth (hall 4, stand #504).

Arm is supporting these new edge-AI designs, including Alif and Infineon chips, with its new Arm Ethos-U85 with a 4× performance improvement for edge-AI applications such as factory automation and commercial or smart-home cameras, and its new IoT reference design platform, the Corstone-320, which delivers embedded IP with virtual hardware for voice, audio and vision systems.

This third-generation NPU also is Arm’s most efficient Ethos NPU to date with 20% higher power efficiency, along with the 4× performance boost, compared to its predecessor. It also scales from 128 to 2048 MAC units (4 tera operations per second, or TOPS, @1 GHz).

Ethos-U85 supports both transformer networks and convolutional neural networks (CNNs) for AI inference. Transformer networks will drive new applications, particularly in vision and generative AI (GenAI) use cases such as understanding videos, filling in missing parts of images or analyzing data from multiple cameras for image classification and object detection, Arm said.

In addition, for more high-performance IoT systems for use cases such as industrial machine vision, wearables and consumer robotics, the Ethos-U85 works with Arm’s leading Armv9 Cortex-A CPUs to accelerate ML tasks and bring power-efficient edge inference into a broader range of higher-performing devices, the company added.

Arm said the Ethos-U85 offers the same toolchain so partners can leverage existing investments. It also provides support for AI frameworks such as TensorFlow Lite and PyTorch.

The Arm Corstone-320 IoT Reference Design Platform combines the new Ethos-U85 NPU with the high-performance Arm Cortex-M85 CPU and Mali-C55 image signal processor (ISP) for a broad range of edge AI applications for voice, audio and vision. Examples include real time image classification and object recognition and voice assistants with natural language translation on smart speakers. The platform includes software, tools and support, including Arm Virtual Hardware.

Also expanding AI to embedded edge devices is Intel’s new series of edge-optimized Intel Core Ultra, Intel Core and Intel Atom processors and discrete Intel Arc graphics processing units (GPUs).

The Core Ultra processors offer up to 5.02× better image classification inference performance compared to 14th Gen Intel Core desktop processors. They integrate the Intel Arc GPU and a NPU (Intel AI Boost) with LGA packaging. They also offer up to 3.13× faster AI performance and up to 3.85× percent lower power for AI and graphics workloads compared to the previous generation.

The new SoC targets GenAI and demanding graphics workloads at the edge for retail, education, smart cities and industrial customers. Application examples include GenAI-enabled kiosk and smart point-of-sale systems in brick-and-mortar retailers, interactive whiteboards for enhanced in-classroom experiences and AI vision-enhanced industrial devices for manufacturing and roadside units.

The Core processors combine the GPU power of 13th Gen Intel Core mobile processors with LGA packaging, targeting system scalability and speed to deployment, Intel said. These processors deliver up to 2.57× greater graphics performance compared to 13th Gen Intel Core desktop processors by leveraging up to three times more graphics execution units. This SoC also features the company’s performance hybrid architecture with Intel Thread Director.

The Atom processors x7000C Series  offers up to eight Efficient-cores with Intel Turbo Boost Technology, up to 2.4-GHz processor base frequency and LPDDR5/DDR5/DDR4 memory to process more packets for enterprise networking and telecommunications devices. These features enable telecommunications businesses to use built-in deep learning inference capabilities to support the detection of zero-day threats, boost packet and control plane processing for OpenSSL/IPSec using native instruction sets and leverage Intel security features to harden networks, the company said. Deep learning inference capabilities, including Intel Deep Learning Boost (Intel DL Boost), Intel Advanced Vector Extensions 2 (Intel AVX2) with INT8 support, and the OpenVINO toolkit (validation to be completed in 2024) support.

Targeting industrial and manufacturing applications, the Atom processors x7000RE Series features built-in deep learning inference capabilities, up to eight Efficient-cores (E-cores) and up to 32 graphics execution units in a ruggedized, power-efficient 6-W to 12-W BGA package, offering up to 9.83× image classification performance compared with the Atom processors x6000RE Series. They also offer up to 1.49× faster single-thread performance and up to 1.61× faster multithread performance compared to the X6000RE Series. Memory features include LPDDR5, DDR5, and DDR4 memory with support for In-Band Error Correction Code (IBECC).

The new ATOM processors support fanless designs to enable Industry 4.0 automation for applications such as AI-automated tending, warehouse AMR, in-line visual inspection for quality control and ruggedized industrial PCs. Deep learning inference capabilities include integrated Intel UHD Graphics, along with Intel Deep Learning Boost (Intel DL Boost), Intel Advanced Vector Extensions 2 (Intel AVX2) with INT8 support and the OpenVINO toolkit.

The Intel Arc GPU for Edge improves performance and edge AI capabilities on legacy Intel Core systems as a discrete GPU, providing accelerated AI, and media and graphics processing power. They deliver an open, standards-based software stack for greater design flexibility.

Startups also are bringing powerful processors to edge devices. Hailo Technologies Ltd., which recently closed a new $120-million funding round, bringing total funding to $340 million, unveiled its Hailo-10 AI accelerator for GenAI at embedded world. The new AI accelerator is designed to process large language models (LLMs) at low power consumption for PC, automotive and robotics industries.

Hailo’s Hailo-10 AI accelerator.

Hailo’s Hailo-10 AI accelerator (Hailo Technologies Ltd.)

The high-performance GenAI accelerators enable users to own and run GenAI applications locally without registering to cloud-based GenAI services, Hailo said. It also eliminates network latency concerns that can impact GenAI performance, delivers privacy by keeping personal information anonymized and supports sustainability by reducing the reliance on cloud data centers.

Focused on performance-to-cost ratio and performance-to-power consumption ratio, the Hailo-10, on popular GenAI platforms such as Llama2-7B, can run with up to 10 tokens per second (TPS) at under 5 W of power. In processing Stable Diffusion 2.1, a popular model that produces images from text prompts, Hailo-10 is rated at under 5 seconds per image in the same ultra-low power envelope, the company said.

Hailo-10 uses the same software suite across the Hailo-8 AI accelerators and the Hailo-15 AI vision processors for seamless integration of AI capabilities across multiple edge devices and platforms.

Hailo-10 is capable of up to 40 TOPS, a new performance standard for edge-AI accelerators, Hailo said, and is claimed to be faster and more energy efficient than integrated NPU solutions, delivering at least 2× more performance at half the power of Intel’s Core Ultra NPU, according to recently published benchmarks.

Hailo will start shipping samples in the second quarter of 2024. Early applications include PCs and automotive infotainment systems. The company is exhibiting at embedded world in Hall 1, stand 126 and at the ISC West exhibition, Las Vegas, in booth #31065.


Recommended
Embedded world 2024: COMs, SIPs and SOMs


Low-power AI for IoT

Pairing vector acceleration with high power efficiency for AI inferencing on-device without a dedicated NPU, Ambiq’s next-generation Apollo series significantly boosts performance for IoT endpoints. Ambiq debuted the Apollo510, the first member of the Apollo5 SoC family for IoT endpoint devices, at embedded world.

Ambiq’s Apollo510 MCU.

Ambiq’s Apollo510 MCU. Click for a larger image. (Source: Ambiq)

The company describes the new Apollo family as a “complete overhaul of hardware and software,” leveraging the Arm Cortex-M55 CPU with Arm Helium to reach processing speeds up to 250 MHz. The Arm Helium technology supports up to 8 MACs per cycle as well as half, full and double precision floating point operations, making it suited for AI calculations in addition to general signal processing operations

The Apollo510 achieves up to 10× better latency while reducing energy consumption by around 2× and delivering 30× better power efficiency, compared to the previous Apollo4. These features enable designers to run AI/ML workloads concurrently with complex graphics as well as sophisticated voice applications and always-on voice/sensor processing on battery-powered devices.

Ambiq said it is the most efficient semiconductor on the market to operate with the Arm Cortex-M55 as well as the company’s most energy-efficient and highest-performance product.

Thanks to its improved power efficiency and SPOT (Subthreshold Power Optimized Technology) platform, the Apollo510 can run many of today’s endpoint AI calculations, including low-power sensor monitoring, always-on voice commands and telco-quality audio enhancement. This means IoT devices that perform AI/ML inferencing, such as wearables, digital health devices, AR/VR glasses, factory automation and remote monitoring devices, can expand their power budget while adding more capabilities to their devices through the Apollo510’s SPOT-optimized design.

Apollo510 also improves its memory capacity over the previous generation with 4 MB of on-chip NVM and 3.75 MB of on-chip SRAM and TCM. For extra-large neural network models or graphics assets, Apollo510 offers high bandwidth off-chip interfaces, individually capable of peak throughputs up to 500 MB/s and sustained throughput over 300MB/s.

Other features include a 2.5D GPU with vector graphics acceleration, offering a 3.5× overall performance enhancement over the Apollo4 Plus family, and support for Memory in Pixel (MiP) displays, typically found in the lowest-power products, Ambiq said.

The new chip also offers advanced security with the secureSPOT platform, together with the Arm TrustZone technology with a physical unclonable function (PUF), tamper-resistant OTP and secure peripherals.

The Apollo510 MCU is currently sampling, with general availability in the fourth quarter of 2024. Target applications include wearables, digital health, agriculture, smart homes and buildings, predictive maintenance and factory automation.  The Apollo510 MCU is an award winner for the hardware category at embedded world. The company is exhibiting in hall 3, stand 301.

Synaptics Inc. announced two new products at embedded world, launching the Synaptics Astra platform with the SL-Series of embedded AI-native IoT processors and the Astra Machina Foundation Series development kit. The SL-Series brings AI directly to products, solving the challenges of data privacy and latency, while delivering the compute capability and power consumption required for a range of consumer, enterprise and industrial edge IoT applications.

Synaptics’ Astra SL-Series embedded AI-native IoT processors.

Synaptics’ Astra SL-Series embedded AI-native IoT processors (Source: Synaptics Inc.)

The Astra AI-native compute platform uses scalable hardware, unified software and an adaptive open-source AI framework, along with a partner-based ecosystem. The multi-core Linux or Android SoCs are based on Arm Cortex A-series CPUs and feature hardware accelerators for edge inferencing and multimedia processing on audio, video, vision, image, voice and speech. It will be joined soon by the company’s power-optimized AI-enabling SR-series of MCUs.

The SL-Series is comprised of three devices—the SL1680, SL1640 and SL1620. They each offer specific application-target performance levels, all with out-of-the-box AI.

The high-efficiency SL1680 is based on a quad-core Arm Cortex-A73 64-bit CPU, a 7.9 TOPS NPU, a feature-rich GPU and a multimedia accelerator pipeline. Applications include home and industrial control, smart appliances, home security gateways, digital signage, displays, point-of-sale systems and scanners.

Optimized for cost and power, the SL1640 integrates a quad-core Arm Cortex-A55 processor, a 1.6+ TOPS NPU and a GE9920 GPU. It is suited for smart home appliances, enterprise conferencing, smart speakers, displays and signage, consumer and industrial control panels.

Offering a feature-rich GPU for advanced graphics and AI acceleration, the SL1620 features a quad-core Arm Cortex-A55 CPU subsystem, high-performance audio algorithms and dual displays. Target applications include enterprise multimedia conferencing, smart appliances, home security gateways, digital signage, displays, point-of-sale systems and smart speakers.

Synaptics' Astra Machina Foundation Series development kit.

Astra Machina Foundation Series development kit. Click for a larger image. (Source: Synaptics Inc.)

The Astra Machina Foundation Series development kit supports the SL-Series. The kit is designed for both AI beginners and experts to leverage the AI capabilities and the processing and graphics performance of the SL-Series as well as the wireless connectivity of Synaptics’ SYN43711 and SYN43752 Wi-Fi and Bluetooth combo SoCs.

The SL-Series processors are available now. The Astra Machina Foundation Series development kit will be available in the second quarter of 2024.  Synaptics is exhibiting at embedded world in hall 4A, stand 259.

MCUs: Entry level to high performance

Renesas Electronics Corp. announced its new entry-level and low-cost RA0 MCU series based on the 32-MHz Arm Cortex-M23 processor. Claiming the industry’s lowest overall power consumption for general-purpose 32-bit MCUs, the RA0 devices consume only 84.3 μA/MHz of current in active mode and 0.82 mA in sleep mode. The new devices also include a software standby mode that reduces power consumption by a further 99 percent to just 0.2 µA, Renesas said. Memory includes up to 64-KB integrated Code Flash memory and 12-KB SRAM.

Renesas’ RA0 MCU series.

Renesas’ RA0 MCU series (Source: Renesas Electronics Corp.)

The new MCUs also feature a fast wake-up high-speed on-chip oscillator (HOCO), making them suited for applications such as battery-operated consumer electronics devices, small appliances, industrial system control and building automation applications. The high-precision (±1.0%) HOCO improves baud rate accuracy, eliminates the need for a standalone oscillator and maintains its precision over the temperature range of -40°C to 105°C. This wide temperature range also eliminates the need for time-consuming “trimming,” even after the reflow process, Renesas said.

The first group in the RA0 Series, the RA0E1 Group, is now shipping. The feature set targets cost-sensitive applications. They offer a wide operating voltage range of 1.6 V to 5.5 V, so customers don’t need a level shifter/regulator in 5-V systems, Renesas said. Other features include integrated timers, serial communications, analog functions, safety functions and HMI functionality.

Security features include true random number generator (TRNG) and AES libraries for IoT applications, including encryption, and safety functions as well as an IEC60730 self-test library. The series offers several packaging options, including a tiny 3 × 3 mm, 16-lead QFN and 24- and 32-lead QFNs as well as a 20-pin LSSOP and 32-pin LQFP.

For design and development, the new RA0E1 Group MCUs are supported by Renesas’ Flexible Software Package (FSP), which also makes it easier to  migrate to larger RA devices; the RA0E1 Fast Prototyping Board and a Winning Combination HVAC Environment Monitor Module for Public Buildings design with vetted components. Samples and kits are available direct from Renesas or through distributors. Renesas is demonstrating the new RA0 MCUs in hall 1, stand 234.

For higher-performing applications, STMicroelectronics has expanded its family of STM32H7 MCUs, delivering improved performance, scalability and security features that make it easier for designers to integrate into their new designs, while raising the performance of their new smart devices for factories, buildings, infrastructure and eHealth.

STMicroelectronics’ STM32H7 MCUs.

STMicroelectronics’ STM32H7 MCUs (Source: STMicroelectronics)

The STM32H7 MCUs leverage the highest-performing Arm Cortex-M core ST has yet announced (Cortex-M7 running at up to 600 MHz) with minimal on-chip memory and high-speed external interfaces. The new devices include the STM32H7R3/S3 general-purpose MCUs and the STM32H7R7/S7 with enhanced graphics-handling capabilities. Developers can share software between the two lines for faster development and time to market for new products.

Both the STM32H7R and STM32H7S MCUs provide advanced security features, required for IoT applications. They include protection against physical attacks, memory protection, code isolation to protect the application at runtime and platform authentication. The STM32H7S devices offer additional security by integrating immutable root of trust, debug authentication, and hardware cryptographic accelerators that support the latest algorithms to prevent unauthorized access to code and data. Thanks to these features the new devices target up to SESIP3 and PSA Level 3 certifications for cyber protection.

ST also has integrated its NeoChrom GPU that enables MPU-like graphical user interfaces (GUIs) with rich colors, animation and 3D-like effects. The on-chip display controller enables the MCUs to handle vibrant and colorful high-definition user interfaces that would be too taxing for smaller microcontrollers, ST said, and running the UI uses only about 10% of the main CPU performance, enabling smartphone-like user experiences while also running demanding applications like edge AI, communication, and real-time control.

The new STM32H7 devices embed bootflash memory and SRAM on the chip while application code and data are stored in off-chip memory ICs. ST said the bootflash ensures easier and more secure startup and enables application development using familiar tools and STM32 software packs. The STM32Cube software and tools help developers to set up the boot system and locate their code in external memories.

Power management is also integrated on-chip, in comparison with typical MPUs that require an external power-management IC (PMIC), ST said.

The STM32H7S8-DK demonstration and development platform, as well as the extendable NUCLEO-H7S3L8 MCU board, are available.  Volume production is expected to start in April 2024. STMicroelectronics is exhibiting in hall 4A, stand 148.

Related embedded world product launches:

Variscite launches i.MX 95-based SOM

Infineon unveils next-gen PSOC Edge MCUs

Reference design demos USB PD 3.1 up to 240 W

Development kit delivers edge-AI SBC

Ultrasonic ToF sensor targets IoT and robotics

Ceva Launches multi-protocol wireless platform IPs

Micron claims first quad-port SSD for SDVs

Advertisement



Learn more about ARM
Infineon Technologies
Intel
Renesas Electronics America
STMicroelectronics
Synaptics

Leave a Reply