New Inference Framework Speeds up LLMs Without Raising Costs

Embedded Computing Design

The leading source of in-depth tech knowledge, news, views, & instructional design content for the electronics engineer.

Published Nov 7, 2024

Large language models (LLMs) are some of today’s most impactful technologies. They’re what make advanced chatbots and generative AI possible, but as their functionality grows, so too do their costs and complexity. A new framework from Stanford researchers could change that.

In a recent research paper, a team unveiled a modular inference framework called Archon. Inference is the stage where LLMs draw on what they’ve learned in training to determine appropriate responses or make predictions based on new data. This requires a considerable amount of complicated computing, so it’s often either slow or expensive. Archon speeds it up without raising costs. Read more.

Power Modules Result in Smaller (and Lighter) Vehicles

For the automakers, weight reduction in electric vehicles (EVs) is at or near the top of their engineering priority lists. EVs are traditionally heavier than their non-electric counterparts even if those weight reductions come in small increments, every little bit counts.

Three new dc-to-dc converter power modules from Vicor can remove significant weight by reducing the size of some components or completely eliminating them completely. Read more.

Third Party IP Block Licensing from Sondrel

Sondrel has made its in-house IP available for licensing including a suite of IP blocks for standard SoC management that are designed to operate start-up of devices, clock and reset control, and power domain handling. The SoC Management Suite is divided into three parts, the PMU (Power Management Unit, the URG (Universal Reset Generator), and the UCG (Universal Clock Generator). Read more.

Embedded AI & Machine Learning, a property of Embedded Computing Design, is the leading source of “how-to” technical articles, videos, blogs, conferences, and podcasts for embedded engineers. Topics covered include AI, AIoT, Machine Learning, Computing Vision, Security and more.

View the Latest White Papers

Edge Computing Boosts Retail Applications Through Improved Image Recognition

Unlock Ultimate SSD Reliability: Discover How Advanced Technology Protects your Critical Data

Optimizing IoT Project Performance in the Hyperscale Era

AI-powered Six-Sided Product Case Inspection at ADLINK Factory

Lower your power consumption for battery-operated smart devices

Using off-the-shelf transformers to drive SiC FETs

Speaking Opportunities

IoT Device Security Conference

Electronica 2024 Opportunities

Marketer’s Guide to electronica electronica Best in Show Award Submissions

Content

2024 Embedded Computing Editorial Calendar

2024 AI/Machine Learning Content Calendar

Recommended by LinkedIn

How Nvidia trained Nemotron, better agents, and more…

Towards AI 5 months ago

Accelerating Industry 4.0 Innovation with Computer…

CodeNinja Inc. 2 months ago

Seeed Monthly Wrap-Up for January: Explore Machine…

Seeed Studio 10 months ago

2024 Automotive Content Calendar

2024 Security Content Calendar

Content Creation Services

Events

FMS: The Future of Memory and Storage

Computex

CES

RISC-V Summit

Content/Lead Strategies

Guide to Content Marketing in the Electronics Space

Guide to Generating Leads in the Embedded Space

New Products

New Product/News Submissions

Product of the Week

NEW: Application Highlight

NEW: Datasheets

Other Opportunities

Dev Kit Weekly

Embedded Toolbox

Embedded Solutions Video

Podcasts

E-newsletters

#ew24 #ai #iot #aiot #embeddedsystems #machinelearning #computex2024 #electronicafair #ewna24 #riscvsummit #embeddedworld #ewna #ewna24 #ewconference #ewnorthamerica

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

1mo

The focus on efficiency gains in LLMs through new inference frameworks raises questions about whether this prioritizes speed over other crucial aspects like explainability and bias mitigation. Recent research on "AI for Social Good" emphasizes the need for ethical considerations alongside performance improvements. How would these power modules impact the accessibility of electric vehicles in developing nations with limited infrastructure?

New Inference Framework Speeds up LLMs Without Raising Costs

Embedded Computing Design

The leading source of in-depth tech knowledge, news, views, & instructional design content for the electronics engineer.

Power Modules Result in Smaller (and Lighter) Vehicles

Third Party IP Block Licensing from Sondrel

Recommended by LinkedIn

Embedded AI & Machine Learning

10,777 followers

More articles by this author

Insights from the community

Others also viewed

The AI Stack

AI and Machine Learning on SoCs

AI Power Consumption Insights

AI Newsletter

AI, MLOps & Robotics Newsletter #90

Organisational AI & The Future Of AI Operations

Demystifying Computer Vision: Its Significance and The Future Landscape

A New Fear Unlocked with Copilot+PCs

Artificial Intelligence and the Surge in Data Center Demand

Integrating AI Computing Power with IoT for Smarter Solutions

Explore topics

Power Modules Result in Smaller (and Lighter) Vehicles

Third Party IP Block Licensing from Sondrel

Recommended by LinkedIn

Embedded AI & Machine Learning

10,777 followers

Embedded Solutions Video with Sealevel Systems: Rugged Systems Needn't Break the Bank

Dec 20, 2024

Real-Time Versus Real Fast

Dec 18, 2024

AIoT in Retail: Transforming Shopping Experiences and Efficiency

Dec 11, 2024

Lattice Developers Conference 24

Dec 10, 2024

ADATA and Advantech Team Up to Power AI Workloads with Advanced HPC Servers

Dec 5, 2024

Embedded Workstations Drive Intelligent Autonomous Forklifts

Dec 4, 2024

How Microsoft Is Optimizing NLP Models With Dynamic Few-Shot Techniques

Dec 3, 2024

Product of the Week: ASUS IoT’s EBS-I300 Fanless Embedded Computer

Dec 2, 2024

New White Paper: Latest innovation in MEMS pressure

Nov 27, 2024

VIA Technologies Powers Industrial Automation with NVIDIA Jetson Orin Edge AI PCs

Nov 26, 2024

Insights from the community

Others also viewed

The AI Stack

AI and Machine Learning on SoCs

AI Power Consumption Insights

AI Newsletter

AI, MLOps & Robotics Newsletter #90

Organisational AI & The Future Of AI Operations

Demystifying Computer Vision: Its Significance and The Future Landscape

A New Fear Unlocked with Copilot+PCs

Artificial Intelligence and the Surge in Data Center Demand

Integrating AI Computing Power with IoT for Smarter Solutions

Explore topics