AI models that once took millions of dollars per month to run could soon fit into the three-by-five-inch phone in your pocket.
That may be a bit of an exaggeration, but recent technological advancements have brought edge AI—a method to run AI models on devices—back into the limelight. Engineers from Carnegie Mellon University, University of Washington, Shanghai Jiao Tong University and AI startup OctoML have recently collaborated to get large language models, or LLMs, running on iPhones, Androids, PCs and browsers.
If that research can be translated into everyday reality—a big if, to be sure—it could dramatically widen how people can use generative AI. Today, most AI models run in the cloud, where users rent hefty servers from providers like Azure, AWS, and Google to power the millions of queries they might receive every day.