Mythic Redefines Edge AI by Combining Analog Processing and Flash Memory
A seemingly unique and fruitful approach, analog processing has been the key to Mythic AI’s success. Its newest processor could be a hopeful contender in the world of edge computing.
The rise of artificial intelligence and edge computing has created an intense demand for new computing platforms that provide as much performance as possible while consuming as little power as possible. This demand has opened engineers' eyes everywhere and has forced us to reevaluate computing as we know it.
Mythic AI is a company that is taking an entirely novel approach to edge computing, ditching digital compute in favor of analog computing. Calling the results impressive may be an understatement, and with over $160 million in funding, the company has been poised to make waves in the industry.
The power efficiency of Mythic's Analog Compute. Image [modified] used courtesy of Mythic
Now, Mythic is making those waves with today's release of its brand new analog matrix processor (AMP).
In-memory Compute + Analog Processing
Mythic's approach to edge computing is unique in two ways.
First and foremost, Mythic is one of the only companies, at this moment, to use analog processing as a solution to edge computing. Analog processing is an older technology that theoretically could perform better than digital but has historically been plagued by large sizes, which eventually limits speed and scalability.
Mythic, however, has found a way to sidestep this limitation by marrying analog processing with embedded flash memory. This combination provides Mythic with the performance of analog but the speed and computes density of flash.
An analog computing cell. Image used courtesy of Mythic
Secondly, Mythic's approach ditches the von Neumann architecture and instead opts for in-memory computing. As chips have scaled down over the years, research has shown that the interconnect has become the bottleneck in system performance and power consumption.
Specifically, data movement in and out of memory has become the single biggest power consumer on ICs. In-memory compute instead allows Mythic to create extremely low power, low latency computing architectures that are not bogged down by the memory bottleneck.
Exploring the M1076
Mythic today announced their newest analog matrix processor, the M1076.
Image of the M1076. Image used courtesy of Mythic
The M1076 is the company's follow up to its earlier M1108, this time yielding even lower power consumption and smaller footprint at the sacrifice of lesser performance.
Still, the M1076 is highly performant, claiming to achieve up to 25 TOPS at a power consumption between 3 and 4 Watts, achieving a staggering best case of 8.33 TOPS/W.
In comparison, an NVIDIA Xavier AGX GPU, for example, peaks at 32 TOPS at 30 W and an area of 8700 mm^2. On the other hand, the M1076 comes in at a small 360 mm^2 area and offers comparable performance at 1/10 the power consumption.
Explicitly designed for AI, the M1076 is optimized to run standard, complex deep neural networks for applications like object classification, object detection, depth estimation, and more.
Architecturally, the processor utilizes an array of 76 AMP tiles, each consisting of a Mythic Analog Compute Engine (Mythic ACE™). The chip as whole claims to store up to 80 M DNN weights, allowing the device to run multiple DNNs entirely on-chip with no need for external DRAM.
Further, M1076 supports multiple data types, including INT4, INT8, and INT16, making it seem like a great platform for TinyML solutions.
Mythic Moving Forward
Mythic has released its M.2 accelerator cards and as well as the processor itself, including the ME1076 and the MM1076.
Mythic's MM1076 and ME1076. Image used courtesy of Mythic
With the help of these accelerators, Mythic hopes that integration of its technology will be made more accessible and soon find itself used widely in industry.
The work at Mythic is inspiring, and the results show that it could truly be a technology that has the potential to change the landscape of computing as we know it.
This sounds like a nice little AI system. With the current design I can see how it can be used as a general purpose processor.
However, I wonder if it is in the works to take the design one step further and connect the ‘pixels’ of an image sensor directly to the inputs of this type of chip. Each ‘pixel’ is already analog. This could then feed a whole image all at once and could be continuously changing. This would eliminate the steps of : read the analog value of each ‘pixel’ and convert to digital, creating an array of digital values and then copying those to the flash memory as input of the chip.
Is it not possible to feed the data to the inputs in such a way?
If that can be done, then have the neural net be configured by the flash memory and the output could be analog and converted only when digital is needed.
This may sound a little vague, but should be able to create a system that is not tied to a clock. In other words, processing is not in frames per second. But instead as fast as the analog circuitry will handle.
Sounds like fun to me. 😊