Some fear that the development of electronics will be stunted at the end of Moore's law. Here's a retrospective look at how hardware and software have been used to approach the problem of transistor counts on silicon chips.

Gordon Moore wrote a paper in 1965 (PDF) introducing Moore’s law, which states that the number of transistors on silicon doubles every year (later revised to two years). However, the laws of physics are starting to challenge this doubling trend and transistor count is approaching a limit.

How will the industry and consumers alike cope with the upper limit of power in computational devices? What can we do as engineers and designers to mitigate against this inevitable problem?

 

The Problem with Transistor Count

For the past 50 years, designers have relied on the transistor count increasing as the main contributor to technological advances. The Intel 4004 CPU was the world's first processor that enabled the creation of computers with a transistor count of just 2300 and a data bit width of four bits. As time progressed, IC manufacturers were able to increase the number of transistors on a single semiconductor device which led to more powerful processors such as the 8080, Z80, and the 68000. With each step, computer designs became more complex with better capabilities including full-color graphics, faster bus protocols, and better networking.

While many of these advances were contributions from software and hardware engineers alike, the main driving force has been the rise in transistor count. And it is this reliance on resistor count that must be addressed soon or technology could potentially hit a brick wall while designers wait for scientists to find a solution.

 

Intel 4004 helped to revolutionize the planet. Image courtesy Thomas Nguyen [CC BY-SA 4.0]

 

So what has changed? Why are smaller transistors harder to make? In the past, size reduction was a matter of using smaller-wavelength UV light, higher contrast masks, and better manufacturing techniques such as spin-coated layers. To get a transistor's features from 10μm to 5μm is relatively trivial whereas feature reduction from 5nm to 4nm is a very difficult task. When creating a transistor that's measured in nanometers, feature sizes start to become atomic (i.e., consist of fewer than 100 atoms in any one direction). For example, atoms typically range between 0.1nm to 0.5nm in size which would make a square 4nm feature only 8 by 8 atoms.

Quantum effects also become a problem when dealing with atomic-sized features. For example, quantum tunneling involves electrons passing through insulating barriers and thus contributing to current draw. This current draw increases the temperature of the silicon and thus reduces the overall efficiency.

So unless scientists can find new methods for silicon production or exotic materials, it is up to us as engineers to do what we do best: solve problems with the resources we have. Luckily, there is a method that can be employed right now which could not only produce faster machines but also make an Intel i7 octal core look stupidly overpowered.

 

The Software Solution

A researcher at the Allen Institute for Artificial Intelligence, colloquially known as AI2, has developed a brilliant solution, reminiscent of the ZX Spectrum and Commodore 64 from the 1980s. Dr. Farhadi, in conjunction with his research team, has developed a program that can identify thousands of different objects. This may not sound that impressive until you see what computer it runs on.

The team's coding methods have produced software that runs 58 times faster and uses less than 1/32 the memory rival programs use. All of this runs on a $5 Raspberry Pi.

Instead of relying on high-end processors and large memory models, the team took the time to develop incredibly clever algorithms while trying to keep memory usage at a minimum and routines as fast as possible. It is this thinking that may produce “faster computers” in the future and help to future technological advances.

So why is this similar to the good old days? In the past, computer programmers using machines like the ZX Spectrum and Commodore 64 had very little memory and processing power. Because of this, extreme planning and care were needed to ensure that their programs would run as fast as possible while leaving memory for other tasks or temporary storage. Instead of using inbuilt BASIC to code games (which was slow and bulky), programmers would use assembly language to take full advantage of the hardware and processing power that was available to them. This is one of the reasons why my Amstrad PCW 8512 uses a Z80 CPU and is a fully-featured word processor that is complete enough to save documents, check spelling, print, and even style.

 

Amstrad PCW 8512, a near complete word processor with a Z80! Image courtesy of Johann H. Addicks [CC-BY-SA-3.0]

 

But, over time, companies and developers wanted faster turnaround with software which brought it the higher level languages such as C, C++ and Java. While these languages are incredibly useful and help to produce software quickly, they also cause issues with optimization and memory requirements. Modern compilers are good with optimization (if the correct options are selected), but even then programmers are quick to not make attempts to optimize their own code.

This lack of care in programming has resulted in a large dependence on powerful processors, larger memory models, and faster bus protocols. Programmers in the future may have to use both high-level languages in conjunction with assembler to ensure that routines are as fast as possible as well as being optimized.

However, not all the blame can be put on the programmers. Hardware engineers are also part of the problem.

 

The Hardware Solution

In the past, CPUs such as the Z80 were very good at doing basic tasks such as file sorting, data lookup, and memory manipulation. Such processors were also very good at basic arithmetic such as adding and subtracting—but what about numbers with decimal points (i.e., floating point numbers)? The answer to that question is no. They were terrible at such operations because those abilities did not exist and had to be coded in software.

To overcome such a problem, it was not uncommon to see a co-processor which is a specialized piece of hardware designed to handle specific operations. One example was the Intel 80287 FPU which is specifically designed to perform floating point operations. This takes pressure off the software both in speed and memory usage.

Modern CPUs do have inbuilt FPUs and other useful functions but there are still many functions that could and probably should be implemented in silicon. For example, encryption and decryption are common tasks performed by computers when backing up data, sending private information over a network connection, and acquiring network access. So instead of performing encryption in software, it could be moved to a secure cryptoprocessor which would remove some of the security features needed in programs.

 

Like the GPU, let's bring back the co-processor! Image courtesy of Konstantin Lanzet [CC-BY-SA-3.0] via the CPU collection

 

Moving features onto dedicated hardware is not the only solution to limiting transistor count. One possibility, albeit extreme, could be to end the industry IBM architecture and develop a whole new architecture which is centered around high-speed computing and modern hardware. IBM backward compatibility means that software which runs on Windows XP or Windows 7 can run on a Windows 10 system, but it also means that MS-DOS can run on an Intel i7. There needs to be a point where old programs and operating systems are left in the past so new technology can bring about faster computers and higher productivity.

 

Read More

 

Summary

Transistor real estate is becoming very expensive and programs such as operating systems are increasingly demanding more processing power while offering little in return. Unless engineers band together and start to think about how we can improve technology with the tools we already have, we can expect to see stagnation in computing capabilities as a whole. 

 

Comments

1 Comment


  • sjgallagher2 2017-01-26

    I disagree with a few aspects of this article. The reason that software is abstracted is not exactly to make things “easier”, but rather to make things “less difficult”. Primarily, higher level languages make code simpler, and thus reduce the number of severe bugs. Software projects can be massive, relying on many other tools that may be changing. Using “clever” algorithms is not a good solution - it is just asking to be undone by unforseen problems. For example, in the cryptography community it is common knowledge that you should never use an algorithm that is private, because private algorithms are going to be rife with issues that you may not know about (mathematical, software-related, etc). The well known and reused algorithms known to all the public have been tested and analysed to death.

    This is the idea behind software reuse, which increases software complexity, and thus requires high levels of abstraction in order to perform correctly. Higher levels of complexity and abstraction require more and more memory. Hence the need for high computing power. Some problems simply cannot be avoid, like certain simulations which must be solved analytically in software. If you have one million frames of a fluid simulation and each frame simulates billions of particles, you have incredible numbers of calculations (whether simple arithmetic or not) to perform. In software and computer hardware, there is no way around this. You must either simulate using the computer, or perform experiments. But one huge reason for simulation is to save cost and reduce interference. Think about simulating an IC with SPICE before sending it to the manufacturing line. If computers didn’t have the computing power they do, simulating large ICs would be impossible, and simply “experimenting” with it is infeasible outside of running a production-quantity line.

    The idea is sound - as Moore’s law slows to a halt, engineers need to work on various methods for improving computation speeds. Perhaps new architectures, power and clock management, logic units, or even AI integration. But these are optimizations, and will not likely be nearly as dramatic as increasing transistor density. Your article did a fair job of explaining this. But the examples used are not good, and the ideas about software becoming ultra-optimized with low level programming is definitely not good. We’ll quickly return to days where your computer could run a maximum of 20 days before it crashes.