
Discussions at chip design conferences hardly ever get heated. However a yr in the past on the International Symposium on Physical Design, issues obtained out of hand. It was described by observers as a “trainwreck” and an “ambush”. The crux of the conflict was whether or not Google’s AI answer to one in every of chip design’s thornier issues was actually higher than people or state-of-the-art algorithms. It pitted established male EDA consultants towards two younger feminine Google laptop scientists, and the underlying argument had already led to the firing of 1 Google researcher.
This yr at that very same convention, a frontrunner within the area, IEEE Fellow Andrew Kahng, hoped to place an finish to the acrimony as soon as and for all. He and colleagues at College of California, San Diego delivered what he referred to as an “an open and transparent assessment” of Google’s reinforcement learning strategy. Utilizing Google’s open-source model of its course of, referred to as Circuit Coaching, and reverse-engineering some components that weren’t clear sufficient for Kahng’s workforce, they set reinforcement studying towards a human designer, business software program, and state-of-the-art tutorial algorithms. Kahng declined to talk with IEEE Spectrum for this text, however he spoke to engineers final week at ISPD, which was held just about.
Usually, Circuit Coaching was not the winner, however it was aggressive. That’s particularly notable on condition that the experiments didn’t enable Circuit Coaching to make use of its signature skill—to enhance its efficiency by studying from different chip designs.
“Our aim has been readability of understanding that can enable the neighborhood to maneuver on,” he informed engineers. Solely time will inform whether or not it labored.
The Hows and the Whens
The issue in query is named placement. Mainly, it’s the technique of figuring out the place chunks of logic or reminiscence ought to be positioned on a chip so as to maximize the chip’s working frequency whereas minimizing its energy consumption and the world it takes up. Discovering an optimum answer to this puzzle is among the many most tough issues round, with extra doable permutations than the sport Go.
However Go was in the end defeated by a kind of AI referred to as deep reinforcement studying, and that’s simply what former Google Mind researchers Azailia Mirhoseini and Anna Goldie utilized to the position drawback. The scheme, then referred to as Morpheus, treats putting giant items of circuitry, referred to as macros, as a sport, studying to search out an optimum answer. (The places of macros have an outsized impression on the chip’s traits. In Circuit Coaching and Morpheus, a separate algorithm fills within the gaps with the smaller components, referred to as commonplace cells. Different strategies use the identical course of for each macros and commonplace cells.)
Briefly, that is the way it works: The chip’s design file begins as what’s referred to as a netlist—which macros and cells are related to which others in accordance what constraints. The usual cells are then collected into clusters to assist velocity up the coaching course of. Circuit Coaching then begins putting the macros on the chip “canvas” one after the other. When the final one is down, a separate algorithm fills within the gaps with the usual cells, and the system spits out a fast analysis of the try, encompassing the size of the wiring (longer is worse), how densely packed it’s (extra dense is worse), and the way congested the wiring is (you guessed it, worse). Referred to as proxy price, this acts just like the rating would in a reinforcement studying system that was determining the best way to play a online game. The rating is used as suggestions to regulate the neural community, and it tries once more. Wash, rinse, repeat. When the system has lastly discovered its activity, business software program does a full analysis of the entire placement, producing the sort of metrics that chip designers care about, resembling space, energy consumption, and constraints on frequency.
Mirhoseini and Goldie revealed the outcomes and methodology of Morpheus in Nature in June 2021, following a seven-month evaluate course of. (Kahng was reviewer #3.) And the method was used to design a couple of era of Google’s TPU AI accelerator chips. (So sure, information you used in the present day could have been processed by an AI operating on a chip partly designed by an AI. However that’s more and more the case as EDA distributors resembling Cadence and Synopsys go all in on AI-assisted chip design.) In January 2022, they launched an open-source model, Circuit Training, on GitHub. However Kahng and others declare that even this model was not full sufficient to breed the analysis.
In response to the Nature publication, a separate group of engineers, principally inside Google, started analysis geared toward what they believed to be a greater manner of evaluating reinforcement studying to established algorithms. However this was no pleasant rivalry. In keeping with press reports, its chief Satarjit Chatterjee, repeatedly undermined Mirhoseini and Goldie personally and was fired for it in 2022.
Whereas nonetheless at Google, Chatterjee’s workforce produced a paper titled “Stronger Baselines”, vital of the analysis revealed in Nature. He sought to have it introduced at a convention, however after evaluate by an impartial decision committee, Google refused. After his termination, an early model of the paper was leaked through an anonymous twitter account simply forward of ISPD in 2022, resulting in the general public confrontation.
Benchmarks, Baselines, and Reproducibility
When IEEE Spectrum spoke with EDA consultants following ISPD 2022, detractors had three interrelated considerations—benchmarks, baselines, and reproducibility.
Benchmarks are overtly out there blocks of circuitry that researchers take a look at their new algorithms on. The benchmarks when Google started its work had been already about twenty years outdated, and their relevance to trendy chips is debated. College of Calgary professor Laleh Behjat compares it to planning a contemporary metropolis versus planning a 17th century one. The infrastructure wanted for every is completely different, she says. Nonetheless, others level out that there isn’t a manner for the analysis neighborhood to progress with out everybody testing on the identical set of benchmarks.
As a substitute of the benchmarks out there on the time, the Nature paper centered on doing the position for Google’s TPU, a posh and cutting-edge chip whose design is just not out there to researchers outdoors of Google. The leaked “Stronger Baselines” work positioned TPU blocks but in addition used the outdated benchmarks. Whereas Kahng’s new work additionally did placements for the outdated benchmarks, the principle focus centered on three extra trendy designs, two of that are newly out there, together with a multicore RISC-V processor.
Baselines are the state-of-the artwork algorithms your new system competes towards. Nature in contrast a human professional utilizing a business instrument to reinforcement studying and to the main tutorial algorithm of the time, RePlAce. Stronger Baselines contended that the Nature work didn’t correctly execute RePlAce and that one other algorithm, simulated annealing, wanted to be in contrast as properly. (To be truthful, simulated annealing outcomes appeared within the addendum to the Nature paper.)
However it’s the reproducibility bit that Kahng was actually centered on. He claims that Circuit Coaching, because it was posted to GitHub, fell wanting permitting an impartial group to completely reproduce the process. So that they took it upon themselves to reverse engineer what they noticed as lacking components and parameters.
Importantly, Kahng’s group publicly documented the progress, code, datasets, and procedure for example of how such work can improve reproducibility. In a primary, they even managed to influence EDA software program corporations Cadence and Synopsys to permit the publication of the high-level scripts used within the experiments. “This was an absolute watershed second for our area,” stated Kahng.
The UCSD effort, which is referred to easily as MacroPlacement, was not meant to be a one-to-one redo of both the Nature paper or the leaked Stronger Baselines work. Moreover utilizing trendy public benchmarks unavailable in 2020 and 2021, Macro Placement compares Circuit Coaching (although not the latest model) to a brand new business instrument, Cadence’s Innovus concurrent macro placer (CMP), and to a way developed at Nvidia referred to as AutoDMP that’s so new it was solely publicly launched at ISPD 2023 minutes earlier than Kahng spoke.
Reinforcement Studying vs. All people
Kahng’s paper stories outcomes on the three trendy benchmark designs carried out utilizing two applied sciences—NanGate45, which is open supply, and GF12, which is a commercial GlobalFoundries FinFET process. (The TPU outcomes reported in Nature used much more superior course of applied sciences.) Kahng’s workforce measured the identical six metrics Mirhoseini and Goldie did of their Nature paper: space, routed wire size, energy, two timing metrics, and the beforehand talked about proxy price. (Proxy price is just not an precise metric utilized in manufacturing, however it was included to reflect the Nature paper.) The outcomes had been combined.
Because it did within the unique Nature paper, reinforcement studying beat RePlAce on most metrics for which there was a head-to-head comparability. (RePlAce didn’t produce a solution for the biggest of the three designs.) Versus simulated annealing, CT received greater than it misplaced on the manufacturing metrics.
For these experiments, the massive winners had been the latest entrants CMP and AutoDMP, which delivered the perfect metrics in additional instances than some other methodology.
Within the checks meant to match Stronger Baselines, utilizing older benchmarks, each RePlAce and simulated annealing nearly at all times beat reinforcement studying. However these outcomes report just one manufacturing metric, wire size, so that they don’t current an entire image, argue Mirhoseini and Goldie.
A Lack of Studying
Understandably, Mirhoseini and Goldie have their very own criticisms of MacroPlacement work, however maybe a very powerful is that it didn’t use neural networks that had been pretrained on different chip designs, robbing their methodology of its predominant benefit. Circuit coaching “in contrast to any of the opposite strategies introduced, can study from expertise, producing higher placements extra shortly with each drawback it sees,” they wrote in an e-mail.
However within the MacroPlacement experiments every Circuit Coaching outcome got here from a neural community that had by no means seen a design earlier than. “That is analogous to resetting AlphaGo earlier than every match… after which forcing it to learn to play Go from scratch each time it confronted a brand new opponent!”
The outcomes from the Nature paper bear this out, displaying that the extra blocks of TPU circuitry the system discovered from, the higher it positioned macros for a block of circuitry it had not but seen. It additionally confirmed {that a} reinforcement studying system that had been pretrained might produce a placement in six hours of the identical high quality as an untrained one after 40 hours.
New Controversy?
Kahng’s ISPD presentation emphasised a specific discrepancy between the strategies described in Nature and people of the open-source model, Circuit Coaching. Recall that, as a preprocessing step, the reinforcement studying methodology gathers up the usual cells into clusters. In Circuit Coaching that step is enabled by business EDA software program that outputs the netlist—what cells and macros are related to one another—and an preliminary placement of the parts.
In keeping with Kahng, the existence of an preliminary placement within the Nature work was unknown to him whilst a reviewer of the paper. In keeping with Goldie, producing the preliminary placement, referred to as bodily synthesis, is standard industry practice as a result of it guides the creation of the netlist, the enter for macro placers. All placement strategies in each Nature and MacroPlacement got the identical enter netlists.
Does the preliminary placement by some means give reinforcement studying a bonus? Sure, based on Kahng. However it’s not clear from the experiments to date to what extent and even why. His group did experiments that fed three completely different unattainable preliminary placements into Circuit Coaching and in contrast them to an actual placement. Routed wirelengths for the unattainable variations had been between 7 and 10 p.c worse.
Mirhoseini and Goldie counter that the preliminary placement data is just used for clustering commonplace cells, which reinforcement studying doesn’t place. The macro-placing reinforcement studying portion has no information of the preliminary placement, they are saying. What’s extra, offering unattainable preliminary placements could also be like taking a sledgehammer to the usual cell clustering step and due to this fact giving the reinforcement studying system a false reward sign. “Kahng has launched an obstacle, not eliminated a bonus,” they write.
Kahng means that extra fastidiously designed experiments are forthcoming.
Transferring On
This dispute has definitely had penalties, most of them unfavorable. Chatterjee is locked in a wrongful termination lawsuit with Google. Kahng and his workforce have spent quite a lot of effort and time reconstructing work carried out—maybe a number of instances—years in the past. After spending years warding off criticism from unpublished and unrefereed analysis, Goldie and Mirhoseini, who’s goal was to assist enhance chip design, have left a area of engineering that has traditionally struggled to draw feminine expertise. Since August 2022 they’ve been at Antrhopic engaged on reinforcement learning for large language models.
If there’s a vibrant facet, it’s that Kahng’s effort presents a mannequin for open and reproducible analysis and added to the shop of overtly out there instruments to push this a part of chip design ahead. That stated, Mirhoseini and Goldie’s group at Google had already made an open-source version of their research, which isn’t frequent for business analysis and required some non-trivial engineering work.
Regardless of all of the drama, using machine studying typically, and reinforcement studying particularly, in chip design, has solely unfold. A couple of group was capable of build on Morpheus even earlier than it was made open supply. And machine studying is aiding in ever rising facets of economic EDA instruments, resembling these from Synopsys and Cadence.
However all that good might have occurred with out the unpleasantness.
To Probe Additional:
The MacroPlacement undertaking is extensively documented on GitHub.
Google’s Circuit Coaching entry on GitHub is here.
Andrew Kahng paperwork his involvement with the Nature paper here. Nature revealed the peer review file in 2022.
From Your Website Articles
Associated Articles Across the Net