Engineering Community Portal
Welcome – From the Editor
Welcome to the Engineering Portal on MERLOT. Here, you will find lots of resources on a wide variety of topics ranging from aerospace engineering to petroleum engineering to help you with your teaching and research.
As you scroll this page, you will find many Engineering resources. This includes the most recently added Engineering material and members; journals and publications and Engineering education alerts and twitter feeds.
Showcase
Over 150 emeddable or downloadable 3D Simulations in the subject area of Automation, Electro/Mechanical, Process Control, and Renewable Energy. Short 3-7 minute simulations that cover a range of engineering topics to help students understand conceptual engineering topics.
Each video is hosted on Vimeo and can be played, embedded, or downloaded for use in the classroom or online. Other option includes an embeddable HTML player created in Storyline with review questions for each simulation that reinforce the concepts learned.
Made possible under a Department of Labor grant. Extensive storyboard/ scripting work with instructors and industry experts to ensure content is accurate and up to date.
New Materials
-
Operaciones con Señales
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Risk and Reliability for Engineers | TU Delft OPEN Textbooks
This book covers a wide range of topics that involve the use of probability to solve problems in engineering design and...
-
Mechanics & Science of Materials - ENGR 330
This open textbook is being utilized in an engineering course for undergraduate or graduate students by Joshua Steimel at...
-
Erie Canalway National Heritage Corridor :: Learn
The Erie Canal carried more than cargo. It was the Internet of its day, carrying people and ideas and bringing the young...
-
Combinación de Funciones en Tiempo Discreto
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Engineers & STEM: 9 Qui est l’inventeur de l’autocuiseur?
Cette vidéo montre comment un ingénieur a contribué au développement de l’autocuiseur.
-
Fracture Mechanics: An Engineering Primer | TU Delft OPEN Textbooks
In this second edition, which is the result of numerous revisions, updates and additions, the authors cover the basic...
-
Engineers & STEM: 9 Who was the inventor of the pressure cooker?
This video shows how an engineer contributed to the development of the pressure cooker.
-
Engineers & STEM: 9 ¿Quién fue el inventor de la olla a presión?
Este video muestra cómo un ingeniero contribuyó al desarrollo de la olla a presión.
-
Combinación de Funciones en Tiempo Continuo
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Engineers & STEM: 9 Quem foi o inventor da panela de pressão?
Este vídeo mostra como um engenheiro contribuiu para o desenvolvimento da panela de pressão..
-
Control Actions
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Engineering Physics II / PHY 205
This open-source textbook is being utilized in a physics course for undergraduate students by Cynthia Trevisan at...
-
Ações de Controle
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Radio Systems Engineering, Revised First Edition
Using a systems framework, this textbook provides a clear and comprehensive introduction to the performance, analysis,...
-
Understanding the Area Moment of Inertia
The area moment of inertia (also called the second moment of area) defines the resistance of a cross-section to bending,...
-
Understanding Young's Modulus
Young's modulus is a crucial mechanical property in engineering, as it defines the stiffness of a material and tells us...
-
Understanding Poisson's Ratio
In this video I take a detailed look at Poisson's ratio, a really important material property which helps describe how a...
-
Introductory Dynamics: 2D Kinematics and Kinetics of Point Masses and Rigid Bodies | TU Delft OPEN Textbooks
Motion is all around us, the universe is full of moving matter and this motion is surprisingly predictable. The field of...
-
Operaciones con Números Complejos
This is a simulator learning object addresses topics in Electrical Engineering. It belongs to the collection Simulações...
-
Essential Fluids with MATLAB and Octave - Pasrt 1 (theory)
This book covers the requisite theory for the basic study of fluid mechanics at low speeds. This book is unique in that...
-
Understanding Conduction and the Heat Equation
In this video we take a look at conduction and the heat equation. Fourier's law is used to calculate the rate at which...
-
Understanding Thermal Radiation
Get access to my bonus video Understanding Dimensional Analysis on Nebula -...
-
Understanding Aerodynamic Lift
Humanity has long been obsessed with heavier-than-air flight, and to this day it remains a topic that is shrouded in a...
New Members
-
CIndy FurseUniversity of Utah
-
Sakthi Sanna university
-
Esakkinathan Banna university
-
Bartosz DulskiDTU
-
Umit GuresHarvard University
-
Sheila HillUniversity System of Georgia - Kennesaw State University
-
Emer HayesXiamen University
-
Jhon Adamlemondecommunications
-
Bob SuttonPurdue University System - School of Technology at South Bend Elkhart
-
Camden StokerWorkplace
-
Hélio Carlos BortolonPetrobras
-
Alfonso Rodríguez-PeñaUniversidad del Atlántico
-
Fana FilliMekelle University
-
Umer DrazMSM
-
Sundararajan VenkatadriagaramUniversity of California System - Riverside
-
Gamachu WakoyaMattu University
-
Yohannes BelachewHydraulic and Water resources engineering
-
Oleksandr HybaloUzhnu
-
Garrett OdomUniversity of Central Oklahoma
-
Kushal AdhikariJuniata College
-
selim ahmednew york
-
Nilesh ChoudharyIIT Tirupati
-
Energeia Dual Fuelenergas
Materials by Discipline
- Aerospace and Aeronautical Engineering (310)
- Agricultural and Biological Engineering (60)
- Audio Engineering (5)
- Biomedical Engineering (66)
- Chemical Engineering (229)
- Civil Engineering (639)
- Computer Engineering (394)
- Electrical Engineering (1363)
- Engineering Science (30)
- Environmental Engineering (169)
- Geological Engineering (255)
- Industrial and Systems (133)
- Manufacturing Engineering (101)
- Materials Science and Engineering (386)
- Mechanical Engineering (970)
- Mining Engineering (9)
- Nuclear Engineering (69)
- Ocean Engineering (14)
- Petroleum Engineering (24)
Journals & Publications
- Journal of Engineering Education
- European Journal of Engineering Education
- Advances in Engineering Education
- International Journal of Engineering Education
- Chemical Engineering Education
- IEEE Transactions on Education
- Journal of Civil Engineering Education
- International Journal of Mechanical Engineering Education
- International Journal of Electrical Engineering Education
Engineering on the Web
-
At India's second-largest engineering co, gaps emerge in electoral bond funding | Mint
Mar 18, 2024 05:17 PM PDT
-
Nvidia Announces GR00T, a Foundation Model For Humanoids
Mar 18, 2024 04:27 PM PDTNvidia’s ongoing GTC developer conference in San Jose is, unsurprisingly, almost entirely about AI this year. But in between the AI developments, Nvidia has also made a couple of significant robotics announcements. First, there’s Project GR00T (with each letter and number pronounced individually so as not to invoke the wrath of Disney), a foundation model for humanoid robots. And secondly, Nvidia has committed to be the founding platinum member of the Open Source Robotics Alliance, a new initiative from the Open Source Robotics Foundation intended to make sure that the Robot Operating System (ROS), a collection of open-source software libraries and tools, has the support that it needs to flourish. GR00T First, let’s talk about GR00T (short for “Generalist Robot 00 Technology”). The way that Nvidia presenters enunciated it letter-by-letter during their talks strongly suggests that in private they just say “Groot.” So the rest of us can also just say “Groot” as far as I’m concerned. As a “general-purpose foundation model for humanoid robots,” GR00T is intended to provide a starting point for specific humanoid robots to do specific tasks. As you might expect from something being presented for the first time at an Nvidia keynote, it’s awfully vague at the moment, and we’ll have to get into it more later on. Here’s pretty much everything useful that Nvidia has told us so far: “Building foundation models for general humanoid robots is one of the most exciting problems to solve in AI today,” said Jensen Huang, founder and CEO of NVIDIA. “The enabling technologies are coming together for leading roboticists around the world to take giant leaps towards artificial general robotics.” Robots powered by GR00T... will be designed to understand natural language and emulate movements by observing human actions—quickly learning coordination, dexterity and other skills in order to navigate, adapt and interact with the real world. This sounds good, but that “will be” is doing a lot of heavy lifting. Like, there’s a very significant “how” missing here. More specifically, we’ll need a better understanding of what’s underlying this foundation model—is there real robot data under there somewhere, or is it based on a massive amount of simulation? Are the humanoid robotic companies involved contributing data to improve GR00T, or instead training their own models based on it? It’s certainly notable that Nvidia is name-dropping most of the heavy-hitters in commercial humanoids, including 1X Technologies, Agility Robotics, Apptronik, Boston Dynamics, Figure AI, Fourier Intelligence, Sanctuary AI, Unitree Robotics, and XPENG Robotics. We’ll be able to check in with some of those folks directly this week to hopefully learn more. On the hardware side, Nvidia is also announcing a new computing platform called Jetson Thor: Jetson Thor was created as a new computing platform capable of performing complex tasks and interacting safely and naturally with people and machines. It has a modular architecture optimized for performance, power and size. The SoC includes a next-generation GPU based on NVIDIA Blackwell architecture with a transformer engine delivering 800 teraflops of 8-bit floating point AI performance to run multimodal generative AI models like GR00T. With an integrated functional safety processor, a high-performance CPU cluster and 100GB of ethernet bandwidth, it significantly simplifies design and integration efforts. Speaking of Nvidia’s Blackwell architecture—today the company also unveiled its B200 Blackwell GPU. And to round out the announcements, the chip foundry TSMC and Synopsys, an electronic design automation company, each said they will be moving Nvidia’s inverse lithography tool, cuLitho, into production. The Open Source Robotics Alliance The other big announcement is actually from the Open Source Robotics Foundation, which is launching the Open Source Robotics Alliance (OSRA), a “new initiative to strengthen the governance of our open-source robotics software projects and ensure the health of the Robot Operating System (ROS) Suite community for many years to come.” Nvidia is an inaugural platinum member of the OSRA, but they’re not alone—other platinum members include Intrinsic and Qualcomm. Other significant members include Apex, Clearpath Robotics, Ekumen, eProsima, PickNik, Silicon Valley Robotics, and Zettascale. “The [Open Source Robotics Foundation] had planned to restructure its operations by broadening community participation and expanding its impact in the larger ROS ecosystem,” explains Vanessa Yamzon Orsi, CEO of the Open Source Robotics Foundation. “The sale of [Open Source Robotics Corporation] was the first step towards that vision, and the launch of the OSRA is the next big step towards that change.” We had time for a brief Q&A with Orsi to better understand how this will affect the ROS community going forward. You structured the OSRA to have a mixed membership and meritocratic model like the Linux Foundation—what does that mean, exactly? Vanessa Yamzon Orsi: We have modeled the OSRA to allow for paths to participation in its activities through both paid memberships (for organizations and their representatives) and the community members who support the projects through their contributions. The mixed model enables participation in the way most appropriate for each organization or individual: contributing funding as a paying member, contributing directly to project development, or both. What are some benefits for the ROS ecosystem that we can look forward to through OSRA? Orsi: We expect the OSRA to benefit the OSRF’s projects in three significant ways. By providing a stable stream of funding to cover the maintenance and development of the ROS ecosystem. By encouraging greater community involvement in development through open processes and open, meritocratic status achievement. By bringing greater community involvement in governance and ensuring that all stakeholders have a voice in decision-making. Why will this be a good thing for ROS users?Orsi: The OSRA will ensure that ROS and the suite of open source projects under the stewardship of Open Robotics will continue to be supported and strengthened for years to come. By providing organized governance and oversight, clearer paths to community participation, and financial support, it will provide stability and structure to the projects while enabling continued development.
-
Koford Engineering now offers 36-mm high-performance motor - Design World
Mar 18, 2024 03:24 PM PDT
-
Rollins receives MUEAO Citation of Merit Award for service to College, profession
Mar 18, 2024 03:08 PM PDT
-
Mizzou presents Missouri Honor Award to Boeing for support of student success
Mar 18, 2024 03:08 PM PDT
-
Ansys and NVIDIA Pioneer Next Era of Computer-Aided Engineering - PR Newswire
Mar 18, 2024 03:01 PM PDT
-
Anil Goel Named Chief Engineering Officer of Nielsen - KXAN
Mar 18, 2024 02:52 PM PDT
-
Lake County commissioners laud Lakeland's Engineering Building expansion, renovation
Mar 18, 2024 02:48 PM PDT
-
BA has a new Engineering Director | Operations - City of Broken Arrow
Mar 18, 2024 02:39 PM PDT
-
Engineering management alumna lands position at Aqua Pennsylvania - Penn State
Mar 18, 2024 02:03 PM PDT
-
Seminar: Simultaneous Space Mission Design, Control & Guidance - Mar. 20
Mar 18, 2024 02:01 PM PDT
-
Engineering Firm Moves HQ to Cypress - Orange County Business Journal
Mar 18, 2024 01:54 PM PDT
-
Once Again, Student Team Selected for NASA BIG Idea Challenge | News
Mar 18, 2024 01:31 PM PDT
-
Women in Engineering | CWIEME Berlin
Mar 18, 2024 01:17 PM PDT
-
Engineering The Trade | tastylive
Mar 18, 2024 12:40 PM PDT
-
Researchers at UMass Amherst and Texas A&M Collaborate to Develop New Antibacterial ...
Mar 18, 2024 12:22 PM PDT
-
Profs awarded chairs and funds from Canadian govt. | Engineering - University of Waterloo
Mar 18, 2024 12:20 PM PDT
-
OceanSound-backed Gannett Fleming snaps up water and transportation engineering firm DEC
Mar 18, 2024 11:29 AM PDT
-
Gonzales engineering firm receives national award for Fish Bayou project - Donaldsonville Chief
Mar 18, 2024 11:23 AM PDT
-
Post-Doctoral Fellow in Planetary Science and Geodynamics - Queen's University
Mar 18, 2024 10:37 AM PDT
-
Spirax-Sarco Engineering falls Monday, underperforms market - MarketWatch
Mar 18, 2024 10:17 AM PDT
-
Nvidia Unveils Blackwell, Its Next GPU
Mar 17, 2024 06:15 PM PDTToday at Nvidia’s developer conference, GTC 2024, the company revealed its next GPU, the B200. The B200 is capable of delivering four times the training performance, up to 30 times the inference performance, and up to 25 times better energy efficiency, compared to its predecessor, the Hopper H100 GPU. Based on the new Blackwell architecture, the GPU can be combined with the company’s Grace CPUs to form a new generation of DGX SuperPOD computers capable of up to 11.5 billion billion floating point operations (exaflops) of AI computing using a new, low-precision number format. “Blackwell is a new class of AI superchip,” says Ian Buck, Nvidia’s vice president of high-performance computing and hyperscale. Nvidia named the GPU architecture for mathematician David Harold Blackwell, the first Black inductee into the U.S. National Academy of Sciences. The B200 is composed of about 1600 square millimeters of processor on two silicon dies that are linked in the same package by a 10 terabyte per second connection, so they perform as if they were a single 208-billion-transistor chip. Those slices of silicon are made using TSMC’s N4P chip technology, which provides a 6 percent performance boost over the N4 technology used to make Hopper architecture GPUs, like the H100. Like Hopper chips, the B200 is surrounded by high-bandwidth memory, increasingly important to reducing the latency and energy consumption of large AI models. B200’s memory is the latest variety, HBM3e, and it totals 192 GB (up from 141 GB for the second generation Hopper chip, H200). Additionally, the memory bandwidth is boosted to 8 terabytes per second from the H200’s 4.8 TB/s. Smaller Numbers, Faster Chips Chipmaking technology did some of the job in making Blackwell, but its what the GPU does with the transistors that really makes the difference. In explaining Nvidia’s AI success to computer scientists last year at IEEE Hot Chips, Nvidia chief scientist Bill Dally said that the majority came from using fewer and fewer bits to represent numbers in AI calculations. Blackwell continues that trend. It’s predecessor architecture, Hopper, was the first instance of what Nvidia calls the transformer engine. It’s a system that examines each layer of a neural network and determines whether it could be computed using lower-precision numbers. Specifically, Hopper can use floating point number formats as small as 8 bits. Smaller numbers are faster and more energy efficient to compute, require less memory and memory bandwidth, and the logic required to do the math takes up less silicon. “With Blackwell, we have taken a step further,” says Buck. The new architecture has units that do matrix math with floating point numbers just 4 bits wide. What’s more, it can decide to deploy them on parts of each neural network layer, not just entire layers like Hopper. “Getting down to that level of fine granularity is a miracle in itself,” says Buck. NVLink and Other Features Among the other architectural insights Nvidia revealed about Blackwell are that it incorporates a dedicated “engine” devoted to the GPU’s reliability, availability, and serviceability. According to Nvidia, it uses an AI-based system to run diagnostics and forecast reliability issues, with the aim of increasing up time and helping massive AI systems run uninterrupted for weeks at a time, a period often needed to train large language models. Nvidia also included systems to help keep AI models secure and to decompress data to speed database queries and data analytics. Finally, Blackwell incorporates Nvidia’s fifth generation computer interconnect technology NVLink, which now delivers 1.8 terabytes per second bidirectionally between GPUs and allows for high-speed communication among up to 576 GPUs. Hopper’s version of NVLink could only reach half that bandwidth. SuperPOD and Other Computers NVLink’s bandwidth is key to building large-scale computers from Blackwell, capable of crunching through trillion-parameter neural network models. The base computing unit is called the DGX GB200. Each of those include 36 GB200 superchips. These are modules that include a Grace CPU and two Blackwell GPUs, all connected together with NVLink. The Grace Blackwell superchip is two Blackwell GPUs and a Grace CPU in the same module.Nvidia Eight DGX GB200s can be connected further via NVLINK to form a 576-GPU supercomputer called a DGX SuperPOD. Nvidia says such a computer can blast through 11.5 exaflops using 4-bit precision calculations. Systems of tens of thousands of GPUs are possible using the company’s Quantum Infiniband networking technology. The company says to expect SuperPODs and other Nvidia computers to become available later this year. Meanwhile, chip foundry TSMC and electronic design automation company Synopsys each announced that they would be moving Nvidia’s inverse lithography tool, cuLitho, into production. Lastly, the Nvidia announced a new foundation model for humanoid robots called GR00T.
-
How Ultrasound Became Ultra Small
Mar 17, 2024 07:00 AM PDTA startling change in medical ultrasound is working its way through hospitals and physicians’ offices. The long-standing, state-of-the-art ultrasound machine that’s pushed around on a cart, with cables and multiple probes dangling, is being wheeled aside permanently in favor of handheld probes that send images to a phone. These devices are small enough to fit in a lab coat pocket and flexible enough to image any part of the body, from deep organs to shallow veins, with sweeping 3D views, all with a single probe. And the AI that accompanies them may soon make these devices operable by untrained professionals in any setting—not just trained sonographers in clinics. The first such miniaturized, handheld ultrasound probe arrived on the market in 2018, from Butterfly Network in Burlington, Mass. Last September, Exo Imaging in Santa Clara, Calif., launched a competing version. Making this possible is silicon ultrasound technology, built using a type of microelectromechanical system (MEMS) that crams 4,000 to 9,000 transducers—the devices that convert electrical signals into sound waves and back again—onto a 2-by-3-centimeter silicon chip. By integrating MEMS transducer technology with sophisticated electronics on a single chip, these scanners not only replicate the quality of traditional imaging and 3D measurements but also open up new applications that were impossible before. How does ultrasound work? To understand how researchers achieved this feat, it’s helpful to know the basics of ultrasound technology. Ultrasound probes use transducers to convert electrical energy to sound waves that penetrate the body. The sound waves bounce off the body’s soft tissue and echo back to the probe. The transducer then converts the echoed sound waves to electrical signals, and a computer translates the data into an image that can be viewed on a screen. Conventional ultrasound probes contain transducer arrays made from slabs of piezoelectric crystals or ceramics such as lead zirconium titanate (PZT). When hit with pulses of electricity, these slabs expand and contract and generate high-frequency ultrasound waves that bounce around within them. Ultrasound technology has historically required bulky machinery with multiple probes. Julian Kevin Zakaras/Fairfax Media/Getty Images To be useful for imaging, the ultrasound waves need to travel out of the slabs and into the soft tissue and fluid of the patient’s body. This is not a trivial task. Capturing the echo of those waves is like standing next to a swimming pool and trying to hear someone speaking under the water. The transducer arrays are thus built from layers of material that smoothly transition in stiffness from the hard piezoelectric crystal at the center of the probe to the soft tissue of the body. The frequency of energy transferred into the body is determined mainly by the thickness of the piezoelectric layer. A thinner layer transfers higher frequencies, which allow smaller, higher-resolution features to be seen in an ultrasound image, but only at shallow depths. The lower frequencies of thicker piezoelectric material travel further into the body but deliver lower resolutions. As a result, several types of ultrasound probes are needed to image various parts of the body, with frequencies that range from 1 to 10 megahertz. To image large organs deep in the body or a baby in the womb, physicians use a 1- to 2-MHz probe, which can provide 2- to 3-millimeter resolution and can reach up to 30 cm into the body. To image blood flow in arteries in the neck, physicians typically use an 8- to 10-MHz probe. How MEMS transformed ultrasound The need for multiple probes along with the lack of miniaturization meant that conventional medical ultrasound systems resided in a heavy, boxy machine lugged around on a cart. The introduction of MEMS technology changed that. Over the last three decades MEMS has allowed manufacturers in an array of industries to create precise, extremely sensitive components at a microscopic scale. This advance has enabled the fabrication of high-density transducer arrays that can produce frequencies in the full 1- to 10-MHz range, allowing imaging of a wide range of depths in the body, all with one probe. MEMS technology also helped miniaturize additional components so that everything fits in the handheld probe. When coupled with the computing power of a smartphone, this eliminated the need for a bulky cart. The first MEMS-based silicon ultrasound prototypes emerged in the mid-1990s when the excitement of MEMS as a new technology was peaking. The key element of these early transducers was the vibrating micromachined membrane, which allowed the devices to generate vibrations in much the same way that banging on a drum creates sound waves in the air. Exo Imaging developed a handheld ultrasound machine using piezoelectric micromachined ultrasonic transducer (PMUT) technology.Exo Imaging Two architectures emerged. One of them, called the capacitive micromachined ultrasonics transducer, or CMUT, is named for its simple capacitor-like structures. Stanford University electrical engineer Pierre Khuri-Yakub and colleagues demonstrated the first versions. The CMUT is based on electrostatic forces in a capacitor formed by two conductive plates separated by a small gap. One plate—the micromachined membrane mentioned before—is made of silicon or silicon nitride with a metal electrode. The other—typically a micromachined silicon wafer substrate—is thicker and more rigid. When a voltage is applied, placing opposite charges on the membrane and substrate, attractive forces pull and flex the membrane toward the substrate. When an oscillating voltage is added, that changes the force, causing the membrane to vibrate, like a struck drumhead. When the membrane is in contact with the human body, the vibrations send ultrasound frequency waves into the tissue. How much ultrasound is generated or detected depends on the gap between the membrane and the substrate, which needs to be about one micrometer or less. Micromachining techniques made that kind of precision possible. The other MEMS-based architecture is called the piezoelectric micromachined ultrasonic transducer, or PMUT, and it works like a miniaturized version of a smoke alarm buzzer. These buzzers consist of two layers: a thin metal disk fixed around its periphery and a thin, smaller piezoelectric disk bonded on top of the metal disk. When voltages are applied to the piezoelectric material, it expands and contracts in thickness and from side to side. Because the lateral dimension is much larger, the piezo disk diameter changes more significantly and in the process bends the whole structure. In smoke alarms, these structures are typically 4 cm in diameter, and they’re what generates the shrieking sound of the alarm, at around 3 kilohertz. When the membrane is scaled down to 100 μm in diameter and 5 to 10 μm in thickness, the vibration moves up into megahertz frequencies, making it useful for medical ultrasound. Honeywell in the early 1980s developed the first micromachined sensors using piezoelectric thin films built on silicon diaphragms. The first PMUTs operating at ultrasound frequencies didn’t emerge until 1996, from the work of materials scientist Paul Muralt at the Swiss Federal Institute of Technology Lausanne (EPFL), in Switzerland. Early years of CMUT A big challenge with CMUTs was getting them to generate enough pressure to send sound waves deep into the body and receive the echoes coming back. The membrane’s motion was limited by the exceedingly small gap between the membrane and the substrate. This constrained the amplitude of the sound waves that could be generated. Combining arrays of CMUT devices with different dimensions into a single probe to increase the frequency range also compromised the pressure output because it reduced the probe area available for each frequency. Butterfly Network developed a handheld ultrasound machine using capacitive micromachined ultrasonic transducer (CMUT) technology.Butterfly The solution to these problems came from Khuri-Yakub’s lab at Stanford University. In experiments in the early 2000s, the researchers found that increasing the voltage on CMUT-like structures caused the electrostatic forces to overcome the restoring forces of the membrane. As a result, the center of the membrane collapses onto the substrate. A collapsed membrane seemed disastrous at first but turned out to be a way of making CMUTs both more efficient and more tunable to different frequencies. The efficiency increased because the gap around the contact region was very small, increasing the electric field there. And the pressure increased because the large doughnut-shaped region around the edge still had a good range of motion. What’s more, the frequency of the device could be adjusted simply by changing the voltage. This, in turn, allowed a single CMUT ultrasound probe to produce the entire ultrasound frequency range needed for medical diagnostics with high efficiency. Butterfly Network From there, it took more than a decade to understand and model the complicated electromechanical behavior of CMUT arrays and iron out the manufacturing. Modeling these devices was tricky because thousands of individual membranes interacted in each CMUT array. On the manufacturing side, the challenges involved finding the right materials and developing the processes needed to produce smooth surfaces and a consistent gap thickness. For example, the thin dielectric layer that separates the conductive membrane and the substrate must withstand about 100 volts at a thickness of 1 μm. If the layer has defects, charges can be injected into it, and the device can short at the edges or when the membrane touches the substrate, killing the device or at least degrading its performance. Eventually, though, MEMS foundries such as Philips Engineering Solutions in Eindhoven, Netherlands, and Taiwan Semiconductor Manufacturing Co. (TSMC), in Hsinchu, developed solutions to these problems. Around 2010, these companies began producing reliable, high-performance CMUTs. Early development of PMUTs Early PMUT designs also had trouble generating enough pressure to work for medical ultrasound. But they could bang out enough to be useful in some consumer applications, such as gesture detection and proximity sensors. In such “in-air ultrasound” uses, bandwidth isn’t critical, and frequencies can be below 1 MHz. In 2015, PMUTs for medical applications got an unexpected boost with the introduction of large 2D matrix arrays for fingerprint sensing in mobile phones. In the first demonstration of this approach, researchers at the University of California, Berkeley, and the University of California, Davis, connected around 2,500 PMUT elements to CMOS electronics and placed them under a silicone rubberlike layer. When a fingertip was pressed to the surface, the prototype measured the amplitudes of the reflected signals at 20 MHz to distinguish the ridges in the fingertip from the air pockets between them. This was an impressive demonstration of integrating PMUTs and electronics on a silicon chip, and it showed that large 2D PMUT arrays could produce a high enough frequency to be useful for imaging of shallow features. But to make the jump to medical ultrasound, PMUT technology needed more bandwidth, more output pressure, and piezoelectric thin films with better efficiency. Help came from semiconductor companies such as ST Microelectronics, based in Geneva, which figured out how to integrate PZT thin films on silicon membranes. These films require extra processing steps to maintain their properties. But the improvement in performance made the cost of the extra steps worthwhile. To achieve a larger pressure output, the piezoelectric layer needed to be thick enough to allow the film to sustain the high voltages required for good ultrasound images. But increased thickness leads to a more rigid membrane, which reduces the bandwidth. One solution was to use an oval-shaped PMUT membrane that effectively combined several membranes of different sizes into one. This is similar to changing the length of guitar strings to generate different tones. The oval membrane provides strings of multiple lengths on the same structure with its narrow and wide sections. To efficiently vibrate wider and narrower parts of the membrane at different frequencies, electrical signals are applied to multiple electrodes placed on corresponding regions of the membrane. This approach allowed PMUTs to be efficient over a wider frequency range. From academia to the real world In the early 2000s, researchers began to push CMUT technology for medical ultrasound out of the lab and into commercial development. Stanford University spun out several startups aimed at this market. And leading medical ultrasound imaging companies such as GE, Philips, Samsung, and Hitachi began developing CMUT technology and testing CMUT-based probes. But it wasn’t until 2011 that CMUT commercialization really began to make progress. That year, a team with semiconductor electronics experience founded Butterfly Network. The 2018 introduction of the IQ Probe was a transformative event. It was the first handheld ultrasound probe that could image the whole body with a 2D imaging array and generate 3D image data. About the size of a TV remote and only slightly heavier, the probe was initially priced at US $1,999—one-twentieth the cost of a full-size, cart-carried machine. Around the same time, Hitachi in Tokyo and Kolo Medical in Suzhou, China (formerly in San Jose, Calif.), commercialized CMUT-based probes for use with conventional ultrasound systems. But neither has the same capabilities as Butterfly’s. For example, the CMUT and electronics aren’t integrated on the same silicon chip, which means the probes have 1D arrays rather than 2D. That limits the system’s ability to generate images in 3D, which is necessary in advanced diagnostics, such as determining bladder volume or looking at simultaneous orthogonal views of the heart. Exo Imaging’s September 2023 launch of its handheld probe, the Exo Iris, marked the commercial debut of PMUTs for medical ultrasound. Developed by a team with experience in semiconductor electronics and integration, the Exo Iris is about the same size and weight as Butterfly’s IQ Probe. Its $3,500 price is comparable to Butterfly’s latest model, the IQ+, priced at $2,999. The ultrasound MEMS chips in these probes, at 2 by 3 cm, are some of the largest silicon chips with combined electromechanical and electronic functionality. The size and complexity impose production challenges in terms of the uniformity of the devices and the yield. These handheld devices operate at low power, so the probe’s battery is lightweight, lasts for several hours of continuous use while the device is connected to a cellphone or tablet, and has a short charging time. To make the output data compatible with cellphones and tablets, the probe’s main chip performs digitization and some signal processing and encoding. Two MEMS ultrasound architectures have emerged. In the capacitive micromachined ultrasonics transducer (CMUT) design, attractive forces pull and flex the membrane toward the substrate. When an oscillating voltage is added, the membrane vibrates like a struck drumhead. Increasing the voltage causes the electrostatic forces to overcome the restoring forces of the membrane, causing the membrane to collapse onto the substrate. In the piezoelectric micromachined ultrasonic transducer (PMUT) architecture, voltages applied to the piezoelectric material cause it to expand and contract in thickness and from side to side. Because the lateral dimension is much larger, the piezo disk diameter changes significantly, bending the whole structure. To provide 3D information, these handheld probes take multiple 2D slices of the anatomy and then use machine learning and AI to construct the necessary 3D data. Built-in AI-based algorithms can also help doctors and nurses precisely place needles in desired locations, such as in challenging vasculature or in other tissue for biopsies. The AI developed for these probes is so good that it may be possible for professionals untrained in ultrasound, such as nurse midwives, to use the portable probes to determine the gestational age of a fetus, with accuracy similar to that of a trained sonographer, according to a 2022 study in NEJM Evidence. The AI-based features could also make the handheld probes useful in emergency medicine, in low-income settings, and for training medical students. Just the beginning for MEMS ultrasound This is only the beginning for miniaturized ultrasound. Several of the world’s largest semiconductor foundries, including TSMC and ST Microelectronics, now do MEMS ultrasound chip production on 300 and 200 mm wafers, respectively. In fact, ST Microelectronics recently formed a dedicated “Lab-in-Fab” in Singapore for thin-film piezoelectric MEMS, to accelerate the transition from proofs of concept to volume production. Philips Engineering Solutions offers CMUT fabrication for CMUT-on-CMOS integration, and Vermon in Tours, France, offers commercial CMUT design and fabrication. That means startups and academic groups now have access to the base technologies that will make a new level of innovation possible at a much lower cost than 10 years ago. With all this activity, industry analysts expect ultrasound MEMS chips to be integrated into many different medical devices for imaging and sensing. For instance, Butterfly Network, in collaboration with Forest Neurotech, is developing MEMS ultrasound for brain-computer interfacing and neuromodulation. Other applications include long-term, low-power wearable devices, such as heart, lung, and brain monitors, and muscle-activity monitors used in rehabilitation. In the next five years, expect to see miniature passive medical implants with ultrasound MEMS chips, in which power and data are remotely transferred using ultrasound waves. Eventually, these handheld ultrasound probes or wearable arrays could be used not only to image the anatomy but also to read out vital signs like internal pressure changes due to tumor growth or deep-tissue oxygenation after surgery. And ultrasound fingerprint-like sensors could one day be used to measure blood flow and heart rate. One day, wearable or implantable versions may enable the generation of passive ultrasound images while we sleep, eat, and go about our lives.
-
China and Norway Lead the World’s EV Switchover
Mar 16, 2024 10:14 AM PDTThe U.S. government recently backed down from enacting tough new measures that would have forced automakers to quadruple their sales of electric vehicles by 2030. Even if Washington hadn’t buckled to outside pressure, the U.S. ambitions for 2030 would not have been exceptional. The move would have raised market share of all-electric vehicles in the U.S. to a level still well below 20 percent. Meanwhile, there is a growing group of nations with their sights set much higher. China, for one, is expected to meet its own 2030 EV adoption target: 40 percent of vehicles sold. By decade’s end, China is expected to be selling only EVs in regions like the island province of Hainan. Norway, more ambitiously still, aims to eliminate sales of new ICE vehicles by 2025. (Eighty percent of new vehicles sold there, as of 2022, are EVs.) It stands to reason that Norway is far ahead of the rest of the world in terms of EV adoption. Norway has been working, with a consistent program of government funding and incentives, toward getting EVs on its roads since the 1990s. Early government investment in charging infrastructure went a long way toward soothing the range anxiety that made car buyers in other places reluctant to make the switch to battery power from gasoline or diesel. Globally, according to research by the Rocky Mountain Institute, EVs will comprise two-thirds of the world’s car sales by 2030. However, according to the World Resources Institute, “EVs need to account for 75 percent to 95 percent of passenger vehicle sales by 2030 in order to meet international climate goals aimed at keeping global warming to 1.5 degrees C (2.7 degrees F).” According to the WRI’s analysis, above, Iceland, Sweden, the Netherlands, and China are the leading EV adopters after Norway. But as of 2022, there was still a major gap between the top spot and the countries trailing behind, the WRI found. Forty-one percent of Iceland’s auto sales, 32 percent of Sweden’s, 24 percent of the Netherlands’, and 22 percent of China’s were EVs. The nations in this group, however, have made pledges that would narrow the gap by 2030. Analysts are optimistic that electric vehicle sales will reach the levels necessary to help avert climate disaster. WRI adds that because the average annual growth rate in EV sales was 65 percent over the past five years, the world needs an average annual growth rate of only 31 percent through 2032. Who’s aiming to achieve what by decade’s end? Thirty-three countries are signatories of the Global Commercial Vehicle Drive to Zero agreement for heavy- and medium-duty vehicles like tractor-trailers, buses, and box trucks. The group’s member states are “working together to enable 100 percent zero-emission new truck and bus sales by 2040 with an interim goal of 30 percent zero-emission vehicle sales by 2030.” More than a dozen European nations are signatories of the pact; their membership dovetails with the European Union’s promise to reduce the continent’s average vehicle CO2 emissions by 45 percent by 2030 and 90 percent by 2040. Germany has not signed on to the Drive to Zero agreement, But that hasn’t stopped it from pursuing its own set of ambitious goals. The German government wants all new vehicles for its government-owned fleet to be “environmentally friendly drive technologies” by 2030, and has set a 2025 deadline by which at least half of those vehicles will be EVs. (For more on what other countries’ plans are, check out the International Energy Agency’s Global EV Policy Explorer page.) Demand for battery-powered vehicles has risen steadily as advances in battery technology and production have brought the purchase prices of EVs down. EVs are at the point where their sticker prices have fallen or will soon fall below those of comparable vehicles with internal combustion engines. According to analysis by the Energy Innovation and System Transition project, that milestone will likely be reached this year in Europe. Cost parity between battery- and petrol-powered vehicles will happen by 2026 in the U.S. and 2027 in India, say EEIST researchers. If those sunny forecasts hold up, they certainly won’t hinder other national EV adoption goals. The EU has targeted a fivefold boost of EV presence on its roads, from roughly 8 million today to 40 million by 2030. To make sure EV availability won’t fall short, Europe is planning to turn to Chinese manufacturers as a backstop. The EU says its countries will import more than a million EVs a year from China in order to help the continent reach its environmental targets. Meanwhile, strong government policy and financial incentives from these countries are laying the groundwork for a more robust EV industry to hit the marketplace as the cost of EV ownership continues to fall. Not to be outdone, India’s government has enacted an enhanced EV adoption strategy featuring generous incentives it predicts will allow electric vehicle sales there to catch up with those in China and EU nations by 2030. Downsides to the forecasts But not all skies are sunny. Though it’s clear that ess carbon in the atmosphere coming from tailpipe exhaust is a win for the planet, not everyone shares the belief that all-electric transportation is the panacea it is chalked up to be. Among those suggesting a measured approach that takes factors such as the local availability of natural resources into account is economist David S. Rapson, a professor at the University of California, Davis. “It is quite possible that, absent technological advancement, the costs of mitigating greenhouse gasses through electrification can rise above current estimates of the social cost of carbon or, more significantly, above alternative approaches to mitigating climate change,” says Rapson. “If such an outcome does arise, policies that rigidly adhere to 100 percent [EV adoption] targets could prove extremely costly and ultimately counterproductive.” Pushback against such government targets is happening beyond U.S. borders. Canada ‘s Liberal government issued a draft in 2022 calling for 20 percent of new light vehicles sold there to be zero-emission vehicles by 2026. The plan seeks to raise that figure to 60 percent by 2030. But Canada might have a fight on its hands that mirrors what the U.S. government was up against before its about-face. Tim Reuss, president of the Canadian Automobile Dealers Association, told Wards Auto that, “With the current high interest rates and high inflation severely impacting consumer affordability, many consumers lack the means to purchase EVs, as evidenced by the rising inventory levels on our members’ lots. Instead of attempting to dictate what individuals have to purchase, we suggest government focus on creating the right set of circumstances to stimulate demand.” Canada might be forced to lower its EV adoption trajectory before all is said and done. Russia’s government, which will likely see little if any pushback, says it is instituting measures that will result in electric vehicles comprising 10 percent of the country’s overall vehicle production by 2030.
-
How Zipline Designed Its Droid Delivery System
Mar 15, 2024 01:08 PM PDTAbout a year ago, Zipline introduced Platform 2, an approach to precision urban drone delivery that combines a large hovering drone with a smaller package-delivery “Droid.” Lowered on a tether from the belly of its parent Zip drone, the Droid contains thrusters and sensors (plus a 2.5- to 3.5-kilogram payload) to reliably navigate itself to a delivery area of just one meter in diameter. The Zip, meanwhile, safely remains hundreds of meters up. After depositing its payload, the Droid rises back up to the drone on its tether, and off they go. At first glance, the sensor and thruster-packed Droid seems complicated enough to be bordering on impractical, especially when you consider the relative simplicity of other drone delivery solutions, which commonly just drop the package itself on a tether from a hovering drone. I’ve been writing about robots long enough that I’m suspicious of robotic solutions that appear to be overengineered, since that’s always a huge temptation with robotics. Like, is this really the best way of solving a problem, or is it just the coolest way? We know the folks at Zipline pretty well, though, and they’ve certainly made creative engineering work for them, as we saw when we visited one of their “nests” in rural Rwanda. So as Zipline nears the official launch of Platform 2, we spoke with Zipline cofounder and CTO Keenan Wyrobek, Platform 2 lead Zoltan Laszlo, and industrial designer Gregoire Vandenbussche to understand exactly why they think this is the best way of solving precision urban drone delivery. First, a quick refresher. Here’s what the delivery sequence with the vertical takeoff and landing (VTOL) Zip and the Droid looks like: The system has a service radius of about 16 kilometers (10 miles), and it can make deliveries to outdoor spaces of “any meaningful size.” Visual sensors on the Droid find the delivery site and check for obstacles on the way down, while the thrusters compensate for wind and movement of the parent drone. Since the big VTOL Zip remains well out of the way, deliveries are fast, safe, and quiet. But it takes two robots to pull off the delivery rather than just one. On the other end is the infrastructure required to load and charge these drones. Zipline’s Platform 1 drones require a dedicated base with relatively large launch and recovery systems. With Platform 2, the drone drops the Droid into a large chute attached to the side of a building so that the Droid can be reloaded, after which it pulls the Droid out again and flies off to make the delivery: “We think it’s the best delivery experience. Not the best drone delivery experience, the best delivery experience,” Zipline’s Wyrobek tells us. That may be true, but the experience also has to be practical and sustainable for Zipline to be successful, so we asked the Zipline team to explain the company’s approach to precision urban delivery. Zipline on: Approach to drone delivery Concept for Droid design Designing for cuteness Making pinpoint deliveries IEEE Spectrum: What problems is Platform 2 solving, and why is it necessary to solve those problems in this specific way? Keenan Wyrobek: There are literally billions of last-mile deliveries happening every year in [the United States] alone, and our customers have been asking for years for something that can deliver to their homes. With our long-range platform, Platform 1, we can float a package down into your yard on a parachute, but that takes some space. And so one half of the big design challenge was how to get our deliveries precise enough, while the other half was to develop a system that will bolt on to existing facilities, which Platform 1 doesn’t do. Zoltan Laszlo: Platform 1 can deliver within an area of about two parking spaces. As we started to actually look at the data in urban areas using publicly available lidar surveys, we found that two parking spaces serves a bit more than half the market. We want to be a universal delivery service. But with a delivery area of 1 meter in diameter, which is what we’re actually hitting in our delivery demonstrations for Platform 2, that gets us into the high 90s for the percentage of people that we can deliver to. Wyrobek: When we say “urban,” what we’re talking about is three-story sprawl, which is common in many large cities around the world. And we wanted to make sure that our deliveries could be precise enough for places like that. There are some existing solutions for precision aerial delivery that have been operating at scale with some success, typically by winching packages to the ground from a VTOL drone. Why develop your own technique rather than just going with something that has already been shown to work? Laszlo: Winching down is the natural extension of being able to hover in place, and when we first started, we were like, “Okay, we’re just going to winch down. This will be great, super easy.” So we went to our test site in Half Moon Bay [on the Northern California coast] and built a quick prototype of a winch system. But as soon as we lowered a box down on the winch, the wind started blowing it all over the place. And this was from the height of our lift, which is less than 10 meters up. We weren’t even able to stay inside two parking spaces, which told us that something was broken with our approach. The aircraft can sense the wind, so we thought we’d be able to find the right angle for the delivery and things like that. But the wind where the aircraft is may be different from the wind nearer the ground. We realized that unless we’re delivering to an open field, a package that does not have active wind compensation is going to be very hard to control. We’re targeting high-90th percentile in terms of availability due to weather—even if it’s a pretty blustery day, we still want to be able to deliver. Wyrobek: This was a wild insight when we really understood that unless it’s a perfect day, using a winch actually takes almost as much space as we use for Platform 1 floating a package down on a parachute. Engineering test footage of Zipline’s Platform 2 docking system at their test site in Half Moon Bay in California. How did you arrive at this particular delivery solution for Platform 2? Laszlo: I don’t remember whose idea it was, but we were playing with a bunch of different options. Putting thrusters on the tether wasn’t even the craziest idea. We had our Platform 1 aircraft, which was reliable, so we started with looking at ways to just make that aircraft deliver more precisely. There was only so much more we could do with passive parachutes, but what does an active, steerable parachute look like? There are remote-controlled paragliding toys out there that we tested, with mixed results—the challenge is to minimize the smarts in your parachute, because there’s a chance you won’t get it back. So then we started some crazy brainstorming about how to reliably retrieve the parachute. Wyrobek: One idea was that the parachute would come with a self-return envelope that you could stick in the mail. Another idea was that the parachute would be steered by a little drone, and when the package got dropped off, the drone would reel the parachute in and then fly back up into the Zip. Laszlo: But when we realized that the package has to be able to steer itself, that meant the Zip doesn’t need to be active. The Zip doesn’t need to drive the package, it doesn’t even need to see the package, it just needs to be a point up in the sky that’s holding the package. That let us move from having the Zip 50 feet up, to having it 300 feet up, which is important because it’s a big, heavy drone that we don’t want in our customer’s space. And the final step was adding enough smarts to the thing coming down into your space to figure out where exactly to deliver to, and of course to handle the wind. Once you knew what you needed to do, how did you get to the actual design of the droid? Gregoire Vandenbussche: Zipline showed me pretty early on that they were ready to try crazy ideas, and from my experience, that’s extremely rare. When the idea of having this controllable tether with a package attached to it came up, one of my first thoughts was that from a user standpoint, nothing like this exists. And the difficulty of designing something that doesn’t exist is that people will try to identify it according to what they know. So we had to find a way to drive that thinking towards something positive. Early Droid concept sketches by designer Gregoire Vandenbussche featured legs that would fold up after delivery.Zipline First we thought about putting words onto it, like “hello” or something, but the reality is that we’re an international company and we need to be able to work everywhere. But there’s one thing that’s common to everyone, and that’s emotions—people are able to recognize certain things as being approachable and adorable, so going in that direction felt like the right thing to do. However, being able to design a robot that gives you that kind of emotion but also flies was quite a challenge. We took inspiration from other things that move in 3D, like sea mammals—things that people will recognize even without thinking about it. Vandenbussche’s sketches show how the design of the Droid was partially inspired by dolphins.Zipline Now that you say it, I can definitely see the sea mammal inspiration in the drone. Vandenbussche: There are two aspects of sea mammals that work really well for our purpose. One of them is simplicity of shape; sea mammals don’t have all that many details. Also, they tend to be optimized for performance. Ultimately, we need that, because we need to be able to fly. And we need to be able to convey to people that the drone is under control. So having something you can tell is moving forward or turning or moving away was very helpful. Wyrobek: One other insight that we had is that Platform 2 needs to be small to fit into tight delivery spaces, and it needs to feel small when it comes into your personal space, but it also has to be big enough inside to be a useful delivery platform. We tried to leverage the chubby but cute look that baby seals have going on. The design journey was pretty fun. Gregoire would spend two or three days coming up with a hundred different concept sketches. We’d do a bunch of brainstorming, and then Gregoire would come up with a whole bunch of new directions, and we’d keep exploring. To be clear, no one would describe our functional prototypes from back then as “cute.” But through all this iteration eventually we ended up in an awesome place. And how do you find that place? When do you know that your robot is just cute enough? One iteration of the Droid, Vandenbussche determined, looked too technical and intimidating.Zipline Vandenbussche: It’s finding the balance around what’s realistic and functional. I like to think of industrial design as taking all of the constraints and kind of playing Tetris with them until you get a result that ideally satisfies everybody. I remember at one point looking at where we were, and feeling like we were focusing too much on performance and missing that emotional level. So, we went back a little bit to say, where can we bring this back from looking like a highly technical machine to something that can give you a feeling of approachability? Laszlo: We spent a fair bit of time on the controls and behaviors of the droid to make sure that it moves in a very approachable and predictable way, so that you know where it’s going ahead of time and it doesn’t behave in unexpected ways. That’s pretty important for how people perceive it. We did a lot of work on how the droid would descend and approach the delivery site. One concept had the droid start to lower down well before the Zip was hovering directly overhead. We had simulations and renderings, and it looked great. We could do the whole delivery in barely over 20 seconds. But even if the package is far away from you, it still looks scary because [the Zip is] moving faster than you would expect, and you can’t tell exactly where it’s going to deliver. So we deleted all that code, and now it just comes straight down, and people don’t back away from the Droid anymore. They’re just like, “Oh, okay, cool.” How did you design the thrusters to enable these pinpoint deliveries? Early tests of the Droid centered around a two-fan version.Zipline Laszlo: With the thrusters, we knew we wanted to maximize the size of at least one of the fans, because we were almost always going to have to deal with wind. We’re trying to be as quiet as we can, so the key there is to maximize the area of the propeller. Our leading early design was just a box with two fans on it: Two fans with unobstructed flow meant that it moved great, but the challenge of fitting it inside another aircraft was going to be painful. And it looked big, even though it wasn’t actually that big. Vandenbussche: It was also pretty intimidating when you had those two fans facing you and the Droid coming toward you. A single steerable fan [left] that acted like a rudder was simpler in some ways, but as the fan got larger, the gyroscopic effects became hard to manage. Instead of one steerable fan, how about two steerable fans? [right] Omnidirectional motion was possible with this setup, but packaging it inside of a Zip didn’t work.Zipline Laszlo: We then started looking at configurations with a main fan and a second smaller fan, with the bigger fan at the back pushing forward and the smaller fan at the front providing thrust for turning. The third fan we added relatively late because we didn’t want to add it at all. But we found that [with two fans] the droid would have to spin relatively quickly to align to shifting winds, whereas with a third fan we can just push sideways in the direction that we need. What kind of intelligence does the Droid have? The current design of Zipline’s Platform 2 Droid is built around a large thruster in the rear and two smaller thrusters at the front and back.Zipline Wyrobek: The Droid has its own little autopilot, and there’s a very simple communications system between the two vehicles. You may think that it’s a really complex coordinated control problem, but it’s not: The Zip just kind of hangs out, and the Droid takes care of the delivery. The sensing challenge is for the Droid to find trees and powerlines and things like that, and then find a good delivery site. Was there ever a point at which you were concerned that the size and weight and complexity would not be worth it? Wyrobek: Our mindset was to fail fast, to try things and do what we needed to do to convince ourselves that it wasn’t a good path. What’s fun about this kind of iterative process is oftentimes, you try things and you realize that actually, this is better than we thought. Laszlo: We first thought about the Droid as a little bit of a tax, in that it’s costing us extra weight. But if your main drone can stay high enough up that it avoids trees and buildings, then it can just float around up there. If it gets pushed around by the wind, it doesn’t matter because the Droid can compensate. Wyrobek: Keeping the Zip at altitude is a big win in many ways. It doesn’t have to spend energy station-keeping, descending, and then ascending again. We just do that with the much smaller Droid, which also makes the hovering phase much shorter. It’s also much more efficient to control the small droid than the large Zip. And having all of the sensors on the Droid very close to the area that you’re delivering to makes that problem easier as well. It may look like a more complex system from the outside, but from the inside, it’s basically making all the hardest problems much easier. Over the past year, Zipline has set up a bunch of partnerships to make residential deliveries to consumers using Droid starting in 2024, including prescriptions from Cleveland Clinic in Ohio, medical products from WellSpan Health in Pennsylvania, tasty food from Mendocino Farms in California, and a little bit of everything from Walmart starting in Dallas. Zipline’s plan is to kick things off with Platform 2 later this year.
-
The Heart and the Chip: What Could Go Wrong?
Mar 15, 2024 12:30 PM PDTLegendary MIT roboticist Daniela Rus has published a new book called The Heart and the Chip: Our Bright Future with Robots. “There is a robotics revolution underway,” Rus says in the book’s introduction, “one that is already causing massive changes in our society and in our lives.” She’s quite right, of course, and although some of us have been feeling that this is true for decades, it’s arguably more true right now than it ever has been. But robots are difficult and complicated, and the way that their progress is intertwined with the humans that make them and work with them means that these changes won’t come quickly or easily. Rus’ experience gives her a deep and nuanced perspective on robotics’ past and future, and we’re able to share a little bit of that with you here. Daniela Rus: Should roboticists consider subscribing to their own Hippocratic oath? The following excerpt is from Chapter 14, entitled “What Could Go Wrong?” Which, let’s be honest, is the right question to ask (and then attempt to conclusively answer) whenever you’re thinking about sending a robot out into the real world. At several points in this book I’ve mentioned the fictional character Tony Stark, who uses technology to transform himself into the superhero Iron Man. To me this character is a tremendous inspiration, yet I often remind myself that in the story, he begins his career as an MIT-trained weapons manufacturer and munitions developer. In the 2008 film Iron Man, he changes his ways because he learns that his company’s specialized weapons are being used by terrorists. Remember, robots are tools. Inherently, they are neither good nor bad; it’s how we choose to use them that matters. In 2022, aerial drones were used as weapons on both sides of devastating wars. Anyone can purchase a drone, but there are regulations for using drones that vary between and within different countries. In the United States, the Federal Aviation Administration requires that all drones be registered, with a few exceptions, including toy models weighing less than 250 grams. The rules also depend on whether the drone is flown for fun or for business. Regardless of regulations, anyone could use a flying robot to inflict harm, just like anyone can swing a hammer to hurt someone instead of driving a nail into a board. Yet drones are also being used to deliver critical medical supplies in hard-to-reach areas, track the health of forests, and help scientists like Roger Payne monitor and advocate for at-risk species. My group collaborated with the modern dance company Pilobolus to stage the first theatrical performance featuring a mix of humans and drones back in 2012, with a robot called Seraph. So, drones can be dancers, too. In Kim Stanley Robinson’s prescient science fiction novel The Ministry for the Future, a swarm of unmanned aerial vehicles is deployed to crash an airliner. I can imagine a flock of these mechanical birds being used in many good ways, too. At the start of its war against Ukraine, Russia limited its citizens’ access to unbiased news and information in hopes of controlling and shaping the narrative around the conflict. The true story of the invasion was stifled, and I wondered whether we could have dispatched a swarm of flying video screens capable of arranging themselves into one giant aerial monitor in the middle of popular city squares across Russia, showing real footage of the war, not merely clips approved by the government. Or, even simpler: swarms of flying digital projectors could have broadcasted the footage on the sides of buildings and walls for all to see. If we had deployed enough, there would have been too many of them to shut down. There may be variations of Tony Stark passing through my university or the labs of my colleagues around the world, and we need to do whatever we can to ensure these talented young individuals endeavor to have a positive impact on humanity. The Tony Stark character is shaped by his experiences and steered toward having a positive impact on the world, but we cannot wait for all of our technologists to endure harrowing, life-changing experiences. Nor can we expect everyone to use these intelligent machines for good once they are developed and moved out into circulation. Yet that doesn’t mean we should stop working on these technologies—the potential benefits are too great. What we can do is think harder about the consequences and put in place the guardrails to ensure positive benefits. My contemporaries and I can’t necessarily control how these tools are used in the world, but we can do more to influence the people making them. There may be variations of Tony Stark passing through my university or the labs of my colleagues around the world, and we need to do whatever we can to ensure these talented young individuals endeavor to have a positive impact on humanity. We absolutely must have diversity in our university labs and research centers, but we may be able to do more to shape the young people who study with us. For example, we could require study of the Manhattan Project and the moral and ethical quandaries associated with the phenomenal effort to build and use the atomic bomb. At this point, ethics courses are not a widespread requirement for an advanced degree in robotics or AI, but perhaps they should be. Or why not require graduates to swear to a robotics- and AI-attuned variation on the Hippocratic oath? The oath comes from an early Greek medical text, which may or may not have been written by the philosopher Hippocrates, and it has evolved over the centuries. Fundamentally, it represents a standard of medical ethics to which doctors are expected to adhere. The most famous of these is the promise to do no harm, or to avoid intentional wrongdoing. I also applaud the oath’s focus on committing to the community of doctors and the necessity of maintaining the sacred bond between teacher and pupils. The more we remain linked as a robotics community, the more we foster and maintain our relationships as our students move out into the world, the more we can do to steer the technology toward a positive future. Today the Hippocratic oath is not a universal requirement for certification as a doctor, and I do not see it functioning that way for roboticists, either. Nor am I the first roboticist or AI leader to suggest this possibility. But we should seriously consider making it standard practice. In the aftermath of the development of the atomic bomb, when the potential of scientists to do harm was made suddenly and terribly evident, there was some discussion of a Hippocratic oath for scientific researchers. The idea has resurfaced from time to time and rarely gains traction. But science is fundamentally about the pursuit of knowledge; in that sense it is pure. In robotics and AI, we are building things that will have an impact on the world and its people and other forms of life. In this sense, our field is somewhat closer to medicine, as doctors are using their training to directly impact the lives of individuals. Asking technologists to formally recite a version of the Hippocratic oath could be a way to continue nudging our field in the right direction, and perhaps serve as a check on individuals who are later asked to develop robots or AI expressly for nefarious purposes. Of course, the very idea of what is good or bad, in terms of how a robot is used, depends on where you sit. I am steadfastly opposed to giving armed or weaponized robots autonomy. We cannot and should not trust machine intelligences to make decisions about whether to inflict harm on a person or group of people on their own. Personally, I would prefer that robots never be used to do harm to anyone, but this is now unrealistic. Robots are being used as tools of war, and it is our responsibility to do whatever we can to shape their ethical use. So, I do not separate or divorce myself from reality and operate solely in some utopian universe of happy, helpful robots. In fact, I teach courses on artificial intelligence to national security officials and advise them on the strengths, weaknesses, and capabilities of the technology. I see this as a patriotic duty, and I’m honored to be helping our leaders understand the limitations, strengths, and possibilities of robots and other AI-enhanced physical systems—what they can and cannot do, what they should and should not do, and what I believe they must do. Ultimately, no matter how much we teach and preach about the limitations of technology, the ethics of AI, or the potential dangers of developing such powerful tools, people will make their own choices, whether they are recently graduated students or senior national security leaders. What I hope and teach is that we should choose to do good. Despite the efforts of life extension companies, we all have a limited time on this planet, what the scientist Carl Sagan called our “pale blue dot,” and we should do whatever we can to make the most of that time and have a positive impact on our beautiful environment, and the many people and other species with which we share it. My decades-long quest to build more intelligent and capable robots has only strengthened my appreciation for—no, wonder at—the marvelous creatures that crawl, walk, swim, run, slither, and soar across and around our planet, and the fantastic plants, too. We should not busy ourselves with the work of developing robots that can eliminate these cosmically rare creations. We should focus instead on building technologies to preserve them, and even help them thrive. That applies to all living entities, including the one species that is especially concerned about the rise of intelligent machines. Excerpted from “The Heart and the Chip: Our Bright Future with Robots”. Copyright 2024 by Daniela Rus, Gregory Mone. Used with permission of the publisher, W.W. Norton & Company. All rights reserved.
-
Why Are Large AI Models Being Red Teamed?
Mar 15, 2024 11:36 AM PDTIn February, OpenAI announced the arrival of Sora, a stunning “text-to-video” tool. Simply enter a prompt, and Sora generates a realistic video within seconds. But it wasn’t immediately available to the public. Some of the delay is because OpenAI reportedly has a set of experts called a red team who, the company has said, will probe the model to understand its capacity for deepfake videos, misinformation, bias, and hateful content. Red teaming, while having proved useful for cybersecurity applications, is a military tool that was never intended for widespread adoption by the private sector. “Done well, red teaming can identify and help address vulnerabilities in AI,” says Brian Chen, director of policy from the New York–based think tank Data & Society. “What it does not do is address the structural gap in regulating the technology in the public interest.” What is red teaming? The practice of red teaming derives its early origins from Sun Tzu’s military stratagem from The Art of War: “If you know the enemy and know yourself, you need not fear the result of a hundred battles.” The purpose of red-teaming exercises is to play the role of the adversary (the red team) and find hidden vulnerabilities in the defenses of the blue team (the defenders) who then think creatively about how to fix the gaps. The practice originated in U.S. government and military circles during the 1960s as a way to anticipate threats from the Soviet Union. Today, it is mostly known as a trusted cybersecurity technique used to help protect computer networks, software, and proprietary data. That’s the idea, at least. And in cybersecurity, where the role of hackers and the defenders are clear-cut, red teaming has a substantial track record. But how blue and red teams might be apportioned for AI—and what motivates the players in this whole exercise to ultimately act toward, ideally, furthering the public good—is unclear. In a scenario where red teaming is being used to ostensibly help safeguard society from the potential harms of AI, who plays the blue and red teams? Is the blue team the developers and the red team hackers? Or is the red team the AI model? And who oversees the blue team? Micah Zenko, author of Red Team: How to Succeed by Thinking Like the Enemy, says the concept of red teaming is not always well-defined and can be varied in its applications. He says AI red teamers should “proceed with caution: Be clear on reasoning, scope, intent, and learning outcomes. Be sure to pressure-test thinking and challenge assumptions.” Zenko also reveals a glaring mismatch between red teaming and the pace of AI advancement. The whole point, he says, is to identify existing vulnerabilities and then fix them. “If the system being tested isn’t sufficiently static,” he says, “then we’re just chasing the past.” Why is red teaming now part of AI public policy? On 30 October last year, President Joe Biden issued Executive Order 14110 instructing the U.S. National Institute of Standards and Technology (NIST) to develop science-based guidelines to support the deployment of safe, secure, and trustworthy systems, including for AI red teaming. Three months later, NIST has concluded the first few steps toward implementing its new responsibilities—red teaming and otherwise. It has collected public comments on the federal register, announced the inaugural leadership of the U.S. Artificial Intelligence Safety Institute, and started a consortium to evaluate AI systems and improve their trustworthiness and safety. This, however, is not the Biden administration’s first instance of turning to AI red teaming. The technique’s popularity in Biden administration circles started earlier in the year. According to Politico, White House officials met with organizers of the hacker conference DEFCON in March and agreed at that time to support a public red-teaming exercise. By May, administration officials announced their support to attempt an AI red teaming exercise at the upcoming DEFCON 31 conference in Las Vegas. Then, as scheduled, in August, thousands descended upon Caesar’s Forum in Las Vegas to test the capacity of AI models to cause harm. As of press time, the results of this exercise have yet to be made public. What can AI red teaming do? Like any computer software, AI models share the same cybervulnerabilities: They can be hacked by nefarious actors to achieve a variety of objectives including data theft or sabotage. As such, red teaming can offer one approach for protecting AI models from external threats. For example, Google uses red teaming to protect its AI models from threats such as prompt attacks, data poisoning, and backdooring. Once such vulnerabilities are identified, they can close the gaps in the software. To address the potential risks of AI, tech developers have built networks of external experts to help them assess the safety and security of their models. However, they tend to hire contractors and require them to sign nondisclosure agreements . The exercises still take place behind closed doors, and results are reported to the public in broad terms. Especially for the case of AI, experts from Data & Society, a technology think tank, say that red teaming should not take place internally within a company. Zenko suggests that “not only is there a need for independent third-party validation, companies should build cross-functional and multidisciplinary teams—not just engineers and hackers.” Dan Hendrycks, executive and research director of the San Francisco–based Center for AI Safety, says red teaming shouldn’t be treated as a turnkey solution either. “The technique is certainly useful,” he says. “But it represents only one line of defense against the potential risks of AI, and a broader ecosystem of policies and methods is essential.” NIST’s new AI Safety Institute now has an opportunity to change the way red teaming is used in AI. The Institute’s consortium of more than 200 organizations has already reportedly begun developing standards for AI red teaming. Tech developers have also begun exploring best practices on their own. For example, Anthropic, Google, Microsoft, and OpenAI have established the Frontier Model Forum (FMF) to develop standards for AI safety and share best practices across the industry. Chris Meserole, FMF executive director, says that “red teaming can be a great starting point for assessing the potential risks a model might introduce.” However, he adds, it is far from “a panacea, which is why we’ve been keen to support the development of other evaluation, assessment, and mitigation techniques to assure the safety of frontier AI models.” In other words, AI models at the bleeding edge of technology development demand a range of strategies, not just a tool recycled from cybersecurity—and ultimately dating back to the Cold War.
-
Video Friday: Many Quadrupeds
Mar 15, 2024 08:45 AM PDTVideo Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. HRI 2024: 11–15 March 2024, BOULDER, COLO. Eurobot Open 2024: 8–11 May 2024, LA ROCHE-SUR-YON, FRANCE ICRA 2024: 13–17 May 2024, YOKOHAMA, JAPAN Enjoy today’s videos! How many quadrupeds can you control with one single locomotion policy? Apparently, the answer is “all of the quadrupeds.” Look for this at ICRA 2024 in a couple of months! [ EPFL ] Thanks, Milad! Very impressive performance from Figure 01, I think, although as is frequently the case, it’s hard to tell exactly how impressive without more information about exactly what’s going on here. [ Figure ] That awesome ANYmal Parkour research is now published, which means that there’s a new video, well worth watching all the way to the end. [ Science ] via [ ETHZ RSL ] Robotic vision can be pretty tricky when you’re cooking, because things can significantly change how they look over time, like with melting butter or an egg being fried. Some new research is tackling this, using a (now ancient?) PR2. [ JSK Lab ] Thanks, Kento! Filmed in January of 2020, this video shows Atlas clearing debris and going through a doorway. Uses a combination of simple footstep planning, teleoperation, and autonomous behaviors through a single virtual reality operator interface. Robot built by Boston Dynamics for the DARPA Robotics Challenge in 2013. [ IHMC ] Sustainable fashion enabled by smart textiles shaped by a robot and a heat gun. Multiple styles, multiple sizes, all in one garment! [ MIT ] Video of Boston Dynamics’ Stretch from MODEX, with a little sneak peak at the end of what the robot’s next warehouse task might be. [ Boston Dynamics ] Pickle Robots autonomously unload trucks and import containers. The system is in production use at customer warehoues handling floor-loaded freight at human scale or better. [ Pickle Robot ] The ROBDEKON robotics competence center is dedicated to the development of robotic systems for hazardous environments that pose a potential risk to humans. As part of the consortium, the FZI Research Center for Information Technology developed robotic systems, technologies, and Artificial Intelligence (AI) methods that can be used to handle hazardous materials–for example, to sort potentially dangerous used batteries for recycling. [ FZI ] This research project with Ontario Power Generation involves adapting Boston Dynamics Spot’s localization system to longterm changes in the environment. During this testing, we mounted a GoPro camera on the back of Spot and took a video of each walk for a year from Spot’s point of view. We put the footage together as a moving time-lapse video where the day changes as Spot completes the Autowalk around the campus. [ MARS Lab ]
-
Who Will Free EV Motors from the Rare Earth Monopoly?
Mar 15, 2024 07:58 AM PDTAs the world builds more and more electric cars and trucks—and electrifies other modes of transit—a race is underway to build the ideal, mean-and-green motor. The goal is a traction motor that’s at least as powerful, reliable, and lightweight as today’s industry standard rare-earth permanent magnet synchronous motor. However, rare-earth elements like neodymium and dysprosium, which are required for the most powerful magnets for the latter, also represent a major choke point. Their mining and processing comes at an environmental cost, and China holds a near-monopoly stake in them. Which is why the race to build an EV motor without rare-earths is so important. These projects don’t get as much attention as the race for better batteries and ever-more gargantuan battery factories, but they will be no less vital to the future of electrified transportation. The motor R&D projects take a variety of forms, including longstanding work to improve induction motors and various exotic types of synchronous motors, as well as efforts to build powerful synchronous motors with permanent magnets that don’t use rare earths. “I am a believer that this type of material is a game changer.” —Ayman El-Refaie, Marquette University Now a sleeper option, synchronous reluctance motors, is getting a surge of interest, thanks to materials-science breakthroughs at GE Aerospace. GE is one of a couple of companies developing materials with a remarkable property: when exposed to a strong magnetic field, different regions of the material become magnetized at radically different levels of intensity—either not magnetized at all or very highly magnetized. In a stunning paper last year, GE researchers reported that they had used such a material, called a dual-phase magnetic material, to produce a rare-earth-free rotor for a synchronous reluctance motor that had impressive characteristics. The 23-kilowatt experimental GE motor was tested on a dynamometer using a torque meter—the gray, cylindrical item with the fins at left. The translucent tubes contain red-colored oil circulated to cool the motor.GE Aerospace Research “I am a believer that this type of material is a game changer,” says Ayman El-Refaie, an IEEE Fellow and professor of electrical and computer engineering at Marquette University in Milwaukee, Wisc. El-Refaie originated the GE program in dual-phase materials in 2005. GE’s breakthrough material In tests, the GE motor handily outperformed synchronous reluctance motors that were more or less identical except for having rotors fabricated with conventional magnetic materials. For example, in one trial, the motor with the dual-phase rotor had power output of 23 kilowatts at 14,000 rpm; the comparable conventional-rotor machine could manage only 3.7 kW. That dual-phase-equipped motor had a respectable mass power density of 1.4 kW per kilogram. (It fell short of the predicted value of 1.87 kW/kg because the prediction had been based on attributes of lab-scale or simulated parts rather than production ones.) Electric vehicles on the market today typically have motors with power densities between 1.1 and 3.0 kW/kg. The peak efficiency of the GE motor was 94 percent, on a par with the best motors used today in commercial EVs. To understand the promise of dual-phase materials, start with some basics about synchronous reluctance motors. As with other motors, they have a stator and a rotor. A rotating magnetic field is created in the stator. This spinning field magnetizes and engages the rotor, which is usually made of a ferromagnetic alloy called electric steel. The rotor then spins because of a phenomenon called magnetic reluctance, which is the property that causes a ferromagnetic material to align itself with the lines of flux of a magnetic field. As the stator field rotates, the magnetized rotor continually tries to align itself with that rotating field, producing torque. A four-pole rotor in a synchronous reluctance motor has areas that are very highly magnetized—shown in red—and others that are not magnetized (blue). This image shows the magnetization when the stator and rotor are aligned.Oak Ridge National Laboratory One weakness of such a machine, however, involves the rotor. The magnetic interaction between the rotor and stator, which makes the rotor spin, is concentrated at evenly spaced positions, called poles, on the rotor and stator. The magnetic field lines from a pole on the rotor must be strongly linked to a corresponding pole on the stator. However, these field lines from the rotor poles have a tendency to interfere with each other, which reduces the lines, or flux, available to connect to the corresponding stator poles. “That will lower the overall torque that you can produce with the motor because the torque is primarily due to that flux linkage between the motor poles in the rotor and the stator,” says Frank Johnson, another veteran of the GE research team on dual-phase materials, and currently chief technical officer of Niron Magnetics in Minneapolis. So to magnetically isolate the poles from each other, one option would be to minimize structures, called bridges and posts, around the poles of the rotor. With less magnetic material, these structures would produce less flux and therefore less interference. That’s actually not a good option, though, because minimizing those structures would leave those areas narrow and therefore mechanically weak. That weakness would greatly limit the speed at which the rotor could spin, which would in turn limit the motor’s power. But with a dual-phase material, it is possible to make the bridges and posts non-magnetizable (the technical term is “non-permeable”), and also wide and strong. That’s what GE did with its experimental motor. No company is yet offering dual-phase magnetic materials suitable for the rotors of high-power traction motors. No one outside of GE Aerospace knows whether, or when, the company might license or manufacture its material. (GE Aerospace declined to make a researcher available to be interviewed for this article.) A survey article published in December in the Journal of Magnetism and Magnetic Materials concluded that GE’s material, which is produced in a process called high-temperature nitriding, “is the most developed method for producing dual-phase magnetic steel. Due to the high manufacturability of production, the final cost of the product is competitive compared to traditional electrical steel.” Besides GE, the only other company known to be working on dual-phase magnetic materials is Proterial (formerly Hitachi Metals, which has been developing dual-phase magnetic materials since the 1990s). University research programs investigating similar technologies include ones at Ufa University of Science and Technology in Russia, Yeungnam University in Republic of Korea, Harbin Institute of Technology in China, and the University of Sheffield in the U.K. El-Refaie, at Marquette, says that GE’s dual-phase material could be improved with further development. For example, the material’s maximum saturation flux density, a measure of how strongly magnetized the material can become, is 1.5 teslas—well below the 2 T limit of ordinary electric steel. But technical advances are probably not the highest hurdle to commercialization, he adds. “Whether someone can bring it to the finish line, and establish a supply-chain for it, it’s not clear how this is going to happen,” says El-Refaie. “It’s not only about GE. They have to work with vendors if they want to make it available for the broader technical market.” “A significant barrier will be finding a steel producer willing and able to produce the rolled metal sheet used to make the dual-phase rotor” material, adds Johnson. “The alloy that we developed has very low cost elements, which paradoxically makes it a difficult business case to justify except at very large manufacturing volumes requiring large amounts of capital equipment.” But if the material does go into mass-scale production, the benefits would go well beyond synchronous reluctance motors. Projects at Marquette University, Yeungnam University, and Ufa University of Science and Technology have demonstrated advantages of dual-phase materials in permanent magnet synchronous motors and electrical generators. “It’s not only IPM (interior permanent magnet motor) machines that can benefit,” says El-Refaie. “It can have advantages in other types of machines as well, for different reasons.”
-
FCC Denies Starlink Low-Orbit Bid for Lower Latency
Mar 13, 2024 04:20 PM PDTThe FCC has once again rejected a Starlink plan to deploy thousands of internet satellites in very low earth orbits (VLEO) ranging from 340 to 360 kilometers. In an order published last week, the FCC wrote: “SpaceX may not deploy any satellites designed for operational altitudes below the International Space Station,” whose orbit can range as low as 370 kilometers. Starlink currently has nearly 6000 satellites orbiting at around 550 kilometers that provide internet access to over 2.5 million customers around the world. But its service is currently slower than most terrestrial fiber networks, with average latencies (the time for data to travel between origin and destination) over 30 milliseconds at best, and double that at peak times. “If you fill that region with tens of thousands of satellites, it would put an even bigger squeeze on them and really compromise your ability to service the space station.” —Hugh Lewis, University of Southampton, U.K. “The biggest single goal for Starlink from a technical standpoint is to get the mean latency below 20 milliseconds,” said Elon Musk at a SpaceX event in January. “For the quality of internet experience, this is actually a really big deal. If you play video games like I sometimes do, this is also important, otherwise you lose.” The easiest way to reduce latency is to simply shorten the distance the data have to travel. So in a February letter, SpaceX pleaded with the FCC to allow its VLEO constellation: “Operating at these lower altitudes will enable SpaceX to provide higher-quality, lower-latency satellite service for consumers, keeping pace with growing demand for real-time applications.” These now include the military use of Starlink for communications in warzones such as Ukraine. Starlink also argued that its VLEO satellites would have collision probabilities ten times lower than those in higher orbits, and be easier to deorbit at the end of their functional lives. But the FCC was having none of it. The agency had already deferred VLEO operations when it licensed Starlink operations in December 2022, and used very similar languages in its order last week: “SpaceX must communicate and collaborate with NASA to ensure that deployment and operation of its satellites does not unduly constrain deployment and operation of NASA assets and missions, supports safety of both SpaceX and NASA assets and missions, and preserves long-term sustainable space-based communications services.” Neither the FCC nor SpaceX replied to requests for comment, but the agency’s reasoning is probably quite simple, according to Hugh Lewis, professor of astronautics at the University of Southampton in the U.K. “We don’t understand enough about what the risks actually are, especially because the number of satellites that SpaceX is proposing is greater than the number they’ve already launched,” he says. “I think the FCC might be overreacting. We will know where all the satellites are, we can watch them and avoid them. It is the stuff we can’t see that’s the problem.” —John Crassidis, University at Buffalo Although it might seem that having satellites orbiting below the International Space Station (ISS) would be safer than orbiting above, the fast-moving, SUV-sized Starlink craft might restrict when astronauts could reach the ISS—or leave in an emergency. “We are already seeing interruptions in launch windows thanks to Starlink,” says Lewis. “If you fill that region with tens of thousands of satellites, it would put an even bigger squeeze on them and really compromise your ability to service the space station.” In February 2022, NASA recommended that SpaceX prepare an analysis of launch window availability for the space station and interplanetary missions to ensure that Starlink would not significantly reduce access to space. No such analysis has been made public. John Crassidis, professor of mechanical and aerospace engineering the University at Buffalo, isn’t convinced the VLEO satellites would be that disruptive. “I think the FCC might be overreacting. We will know where all the satellites are, we can watch them and avoid them,” he says. “It is the stuff we can’t see that’s the problem.” While VLEO is almost empty compared to higher orbits, satellites there still risk collisions from satellites transiting up to their operational altitudes—and particularly from objects making uncontrolled descents to Earth. “There’s a persistent stream of things that are coming down, old cubesats and debris,” says Lewis. “It’s like a constant rain coming down.” New guidelines that are meant to leave fewer dead satellites in space for decades could also mean more transits through lower orbits, according to a paper Lewis wrote last year. He thinks that impacts in VLEO could easily eject high speed fragments up to higher orbits: “So even though you’re below the ISS, the ISS would still be within range of a debris cloud for a collision at 350 kilometers.” Crassidis disagrees. “You’d have to have a very violent collision to make that happen,” he says. “That’s something I’m not worried about.” Aside from safety considerations, other internet satellite operators also seem skeptical of SpaceX’s VLEO plans. Amazon asked the FCC for more opportunity to comment, while the Betzdorf, Luxembourg-based satellite telecom company SES sent a letter citing concerns about VLEO Starlinks interfering with its own satellites. Although SpaceX will have to keep deploying its satellites well above 500 kilometers, the battle for a low-latency VLEO constellation isn’t over. SpaceX also demonstrated a direct-to-cellular service in January with Starlink satellites at 360 kilometers, that it will likely want to operate commercially at similar altitudes rather than hundreds of kilometers higher. The FCC only deferred its decision on the low-flying satellites, along with 22,488 other satellites from SpaceX’s original application, leaving the door open for future changes. But for now at least, the astronauts of the ISS have won, and Musk and other online gamers will need to just keep on losing. UPDATE 14 March 2023: The story was updated to include reference to a January demonstration of direct-to-cell service.
-
Cerebras Unveils Its Next Waferscale AI Chip
Mar 13, 2024 06:00 AM PDTSunnyvale, Calif., AI supercomputer firm Cerebras says its next generation of waferscale AI chips can do double the performance of the previous generation while consuming the same amount of power. The Wafer Scale Engine 3 (WSE-3) contains 4 trillion transistors, a more than 50 percent increase over the previous generation thanks to the use of newer chipmaking technology. The company says it will use the WSE-3 in a new generation of AI computers, which are now being installed in a datacenter in Dallas to form a supercomputer capable of 8 exaflops (8 billion billion floating point operations per second). Separately, Cerebras has entered into a joint development agreement with Qualcomm that aims to boost a metric of price and performance for AI inference 10-fold. The company says the CS-3 can train neural network models up to 24-trillion parameters in size, more than 10 times the size of today’s largest LLMs. With WSE-3, Cerebras can keep its claim to producing the largest single chip in the world. Square-shaped with 21.5 centimeters to a side, it uses nearly an entire 300-millimeter wafer of silicon to make one chip. Chipmaking equipment is typically limited to producing silicon dies of no more than about 800 square millimeters. Chipmakers have begun to escape that limit by using 3D integration and other advanced packaging technology3D integration and other advanced packaging technology to combine multiple dies. But even in these systems, the transistor count is in the tens of billions. As usual, such a large chip comes with some mind-blowing superlatives. Transistors 4 trillion Square millimeters of silicon 46,225 AI cores 900,000 AI compute 125 petaflops On chip memory 44 gigabytes Memory bandwidth 21 petabytes Network fabric bandwidth 214 petabits You can see the effect of Moore’s Law in the succession of WSE chips. The first, debuting in 2019, was made using TSMC’s 16-nanometer tech. For WSE-2, which arrived in 2021, Cerebras moved on to TSMC’s 7-nm process. WSE-3 is built with the foundry giant’s 5-nm tech. The number of transistors has more than tripled since that first megachip. Meanwhile, what they’re being used for has also changed. For example, the number of AI cores on the chip has significantly leveled off, as has the amount of memory and the internal bandwidth. Nevertheless, the improvement in performance in terms of floating-point operations per second (flops) has outpaced all other measures. CS-3 and the Condor Galaxy 3 The computer built around the new AI chip, the CS-3, is designed to train new generations of giant large language models, 10 times larger than OpenAI’s GPT-4 and Google’s Gemini. The company says the CS-3 can train neural network models up to 24-trillion parameters in size, more than 10 times the size of today’s largest LLMs, without resorting to a set of software tricks needed by other computers. According to Cerebras, that means the software needed to train a one-trillion parameter model on the CS-3 is as straightforward as training a one billion parameter model on GPUs. As many as 2,048 systems can be combined, a configuration that would chew through training the popular LLM Llama 70B from scratch in just one day. Nothing quite that big is in the works, though, the company says. The first CS-3-based supercomputer, Condor Galaxy 3 in Dallas, will be made up of 64 CS-3s. As with its CS-2-based sibling systems, Abu Dhabi’s G42 owns the system. Together with Condor Galaxy 1 and 2, that makes a network of 16 exaflops. “The existing Condor Galaxy network has trained some of the leading open-source models in the industry, with tens of thousands of downloads,” said Kiril Evtimov, group CTO of G42 in a press release. “By doubling the capacity to 16 exaflops, we look forward to seeing the next wave of innovation Condor Galaxy supercomputers can enable.” A Deal With Qualcomm While Cerebras computers are built for training, Cerebras CEO Andrew Feldman says it’s inference, the execution of neural network models, that is the real limit to AI’s adoption. According to Cerebras estimates, if every person on the planet used ChatGPT, it would cost US $1 trillion annually—not to mention an overwhelming amount of fossil-fueled energy. (Operating costs are proportional to the size of neural network model and the number of users.) So Cerebras and Qualcomm have formed a partnership with the goal of bringing the cost of inference down by a factor of 10. Cerebras says their solution will involve applying neural networks techniques such as weight data compression and sparsity—the pruning of unneeded connections. The Cerebras-trained networks would then run efficiently on Qualcomm’s new inference chip, the AI 100 Ultra, the company says.
-
The Messy Reality Behind a Silicon Valley Unicorn
Mar 13, 2024 05:00 AM PDTFor 19 months, the sociologist Benjamin Shestakofsky embedded himself in an early-stage tech startup to study its organization and culture. The company went on to become one of Silicon Valley’s “unicorns,” valued at over US $1 billion. This article is adapted from an excerpt of the author’s new book, Behind the Startup: How Venture Capital Shapes Work, Innovation, and Inequality (University of California Press, 2024). The names of staff members and the company have been changed to preserve privacy. When I began my research, AllDone had just secured its first round of venture capital funding to fuel its quest to build an “Amazon for local services.” The company had built a digital platform connecting buyers and sellers of local services—housecleaners, plumbers, math tutors, and everything in between—across the United States. Although the influx of $4.5 million was cause for celebration, it also incited a sense of urgency among employees in the San Francisco office. As Carter, AllDone’s president, intoned in an all-staff email: We know what the future of local services is. But we’re not the only people that know this is the future. And, more importantly, there’s lots of people—smart, scrappy, and well-funded people—building our vision. Someone is going to do it. And it looks like it’s going to happen soon. We just have to finish building faster than anyone else and we will win. Demonstrating AllDone’s potential for explosive growth was the founders’ highest priority—and that priority shaped the company’s strategy and structure. AllDone faced extraordinary pressure from venture capital investors to grow as quickly as possible, which required finding new ways to attract users and increase their activity on the platform. At the same time, AllDone’s leaders knew the firm would be worthless if it couldn’t keep its product functioning properly and provide services to its ever-expanding user base. So the engineers in San Francisco set out to meet investors’ expectations by finding new ways to grow the company. Meanwhile, AllDone’s managers hired contractors in the Philippines to perform routine information-processing tasks. Some of the contractor work involved operations that software alone was unable to accomplish. But engineers also offloaded processes that software was technically capable of handling so that employees in San Francisco could remain focused on their strategic goals. Managers viewed AllDone’s Filipino workforce as a crucial contributor to the company’s rapid growth. It was, in the words of two executives, “the magic behind AllDone.” The Voorhes Startup Life After the First Funding Round In the period immediately following the first round of funding, AllDone’s founders prioritized two kinds of expansion: growing the user base and hiring more staff for the San Francisco team. First, to have any hope of success, AllDone would have to bring a critical mass of users on board. While the company had enrolled 250,000 “sellers” of services, “buyers” were submitting only about 7,000 requests for services per month. The team aimed to boost buyer requests by nearly 50 percent over the next quarter, demonstrating the kind of explosive growth that would make AllDone an attractive target for future VC funding rounds. AllDone’s software developers would thus be mobilized to overhaul the platform and make users’ experiences more intuitive and engaging. Executives planned to use most of the new money to hire more engineers and designers. Recruiting them soon became an all-consuming task that engaged AllDoners both inside and outside of the office, leaving little time for the staff to run the business. The recruitment effort was led by Peter, AllDone’s CEO. First, an external headhunter reviewed résumé submissions and scheduled introductory phone calls between promising applicants and Peter. Next came a coding challenge devised by the company’s four software engineers, followed by a phone interview with one of the engineers to further evaluate each applicant’s technical prowess. Those who passed that test moved on to a daylong interview in the office, which consisted of 90-minute one-on-one sessions with each of the four current engineers. Candidates would also spend an hour with Josh, the product manager, and finally another hour with Peter before being sent off in the evening with a beer stein emblazoned with the AllDone logo. Each member of the hiring committee would write an evaluation that everyone involved would read before conferring in person to discuss the candidate’s fate. For weeks at a time, the hiring team interviewed one or two candidates per day. The engineers’ heavy involvement in the laborious and time-consuming hiring process reduced their productivity, which threatened to slow the company’s progress at a time when investors expected precipitous growth. Although I had come to AllDone because of my interest in studying work and life inside a startup, my field notes reflected my surprise: “Since I began at AllDone, there doesn’t appear to be much work going on at all, at least as far as software production is concerned.” My observations were later confirmed by Josh, AllDone’s product manager, when he reported that during the first quarter of the year, AllDone’s four software engineers had “accomplished very little” in terms of their production goals because they had been “very, very focused on recruiting,” which he said had consumed at least half of their work hours. How, then, did AllDone run and even grow its platform when its software developers were frequently too busy with recruiting to do their jobs? The Voorhes The Human Machine Behind the Software AllDone’s managers increasingly turned to the company’s digital assembly line in the Philippines, where contractors performed computational work that stood in for or supported software algorithms. AllDone had hired its first work-from-home Filipino contractor a few months after the company’s launch. Within a year, the team had grown to 125, and during my research it expanded to 200. Most contractors were college educated and between the ages of 20 and 40; about 70 percent were women. Executives often called these workers AllDone’s “human machine.” Contractors logged in to AllDone’s administrative portals to complete various sets of tasks. Most notably, a division that eventually numbered nearly 100 people handled the company’s primary function of manually matching buyer requests with sellers from AllDone’s database of service providers—a process that users likely assumed was automated. Another division onboarded new sellers by classifying the services they provided, running an array of checks to verify their trustworthiness, and proofreading their profiles. A third division was responsible for generating brief descriptions of AllDone sellers; these blurbs were then compiled on Web pages designed to boost AllDone’s position in search-engine rankings. In total, Filipino contractors executed over 10,000 routine tasks per day. Filipino contractors’ wages and work hours were determined by their jobs: On average, contractors earned about $2.00 per hour and worked about 30 hours per week. While AllDone paid its Filipino workers only a tiny fraction of what San Francisco–based employees earned, their compensation substantially exceeded the Philippines’ legal minimum wage. As independent contractors, these workers didn’t receive paid vacation, sick leave, health insurance, or retirement benefits, nor did they enjoy the perks (like free food) available to workers in the San Francisco office. Contractors were also responsible for providing their own computer equipment and Internet connections. Contractors effectively functioned as artificial artificial intelligence, simulating the output of software algorithms that had yet to be completed. Companies seeking workers to do routine information processing often post tasks to on-demand “crowdwork” platforms like Amazon Mechanical Turk. In AllDone’s case, the importance of its contractors’ tasks to the company’s success meant that an open call fulfilled by anonymous workers simply wouldn’t do. AllDone’s staff in San Francisco considered AllDone Philippines an integral part of the organization and built enduring relationships with contractors, who typically performed the same assigned task for a period of months or even years. Newly hired contractors watched training videos to learn how to perform operations using AllDone’s proprietary administrative software. Managers of the Filipino divisions distributed weekly quizzes and offered coaching to ensure that workers understood AllDone’s rules and procedures. Yet at times, even high-ranking managers in the Philippines were excluded from important decisions that would affect their teams. In one meeting I had with Carter, AllDone’s president, he explained that AllDone’s engineers had recently made a change that suddenly increased some contractors’ workload by 60 percent. “We should have told them ahead of time so they would know it’s coming,” Carter said, wincing a little and shrugging sheepishly, “but it just didn’t occur to us.” For most staffers at AllDone San Francisco, their Filipino colleagues were effectively invisible human infrastructure that they could take for granted. The efforts of AllDone’s Filipino workforce had the desired effect. During the first quarter of the year, AllDone met its user-growth goal, receiving almost 50 percent more buyer requests than in the prior three-month period. During the second quarter, that metric would increase again by 75 percent. AllDone’s Filipino contractors made these substantial gains possible by laboring alongside computer code. In some instances, their efforts complemented software systems because the workers’ skills allowed them to perform tasks that algorithms couldn’t yet reliably manage, like writing original blurbs about specific sellers. In other cases, AllDone relied on workers to imitate software algorithms, taking on functions that computers were technically capable of performing but that developers in San Francisco believed would have been too costly or time-consuming to code themselves. Relying on Artificial Artificial Intelligence Because AllDone’s search-engine optimization strategy was yielding an ever-increasing volume of buyer requests, the company had to connect far more buyers with sellers than ever before. Indeed, this matching process was AllDone’s core function. But instead of expending scarce engineering resources on matching buyers with sellers, AllDone relied on staff in the Philippines to manually construct every introduction. This arrangement allowed software engineers to devote their energies to experimenting with new projects that could “move the needle,” or significantly increase key metrics (such as the number of buyer requests) that VC investors watched to assess the startup’s success. Members of the Filipino matching team used a Web portal that displayed the details of each new buyer request. They began their work by vetting requests and deleting those that appeared to be fraudulent (for example, a request placed by “Mickey Mouse”). The portal then provided team members with a rough, algorithmically generated list of local AllDone sellers who might be eligible to fulfill the request because they worked in relevant service categories. Workers would select all the sellers whom they judged to be appropriate matches, and the sellers would then be automatically notified so they could provide quotes for the service. The Filipino contractors effectively functioned as artificial artificial intelligence, simulating the output of software algorithms that had yet to be completed. It’s too soon to forecast a future of full automation or a world without work. AllDone’s users never knew that human workers, rather than a computer algorithm, had handcrafted each introduction. To keep up with the rapid rise in request volume, the matching team more than doubled in size during the first phase of my research, increasing from 30 to 68 people. Additionally, local managers cross-trained members of another division on the matching function so that when user activity peaked, more workers could be immediately mobilized to assist. There were many other processes that AllDone’s engineers agreed could have been automated yet were instead handled by contractors. These included screening out sellers whose names appeared on the U.S. Department of Justice’s national sex-offender registry, adding badges to seller profiles that passed a series of verifications, checking sellers’ professional license numbers against relevant state databases, running voluntary criminal-background checks on sellers, and sending customized emails apologizing to buyers whose requests received zero quotes from sellers. Quick and Dirty Tests The San Francisco team further reduced the engineering burden that came with developing new product features by having contractors support what AllDone’s software engineers called “quick and dirty” tests. That is, Filipino workers would manually execute algorithmic tasks that were under consideration for automation, providing a rough approximation of a project’s potential before developers invested time and resources in coding the software. In one such case, the product team wanted to determine whether they should add information from sellers’ profiles on the consumer-review website Yelp to their AllDone profile pages. They theorized that this additional information would enhance the perceived trustworthiness of AllDone sellers and increase buyer requests. Yelp offers free tools that allow software developers to embed Yelp users’ business information directly into their own websites. However, Bill, the AllDone engineer in charge of the project, preferred not to spend his time learning how to use Yelp’s tools without first knowing whether the new feature was likely to succeed. So he devised a test whereby contractors in the Philippines manually searched for 9,000 AllDone sellers on Yelp and gathered information from their Yelp user profiles. Bill then put this information on relevant AllDone pages. Upon finding that it did not have a statistically significant effect on buyer behavior, he abandoned the test. Throughout my research, AllDone had between four and eight software engineers on staff. Without the Filipino team, the startup would have been forced to abandon some functions of its website and to reallocate some of its engineering resources toward building software infrastructure. The Filipinos’ reliable performance of important tasks helped the company achieve the precipitous growth demanded by venture capital investors to rapidly increase the company’s valuation. While the team in San Francisco threw parties for new recruits, enjoyed catered meals, and created the impression of technological wizardry, Filipino contractors were toiling behind the scenes. AllDone’s story highlights the unseen but ongoing role of human workers on the frontiers of automation, and it demonstrates why it’s too soon to forecast a future of full automation or a world without work. The interdependence between generously compensated software engineers in San Francisco and low-cost contractors in the Philippines suggests that advances in software automation still rely not only on human labor, but also on global inequalities.
-
Using Manga to Spark Interest in STEM
Mar 12, 2024 01:00 PM PDTManga has grown in popularity in recent years among young adults. The Japanese comics and graphic novels dominated last year’s Circana BookScan graphic novels sales charts. The IEEE Women in Engineering group decided to use manga’s popularity with young people as a way to encourage girls to consider a career in STEM fields. WIE held its first competition last year to find the best-written manga that centered around a character WIE created: Riko-chan, a preuniversity student who uses STEM tools to solve everyday problems. The competition, which was supported by the IEEE Japan Council and the IEEE New Initiatives Committee, was open to all IEEE members and student members. They could submit multiple original stories individually, in teams, or on behalf of a preuniversity student. Out of 43 submissions from around the world, six winners were chosen. The winning manga stories are available to read online. Explaining how blockchain and aerodynamics work One of the winners was IEEE Member Carolyn Sher-DeCusatis, who teaches software engineering at Western Governors University, in Salt Lake City. Her areas of interest include physics, semiconductors, and computer programming. WIE has published her two winning comics on its website. Sher-DeCusatis says she entered the contest because she enjoys encouraging youngsters—especially girls and young women—to pursue a STEM career and to show them how great the field is. She has a lot of experience writing fiction because it’s her hobby. This is the first time her work has been published. “When I saw [IEEE WIE] was looking for stories about a young woman who was solving problems with engineering, I thought that it was right up my alley and that it would be fun,” she says. Hoping to connect with young readers through a Pokémon-inspired card game, Sher-DeCusatis’s first comic, Riko-chan: Cybersecurity Engineer, centers around two of the title character’s classmates who are concerned whether a rare trading card used in their favorite game is authentic. Riko-chan uses blockchain technology to help verify the card’s authenticity. People use blockchains to keep records of transactions and exchanges of data without relying on a central authority. The system is designed to use cryptography to protect information from being altered or stolen. The idea for Sher-DeCusatis’s second comic, she says, came from a personal experience she had while volunteering for IEEE. “Through the organization, I’ve helped [preuniversity] teachers conduct hands-on activities in local schools,” she says. “One of the classrooms I went to didn’t have materials to teach students about STEM subjects, so the teacher taught a lesson on aerodynamics using paper airplanes.” At the end of the lesson, students had to build a paper plane and throw it at a target. The experience inspired Riko-chan: Aeronautical Engineer. While walking through a forest, a paper plane suddenly lands in front of Riko-chan, nearly hitting her. Her friend, who also was in the forest, had built the plane and explained to Riko-chan that she hadn’t intended to hit her; the plane was supposed to land farther away. Riko-chan uses her knowledge of aerodynamics to show her friend how to improve the plane’s design. Sher-DeCusatis says she hopes readers can identify with Riko-chan. “I think having diverse representation in media, like books, comics, and graphic novels, will help bring more kids into STEM,” she says. “Fiction reaches our hearts, and when young people read stories about someone who looks like them, they can say: ‘That could be me.’ “It’s really important to excite students from a variety of backgrounds about engineering,” she adds, “because it’s vital to the future of STEM to have a lot of different voices contributing. “So much of our world is based on technology, so we need to have a lot of voices to shape its future. “I also think it’s fun and exciting,” she adds. “That’s why I’m still doing it after all these years.” Confronting climate change and cybersecurity threats The other four winning comics cover artificial intelligence, climate change, and cybersecurity. Riko-chan and the Perfect Circuit, written by IEEE Member Lais Lara Baptista, follows the character as she and a friend try to repair the old lights in her grandmother’s house. The two ask an online AI circuit-design program for directions on how to repair the lights’ outdated circuitry. “I see Riko-chan as a powerful motivator,” says Baptista, a full-stack developer based in Brazil. “I hope my comic inspires girls to see themselves as problem-solvers, capable of mastering the world of STEM.” High school student Julia Griffin wanted her comic to encourage youngsters to protect the environment. Her mother, IEEE Member Denise Griffin, submitted her story, Motion Detected, on her behalf. In the story, Riko-chan and a friend are able to escape being hit by a car while crossing a busy road. They notice that the pedestrian crossing sign was obstructed by a branch from a 100-year-old tree. To improve the road’s safety measures without damaging the tree, the girls install a motion detector that sets off a flashing light to notify drivers when people are crossing the street. “A love for the world around us can motivate young minds to pursue STEM and solve some of the world’s biggest challenges: environmental issues,” Griffin says. Maira Ratnarajah, author of Life of Our Beautiful Earth, stressed STEM’s role in confronting the climate crisis in her comic. Ratnarajah is a high school student in the United Kingdom. IEEE Member Kit August submitted the story on her behalf. In Ratnarajah’s comic, Riko-chan sees a poisonous cloud in the sky and works to reduce the carbon footprint humans leave on Earth. “STEM skills and knowledge can help engineers design renewable energy systems and study how climate change affects our ecosystems,” Ratnarajah says. “Manga, like the one I wrote, can motivate girls to pursue STEM to develop green technologies.” The final manga entry, Riko-chan: The Science Solver, was written by Devidas Kulkarni, an IEEE graduate student member. It focuses on the importance of cybersecurity. In it, Riko-chan’s friend Reizei discovers that his cellphone has been hacked. Riko-chan is able to track down who did it. Submissions for this year’s manga story competition are now being accepted. Check out the rules and deadlines. If your organization would like to support the competition, contact the WIE staff.
-
Covariant Announces a Universal AI Platform for Robots
Mar 11, 2024 10:44 AM PDTWhen IEEE Spectrum first wrote about Covariant in 2020, it was a new-ish robotics startup looking to apply robotics to warehouse picking at scale through the magic of a single end-to-end neural network. At the time, Covariant was focused on this picking use case, because it represents an application that could provide immediate value—warehouse companies pay Covariant for its robots to pick items in their warehouses. But for Covariant, the exciting part was that picking items in warehouses has, over the last four years, yielded a massive amount of real-world manipulation data—and you can probably guess where this is going. Today, Covariant is announcing RFM-1, which the company describes as a robotics foundation model that gives robots the “human-like ability to reason.” That’s from the press release, and while I wouldn’t necessarily read too much into “human-like” or “reason,” what Covariant has going on here is pretty cool. “Foundation model” means that RFM-1 can be trained on more data to do more things—at the moment, it’s all about warehouse manipulation because that’s what it’s been trained on, but its capabilities can be expanded by feeding it more data. “Our existing system is already good enough to do very fast, very variable pick and place,” says Covariant co-founder Pieter Abbeel. “But we’re now taking it quite a bit further. Any task, any embodiment—that’s the long-term vision. Robotics foundation models powering billions of robots across the world.” From the sound of things, Covariant’s business of deploying a large fleet of warehouse automation robots was the fastest way for them to collect the tens of millions of trajectories (how a robot moves during a task) that they needed to train the 8 billion parameter RFM-1 model. Covariant “The only way you can do what we’re doing is by having robots deployed in the world collecting a ton of data,” says Abbeel. “Which is what allows us to train a robotics foundation model that’s uniquely capable.” There have been other attempts at this sort of thing: The RTX project is one recent example. But while RT-X depends on research labs sharing what data they have to create a dataset that’s large enough to be useful, Covariant is doing it alone, thanks to its fleet of warehouse robots. “RT-X is about a million trajectories of data,” Abbeel says, “but we’re able to surpass it because we’re getting a million trajectories every few weeks.” “By building a valuable picking robot that’s deployed across 15 countries with dozens of customers, we essentially have a data collection machine.” —Pieter Abbeel, Covariant You can think of the current execution of RFM-1 as a prediction engine for suction-based object manipulation in warehouse environments. The model incorporates still images, video, joint angles, force reading, suction cup strength—everything involved in the kind of robotic manipulation that Covariant does. All of these things are interconnected within RFM-1, which means that you can put any of those things into one end of RFM-1, and out of the other end of the model will come a prediction. That prediction can be in the form of an image, a video, or a series of commands for a robot. What’s important to understand about all of this is that RFM-1 isn’t restricted to picking only things it’s seen before, or only working on robots it has direct experience with. This is what’s nice about foundation models—they can generalize within the domain of their training data, and it’s how Covariant has been able to scale their business as successfully as they have, by not having to retrain for every new picking robot or every new item. What’s counter-intuitive about these large models is that they’re actually better at dealing with new situations than models that are trained specifically for those situations. For example, let’s say you want to train a model to drive a car on a highway. The question, Abbeel says, is whether it would be worth your time to train on other kinds of driving anyway. The answer is yes, because highway driving is sometimes not highway driving. There will be accidents or rush hour traffic that will require you to drive differently. If you’ve also trained on driving on city streets, you’re effectively training on highway edge cases, which will come in handy at some point and improve performance overall. With RFM-1, it’s the same idea: Training on lots of different kinds of manipulation—different robots, different objects, and so on—means that any single kind of manipulation will be that much more capable. In the context of generalization, Covariant talks about RFM-1’s ability to “understand” its environment. This can be a tricky word with AI, but what’s relevant is to ground the meaning of “understand” in what RFM-1 is capable of. For example, you don’t need to understand physics to be able to catch a baseball, you just need to have a lot of experience catching baseballs, and that’s where RFM-1 is at. You could also reason out how to catch a baseball with no experience but an understanding of physics, and RFM-1 is not doing this, which is why I hesitate to use the word “understand” in this context. But this brings us to another interesting capability of RFM-1: it operates as a very effective, if constrained, simulation tool. As a prediction engine that outputs video, you can ask it to generate what the next couple seconds of an action sequence will look like, and it’ll give you a result that’s both realistic and accurate, being grounded in all of its data. The key here is that RFM-1 can effectively simulate objects that are challenging to simulate traditionally, like floppy things. Covariant’s Abbeel explains that the “world model” that RFM-1 bases its predictions on is effectively a learned physics engine. “Building physics engines turns out to be a very daunting task to really cover every possible thing that can happen in the world,” Abbeel says. “Once you get complicated scenarios, it becomes very inaccurate, very quickly, because people have to make all kinds of approximations to make the physics engine run on a computer. We’re just doing the large-scale data version of this with a world model, and it’s showing really good results.” Abbeel gives an example of asking a robot to simulate (or predict) what would happen if a cylinder is placed vertically on a conveyor belt. The prediction accurately shows the cylinder falling over and rolling when the belt starts to move—not because the cylinder is being simulated, but because RFM-1 has seen a lot of things being placed on a lot of conveyor belts. “Five years from now, it’s not unlikely that what we are building here will be the only type of simulator anyone will ever use.” —Pieter Abbeel, Covariant This only works if there’s the right kind of data for RFM-1 to train on, so unlike most simulation environments, it can’t currently generalize to completely new objects or situations. But Abbeel believes that with enough data, useful world simulation will be possible. “Five years from now, it’s not unlikely that what we are building here will be the only type of simulator anyone will ever use. It’s a more capable simulator than one built from the ground up with collision checking and finite elements and all that stuff. All those things are so hard to build into your physics engine in any kind of way, not to mention the renderer to make things look like they look in the real world—in some sense, we’re taking a shortcut.” RFM-1 also incorporates language data to be able to communicate more effectively with humans. Covariant For Covariant to expand the capabilities of RFM-1 towards that long-term vision of foundation models powering “billions of robots across the world,” the next step is to feed it more data from a wider variety of robots doing a wider variety of tasks. “We’ve built essentially a data ingestion engine,” Abbeel says. “If you’re willing to give us data of a different type, we’ll ingest that too.” “We have a lot of confidence that this kind of model could power all kinds of robots—maybe with more data for the types of robots and types of situations it could be used in.” —Pieter Abbeel, Covariant One way or another, that path is going to involve a heck of a lot of data, and it’s going to be data that Covariant is not currently collecting with its own fleet of warehouse manipulation robots. So if you’re, say, a humanoid robotics company, what’s your incentive to share all the data you’ve been collecting with Covariant? “The pitch is that we’ll help them get to the real world,” Covariant co-founder Peter Chen says. “I don’t think there are really that many companies that have AI to make their robots truly autonomous in a production environment. If they want AI that’s robust and powerful and can actually help them enter the real world, we are really their best bet.” Covariant’s core argument here is that while it’s certainly possible for every robotics company to train up their own models individually, the performance—for anybody trying to do manipulation, at least—would be not nearly as good as using a model that incorporates all of the manipulation data that Covariant already has within RFM-1. “It has always been our long term plan to be a robotics foundation model company,” says Chen. “There was just not sufficient data and compute and algorithms to get to this point—but building a universal AI platform for robots, that’s what Covariant has been about from the very beginning.”
-
Countdown to the 2024 IEEE Annual Election
Mar 10, 2024 11:00 AM PDTOn 1 May the IEEE Board of Directors is scheduled to announce the candidates to be placed on this year’s ballot for the annual election of officers—which begins on 15 August. The ballot includes IEEE president-elect candidates and other officer positions up for election. The Board of Directors has nominated IEEE Fellows S. K. Ramesh, Mary Ellen Randall, and John P. Verboncoeur as candidates for 2025 IEEE president-elect. Visit the IEEE elections page to learn about the candidates. The ballot includes nominees for delegate-elect/director-elect openings submitted by division and region nominating committees, IEEE Technical Activities vice president-elect, IEEE-USA president-elect, and IEEE Standards Association board of governors members-at-large. IEEE members who want to run for an office but who have not been nominated need to submit their petition intention to the IEEE Board of Directors by 15 April. Petitions should be sent to the IEEE Corporate Governance staff: elections@ieee.org. Those elected take office on 1 January 2025. To ensure voting eligibility, members are encouraged to review and update their contact information and communication preferences by 30 June. In support of IEEE’s global sustainability initiatives, electronic voting is encouraged. For more information about the offices up for election, the process of getting on the ballot, and deadlines, visit the IEEE elections page or write to elections@ieee.org.
-
Video Friday: Human to Humanoid
Mar 08, 2024 09:41 AM PSTVideo Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. HRI 2024: 11–15 March 2024, BOULDER, COLO. Eurobot Open 2024: 8–11 May 2024, LA ROCHE-SUR-YON, FRANCE ICRA 2024: 13–17 May 2024, YOKOHAMA, JAPAN RoboCup 2024: 17–22 July 2024, EINDHOVEN, NETHERLANDS Enjoy today’s videos! We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time, whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. We successfully achieve teleoperation of dynamic, whole-body motions in real-world scenarios, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. To the best of our knowledge, this is the first demonstration to achieve learning-based, real-time, whole-body humanoid teleoperation. [ CMU ] Legged robots have the potential to traverse complex terrain and access confined spaces beyond the reach of traditional platforms thanks to their ability to carefully select footholds and flexibly adapt their body posture while walking. However, robust deployment in real-world applications is still an open challenge. In this paper, we present a method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments. [ Takahiro Miki ] Sure, 3.3 meters per second is fast for a humanoid, but I’m more impressed by the spinning around while walking downstairs. [ Unitree ] Improving the safety of collaborative manipulators necessitates the reduction of inertia in the moving part. We introduce a novel approach in the form of a passive, 3D wire aligner, serving as a lightweight and low-friction power transmission mechanism, thus achieving the desired low inertia in the manipulator’s operation. [ SAQIEL ] Thanks, Temma! Robot Era just launched Humanoid-Gym, an open-source reinforcement learning framework for bipedal humanoids. As you can see from the video, RL algorithms have given the robot, called Xiao Xing, or XBot, the ability to climb up and down haphazardly stacked boxes with relative stability and ease. [ Robot Era ] “Impact-Aware Bimanual Catching of Large-Momentum Objects.” Need I say more? [ SLMC ] More than 80% of stroke survivors experience walking difficulty, significantly impacting their daily lives, independence, and overall quality of life. Now, new research from the University of Massachusetts Amherst pushes forward the bounds of stroke recovery with a unique robotic hip exoskeleton, designed as a training tool to improve walking function. This invites the possibility of new therapies that are more accessible and easier to translate from practice to daily life, compared to current rehabilitation methods. [ UMass Amherst ] Thanks, Julia! The manipulation here is pretty impressive, but it’s hard to know how impressive without also knowing how much the video was sped up. [ Somatic ] DJI drones work to make the world a better place and one of the ways that we do this is through conservation work. We partnered with Halo Robotics and the OFI Orangutan Foundation International to showcase just how these drones can make an impact. [ DJI ] The aim of the test is to demonstrate the removal and replacement of satellite modules into a 27U CubeSat format using augmented reality control of a robot. In this use case, the “client” satellite is being upgraded and refueled using modular componentry. The robot will then remove the failed computer module and place it in a fixture. It will then do the same with the propellant tank. The robot will then place these correctly back into the satellite. [ Extend Robotics ] This video features some of the highlights and favorite moments from the CYBATHLON Challenges 2024 that took place on 2 February, showing so many diverse types of assistive technology taking on discipline tasks and displaying pilots’ tenacity and determination. The Challenges saw new teams, new tasks, and new formats for many of the CYBATHLON disciplines. [ Cybathlon ] It’s been a long road to electrically powered robots. [ ABB ] Small drones for catastrophic wildfires (ones covering more than [40,470 hectares]) are like bringing a flashlight to light up a football field. This short video describes the major uses for drones of all sizes and why and when they are used, or why not. [ CRASAR ] It probably will not surprise you that there are a lot of robots involved in building Rivian trucks and vans. [ Kawasaki Robotics ] DARPA’s Learning Introspective Control (LINC) program is developing machine learning methods that show promise in making that scenario closer to reality. LINC aims to fundamentally improve the safety of mechanical systems—specifically in ground vehicles, ships, drone swarms, and robotics—using various methods that require minimal computing power. The result is an AI-powered controller the size of a cell phone. [ DARPA ]
-
What U.S. Members Think About Regulating AI
Mar 07, 2024 11:00 AM PSTWith the rapid proliferation of AI systems, public policymakers and industry leaders are calling for clearer guidance on governing the technology. The majority of U.S. IEEE members express that the current regulatory approach to managing artificial intelligence (AI) systems is inadequate. They also say that prioritizing AI governance should be a matter of public policy, equal to issues such as health care, education, immigration, and the environment. That’s according to the results of a survey conducted by IEEE for the IEEE-USA AI Policy Committee. We serve as chairs of the AI Policy Committee, and know that IEEE’s members are a crucial, invaluable resource for informed insights into the technology. To guide our public policy advocacy work in Washington, D.C., and to better understand opinions about the governance of AI systems in the U.S., IEEE surveyed a random sampling of 9,000 active IEEE-USA members plus 888 active members working on AI and neural networks. The survey intentionally did not define the term AI. Instead, it asked respondents to use their own interpretation of the technology when answering. The results demonstrated that, even among IEEE’s membership, there is no clear consensus on a definition of AI. Significant variances exist in how members think of AI systems, and this lack of convergence has public policy repercussions. Overall, members were asked their opinion on how to govern the use of algorithms in consequential decision-making and on data privacy, and whether the U.S. government should increase its workforce capacity and expertise in AI. The state of AI governance For years, IEEE-USA has been advocating for strong governance to control AI’s impact on society. It is apparent that U.S. public policy makers struggle with regulation of the data that drives AI systems. Existing federal laws protect certain types of health and financial data, but Congress has yet to pass legislation that would implement a national data privacy standard, despite numerous attempts to do so. Data protections for Americans are piecemeal, and compliance with the complex federal and state data privacy laws can be costly for industry. Numerous U.S. policymakers have espoused that governance of AI cannot happen without a national data privacy law that provides standards and technical guardrails around data collection and use, particularly in the commercially available information market. The data is a critical resource for third-party large-language models, which use it to train AI tools and generate content. As the U.S. government has acknowledged, the commercially available information market allows any buyer to obtain hordes of data about individuals and groups, including details otherwise protected under the law. The issue raises significant privacy and civil liberties concerns. Regulating data privacy, it turns out, is an area where IEEE members have strong and clear consensus views. Survey takeaways The majority of respondents—about 70 percent—said the current regulatory approach is inadequate. Individual responses tell us more. To provide context, we have broken down the results into four areas of discussion: governance of AI-related public policies; risk and responsibility; trust; and comparative perspectives. Governance of AI as public policy Although there are divergent opinions around aspects of AI governance, what stands out is the consensus around regulation of AI in specific cases. More than 93 percent of respondents support protecting individual data privacy and favor regulation to address AI-generated misinformation. About 84 percent support requiring risk assessments for medium- and high-risk AI products. Eighty percent called for placing transparency or explainability requirements on AI systems, and 78 percent called for restrictions on autonomous weapon systems. More than 72 percent of members support policies that restrict or govern the use of facial recognition in certain contexts, and nearly 68 percent support policies that regulate the use of algorithms in consequential decisions. There was strong agreement among respondents around prioritizing AI governance as a matter of public policy. Two-thirds said the technology should be given at least equal priority as other areas within the government’s purview, such as health care, education, immigration, and the environment. Eighty percent support the development and use of AI, and more than 85 percent say it needs to be carefully managed, but respondents disagreed as to how and by whom such management should be undertaken. While only a little more than half of the respondents said the government should regulate AI, this data point should be juxtaposed with the majority’s clear support of government regulation in specific areas or use case scenarios. Only a very small percentage of non-AI focused computer scientists and software engineers thought private companies should self-regulate AI with minimal government oversight. In contrast, almost half of AI professionals prefer government monitoring. More than three quarters of IEEE members support the idea that governing bodies of all types should be doing more to govern AI’s impacts. Risk and responsibility A number of the survey questions asked about the perception of AI risk. Nearly 83 percent of members said the public is inadequately informed about AI. Over half agree that AI’s benefits outweigh its risks. In terms of responsibility and liability for AI systems, a little more than half said the developers should bear the primary responsibility for ensuring that the systems are safe and effective. About a third said the government should bear the responsibility. Trusted organizations Respondents ranked academic institutions, nonprofits and small and midsize technology companies as the most trusted entities for responsible design, development, and deployment. The three least trusted factions are large technology companies, international organizations, and governments. The entities most trusted to manage or govern AI responsibly are academic institutions and independent third-party institutions. The least trusted are large technology companies and international organizations. Comparative perspectives Members demonstrated a strong preference for regulating AI to mitigate social and ethical risks, with 80 percent of non-AI science and engineering professionals and 72 percent of AI workers supporting the view. Almost 30 percent of professionals working in AI express that regulation might stifle innovation, compared with about 19 percent of their non-AI counterparts. A majority across all groups agree that it’s crucial to start regulating AI, rather than waiting, with 70 percent of non-AI professionals and 62 percent of AI workers supporting immediate regulation. A significant majority of the respondents acknowledged the social and ethical risks of AI, emphasizing the need for responsible innovation. Over half of AI professionals are inclined toward nonbinding regulatory tools such as standards. About half of non-AI professionals favor specific government rules. A mixed governance approach The survey establishes that a majority of U.S.-based IEEE members support AI development and strongly advocate for its careful management. The results will guide IEEE-USA in working with Congress and the White House. Respondents acknowledge the benefits of AI, but they expressed concerns about its societal impacts, such as inequality and misinformation. Trust in entities responsible for AI’s creation and management varies greatly; academic institutions are considered the most trustworthy entities. A notable minority oppose government involvement, preferring non regulatory guidelines and standards, but the numbers should not be viewed in isolation. Although conceptually there are mixed attitudes toward government regulation, there is an overwhelming consensus for prompt regulation in specific scenarios such as data privacy, the use of algorithms in consequential decision-making, facial recognition, and autonomous weapons systems. Overall, there is a preference for a mixed governance approach, using laws, regulations, and technical and industry standards.
-
Wireless Channel Modeling for Dynamic Terrestrial Environments
Mar 07, 2024 03:00 AM PSTAs wireless systems become complex and reach for more spectrum, RF engineers must rely on high-fidelity simulation solutions to model and test their proposed new networks effectively. We offer tools to address these challenges and enable network architects and mission planners to digitally model and simulate dynamic wireless networks within an accurate systems simulation environment. Leveraging solutions for electromagnetic wave propagation, electronically steered antenna design tools, and a digital mission simulation, engineers can rapidly deploy models and execute them within a high-fidelity, physics-accurate digital testing environment. Engineers will understand the impacts of terrain, urban landscapes, and the dynamic kinematic motions across any number of simulated scenarios needed to test anticipated wireless network performance against design requirements thoroughly. Register now to attend this free webinar! Ansys is combining these industry-leading modeling and simulation tools to provide a workflow-driven solution to these unique needs and challenges. Join us to learn more about the new Ansys RF Channel Modeler and how it can help you and your organization leverage digital modeling and simulation tools like never before. What you will learn: How high-fidelity simulation solutions effectively model and test proposed new networks How to enable network architects and mission planners to digitally model and simulate dynamic wireless networks within an accurate systems simulation environment How the impacts of terrain, urban landscapes, and the dynamic kinematic motions across any number of simulated scenarios are needed to test anticipated wireless network performance against design requirements
-
Multiphysics Modeling of Electrical Motors
Mar 06, 2024 11:02 AM PSTTo reduce global warming and the associated effects, the transportation and energy sectors are adopting measures to make different applications potentially fossil free. This has led to a surge in demand for electric machines and the related design and development efforts. The designs of these electrical machines need to meet various specifications, including efficiency and power density requirements. A multiphysics-based simulation and modeling approach plays a critical role in accomplishing the design needs and significantly reducing the lead time to market. The COMSOL Multiphysics software and its add-on modules provide the capability needed to model the multiphysics phenomena involved in electrical motors, including electromagnetics, thermal, structural mechanics, and fluid flow. The most common motor types, synchronous permanent magnet and asynchronous motors, as well as more recently researched alternatives such as synchronous reluctance or axial flux motors, can be modeled and optimized in COMSOL Multiphysics®. In this webinar, we will demonstrate the capabilities of the software for electrical motor modeling and the optimization techniques available for accelerating product development time. Register now to attend this free webinar!
-
India Injects $15 Billion Into Semiconductors
Mar 06, 2024 08:53 AM PSTThe government of India has approved a major investment in semiconductor and electronics production that will include the country’s first state-of-the-art semiconductor fab. It announced that three plants—one semiconductor fab and two packaging and test facilities—will break ground within 100 days. The government has approved 1.26 trillion Indian rupees (US $15.2 billion) for the projects. India’s is the latest in a string of efforts to boost domestic chip manufacturing in the hope of making nations and regions more independent in what’s seen as a strategically critical industry. “On one end India has a large and growing domestic demand and on the other end global customers are looking at India for supply-chain resilience,” Frank Hong, chairman of Taiwan-based foundry Powerchip Semiconductor (PSMC), a partner in the new fab, said in a press release. “There could not have been a better time for India to make its entry into the semiconductor manufacturing industry.” The country’s first fab will be an $11 billion joint venture between PSMC and Tata Electronics, a branch of the $370 billion Indian conglomerate. Through the partnership, it will be capable of 28-, 40-, 55-, and 110-nanometer chip production, with a capacity of 50,000 wafers per month. Far from the cutting edge, these technology nodes nevertheless are used in the bulk of chipmaking, with 28 nm being the most advanced node using planar CMOS transistors instead of the more advanced FinFET devices. “The announcement is clear progress toward creating a semiconductor manufacturing presence in India,” says Rakesh Kumar, a professor of electrical and computer engineering at University of Illinois Urbana-Champaign and author of Reluctant Technophiles: India’s Complicated Relationship with Technology. “The choice of 28-nm, 40-nm, 55-nm, 90-nm, and 110-nm also seems sensible, since it limits the cost to the government and the players, who are taking a clear risk.” According to Tata, the fab will make chips for applications such as power management, display drivers, microcontrollers, as well as and high-performance computing logic. Both the fab’s technological capability and target applications point toward products that were at the heart of the pandemic-era chip shortage. The fab is in a new industrial zone in Dholera, in Gujarat, Prime Minister Narendra Modhi’s home state. Tata projects it will directly or indirectly lead to more than 20,000 skilled jobs in the region. Chip Packaging Push In addition to the chip fab, the government approved investments in two assembly, test, and packaging facilities, a sector of the semiconductor industry currently concentrated in Southeast Asia. Tata Electronics will build a $3.25 billion plant at Jagiroad, in the eastern state of Assam. The company says it will offer a range of packaging technologies: wire bond and flip-chip, as well as system-in-package. It plans to expand into advanced packaging tech “in the future.” Advanced packaging, such as 3D integration, has emerged as a critical technology as the traditional transistor scaling of Moore’s Law has slowed and become increasingly expensive. Tata plans to start production at Jagiroad in 2025, and it predicts the plant will add 27,000 direct and indirect jobs to the local economy. A joint venture between Japanese microcontroller giant Renesas, Thai chip packaging company Stars Microelectronics, and India’s CG Power and Industrial Solutions will build a $900 million packaging plant in Sanand, Gujarat. The plant will offer wire-bond and flip-chip technologies. CG, which will own 92 percent of the venture, is a Mumbai-based appliances and industrial motors and electronics firm. There’s already a chip-packaging plant in the works in Sanand from a previous agreement. U.S.-based memory and storage maker Micron agreed last June to build a packaging and test facility there. Micron plans to spend $825 million in two phases on the plant. Gujarat and the Indian federal government is set to cover a further $1.925 billion. Micron expects the first phase to be operational by the end of 2024. Generous Incentives After an initial overture failed to attract chip companies, the government upped its ante. According to Stephen Ezzell at the Washington, D.C.–based policy-research organization the Information Technology and Innovation Foundation (IT&IF), India’s semiconductor incentives are now among the most attractive in the world. In a report issued two weeks before the India fab announcement, Ezzell explained that for an approved silicon fab worth at least $2.5 billion and making 40,000 wafer starts per month the federal government will reimburse 50 percent of the fab cost with a state partner expected to add 20 percent. For a chip fab making smaller-volume products, such as sensors, silicon photonics, or compound semiconductors, the same formula holds, except that the minimum investment is $13 million. For a test and packaging facility, it’s just $6.5 million. India is a rapidly growing consumer of semiconductors. Its market was worth $22 billion in 2019 and is expected to nearly triple to $64 billion by 2026, according to Counterpoint Technology Market Research. The country’s minister of state for IT and electronics, Rajeev Chandrasekhar projects further growth to $110 billion by 2030. At that point, it would account for 10 percent of global consumption, according to the IT&IF report. About 20 percent of the world’s semiconductor design engineers are in India, according to the IT&IF report. And between March 2019 and 2023 semiconductor job openings in the country increased 7 percent. The hope is that the investment will be a draw for new engineering students. “I think it is a big boost for the Indian semiconductor industry and will benefit not just students but the entire academic system in India,” says Saurabh N. Mehta, a professor and chief academic officer at Vidyalankar Institute of Technology, in Mumbai. “It will boost many startups, jobs, and product-development initiatives, especially in the defense and power sectors. Many talented students will join the electronics and allied courses, making India the next semiconductor hub.”
-
AI Prompt Engineering Is Dead
Mar 06, 2024 07:07 AM PSTSince ChatGPT dropped in the fall of 2022, everyone and their donkey has tried their hand at prompt engineering—finding a clever way to phrase your query to a large language model (LLM) or AI art or video generator to get the best results or sidestep protections. The Internet is replete with prompt-engineering guides, cheat sheets, and advice threads to help you get the most out of an LLM. In the commercial sector, companies are now wrangling LLMs to build product copilots, automate tedious work, create personal assistants, and more, says Austin Henley, a former Microsoft employee who conducted a series of interviews with people developing LLM-powered copilots. “Every business is trying to use it for virtually every use case that they can imagine,” Henley says. “The only real trend may be no trend. What’s best for any given model, dataset, and prompting strategy is likely to be specific to the particular combination at hand.” —Rick Battle & Teja Gollapudi, VMware To do so, they’ve enlisted the help of prompt engineers professionally. However, new research suggests that prompt engineering is best done by the model itself, and not by a human engineer. This has cast doubt on prompt engineering’s future—and increased suspicions that a fair portion of prompt-engineering jobs may be a passing fad, at least as the field is currently imagined. Autotuned prompts are successful and strange Rick Battle and Teja Gollapudi at California-based cloud computing company VMware were perplexed by how finicky and unpredictable LLM performance was in response to weird prompting techniques. For example, people have found that asking models to explain its reasoning step-by-step—a technique called chain-of-thought—improved their performance on a range of math and logic questions. Even weirder, Battle found that giving a model positive prompts, such as “this will be fun” or “you are as smart as chatGPT,” sometimes improved performance. Battle and Gollapudi decided to systematically test how different prompt-engineering strategies impact an LLM’s ability to solve grade-school math questions. They tested three different open-source language models with 60 different prompt combinations each. What they found was a surprising lack of consistency. Even chain-of-thought prompting sometimes helped and other times hurt performance. “The only real trend may be no trend,” they write. “What’s best for any given model, dataset, and prompting strategy is likely to be specific to the particular combination at hand.” According to one research team, no human should manually optimize prompts ever again. There is an alternative to the trial-and-error-style prompt engineering that yielded such inconsistent results: Ask the language model to devise its own optimal prompt. Recently, new tools have been developed to automate this process. Given a few examples and a quantitative success metric, these tools will iteratively find the optimal phrase to feed into the LLM. Battle and his collaborators found that in almost every case, this automatically generated prompt did better than the best prompt found through trial-and-error. And, the process was much faster, a couple of hours rather than several days of searching. The optimal prompts the algorithm spit out were so bizarre, no human is likely to have ever come up with them. “I literally could not believe some of the stuff that it generated,” Battle says. In one instance, the prompt was just an extended Star Trek reference: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation.” Apparently, thinking it was Captain Kirk helped this particular LLM do better on grade-school math questions. Battle says that optimizing the prompts algorithmically fundamentally makes sense given what language models really are—models. “A lot of people anthropomorphize these things because they ‘speak English.’ No, they don’t,” Battle says. “It doesn’t speak English. It does a lot of math.” In fact, in light of his team’s results, Battle says no human should manually optimize prompts ever again. “You’re just sitting there trying to figure out what special magic combination of words will give you the best possible performance for your task,” Battle says, “But that’s where hopefully this research will come in and say ‘don’t bother.’ Just develop a scoring metric so that the system itself can tell whether one prompt is better than another, and then just let the model optimize itself.” Autotuned prompts make pictures prettier, too Image-generation algorithms can benefit from automatically generated prompts as well. Recently, a team at Intel labs, led by Vasudev Lal, set out on a similar quest to optimize prompts for the image-generation model Stable Diffusion. “It seems more like a bug of LLMs and diffusion models, not a feature, that you have to do this expert prompt engineering,” Lal says. “So, we wanted to see if we can automate this kind of prompt engineering.” “Now we have this full machinery, the full loop that’s completed with this reinforcement learning.… This is why we are able to outperform human prompt engineering.” —Vasudev Lal, Intel Labs Lal’s team created a tool called NeuroPrompts that takes a simple input prompt, such as “boy on a horse,” and automatically enhances it to produce a better picture. To do this, they started with a range of prompts generated by human prompt-engineering experts. They then trained a language model to transform simple prompts into these expert-level prompts. On top of that, they used reinforcement learning to optimize these prompts to create more aesthetically pleasing images, as rated by yet another machine-learning model, PickScore, a recently developed image-evaluation tool. NeuroPrompts is a generative AI auto prompt tuner that transforms simple prompts into more detailed and visually stunning StableDiffusion results—as in this case, an image generated by a generic prompt [left] versus its equivalent NeuroPrompt-generated image.Intel Labs/Stable Diffusion Here too, the automatically generated prompts did better than the expert-human prompts they used as a starting point, at least according to the PickScore metric. Lal found this unsurprising. “Humans will only do it with trial and error,” Lal says. “But now we have this full machinery, the full loop that’s completed with this reinforcement learning.… This is why we are able to outperform human prompt engineering.” Since aesthetic quality is infamously subjective, Lal and his team wanted to give the user some control over how the prompt was optimized. In their tool, the user can specify the original prompt (say, “boy on a horse”) as well as an artist to emulate, a style, a format, and other modifiers. Lal believes that as generative AI models evolve, be it image generators or large language models, the weird quirks of prompt dependence should go away. “I think it’s important that these kinds of optimizations are investigated and then ultimately, they’re really incorporated into the base model itself so that you don’t really need a complicated prompt-engineering step.” Prompt engineering will live on, by some name Even if autotuning prompts becomes the industry norm, prompt-engineering jobs in some form are not going away, says Tim Cramer, senior vice president of software engineering at Red Hat. Adapting generative AI for industry needs is a complicated, multistage endeavor that will continue requiring humans in the loop for the foreseeable future. “Maybe we’re calling them prompt engineers today. But I think the nature of that interaction will just keep on changing as AI models also keep changing.” —Vasudev Lal, Intel Labs “I think there are going to be prompt engineers for quite some time, and data scientists,” Cramer says. “It’s not just asking questions of the LLM and making sure that the answer looks good. But there’s a raft of things that prompt engineers really need to be able to do.” “It’s very easy to make a prototype,” Henley says. “It’s very hard to production-ize it.” Prompt engineering seems like a big piece of the puzzle when you’re building a prototype, Henley says, but many other considerations come into play when you’re making a commercial-grade product. Challenges of making a commercial product include ensuring reliability—for example, failing gracefully when the model goes offline; adapting the model’s output to the appropriate format, since many use cases require outputs other than text; testing to make sure the AI assistant won’t do something harmful in even a small number of cases; and ensuring safety, privacy, and compliance. Testing and compliance are particularly difficult, Henley says, as traditional software-development testing strategies are maladapted for nondeterministic LLMs. To fulfill these myriad tasks, many large companies are heralding a new job title: Large Language Model Operations, or LLMOps, which includes prompt engineering in its life cycle but also entails all the other tasks needed to deploy the product. Henley says LLMOps’ predecessors, machine learning operations (MLOps) engineers, are best positioned to take on these jobs. Whether the job titles will be “prompt engineer,” “LLMOps engineer,” or something new entirely, the nature of the job will continue evolving quickly. “Maybe we’re calling them prompt engineers today,” Lal says, “But I think the nature of that interaction will just keep on changing as AI models also keep changing.” “I don’t know if we’re going to combine it with another sort of job category or job role,” Cramer says, “But I don’t think that these things are going to be going away anytime soon. And the landscape is just too crazy right now. Everything’s changing so much. We’re not going to figure it all out in a few months.” Henley says that, to some extent in this early phase of the field, the only overriding rule seems to be the absence of rules. “It’s kind of the Wild, Wild West for this right now.” he says.
-
Lean Software, Power Electronics, and the Return of Optical Storage
Mar 06, 2024 02:00 AM PSTStephen Cass: Hi. I’m Stephen Cass, a senior editor at IEEE Spectrum. And welcome to Fixing The Future, our bi-weekly podcast that focuses on concrete solutions to hard problems. Before we start, I want to tell you that you can get the latest coverage from some of Spectrum‘s most important beats, including AI, climate change, and robotics, by signing up for one of our free newsletters. Just go to spectrum.ieee.org/newsletters to subscribe. Today on Fixing The Future, we’re doing something a little different. Normally, we deep dive into exploring one topic, but that does mean that some really interesting things get left out for the podcast simply because they wouldn’t take up a whole episode. So here today to talk about some of those interesting things, I have Spectrum‘s Editor in Chief Harry Goldstein. Hi, boss. Welcome to the show. Harry Goldstein: Hi there, Stephen. Happy to be here. Cass: You look thrilled. Goldstein: I mean, I am thrilled. I’m always excited to talk about Spectrum stories. Cass: No, we’ve tied you down and made you agree to this, but I think it’ll be fun. So first up, I’d like to talk about this guest post we had from Bert Hubert which seemed to really strike a chord with readers. It was called Why Bloat Is Still Software’s Biggest Vulnerability: A 2024 plea for lean software. Why do you think this one resonated with readers, and why is it so important? Goldstein: I think it resonated with readers because software is everywhere. It’s ubiquitous. The entire world is essentially run on software. A few days ago, even, there was a good example of the AT&T network going down likely because of some kind of software misconfiguration. This happens constantly. In fact, it’s kind of like bad weather, the software systems going down. You just come to expect it, and we all live with it. But why we live with it and why we’re forced to live with it is something that people are interested in finding out more, I guess. Cass: So I think, in the past, when we associated giant bloated software, we had associated with large projects, these big government projects, these big airlines, big, big, big projects. And we’ve written about that a lot at Spectrum before, haven’t we? Goldstein: We certainly have. And Bob Charette, our longtime contributing editor, who is actually the father of lean software, back in the early ‘90s took the Toyota Total Quality Management program and applied it to software development. And so it was pretty interesting to see Hubert’s piece on this more than 30 years later where the problems have just proliferated. And think about your average car these days. It’s approaching a couple hundred million lines of code. A glitch in any of those could cause some kind of safety problem. Recalls are pretty common. I think Toyota had one a few months ago. So the problem is everywhere, and it’s just going to get worse. Cass: Yeah. One of the things that struck me was that Bert’s making the argument that you don’t actually need now an army of programmers to create bloated software— to get all those millions of lines of code. You could be just writing a code to open a garage door. This is a trivial program. Because of the way you’re writing it on frameworks, and those are pulling in dependencies and so on, you’re pulling in just millions of lines of other people’s code. You might not even know you’re doing it. And you kind of don’t notice unless, at the end of the day, you look at your final program file and you’re like, “Oh, why is that megabytes upon megabytes?” which represents endless lines of source code. Why is that so big? Because this is how you do software. You just pull these things together. You glue stuff. You focus on the business logic because that’s your value add, but you’re not paying attention to this enormous sort of—I don’t know; what would you call it?—invisible dark matter that surrounds your software. Goldstein: Right. It’s kind of like dark matter. Yeah, that’s kind of true. I mean, it actually started making me think. All of these large language models that are being applied to software development. Co-piloting, I guess they call it, right, where the coder is sitting with an AI, trying to write better code. Do you think that might solve the problem or get us closer? Cass: No, because I think those systems, if you look at them, they reflect modern programming usage. And modern programming usage is often to use the frameworks that are available. It’s not about really getting in and writing something that’s a little bit leaner. Actually, I think the Ais—it’s not their fault—they just do what we do. And we write bloaty softwares. So I think that’s not going to get any better necessarily with this AI stuff because the point of lean software is it does take extra time to make, and there are no incentives to make lean software. And Bert talks about, “Maybe we’re going to have to impose some of this legis— l e g i s l a tively.”—I speak good. I editor. You hire wise.—But some of these things are going to have to be mandated through standards and regulations, and specifically through the lens of these cybersecurity requirements and knowing what’s going into your software. And that may help with all just getting a little bit leaner. But I did actually want to— another news story that came up this week was Apple closing down its EV division. And you mentioned Bob Charette there. And he wrote this great thing for us recently about why EV cars are one thing and EV infrastructure is an even bigger problem and why EVs are proving to be really quite tough. And maybe the problem— again, it’s a dark matter problem, not so much the car at the center, but this sort of infrastructure— just talk a little bit about Bob’s book, which is, by the way, free to download, and we’ll have the link in the show notes. Goldstein: Everything you need to know about the EV transition can be yours for the low, low price of free. But, yeah. And I think we’re starting to see-- I mean, even if you mandate things, you’re going to-- you were talking about legislation to regulate software bloat. Cass: Well, it’s kind of indirect. If you want to have good security, then you’re going to have to do certain things. The White House just came out with this paper, I think yesterday or the day before, saying, “Okay, you need to start using memory-safe languages.” And it’s not quite saying, “You are forbidden from using C, and you must use Rust,” but it’s kind of close to that for certain applications. They exempted certain areas. But you can see, that is the government really coming in and, actually, what has often been a very personal decision of programmers, like, “What language do I use?” and, “I know how to use C. I know how to do garbage collection,” the government kind of saying, “Yeah, we don’t care how great a programmer you think you are. These programs lead to this class of bugs, and we’d really prefer if you used one of these memory-safe languages.” And that’s, I guess, a push into sort of the private lives of programmers that I think we’re going to see more of as time goes by. Goldstein: Oh, that’s interesting because the—I mean, where I was going with that connection to legislation is that—I think what Bob found in the EV transition is that the knowledge base of the people who are charged with making decisions about regulations is pretty small. They don’t really understand the technology. They certainly don’t understand the interdependencies, which are very similar to the software development processes you were just referring to. It’s very similar to the infrastructure for electric cars because the idea, ultimately, for electric cars is that you also are revamping your grid to facilitate, whatchamacallit, intermittent renewable energy sources, like wind and solar, because having an electric car that runs off a coal-fired power plant is defeating the purpose, essentially. In fact, Ozzie Zehner wrote an article for us way back in the mid-Teens about the— the dirty secret behind your electric car is the coal that fuels it. And— Cass: Oh, that was quite controversial. Yeah. I think maybe because the cover was a car perched at the top of a giant mountain of coal. I think that— Goldstein: But it’s true. I mean, in China, they have one of the biggest electric car industries in the world, if not the biggest, and one of the biggest markets that has not been totally saturated by personal vehicles, and all their cars are going to be running on coal. And they’re the world’s second-largest emitter behind the US. But just circling back to the legislative angle and the state of the electric vehicle industry-- well, actually, are we just getting way off topic with the electric vehicles? Cass: No, it is this idea of interdependence and these very systems that are all coupled in all kinds of ways we don’t expect. And with that EV story— so last time I was home in Ireland, one of the stories was— so they had bought this fleet of buses to put in Dublin to replace these double-decker buses, electric double-deck, to help Ireland hit its carbon targets. So this was an official government goal. We bought the buses, great expense purchasing the buses, and then they can’t charge the buses because they haven’t already done the planning permission to get the charging stations added into the bus depot, which just was this staggering level of interconnect whereas, one hand, the national government is very— “Yes, meeting our target goals. We’re getting these green buses in. Fantastic advance. Very proud of it,” la la la la, and you can’t plug the things in because just the basic work on the ground and dealing with the local government has not been there to put in the charging stations. All of these little disconnects add up. And the bigger, the more complex system you have, the more these things add up, which I think does come back to lean software. Because it’s not so much, “Okay. Yeah, your software is bloaty.” Okay, you don’t win the Turing Prize. Boo-hoo. Okay. But the problem is that because you are pulling all of these dependencies that you just do not know and all these places where things break— or the problem of libraries getting hijacked. So we have to retain the capacity on some level— and this actually is a personal thing with me, is that I believe in the concept of personal computing. And this was the thing back in the 1970s when personal computers first came out, which the idea was it would— it was very explicitly part of the culture that you would free yourself from the utilities and the centralized systems and you could have a computer on your desk that will let you do stuff, that you didn’t have to go through, at that stage, university administrators and paperwork and you could— it was a personal computer revolution. It was very much front and center. And nowadays it’s kind of come back full circle because now we’re increasingly finding things don’t work if they’re not network connected. So I believe it should be possible to have machines that operate independently, truly personal machines. I believe it should be possible to write software to do even complicated things without relying on network servers or vast downloads or, again, the situation where you want it to run independently, okay, but you’ve got to download these Docker images that are 350 megabytes or something because an entire operating system has to be bundled into them because it is impossible to otherwise replicate the correct environment in which software is running, which also undercuts the whole point of open source software. The point of open source is, if I don’t like something, I can change it. But if it’s so hard for me to change something because I have to replicate the exact environment and toolchains that people on a particular project are using, it really limits the ability of me to come in and maybe— maybe I just want to make some small changes, or I just want to modify something, or I want to pull it into my project. That I have to bring this whole trail of dependencies with me is really tough. Sorry, that’s my rant. Goldstein: Right. Yeah. Yeah. Actually, one of the things I learned the most about from the Hubert piece was Docker and the idea that you have to put your program in a container that carries with it an entire operating system or whatever. Can you tell me more about containers? Cass: Yeah. Yeah. Yeah. I mean, you can put whatever you want into a container, and some containers are very small. It distributes its own thing. You can get very lean containers that is just basically the program and the install. But it basically replaces the old idea of installing software, where you’d— and that was a problem, because every time you installed a bit of software, it scarred your system in some way. There was always scar tissue because it made changes. It nestled in. If nothing else, it put files onto your disk. And so over time, one of the problems was that this then meant that your computer would accumulate random files. It was very hard to really uninstall something completely because it’d always put little hooks and would register itself in a different place in the operating system, again, because now it’s interoperating with a whole bunch of stuff. Programs are not completely standalone. At the very least, they’re talking to an operating system. You want it to talk nicely to other programs in the operating system. And this led to all these kind of direct install problems. And so the idea was, “Oh, we will sandbox this out. We’ll have these little Docker images, basically, to do it,” but that does give you the freedom whereby you can build these huge images, which are essentially virtual machines running away. So, again, it relieves the process of having to figure out your install and your configuration, which is one thing he was talking about. When you had to do these installers, it did really make you clarify your thinking very sharply on configuration and so on. So again, containers are great. All these cloud technologies, being able to use libraries, being able to automatically pull in dependencies, they’re all terrific in moderation. They all solve very real problems. I don’t want to be a Luddite and go, “We should go back to writing assembler code as God intended.” That’s not what I’m saying, but we do sometimes have to look at— it does sometimes enable bad habits. It can incentivize bad habits. And you have to really then think very deliberately about how to combat those problems as they pop up. Goldstein: But from the beginning, right? I mean, it seems to me like you have to commit to a lean methodology at the start of any project. It’s not something that the AI is going to come in and magically solve and slim down at the end. Cass: No, I agree. Yeah, you have to commit to it, or you have to commit to frameworks where— I’m not going to necessarily use these frameworks. I’m going to go and try and do some of this myself, or I’m going to be very careful in how I look at my frameworks, like what libraries I’m going to use. I’m going to use maybe a library that doesn’t pull in other dependencies. This guy maybe wrote this library which has got 80 percent of what I need it to do, but it doesn’t pull in libraries, unlike the bells and whistles thing which actually does 400 percent of what I need it to do. And maybe I might write that extra 20 percent. And again, it requires skill and it requires time. And it’s like anything else. There are just incentives in the world that really tend to sort of militate against having the time to do that, which, again, is where we start coming back into some of these regulatory regimes where it becomes a compliance requirement. And I think a lot of people listening will know that time when things get done is when things become compliance requirements, and then it’s mandatory. And that has its own set of issues with it in terms of losing a certain amount of flexibility and so on, but that sometimes seems to be the only way to get things done in commercial environments certainly. Not in terms of personal projects, but certainly for commercial environments. Goldstein: So what are the consequences, in a commercial environment, of bloat, besides— are there things beyond security? Here’s why I’m asking, because the idea that you’re going to legislate lean software into the world as opposed to having it come from the bottom up where people are recognizing the need because it’s costing them something—so what are the commercial costs to bloated software? Cass: Well, apparently, absolutely none. That really is the issue. Really, none, because software often isn’t maintained. People just really want to get their products out. They want to move very quickly. We see this when it comes to— they like to abandon old software very quickly. Some companies like to abandon old products as soon as the new one comes out. There really is no commercial downside to using this big software because you can always say, “Well, it’s industry standard. Everybody is doing it.” Because everybody’s doing it. You’re not necessarily losing out to your competitor. We see these massive security breaches. And again, the legislating for lean software is through demanding better security. Because currently, we see these huge security breaches, and there’s very minimal consequences. Occasionally, yes, a company screws up so badly that it goes down. But even so, sometimes they’ll reemerge in a different form, or they’ll get gobbled up in someone. There really does not, at the moment, seem to be any commercial downside for this big software, in the same way that— there are a lot of weird incentives in the system, and this certainly is one of them where, actually, the incentive is, “Just use all the frameworks. Bolt everything together. Use JS Electron. Use all the libraries. Doesn’t matter because the end user is not really going to notice very much if their program is 10 megabytes versus 350 megabytes,” especially now when people are completely immune to the size of their software. Back in the days when software came on floppy disk, if you had a piece of software that came on 100 floppy disks, that would be considered impractical. But nowadays, people are downloading gigabytes of data just to watch a movie or something like this. If a program is 1 gigabyte versus 100 megabytes, they don’t really notice. I mean, the only people who notice is if, say, video games— a really big video game. And then you see people going, “Well, it took me three hours to download the 70 gigabytes for this AAA game that I wanted to play.” That’s about the only time you see people complaining about the actual storage size of software anymore, but everybody else, they just don’t care. Yeah, it’s just invisible to them now. Goldstein: And that’s a good thing. I think Charles Choi had a piece for us on-- we’ll have endless storage, right, on disks, apparently. Cass: Oh, I love this story because it’s another story of a technology that looks like it’s headed off into the sunset, “We’ll see you in the museum.” And this is optical disk technology. I love this story and the idea that you can— we had laser disks. We had CDs. We had CD-ROMs. We had DVD. We had Blu-ray. And Blu-ray really seemed to be in many ways the end of the line for optical disks, that after that, we’re just going to use solid-state storage devices, and we’ll store all our data in those tiny little memory cells. And now we have these researchers coming back. And now my brain has frozen for a second on where they’re from. I think they’re from Shanghai. Is it Shanghai Institute? Goldstein: Yes, I think so. Cass: Yes, Shanghai. There we go. There we go. Very nice subtle check of the website there. And it might let us squeeze this data center into something the size of a room. And this is this optical disk technology where you can make a disk that’s about the size of just a regular DVD. And you can squeeze just enormous amount of data. I think he’s talking about petabits in a— Goldstein: Yeah, like 1.6 petabits on-- Cass: Petabits on this optical surface. And the magic key is, as always, a new material. I mean, we do love new materials because they’re always the wellspring from which so much springs. And we have at Spectrum many times chased down materials that have not fulfilled necessarily their promise. We have a long history— and sometimes materials go away and they come back, like— Goldstein: They come back, like graphene. It’s gone away. It’s come back. Cass: —graphene and stuff like this. We’re always looking for the new magic material. But this new magic material, which has this— Goldstein: Oh, yeah. Oh, I looked this one up, Stephen. Cass: What is it? What is it? What is it? It is called-- Goldstein: Actually, our story did not even bother to include the translation because it’s so botched. But it is A-I-E, dash, D-D-P-R, AIE-DDPR or aggregation-induced emission dye-doped photoresist. Cass: Okay. Well, let’s just call it magic new dye-doped photoresist. And the point about this is that this material works at basically four wavelengths. And why you want a material that responds at four different wavelengths? Because the limit on optical technologies— and I’m also stretching here into the boundaries on either side of optical. The standard rule is you can’t really do anything that’s smaller than the wavelength of the light you’re using to read or write. So the length of your laser sets the density of data on your disk. And what these clever clogs have done is they’ve worked out that by using basically two lasers at once, you can, in a very clever way, write a blob that is smaller than the wavelength of light, and you can do it in multiple layers. So usually, your standard Blu-ray disk, they’re very limited in the number of layers they have on them, like CDs originally, one layer. So you have multiple layers on this disk that you can write to, and you can write at resolutions that you wouldn’t think you could do if you were just doing— from your high school physics or whatever. So you write it using these two lasers of two wavelengths, and then you read it back using another two lasers at two different wavelengths. And this all localizes and makes it work. And suddenly, as I say, you can squeeze racks and racks and racks of solid-state storage down to hopefully something that is very small. And what’s also interesting is that they’re actually closer to commercialization than you normally see with these early material stories. And they also think you could write one of these disks in six minutes, which is pretty impressive. As someone who stood and has sat watching the progress bars on a lot of DVD-ROMs burn over the years back in the day, six minutes to burn these—that’s probably for commercial mass production—is still pretty impressive. And so you could solve this problem of some of these large data transfers we get where currently you do have to ship servers from one side of the world to the other because it actually is too slow to copy things over the internet. And so this would increase the bandwidth of sort of the global sneakernet or station wagon net quite dramatically as well. Goldstein: Yeah. They are super interested in seeing them deployed in big data centers. And in order for them to do that, they still have to get the writing speed up and the energy consumption down. So the real engineering is just beginning for this. Well, speaking of new materials, there’s a new use for aluminum nitride according to our colleague Glenn Zorpette who wrote about the use of the material in power transistors. And apparently, if you properly dope this material, it’ll have a much wider band gap and be able to handle higher voltages. So what does this mean for the grid, Stephen? Cass: Yeah. So I actually find power electronics really fascinating because most of the history of transistors, right, is about making them use ever smaller amounts of electricity—5-volt logic used to be pretty common; now 3.3 is pretty common, and even 1.1 volts is pretty common—and really sipping microamps of power through these circuits. And power electronics kind of gets you back to actually the origins of being an electronics engineer, electrical engineers, which is when you’re really talking about power and energy, and you are humping around thousands of volts, and you’re humping around huge currents. And power electronics is an attempt to bring some of that smartness that transistors gives you into these much higher voltages. And we’ve seen some of this with, say, gallium nitride, which is a material we had talked about in Spectrum for years, speaking of materials that had been for years floating around, and then really, though, in the last like five years, you’ve seen it be a real commercial success. So all those wall warts we have have gotten dramatically smaller and better, which is why you can have a USB-C charger system where you can drive your laptop and bunch of ancillary peripherals all off one little wall wart without worrying about it bringing down the house because it’s just so efficient and so small. And most of those now are these new gallium-nitride-based devices, which is one example where a material really is making some progress. And so aluminum nitride is kind of another step along that, to be able to handle even higher voltages, being able to handle bigger currents. So we’re not up yet to the level where you could have these massive high-voltage transmission lines directly, but the more and more you— the rising tide of where you can put these kind of electronics into your systems. First off, it means more efficient. As I say, these power adapters that convert AC to DC, they get more efficient. Your power supplies in your computer get more efficient, and your power supplies in your grid center. We’ve talked about how much power grid centers today get more efficient. And it bundles up. And the whole point of this is that you do want a grid that is as smart as possible. You need something that will be able to handle very intermittent power sources, fluctuating power sources. The current grid is really built around very, very stable power supplies, very constant power supplies, very stable frequency timings. So the frequency of the grid is the key to stability. Everything’s got to be on that 60 hertz in the US, 50 hertz in other places. Every power station has got to be synchronized very precisely with the other. So stability is a problem, and being able to handle fluctuations quickly is the key to both grid stability and to be able to handle some of these intermittent sources where the power varies as the wind blows stronger or weaker, as the day turns, as clouds move in front of your farm. So it’s very exciting from that point of view to see these very esoteric technologies. We’re talking about things like band gaps and how do you stick the right doping molecule in the matrix, but it does bubble up into these very-large-scale impacts when we’re talking about the future of electrical engineering and that old-school power and energy keeping the lights on and the motors churning kind of a way. Goldstein: Right. And the electrification of everything is just going to put bigger demands on the grid, like you were saying, for alternative energy sources. “Alternative.” They’re all price competitive now, the solar and wind. But-- Cass: Yeah, not just at the generate— this idea that you have distributed power and power can be generated locally, and also being able to switch power. So you have these smart transformers so that if you are generating surplus power on your solar panels, you can send that to maybe your neighbor next door who’s charging their electric vehicle without at all having to be mediated by going up to the power company. Maybe your local transformer is making some of these local grid scale balancing decisions that are much closer to where the power is being used. Goldstein: Oh, yeah. Stephen, that reminds me of this other piece we had this week, actually, on utilities and profit motive on their part hampering US grid expansion. It’s by a Harvard scholar named Ari Peskoe, and his first line is, “The United States is not building enough transmission lines to connect regional power networks. The deficit is driving up electricity prices, reducing grid reliability, and hobbling renewable-energy deployment.” And basically, they’re just saying that it’s not—what he does a good job explaining is not only how these new projects might impact their bottom lines but also all of the industry alliances that they’ve established over the years that become these embedded interests that need to be disrupted. Cass: Yeah, the truth is there is a list of things we could do. Not magic things. There are pretty obvious things we could do that would make the US grid— even if you don’t care much about renewables, you probably do care about your grid resilience and reliability and being able to move power around. The US grid is not great. It is creaky. We know there are things that could be done. As a byproduct of doing those things, you also would actually make it much more renewable friendly. So it is this issue of— there are political problems. Depending on which administration is in power, there is more or less an appetite to deal with some of these interests. And then, yeah, these utilities often have incentives to kind of keep things the way they are. They don’t necessarily want a grid where it’s easier to get cheaper electricity or more green electricity from one place to a different market. Everybody loves a captive monopoly market they can sell. I mean, that’s wonderful if you could do that. And then there are many places with anti-competition rules. But grids are a real— it’s really difficult to break down those barriers. Goldstein: It is. And if you’re in Texas in a bad winter and the grid goes down and you need power from outside but you’re an island unto yourself and you can’t import that power, it becomes something that is disruptive to people’s lives, right? And people pay attention to it during a disaster, but we have a slow-rolling disaster called climate change that if we don’t start overturning some of the barriers to electrification and alternative energy sources, we’re kind of digging our own grave. Cass: It is very tricky because we do then get into these issues where you build these transmission lines, and there are questions about who ends up paying for those transmission lines and whether they get built over their lands, the local impacts of those. And it’s hard sometimes to tell. Is this a group that is really genuinely feeling that there is a sort of justice gap here— that they’re being asked to pay for the sins of higher carbon producers, or is this astroturfing? And sometimes it’s very difficult to tell that these organizations are being underwritten by people who are invested in the status quo, and it does become a knotty problem. And we are going to, I think, as things get more and more difficult, be really faced into making some difficult choices. And I am not quite sure how that’s going to play out, but I do know that we will keep tracking it as best we can. And I think maybe, yeah, you just have to come back and see how we keep covering the grid in pages of Spectrum. Goldstein: Excellent. Well— Cass: And so that’s probably a good point where— I think we’re going to have to wrap this round up here. But thank you so much for coming on the show. Goldstein: Excellent. Thank you, Stephen. Much fun. Cass: So today on Fixing The Future, I was talking with Spectrum‘s Editor in Chief Harry Goldstein, and we talked about electric vehicles, we talked about software bloat, and we talked about new materials. I’m Stephen Cass, and I hope you join us next time.
-
Anyware Robotics’ Pixmo Takes Unique Approach to Trailer Unloading
Mar 05, 2024 10:00 AM PSTYou’ve seen this before: a truck-unloading robot that’s made up of a mobile base with an arm on it that drives up into the back of a trailer and then uses suction to grab stacked boxes and put them onto a conveyor belt. We’ve written about a couple of the companies doing this, and there are even more out there. It’s easy to understand why—trailer unloading involves a fairly structured and controlled environment with a very repetitive task, it’s a hard job that sucks for humans, and there’s an enormous amount of demand. While it’s likely true that there’s enough room for a whole bunch of different robotics companies in the trailer-unloading space, a given customer is probably going to only pick one, and they’re going to pick the one that offers the right combination of safety, capability, and cost. Anyware Robotics thinks they have that mix, aided by a box-handling solution that is both very clever and so obvious that I’m wondering why I didn’t think of it myself. The overall design of Pixmo itself is fairly standard as far as trailer-unloading robots go, but some of the details are interesting. We’re told that Pixmo is the only trailer-unloading system that integrates a heavy-payload collaborative arm, actually a fairly new commercial arm from Fanuc. This means that Anyware Robotics doesn’t have to faff about with their own hardware, and also that their robot is arguably safer, being ISO-certified safe to work directly with people. The base is custom, but Anyware is contracting it out to a big robotics original equipment manufacturer. “We’ve put a lot of effort into making sure that most of the components of our robot are off-the-shelf,” cofounder and CEO Thomas Tang tells us. “There are already so many mature and cost-efficient suppliers that we want to offload the supply chain, the certification, the reliability testing onto someone else’s shoulders.” And while there are a selection of automated mobile robots (AMRs) out there that seem like they could get the job done, the problem is that they’re all designed for flat surfaces, and getting into and out of the back of the trailer often involves a short, steep ramp, hence the need for a design just for them. Even with the custom base, Tang says that Pixmo is very cost efficient, and the company predicts that it will be approximately one-third the cost of other solutions with a payback of about 24 months. But here’s the really clever bit: Anyware Robotics Pixmo Trailer Unloading That conveyor system in front of the boxes is an add-on that’s used in support of Pixmo. There are two benefits here: First, having the conveyor add-on aligned with the base of a box minimizes the amount of lifting that Pixmo has to do. This allows Pixmo to handle boxes of up to 65 pounds with a lift-and-slide technique, putting it at the top end of a trailer-unloading robot payload. And the second benefit is that the add-on system decreases the distance that Pixmo has to move the box to just about as small as it can possibly be, eliminating the need for the arm to rotate around to place a box on a conveyor next to or behind itself. Lowering this cycle time means that Pixmo can achieve a throughput of up to 1,000 boxes per hour—about one box every 4 seconds, which the Internet suggests is quite fast, even for a professional human. Anyware Robotics is introducing this add-on system at the MODEX manufacturing and supply-chain show next week, and the company has a patent pending on the idea. This seems like such a simple, useful idea that I asked Tang why they were the first ones to come up with it. “In robotics startups, there tends to be a legacy mind-set issue,” Tang told me. “When people have been working on robot arms for so many years, we just think about how to use robot arms to solve everything. That’s maybe the reason why other companies didn’t come up with this solution.” Tang says that Anyware started with much more complicated add-on designs before finding this solution. “Usually it’s the most simple solution that has the most trial and error behind it.” Anyware Robotics is focused on trailer unloading for now, but Pixmo could easily be adapted for palletizing and depalletizing or somewhat less easily for other warehouse tasks like order picking or machine tending. But why stop there? A mobile manipulator can (theoretically) do it all (almost), and that’s exactly what Tang wants: In our long-term vision, we believe that the future will have two different types of general-purpose robots. In one direction is the humanoid form, which is a really flexible solution for jobs where you want to replace a human. But there are so many jobs that are just not reasonable for a human body to do. So we believe there should be another form of general-purpose robot, which is designed for industrial tasks. Our design philosophy is in that direction—it’s also general purpose, but for industrial applications. At just over one year old, Anyware has already managed to complete a pilot program (and convert it to a purchase order). They’re currently in the middle of several other pilot programs with leading third-party logistics providers, and they expect to spend the next several months focusing on productization with the goal of releasing the first commercial version of Pixmo by July of this year.
-
Sci-fi and Hi-fi
Mar 04, 2024 11:46 AM PSTMany a technologist has been inspired by science fiction. Some have even built, or rebuilt, entire companies around an idea introduced in a story they read, as the founders of Second Life and Meta did, working from the metaverse as imagined by Neal Stephenson in his seminal 1992 novel Snow Crash. IEEE Spectrum has a history of running amazing sci-fi stories. Twenty years ago, I worked with computer scientist and novelist Vernor Vinge on his “Synthetic Serendipity,” a short story he adapted from his novel Rainbows End just for publication in Spectrum. Vinge’s work is informed by his research and relationships with some of the world’s leading technologists, which in turn gave me plenty of background for the accompanying 2004 Spectrum article “Mike Villas’s World.” Vinge’s tale of the near future explored then-nascent technologies, such as 3D printing, augmented reality, and advanced search-engines, all of which Vinge depicts with stunning clarity and foresight. So when our News Manager Margo Anderson and Contributing Editor Charles Q. Choi hatched the idea for the science fiction/fact package featured in this issue, our local sci-fi maven, Special Projects Editor Stephen Cass, eagerly volunteered to shepherd the project. Stephen is coauthor of Hollyweird Science: From Quantum Quirks to the Multiverse (on the science shown in movies and TV shows) and the editor of several sci-fi anthologies, including Coming Soon Enough, published by Spectrum 10 years ago. Choi suggested we hire the futurist Karl Schroeder, author of 10 sci-fi novels, to write the sci-fi story. Cass, Choi, and Schroeder then had a brainstorming session. Cass recalls, “I knew by the end of it that Karl had the chops to nail the real science concepts we wanted to explore, and come up with a compelling narrative.” The idea they hit upon—turning a planet into a computer—is not new in science fiction, Cass notes. But “we wanted Karl to explore the idea in a way that would shed light on what purpose you’d put one to,” he says, “and also think about what some of the unintended consequences might be. And he had to do it in 2,500 words, which is a very tight fit for a story.” As for the accompanying nonfiction annotations, Choi’s brief was to work with Cass and Schroeder to make sure that the story, although fantastical and set in the far future, was sufficiently grounded in ideas that scientists and futurists are taking seriously today. And of course, any good sci-fi story needs some cool art. For that, Deputy Art Director Brandon Palacio chose Andrew Archer, whose work has a terrific balance of realism and stylistic flair. Historically, many science-fiction stories and books have had accompanying art that’s only barely related to what happens in the text, but Archer worked with us to make sure his work really fit “Hijack”. Deft storytelling is something Cass himself delivers in this month’s Hands On: “Vintage Hi-Fi Enters the 21st Century”. Not only is he our in-house sci-fi expert, he’s also our staff do-it-yourselfer. This month, he resurrects a vintage hi-fi that came from his wife’s family. Inspired by the recent passing of his father, who helped his own father in their radio and television rental shop in Dublin before spending decades working as a broadcast engineer, Cass wires up a tale of family and connection through technology that you’ll read only in these pages.
-
Meta’s AI Watermarking Plan Is Flimsy, at Best
Mar 04, 2024 09:35 AM PSTIn the past few months, we’ve seen a deepfake robocall of Joe Biden encouraging New Hampshire voters to “save your vote for the November election” and a fake endorsement of Donald Trump from Taylor Swift. It’s clear that 2024 will mark the first “AI election” in United States history. With many advocates calling for safeguards against AI’s potential harms to our democracy, Meta (the parent company of Facebook and Instagram) proudly announced last month that it will label AI-generated content that was created using the most popular generative AI tools. The company said it’s “building industry-leading tools that can identify invisible markers at scale—specifically, the ‘AI generated’ information in the C2PA and IPTC technical standards.” Unfortunately, social media companies will not solve the problem of deepfakes on social media this year with this approach. Indeed, this new effort will do very little to tackle the problem of AI-generated material polluting the election environment. The most obvious weakness is that Meta’s system will work only if the bad actors creating deepfakes use tools that already put watermarks—that is, hidden or visible information about the origin of digital content—into their images. Most unsecured “open-source” generative AI tools don’t produce watermarks at all. (We use the term unsecured and put “open-source” in quotes to denote that many such tools don’t meet traditional definitions of open-source software, but still pose a threat because their underlying code or model weights have been made publicly available.) If new versions of these unsecured tools are released that do contain watermarks, the old tools will still be available and able to produce watermark-free content, including personalized and highly persuasive disinformation and nonconsensual deepfake pornography. We are also concerned that bad actors can easily circumvent Meta’s labeling regimen even if they are using the AI tools that Meta says will be covered, which include products from Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock. Given that it takes about 2 seconds to remove a watermark from an image produced using the current C2PA watermarking standard that these companies have implemented, Meta’s promise to label AI-generated images falls flat. When the authors uploaded an image they’d generated to a website that checks for watermarks, the site correctly stated that it was a synthetic image generated by an OpenAI tool. IEEE Spectrum We know this because we were able to easily remove the watermarks Meta claims it will detect—and neither of us is an engineer. Nor did we have to write a single line of code or install any software. First, we generated an image with OpenAI’s DALL-E 3. Then, to see if the watermark worked, we uploaded the image to the C2PA content credentials verification website. A simple and elegant interface showed us that this image was indeed made with OpenAI’s DALL-E 3. How did we then remove the watermark? By taking a screenshot. When we uploaded the screenshot to the same verification website, the verification site found no evidence that the image had been generated by AI. The same process worked when we made an image with Meta’s AI image generator and took a screenshot of it—and uploaded it to a website that detects the IPTC metadata that contains Meta’s AI “watermark.” However, when the authors took a screenshot of the image and uploaded that screenshot to the same verification site, the site found no watermark and therefore no evidence that the image was AI generated. IEEE Spectrum Is there a better way to identify AI-generated content? Meta’s announcement states that it’s “working hard to develop classifiers that can help...to automatically detect AI-generated content, even if the content lacks invisible markers.” It’s nice that the company is working on it, but until it succeeds and shares this technology with the entire industry, we will be stuck wondering whether anything we see or hear online is real. For a more immediate solution, the industry could adopt maximally indelible watermarks—meaning watermarks that are as difficult to remove as possible. Today’s imperfect watermarks typically attach information to a file in the form of metadata. For maximally indelible watermarks to offer an improvement, they need to hide information imperceptibly in the actual pixels of images, the waveforms of audio (Google Deepmind claims to have done this with its proprietary SynthID watermark) or through slightly modified word frequency patterns in AI-generated text. We use the term “maximally” to acknowledge that there may never be a perfectly indelible watermark. This is not a problem just with watermarks though. The celebrated security expert Bruce Schneier notes that “computer security is not a solvable problem…. Security has always been an arms race, and always will be.” In metaphorical terms, it’s instructive to consider automobile safety. No car manufacturer has ever produced a car that cannot crash. Yet that hasn’t stopped regulators from implementing comprehensive safety standards that require seatbelts, airbags, and backup cameras on cars. If we waited for safety technologies to be perfected before requiring implementation of the best available options, we would be much worse off in many domains. There’s increasing political momentum to tackle deepfakes. Fifteen of the biggest AI companies—including almost every one mentioned in this article—signed on to the White House Voluntary AI Commitments last year, which included pledges to “develop robust mechanisms, including provenance and/or watermarking systems for audio or visual content” and to “develop tools or APIs to determine if a particular piece of content was created with their system.” Unfortunately, the White House did not set any timeline for the voluntary commitments. Then, in October, the White House, in its AI Executive Order, defined AI watermarking as “the act of embedding information, which is typically difficult to remove, into outputs created by AI—including into outputs such as photos, videos, audio clips, or text—for the purposes of verifying the authenticity of the output or the identity or characteristics of its provenance, modifications, or conveyance.” Next, at the Munich Security Conference on 16 February, a group of 20 tech companies (half of which had previously signed the voluntary commitments) signed onto a new “Tech Accord to Combat Deceptive Use of AI in 2024 Elections.” Without making any concrete commitments or providing any timelines, the accord offers a vague intention to implement some form of watermarking or content-provenance efforts. Although a standard is not specified, the accord lists both C2PA and SynthID as examples of technologies that could be adopted. Could regulations help? We’ve seen examples of robust pushback against deepfakes. Following the AI-generated Biden robocalls, the New Hampshire Department of Justice launched an investigation in coordination with state and federal partners, including a bipartisan task force made up of all 50 state attorneys general and the Federal Communications Commission. Meanwhile, in early February the FCC clarified that calls using voice-generation AI will be considered artificial and subject to restrictions under existing laws regulating robocalls. Unfortunately, we don’t have laws to force action by either AI developers or social media companies. Congress and the states should mandate that all generative AI products embed maximally indelible watermarks in their image, audio, video, and text content using state-of-the-art technology. They should also address risks from unsecured “open-source” systems that can either have their watermarking functionality disabled or be used to remove watermarks from other content. Furthermore, any company that makes a generative AI tool should be encouraged to release a detector that can identify, with the highest accuracy possible, any content it produces. This proposal shouldn’t be controversial, as its rough outlines have already been agreed to by the signers of the voluntary commitments and the recent elections accord. Standards organizations like C2PA, the National Institute of Standards and Technology, and the International Organization for Standardization should also move faster to build consensus and release standards for maximally indelible watermarks and content labeling in preparation for laws requiring these technologies. Google, as C2PA’s newest steering committee member, should also quickly move to open up its seemingly best-in-class SynthID watermarking technology to all members for testing. Misinformation and voter deception are nothing new in elections. But AI is accelerating existing threats to our already fragile democracy. Congress must also consider what steps it can take to protect our elections more generally from those who are seeking to undermine them. That should include some basic steps, such as passing the Deceptive Practices and Voter Intimidation Act, which would make it illegal to knowingly lie to voters about the time, place, and manner of elections with the intent of preventing them from voting in the period before a federal election. Congress has been woefully slow to take up comprehensive democracy reform in the face of recent shocks. The potential amplification of these shocks through abuse of AI ought to be enough to finally get lawmakers to act.
-
The Engineer Behind Samsung’s Speech Recognition Software
Mar 03, 2024 11:00 AM PSTEvery time you use your voice to generate a message on a Samsung Galaxy mobile phone or activate a Google Home device, you’re using tools Chanwoo Kim helped develop. The former executive vice president of Samsung Research’s Global AI Centers specializes in end-to-end speech recognition, end-to-end text-to-speech tools, and language modeling. “The most rewarding part of my career is helping to develop technologies that my friends and family members use and enjoy,” Kim says. He recently left Samsung to continue his work in the field at Korea University, in Seoul, leading the school’s speech and language processing laboratory. A professor of artificial intelligence, he says he is passionate about teaching the next generation of tech leaders. “I’m excited to have my own lab at the school and to guide students in research,” he says. Bringing Google Home to market When Amazon announced in 2014 it was developing smart speakers with AI assistive technology, a gadget now known as Echo, Google decided to develop its own version. Kim saw a role for his expertise in the endeavor—he has a Ph.D. in language and information technology from Carnegie Mellon, and he specialized in robust speech recognition. Friends of his who were working on such projects at Google in Mountain View, Calif., encouraged him to apply for a software engineering job there. He left Microsoft in Seattle where he had worked for three years as a software development engineer and speech scientist. After joining Google’s acoustic modeling team in 2013, he worked to ensure the company’s AI assistive technology, used in Google Home products, could perform in the presence of background noise. Chanwoo Kim Employer Korea University in Seoul Title Director of the the speech and language processing lab and professor of artificial intelligence Member grade Member Alma maters Seoul National University; Carnegie Mellon He led an effort to improve Google Home’s speech-recognition algorithms, including the use of acoustic modeling, which allows a device to interpret the relationship between speech and phonemes (phonetic units in languages). “When people used the speech-recognition function on their mobile phones, they were only standing about 1 meter away from the device at most,” he says. “For the speaker, my team and I had to make sure it understood the user when they were talking farther away.” Kim proposed using large-scale data augmentation that simulates far-field speech data to enhance the device’s speech-recognition capabilities. Data augmentation analyzes training data received and artificially generates additional training data to improve recognition accuracy. His contributions enabled the company to release its first Google Home product, a smart speaker, in 2016. “That was a really rewarding experience,” he says. That same year, Kim moved up to senior software engineer and continued improving the algorithms used by Google Home for large-scale data augmentation. He also further developed technologies to reduce the time and computing power used by the neural network and to improve multi-microphone beamforming for far-field speech recognition. Kim, who grew up in South Korea, missed his family, and in 2018 he moved back, joining Samsung as vice president of its AI Center in Seoul. When he joined Samsung, he aimed to develop end-to-end speech recognition and text-to-speech recognition engines for the company’s products, focusing on on-device processing. To help him reach his goals, he founded a speech processing lab and led a team of researchers developing neural networks to replace the conventional speech-recognition systems then used by Samsung’s AI devices. “The most rewarding part of my work is helping to develop technologies that my friends and family members use and enjoy.” Those systems included an acoustic model, a language model, a pronunciation model, a weighted finite state transducer, and an inverse text normalizer. The language model looks at the relationship between the words being spoken by the user, while the pronunciation model acts as a dictionary. The inverse text normalizer, most often used by text-to-speech tools on phones, converts speech into written expressions. Because the components were bulky, it was not possible to develop an accurate, on-device speech-recognition system using conventional technology, Kim says. An end-to-end neural network would complete all the tasks and “greatly simplify speech-recognition systems,” he says. Chanwoo Kim [top row, seventh from the right] with some of the members of his speech processing lab at Samsung Research.Chanwoo Kim He and his team used a streaming attention-based approach to develop their model. An input sequence—the spoken words—are encoded, then decoded into a target sequence with the help of a context vector, a numeric representation of words generated by a pretrained deep learning model for machine translation. The model was commercialized in 2019 and is now part of Samsung’s Galaxy phone. That same year, a cloud version of the system was commercialized and is used by the phone’s virtual assistant, Bixby. Kim’s team continued to improve speech recognition and text-to-speech systems in other products, and every year they commercialized a new engine. They include the power-normalized cepstral coefficients, which improve the accuracy of speech recognition in environments with disturbances such as additive noise, changes in the signal, multiple speakers, and reverberation. It suppresses the effects of background noise by using statistics to estimate characteristics. It is now used in a variety of Samsung products including air conditioners, cellphones, and robotic vacuum cleaners. Samsung promoted Kim in 2021 to executive vice president of its six Global AI Centers, located in Cambridge, England; Montreal; Seoul; Silicon Valley; New York; and Toronto. In that role he oversaw research on incorporating artificial intelligence and machine learning into Samsung products. He is the youngest person to be an executive vice president at the company. He also led the development of Samsung’s generative large language models, which evolved in Samsung Gauss. The suite of generative AI models can generate code, images, and text. In March he left the company to join Korea University as a professor of artificial intelligence—which is a dream come true, he says. “When I first started my doctoral work, my dream was to pursue a career in academia,” Kim says. “But after earning my Ph.D., I found myself drawn to the impact my research could have on real products, so I decided to go into industry.” He says he was excited to join Korea University, as “it has a strong presence in artificial intelligence” and is one of the top universities in the country. Kim says his research will focus on generative speech models, multimodal processing, and integrating generative speech with language models. Chasing his dream at Carnegie Mellon Kim’s father was an electrical engineer, and from a young age, Kim wanted to follow in his footsteps, he says. He attended a science-focused high school in Seoul to get a head start in learning engineering topics and programming. He earned his bachelor’s and master’s degrees in electrical engineering from Seoul National University in 1998 and 2001, respectively. Kim long had hoped to earn a doctoral degree from a U.S. university because he felt it would give him more opportunities. And that’s exactly what he did. He left for Pittsburgh in 2005 to pursue a Ph.D. in language and information technology at Carnegie Mellon. “I decided to major in speech recognition because I was interested in raising the standard of quality,” he says. “I also liked that the field is multifaceted, and I could work on hardware or software and easily shift focus from real-time signal processing to image signal processing or another sector of the field.” Kim did his doctoral work under the guidance of IEEE Life Fellow Richard Stern, who probably is best known for his theoretical work in how the human brain compares sound coming from each ear to judge where the sound is coming from. “At that time, I wanted to improve the accuracy of automatic speech recognition systems in noisy environments or when there were multiple speakers,” he says. He developed several signal processing algorithms that used mathematical representations created from information about how humans process auditory information. Kim earned his Ph.D. in 2010 and joined Microsoft in Seattle as a software development engineer and speech scientist. He worked at Microsoft for three years before joining Google. Access to trustworthy information Kim joined IEEE when he was a doctoral student so he could present his research papers at IEEE conferences. In 2016 a paper he wrote with Stern was published in the IEEE/ACM Transactions on Audio, Speech, and Language Processing. It won them the 2019 IEEE Signal Processing Society’s Best Paper Award. Kim felt honored, he says, to receive this “prestigious award.” Kim maintains his IEEE membership partly because, he says, IEEE is a trustworthy source of information, and he can access the latest technical information. Another benefit of membership is IEEE’s global network, Kim says. “By being a member, I have the opportunity to meet other engineers in my field,” he says. He is a regular attendee at the annual IEEE Conference for Acoustics, Speech, and Signal Processing. This year he is the technical program committee’s vice chair for the meeting, which is scheduled for next month in Seoul.
-
Video Friday: $2.6 Billion
Mar 01, 2024 01:02 PM PSTVideo Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. HRI 2024: 11–15 March 2024, BOULDER, COLORADO, USA Eurobot Open 2024: 8–11 May 2024, LA ROCHE-SUR-YON, FRANCE ICRA 2024: 13–17 May 2024, YOKOHAMA, JAPAN RoboCup 2024: 17–22 July 2024, EINDHOVEN, NETHERLANDS Enjoy today’s videos! Figure has raised a US $675 million Series B, valuing the company at $2.6 billion. [ Figure ] Meanwhile, here’s how things are going at Agility Robotics, whose last raise was a $150 million Series B in April of 2022. [ Agility Robotics ] Also meanwhile, here’s how things are going at Sanctuary AI, whose last raise was a $58.5 million Series A in March of 2022. [ Sanctuary AI ] The time has come for humanoid robots to enter industrial production lines and learn how to assist humans by undertaking repetitive, tedious, and potentially dangerous tasks for them. Recently, UBTECH’s humanoid robot Walker S was introduced into the assembly line of NIO’s advanced vehicle-manufacturing center, as an “intern” assisting in the car production. Walker S is the first bipedal humanoid robot to complete a specific workstation’s tasks on a mobile EV production line. [ UBTECH ] Henry Evans keeps working hard to make robots better, this time with the assistance of researchers from Carnegie Mellon University. Henry said he preferred using head-worn assistive teleoperation (HAT) with a robot for certain tasks rather than depending on a caregiver. “Definitely scratching itches,” he said. “I would be happy to have it stand next to me all day, ready to do that or hold a towel to my mouth. Also, feeding me soft foods, operating the blinds, and doing odd jobs around the room.” One innovation in particular, software called Driver Assistance that helps align the robot’s gripper with an object the user wants to pick up, was “awesome,” Henry said. Driver Assistance leaves the user in control while it makes the fine adjustments and corrections that can make controlling a robot both tedious and demanding. “That’s better than anything I have tried for grasping,” Henry said, adding that he would like to see Driver Assistance used for every interface that controls Stretch robots. [ HAT2 ] via [ CMU ] Watch this video for the three glorious seconds at the end. [ Tech United ] Get ready to rip, shear, mow, and tear, as DOOM is back! This April, we’re making the legendary game playable on our robotic mowers as a tribute to 30 years of mowing down demons. Oh, it’s HOOSKvarna, not HUSKvarna. [ Husqvarna ] via [ Engadget ] Latest developments demonstrated on the Ameca Desktop platform. Having fun with vision- and voice-cloning capabilities. [ Engineered Arts ] Could an artificial-intelligence system learn language from a child? New York University researchers supported by the National Science Foundation, using first-person video from a head-mounted camera, trained AI models to learn language through the eyes and ears of a child. [ NYU ] The world’s leaders in manufacturing, natural resources, power, and utilities are using our autonomous robots to gather data of higher quality and higher quantities of data than ever before. Thousands of Spots have been deployed around the world—more than any other walking robot—to tackle this challenge. This release helps maintenance teams tap into the power of AI with new software capabilities and Spot enhancements. [ Boston Dynamics ] Modular self-reconfigurable robotic systems are more adaptive than conventional systems. This article proposes a novel free-form and truss-structured modular self-reconfigurable robot called FreeSN, containing node and strut modules. This article presents a novel configuration identification system for FreeSN, including connection point magnetic localization, module identification, module orientation fusion, and system-configuration fusion. [ Freeform Robotics ] The OOS-SIM (On-Orbit Servicing Simulator) is a simulator for on-orbit servicing tasks such as repair, maintenance and assembly that have to be carried out on satellites orbiting the earth. It simulates the operational conditions in orbit, such as the felt weightlessness and the harsh illumination. [ DLR ] The next CYBATHLON competition, which will take place again in 2024, breaks down barriers between the public, people with disabilities, researchers and technology developers. From 25 to 27 October 2024, the CYBATHLON will take place in a global format in the Arena Schluefweg in Kloten near Zurich and in local hubs all around the world. [ CYBATHLON ] George’s story is a testament to the incredible journey that unfolds when passion, opportunity and community converge. His journey from a drone enthusiast to someone actively contributing to making a difference not only to his local community but also globally; serves as a beacon of hope for all who dare to dream and pursue their passions. [ WeRobotics ] In case you’d forgotten, Amazon has a lot of robots. [ Amazon Robotics ] ABB’s fifty-year story of robotic innovation that began in 1974 with the sale of the world’s first commercial all-electric robot, the IRB 6. Björn Weichbrodt was a key figure in the development of the IRB 6. [ ABB ] Robotics Debate of the Ingenuity Labs Robotics and AI Symposium (RAIS2023) from October 12, 2023: Is robotics helping or hindering our progress on UN Sustainable Development Goals? [ Ingenuity Labs ]
-
Andrew Ng: Unbiggen AI
Feb 09, 2022 07:31 AM PSTAndrew Ng has serious street cred in artificial intelligence. He pioneered the use of graphics processing units (GPUs) to train deep learning models in the late 2000s with his students at Stanford University, cofounded Google Brain in 2011, and then served for three years as chief scientist for Baidu, where he helped build the Chinese tech giant’s AI group. So when he says he has identified the next big shift in artificial intelligence, people listen. And that’s what he told IEEE Spectrum in an exclusive Q&A. Ng’s current efforts are focused on his company Landing AI, which built a platform called LandingLens to help manufacturers improve visual inspection with computer vision. He has also become something of an evangelist for what he calls the data-centric AI movement, which he says can yield “small data” solutions to big issues in AI, including model efficiency, accuracy, and bias. Andrew Ng on... What’s next for really big models The career advice he didn’t listen to Defining the data-centric AI movement Synthetic data Why Landing AI asks its customers to do the work The great advances in deep learning over the past decade or so have been powered by ever-bigger models crunching ever-bigger amounts of data. Some people argue that that’s an unsustainable trajectory. Do you agree that it can’t go on that way? Andrew Ng: This is a big question. We’ve seen foundation models in NLP [natural language processing]. I’m excited about NLP models getting even bigger, and also about the potential of building foundation models in computer vision. I think there’s lots of signal to still be exploited in video: We have not been able to build foundation models yet for video because of compute bandwidth and the cost of processing video, as opposed to tokenized text. So I think that this engine of scaling up deep learning algorithms, which has been running for something like 15 years now, still has steam in it. Having said that, it only applies to certain problems, and there’s a set of other problems that need small data solutions. When you say you want a foundation model for computer vision, what do you mean by that? Ng: This is a term coined by Percy Liang and some of my friends at Stanford to refer to very large models, trained on very large data sets, that can be tuned for specific applications. For example, GPT-3 is an example of a foundation model [for NLP]. Foundation models offer a lot of promise as a new paradigm in developing machine learning applications, but also challenges in terms of making sure that they’re reasonably fair and free from bias, especially if many of us will be building on top of them. What needs to happen for someone to build a foundation model for video? Ng: I think there is a scalability problem. The compute power needed to process the large volume of images for video is significant, and I think that’s why foundation models have arisen first in NLP. Many researchers are working on this, and I think we’re seeing early signs of such models being developed in computer vision. But I’m confident that if a semiconductor maker gave us 10 times more processor power, we could easily find 10 times more video to build such models for vision. Having said that, a lot of what’s happened over the past decade is that deep learning has happened in consumer-facing companies that have large user bases, sometimes billions of users, and therefore very large data sets. While that paradigm of machine learning has driven a lot of economic value in consumer software, I find that that recipe of scale doesn’t work for other industries. Back to top It’s funny to hear you say that, because your early work was at a consumer-facing company with millions of users. Ng: Over a decade ago, when I proposed starting the Google Brain project to use Google’s compute infrastructure to build very large neural networks, it was a controversial step. One very senior person pulled me aside and warned me that starting Google Brain would be bad for my career. I think he felt that the action couldn’t just be in scaling up, and that I should instead focus on architecture innovation. “In many industries where giant data sets simply don’t exist, I think the focus has to shift from big data to good data. Having 50 thoughtfully engineered examples can be sufficient to explain to the neural network what you want it to learn.” —Andrew Ng, CEO & Founder, Landing AI I remember when my students and I published the first NeurIPS workshop paper advocating using CUDA, a platform for processing on GPUs, for deep learning—a different senior person in AI sat me down and said, “CUDA is really complicated to program. As a programming paradigm, this seems like too much work.” I did manage to convince him; the other person I did not convince. I expect they’re both convinced now. Ng: I think so, yes. Over the past year as I’ve been speaking to people about the data-centric AI movement, I’ve been getting flashbacks to when I was speaking to people about deep learning and scalability 10 or 15 years ago. In the past year, I’ve been getting the same mix of “there’s nothing new here” and “this seems like the wrong direction.” Back to top How do you define data-centric AI, and why do you consider it a movement? Ng: Data-centric AI is the discipline of systematically engineering the data needed to successfully build an AI system. For an AI system, you have to implement some algorithm, say a neural network, in code and then train it on your data set. The dominant paradigm over the last decade was to download the data set while you focus on improving the code. Thanks to that paradigm, over the last decade deep learning networks have improved significantly, to the point where for a lot of applications the code—the neural network architecture—is basically a solved problem. So for many practical applications, it’s now more productive to hold the neural network architecture fixed, and instead find ways to improve the data. When I started speaking about this, there were many practitioners who, completely appropriately, raised their hands and said, “Yes, we’ve been doing this for 20 years.” This is the time to take the things that some individuals have been doing intuitively and make it a systematic engineering discipline. The data-centric AI movement is much bigger than one company or group of researchers. My collaborators and I organized a data-centric AI workshop at NeurIPS, and I was really delighted at the number of authors and presenters that showed up. You often talk about companies or institutions that have only a small amount of data to work with. How can data-centric AI help them? Ng: You hear a lot about vision systems built with millions of images—I once built a face recognition system using 350 million images. Architectures built for hundreds of millions of images don’t work with only 50 images. But it turns out, if you have 50 really good examples, you can build something valuable, like a defect-inspection system. In many industries where giant data sets simply don’t exist, I think the focus has to shift from big data to good data. Having 50 thoughtfully engineered examples can be sufficient to explain to the neural network what you want it to learn. When you talk about training a model with just 50 images, does that really mean you’re taking an existing model that was trained on a very large data set and fine-tuning it? Or do you mean a brand new model that’s designed to learn only from that small data set? Ng: Let me describe what Landing AI does. When doing visual inspection for manufacturers, we often use our own flavor of RetinaNet. It is a pretrained model. Having said that, the pretraining is a small piece of the puzzle. What’s a bigger piece of the puzzle is providing tools that enable the manufacturer to pick the right set of images [to use for fine-tuning] and label them in a consistent way. There’s a very practical problem we’ve seen spanning vision, NLP, and speech, where even human annotators don’t agree on the appropriate label. For big data applications, the common response has been: If the data is noisy, let’s just get a lot of data and the algorithm will average over it. But if you can develop tools that flag where the data’s inconsistent and give you a very targeted way to improve the consistency of the data, that turns out to be a more efficient way to get a high-performing system. “Collecting more data often helps, but if you try to collect more data for everything, that can be a very expensive activity.” —Andrew Ng For example, if you have 10,000 images where 30 images are of one class, and those 30 images are labeled inconsistently, one of the things we do is build tools to draw your attention to the subset of data that’s inconsistent. So you can very quickly relabel those images to be more consistent, and this leads to improvement in performance. Could this focus on high-quality data help with bias in data sets? If you’re able to curate the data more before training? Ng: Very much so. Many researchers have pointed out that biased data is one factor among many leading to biased systems. There have been many thoughtful efforts to engineer the data. At the NeurIPS workshop, Olga Russakovsky gave a really nice talk on this. At the main NeurIPS conference, I also really enjoyed Mary Gray’s presentation, which touched on how data-centric AI is one piece of the solution, but not the entire solution. New tools like Datasheets for Datasets also seem like an important piece of the puzzle. One of the powerful tools that data-centric AI gives us is the ability to engineer a subset of the data. Imagine training a machine-learning system and finding that its performance is okay for most of the data set, but its performance is biased for just a subset of the data. If you try to change the whole neural network architecture to improve the performance on just that subset, it’s quite difficult. But if you can engineer a subset of the data you can address the problem in a much more targeted way. When you talk about engineering the data, what do you mean exactly? Ng: In AI, data cleaning is important, but the way the data has been cleaned has often been in very manual ways. In computer vision, someone may visualize images through a Jupyter notebook and maybe spot the problem, and maybe fix it. But I’m excited about tools that allow you to have a very large data set, tools that draw your attention quickly and efficiently to the subset of data where, say, the labels are noisy. Or to quickly bring your attention to the one class among 100 classes where it would benefit you to collect more data. Collecting more data often helps, but if you try to collect more data for everything, that can be a very expensive activity. For example, I once figured out that a speech-recognition system was performing poorly when there was car noise in the background. Knowing that allowed me to collect more data with car noise in the background, rather than trying to collect more data for everything, which would have been expensive and slow. Back to top What about using synthetic data, is that often a good solution? Ng: I think synthetic data is an important tool in the tool chest of data-centric AI. At the NeurIPS workshop, Anima Anandkumar gave a great talk that touched on synthetic data. I think there are important uses of synthetic data that go beyond just being a preprocessing step for increasing the data set for a learning algorithm. I’d love to see more tools to let developers use synthetic data generation as part of the closed loop of iterative machine learning development. Do you mean that synthetic data would allow you to try the model on more data sets? Ng: Not really. Here’s an example. Let’s say you’re trying to detect defects in a smartphone casing. There are many different types of defects on smartphones. It could be a scratch, a dent, pit marks, discoloration of the material, other types of blemishes. If you train the model and then find through error analysis that it’s doing well overall but it’s performing poorly on pit marks, then synthetic data generation allows you to address the problem in a more targeted way. You could generate more data just for the pit-mark category. “In the consumer software Internet, we could train a handful of machine-learning models to serve a billion users. In manufacturing, you might have 10,000 manufacturers building 10,000 custom AI models.” —Andrew Ng Synthetic data generation is a very powerful tool, but there are many simpler tools that I will often try first. Such as data augmentation, improving labeling consistency, or just asking a factory to collect more data. Back to top To make these issues more concrete, can you walk me through an example? When a company approaches Landing AI and says it has a problem with visual inspection, how do you onboard them and work toward deployment? Ng: When a customer approaches us we usually have a conversation about their inspection problem and look at a few images to verify that the problem is feasible with computer vision. Assuming it is, we ask them to upload the data to the LandingLens platform. We often advise them on the methodology of data-centric AI and help them label the data. One of the foci of Landing AI is to empower manufacturing companies to do the machine learning work themselves. A lot of our work is making sure the software is fast and easy to use. Through the iterative process of machine learning development, we advise customers on things like how to train models on the platform, when and how to improve the labeling of data so the performance of the model improves. Our training and software supports them all the way through deploying the trained model to an edge device in the factory. How do you deal with changing needs? If products change or lighting conditions change in the factory, can the model keep up? Ng: It varies by manufacturer. There is data drift in many contexts. But there are some manufacturers that have been running the same manufacturing line for 20 years now with few changes, so they don’t expect changes in the next five years. Those stable environments make things easier. For other manufacturers, we provide tools to flag when there’s a significant data-drift issue. I find it really important to empower manufacturing customers to correct data, retrain, and update the model. Because if something changes and it’s 3 a.m. in the United States, I want them to be able to adapt their learning algorithm right away to maintain operations. In the consumer software Internet, we could train a handful of machine-learning models to serve a billion users. In manufacturing, you might have 10,000 manufacturers building 10,000 custom AI models. The challenge is, how do you do that without Landing AI having to hire 10,000 machine learning specialists? So you’re saying that to make it scale, you have to empower customers to do a lot of the training and other work. Ng: Yes, exactly! This is an industry-wide problem in AI, not just in manufacturing. Look at health care. Every hospital has its own slightly different format for electronic health records. How can every hospital train its own custom AI model? Expecting every hospital’s IT personnel to invent new neural-network architectures is unrealistic. The only way out of this dilemma is to build tools that empower the customers to build their own models by giving them tools to engineer the data and express their domain knowledge. That’s what Landing AI is executing in computer vision, and the field of AI needs other teams to execute this in other domains. Is there anything else you think it’s important for people to understand about the work you’re doing or the data-centric AI movement? Ng: In the last decade, the biggest shift in AI was a shift to deep learning. I think it’s quite possible that in this decade the biggest shift will be to data-centric AI. With the maturity of today’s neural network architectures, I think for a lot of the practical applications the bottleneck will be whether we can efficiently get the data we need to develop systems that work well. The data-centric AI movement has tremendous energy and momentum across the whole community. I hope more researchers and developers will jump in and work on it. Back to top This article appears in the April 2022 print issue as “Andrew Ng, AI Minimalist.”
-
How AI Will Change Chip Design
Feb 08, 2022 06:00 AM PSTThe end of Moore’s Law is looming. Engineers and designers can do only so much to miniaturize transistors and pack as many of them as possible into chips. So they’re turning to other approaches to chip design, incorporating technologies like AI into the process. Samsung, for instance, is adding AI to its memory chips to enable processing in memory, thereby saving energy and speeding up machine learning. Speaking of speed, Google’s TPU V4 AI chip has doubled its processing power compared with that of its previous version. But AI holds still more promise and potential for the semiconductor industry. To better understand how AI is set to revolutionize chip design, we spoke with Heather Gorr, senior product manager for MathWorks’ MATLAB platform. How is AI currently being used to design the next generation of chips? Heather Gorr: AI is such an important technology because it’s involved in most parts of the cycle, including the design and manufacturing process. There’s a lot of important applications here, even in the general process engineering where we want to optimize things. I think defect detection is a big one at all phases of the process, especially in manufacturing. But even thinking ahead in the design process, [AI now plays a significant role] when you’re designing the light and the sensors and all the different components. There’s a lot of anomaly detection and fault mitigation that you really want to consider. Heather GorrMathWorks Then, thinking about the logistical modeling that you see in any industry, there is always planned downtime that you want to mitigate; but you also end up having unplanned downtime. So, looking back at that historical data of when you’ve had those moments where maybe it took a bit longer than expected to manufacture something, you can take a look at all of that data and use AI to try to identify the proximate cause or to see something that might jump out even in the processing and design phases. We think of AI oftentimes as a predictive tool, or as a robot doing something, but a lot of times you get a lot of insight from the data through AI. What are the benefits of using AI for chip design? Gorr: Historically, we’ve seen a lot of physics-based modeling, which is a very intensive process. We want to do a reduced order model, where instead of solving such a computationally expensive and extensive model, we can do something a little cheaper. You could create a surrogate model, so to speak, of that physics-based model, use the data, and then do your parameter sweeps, your optimizations, your Monte Carlo simulations using the surrogate model. That takes a lot less time computationally than solving the physics-based equations directly. So, we’re seeing that benefit in many ways, including the efficiency and economy that are the results of iterating quickly on the experiments and the simulations that will really help in the design. So it’s like having a digital twin in a sense? Gorr: Exactly. That’s pretty much what people are doing, where you have the physical system model and the experimental data. Then, in conjunction, you have this other model that you could tweak and tune and try different parameters and experiments that let sweep through all of those different situations and come up with a better design in the end. So, it’s going to be more efficient and, as you said, cheaper? Gorr: Yeah, definitely. Especially in the experimentation and design phases, where you’re trying different things. That’s obviously going to yield dramatic cost savings if you’re actually manufacturing and producing [the chips]. You want to simulate, test, experiment as much as possible without making something using the actual process engineering. We’ve talked about the benefits. How about the drawbacks? Gorr: The [AI-based experimental models] tend to not be as accurate as physics-based models. Of course, that’s why you do many simulations and parameter sweeps. But that’s also the benefit of having that digital twin, where you can keep that in mind—it’s not going to be as accurate as that precise model that we’ve developed over the years. Both chip design and manufacturing are system intensive; you have to consider every little part. And that can be really challenging. It’s a case where you might have models to predict something and different parts of it, but you still need to bring it all together. One of the other things to think about too is that you need the data to build the models. You have to incorporate data from all sorts of different sensors and different sorts of teams, and so that heightens the challenge. How can engineers use AI to better prepare and extract insights from hardware or sensor data? Gorr: We always think about using AI to predict something or do some robot task, but you can use AI to come up with patterns and pick out things you might not have noticed before on your own. People will use AI when they have high-frequency data coming from many different sensors, and a lot of times it’s useful to explore the frequency domain and things like data synchronization or resampling. Those can be really challenging if you’re not sure where to start. One of the things I would say is, use the tools that are available. There’s a vast community of people working on these things, and you can find lots of examples [of applications and techniques] on GitHub or MATLAB Central, where people have shared nice examples, even little apps they’ve created. I think many of us are buried in data and just not sure what to do with it, so definitely take advantage of what’s already out there in the community. You can explore and see what makes sense to you, and bring in that balance of domain knowledge and the insight you get from the tools and AI. What should engineers and designers consider when using AI for chip design? Gorr: Think through what problems you’re trying to solve or what insights you might hope to find, and try to be clear about that. Consider all of the different components, and document and test each of those different parts. Consider all of the people involved, and explain and hand off in a way that is sensible for the whole team. How do you think AI will affect chip designers’ jobs? Gorr: It’s going to free up a lot of human capital for more advanced tasks. We can use AI to reduce waste, to optimize the materials, to optimize the design, but then you still have that human involved whenever it comes to decision-making. I think it’s a great example of people and technology working hand in hand. It’s also an industry where all people involved—even on the manufacturing floor—need to have some level of understanding of what’s happening, so this is a great industry for advancing AI because of how we test things and how we think about them before we put them on the chip. How do you envision the future of AI and chip design? Gorr: It’s very much dependent on that human element—involving people in the process and having that interpretable model. We can do many things with the mathematical minutiae of modeling, but it comes down to how people are using it, how everybody in the process is understanding and applying it. Communication and involvement of people of all skill levels in the process are going to be really important. We’re going to see less of those superprecise predictions and more transparency of information, sharing, and that digital twin—not only using AI but also using our human knowledge and all of the work that many people have done over the years.
-
Atomically Thin Materials Significantly Shrink Qubits
Feb 07, 2022 08:12 AM PSTQuantum computing is a devilishly complex technology, with many technical hurdles impacting its development. Of these challenges two critical issues stand out: miniaturization and qubit quality. IBM has adopted the superconducting qubit road map of reaching a 1,121-qubit processor by 2023, leading to the expectation that 1,000 qubits with today’s qubit form factor is feasible. However, current approaches will require very large chips (50 millimeters on a side, or larger) at the scale of small wafers, or the use of chiplets on multichip modules. While this approach will work, the aim is to attain a better path toward scalability. Now researchers at MIT have been able to both reduce the size of the qubits and done so in a way that reduces the interference that occurs between neighboring qubits. The MIT researchers have increased the number of superconducting qubits that can be added onto a device by a factor of 100. “We are addressing both qubit miniaturization and quality,” said William Oliver, the director for the Center for Quantum Engineering at MIT. “Unlike conventional transistor scaling, where only the number really matters, for qubits, large numbers are not sufficient, they must also be high-performance. Sacrificing performance for qubit number is not a useful trade in quantum computing. They must go hand in hand.” The key to this big increase in qubit density and reduction of interference comes down to the use of two-dimensional materials, in particular the 2D insulator hexagonal boron nitride (hBN). The MIT researchers demonstrated that a few atomic monolayers of hBN can be stacked to form the insulator in the capacitors of a superconducting qubit. Just like other capacitors, the capacitors in these superconducting circuits take the form of a sandwich in which an insulator material is sandwiched between two metal plates. The big difference for these capacitors is that the superconducting circuits can operate only at extremely low temperatures—less than 0.02 degrees above absolute zero (-273.15 °C). Superconducting qubits are measured at temperatures as low as 20 millikelvin in a dilution refrigerator.Nathan Fiske/MIT In that environment, insulating materials that are available for the job, such as PE-CVD silicon oxide or silicon nitride, have quite a few defects that are too lossy for quantum computing applications. To get around these material shortcomings, most superconducting circuits use what are called coplanar capacitors. In these capacitors, the plates are positioned laterally to one another, rather than on top of one another. As a result, the intrinsic silicon substrate below the plates and to a smaller degree the vacuum above the plates serve as the capacitor dielectric. Intrinsic silicon is chemically pure and therefore has few defects, and the large size dilutes the electric field at the plate interfaces, all of which leads to a low-loss capacitor. The lateral size of each plate in this open-face design ends up being quite large (typically 100 by 100 micrometers) in order to achieve the required capacitance. In an effort to move away from the large lateral configuration, the MIT researchers embarked on a search for an insulator that has very few defects and is compatible with superconducting capacitor plates. “We chose to study hBN because it is the most widely used insulator in 2D material research due to its cleanliness and chemical inertness,” said colead author Joel Wang, a research scientist in the Engineering Quantum Systems group of the MIT Research Laboratory for Electronics. On either side of the hBN, the MIT researchers used the 2D superconducting material, niobium diselenide. One of the trickiest aspects of fabricating the capacitors was working with the niobium diselenide, which oxidizes in seconds when exposed to air, according to Wang. This necessitates that the assembly of the capacitor occur in a glove box filled with argon gas. While this would seemingly complicate the scaling up of the production of these capacitors, Wang doesn’t regard this as a limiting factor. “What determines the quality factor of the capacitor are the two interfaces between the two materials,” said Wang. “Once the sandwich is made, the two interfaces are “sealed” and we don’t see any noticeable degradation over time when exposed to the atmosphere.” This lack of degradation is because around 90 percent of the electric field is contained within the sandwich structure, so the oxidation of the outer surface of the niobium diselenide does not play a significant role anymore. This ultimately makes the capacitor footprint much smaller, and it accounts for the reduction in cross talk between the neighboring qubits. “The main challenge for scaling up the fabrication will be the wafer-scale growth of hBN and 2D superconductors like [niobium diselenide], and how one can do wafer-scale stacking of these films,” added Wang. Wang believes that this research has shown 2D hBN to be a good insulator candidate for superconducting qubits. He says that the groundwork the MIT team has done will serve as a road map for using other hybrid 2D materials to build superconducting circuits.