Starting at 52 minutes and 10 seconds of this video, Ali Rahimi explains how one can use a noisy evaluation of the gradient in a gradient descent algorithm in order to minimize a convex function (most relaxations featured in current Advanced Matrix Factorization techniques and some instances of Compressive Sensing fall in that category). The interesting part of this is that the noise comes from the errors made by the, now stochastic, Floating Point Unit. From the paper below:
Unlike the traditional setting for stochastic gradient descent, where stochasticity arises because the gradient direction is computed from a random subset of a dataset, here the processor itself is the source of stochasticity. We call our approach application robustification.
The attendant paper mentioned in the video for the Intel chip is here: A numerical optimization-based methodology for application robustification: Transforming applications for error tolerance by Joseph Sloan, David Kesler, Rakesh Kumar, Ali Rahimi
There have been several attempts at correcting process variation induced errors by identifying and masking these errors at the circuit and architecture level. These approaches take up valuable die area and power on the chip. As an alternative, we explore the feasibility of an approach that allows these errors to occur freely, and handle them in software, at the algorithmic level. In this paper, we present a general approach to converting applications into an error tolerant form by recasting these applications as numerical optimization problems, which can then be solved reliably via stochastic optimization. We evaluate the potential robustness and energy benefits of the proposed approach using an FPGA-based framework that emulates timing errors in the floating point unit (FPU) of a Leon3 processor. We show that stochastic versions of applications have the potential to produce good quality outputs in the face of timing errors under certain assumptions. We also show that good quality results are possible for both intrinsically robust algorithms as well as fragile applications under these assumptions.
Of note, this thesis: A HARDWARE ACCELERATION TECHNIQUE FOR GRADIENT DESCENT AND CONJUGATE GRADIENT by DAVID ROBERT KESLER
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.