Showing posts with label autonomous. Show all posts
Showing posts with label autonomous. Show all posts

Tuesday, July 27, 2010

A Unified Algorithmic Framework for Multi-Dimensional Scaling, Philosophy and the practice of Bayesian statistics

Here are some of the reading I took with me in a place away from the interwebs.

As we all know, many issues in machine learning depends crucially on an initial good distance measure, From Suresh's tweet, here is: A Unified Algorithmic Framework for Multi-Dimensional Scaling by Arvind Agarwal, Jeff M. Phillips, Suresh Venkatasubramanian. The abstract reads:
In this paper, we propose a unified algorithmic framework for solving many known variants of \mds. Our algorithm is a simple iterative scheme with guaranteed convergence, and is \emph{modular}; by changing the internals of a single subroutine in the algorithm, we can switch cost functions and target spaces easily. In addition to the formal guarantees of convergence, our algorithms are accurate; in most cases, they converge to better quality solutions than existing methods, in comparable time. We expect that this framework will be useful for a number of \mds variants that have not yet been studied. 
Our framework extends to embedding high-dimensional points lying on a sphere to points on a lower dimensional sphere, preserving geodesic distances. As a compliment to this result, we also extend the Johnson-Lindenstrauss Lemma to this spherical setting, where projecting to a random $O((1/\eps^2) \log n)$-dimensional sphere causes $\eps$-distortion.

A press release can be found here.The Matlab code implementing the Euclidian MDS is here.


A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science.
Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.

Cosma wrote a small blog entry on his paper here.

Other items to read include:


* Paul Graham's The top idea in your mind
* From Suresh 's blog here are some interesting links: to read and ponder:

Sunday, August 24, 2008

Competing with the Human Element

When the U.S. Corona satellites produced their first imagery, the imagery products had cloud cover over most of their target of interest rendering the image useless. This simple fact that I keep on re-discovering for other applications like finding a lost boat in the ocean or simply taking photos over New Mexico from a high altitude balloon, started a Space race in the early 1960's between the U.S. and the U.S.S.R. I only recently learned about it when I was watching a PBS/NOVA show called Astrospies (a show also viewable as Des espions dans l'espace in France and as Die Weltraumspione in Germany). This story is fascinating in many respects. I had heard about the U.S. Manned Orbiting Laboratory (MOL) but had never heard about its Russian equivalent the Almaz program. While I knew the U.S MOL did not go anywhere, I did not realize it was internally in competition with the U.S. NRO spysats: A situation that featured for the first time a clear fight between human and robotic spaceflights. The robotic projects eventually won with the KH spacecrafts in the U.S. In Russia, however, humans won and the program actually flew successfully three manned missions. The Almaz spacecrafts in which these flights took place looked like Salyut stations:


The inside of these Almaz stations look like any military airplane dedicated to observation.

But the most amazing part are the technical specifications of the camera called Agat weighting more than 2 tons which had a lens housing that is more than 6 meters wide. It had a magnification capability of about 80 times. The final resolution was said to be 8 cm.


After 3 billion dollars spent on the U.S. MOL, the equivalent program got canceled in the U.S. on June 10th, 1969 while the first russian Almaz flight took off, ironically, on July 4th, 1974.

Why am I telling this story ? well, I am stricken by the fact that the cloud cover was such a pain to weed out that it became a mission requirements to have a human in the loop thereby yielding very large funding expenditures. It also showed that progress in a ten year time frame (from the early sixties to the early seventies) in communication and computing power have essentially removed the Human element in the loop requirement.

Second, the most important part of these observation program requirements continuously balances the resolution vs the Field of View. If your sensor has high resolution, then on-board power and bandwidth put a hard constraint on your footprint on the ground (such is the case of EO-1 and the hyperion hyperspectral camera for instance). This is why the DARPA LACOSTE program using coded aperture in the IR seems to be far reaching: high resolution and large field of view are requirements of the program. Yet, the program does not remove the Human element: even after having pushed tons of data to the ground, none of these programs can figure out the intent and are therefore likely to yield operator overload. It is one thing to see everything, it is another to make sense of it (TM).

The Astrospies DVD can be bought here or from Amazon.

Photos: Astrospies documentary.

Tuesday, July 08, 2008

CS: MMDS papers, a CS page, A2I Converters, CS in Space.

I just stumbled upon the MMDS 2008. Workshop on Algorithms for Modern Massive Data Sets that took place at Stanford University on June 25–28, 2008. All the abstracts are located here. All the presentations are here. Among the ones that caught my interest:

Piotr Indyk has a presentation entitled Sparse recovery using sparse random matrices/ Or: Fast and Effective Linear Compression where he presents a joint work with: Radu Berinde, Anna Gilbert, Piotr Indyk, Howard Karloff, Milan Ruzic and Martin Strauss that was partially shown in Anna Gilbert's presentation at the L1 meeting at Texas A&M). The interesting new part of this presentation is the recent result with Milan Ruzic where it is shown that if a measurement matrix follows RIP-1, then OMP converges. The previous results was on LP-decoding only working. The abstract reads:

Over the recent years, a new *linear* method for compressing high-dimensional data (e.g., images) has been discovered. For any high-dimensional vector x, its *sketch* is equal to Ax, where A is an m x n matrix (possibly chosen at random). Although typically the sketch length m is much smaller than the number of dimensions n, the sketch contains enough information to recover an *approximation* to x. At the same time, the linearity of the sketching method is very convenient for many applications, such as data stream computing and compressed sensing.

The major sketching approaches can be classified as either combinatorial (using sparse sketching matrices) or geometric (using dense sketching matrices). They achieve different trade-offs, notably between the compression rate and the running time. Thus, it is desirable to understand the connections between them, with the ultimate goal of obtaining the "best of both worlds" solution.

In this talk we show that, in a sense, the combinatorial and geometric approaches are based on different manifestations of the same phenomenon. This connection will enable us to obtain several novel algorithms and constructions, which inherit advantages of sparse matrices, such as lower sketching and recovery times.


Also, Anna Gilbert had a presentation entitled Combinatorial Group Testing in Signal Recovery. The abstract reads:

Traditionally, group testing is a design problem. The goal is to construct an optimally efficient set of tests of items such that the test results contains enough information to determine a small subset of items of interest. It has its roots in the statistics community and was originally designed for the Selective Service to find and to remove men with syphilis from the draft. It appears in many forms, including coin-weighing problems, experimental designs, and public health. We are interested in both the design of tests and the design of an efficient algorithm that works with the tests to determine the group of interest. Many of the same techniques that are useful for designing tests are also used to solve algorithmic problems in analyzing and in recovering statistical quantities from streaming data. I will discuss some of these methods and briefly discuss several recent applications in high throughput drug screening.
Joint work with Radu Berinde, Piotr Indyk, Howard Karloff, Martin Strauss, Raghu Kainkaryam and Peter Woolf.


By the way, it's not a rat, it's a Chef.


Tong Zhang, An Adaptive Forward/Backward Greedy Algorithm for Learning Sparse Representations (the technical report is here: Forward-Backward Greedy Algorithm for Learning Sparse Representations).

Consider linear least squares regression where the target function is a sparse linear combination of a set of basis functions. We are interested in the problem of identifying those basis functions with non-zero coefficients and reconstructing the target function from noisy observations. This problem is NP-hard. Two heuristics that are widely used in practice are forward and backward greedy algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever necessary. We prove strong theoretical results showing that this procedure is effective in learning sparse representations. Experimental results support our theory.

Nir Ailon, Efficient Dimension Reduction. The abstract reads:
The Johnson-Lindenstrauss dimension reduction idea using random projections was discovered in the early 80's. Since then many "computer science friendly" versions were published, offering constant factor but no big-Oh improvements in the runtime. Two years ago Ailon and Chazelle showed a nontrivial algorithm with the first asymptotic improvement, and suggested the question: What is the exact complexity of J-L computation from d dimensions to k dimensions? An O(d log d) upper bound is implied by A-C for k up to d^{1/3} (in the L2 to L2) case. In this talk I will show how to achieve this bound for k up to d^{1/2} combining techniques from the theories of error correcting codes and probability in Banach spaces. This is based on joint work with Edo Liberty.

Yoram Singer, Efficient Projection Algorithms for Learning Sparse Representations from High Dimensional Data.
Many machine learning tasks can be cast as constrained optimization problems. The talk focuses on efficient algorithms for learning tasks which are cast as optimization problems subject to L1 and hyper-box constraints. The end result are typically sparse and accurate models. We start with an overview of existing projection algorithms onto the simplex. We then describe a linear time projection for dense input spaces. Last, we describe a new efficient projection algorithm for very high dimensional spaces. We demonstrate the merits of the algorithm in experiments with
large scale image and text classification.

Kenneth Clarkson, Tighter Bounds for Random Projections of Manifolds (this is the report). The abstract reads:
The Johnson-Lindenstrauss random projection lemma gives a simple way to reduce the dimensionality of a set of points while approximately preserving their pairwise Euclidean distances. The most direct application of the lemma applies to a finite set of points, but recent work has extended the technique to affine subspaces, smooth manifolds, and sets of bounded doubling dimension; in each case, a projection to a sufficiently large dimension k implies that all pairwise distances are approximately preserved with high probability. Here the case of random projection of a smooth manifold (submanifold of R^m) is considered, and a previous analysis is sharpened, giving an upper bound for k that depends on the surface area, total absolute curvature, and a few other quantities associated with the manifold, and in particular does not depend on the ambient dimension m or the reach of the manifold.

Sanjoy Dasgupta, Random Projection Trees and Low Dimensional Manifolds. The abstract reads:
The curse of dimensionality has traditionally been the bane of nonparametric statistics (histograms, kernel density estimation, nearest neighbor search, and so on), as reflected in running times and convergence rates that are exponentially bad in the dimension. This problem is all the more pressing as data sets get increasingly high dimensional. Recently the field has been rejuvenated substantially, in part by the realization that a lot of real-world data which appears high-dimensional in fact has low "intrinsic" dimension in the sense of lying close to a low-dimensional manifold. In the past few years, there has been a huge interest in learning such manifolds from data, and then using the learned structure to transform the data into a lower dimensional space where standard statistical methods generically work better.

I'll exhibit a way to benefit from intrinsic low dimensionality without having to go to the trouble of explicitly learning its fine structure. Specifically, I'll present a simple variant of the ubiquitous k-d tree (a spatial data structure widely used in machine learning and statistics) that is provably adaptive to low dimensional structure. We call this a "random projection tree" (RP tree).

Along the way, I'll discuss different notions of intrinsic dimension -- motivated by manifolds, by local statistics, and by analysis on metric spaces -- and relate them. I'll then prove that RP trees require resources that depend only on these dimensions rather than the dimension of the space in which the data happens to be situated. This is work with Yoav Freund (UC San Diego).

Lars Kai Hansen, Generalization in High-Dimensional Matrix Factorization. The abstract reads:
While the generalization performance of high-dimensional principal component analysis is quite well understood, matrix factorizations like independent component analysis, non-negative matrix factorization, and clustering are less investigated for generalizability. I will review theoretical results for PCA and heuristics used to improve PCA test performance, and discuss extensions to high-dimensional ICA, NMF, and clustering.

Holly Jin (at LinkedIn !), Exploring Sparse NonNegative Matrix Factorization. The abstract reads:
We explore the use of basis pursuit denoising (BPDN) for sparse nonnegative matrix factorization (sparse NMF). A matrix A is approximated by low-rank factors UDV', where U and V are sparse with unit-norm columns, and D is diagonal. We use an active-set BPDN solver with warm starts to compute the rows of U and V in turn. (Friedlander and Hatz have developed a similar formulation for both matrices and tensors.) We present computational results and discuss the benefits of sparse NMF for some real matrix applications. This is joint work with Michael Saunders.

Compressed Counting and Stable Random Projections by Ping Li. The abstract reads:
The method of stable random projections has become a popular tool for dimension reduction, in particular, for efficiently computing pairwise distances in massive high-dimensional data (including dynamic streaming data) matrices, with many applications in data mining and machine learning such as clustering, nearest neighbors, kernel methods etc.. Closely related to stable random projections, Compressed Counting (CC) is recently developed to efficiently compute Lp frequency moments of a single dynamic data stream. CC exhibits a dramatic improvement over stable random projections when p is about 1. Applications of CC include estimating entropy moments of data streams and statistical parameter estimations in dynamic data using low memory.

Ronald Coifman, Diffusion Geometries and Harmonic Analysis on Data Sets. The abstract reads:
We discuss the emergence of self organization of data, either through eigenvectors of affinity operators, or equivalently through a multiscale ontology. We illustrate these ideas on images and audio files as well as molecular dynamics.

Thomas Blumensath and Mike Davies have set up a larger web page on Compressed Sensing and their attendant research in that field.

This one is a little old but it just came out on my radar screen and it is not on the Rice page yet. It features an A2I paper by Sami Kirolos, Tamer Ragheb, Jason Laska, Marco F. Duarte, Yehia Massoud and Richard Baraniuk entitled Practical Issues in Implementing Analog-to-Information Converters. The abstract reads:

The stability and programmability of digital signal processing systems has motivated engineers to move the analog-to-digital conversion (ADC) process closer and closer to the front end of many signal processing systems in order to perform as much processing as possible in the digital domain. Unfortunately, many important applications, including radar and communication systems, involve wideband signals that seriously stress modern ADCs; sampling these signals above the Nyquist rate is in some cases challenging and in others impossible. While wideband signals by definition have a large bandwidth, often the amount of information they carry per second is much lower; that is, they are compressible in some sense. The first contribution of this paper is a new framework for wideband signal acquisition purpose-built for compressible signals that enables sub-Nyquist data acquisition via an analog-to-information converter (AIC). The framework is based on the recently developed theory of compressive sensitng in which a small number of non-adaptive, randomized measurements are sufficient to reconstruct compressible signals. The second contribution of this paper is an AIC implementation design and study of the tradeoffs and nonidealities introduced by real hardware. The goal is to identify and optimize the parameters that dominate the overall system performance.
Finally, CS in Space

The eighth annual NASA Earth Science Technology Conference (ESTC2008) was held June 24-26. 2008, at the University of Maryland University College and showcased a wide array of technology research and development related to NASA's Earth science endeavors. The papers are here. Of particular interest is this presentation:

Novel Distributed Wavelet Transforms and Routing Algorithms for Efficient Data Gathering in Sensor Webs. Antonio Ortega, G. Shen S. Lee S.W. Lee S. Pattem A. Tu B. Krishnamachari, M. Cheng S. Dolinar A. Kiely M. Klimesh, H. Xie
on page 13, there is : Compressed Sensing for Sensor Networks.

The GRETSI conference occured a year ago. This is one of the papers (in French with an English abstract) entitled Quelques Applications du Compressed Sensing en Astronomie, David Mary and Olivier Michel. The abstract reads:
We investigate in this communication how recently introduced “ Compressed Sensing” methods can be applied to some important problems in observational Astronomy. The mathematical background is first outlined. Some examples are then described in stellar variability and in image reconstruction for space-based observations. We finally illustrate the interest of such techniques for direct imaging through stellar interferometry.

Nous proposons dans cette communication d'évaluer le potentiel des méthodes de « Compressed Sensing » récemment introduites, à travers leur application à quelques problèmes importants en astrophysique observationelle. Après avoir rapidement décrit les bases mathématiques de ces approches, des exemples sont développés dans la cadre de la variabilité stellaire, de la reconstruction d'images satellitaire et enfin dans le cadre plus prospectif des futures possibilités offertes par les grands projets d'interféromètres à multiples bases pour l'imagerie directe dans le plan de Fourier.

Saturday, June 14, 2008

CS: CS Ground Penetrating Radar, History of Matching Pursuit, Reinforcement Learning Blog, Neuroscience and Dynamic Systems

Andriyan Suksmono at the Institut Technologi Bandug blogs on Compressed Sensing. The blog's name is Chaotic Pearls and most of it is in Indonesian. His latest entry on the use of Compressed Sensing in the context of a Ground Penetrating Radar application. This is new. The presentation (in english) is entitled: A Compressive SFCW-GPR System (also here) by Andriyan Suksmono, Endon Bharata, A. Andaya Lestari, A. Yarovoy, and L.P. Ligthart. The abstract of the paper reads:

Data acquisition speed is an inherent problem of the stepped-frequency continuous wave (SFCW) radars, which discouraging further usage and development of this technology. We propose an emerging paradigm called the compressed sensing (CS), to manage this problem. In the CS, a signal can be reconstructed exactly based on only a few samples below the Nyquist rate. Accordingly, the data acquisition speed can be increased significantly. A novel design of SFCW ground penetrating radar (GPR) with a high acquisition speed capability is proposed and evaluated. Simulation by a mono-cycle waveform and actual measurement by a Vector Network Analyzer in a GPR test-range confirm the implementability of the proposed system.

The architecture looks like this:

and some photos of the experiment are also shown below. The rest of the presentation show some of the reconstruction results using L1 magic.



Here is another blogger. Laurent Jacques, a contributor to this blog, has decided to start his own blog entitled: Le Petit Chercheur Illustré, Yet another signal processing (and applied math). His first technical entry is on an inspiring historical perspective on the Matching Pursuit technique.

Some of you know of my interest in Robotics and Artificial Intelligence. In particular, learning in low dimensional spaces. Two items appeared on my radar this week:

  • A blog: The Reinforcement Learning Blog

    and a paper entitled:

  • Where neuroscience and dynamic system theory meet autonomous robotics: A contracting basal ganglia model for action selection. by B. Girard, Nicolas Tabareau, Quang Cuong Pham, Alain Berthoz, Jean-Jacques Slotine. The abstract reads:
    Action selection, the problem of choosing what to do next, is central to any autonomous agent architecture. We use here a multi-disciplinary approach at the convergence of neuroscience, dynamical system theory and autonomous robotics, in order to propose an efficient action selection mechanism based on a new model of the basal ganglia. We first describe new developments of contraction theory regarding locally projected dynamical systems. We exploit these results to design a stable computational model of the cortico-baso-thalamo-cortical loops. Based on recent anatomical data, we include usually neglected neural projections, which participate in performing accurate selection. Finally, the efficiency of this model as an autonomous robot action selection mechanism is assessed in a standard survival task. The model exhibits valuable dithering avoidance and energy-saving properties, when compared with a simple if-then-else decision rule.
  • Wednesday, February 20, 2008

    Compressed Sensing: More on Deterministic Subsampling.


    The current crop of deterministic schemes devised to perform compressed sensing measurements or subsample below Nyquist are listed here. To that we have to add the upcoming paper of Yves Meyer and Basarab Matei. Improving on a previous algorithm Mark Iwen and Craig Spencer have just released a new preprint entitled: Improved Bounds for a Deterministic Sublinear-Time Sparse Fourier Algorithm. The abstract reads:
    This paper improves on the best-known runtime and measurement bounds for a recently proposed Deterministic sublinear-time Sparse Fourier Transform algorithm (hereafter called DSFT). In [1], [2], it is shown that DSFT can exactly reconstruct the Fourier transform (FT) of an N-bandwidth signal f , consisting of B  N non-zero frequencies, using O(B2 ·polylog(N)) time and O(B2 · polylog(N)) f -samples. DSFT works by taking advantage of natural aliasing phenomena to hash a frequency sparse signal’s FT information modulo O(B·polylog(N)) pairwise coprime numbers via O(B · polylog(N)) small Discrete Fourier Transforms. Number theoretic arguments then guarantee the original DFT frequencies/coefficients can be recovered via the Chinese Remainder Theorem. DSFT’s usage of primes makes its runtime and signal sample requirements highly dependent on the sizes of sums and products of small primes. Our new bounds utilize analytic number theoretic techniques to generate improved (asymptotic) bounds for DSFT. As a result, we provide better bounds for the sampling complexity/number of low-rate analog-to-digital converters (ADCs) required to deterministically recover frequency-sparse wideband signals via DSFT in signal processing applications [3], [4].

    This is an improvement over the previous algorithm presented here by Mark Iwen in A Deterministic Sub-linear Time Sparse Fourier Algorithm via Non-adaptive Compressed Sensing Methods

    Also in line with probabilistic schemes to help your average LAPACK library (see Random Projection Lapack ?), Mark Iwen and Craig Spencer are also making available A Note on Compressed Sensing and Rapid Matrix Multiplication when you know the result of a matrix multiplication will yield a sparse matrix, which is not often but a step in the right direction.The abstract reads:
    This paper discusses how compressed sensing methods can be used to (approximately) multiply two square matrices quickly if the product is known to be sparse. Let A and B denote square N × N matrices,  > 0, and c > 0. If A · B is compressible in each column, we can obtain a near-optimal best cN^.29462 element-per-column approximation to A · B using O(N^(2+epsilon)) operations. More specifically, if each column of the product A · B has at most cN^.29462 non-zero elements, then we can calculate the product A · B exactly by using O(N^(2+epsilon)) operations.



    Credit photo: NASA/JPL-Caltech/Cornell University, View from Spirit's Overwintering Position, photo taken Jan. 29, 2008.

    Thursday, December 27, 2007

    Building a Roaming Dinosaur: Why is Policy Learning Not Used ?


    Dubai is going to host a Jurassic Park. It is about time: the current display of Animatronics are very much underwhelming and certainly do not yield the type of magic moment as displayed by Laura Dern's face in Jurassic Park. Yet, all over the world, kids and families line up to pay from $10 to $100 for the privilege of being wowed. The most interesting feature of the Dubai 'Resteless Planet' undertaking will be the claimed ability for the dinosaurs to be roaming. This is no small feat as none of the current animatronics are able to do that. I have a keen interest in this as you probably have noticed from the different entries on muscles, scaling and various autonomous robotics undertaking.

    So over the years, I have kept an eye on the current understanding of dinosaur gait. Examples on the web can be found here and here. A sentence seems to be pretty much a good summary:

    When the American Museum of Natural History wanted to create a digital walking Tyrannosaurus rex for a new dinosaur exhibit, it turned to dinosaur locomotion experts John Hutchinson and Stephen Gatesy for guidance.

    The pair found the process humbling.

    With powerful computers and sophisticated modeling software, animators can take a pile of digital bones and move them in any way they want. This part is easy; but choosing the most likely motion from the myriad of possibilities proved difficult......

    The researchers think that one way to narrow down the possibilities of dinosaur movement is to use more rigorous physical constraints in computer models. These constraints fall into two broad categories: kinematic (motion-based) and kinetic (force-based). One simple kinematic constraint, for example, is that the ankle and knee cannot bend backwards.
    So in short, it is one thing to know the skeleton, it is another one to devise how the movement goes. Yet, the current methods devised to figure gait are relying on not so sophisticated methods. As it turns out, we have a similar problem in robotics and machine learning. Because robots are becoming increasingly complex, there needs to be new methods of collecting data and summarizing them in what are called 'policies'. New methods are able to learn behavior for robots even though they have many degrees of freedom though some type of supervised learning. Some of the techniques include Non-Negative Matrix Factorization (NMF), diffusion processes and some of the techniques we tried in our unsuccesful attempt in DARPA's race in the future.

    [ Update: a dinosaur finding showing preserved organic parts shows us that basing our intuition on just bones is not enough. It looks as though dinosaurs may have been much larger. Ona different note, it is one thing to model human behavior (and by extension dinosaur behavior) using Differential equations,but the problem you are trying to solve is 'given a certain behavior, how can it fit the model set forth by the differential equations?'. This is what is called an inverse problem and while a set of differential equations may give you a sentiment that you are modeling everything right, they generally are simplification of the real joint behavior and their interaction with the environment (soil,...). In short, to give a sense of realness, you have to go beyond a description with differential equations alone, for these reasons alone. For this reason, building a real roaming dinosaur need the type of undertaking mentioned above in this entry ]

    Saturday, November 03, 2007

    DARPA Urban Challenge: It's today

    We did not qualify for the semi-finals but others did. DARPA has finally selected the teams that will compete in the Urban Challenge Final Event on Saturday, November 3 at the former George Air Force Base in Victorville, Calif. The teams will attempt to complete a complex 60-mile urban course with live traffic in less than six hours. The finalists will operate on the course roads with approximately 50 human-driven traffic vehicles. Speed is not the only factor in determining the winners, as vehicles must also meet the same standards required to pass the California DMV road test. The Urban Challenge Final Event is to take place today (November 3) and will be webcast live at http://www.grandchallenge.org starting at 7:30 am PT. Grounds will open at 6:00 AM PT for spectators, but the opening ceremony will begin at 7:30 AM, and vehicles will begin to launch at 8:00 AM.

    Watch the Race here, Status Board is here, Live Video is here, Tracking Map is here.

    A brief highlight video of National Qualifying Event (semi-final) operations is available for download here: High (27MB) Med (7.5MB) Low (3MB)

    Vehicles are being tested in three test areas to evaluate their ability to operate with live traffic, make safe left turns across moving traffic, and pull out at T-intersections with cars arriving from both directions. Vehicles also have to follow narrow winding roads, avoiding parked cars and other obstructions, maneuver into a designated parking spot and negotiate 4-way intersections and road-blocks.

    List of teams that did qualify for the Final Event can be found here.
    List of teams that did not qualify for the Final Event can be found here.

    I personally will not watch, I am still sore from this.

    Tuesday, September 18, 2007

    Imaging from the sky: When You Become The Map

    There is a new call from the HASP folks about submitting new payloads to be flown next year in a NASA high altitude balloon. Deadline is December 18, 2007, it is directed toward undergraduate projects.
    From the HASP website:

    September 17, 2007: HASP CALL FOR PAYLOADS 2007-2008 RELEASED: The HASP Call for Payloads 2007-2008 (CFP) has been released and application materials are now available on the HASP website “Participant Info” page. Student groups interested in applying for a seat on the September 2008 flight of HASP should download these materials and prepare an application. New for this year is an increase in the allowed weight of the student payloads. Small class payloads can now mass up to 3 kilograms and large class payloads can weigh as heavy as 20 kilograms. Applications are due December 18, 2007 and selections will be announced by mid-January 2008.


    The photos below and sideways are a 10 percent composite of several photos taken at 30,000 feet, with a 3x optical zoom at 500 mph. The speed makes it very unl
    ikely to get any good details without some type of processing. And so, for the time being, imaging the ground with some type of precision with some type of point and shoot camera seems to be only feasible to payloads on balloons.
    Compared to satellite imagery, one of the interesting capability is to remove the effect of clouds when possible. In satellite imagery, cameras work with pushbroom technology where the imager is a line of pixels (not a square dye). One consequence is the inability of photographing twice the same object with one sweep. Using off the shelf cameras on much slower balloons allow one to obtain multiple images of the same object at different angle. This is important when one wants to evaluate whether the object is noise or not.

    Chris Anderson of the Long Tail book mentioned a different approach by Pict'Earth to using images from the sky using UAVs and patching them into Google Earth. This is interesting, but as I have mentioned before, when you take enough images, you don't need Google Earth, you don't need the headache of re-projecting these images onto some maps (even though it looks easier with Yahoo Map Mixer for small images), because you are the map. No need for IMUs or GPS instrumentation. This is clearly an instance of advances in stitching algorithms removing hardware requirements on the sensors. As for the current results Chris is getting from PTGui, I am pretty sure the autopano folks will enable the orthographic projection soon in order to cater to that market. With balloons, the view is from very far, so the patching algorithm has no problem stitching images together. In the case of UAVs, you need the orthographic projections.

    Eventually, two other issues become tremendously important (especially in the context of Search And Rescue). Cameras and memory are going cheaper and one is faced with GB's of data to store, map and share. Our experience is that the sharing is challenging when you go over 2 GB of data mostly because of small file format limits (2 GB). Zoomify is interesting and they need to figure out a way to deal with larger images. While Autopano allows for images taken at different times to be overlayed with each other (a very nice feature), the viewer might be interested in this time information. Right now I know of no tool that allows one to switch back and forth between different times for the same map.

    References:

    1. Comparing Satellite Imagery and GeoCam Data
    2. A 150-km panoramic image of New Mexico

    Sunday, August 26, 2007

    Hard Problems: Walking Dinosaurs wearing make-ups while sleeping.

    I am always a little astonished by some things that I do not see implemented because they are too hard. Yet, I don't even see them being even attempted in the first place even though there is a very large market for each of those. Here they are:
    • How come we don't have Jurassic Park with walking dinosaurs ? everytime there is an animotronics coming in town, you have lines of kids waiting to see those things even if the time spent waiting will be longer than the time spent watching them and yet we still don't have walking dinosaurs (except when a human is inside). How come ? (my interest here lie in muscle, autonomous ). It looks as though we have only been able to devise their gait recently. Some people are already making a business case that building them will get people to come.
    • Knowing that some women spent as much as two hours every day to do their make-ups, how come there is not a Make-up robot for women ? ( autonomous ). This is all the more interesting that much technology goes into changing the face/shape of women in magazines. How come there isn't a similar technology to evaluate if the make-up is good enough ? Think Snow White mirror.
    • People spend an average of 8 hours sleeping yet there is no real good technology to improve sleep. How come there isn't an autonomous pillow that shapes itself around one's head over the course of the sleep. Or since a 32 GB SD card can allow people to record entire sleeping patterns for over 8 hour. What is the software that will allow to check if the pattern is a good one or a detrimental one ?

    Saturday, August 11, 2007

    A bridge too far. It's over.

    As hinted on previously and on John Wiseman's blog, we are not semi-finalist to the urban Challenge. We did not make that bridge. Thank you for all the good wishers. At some time, we'll try to write and talk more about our system of ominidirectional cameras using compressed sensing and dimensionality reduction in policies we intended to accomplish during the site visit but did not have time to do.



    In the meantime, our payload on HASP was integrated, we are looking forward to a second-flight of GeoCam and a new flight of Hyper-GeoCam, an Hyperspectral Random Lens Imager (following the foodsteps of the MIT Random Lens Imaging system by Rob Fergus, Antonio Torralba, William Freeman.)

    Monday, July 09, 2007

    Adding Search and Rescue Capabilities (part I): Using Hyperspectral and Multispectral Cameras to Search for Non-Evading Targets

    In the search for the Tenacious, I mentioned the possibility of using Hyperspectral or multispectral cameras on-board current satellites to see if there was a way to locate it. The big challenge resides in the resolution. Most of these cameras have a coarser resolution than, say, either the Tenacious or any medium sailing boat (i.e. one pixel is more than the size of the boat). Hence the generic expectation is that one cannot locate a boat using these. These cameras are also mostly used for other purposes such as environmental studies and the access is rightfully restricted to a small circle of experts. Because of that there is a large amount of convincing to do in order to have access to that imagery. The underlying reasoning as to why we could, in effect, discriminate between something that is interesting and something that is not, can be put in two categories:
    • the boat and its wake span a large part of the pixel and using a few bands, one can see a large difference between a man made object and the sea. In other words, the underlying scene is very sparse and one in effect detect very rapidly interesting artifacts. This a little bit like superresolution.
    • In some cameras like Hyperion on EO-1 or Meris (Envisat) there are 250 spectral channels. Even if the spatial resolution is coarse, we are bound to see something different when using 250 bands (as opposed to the traditional three color bands) especially against a very uniform background (sea). Techniques such as the ones developed by Mauro Maggioni and Ronald Coifman should be evaluated for that purpose.

    Recall that the idea is to produce a map of what is potentially interesting, not an exact match of where that target is. A second step dealing with data fusion is responsible for eliminating the false positives given information from other sensors. With the help of Lawrence Ong, Sorin Popescu and I showed that you could see boats with Landsat 7. This is new but falls into the first category highlighted above. The second category has not been investigated as far as I know: Maybe it should. There are three categories of targets/signatures that should be investigated:

    • During the Tenacious search, a false positive was detected in the form of a sighting of a green dye. These dye are generally part of "distress kits" and used whenever a boat want to make it clear it has problem. While it was a false positive for other reasons, I had a discussion with the EO-1 folks (at JPL and Goddard) who mentioned that maybe producing ground truth data with green dye and Hyperion could probably lead to having a similar capability than the one we currently have for detecting volcanoes. In other words, produce a calibration formula to be fed to EO-1 so that in the future, its autonomous detection capability can provide data to the ground that this green dye has been detected over a specific area. Since one can schedule imagery on EO-1 and figure out Envisat data gathering cycle, this could be done as a small private endeavor.
    • Another signature of interest is that of the boat as produced on the camera. If it is a boat or a plane, it is very likely that these have been imaged before by the same camera over the same harbour or airport at some other time. But for the latter, a signature is not really that important per se. A large signal over background noise on some channels should be enough to find that boat/plane. In the case of the plane, the signature may be interesting as the background is generally cluttered.
    • In order to verify the ability to find current boats at sea, one could try to locate the boats currently involved in much advertized journeys or races. One could find out from the current stock of envisat and eo-1 photos whether boats like the Schooner Anne can be located. That boat is part of a 1000 days at sea journey. They have a map of their location day after day. The boat is a schooner (or about 120 feet large).


    Another item that would have sped up the search is the ability to query simultaneously different databases on the availability of hyperspectral or multispectral images from different platforms. Either USGS or the ESA platforms are very nice, but making it into one search would have been a nice time saver. I am also pretty sure that there are other Earth Observation platforms from Asia (India in particular) that could have been used, provided I knew about them. Yet I cannot find anywhere on the web a catalog of civilian hyperspectral or multispectral imagers on current satellites.


    Finally, let us recall that doing this can help us locate hard cases like the Tenacious but it may also help us in a totally different endeavor. As one can see from the extraordinary effort of the Coast Guards for the Tenacious, one boat can consume a large amount of man power. Let us imagine a case where you have to do the tracking of 8000 targets lost at sea.

    In the search for Cessna N2700Q, the Civil Air Patrol tried the new ARCHER system without success on that search. And it looks like this is not happening only for this search as some people are doubting its capability for Search and Rescue Operations.
    As indicated by Bradley,

    CAP forum posts indicate ARCHER
    requires a very narrow search area to be of much use. Our problem is that we're not sure where this Cessna pilot went after he dropped
    below radar (N34° 48' 7" , W111° 56' 52").
    This is the same problem that arises for EO-1, the swath of interest is generally very narrow compared to the size of the problem. We should probably think of a way of integrating Compressed Sensing into current hyperspectral imagery to increase the field of view. Let us recall that one of the reason this would be interesting is that these systems are there to point out major differences from the background, they are not there to produce very nice imagery.


    If any of those items are of interest to you please contact me. I am particularly interested in people (from Asia, Europe or the U.S.) that can have direct access to this type of imagery so we can test some of what is said in this entry.

    [ Si vous pensez que ce sujet est important et qu'il doit etre etudie, je serais tres heureux de pouvoir vous aider. N'hesitez pas a me contacter ]

    Thursday, June 21, 2007

    On-going concern


    It is not a fair picture of Pegasus Bridge 1, as it is obviously missing the light bulbs. DARPA will visit us next week. As usual we are going through some intense last minute implementations and preparations. One of them is having to find enough people to show up at the visit (all of us have day jobs or cannot make it in one day to Texas) as we are required to have five people on-site.

    Friday, May 11, 2007

    DARPA Urban Challenge: Obstacle avoidance - Isolating important cues

    I have had requests from several parties to give an overview on the type of obstacle avoidance scheme that might be most promising. Right now, we (Pegasus) are still evaluating some of these, so this entry should not be construed as part of our algorithm unveiling entries but rather a general overview we did a while back. It is important to realize that the main contribution of this entry is really about defining a hardware + software solution to the localization of cues that will be later learned/categorized. One large component of an obstacle avoidance solution is the machine learning/statistical device used to identify rapidly these cues as problematic for the autonomous vehicle or not. This is not unlike the human cortex (see reference section [1] [2] [3]).

    In the case of the Urban Challenge, we are facing not only stopped obstacles but moving one as well. The moving obstacle have behaviors from which one needs to learn from as well. In other words, in the case of vision, a lot of work boils down to producing some amount of cues/features (a small number) from a very large set of data (pixels from an image). In some areas of computer science this is called dimensionality reduction.

    1. Stereo-imaging:
      1. The fly algorithm, a robust stereo algorithm using real time a genetic algorithm (yes, there is such thing as real time genetic algorithm!) and has been tried on cars. and specifically to has been used to avoid people and other objects. The initial thesis with the algorithm is in french. Improvement over the thesis have been focused on the car driving experience.
      2. There are also numerous commercial solutions as listed by the folks at V2_lab's where they discuss each of them. I found this entry pretty revealing about the state of the affairs with regards to stereovision, you have to look at the comment section
        For most stereo matching algorithms the Firewire cameras produce higher quality uncompressed images that do not wreak havoc on sensitive feature detectors. I use the Unibrain Fire-I board camera http://www.unibrain.com/index.html with the CMU 1394 Digital Camera API (Windows), which gives you very complete control and works with just about any Firewire camera, because they all use the same standard interface. http://www.cs.cmu.edu/~iwan/1394/ . When I read the technical reports for the 2005 DARPA Grand Challenge almost every report showed pictures of vehicles equiped with stereo pairs of cameras, but at the race just about all of them had been removed, presumably because of basic issues such as camera synchronization.


    2. Monocular evaluation of the distance field. There are two approaches that caught our attention:
      1. Vision-based Motion Planning for an Autonomous Motorcycle on Ill-Structured Road, Dezhen Song, Hyun Nam Lee, Jingang Yi and Anthony Levandowski from the bike entry at the last Grand Challenge, and
      2. Depth Estimation Using Monocular and Stereo Cues / 2197, Ashutosh Saxena, Jamie Schulte, Andrew Y.Ng and Learning Depth from Single Monocular Images, Ashutosh Saxena, Sung Chung, and Andrew Y. Ng. In NIPS 18, 2006. [ps, pdf]
      3. (With regards to Monocular information, we should not forget the excellent Mono-SLAM : This website has a Matlab implementation of the SLAM using monocular vision. There is a timely thesis on the subject here where it looks like using two cameras implementing both the monoslam algorithm.)
    3. A Random Lens Imager, it is a hardware implementation of a totally new concept in data processing known as compressed sensing (don't ask anybody around you about it because it is too new). It needs only one camera but much of the work goes into the calibration.

    References:

    [1] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. Object recognition with cortex-like mechanisms. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 29 (3), pp. 411-426 , 2007

    [2] T. Serre. Learning a dictionary of shape-components in visual cortex: Comparison with neurons, humans and machines, PhD Thesis, CBCL Paper #260/MIT-CSAIL-TR #2006-028, Massachusetts Institute of Technology, Cambridge, MA, April, 2006
    (Page 154-163 have the model parameters).

    [3]
    T. Serre, L. Wolf and T. Poggio. Object recognition with features inspired by visual cortex. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), IEEE Computer Society Press, San Diego, June 2005

    Extended paper: [CBCL Paper #243/AI Memo #2004-026] and the code is here.

    Monday, May 07, 2007

    DARPA Urban Challenge: Unveiling our algorithm, Part Three, Building the Motor/Perceptual models



    As a small team, we had to make choices that are reflected in our choice of hardware and algorithm. One of the most time intensive aspect of devising a robot is the development of an accurate motion model and then an adequate sensor/perceptual model [1].

    We could not spend an entire team and the time to devise a motor model as well as a sensor model as what other teams have done. Some of it stems from the fact that we are not interested in getting M.S. theses written for the moment as it is the case here. Furthermore, our vehicle is far from being protected from the elements. And we considered that it would become difficult to rely on a model that could become less and less accurate as the car was getting older. Another aspect of our decision making came from the actuators we used. In the case of the steering wheel, there are moments when the steering motor can slip (especially in large deviations). This is owed to two factors: the laptop is busy doing other tasks can sometime send only the steering task with a delay or there is too much torque too be applied too fast. This results in a command control that would be difficult to model. Finally, we were interested in developing a robot that would learn it's own behavior while being supervised with very little labeled data. In other words, we want the learning to be based on data coming from an actual driving behavior. Not much data understanding should come from any subsequent data processing.

    Our motor model relies on our ability to drive around while in the car through the keyboard of the command laptop. This supervised learning of the motor model was built by producing a probability distribution of the next state of the car as a function of the previous state and the commands sent by the "supervised" commands. We used a a variety of techniques to build this probability distribution including the simple but robust Experimental Probabilistic Hypersurface devised at SCM [2] as well as Subjective mapping representation techniques (by Michael Bowling, Ali Ghodsi and Dana Wilkinson, Subjective Localization with Action Respecting Embedding and [3], [4] [5] [6])



    In the latter case, we decided against using the SDE approach but use Compressed Sensing instead for the dimensionality reduction aspect of the problem for building the sensor model. Some aspect of the reduction include some of the ideas in [7].

    References:
    [1] D. Fox, J. Hightower, L. Liao, D. Schulz, and G. Borriello., Bayesian Filtering for Location Estimation, IEEE Pervasive Computing, 2003.

    [2] Robust Mathematical Modeling, SCM, Experimental Probabilistic Hypersurfaces

    [3] [pdf] [ps.gz] Subjective mapping, Michael Bowling, Dana Wilkinson, and Ali Ghodsi.
    In New Scientific and Technical Advances in Research (NECTAR) of the Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI) , pages 1569--1572, 2006.

    [4] [pdf] [ps.gz] Subjective localization with action respecting embedding, Michael Bowling, Dana Wilkinson, Ali Ghodsi, and Adam Milstein. In Proceedings of the International Symposium of Robotics Research (ISRR) , 2005.

    [5] [pdf] Action respecting embedding, Michael Bowling, Ali Ghodsi, and Dana Wilkinson.
    In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML) , pages 65--72, 2005.

    [6] [pdf] [ps.gz] Learning subjective representations for planning, Dana Wilkinson, Michael Bowling, and Ali Ghodsi. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI) , pages 889--894, 2005.

    [7] INRIA :: [inria-00117116, version 1] Real-time Vehicle Motion Estimation Using Texture Learning and Monocular Vision

    Tuesday, May 01, 2007

    Small summary of what DARPA's Urban Challenge entails

    The Technical Evaluation criteria lists the following items as part of what an autonomous vehicle involved in DARPA Urban Challenge race should be able to handle and how to handle it during the race and in some of the pre-qualifying stages:

    A. Basic Navigation
    • A. 1Preparation for run
    • A.2 Mission start
    • A.3. Checkpoints
    • A.4. Stay in lane
    • A.5. Speed limits
    • A.6. Excess delay
    • A.7. Collisions
    • A.8. Stop line
    • A.9. Vehicle Separation
    • A.10. Leaving lane to pass
    • A.11. Returning to lane after pass
    • A.12. U-turn
    B. Basic Traffic
    • B.1. Basic Navigation
    • B.2. Intersection precedence
    • B.3. Minimum following distance
    • B.4. Queueing
    C. Advanced Navigation
    • C.1. Basic Traffic
    • C.2. Obstacle field
    • C.3. Parking lot
    • C.4. Dynamic re-planning
    • C.5. Road following
    • C.6. GPS outage
    D. Advanced Traffic
    • D.1. Advanced Navigation
    • D.2. Merge
    • D.3. Vehicle separation during merge
    • D.4. Left turn
    • D.5. Vehicle separation during left turn
    • D.6. Zones
    • D.7. Emergency braking
    • D.8. Defensive driving
    • D.9. Traffic jam


    The current rules are here. The site visit is scheduled to occur in San Antonio. The RNDF file describes the road to the vehicle. The MDF is the file that describes what is expected from the vehicle for its mission. There is no mention in the MDF of potential obstacles, traffic or GPS outage (as generally seen in cities with large buildings like Paris or New-York, a phenomenon called the canyon effect). The expectation is for the autonomous vehicle to follow the rules to get out these unexpected problems and accomplish its mission within the alloted time. DARPA decides on May 10th which team to invite at the Site visits.


    Technical Note: A parser for the RNDF and MDF can be found here.

    Tuesday, April 24, 2007

    DARPA Urban Challenge: The basics

    One of the earliest milestone that an autonomous car needs to reach for the competition is to be able to follow GPS waypoints. Since I don't have too much time to detail Kalman filtering using GPS and an IMU, I will point to the explanation and implementation of a software that does just that for the Texas A&M University autonomous car that was never entered in the competition. Two theses were filed on the subject matter:

    One was written by James Massey on Control and way point navigation of an autonomous ground vehicle
    Abstract: This thesis describes the initial development of the Texas A&M Autonomous Ground Vehicle test platform and waypoint following software, including the associated controller design. The original goal of the team responsible for the development of the vehicle was to enter the DARPA Grand Challenge in October 2005. A 2004 Ford F150 4x4 pickup was chosen as the vehicle platform and was modified with a 6” suspension lift and 35” tires, as well as a commercial drive-by-wire system. The waypoint following software, the design of which is described in this thesis, is written in C and successfully drives the vehicle on a course defined by GPS waypoints at speeds up to 50 mph. It uses various heuristics to determine desired speeds and headings and uses control feedback to guide the vehicle towards these desired states. A vehicle dynamics simulator was also developed for software testing. Ultimately, this software will accept commands from advanced obstacle avoidance software so that the vehicle can navigate in true off-road terrain.
    It is available at: http://handle.tamu.edu/1969.1/3862

    The other one was written by Graig Odom entitled Navigation solution for the Texas A&M autonomous ground vehicle

    Abstract: The need addressed in this thesis is to provide an Autonomous Ground Vehicle (AGV) with accurate information regarding its position, velocity, and orientation. The system chosen to meet these needs incorporates (1) a differential Global Positioning System, (2) an Inertial Measurement Unit consisting of accelerometers and angular-rate sensors, and (3) a Kalman Filter (KF) to fuse the sensor data. The obstacle avoidance software requires position and orientation to build a global map of obstacles based on the returns of a scanning laser rangefinder. The path control software requires position and velocity. The development of the KF is the major contribution of this thesis. This technology can either be purchased or developed, and, for educational and financial reasons, it was decided to develop instead of purchasing the KF software. This thesis analyzes three different cases of navigation: one-dimensional, two dimensional and three-dimensional (general). Each becomes more complex, and separating them allows a three step progression to reach the general motion solution. Three tests were conducted at the Texas A&M University Riverside campus that demonstrated the accuracy of the solution. Starting from a designated origin, the AGV traveled along the runway and then returned to the same origin within 11 cm along the North axis, 19 cm along the East axis and 8 cm along the Down axis. Also, the vehicle traveled along runway 35R which runs North-South within 0.1°, with the yaw solution consistently within 1° of North or South. The final test was mapping a box onto the origin of the global map, which requires accurate linear and angular position estimates and a correct mapping transformation.
    http://handle.tamu.edu/1969.1/4244

    I'll mention later how our vehicle differs from this approach.

    Sunday, April 08, 2007

    DARPA Urban Challenge: Unveiling our algorithm

    In a previous entry, I mentioned the fact that we are unveiling our strategy.unveiling our strategy for our entry in the DARPA Urban Challenge (DARPA Urban Challenge is about driving at a record pace in some urban environment past many urban difficulties, including no GPS capability). This was not accurate, in fact, we really are going to unveil our algorithm. You'll be privy of the development quirks and everything that goes on implementing an algorithm that has to respond on-line to a challenging situation. I'll be talking on the history of why we are choosing specific algorithms over others. I will specifically talk more about manifold-based models for decision making in the race and the use of techniques devised to produce a storage device of previous actions in order to produce some sort of supervised learning capability. In the previous race, we were eliminated early mostly because we were plagued with mechanical problems that most of us had never faced before (none of us had robotics background), we hope to go farther this time as the vehicle is OK. For reference, we have already shown some of our drive by wire program before as well. We made some of our data available before and I expect, as time permit to do the same as we go along. Because our entry is truely innovative, we are trying to balance not getting eliminated by passing every steps of the application and those innovation in the algorithm. However, since all of us are not interested in just an autonomous car, our emphasis will always be on the side of doing something that most other entries are not attempting such as using compressed sensing and robust mathematical techniques for instance.

    Printfriendly