Jort Gemmeke just sent me the following:
Hi Igor,The emphasis was mine, I am sure that at some point, we will see more of these KL based solvers. The two papers are: Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition by Jort Florent Gemmeke, Hugo Van hamme, Bert Cranen, Lou Boves.The abstract reads:
I finally found time to update my website (www.amadana.nl). There are two new papers which are probably of interest for this community. The first is "Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition", which is a more detailed description of my work on compressive sensing for imputing missing data in speech.
In "Noise robust exemplar-based connected digit recognition" we use the same concept of representing speech as a sparse linear combination of exemplars, but in this work we model the noisy speech as a linear combination of both speech and noise exemplars. Rather than use the activations to do source seperation, we do classification directly on the clean speech exemplar activations. Interestingly, although not discussed in detail in the paper, we obtained better (much) results using a sparse solver which minimizes the Kulback-Leibler divergence than the customary sparse solvers which minimize the Euclidean distance.
Cheers,
Jort
and Noise robust exemplar-based connected digit recognition by Jort Gemmeke and Tuomas Virtanen. The abstract reads:An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing), and to replace (impute) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low SNR’s these techniques fail, because too many time frames may contain few, if any, reliable features. In this paper we introduce a novel non-parametric, exemplar based method for reconstructing clean speech from noisy observations, based on techniques from the field of Compressive Sensing. The method, dubbed sparse imputation, can impute missing features using larger time windows such as entire words. Using an overcomplete dictionary of clean speech exemplars, the method finds the sparsest combination of exemplars that jointly approximate the reliable features of a noisy utterance. That linear combination of clean speech exemplars is used to replace the missing features. Recognition experiments on noisy isolated digits show that sparse imputation outperforms conventional imputation techniques at SNR = −5 dB when using an ideal ‘oracle’ mask. With error-prone estimated masks sparse imputation performs slightly worse than the best conventional technique.
This paper proposes a noise robust exemplar-based speech recognition system where noisy speech is modeled as a linear combination of a set of speech and noise exemplars. The method works by finding a small number of labeled exemplars in a very large collection of speech and noise exemplars that jointly approximate the observed speech signal. We represent the exemplars using melenergies, which allows modeling the summation of speech and noise, and estimate the activations of the exemplars by minimizing the generalized Kullback-Leibler divergence between the observations and the model. The activations of the speech exemplars are directly being used for recognition. This approach proves to be promising, achieving up to 55.8% accuracy at signal-to-noise ratio −5 dB on the AURORA-2 connected digit recognition task.
You probably recall this paper entitled Sparse Event Detection in Wireless Sensor Networks using Compressive Sensing, now there is a video presentation by one of the author and it is in Chinese.
As I mentioned a while back, I am very much interested in a compressive sensing system that does EEG so I was a little taken aback when I noticed a paper I had not seen before: Quantifying the Feasibility of Compressive Sensing in Portable Electroencephalography Systems by Amir M. Abdulghani, Alexander J. Casson, Esther Rodriguez-Villegas. The abstract reads:
The EEG for use in augmented cognition produces large amounts of compressible data from multiple electrodes mounted on the scalp. This huge amount of data needs to be processed, stored and transmitted and consumes large amounts of power. In turn this leads to physically large EEG units with limited lifetimes which limit the ease of use, and robustness and reliability of the recording. This work investigates the suitability of compressive sensing, a recent development in compression theory, for providing online data reduction to decrease the amount of system power required. System modeling which incorporates a review of state-of-the-art EEG suitable integrated circuits shows that compressive sensing offers no benefits when using an EEG system with only a few channels. It can, however, lead to significant power savings in situations where more than approximately 20 channels are required. This result shows that the further investigation and optimization of compressive sensing algorithms for EEG data is justified.
The result for 20 channels springs from applying a compressed sensing design such as the one shown above. The question I have is really how can we change this design entirely: For instance, instead of applying a simple \phi measurement to the signals after they have been amplified and digitized, what about providing some randomness in the reference voltage ? what about removing entirely both the Op-amps ?
While we are on the subject of RTE, I found the following intriguing presentation: Sparse nonlinear approximation of functions in two dimensions by sums of exponential functions by Marcus Carlsson, which led me to Axel Osses's publication page and the following:Finite Rank Approximation based method for solving the RTE in stellar atmospheres and application to an inverse problem by Olivier Titaud, Axel Osses.
In other news, the googlebot found the following paper:
Sampling Rate Reduction for 60 GHz UWB Communication Using Compressive Sensing by Jia (Jasmine) Meng, Javad Ahmadi-Shokouh, Husheng Li,, E. Joe Charlson, Zhu Han, Sima Noghanian, and Ekram Houssain, The abstract reads:
60 GHz ultra wide-band (UWB) communication is an emerging technology for high speed short range communications. However, the requirement of high-speed sampling increases the cost of receiver circuitry such as analog-to-digital converter (ADC). In this paper, we propose to use a compressive sensing framework to achieve a significant reduction of sampling rate. The basic idea is based on the observation that the received signals are sparse in the time domain due to the limited multipath effects at 60 GHz wireless transmission. According to the theory of compressive sensing, by carefully designing the sensing scheme, sub-Nyquist rate sampling of the sparse signal still enables exact recovery with very high probability. We discuss an implementation for a low-speed A/D converter for 60 GHz UWB received signal. Moreover, we analyze the bit error rate (BER) performance for BPSK modulation under RAKE reception. Simulation results show that in the single antenna pair system model, sampling rate can be reduced to 2.2% with 0.3dB loss of BER performance if the input sparsity is less than 1%. Consequently, the implementation cost of ADC is significantly reduced.
I also found the following preprints on Arxiv: Training Over Sparse Multipath Channels in the Low SNR Regime by Elchanan Zwecher, Dana Porrat. The abstract reads:
Training over sparse multipath channels is explored. The energy allocation and the optimal shape of training signals that enable error free communications over unknown channels are characterized as a function of the channels' statistics. The performance of training is evaluated by the reduction of the mean square error of the channel estimate and by the decrease in the uncertainty of the channel. A connection between the entropy of the wideband channel and the required energy for training is shown. In addition, there is a linkage between the sparsity and the entropy of the channel to the number of required channel measurements when the training is based on compressed sensing. The ability to learn the channel from few measurements is connected to the low entropy of sparse channels that enables training in the low SNR regime.
Convex Feasibility Programming for Compressed Sensing by Avishy Carmi, Pini Gurfil. the abstract reads:
We present a computationally-efficient method for recovering sparse signals from a series of noisy observations, known as the problem of compressed sensing (CS). CS theory requires solving a convex constrained minimization problem. We propose to transform this optimization problem into a convex feasibility problem (CFP), and solve it using subgradient projection methods, which are iterative, fast, robust and convergent schemes for solving CFPs. As opposed to some of the recently-introduced CS algorithms, such as Bayesian CS and gradient projections for sparse reconstruction, which become prohibitively inefficient as the problem dimension and sparseness degree increase, the newly-proposed methods exhibit a marked robustness with respect to these factors. This renders the subgradient projection methods highly viable for large-scale compressible scenarios.
Credit: NASA, Spirit, Navigation Camera, Sol 2156. As you all know the big news tuesday was that Spirit is likely not moving any time soon.
No comments:
Post a Comment