Nuit Blanche

Saturday, January 07, 2012

You think peer review as currently implemented suck ? Wait till it is going to be the only way to publish

Aside from the egregious case of a monopoly grab that is currently underway through the proposed new "Research Works Act" bill (see the Scientific American blog here and Jonathan Eisen's post about a highly disturbing letter from the Ecological Society of America, or Michael Eisen responding to Elsevier's comments ), I find it interesting to read the following (emphasis added):

(3) PRIVATE-SECTOR RESEARCH WORK- The term `private-sector research work' means an article intended to be published in a scholarly or scientific publication, or any version of such an article, that is not a work of the United States Government (as defined in section 101 of title 17, United States Code), describing or interpreting research funded in whole or in part by a Federal agency and to which a commercial or nonprofit publisher has made or has entered into an arrangement to make a value-added contribution, including peer review or editing. Such term does not include progress reports or raw data outputs routinely required to be created for and submitted directly to a funding agency in the course of research.

So the value added by a commercial or non-profit publisher is peer-review ? It certainly is not, unless you are on K-street. Nobody seems to say it aloud, so let me be clear on that: the only thing a commercial or non-profit publisher does is brand an article based on free external advice.

Do you really think that a post peer-review publishing model (see The wrath of our discontent and Tim Gowers' Model of Mathematical Publishing ) could ever take off and provide a different way of evaluating Science if the current monopolies and gatekeepers can pass a bill like this ? In the affirmative, wait till you see the next one.

[Update: if you think publishers are not in the branding business, here is another proof: Tee-shirts ]

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, January 06, 2012

Sudoku and the Donoho-Tanner Phase Transition

Dan let us know that some folks back in Ireland (There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem by Gary McGuire, Bastian Tugemann, Gilles Civario) have found that if you give 16 clues in a Sudoku's problem then, there is probably more than one solution to it. With 17 clues, then there is only one solution. They seem to have gone through an exhaustive search to find that result.

The reason I mention this today is because I wonder what the generic Donoho-Tanner phase transition can bring to the table. Let us remember that the Sudoku problem was set up as the recovery of a solution to an underdetermined linear system of equations (Linear Systems, Sparse Solutions, and Sudoku by Prabhu Babu, Kristiaan Pelckmans, Petre Stoica, and Jian Li, implementation in Python available thanks to Ben Moran), i.e. a typical instance of compressive sensing. The problem was then solved using a reweighted L1 approach but it looks like there are problem that cannot be solved with it. Since then, Gabriel Peyre and Yue Lu made another version of that solver and have a numerical tour discussing a matlab implementation.

If I recall correctly, the Donoho-Tanner phase transition shows that (see Jared Tanner's presentation at Texas A&M in 2008) under a specific curve, there was only one sparse solution to these underdetermined systems of linear equations (see Precise Undersampling Theorems for a more in-depth look)

I haven't had time to look back at the set up of the Sudoku problem as an underdetermined system of linear equations or if the solution has to be positive but I wonder where this 16/17 result recently found through brute force fit in this diagram. Do you know ?

CODAC: Compressive Depth Acquisition Camera

Here is another compressive sensing hardware. Vivek Goyal does an excellent job at presenting the whole process in the following video. The attendant site is here.

The attendant papers are: CODAC: A Compressive Depth Acquisition Camera Framework by Ahmed Kirmani, Andrea Colaco, Franco N. C. Wong, and Vivek K Goyal. The abstract reads:

Light detection and ranging (LIDAR) systems use time of ﬂight (TOF) in combination with raster scanning of the scene to form depth maps, and TOF cameras instead make TOF measurements in parallel by using an array of sensors. Here we present a framework for depth map acquisition using neither raster scanning by the illumination source nor an array of sensors. Our architecture uses a spatial light modulator (SLM) to spatially pattern a temporally-modulated light source. Then, measurements from a single omnidirectional sensor provide adequate information for depth map estimation at a resolution equal that of the SLM. Proof-of-concept experiments have veriﬁed the validity of our modeling and algorithms.

Also

Exploiting sparsity in time-of-ﬂight range acquisition using a single time-resolved sensor by by Ahmed Kirmani, Andrea Colaco, Franco N. C. Wong, and Vivek K Goyal. The abstract reads:

Range acquisition systems such as light detection and ranging (LIDAR) and time-of-ﬂight (TOF) cameras operate by measuring the time difference of arrival between a transmitted pulse and the scene reﬂection. We introduce the design of a range acquisition system for acquiring depth maps of piecewise-planar scenes with high spatial resolution using a single, omnidirectional, time-resolved photodetector and no scanning components. In our experiment, we reconstructed 64 × 64-pixel depth maps of scenes comprising two to four planar shapes using only 205 spatially-patterned, femtosecond illuminations of the scene. The reconstruction uses parametric signal modeling to recover a set of depths present in the scene. Then, a convex optimization that exploits sparsity of the Laplacian of the depth map of a typical scene determines correspondences between spatial positions and depths. In contrast with 2D laser scanning used in LIDAR systems and low-resolution 2D sensor arrays used in TOF cameras, our experiment demonstrates that it is possible to build a non-scanning range acquisition system with high spatial resolution using only a standard, low-cost photodetector and a spatial light modulator

Compressive Fluorescence Microscopy for Biological and Hyperspectral Imaging

Here is a new hardware instance that will be listed in the compressive sensing hardware page:

Compressive Fluorescence Microscopy for Biological and Hyperspectral Imaging by Vincent Studer, Jerome Bobin, Makhlad Chahid, Hamed Moussavi, Emmanuel Candes, Maxime Dahan. The abstract reads:

The mathematical theory of compressed sensing (CS) asserts that one can acquire signals from measurements whose rate is much lower than the total bandwidth. Whereas the CS theory is now well developed, challenges concerning hardware implementations of CS-based acquisition devices---especially in optics---have only started being addressed. This paper presents an implementation of compressive sensing in fluorescence microscopy and its applications to biomedical imaging. Our CS microscope combines a dynamic structured wide-field illumination and a fast and sensitive single-point fluorescence detection to enable reconstructions of images of fluorescent beads, cells and tissues with undersampling ratios (between the number of pixels and number of measurements) up to 32. We further demonstrate a hyperspectral mode and record images with 128 spectral channels and undersampling ratios up to 64, illustrating the potential benefits of CS acquisition for higher dimensional signals which typically exhibits extreme redundancy. Altogether, our results emphasize the interest of CS schemes for acquisition at a significantly reduced rate and point out to some remaining challenges for CS fluorescence microscopy.

It may look like a crisis but it is only the end of an illusion

I took the title of this entry from Jerry Weinberg's The Secrets of Consulting. It's great read and worth the $25 investment I made some years ago as I keep on reopening the book every once in a while. Also, It's not just about consulting. All this leads to the substance of this blog. Over time, you may have noticed a change in the frequency of blog entries. There may be more but I have switched to a different order. I keep on refining the model and while I never commit to a specific frequency, I have come to the conclusions that there are entries I enjoy and others that I want to enjoy. Entries I enjoy writing about:

surprises

reality checks

somebody going through the risky business of making their code/implementation available with examples

instance of compressive sensing hardware

comparison between different solvers, piece of hardware

instance of issues that are deeply connected to making the world a better place (calibration issue for compressive sensing hardware is one...)

how the world is inherently compressive

how compressive sensing and attendant techniques are changing the world as we know it..

how the blog started conversations among y'all.

Entries I yearn to enjoy are those featuring papers:

with no illuminating figures or few usable insights.

that show scholarship but not the underlying "how to do it" bit

that will take me a long while to absorb

etc ....

I end up writing single entries for subjects I enjoy writing about, while subjects I want to enjoy may get stacked together in one long entry. The illusion is to think that all subjects are equal.

Thursday, January 05, 2012

SubMF changes to DCF, something new in ReProCS and Martin Jaggi's thesis and code.

In September it used to be called SubMF it is now DFC (from Divide-and-Conquer Matrix Factorization by Lester Mackey, Ameet Talwalkar, Michael I. Jordan). it's been changed in the Matrix Factorization Jungle page.

The Recursive Projected Compressive Sensing implementation by Chenlu Qiu and Namrata Vaswani has new examples and a new attendant paper

Automated Recursive Projected CS (ReProCS) for Real-time Video Layering by Chenlu Qiu and Namrata Vaswani. The abstract reads:

A large class of video sequences are composed of at least two layers – the foreground, which is a sparse image that often consists of one or more moving objects, and the background, which is a dense image, that is either constant or changes gradually over time and the changes are usually global. Thus the background sequence is well modeled as lying in a low dimensional subspace that can slowly change with time; while the foreground is well modeled as a sparse “outlier” that changes in a correlated fashion over time (e.g., due to objects’ motion). Video layering can thus be posed as a robust principal components’ analysis (PCA) problem, with the difference that the “outlier” for PCA is also a signal-of-interest and needs to be recovered too. Real-time video layering then becomes a recursive robust PCA problem. Most existing recursive, or even batch, robust PCA algorithms fail when (a) there are too many nonzero foreground pixels; or when (b) the foreground pixels are signiﬁcantly correlated spatially or temporally (in almost all cases foreground objects will not jump around randomly); or when (c) the foreground intensity is quite similar to that of the background. We propose a novel solution framework called Recursive Projected Compressive Sensing (ReProCS) that ensures robustness to all three issues.

Finally, Martin Jaggi let me know his thesis is out:

hi igor: Just wanted to let you know that finally my thesis came online, just in case:

http://m8j.net/math/thesis.pdf

or alternatively here is the page (also linking to the content you already mentioned on the Matrix Factorization Jungle page. )

http://www.m8j.net/(All)My%20Thesis

Here is the thesis: Sparse Convex Optimization Methods for Machine Learning by Martin Jaggi. The abstract reads:

Convex optimization is at the core of many of today's analysis tools for large datasets, and in particular machine learning methods. In this thesis we will study the general setting of optimizing (minimizing) a convex function over a compact convex domain. In the rst part of this thesis, we study a simple iterative approximation algorithm for that class of optimization problems, based on the classical method by Frank & Wolfe. The algorithm only relies on supporting hyperplanes to the function that we need to optimize. In each iteration, we move slightly towards a point which (approximately) minimizes the linear function given by the supporting hyperplane at the current point, where the minimum is taken over the original optimization domain. In contrast to gradient-descent-type methods, this algorithm does not need any projection steps in order to stay inside the optimization domain. Our framework generalizes the sparse greedy algorithm of Frank & Wolfe and its recent primal-dual analysis by Clarkson (and the low-rank SDP approach by Hazan) to arbitrary compact convex domains. Analogously, we give a convergence proof guaranteeing "-small error | which in our context is the duality gap | after O(1") iterations.This method allows us to understand the sparsity of approximate solutions for any `1-regularized convex optimization problem (and for optimization over the simplex), expressed as a function of the approximation quality. Here we obtain matching upper and lower bounds of 1" for the sparsity. The same bounds apply to low-rank semide finite optimization with bounded trace, showing that rank O1"is best possible here as well.For some classes of geometric optimization problems, our algorithm has a simple geometric interpretation, which is also known as the coreset concept. Here we will study linear classi ers such as support vector machines (SVM) and perceptrons, as well as general distance computations between convex hulls (or polytopes). Here the framework will allow us to understand the sparsity of SVM solutions, here being the number of support vectors, in terms of the required approximation quality.

If you recall, Martin made his implementation available on his page. Thanks Martin and Congratulations Dr. Jaggi !

Wednesday, January 04, 2012

The Self Driving RC Car

More on the project and how to build one can be found David Singleton's Blog.

Tuesday, January 03, 2012

Several ways of showing your interest

Monday, January 02, 2012

Could SOPA shut down Nuit Blanche ? and responding to OSTP's RFI

I wanted to write on this but Dave Bacon wrote a good summary on this. The short of it is "an accusation of alleged infringement" is good enough to shut down a service like this blog, since at some point in time, it won't matter if publishing a copyrighted abstract or using figures from journals are really fair use. SOPA as currently written is badly written and will eventually get abused because of its uncertain language.

On a different level, Dave Bacon also mentions an Request For Interest for the Office of Science and Technology Policy (OSTP) which ends tonight (January 2) on Public Access to Peer-Reviewed Scholarly Publications Resulting From Federally Funded Research. The RFI is here.

Who should respond ?

".. The Task Force is now seeking additional insight from ``non-Federal stakeholders, including the public, universities, nonprofit and for- profit publishers, libraries, federally funded and non-federally funded research scientists, and other organizations and institutions with a stake in long-term preservation and access to the results of federally funded research,'' as described in Section 103(b)(6) of the ACRA. Specifically, OSTP seeks further public comment on the questions listed below, on behalf of the Task Force:.."

Essentially most of the readership of this blog. Here are the questions:

"...Specifically, OSTP seeks further public comment on the questions listed below, on behalf of the Task Force:

Are there steps that agencies could take to grow existing and new markets related to the access and analysis of peer-reviewed publications that result from federally funded scientific research? How can policies for archiving publications and making them publically accessible be used to grow the economy and improve the productivity of the scientific enterprise? What are the relative costs and benefits of such policies? What type of access to these publications is required to maximize U.S. economic growth and improve the productivity of the American scientific enterprise?
What specific steps can be taken to protect the intellectual property interests of publishers, scientists, Federal agencies, and other stakeholders involved with the publication and dissemination of peer-reviewed scholarly publications resulting from federally funded scientific research? Conversely, are there policies that should not be adopted with respect to public access to peer-reviewed scholarly publications so as not to undermine any intellectual property rights of publishers, scientists, Federal agencies, and other stakeholders?
What are the pros and cons of centralized and decentralized approaches to managing public access to peer-reviewed scholarly publications that result from federally funded research in terms of interoperability, search, development of analytic tools, and other scientific and commercial opportunities? Are there reasons why a Federal agency (or agencies) should maintain custody of all published content, and are there ways that the government can ensure long-term stewardship if content is distributed across multiple private sources?
Are there models or new ideas for public-private partnerships that take advantage of existing publisher archives and encourage innovation in accessibility and interoperability, while ensuring long- term stewardship of the results of federally funded research?
What steps can be taken by Federal agencies, publishers, and/or scholarly and professional societies to encourage interoperable search, discovery, and analysis capacity across disciplines and archives? What are the minimum core metadata for scholarly publications that must be made available to the public to allow such capabilities? How should Federal agencies make certain that such minimum core metadata associated with peer-reviewed publications resulting from federally funded scientific research are publicly available to ensure that these publications can be easily found and linked to Federal science funding?
How can Federal agencies that fund science maximize the benefit of public access policies to U.S. taxpayers, and their investment in the peer-reviewed literature, while minimizing burden and costs for stakeholders, including awardee institutions, scientists, publishers, Federal agencies, and libraries?
Besides scholarly journal articles, should other types of peer- reviewed publications resulting from federally funded research, such as book chapters and conference proceedings, be covered by these public access policies?
What is the appropriate embargo period after publication before the public is granted free access to the full content of peer-reviewed scholarly publications resulting from federally funded research? Please describe the empirical basis for the recommended embargo period. Analyses that weigh public and private benefits and account for external market factors, such as competition, price changes, library budgets, and other factors, will be particularly useful. Are there evidence-based arguments that can be made that the delay period should be different for specific disciplines or types of publications? Please identify any other items the Task Force might consider for Federal policies related to public access to peer-reviewed scholarly publications resulting from federally supported research.

Response to this RFI is voluntary. Responders are free to address any or all the above items, as well as provide additional information that they think is relevant to developing policies consistent with increased public access to peer-reviewed scholarly publications resulting from federally funded research. Please note that the U.S. Government will not pay for response preparation or for the use of any information contained in the response..."

Also

"...How To Submit a Response All comments must be submitted electronically to: publicaccess@ostp.gov. Responses to this RFI will be accepted through January 2, 2012. You will receive an electronic confirmation acknowledging receipt of your response,..."

If you submit something it will eventually be made public. ~~Today is the last day,~~ the response date has been extended till January 12, 2012.