I just found out about a draft version of "Foundations of Data Science" a new book by John Hopcroft and Ravindran Kannan and very much like chapter 7 through 10 that touches upon themes we talk about often here on Nuit Blanche: Compressive Sensing ( see also The Big Picture in Compressive Sensing) , Advanced Matrix Factorization (see also the Advanced Matrix Factorization Jungle Page), complexity, streaming/sketching issues and Randomized Numerical Linear Algebra, Machine Learning and more. Let us note that compressive sensing is introduced at the very end, much like the other foundation book on signal processing this week (Books: "Foundations of Signal Processing" and "Fourier and Wavelet Signal Processing"). One wonders if a certain clarity would come earlier if the subject was introduced first so that titles such as Compressive Sensing Demystified ( by Frank Bucholtz and Jonathan M. Nichols) would not be needed. Here is the table of content for the last chapters.

7 Algorithms for Massive Data Problems238

7.1 Frequency Moments of Data Streams . . . . . . . . . . . . . . . . . . . . . 238

7.1.1 Number of Distinct Elements in a Data Stream . . . . . . . . . . . 239

7.1.2 Counting the Number of Occurrences of a Given Element. . . . . . 243

7.1.3 Counting Frequent Elements . . . . . . . . . . . . . . . . . . . . . . 243

7.1.4 The Second Moment . . . . . . . . . . . . . . . . . . . . . . . . . . 245

7.2 Matrix Algorithms Using Sampling . . . . . . . . . . . . . . . . . . . . . . 248

7.2.1 Matrix Multiplication Using Sampling . . . . . . . . . . . . . . . . 248

7.2.2 Sketch of a Large Matrix . . . . . . . . . . . . . . . . . . . . . . . . 250

7.3 Sketches of Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

8 Clustering260

8.1 Some Clustering Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

8.2 A k-means Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . 263

8.3 A Greedy Algorithm for k-Center Criterion Clustering . . . . . . . . . . . 265

8.4 Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

8.5 Recursive Clustering Based on Sparse Cuts . . . . . . . . . . . . . . . . . . 273

8.6 Kernel Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

8.7 Agglomerative Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

8.8 Dense Submatrices and Communities . . . . . . . . . . . . . . . . . . . . . 278

8.9 Flow Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

8.10 Finding a Local Cluster Without Examining the Whole Graph . . . . . . . 284

8.11 Axioms for Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

8.11.1 An Impossibility Result . . . . . . . . . . . . . . . . . . . . . . . . 289

8.11.2 A Satis able Set of Axioms . . . . . . . . . . . . . . . . . . . . . . 295

8.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

9 Topic Models, Hidden Markov Process, Graphical Models, and Belief Propagation301

9.1 Topic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

9.2 Hidden Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

9.3 Graphical Models, and Belief Propagation . . . . . . . . . . . . . . . . . . 310

9.4 Bayesian or Belief Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 311

9.5 Markov Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

9.6 Factor Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

9.7 Tree Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

9.8 Message Passing in general Graphs . . . . . . . . . . . . . . . . . . . . . . 315

9.9 Graphs with a Single Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 317

9.10 Belief Update in Networks with a Single Loop . . . . . . . . . . . . . . . . 319

9.11 Maximum Weight Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 320

9.12 Warning Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

9.13 Correlation Between Variables . . . . . . . . . . . . . . . . . . . . . . . . . 325

9.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330

10 Other Topics332

10.1 Rankings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .332

10.2 Hare System for Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . .334

10.3 Compressed Sensing and Sparse Vectors . . . . . . . . . . . . . . . . . . .335

10.3.1 Unique Reconstruction of a Sparse Vector . . . . . . . . . . . . . .336

10.3.2 The Exact Reconstruction Property . . . . . . . . . . . . . . . . . .339

10.3.3 Restricted Isometry Property . . . . . . . . . . . . . . . . . . . . .340

10.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .342

10.4.1 Sparse Vector in Some Coordinate Basis . . . . . . . . . . . . . . .342

10.4.2 A Representation Cannot be Sparse in Both Time and Frequency Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .342

10.4.3 Biological . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .345

10.4.4 Finding Overlapping Cliques or Communities . . . . . . . . . . . .345

10.4.5 Low Rank Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . .346

10.5 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .347

10.6 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .348

10.6.1 The Ellipsoid Algorithm . . . . . . . . . . . . . . . . . . . . . . . .350

10.7 Integer Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .351

10.8 Semi-De nite Programming . . . . . . . . . . . . . . . . . . . . . . . . . .352

10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

Related entry: A
Bird's eye view of Compressive Sensing, Advanced matrix Factorization,
Randomized Numerical Linear Algebra, Big Data and more ... )

**Join the CompressiveSensing subreddit or the Google+ Community and post there !**

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## 2 comments:

Should the link to the 'draft version' work?

Yes, it should.

Post a Comment