( Personal message: I will at ICLR next week, let's grab some coffee if you are there. )
F2F: A Library For Fast Kernel Expansions by Joachim Curto, Irene Zarza, Feng Yang, Alexander J. Smola, Luc Van Gool
F2F is a C++ library for large-scale machine learning. It contains a CPU optimized implementation of the Fastfood algorithm, that allows the computation of approximated kernel expansions in loglinear time. The algorithm requires to compute the product of Walsh-Hadamard Transform (WHT) matrices. A cache friendly SIMD Fast Walsh-Hadamard Transform (FWHT) that achieves compelling speed and outperforms current state-of-the-art methods has been developed. F2F allows to obtain non-linear classification combining Fastfood and a linear classifier.
I am told by one of the author that the library should be out at some point in time.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
2 comments:
You can get 5000 65536-point 32 bit floating point Fast Walsh Hadamard Transforms per second on a single Intel Celeron core. They must be doing something terribly wrong.
https://drive.google.com/open?id=0BwsgMLjV0BnhMG5leTFkYWJMLU0
You should be able to get something like 75 million to 100 million 65536-point FWHT's per second on a top of the range GPU.
The basic algorithm is:
sub wht(x() as single)
dim as ulongint i,j,k
dim as ulongint hs=1,m=ubound(x) 'n-1
dim as single a,b
while hs<=m 'Walsh Hadamard transform
i=0
while i<=m
k=i+hs
while i<k
a=x(i)
b=x(i+hs)
x(i)=a+b
x(i+hs)=a-b
i+=1
wend
i=k+hs
wend
hs+=hs
wend
end sub
That should have been 75 to 100 million 256-point FWHTs per second on a single GPU.
https://estudogeral.sib.uc.pt/bitstream/10316/27842/1/Optimized%20Fast%20Walsh%E2%80%93Hadamard%20Transform.pdf
Probably about 1 to 3 million 65536-point FWHTs/sec on a single GPU. 1 65536-point FWHT/sec needs 1 MegaFlop/sec.
The only thing you then need for random projections (RP) is to do a random sign flip of the data before the FWHT. For better quality repeat. Including random permutations could give you more entropy per step but is expensive time-wise.
One thing you could do is make a "soft" hash table as a possible discriminator for GAN neural nets, or as soft memory for deep networks.
https://drive.google.com/open?id=0BwsgMLjV0BnhellXODdLZWlrOWc
Post a Comment