Nuit Blanche

Hash & Select Extreme Learning Machines: https...

2025-08-19T09:48:28.732-05:00

Hash & Select Extreme Learning Machines: https://sciencelimelight.blogspot.com/2025/08/hash-select-extreme-learning-machine.html
Make a lot of use of random projections.

it was not written by the Museum :-)

2020-07-23T07:56:12.188-05:00

it was not written by the Museum :-)

because it took jobs away from humans Was this pa...

2020-07-23T03:00:04.815-05:00

because it took jobs away from humans

Was this part written in the museum? Or, how do you know this?

E P Thompson stated his goal in writing https://archive.org/stream/makingofenglishw01thom/makingofenglishw01thom_djvu.txt was “to rescue … the Luddites … from the enormous condescension of history.”

Great stuff, thanks for posting this, Igor!

2019-11-01T16:40:15.949-05:00

Great stuff, thanks for posting this, Igor!

Super!

2019-04-11T15:25:37.543-05:00

Super!

Congrats the lighton team!

2018-12-21T10:10:41.660-06:00

Congrats the lighton team!

Just to say where the variable precision comes fro...

2018-05-01T09:56:17.069-05:00

Just to say where the variable precision comes from. The Walsh Hadamard transform maps points to certain patterns. And can map those certain patterns back to the original points again, since it is self inverse.
With random projections based on the WHT you get similar properties except they are not generally self inverse, you have to construct an inverse if you need it.
Anyway certain patterns can focus to a point in the output allowing to set whatever value you like there. With incomplete patterns (doing a dimension increase) you can only focus specific values in some places in the output and other places only a low precision approximation of the wanted value.
Anyway it is all just the linear algebra of under-determined systems.

You could put a non-linearity at the output of the random projection after the dimension increase. Using say a signed square function y=x*x x>0, y=-x*x x<=0 you would get somewhat sparse synthesized weights with a spiky type of distribution. Or the signed square root (similar idea) has attractor states of +1,1 if you think the synthesized weights should have a soft binary type distribution.

If what they are doing is variable precision weigh...

2018-05-01T00:48:48.592-05:00

If what they are doing is variable precision weight sharing:

https://randomprojectionai.blogspot.com/2018/02/neural-network-weight-sharing-using.html

Then it would be very interesting to figure out an algorithm to know which weights were being set very exactly to high precision and which ones to low precision. You can think of schemes where you alternate the backpropagation through 2 different random projections then look later where they agree to high precision on some weights and disagree on others.
I presume other people think it is a very significant paper?

Well, this is mainly the work of my PhD student, V...

2018-04-27T13:47:40.389-05:00

Well, this is mainly the work of my PhD student, Vincent Schellekens. But thank you for the post !!

Thanks, Nuit!

2018-04-18T15:15:38.155-05:00

Thanks, Nuit!

toutes les infos pour s'inscrire sont ici: htt...

2018-02-14T04:34:38.450-06:00

toutes les infos pour s'inscrire sont ici: https://www.meetup.com/Paris-Machine-learning-applications-group/events/241149416/

Ca aura lieu où cet événement?

2018-02-14T04:30:54.047-06:00

Ca aura lieu où cet événement?

It seems that was an early/incomplete draft which ...

2018-02-07T16:03:40.834-06:00

It seems that was an early/incomplete draft which was also updated quickly after several days.
It is now much more through than it was initially, suggesting either the guy didn't know how Arxiv works in first place or somehow it got published before the he (anyone responsible for publishing it) realizes it!
Anyway, the latest version looks promising IMHO though.

One other thing I should mention is information lo...

2018-01-30T17:43:20.758-06:00

One other thing I should mention is information loss by non-linear functions. Linear transformation can always be organized to leave the information content almost completely intact. With standard floating point operations (IEEE) any non-linear (activation) function will always cause information loss (ie the first derivative will be very coarse):

https://www.spsc.tugraz.at/system/files/Geiger_InfoLossStatic.pdf

Layer after layer the non-linearity in a deep neural network compounds. And at each layer a significant amount of information about the input is lost. After a number of layers the network ends up on a set trajectory, the input information is no longer available (has been washed out) to cause further switching between decision regions.

In the system discribed maybe the random pathways through the network allow information about the input to pass through less hindered than if all the weights had been organized.

I think the idea with generalization is to make th...

2018-01-29T17:43:53.271-06:00

I think the idea with generalization is to make the decision regions as large as possible consistent with the training data. Tishby noted a diffusion process occurring in neural networks trained by backpropagation leading to compression.
https://youtu.be/bLqJHjXihK8
I presume that diffusion process increases the size of the decision regions maximally.

I suppose just having a lot of random weights around automatically causes some broadening of the decision regions. Anyway it is interesting information for me. It provides some justification for evolving subsets of weights in a neural network rather than trying to evolve them all at once.
There is an interesting question whether or under what circumstances evolution can provide a similar diffusion process to backpropagation.

Off topic, somewhat: https://github.com/S6Regen/Da...

2018-01-12T00:32:40.660-06:00

Off topic, somewhat:
https://github.com/S6Regen/Data-Reservoir-AI
It does make heavy use of random projections.

22/12 : first day of vacations. I hesitate between...

2017-12-18T14:43:14.426-06:00

22/12 : first day of vacations. I hesitate between crowded malls and this workshop... No more

Will this be streamed or a video taken for later d...

2017-12-17T22:07:11.990-06:00

Will this be streamed or a video taken for later dissemination?

Offtopic: Chaos cancelling neural network ensemble...

2017-11-03T19:24:51.492-05:00

Offtopic: Chaos cancelling neural network ensembles.
The idea is that the non-linearities in deep neural networks compound (exponentially) layer after layer. A recent paper shows single pixel attacks of deep networks that would support the idea of bifurcations along 1 or several dimensions. Implying chaos theory applies.
By using ensembles of diverse neural networks you should be able to cancel out chaotic responses to low level Gaussian noise according to the central limit theorem.

It shouldn't add too much extra computational burden because if you train the ensemble collectively you still get a chaos cancelling effect using individual networks with fewer weight parameters each.
https://groups.google.com/forum/#!topic/artificial-general-intelligence/itUghRNZWN8

Kevin, The videos of either Mark Davenport or Emm...

2017-09-30T14:18:19.880-05:00

Kevin,

The videos of either Mark Davenport or Emmanuel Candes do a pretty good job I think.

Cheers,

Igor.

Thanks Igor, I actually did take a look at that ...

2017-09-30T11:57:54.689-05:00

Thanks Igor,

I actually did take a look at that page. Your page here (https://nuit-blanche.blogspot.com/p/teaching-compressed-sensing.html) is an excellent resource for approaching this topic, providing further reading for people of all levels of mathematical sophistication. Will poke around on that page to see if there is something that I can sink my teeth into. Although I remember a bit of linear algebra, probability, and signal processing from my electrical engineering college days, that was a long time ago (Rice '78).

Best,
Kevin

Hi Kevin, Maybe this explanation will help ( http...

2017-09-29T18:05:10.794-05:00

Hi Kevin,

Maybe this explanation will help ( https://www.quora.com/What-is-compressed-sensing-compressive-sampling-in-laymans-terms/answer/Igor-Carron ) . The number of defective items (1 ball) is small (sparse) compared to the total number of balls (12).

Hope this helps,

Igor.

This took me the better part of a day to figure ou...

2017-09-29T16:34:27.637-05:00

This took me the better part of a day to figure out as well. Was about to give up when I stumbled on the trick. Better late than never.

Thanks for the link to the article, since I was wondering the same thing: How does a puzzle like this relate to Compressed Sensing? Given that my BSEE was a long time ago I doubt that I will be able to understand it, but will do my best. (A shoutout to Steve Hsu's Information Processing blog for stimulating my interest in this topic.)

A bientôt.

I mentioned over a the Numenta forum that one pers...

2017-09-29T00:44:45.508-05:00

I mentioned over a the Numenta forum that one perspective on deep neural networks views them as pattern based fuzzy logic.
https://discourse.numenta.org/t/artificial-life-concept/2308/11
In particular in higher dimension the dot product weighting function used in neural nets acts as a selective filter. Or you can say the dot product weighting produces a low magnitude output for most any random input and only a small number of select input vectors will produce a high magnitude output.
https://www.cs.princeton.edu/courses/archive/fall14/cos521/lecnotes/lec11.pdf

https://github.com/S6Regen/EvoNet

2017-08-31T19:16:04.190-05:00

https://github.com/S6Regen/EvoNet