tag:blogger.com,1999:blog-6141980.comments2018-12-18T13:58:47.832-06:00Nuit BlancheIgorhttp://www.blogger.com/profile/17474880327699002140noreply@blogger.comBlogger1517125tag:blogger.com,1999:blog-6141980.post-11114875786006156992018-05-01T09:56:17.069-05:002018-05-01T09:56:17.069-05:00Just to say where the variable precision comes fro...Just to say where the variable precision comes from. The Walsh Hadamard transform maps points to certain patterns. And can map those certain patterns back to the original points again, since it is self inverse.<br />With random projections based on the WHT you get similar properties except they are not generally self inverse, you have to construct an inverse if you need it.<br />Anyway certain patterns can focus to a point in the output allowing to set whatever value you like there. With incomplete patterns (doing a dimension increase) you can only focus specific values in some places in the output and other places only a low precision approximation of the wanted value.<br />Anyway it is all just the linear algebra of under-determined systems.<br /><br />You could put a non-linearity at the output of the random projection after the dimension increase. Using say a signed square function y=x*x x>0, y=-x*x x<=0 you would get somewhat sparse synthesized weights with a spiky type of distribution. Or the signed square root (similar idea) has attractor states of +1,1 if you think the synthesized weights should have a soft binary type distribution.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-54042577408286788732018-05-01T00:48:48.592-05:002018-05-01T00:48:48.592-05:00If what they are doing is variable precision weigh...If what they are doing is variable precision weight sharing:<br /><br />https://randomprojectionai.blogspot.com/2018/02/neural-network-weight-sharing-using.html<br /><br />Then it would be very interesting to figure out an algorithm to know which weights were being set very exactly to high precision and which ones to low precision. You can think of schemes where you alternate the backpropagation through 2 different random projections then look later where they agree to high precision on some weights and disagree on others.<br />I presume other people think it is a very significant paper? Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-52937100768058875102018-04-27T13:47:40.389-05:002018-04-27T13:47:40.389-05:00Well, this is mainly the work of my PhD student, V...Well, this is mainly the work of my PhD student, Vincent Schellekens. But thank you for the post !!JackDhttps://www.blogger.com/profile/14728376685283207988noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-57481468020649726342018-04-18T15:15:38.155-05:002018-04-18T15:15:38.155-05:00Thanks, Nuit!Thanks, Nuit!Cun Muhttps://www.blogger.com/profile/04865641672431083968noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-80789143665042749382018-02-14T04:34:38.450-06:002018-02-14T04:34:38.450-06:00toutes les infos pour s'inscrire sont ici: htt...toutes les infos pour s'inscrire sont ici: https://www.meetup.com/Paris-Machine-learning-applications-group/events/241149416/Igorhttps://www.blogger.com/profile/17474880327699002140noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-44175179632755434212018-02-14T04:30:54.047-06:002018-02-14T04:30:54.047-06:00Ca aura lieu où cet événement?Ca aura lieu où cet événement?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-87135115121772221652018-02-07T16:03:40.834-06:002018-02-07T16:03:40.834-06:00It seems that was an early/incomplete draft which ...It seems that was an early/incomplete draft which was also updated quickly after several days.<br />It is now much more through than it was initially, suggesting either the guy didn't know how Arxiv works in first place or somehow it got published before the he (anyone responsible for publishing it) realizes it!<br />Anyway, the latest version looks promising IMHO though. Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-6782081726706041972018-01-30T17:43:20.758-06:002018-01-30T17:43:20.758-06:00One other thing I should mention is information lo...One other thing I should mention is information loss by non-linear functions. Linear transformation can always be organized to leave the information content almost completely intact. With standard floating point operations (IEEE) any non-linear (activation) function will always cause information loss (ie the first derivative will be very coarse):<br /><br />https://www.spsc.tugraz.at/system/files/Geiger_InfoLossStatic.pdf<br /><br />Layer after layer the non-linearity in a deep neural network compounds. And at each layer a significant amount of information about the input is lost. After a number of layers the network ends up on a set trajectory, the input information is no longer available (has been washed out) to cause further switching between decision regions. <br /><br />In the system discribed maybe the random pathways through the network allow information about the input to pass through less hindered than if all the weights had been organized.<br /><br />SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-36166645609672676372018-01-29T17:43:53.271-06:002018-01-29T17:43:53.271-06:00I think the idea with generalization is to make th...I think the idea with generalization is to make the decision regions as large as possible consistent with the training data. Tishby noted a diffusion process occurring in neural networks trained by backpropagation leading to compression. <br />https://youtu.be/bLqJHjXihK8<br />I presume that diffusion process increases the size of the decision regions maximally.<br /><br />I suppose just having a lot of random weights around automatically causes some broadening of the decision regions. Anyway it is interesting information for me. It provides some justification for evolving subsets of weights in a neural network rather than trying to evolve them all at once. <br />There is an interesting question whether or under what circumstances evolution can provide a similar diffusion process to backpropagation. <br /> SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-52504343890977950092018-01-12T00:32:40.660-06:002018-01-12T00:32:40.660-06:00Off topic, somewhat:
https://github.com/S6Regen/Da...Off topic, somewhat:<br />https://github.com/S6Regen/Data-Reservoir-AI<br />It does make heavy use of random projections. SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-56968162608666517542017-12-18T14:43:14.426-06:002017-12-18T14:43:14.426-06:0022/12 : first day of vacations. I hesitate between...22/12 : first day of vacations. I hesitate between crowded malls and this workshop... No moreLaurent Duvalhttps://www.blogger.com/profile/05343286920474971263noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-30865629846668953292017-12-17T22:07:11.990-06:002017-12-17T22:07:11.990-06:00Will this be streamed or a video taken for later d...Will this be streamed or a video taken for later dissemination?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-63276383419939607192017-11-03T19:24:51.492-05:002017-11-03T19:24:51.492-05:00Offtopic: Chaos cancelling neural network ensemble...Offtopic: Chaos cancelling neural network ensembles.<br />The idea is that the non-linearities in deep neural networks compound (exponentially) layer after layer. A recent paper shows single pixel attacks of deep networks that would support the idea of bifurcations along 1 or several dimensions. Implying chaos theory applies.<br />By using ensembles of diverse neural networks you should be able to cancel out chaotic responses to low level Gaussian noise according to the central limit theorem.<br /><br />It shouldn't add too much extra computational burden because if you train the ensemble collectively you still get a chaos cancelling effect using individual networks with fewer weight parameters each. <br />https://groups.google.com/forum/#!topic/artificial-general-intelligence/itUghRNZWN8Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-4587269790536423902017-09-30T14:18:19.880-05:002017-09-30T14:18:19.880-05:00Kevin,
The videos of either Mark Davenport or Emm...Kevin,<br /><br />The videos of either Mark Davenport or Emmanuel Candes do a pretty good job I think. <br /><br />Cheers,<br /><br />Igor.Igorhttps://www.blogger.com/profile/17474880327699002140noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-85223273754953120602017-09-30T11:57:54.689-05:002017-09-30T11:57:54.689-05:00Thanks Igor,
I actually did take a look at that ...Thanks Igor, <br /><br />I actually did take a look at that page. Your page here (https://nuit-blanche.blogspot.com/p/teaching-compressed-sensing.html) is an excellent resource for approaching this topic, providing further reading for people of all levels of mathematical sophistication. Will poke around on that page to see if there is something that I can sink my teeth into. Although I remember a bit of linear algebra, probability, and signal processing from my electrical engineering college days, that was a long time ago (Rice '78).<br /><br />Best,<br />KevinKevin Finchhttps://www.blogger.com/profile/12292237902904082859noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-65992644295006482062017-09-29T18:05:10.794-05:002017-09-29T18:05:10.794-05:00Hi Kevin,
Maybe this explanation will help ( http...Hi Kevin,<br /><br />Maybe this explanation will help ( https://www.quora.com/What-is-compressed-sensing-compressive-sampling-in-laymans-terms/answer/Igor-Carron ) . The number of defective items (1 ball) is small (sparse) compared to the total number of balls (12).<br /><br />Hope this helps,<br /><br />Igor. Igorhttps://www.blogger.com/profile/17474880327699002140noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-89712336441039018122017-09-29T16:34:27.637-05:002017-09-29T16:34:27.637-05:00This took me the better part of a day to figure ou...This took me the better part of a day to figure out as well. Was about to give up when I stumbled on the trick. Better late than never. <br /><br />Thanks for the link to the article, since I was wondering the same thing: How does a puzzle like this relate to Compressed Sensing? Given that my BSEE was a long time ago I doubt that I will be able to understand it, but will do my best. (A shoutout to Steve Hsu's Information Processing blog for stimulating my interest in this topic.)<br /><br />A bientôt.Kevin Finchhttps://www.blogger.com/profile/12292237902904082859noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-89308302685217488792017-09-29T00:44:45.508-05:002017-09-29T00:44:45.508-05:00I mentioned over a the Numenta forum that one pers...I mentioned over a the Numenta forum that one perspective on deep neural networks views them as pattern based fuzzy logic.<br />https://discourse.numenta.org/t/artificial-life-concept/2308/11<br />In particular in higher dimension the dot product weighting function used in neural nets acts as a selective filter. Or you can say the dot product weighting produces a low magnitude output for most any random input and only a small number of select input vectors will produce a high magnitude output.<br />https://www.cs.princeton.edu/courses/archive/fall14/cos521/lecnotes/lec11.pdf<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6141980.post-5061841268433299022017-08-31T19:16:04.190-05:002017-08-31T19:16:04.190-05:00https://github.com/S6Regen/EvoNethttps://github.com/S6Regen/EvoNetSeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-16692102985063282022017-06-29T09:34:06.866-05:002017-06-29T09:34:06.866-05:00Maybe subrandom sampling can help with compressive...Maybe subrandom sampling can help with compressive sensing, perhaps being better than purely random sampling:<br />https://en.wikipedia.org/wiki/Low-discrepancy_sequence<br />I suppose there is a good chance it has already been investigated. <br />SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-35430551569191075822017-06-24T18:08:58.622-05:002017-06-24T18:08:58.622-05:00Not at the moment, it looks like.
Igor.Not at the moment, it looks like.<br /><br />Igor.Igorhttps://www.blogger.com/profile/17474880327699002140noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-51100357864574434072017-06-24T14:08:07.014-05:002017-06-24T14:08:07.014-05:00Are slides available for this talk?Are slides available for this talk?Gokulhttps://www.blogger.com/profile/14250642559129421797noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-23978504194151106062017-06-04T07:52:01.408-05:002017-06-04T07:52:01.408-05:00There has been a lack of discussion about binariza...There has been a lack of discussion about binarization in neural networks. Multiplying those +1/-1 values by weights and summing allows you to store values with a high degree of independence. For a given binary input and target value you get an error. You divide the error by the number of binary values and then you simply correct each of the weights by the reduced error taking account of the binary sign. That gives a full correction to get the correct target output. In higher dimensional space most vectors are orthogonal. For a different binary input the adjustments you made to the weights will not align at all. In fact they will sum to Gaussian noise by the central limit theorem. The value you previously stored for the second binary input will now be contaminated by a slight amount of Gaussian which you can correct for. This will now introduce an even smaller amount of Gaussian noise on the value for the first binary input. Iterating back and forth will get rid of the noise entirely for both binary inputs. <br />This has high use in random projection,reservoir and extreme learning machine computing.SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-73081378628405822532017-05-30T19:15:38.529-05:002017-05-30T19:15:38.529-05:00Tested in code:
http://www.freebasic.net/forum/vie...Tested in code:<br />http://www.freebasic.net/forum/viewtopic.php?f=7&t=25710<br />Conclusion: Very good<br /><br />The idea is very applicable to locality sensitive hashing as well.SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.comtag:blogger.com,1999:blog-6141980.post-74043354810802289852017-05-30T01:29:57.391-05:002017-05-30T01:29:57.391-05:00It would seem you could fit about 100 million inte...It would seem you could fit about 100 million integer add/subtract logic units on a current semiconductor die. Clock them at 1 billion operations per second and you have 100 Peta operations per second available for "no multiply" nets. <br />https://discourse.numenta.org/t/no-multiply-neural-networks/2361<br />SeanVNhttps://www.blogger.com/profile/05967727000105480078noreply@blogger.com