In Machine Learning, one is often faced with using the Euclidian distance as the de-facto way of comparing objects or features. This is like comparing Apples and Oranges, little contextual side information is provided since it really depends on the application. Comparing Apples and Oranges might be Ok if you are caring about eating one fruit a day, but it might be extremely non-productive if you care about how comparing Vitamin C content between fruits without making a difference between the two. One way to do this is to construct a specific metric associated with the end goal you have in mind and create a Mahalanobis distance function. The main issue is how do you impart any type of similarity and dissimilarity between features with any type of prior knowledge of the problem at hand. Ali Ghodsi, Dana Wilkinson and Finnegan Southey have begun answering this question by building a metric based on side information in Improving Embeddings by Flexible Exploitation of Side Information. We are going to feature this algorithm today with Cable Kurwitz (as usual all the good stuff is his and the mistakes are mine). The conclusion of the article reads
Many machine learning techniques handle complex data by mapping it into new spaces and then applying standard techniques, often utilizing distances in the new space. Learning a distance metric directly is an elegant means to this end. This research shows how side information of two forms, class equivalence information and partial distance information, can be used to learn a new distance as a preprocessing step for embedding. The results demonstrate that side information allows us to capture attributes of the data within the embedding in a controllable manner. Even a small amount of side information can improve the embedding’s structure. Furthermore, the method can be kernelized and also used to capture multiple attributes in the same embedding. Its advantages over existing metric learning methods are a simpler optimization formulation that can be readily solved with off-the-shelf software and greater flexibility in the nature of the side information one can use. In all, this technique represents a useful addition to our toolbox for informed embeddings.
In order to implement this algorithm you have to have Michael Grant, Stephen Boyd, and Yinyu Ye's CVX program installed.
clear
% simple example where x1 and x2 are very
% close and x3 and x4 are very far apart
% the dimension space is 5
n =5;
x1=[1 2 3 4 5]';
x2=[0.1 2.1 12 56 100; 7 1.9 98 200 1]';
x3=[1 4 7 8 19]';
x4=[123 43 7.1 29 133]';
% computing elements in the loss function
for i=1:2
xs1(:,i)=x1-x2(:,i);
end
xt1 = zeros(n*n,1);
for i=1:2
xt1= xt1 + vec(xs1(:,i)*xs1(:,i)') ;
end
for i=1:1
xs2=x3-x4
end
for i=1:1
xt2= vec(xs2*xs2') ;
end
cvx_begin
variable A(n,n) symmetric;
minimize(vec( A )' *(xt1-xt2))
subject to
A == semidefinite(n);
trace(A) == 1;
cvx_end
% checking results
% difference between vector supposedly close to each other
for i=1:2
xr1(i)=(x1-x2(:,i))'*A*(x1-x2(:,i))
end
% results is close to zero for both distance(x1 and the
% two vectors x2)
%
% now difference between vectors that are supposedly far
% from each other
for i=1:1
(x3-x4)'*A*(x3-x4)
end
% results are very far for the two vectors x3 and x4 as
% expected.
% Actual results:
% xr1 =
%
% 2.7221e+003
%
%
%xr1 =
%
% 1.0e+003 *
%
% 2.7221 0.0067
%
%
%ans =
%
% 2.8875e+004
It's a nifty little tool. From there you can use you favorite dimensionality reduction technique.
Some other algorithms I will be featuring in the future Monday Morning Algorithm Series include: The Fast Johnson Lindenstrauss Transform, the Experimental Probabilistic Hypersurface, Diffusion methods for dimensionality reduction and others....
Credit Photo: Chinese Space Agency, Xinhua, First photo of the Moon from the Chinese made Chang'e lunar orbiter yesterday.
No comments:
Post a Comment