Thursday, March 31, 2016

Implementation of pdist in Theano

There days I come cross a problem that need me to compute pairwise distance for two given matrix in theano. Such functionality is built as pdist() function in scipy. As far as I know there is no such equivalent function in theano for all of them at once. However, each specific distance, being a closed form mathematical expression, can be written down in Theano as such and then compiled.

Take as an example the minkowski p norm distance (wiki)


import theano
import theano.tensor as T
X = T.fmatrix('X')
Y = T.fmatrix('Y')
P = T.scalar('P')
translation_vectors = X.reshape((X.shape[0], 1, -1)) - Y.reshape((1, Y.shape[0], -1))
minkowski_distances = (abs(translation_vectors) ** P).sum(2) ** (1. / P)
f_minkowski = theano.function([X, Y, P], minkowski_distances)
view raw pdist_theano.py hosted with ❤ by GitHub
Note that abs calls the built-in __abs__, so abs is also a theano function. We can now compare this to pdist:

3 comments: