DeepWalk ([login to view URL]) is a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We hypothesize that DeepWalks performance could be significantly improved by implementing the solution on a GPU.
We are seeking an individual willing to implement a C/C++/Cuda implementation of DeepWalk on a Nvidia GPU that supports CUDA 3.5 or higher.
Requirements:
Implement a class to perform the deep walk algorithm that implements the following methods.
void beforeRun(); // The method that will be called before any iterations start.
void afterRun(); // The method that will be called after all iterations are done.
void beforeIteration(); // The method that will be called just before an iteration starts.
void afterIteration(); // The method that will be called just after an iteration completes.
void update(); // The method in which vertex updates need to happen. Will be called on every iteration for every interval.
bool hasConverged(); // Should return true if the algorithm converged during execution.
void assembleAndSaveFullResultSet(); // Will be called at the end of the execution to assemble results to files that can be sent back for visualization.
Input File Format:
All files should be in binary format.
The graph is divided into sets of sequentially numbered vertices, called intervals. All of the edges with a destination in a given interval are collected into a shard associated with that interval. The edges are then sorted by their source and within that by their destination. The shards are then broken down by the interval of the source vertices to make windows. This is the Parallel Sliding Windows scheme from [login to view URL]~pavlo/courses/fall2013/static/papers/[login to view URL]
Each window file contains the adjacency lists for all of the vertices from one interval into another interval. The lists are stored sequentially in the following format. The first value is L, the length of the adjacency list. The next L values are the destination vertices of the associated edges. If a vertex has no adjacencies in the given window, then L is zero.
There are two additional meta-data files. The interval file contains the first and last vertex for each of the intervals. The degree file contains the in and out degrees of each vertex.
Output File Format:
The output file will contain d floating point values for each vertex. The first d values will represent the first vertex in d-dimensional space, and so on.
If you are interested in responding please answer the following with you quote.
- success of the development effort and testing depends on an understanding of the algorithm, please explain the deepwalk algorithm and provide examples
- What CUDA libraries have you used?
- Please describe your development environment: which Nvidia GPU do you plan to use for development and unit test of your solution?
- Give examples of your experience with CUDA coding?
I have a bachelor in CS from the American University in Cairo and a minor in Mathematics. I have worked for the past year in Microsoft's Advanced Technology Lab in Cairo (ATLC) as a research assistant in the Computer Vision team. I have worked on various machine learning problems such as object detection and scene recognition and I am an IEEE published author.
I have worked extensively on deep learning before and contributed to caffe package on github.
$2.222 USD en 30 días
4,8 (27 comentarios)
5,7
5,7
5 freelancers están ofertando un promedio de $2.044 USD por este trabajo
Hi,
I had around 8 years of industrial experience in the field of FPGAs and computer programming. I had worked on several projects related to GPUs, Network programming, and Parallel programming etc. I can do this project for you.
Kind Regards,