To submit giraph jobs please do the following: Compile your java code using: bin/hadoop GIRAPH: Create a jar as follows: jar cf sspc.jar SimpleShortestPathsComputation*.class Now add this jar to your hadoop classpath export HADOOP_CLASSPATH="/home//sspc.jar" and then run the following command: hadoop jar giraph.jar org.apache.giraph.GiraphRunner [ -D option ]* [ GiraphRunner option e.g. -vip -vif etc. ]* jar jarname (giraph.jar) - where jarname is the path to the jar that has your compute code org.apache.giraph.GiraphRunner - Helper class to run Giraph applications by specifying the actual class name to use (i.e. vertex, vertex input/output format, combiner, etc.) (provided in the giraph-core.jar, you don't need to write this class) -D options - command line params your.package.ComputationClass - The fully qualified name of your compute class E.G org.apache.giraph.examples.SimpleShortestPathsComputation GiraphRunnerOptions - • -vif : Vertex input format (supported formats: • -vip : Path in HDFS where graph is stored in the format specified by vif • -vof : Vertex output format (most commonly used is which is vertex followed by value E.G: in case of SSSP it would be vertex ID followed by its distance from source) • -op : Output directory where output is stored in vof format • -w : Number of giraph workers • -ca giraph.userPartitionCount : Number of partitions to split the graph into (By default this number is the square of the number of workers) (Note: Default partitioner is HashPartitioner, you can find classes related to partitioning at • -ca giraph.logLevel : Specify log level (INFO, WARN,ERROR, DEBUG etc) • -ca giraph.checkpointFrequency : Checkpoint after specified number of supersteps • -ca giraph.zkList - always set this to • -yh : Heap memory for a single giraph worker • -yj : Provide the location of the jar file you created so that it gets distributed onto the yarn containers • All other options available can be viewed at For E.G to run the SSSP code provided in giraph examples jar (The examples jar and giraph core jar are at /home/hadoop27/hadoop-2.7.3/share/hadoop/yarn/lib/giraph-examples-1.2.0-for-hadoop-2.7.3-jar-with-dependencies.jar and /home/hadoop27/hadoop-2.7.3/share/hadoop/yarn/lib/giraph-1.2.0-for-hadoop-2.7.3-jar-with-dependencies.jar): hadoop jar giraph-examples-1.2.0-for-hadoop-2.7.3-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dgiraph.metrics.enable=true org.apache.giraph.examples.SimpleShortestPathsComputation -vif -vip /user/jayanth/tiny_graph.txt -vof -op /user/jayanth/output -w 8 -ca giraph.zkList=orion-00:2181 -ca giraph.userPartitionCount=16 -ca SimpleShortestPathsVertex.sourceId=1 -ca giraph.logLevel=debug,giraph.checkpointFrequency=0 -yj giraph-examples-1.2.0-for-hadoop-2.7.3-jar-with-dependencies.jar GOFFISH: The instructions to download and install goffish are Similar to giraph once you have the jar file to be executed, you can run it on the cluster using the following command: export HAMA_CLASSPATH=/home/jayanth/goffish-sample-3.1.jar hama in.dream_lab.goffish.job.DefaultJob /home/jayanth/ /user/jayanth/fb4/Job2/ /user/jayanth/output The required goffish hama jars have already been added to the turing cluster at /home/hadoop27/hama/lib/ For any questions regarding GOFFISH please subscribe to