Spark GraphX例子

假定我们想从一些文本文件中构建一个图,限制这个图包含重要的关系和用户,并且在子图上运行page-rank,最后返回与top用户相关的属性。可以通过如下方式实现。

// Connect to the Spark cluster
val sc = new SparkContext("spark://master.amplab.org", "research")

// Load my user data and parse into tuples of user id and attribute list
val users = (sc.textFile("graphx/data/users.txt")
  .map(line => line.split(",")).map( parts => (parts.head.toLong, parts.tail) ))

// Parse the edge data which is already in userId -> userId format
val followerGraph = GraphLoader.edgeListFile(sc, "graphx/data/followers.txt")

// Attach the user attributes
val graph = followerGraph.outerJoinVertices(users) {
  case (uid, deg, Some(attrList)) => attrList
  // Some users may not have attributes so we set them as empty
  case (uid, deg, None) => Array.empty[String]
}

// Restrict the graph to users with usernames and names
val subgraph = graph.subgraph(vpred = (vid, attr) => attr.size == 2)

// Compute the PageRank
val pagerankGraph = subgraph.pageRank(0.001)

// Get the attributes of the top pagerank users
val userInfoWithPageRank = subgraph.outerJoinVertices(pagerankGraph.vertices) {
  case (uid, attrList, Some(pr)) => (pr, attrList.toList)
  case (uid, attrList, None) => (0.0, attrList.toList)
}

println(userInfoWithPageRank.vertices.top(5)(Ordering.by(_._2._1)).mkString("
"))
andy

前端小白,在Web176教程网这个平台跟大家一起学习,加油!

Share
Published by
andy

Recent Posts

聊聊vue3中的defineProps

在Vue 3中,defineP…

1 周 ago

在 Chrome 中删除、允许和管理 Cookie

您可以选择删除现有 Cooki…

2 周 ago

自定义指令:聊聊vue中的自定义指令应用法则

今天我们来聊聊vue中的自定义…

3 周 ago

聊聊Vue中@click.stop和@click.prevent

一起来学下聊聊Vue中@cli…

4 周 ago

Nginx 基本操作:启动、停止、重启命令。

我们来学习Nginx基础操作:…

4 周 ago

Vue3:手动清理keep-alive组件缓存的方法

Vue3中手动清理keep-a…

1 月 ago