(For more resources related to this topic, see here.)
Social Networks Analysis (SNA) is not new, sociologists have been using it for a long time to study human relationships (sociometry), to find communities and to simulate how information or a disease is spread in a population.
With the rise of social networking sites such as Facebook, Twitter, LinkedIn, and so on. The acquisition of large amounts of social network data is easier. We can use SNA to get insight about customer behavior or unknown communities. It is important to say that this is not a trivial task and we will come across sparse data and a lot of noise (meaningless data). We need to understand how to distinguish between false correlation and causation. A good start is by knowing our graph through visualization and statistical analysis.
Social networking sites bring us the opportunities to ask questions that otherwise are too hard to approach, because polling enough people is time-consuming and expensive.
In this article, we will obtain our social network’s graph from Facebook (FB) website in order to visualize the relationships between our friends. Finally we will create an interactive visualization of our graph using D3.js.
The easiest method to get our friends list is by using a third-party application. Netvizz is a Facebook app developed by Bernhard Rieder, which allows exporting social graph data to gdf and tab formats. Netvizz may export information about our friends such as gender, age, locale, posts, and likes.
In order to get our social graph from Netvizz we need to access the link below and giving access to your Facebook profile.
https://apps.facebook.com/netvizz/
As is shown in the following screenshot, we will create a gdf file from our personal friend network by clicking on the link named here in the Step 2.
Then we will download the GDF (Graph Modeling Language) file. Netvizz will give us the number of nodes and edges (links); finally we will click on the gdf file link, as we can see in the following screenshot:
The output file myFacebookNet.gdf will look like this:
nodedef>name VARCHAR,label VARCHAR,gender VARCHAR,locale VARCHAR,agerank
INT
23917067,Jorge,male,en_US,106
23931909,Haruna,female,en_US,105
35702006,Joseph,male,en_US,104
503839109,Damian,male,en_US,103
532735006,Isaac,male,es_LA,102
. . .
edgedef>node1 VARCHAR,node2 VARCHAR
23917067,35702006
23917067,629395837
23917067,747343482
23917067,755605075
23917067,1186286815
. . .
In the following screenshot we may see the visualization of the graph (106 nodes and 279 links). The nodes represent my friends and the links represent how my friends are connected between them.
In order to work with the graph in the web with d3.js, we need to transform our gdf file to json format.
import numpy as np
import json
nodes = np.genfromtxt("nodes.csv",
dtype='object',
delimiter=',',
skip_header=1,
usecols=(0,1))
links = np.genfromtxt("links.csv",
dtype='object',
delimiter=',',
skip_header=1,
usecols=(0,1))
The JSON format used in the D3.js Force Layout graph implemented in this article requires transforming the ID (for example, 100001448673085) into a numerical position in the list of nodes.
for n in range(len(nodes)):
for ls in range(len(links)):
if nodes[n][0] == links[ls][0]:
links[ls][0] = n
if nodes[n][0] == links[ls][1]:
links[ls][1] = n
data ={}
"nodes": [{"name": "X"},{"name": "Y"},. . .] and add it to the
data dictionary.
lst = []
for x in nodes:
d = {}
d["name"] = str(x[1]).replace("b'","").replace("'","")
lst.append(d)
data["nodes"] = lst
"links": [{"source": 0, "target": 2},{"source": 1, "target":
2},. . .] and add it to the data dictionary.
lnks = []
for ls in links:
d = {}
d["source"] = ls[0]
d["target"] = ls[1]
lnks.append(d)
data["links"] = lnks
with open("newJson.json","w") as f:
f.write(json.dumps(data))
The file newJson.json will look as follows:
{"nodes": [{"name": "Jorge"},
{"name": "Haruna"},
{"name": "Joseph"},
{"name": "Damian"},
{"name": "Isaac"},
. . .],
"links": [{"source": 0, "target": 2},
{"source": 0, "target": 12},
{"source": 0, "target": 20},
{"source": 0, "target": 23},
{"source": 0, "target": 31},
. . .]}
D3.js provides us with the d3.layout.force() function that use the Force Atlas layout algorithm and help us to visualize our graph.
<style>
.link {
fill: none;
stroke: #666;
stroke-width: 1.5px;
}
.node circle
{
fill: steelblue;
stroke: #fff;
stroke-width: 1.5px;
}
.node text
{
pointer-events: none;
font: 10px sans-serif;
}
</style>
<script src = "http://d3js.org/d3.v3.min.js"></script>
var width = 1100,
height = 800
var svg = d3.select("body").append("svg")
.attr("width", width)
.attr("height", height);
var force = d3.layout.force()
.gravity(.05)
.distance(150)
.charge(-100)
.size([width, height]);
d3.json("newJson.json", function(error, json) {
force
.nodes(json.nodes)
.links(json.links)
.start();
For a complete reference about the d3js Force Layout implementation, visit the link https://github.com/mbostock/d3/wiki/Force-Layout.
var link = svg.selectAll(".link")
.data(json.links)
.enter().append("line")
.attr("class", "link");
var node = svg.selectAll(".node")
.data(json.nodes)
.enter().append("g")
.attr("class", "node")
.call(force.drag);
node.append("circle")
.attr("r", 6);
node.append("text")
.attr("dx", 12)
.attr("dy", ".35em")
.text(function(d) { return d.name });
force.on("tick", function()
{
link.attr("x1", function(d) { return d.source.x; })
.attr("y1", function(d) { return d.source.y; })
.attr("x2", function(d) { return d.target.x; })
.attr("y2", function(d) { return d.target.y; });
node.attr("transform", function(d)
{
return "translate(" + d.x + "," + d.y + ")";
})
});
});
</script>
In the image below we can see the result of the visualization. In order to run the visualization we just need to open a Command Terminal and run the following Python command or any other web server.
>>python –m http.server 8000
Then you just need to open a web browser and type the direction http://localhost:8000/ForceGraph.html. In the HTML page we can see our Facebook graph with a gravity effect and we can interactively drag-and-drop the nodes.
All the codes and datasets of this article may be found in the author github repository in the link below.https://github.com/hmcuesta/PDA_Book/tree/master/Chapter10
In this article we developed our own social graph visualization tool with D3js, transforming the data obtained from Netvizz with GDF format into JSON.
Further resources on this subject:
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…