7 min read

(For more resources related to this topic, see here.)

The Social Networks Analysis

Social Networks Analysis (SNA) is not new, sociologists have been using it for a long time to study human relationships (sociometry), to find communities and to simulate how information or a disease is spread in a population.

With the rise of social networking sites such as Facebook, Twitter, LinkedIn, and so on. The acquisition of large amounts of social network data is easier. We can use SNA to get insight about customer behavior or unknown communities. It is important to say that this is not a trivial task and we will come across sparse data and a lot of noise (meaningless data). We need to understand how to distinguish between false correlation and causation. A good start is by knowing our graph through visualization and statistical analysis.

Social networking sites bring us the opportunities to ask questions that otherwise are too hard to approach, because polling enough people is time-consuming and expensive.

In this article, we will obtain our social network’s graph from Facebook (FB) website in order to visualize the relationships between our friends. Finally we will create an interactive visualization of our graph using D3.js.

Getting ready

The easiest method to get our friends list is by using a third-party application. Netvizz is a Facebook app developed by Bernhard Rieder, which allows exporting social graph data to gdf and tab formats. Netvizz may export information about our friends such as gender, age, locale, posts, and likes.

In order to get our social graph from Netvizz we need to access the link below and giving access to your Facebook profile.

https://apps.facebook.com/netvizz/

As is shown in the following screenshot, we will create a gdf file from our personal friend network by clicking on the link named here in the Step 2.

Then we will download the GDF (Graph Modeling Language) file. Netvizz will give us the number of nodes and edges (links); finally we will click on the gdf file link, as we can see in the following screenshot:

The output file myFacebookNet.gdf will look like this:

nodedef>name VARCHAR,label VARCHAR,gender VARCHAR,locale VARCHAR,agerank
INT
23917067,Jorge,male,en_US,106
23931909,Haruna,female,en_US,105
35702006,Joseph,male,en_US,104
503839109,Damian,male,en_US,103
532735006,Isaac,male,es_LA,102
. . .
edgedef>node1 VARCHAR,node2 VARCHAR
23917067,35702006
23917067,629395837
23917067,747343482
23917067,755605075
23917067,1186286815
. . .

In the following screenshot we may see the visualization of the graph (106 nodes and 279 links). The nodes represent my friends and the links represent how my friends are connected between them.

Transforming GDF to JSON

In order to work with the graph in the web with d3.js, we need to transform our gdf file to json format.

  1. Firstly, we need to import the libraries numpy and json.

    import numpy as np
    import json

  2. The numpy function, genfromtxt, will obtain only the ID and name from the nodes.csv file using the usecols attribute in the ‘object’ format.

    nodes = np.genfromtxt("nodes.csv",
    dtype='object',
    delimiter=',',
    skip_header=1,
    usecols=(0,1))

  3. Then, the numpy function, genfromtxt, will obtain links with the source node and target node from the links.csv file using the usecols attribute in the ‘object’ format.

    links = np.genfromtxt("links.csv",
    dtype='object',
    delimiter=',',
    skip_header=1,
    usecols=(0,1))

    The JSON format used in the D3.js Force Layout graph implemented in this article requires transforming the ID (for example, 100001448673085) into a numerical position in the list of nodes.

  4. Then, we need to look for each appearance of the ID in the links and replace them by their position in the list of nodes.

    for n in range(len(nodes)):
    for ls in range(len(links)):
    if nodes[n][0] == links[ls][0]:
    links[ls][0] = n
    if nodes[n][0] == links[ls][1]:
    links[ls][1] = n

  5. Now, we need to create a dictionary “data” to store the JSON file.

    data ={}

  6. Next, we need to create a list of nodes with the names of the friends in the format as follows:

    "nodes": [{"name": "X"},{"name": "Y"},. . .] and add it to the
    data dictionary.
    lst = []
    for x in nodes:
    d = {}
    d["name"] = str(x[1]).replace("b'","").replace("'","")
    lst.append(d)
    data["nodes"] = lst

  7. Now, we need to create a list of links with the source and target in the format as follows:

    "links": [{"source": 0, "target": 2},{"source": 1, "target":
    2},. . .] and add it to the data dictionary.
    lnks = []
    for ls in links:
    d = {}
    d["source"] = ls[0]
    d["target"] = ls[1]
    lnks.append(d)
    data["links"] = lnks

  8. Finally, we need to create the file, newJson.json, and write the data dictionary in the file with the function dumps of the json library.

    with open("newJson.json","w") as f:
    f.write(json.dumps(data))

The file newJson.json will look as follows:

{"nodes": [{"name": "Jorge"},
{"name": "Haruna"},
{"name": "Joseph"},
{"name": "Damian"},
{"name": "Isaac"},
. . .],
"links": [{"source": 0, "target": 2},
{"source": 0, "target": 12},
{"source": 0, "target": 20},
{"source": 0, "target": 23},
{"source": 0, "target": 31},
. . .]}

Graph visualization with D3.js

D3.js provides us with the d3.layout.force() function that use the Force Atlas layout algorithm and help us to visualize our graph.

  1. First, we need to define the CSS style for the nodes, links, and node labels.

    <style>
    .link {
    fill: none;
    stroke: #666;
    stroke-width: 1.5px;
    }
    .node circle
    {
    fill: steelblue;
    stroke: #fff;
    stroke-width: 1.5px;
    }
    .node text
    {
    pointer-events: none;
    font: 10px sans-serif;
    }
    </style>

  2. Then, we need to refer the d3js library.

    <script src = "http://d3js.org/d3.v3.min.js"></script>

  3. Then, we need to define the width and height parameters for the svg container and include into the body tag.

    var width = 1100,
    height = 800
    var svg = d3.select("body").append("svg")
    .attr("width", width)
    .attr("height", height);

  4. Now, we define the properties of the force layout such as gravity, distance, and size.

    var force = d3.layout.force()
    .gravity(.05)
    .distance(150)
    .charge(-100)
    .size([width, height]);

  5. Then, we need to acquire the data of the graph using the JSON format. We will configure the parameters for nodes and links.

    d3.json("newJson.json", function(error, json) {
    force
    .nodes(json.nodes)
    .links(json.links)
    .start();

    For a complete reference about the d3js Force Layout implementation, visit the link https://github.com/mbostock/d3/wiki/Force-Layout.

  6. Then, we define the links as lines from the json data.

    var link = svg.selectAll(".link")
    .data(json.links)
    .enter().append("line")
    .attr("class", "link");
    var node = svg.selectAll(".node")
    .data(json.nodes)
    .enter().append("g")
    .attr("class", "node")
    .call(force.drag);

  7. Now, we define the node as circles of size 6 and include the labels of each node.

    node.append("circle")
    .attr("r", 6);
    node.append("text")
    .attr("dx", 12)
    .attr("dy", ".35em")
    .text(function(d) { return d.name });

  8. Finally, with the function, tick, run step-by-step the force layout simulation.

    force.on("tick", function()
    {
    link.attr("x1", function(d) { return d.source.x; })
    .attr("y1", function(d) { return d.source.y; })
    .attr("x2", function(d) { return d.target.x; })
    .attr("y2", function(d) { return d.target.y; });
    node.attr("transform", function(d)
    {
    return "translate(" + d.x + "," + d.y + ")";
    })
    });
    });
    </script>

In the image below we can see the result of the visualization. In order to run the visualization we just need to open a Command Terminal and run the following Python command or any other web server.

>>python –m http.server 8000

Then you just need to open a web browser and type the direction http://localhost:8000/ForceGraph.html. In the HTML page we can see our Facebook graph with a gravity effect and we can interactively drag-and-drop the nodes.

All the codes and datasets of this article may be found in the author github repository in the link below.https://github.com/hmcuesta/PDA_Book/tree/master/Chapter10

Summary

In this article we developed our own social graph visualization tool with D3js, transforming the data obtained from Netvizz with GDF format into JSON.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here