-
Notifications
You must be signed in to change notification settings - Fork 52
Home
TODO: write intro:
- Goals
- Example: Globalmigrationplot
- Interaction
- Technologies used: HTML, CSS, JavaScript, D3, SVG, Grunt
D3 is a JavaScript library which empowers creating beautiful interactive visualisations in HTML. Although not tied to the Web per se, it is predominantly used to do data-driven manipulations of Web content, especially SVG documents embedded in HTML. D3 is the the fourth iteration of a visualization library, its precursors are Prefuse (Java, 2005), Flare (Actionscript, 2007) and Protoviz (Javascript, 2009), all of which the author of D3 had a leading role in.
D3 was created 2010 by Mike Bostock and sponsored by his employer, The New York Times. It has since received great attention and is used in various scenarios, especially for data visualisations. It has gained considerable traction in the relatively new discipline of data journalism.
The strength of D3 lies in its unique way to join data to DOM elements (the basic building blocks of HTML and SVG documents), which enables very efficient dynamic manipulations of page contents. It all starts with selecting DOM nodes with CSS selectors. A typical selection in D3 would be:
d3.selectAll('p');
This selects all paragraphs (tag name 'p') in the document. The selector strings are the same as in the W3C Selectors API, but with a twist on the returned value. This is the same selector (all paragraphs in the document) called via the standard javascript DOM API:
document.getElementsByTagName('p');
That DOM API call returns an array of DOM nodes, as expected, the difference to D3 is that D3 returns a D3 Selection Object, on which a plethora of useful methods can be called. All of these methods are chainable, meaning that the return value of these methods is again a D3 Object. This leads to the typical D3 coding style
d3.selectAll(...)
.data(...)
.attr(...)
.text(...)
.css(...)
...
The most important feature of D3 is data binding:
d3.selectAll("p")
.data(['Visualizing', 'Migration', 'Flow', 'Data']);
Here an array containing the four elements 'Visualizing', 'Migration', 'Flow' and 'Data' is bound to the previously selected paragraphs. Now it is possible to manipulate the selection depending on that data. For example we can assign each element of the data array to the text content of the paragraphs:
d3.selectAll("p")
.data(['Visualising', 'Migration', 'Flow', 'Data'])
.text(function(d) { return d; });
This code snippet assumes that there are already enough elements in the DOM and updates the first four paragraphs found in the page with our four strings and leaves any other paragraph untouched.
D3 provides methods to insert, update and remove nodes as needed, so it is possible to generate the DOM structure based on data:
// Select
var p = d3.selectAll("p")
// Join
.data(['Visualizing', 'Migration', 'Flow', 'Data']);
// Enter
p.enter().append('p');
// Exit
p.exit().remove();
// Update
p.text(String);
Here we create any missing paragraph before we set the text content. This works because D3 is creating a virtual selection with placeholder nodes bound to the missing nodes.
enter
and exit
are just filters on the selection which return nodes which are missing or superflous in the DOM. In the example enter
adds a paragraph node (appending it to the document), `exit' simply removes paragraphs we dont need for the given dataset.
These are the core concepts of D3, explaining D3 in depth is beyond the scope of this document. For more information about tying addition and removal of nodes to data, read about data joins in Mike Bostock's blog.
Scalable Vector Graphics (SVG) is an XML-based vector image format that has support for interactivity and animation.
SVG can be embedded in HTML pages and styled with CSS:
<!DOCTYPE html>
<h1>Welcome</h1>
<svg>
<rect height="100" width="100" style="fill: #ff00ff" />
</svg>
D3 combined with inlined SVG renders it possible to create interactive graphics in the browser.
TODO: describe git, git checkout (or download zip) as well as npm install
The source code of the globalmigration project is hosted on Github: https://github.com/null2/globalmigration.
The project uses Grunt as a build tool, which performs the following tasks:
- Lint JavaScript files
- Run unit tests
- Filter data
- Compile data
The website consists of a single HTML file, which references several JavaScript files.
The chart is then initialized and configured from within a <script>
tag:
d3.json('json/migrations.json', function(data) {
var chart = Globalmigration.chart(data, {
element: '#diagram',
animationDuration: 500,
margin: 125,
arcPadding: 0.04,
layout: {
threshold: 50000,
labelThreshold: 5000
}
});
chart.draw(2005);
});
First, migrations.json
is requested and parsed via d3.json
.
Once loaded, it gets passed to Globalmigration.chart
(defined in lib/chart.js)
together with some configuration options.
The migrations.json file is being preprocessed to minimize computation needs on the client. There are taken several optimisations described below:
The input CSV looks like this:
originregion_id,originregion_name,destinationregion_id,destinationregion_name,regionflow_1990,regionflow_1995,regionflow_2000,regionflow_2005,xxx,origin_iso,origin_name,destination_iso,destination_name,countryflow_1990,countryflow_1995,countryflow_2000,countryflow_2005
10,"Latin America",10,"Latin America",638385,628290,680458,671267,,"ABW","Aruba","ABW","Aruba",0,0,0,0
10,"Latin America",6,"South Asia",6881,4620,833,192,,"ABW","Aruba","AFG","Afghanistan",0,0,0,0
The filter task takes a countries.csv file which specifies whether a country should be displayed:
iso,show (1 = yes; 2 = no)
USA,1
FIN,0
The output from the filter step above is now taken as input for the compilation task.
The resulting migrations.json
looks like this:
{
"regions": [0, 3, 36, 61, 74, 88, 96, 101, 110, 113],
"names": [
"North America",
"Canada",
"United States",
"Africa",
"Angola",
...
"Venezuela"
],
"matrix": {
"2005": [
[ 139950, ... 8621 ],
[ 51564, ... 458 ],
...
],
"1990": [
...
]
}
}
To reduce the amount of chords displayed at any time the data is accumulated as region flows.
The graph starts collapsed and the user can expand a region to see individual country flows.
There are only two regions collapsed at any time, when the user expands a new region,
the first one gets closed, if there were two. To achieve this the region flows are
stored in the flow matrix
, followed by the appropriate country flows. A regions
index
keeps track of the region flows. Expanding a region is then done by displaying all flows
in the matrix between the current region index and the next region index. To display
labels, region and country names
are listed.
While D3 provides helpful layouts
for generating chrords,
they had to be extended to fit the requirements of migration flow charts.
One major difference between the chart provided by D3 and the needs of migration
flow charts is the fact that migration flow charts display two chords
(A chord is a shape which shows a single flow.
It is a geometry which consists of two arcs connected with two bezier curves.)
for every direction, one for emigration and one for imigration.
The other difference is that the chords end with slightly smaller radius,
to distinguish direction.
In order to display tooltips and numbers the data was added to the data generated
by the layout.
The modified chord layout
can be found together with the extended
chord shape
in the projects lib/
folder.
TODO: conclusion / summary
- OpenWeb technologies rendered interactive possible / made them widely used
- Interactive visualisations made data explorable at all