Visualising Data Workshop Report: part 1

The first half of the workshop consisted of speakers we invited to introduce the ways in which they have used visualisation in research, and look at how these could be useful to the Digital Panopticon and researchers attending the event. I’ve included as many links to relevant resources as I could find. (See also the Storify of the event.)

Professor Min Chen of the Oxford e-Research Centre got the day off to a great start. He treated us to a dizzying array of examples of different kinds of visualisations, emphasising the importance of who visualisations are being created for. He surveyed the long history of data visualisation and outlined four levels of visualisation:

disseminative (‘this is’) – presentational aids for dissemination
operational (‘what?’) – enable intuitive and speedy observation of captured data
analytical (‘why?’) – investigative, can be used to examine complex relationships
inventive (‘how?’) – aid improving existing models, methods etc

He also got us to think about ‘modes’ of visualisation, the different perspectives/needs of analysts, presenters and viewers. Question asked: ‘what would be a visual language for the Digital Panopticon?’ – taking into account the different kinds of data we’re working with.

These were just some of the examples!

Poem Viewer from the Imagery Lenses for Visualizing Text Corpora project (Oxford and Utah collaboration) – designed to support close reading by visualising the sounds of poetry.
Temporal Visualization of Boundary-based Geo-information Using Radial Projection – visualising movement of 200 glaciers over 10 years (recorded in satellite images). This was highly challenging: line graphs were too messy, maps not very helpful; a solution was found in radial visualizations.
Visualizing facial dynamics – humans are very good at expression recognition, but computers are terrible; project investigating methods to do this
Use of glyphs (simple stylised icons) rather than text labels in complex workflow diagrams, and to enable display of multiple measurements simultaneously.
Idea of parallel coordinates for visualising multi-dimensional data. (Lots of interest in this!)
How to visualise time without animation? – summarising into a single picture can help to see patterns.

Next, William Allen of the Oxford Migration Observatory talked about ‘Doing the Best with Data: critical realism and visualisation’. The Observatory’s goals are to communicate social science research beyond academia; migration is complex and doing this accessibly is challenging, so they make extensive use of visual techniques.

Visualisations are appealing, as they appear to offer comprehensive and independent windows, but actually achieving this needs to approach visualisation as an iterative and critical process. Use of critical realism approach as a lens for evaluation, critical testing of given categories. Rather than ‘what works?’ it’s better to ask ‘what about this visualisation works, for whom in which contexts, for what purposes?’

The media monitoring project was set up to monitor and analyse systematically what the press actually say about migration, over a period of time. Analysis of how press portrays migrant groups uses corpus linguistic methods (43 million words for 2010-12!). Allen showed us a number of visualisations using the tool Tableau Public (which some members of DP team have also been using).

Allen spoke of the ‘frontiers of visualisation’

political: how data/research are used by range of actors, decisions made through research
technical: the software and built-in assumptions/settings
virtual: interactivity, challenges of opening analysis up to public stakeholders

Questions and problems arising from the Observatory’s work: how do we visualise large datasets and patterns in them? Every decision comes with assumptions about what works. Also emphaised the danger that visualisation software can be a black box – eg, misleading on scale.

Additional resource: The Observatory website has a terrific page of data and resources with ‘ready-made charts and maps on migration in the UK as well as a description of key data sources and their limitations’, and a create your own chart facility. Go and play!

Our third speaker, Arthur Downing (Oxford), gave a presentation on Network Analysis and Visualisation for historians.

A network is a particular set of connections between agents: network analysis is analysis of the patterns of these connections (‘nodes’ and ‘links’). It differs from standard social science methodology (which tend to chop up objects by categories like race and gender and then looks at averages), in that network analysis starts with connections between objects/actors and then looks at their attributes. This is important because there can be different patterns of connections within superficially similar scenarios.

Some fascinating case studies he introduced:

RV Gould on networks and mobilization in the Paris Commune 1871
Hillman on Elites before the English civil war – showed the importance of merchants even though their numbers were small, as they linked many other groups together.
Adamic and Glance on the US political blogosphere in 2004

Downing’s own work on 19th-century Friendly Societies – a network analysis of proposers and seconders showed that top 20% of recruiters were responsible for 80% of members. But using ‘eigenvector centrality’ (which takes into account degree of node and degree of nodes connected to each node), also showed that some people were important even though they weren’t large recruiters.

Network analysis for maps can show more complex patterns than standard maps:

Spread of Freemasons in the US – on a conventional map this just looks like a ‘frontier’ movement, but when mapped as a network, a different picture emerges with more complex directions of flows
Social networks between Australian lodges – most migration is short distance and internal, though migration from England and Wales is very important

Pitfalls and problems:

identifying the boundaries of networks can be difficult
sampling is hard to justify as any missing ties can skew interpretation
longitudinal analysis is difficult – network analysis by definition is a snapshot in time; but may want to know how long does a tie persist. One answer is to breaks down into phases and look at different periods

Conceptually this is very different to standard statistics: ‘analysis of an endogenous system where endogeneity is what is interesting’, but potentially a great method for social history since it’s all about exploring complexity.

In subsequent discussion, concerns about ideological assumptions going into visualisations and how to communicate them to the user – but a reminder that this is a problem with traditional charts and tables too, with no simple answer.

We were deeply grateful to all three speakers for providing us with so much food for thought, and so many ideas to follow up!

[Part 2 of the report to follow shortly…]

The Digital Panopticon

Tracing London Convicts in Britain & Australia, 1780-1925

Visualising Data Workshop Report: part 1

Leave a Reply Cancel reply