Data analysis is a crucial part of any research project yet it is notoriously difficult to describe. In many research papers it’s glossed with a phrase like ‘we analysed the data thematically using the software package NVivo’. In this post, Heather unpacks some of the processes and dilemmas that lie behind this phrase.
We’re still collecting materials for our case studies and going through individual interview transcripts, but we have one complete dataset – our 24 group interviews. We fed this all into NVivo, using heading styles to identify individual speakers, automatically collect their data and cross-reference it to their attributes using an imported classification sheet with the participants’ anonymised data (school, year group, gender, ethnicity etc). This sounds very technical but luckily it’s not, though it did involve some tedious transcript checking.
After this we used a team meeting to come up with some codes that we felt covered the material. We arranged these into six groups:
- Behaviours: Arrogance and Vanity; ‘Bad’ behaviour; Being real and fake; Body mods; Charity; Fan interactions; Inspiration; Money and consumption; Passion; Privacy; Role model; Weird and OTT
- Routes to fame: Back story and survival; Hard work; Nepotism; Notoriety and other ‘bad’ routes; Talent
- Social dimensions: Age; Class; Gender; Nation; Race and ethnicity; Sexuality
- Celebrities: Barack Obama; Bear Grylls; Bill Gates; Boris Johnson; David Cameron; Emma Watson; Justin Bieber; Katie Price; Keith Lemon; Kim Kardashian; Miley Cyrus; Nicki Minaj; One Direction; Russell Howard; Tom Daley; Will Smith
- Genres: Business; Comedy; Definitions (of celebrity); Film; Music; Politics; Royalty; RTV; Socialites; Sport; TV; YouTube
- Ways of engaging: Autobiographies; Desire; Encounters; Family; Fandom; Friends; Internet; Learning; Magazines; Newspapers; Not engaging; Television
We presented these to our wonderful advisory group, asking for their thoughts on the codes and a couple of related issues:
- How do we capture the patterns and themes across the group interviews without losing the specifics of interactions in each group?
- How do we combine a thematic analysis with a more fine-grained look at the language and practices used to talk about celebrity?
We didn’t get any direct answers from them but we hadn’t expected to. What the advisory group provided was a forum in which to discuss and gain advice on our ideas and concerns.
Following on from this we made three key changes. First, we added a new group of codes on Response to capture the emotions – from love to disgust – provoked by celebrity talk. Second, we reduced the number of celebrities to those twelve who make up our case studies, figuring it would be quite easy to use the search facilities in NVivo at a later date to track down all the data we have on One Direction, Miley Cyrus, or anyone else we got interested in. Third, we agreed that we would keep notes on the coding process and add annotations to the transcripts as needed.
Now for the science bit… We each took a copy of the NVvio file containing all the group interviews and took responsibility for working through all these data in relation to about a third of the codes. So long as none of us edited any of the original transcripts we would then be able to recombine our three separate files into one which held all our coding (this worked, but that’s getting ahead of the process).
Having us each go through all of the group interviews on top of each listening through to each soundfile and checking the transcript meant we got to know our data. As Maggie MacLure says, while coding inevitably reduces the complexity of your data, it also ‘demands immersion in, and entanglement with, the minutiae of ‘the data’’. Also since we weren’t sharing our codes, we were able to mess around with those areas over which we had ownership, adding new subcodes and changing old ones. We didn’t worry about whether we’d all code these in the same way – we knew we wouldn’t! Our aim was to make a first pass through the data and to aim for within, rather than between, coder consistency – though, even managing to be self-consistent across 24 wide-ranging group interviews is more of an aspiration than something that’s attainable in practice.
Now we have combined all of our work and have an enormous collection of codes, we’re focusing on them one at a time. We’re producing coding summaries – aiming at 10 pages or less – that run through the key themes within a given code and that begin to unpick the meanings and the patterns of language contained therein and delve into the broader discourses of which these are part. I’ve written about one small section of the gender code to give a flavour of what these summaries look like. Our hope is that they will provide both the starting point for papers and a way of checking on emerging ideas.
Trackback from your site.