What have corpora ever done for us?

Following a conversation over on the facebook page I use for talking about teaching and language, I’ve decided to post a talk I did at IATEFL many moons ago. I do remember, with a faint smile, that Dave Wills himself came along to watch this one, but at some point became overcome with either rage or tedium and flounced out, thus allowing me to make the cheap jibe about Elvis having left the building before carrying on. Were this post to generate even a tenth of that heady level of excitement, I’d be delighted!

Written maybe ten years ago, at the height of the corpora promo boom, it was intended as a partially tongue-in-cheek critical overview of corpora linguistics. And yes, for those of you that were wondering, the title WAS inspired by this rather splendid Monty Python sketch:

With that in place, here goes nothing . . .

The use of computers to store and help analyse language has obviously revolutionised many aspects of language teaching, and corpora linguists have become an ever-present feature at IATEFL and other similar conferences. Obviously, much good has come from this. We have had a whole new generation of much-improved dictionaries, all of which contain better information about usage, collocation and frequency; superb new reference books such as the Longman Grammar of Spoken and Written English have been made possible, and, perhaps inadvertently, corpora linguistics helped to launch the Lexical Approach and to thus help to move language at least some way back towards the centre of language teaching. Nevertheless, it seems to me that despite all these advances, corpora linguistics has also had several negative side-effects on the way teachers perceive their roles, and that they have actually enslaved us in ways which are not entirely healthy. I would like to move on to consider the ways in which I feel this has occurred.

The fallacy of frequency

Corpora linguists repeatedly promote their products with often highly-detailed reference to frequency counts and the idea that frequency is central has become a common one. However, should a Pre-Intermediate learner wish to be passed the salt over dinner, simply knowing the infrequent item ‘Salt’ will facilitate this in a way that knowing the far more frequent ‘Could’, ‘you’, ‘pass’, ‘the’ and ‘please’ would not. Generally, it’s not the most common words which carry core meanings; rather, it’s the far rarer items that do. Simply knowing the 800 most common words in the language makes you only able to say a lot about not very much. In the same way, failure to learn word which may well be low-frequency generally, but which are possibly much higher frequency within specific types of conversations condemns you to not being able to say very much about a lot!! Frequency tells us nothing more than what is frequent. It cannot tell us what’s useful, what’s necessary or even what’s teachable.

There are deeper problems here to do with the way in which frequency is actually calculated. Corpora remains word-obsessed and the process of lemmatisation compounds this. Hence, an idiom like ‘You’re a dark horse’ is entered not as a two-word idiom, but rather as one example of ‘dark’ and another of ‘horse, thus defaulting on two fronts.  Similarly, plural nouns are currently counted as other examples of singular ones, which is a rather major oversight. Is, for instance, the singular of ‘Many Happy Returns’ ‘A Happy Return’? ‘Meetings’ is not simply  the plural of ‘meeting’, and it collocates with different words. Finally, knowing that, say, ‘get’ is a very common word does little to help teachers know whether ‘get on with it’ is more frequent that’ Let’s get down to business’. Sadly, until corpora start sorting by chunk they will remain of limited relevance.

The fallibility of human endeavour.

That corpora need to be approached cautiously and with one’s intuition fully tuned is made apparent by a cursory glance at the word ‘thaw’ on several published CDs. Should one access the word, wishing to know whether snow melts or thaws, one would be surprised to learn that a far more frequent example of the word, and thus – if we follow the logic of corpora linguists – a more useful collocate for our students is actually John, as in John Thaw, the late, great British actor.

Similarly, I once saw a Jane Willis talk wherein she suggested that one of the most common three-word lexical items in the English language was ‘Princess of Wales’. It was only when pushed during questioning that she actually admitted that the corpora she had taken this data from was based almost exclusively on a couple of radio phone-in programmes. In the same, way, the actual construction of corpora-based materials – dictionaries and the like – also inevitably involve a degree of hammering out by researchers, often by means of a vote or a fudge. Corpora are by necessity human constructs based on limited samples of data, are easily skewed by input and thus are best viewed sceptically.

The limitations of what corpora can offer

While spoken language, conversation, may well form the basis – even the majority – of many corpora, what corpora can’t show us is what typical conversations look like. It’s not possible, for instance, to access ten typical conversations had by people talking about what they did last night or to look at the 20 most common ways of answering the question “So what do you do for a living, then?”. As such, if we want to present our students with models of the kinds of conversations they themselves might actually want to have, we are forced to fall back on our (actually ample) experience of such conversations in order to script them. However, I would argue that it is precisely because we have got such broad experience of such conversations that we do tend to know how they work and sound and look.

For teaching purposes. we need to be able to script conversations that aren’t so culturally and spatially bound as to exclude students; we need to ensure the conversations students are exposed to still somehow facilitate intra-class bonding. Input needs to be proto-typical and to include items which are easy for us to systematise and for learners to appropriate and assimilate. Corpora cannot do this for us.

Corpora and the non-native speaker teacher

It is often claimed – mainly by those who are employed to make, package and sell corpora – that corpora are an invaluable aid for the non-native speaker teacher. I would personally argue that the opposite is far too often true and that as they stand, corpora massively favour native speakers.

One understandable reaction many teachers, both native and non-native, have to the notion that they should teach more spoken English is the ‘but I’d never say this or that bit of language” response when faced with a spoken text. Ironically, written texts never elicit a similar “But I’d never write that myself” response, and there are several reasons for this, I feel. There is possibly an assumption that writing is a more creative realm where anything goes; there’s also the fact that the grammar and the lexis of the written language have already been codified and disseminated and are thus more familiar to teachers; thirdly, I think, there’s the fact that we pin our identities on our speech – our idiolect, our regional, class-based, age-oriented, in-group, gender-based grasp of lexis and grammar – far more profoundly than we do on what we write. We are so aware of differences in the way we speak that we usually fail to notice the massive similarities. A good example of this is the fact that every EFL book which focuses on the UK / US divide fails to note that the vast majority of the language used in both countries is remarkably similar, and instead frets over the present perfect, sidewalks versus pavements and the correct pronunciation of aluminium. Yet for every “It can out of the blue” / “It came out of left-field’ divergence, there must surely be ten other idioms we all have in common.

Given this, I personally feel it doesn’t take much to persuade non-native speaker teachers to stick to the already familiar, tried-and-tested formula of written texts and comprehension questions and structural grammar. By spending so much time pointing out relatively obscure quirks and neologisms, such as the fact that ‘like’ is being increasingly used to report speech (as in “He was like ‘Hi’ so I was like ‘Bye’) , corpora linguists are inadvertently making spoken English more of a foreign language for non-native speaker teachers than is perhaps wise for people who claim to believe – as I do – that spoken English should become much more a part of General English than is currently the case. Too relentless a focus on the new, the odd, the interesting, the different obscures the wealth of English that unites us all.

I also feel that it is not only many non-native speaker teachers who would never use ‘like’ in this way, but also many native speakers too. The vast majority of language teachers do NOT need corpora to tell us that this is a relatively unuseful piece of lexis, so long as it remains still relatively unused. Indeed, my own rule of thumb would be that if YOU don’t say it, don’t TEACH it. English as a foreign language is NOT English as the corpora knows it. If you believe, as I do, that the kind of model conversations coursebooks provide for teaching purposes should be better modelled on the information provided by corpora than is currently the case, then I find it hard to see how you couldn’t also support the idea that corpora specialists should concentrate more on insights which will be of direct use to coursebook writers and teachers alike. Indeed, given the problematic status of spoken language within the classroom at present, I’d go so far as to say assert that failure to do anything less serves to sabotage attempts to spread a methodology based on spoken language (and here, of course, I’m compelled to acknowledge my own interest in this area as a coursebook writer).

I find it particularly interesting to note that the constructors of corpora – or at least their backers – seem as yet very reluctant to work on a corpus of English as used by non-native speakers. Obviously, this would be in essence the same corpus, but with much left out. This is precisely the point : that which is left out by competent non-native speakers has no real place in most – and especially most pre-Advanced – teaching materials.

Animal Farm (or Beware of the oppressive tendencies of those who come claiming to liberate us!!)

It would be churlish to deny that corpora have provided us with some useful insights into such features of language as the fact that would is three times more common when talking about past habits than used to is, but at the same time it must also be added that the way in which corpora have been presented has all-too often intimidated us into pretending that we didn’t already know much – if not most – of what they confirm. For example, Mike McCarthy, at IATEFL Brighton 2001 spent half an hour blinding us with the statistics that showed – entirely unsurprisingly – that ‘take the mickey’ is far more common than ‘mickey-taker’ or ‘mickey-taking’. Surely any fluent speaker of the language could have guessed this (dubiously relevant) fact themselves, based on their own intuitions about the language.

The relentless emphasis on the finality of corporal truth no only denies the reality of the classroom practitioner who has to get in there each and every day and try to give their students information about the language being studied, but also refuses to acknowledge the fact that we all have heard and read millions and millions more words than any corpus will ever hold and thus have good hunches about words as a result. Sure, hunches about language can be wrong, but more often than not, they aren’t. I personally really resent the notion that not only are corpora useful for showing us the errors of our ways, but also for confirming when we’re right. The implication is that we are not right UNTIL we’ve checked! This way lies madness – and the deskilling of us all!!


Obviously, it is important that teachers do keep themselves up-to-date with corpora findings and adapt their understanding of the way language works accordingly. Here I totally agree with Ron Carter that one thing corpora has helped us become more aware of is the fact that grammar is much broader than sentence-based / tense-based grammar would seem to suggest. Words have their own micro-grammar and so lexis needs to continuously be grammaticalised in typical ways. Nevertheless, it is also vital that teachers are encouraged to believe that they can tap into and trust their own inner corpora.

If Carter and McCarthy can proclaim that the more students are encouraged and trained to notice, the more they actually will notice, then the same much surely be true for us as teachers. Indeed, the true sign of corpora-work well done is its own eventual redundancy. This really brings me to my final point – one of the great ironies of corpora is that they have actually unwittingly made teachers more intuitive, not less. What corpora have done is to place language back at the centre of classrooms and, as such, we all now have to think much more about how we actually use language.

To a degree, corpora and teachers exist in a parent-child relationship, and many teachers are now ready to leave home. Thanks Mum and Dad – you’ve done a great job, we may be back to visit every now and then, but we’ve basically already got the message!

However, lest we forget, corpora are bank-rolled by major publishing houses and have endless spin-off publications derived from them in an effort to recoup much of this investment. As such, maybe I’m expecting too much by asking those in receipt of the publisher’s pound to loose the reins on much of their power and place it back where it rightly belongs – back in the hands of the humble classroom practitioners!!!

Twenty things in twenty years – Part Two: troubling trouble when trouble troubles you!

There are plenty of things that you generally don’t learn on a four-week CELTA course: how bizarre many of the staff rooms you’ll later find yourself in will be; how rife the illegal photocopying of published material is around the world; how you’ll probably end up inventing Dogme by accident one morning as you stumble into class having not slept a wink and quite possibly with either an illegal or at least a severely impaired bloodstream; how sooner or later you’ll find yourself subjected to threats / bribes / tears / offers of sexual favours as students desperately try and blag attendance certificates or better test results or placement in a level they absolutely don’t deserve to enter. I could obviously go on and on here! However, the one thing that perhaps more attention should be paid to on initial training courses is the subject of today’s reflective post wherein I look back over what’s now twenty years of teaching and try to work out what the hell I’ve learned about the trade: the kind of trouble that can erupt – or fester – in EFL classes and how we as teachers might best tackle them. In other words, how to trouble trouble before trouble troubles you – and the class you’re teaching!

The moment that I came to realise the importance of developing strategies for doing this came unpleasantly early in my teaching career. I’m somehow managed to blag my first real paid teaching job at St. Giles Central in London and had a lovely Intermediate-level class that I was doing every morning. They were predominantly Asian, with students from a wide range of different countries. The first week or so went really well and then the evil effects of continuous enrollment reared their head the following Monday when the door opened fifteen minutes into class and in walked a medallion-wearing living breathing stereotypical Italian male, complete with unbuttoned shirt and such a copious amount of hair on display that I’m prepared to believe it may well have been a chest wig. “Francesco Celotto from Milano”  he announced, as though this in itself merited a round of applause. “Come in” I smiled, before adding “You’re late!” He then surveyed the room a couple of times with a look of increasing unease before uttering the immortal lines “Ma dai! But it is all the Japanese in here” It was at this point I realised we had what could only be termed a situation. It was one of those moments where you suddenly sense just how much is riding on what you decide to do next. Say nothing, and you’re essentially colluding with this ignorance. Come down heavy and you’ve got one very pissed-off new student who’s lost face and who now hates you. What to do? What to do?

In the end, I smiled and said “Not quite Francesco. This is Dilokpol. He’s from Thailand. And this is Henu, from Indonesia. This is Lily from Vietnam, and this is Chen Chen from Hong Kong. This is Agnes from The Philippines, this is Nan-Joo from South Korea and oh, this is Kenzo, who actually IS from Japan, so one out of seven. Not bad, not bad. And which part of Spain were you from again?” – a question which caused Francesco to look incredulous and to insist on his Italian origins. “Exactly”, I pointed out. “Where you’re from is important to you, right? And it’s the same for everyone else in the class, OK?” Firm but friendly smile tinged with just a tiny touch of menace. Move on.

I’m not sure how I knew to do this or what led me to make the choices I made in this instance. As I’ve already said, it certainly wasn’t anything my initial – and let’s face it. most CELTAs are VERY initial – training had prepared me for. There’d been no suggestion there that TEFL was going to be anything other than a constant holiday camp roller-coaster ride of great big neon FUN. I suppose I’d just developed – unconsciously up until this juncture – conflict resolution or deflation skills the way that most of us – by living! Life, whether we like it or not, comes with conflict in-built and whether it had been avoiding school bullies, recognising who not to stare at too long at football matches, working as a bouncer in dodgy London pubs whilst at uni or going through relationship break-ups, I’d somehow gotten to the stage where I was able to defuse this potential bomb in such a way as to show the Asian students in my class that I’d noticed the affront and wasn’t prepared to accept it, whilst also somehow keeping Francesco onside with a kind of firm humour.

This was one of the most crucial lessons I learned early o in my teaching career and, having survived this baptism by fire, I was set to be able to survive similarly testing encounters over the years to come. Now, I’m not suggesting that this was the only way of dealing with this situation, but it worked for me and the combination of stern / serious and kind / inclusive has stood me in good stead. Obviously, failure to develop ways of ensuring parity and equality in class; of ensuring students are not allowed to offend or abuse each other – or at least do not get to do so without being aware of the fact that this is what they’ve done; of ensuring that you as a teacher are in charge of the class and are able to meld its disparate elements into something resembling a cohesive whole can all lead to disaster . . . to lessons slipping out of your control; the factions developing; to outright mutiny; the upset and anger; to complaints and possible even dismissal. All of which ought really to suggest that we start taking our innate conflict-handling abilities a bit more seriously on initial training courses and at least allowing space for some discussion of how and when they might best be implemented.

As the years have gone by, I’d like to think I’ve honed the way I deal with conflict into an even more effective technique, which is essentially two-fold and involves (a) diffusing tension by turning arguments inwards towards new linguistic input and (b) if I think something is particularly wrong or offensive, politely saying that I disagree and explaining why. To wrap up this post, one quick example. A year or so ago, I was teaching a multilingual Upper-Intermediate group here in London. The word DISCRIMINATION came up in an exercise we were doing and one student asked if it was like racism. I explained it was kind of similar, though mainly limited to unfair treatment – rather than abuse or violence – and also mainly limited to the ability to get jobs, promotion, housing, and so on. I then said that in some ways it was also sort of bigger than racism as you could face discrimination if you were black or Asian, but that you could also FACE DISCRIMINATION or BE DISCRIMINATED AGAINST ON THE GROUNDS OF gender, so it’s harder for women to get some jobs; on the grounds of sexual orientation, so it’s harder to get work or housing if you’re openly gay and so on. At this point, a student said “Gay is like homosexual?” to which I replied “yes, but homosexual is quite old-fashioned and most homosexuals usually prefer to be called GAY”. The student then said something along the lines of “I hate the gays. They must die” – to generally fairly stunned / bemused / upset silence in class.

“Well, you’re entitled to think what you think, and I’m not here to change your mind”, I began, “but personally I think you’re wrong. I have plenty of gay friends and it’s not nice to think you want them dead. There may even be gay people in this class, for all you know. Anyway, you can think what you think. It’s up to you. In the university, though, if you say things like that can get you kicked out. You can be thrown off courses if you make HOMOPHOBIC COMMENTS.” I then explained the concept and wrote up on the board the following:

You can be kicked out of the university for making racist / sexist / homophobic / anti-Semitic / Islamophobic comments.

There followed a brief discussion of each of the concepts and a discussion about whether nor not similar rules applied in higher education institutions in their countries. Interestingly, and I’m certainly not claiming that this kind of thing happens all the time, at the end of the class this particular student came up and apologised and said he’d never had a discussion about any of these issues before and had never met anybody ‘who knows the gays’. We then had a further talk which took in things like ‘why the gays like men’ and the like – and no further comments of this nature were ever heard in my class again.

This defusing of potential heat by turning it inwards towards the teaching of new language has worked for me thus far.

Long may it continue to do so.

Twenty Things In Twenty Years Part One: Falling Into A Me-Shaped Hole

In much the same way as I once found it inconceivable that I’d ever suffer the indignity of reaching the terrifying age of 30, so it seems preposterous that this year marks the twentieth anniversary of my career in English Language Teaching! In acknowledgement and commemoration of this rather momentous life event, I’ve decided that over the course of the next twelve months I shall attempt to blog twenty pearls of wisdom I’ve gleaned during my years at the chalk face . . . and in publishing and on the conference circuit.

In April 1993, I stumbled onto my one-month CTEFLA course at Westminster College, having spent the previous two years (since graduating in 1991) doing everything from building site labouring to making sandwiches in a factory canteen, from demonstrating ‘the ancient Chinese game of Jenga’ (TM) in Hamley’s the Toy Shop to buying and selling old records in the legendary and indeed infamous Music and Video Exchange empire, all the while trying my darndest to enjoy the many and varied delights, shall we say, that London’s nightlife had to offer. I was 24 and reaching some kind of burnout point. A change I was most definitely ready for!

jenga image

As with many native-speaker teachers, a career in education was certainly never something I’d planned on. In fact, it was a fateful conversation in a pub in Soho with an old friend, the splendidly named Julian Savage, that pushed me on down the road I’ve been exploring ever since. A few years older than me, I’d first encountered Julian in Our Price Hastings and our initial bond was to do with the fact we both sported bowl cuts and loved The Byrds and The 13th Floor Elevators. Julian had himself wandered into TEFL a few years earlier as a way to facilitate his wanderlust and peripatetic lifestyle. Anyway, he was briefly back from a sojourn in Iran. Or was it Ethiopia? Or Indonesia? Anyway, we retired to a watering hole to catch up and shoot the breeze. At some point, I mentioned I was in need of a change of scene and was contemplating heading off round the works in search of thrills and pastures new – at which juncture a CTEFLA was suggested. “Why would I want to be a teacher?” I asked incredulously. “I hated most of my teachers at school!” “Well,” Julian countered, “that’s as good a reason as any for becoming a teacher! Look on it as a firm of revenge.” And thus my fate was sealed!

With a full set of negative role models to kick against, I stashed two grand away during a gruelling six-month stint working bars seven nights a week and embarked on a whole new adventure. Now, here’s the thing: almost as soon as I’d finished my first twenty-minute teaching practice, I had a strange and most singular feeling – here was some kind of work for which being me was not only no longer a profound disadvantage, but where it may actually be an advantage! In every other form of paid employment I’ve ever had, with the possible exception of second-hand record store work, at some point or other being me caused problems. I struggled to confine myself to the (often stark) parameters of the work; I struggled to keep my big mouth shut when confronted with idiotic rules and jobsworths; I struggled not to give in to the overwhelming desire to gouge my own mind out in frustration at the sheer tedium of so much of it!

In many ways, teaching didn’t feel – and to some extent never really has felt – like real work at al, certainly not when compared to trying to prevent the local apes from ripping each other’s faces off on a Friday night’s pub crawl down the Old Kent Road! As such, it’s probably worth considering why that might be the case.

Obviously, much of the early appeal, apart from (and let’s be honest here) the thrill of being in close proximity to so many beautiful and interesting young people from all over the world, was down to the space teaching allowed for whatever kind of demented (albeit well-intentioned) attempts to create my own lessons I could muster. It took me probably far too long to realise that not only were my students not massively interested in lessons based around David Bowie‘s God Knows I’m Good or A Clockwork Orange, but also – more crucially – that they weren’t teaching much of real utility.


I was also slow to grasp that stumbling into class pretending to be drunk really wasn’t the best way of teaching the present perfect continuous, but I was still intoxicated by the freedom allowed me and by the plaudits of being ‘dynamic’ that students rained on me.

In retrospect I can see that a lot of poor teaching is excused – or possibly even validated – by a kind of pedagogical relativity, where we persuade ourselves that we teach as we wish to be taught, as though this justifies all, or where rampant experimentation is not only tolerated but actively encouraged. the point is, though, that teaching is a broad church and one that allows you to explore and work through all of this and more. Which is why becoming an English language teacher felt to me – and I’m sure to many many others – like falling into a me-shaped hole.

I later learned, of course, that the Subud quote on the back of one of the early Funkadelic LPs about freedom being free of the need to be free is profoundly true when it comes to teaching, and that it’s perfectly possible to still be both completely yourself in class and yet operate within clearly thought-out and even fairly narrow parameters.

But that, perhaps, is an area best left for another day!