Category Archives: Corpora linguistics

Twenty things in twenty years part eight: there’s nothing as practical as a good theory

In the early years of my career, I was at one with many in my profession in that I suffered from an insatiable hunger for recipes. I devoured the resource books that were available in the staff rooms of the schools I was teaching in, and spent much of my hard-earned cash on investing in further similar tomes. I rushed through all manner of tricks, techniques, activities and games like a demented fusion food fanatic. The words “And here’s something you might want to try in your class on Monday morning” were music to my ears – and I prided myself on being an innovative, progressive teacher. The only problem was, of course, that I had little – or no – idea as to what all this endless innovation was actually FOR, apart from to pave a road to who knew where, to facilitate what I saw back then as ‘development’, and to ensure my classes were filled with ‘fun stuff’ for my students to do, ideally – as previously stated – stuff that kept students on a potentially endless riff of speaking.

Now, it may seem odd – willfully perverse even – for someone who’s co-authored a series called Innovations to question the value of innovation. After all, there I was just a few weeks back, gratefully quaffing the British Council’s free booze and hobnobbing with the great and the good at the annual ELTONs awards night, wherein the BC “recognises and celebrate innovation in the field of English language teaching”. Wasn’t griping then, was I, eh! Well, it’s not that innovation per se is necessarily a bad thing. It’s just that it’s also not necessarily a GOOD thing, despite the way the notion of innovation is almost invariably used to describe positive developments in English – and despite the fact that its dictionary definition is simply ‘a new idea, definition or piece of equipment’. Nevertheless, the fact remains that for many of us the very idea of innovation suggests the thrill of the new and conjures up images such as these:


In classroom reality, though (and of course this is only something that has become clear with the benefit of hindsight), most of my early innovations had far more in common with the kinds of madness depicted below – familiar and yet twisted, entertaining and yet utterly pointless, transitory, fleeting, once tried and soon forgotten.


And I’d dare to venture that the vast majority of recipe-driven teaching out there falls into the same trap, sadly. Method ends up being valued over knowledge of the very thing we’re supposed to be teaching – language! The harsh fact of the matter is that unless it’s rooted in a theoretical view of both language and learning then innovation is simply change for the sake of change and is destined to result in teaching that’s of (often severely) limited practical utility to learners ninety-nine times out of a hundred. There’s an inverse correlation here that’s maybe less discussed too, though, and it’s that once you do have a theory of language and of learning that informs and feeds into your teaching, you will almost inevitably becomes LESS experimental, less driven by the need to find new things to do in class, and perhaps more static, more fixed. Yet out of this solidity can emerge the real wonder of the craft. It’s almost as if the disciplines you impose on your practice create something semi-routinised and thus then allow the mind to pick up on and notice what’s happening on the peripheries: the students’ interlanguage, the content of their output, the problems they encounter with the material they’re using – and the reasons for these problems, etc.

For me as a teacher and – later – as a writer and trainer, the thing that really allowed me to forge forwards and focus my classroom practice clearly and with precision was  getting my head round the findings emerging from corpora research that suggested that language was often more fixed than we’d perhaps previously realised, that collocation was a key factor in fluent usage, that grammar and vocabulary existed in a complex intertwining, that co-text was at least as important as situation or context. Later, my ideas of what was important to be doing in the classroom were consolidated and further clarified by grasping the idea that competent usage emerges not – or at least only rarely – from a study of grammar rules and forms and of single words, but rather from having one’s knowledge, whether that be implicit or explicit, expanded via encounters with language in use, each and every one of which prime us to expect language to operate in certain ways again.

Which brings me more or less to where I am today: in a place where I believe that the main job of the language teacher is NOT to search out The Five Main Reasons To Use YouTube In Class or to feel somehow inadequate if you’re unable to recite in order The 12 Ways That Technology Can Enhance Your Teaching, but instead to continue first and foremost to learn and to think about language and the way it works and is used – in order to then be better able to teach students at least some of these insights. Our role is class is primarily to ensure students meet, whether through reading or listening, language that may be of use to them (and we do need to have thought about why – and, indeed, whether – what we’re teaching may be useful), to make sure it’s intelligible to them (explaining and exemplifying where necessary), to help them notice salient features of whatever language it is that comes up and to then ensure they use it in some way – and get to revise as much of it as possible at a later date as possible.

Of course, you can do all of these things and still try out new techniques and technologies.

But at the same time, you really don’t have to.

And if you don’t, you may well still be an excellent teacher who gets good results from their students.

Maybe this seems obvious to you. If so, it may simply be because the very fact that you’re hearing reading yet another post on my blog means you are by definition one of the converted. I’m preaching to the choir, as our American cousins would have it.

However, it may also be the case that by now you’re actually feeling guilty about the irrepressible desire you still harbour yourself for recipes. You may be starting to question where that thirst leads you and what function it serves. You may even be asking if the uses you’re making of your precious and limited free time are actually the best if you’re seeking to really facilitate advancement.

My suspicion remains that many teachers – though, of course perhaps not those that find their way here – will fall into the latter camp quite simply because so little emphasis is placed on language development in TD circles. When was the last time you saw a conference talk or a journal paper that focused primarily on language, and in particular on language as seen from the point of view of a language teacher having to deal with the kinds of questions language students ask as they process and digest what they’re given? Never could well be a safe wager!

Why bother with such deeply unfashionable notions when there are new gimmicks to flog, new hoops to get teachers to jump through, and new recipes to fill yet more ELT cookbooks up with?

Jumping Through Hoops

Working exercises hard

Last weekend at University of Westminster, we held our first one-day Lexical Conference. This will hopefully now become an annual event, and we were greatly encouraged by the fact that it sold out and also by the wonderful speakers we had. Alongside myself and Andrew Walkley, we had Leo Selivan, Philip Kerr, Nick Bilbrough, Luke Fletcher, Richard Paterson and Katie Mansfield, Muralee Navaratnam and as special guests of honour Michael Hoey and Michael Lewis.

I did two sessions – a plenary entitled Teaching Grammar Lexically – and a workshop called Working Exercises Hard. I had a couple of folk email me to ask if I had an online version of the sessions, which I didn’t, but due to popular request (well, ONE request at least!), I’ve trained myself how to use a great site that allows you to upload Powerpoints and narrate them and below is the fruits of my labour.

Thought it’d make a change as a blog post and if it is well received, it may be something I try and do again.

Hope you enjoy watching this and look forward to your comments and questions.

Possibility, probability and (Raymond) Murphy’s Law: dodging stray grammar bullets

If Murphy’s Law didn’t already exist, it’d be the perfect name to describe the correlation between how much a teacher knows about language, how confident they are of their own grasp of grammar, and the likelihood that at some point in the lesson they’ll go off on one and start lecturing at great – and confusing – length about an obscure point they have only the most tenuous grip on. The fact is that at the first whiff of grammar, many students suddenly spark into life and start scattering the unwitting teacher with stray grammar bullets that only years of painful experience really help you dodge. Of course, the axiom that states that whatever can go wrong, will go wrong is not irrelevant here, but it’s actually more Raymond Murphy’s Law that teachers so often inadvertently bring into play in the classroom.

I know because I’ve been there! And lived to tell the tale. I was reminded of my former selves just yesterday when a brief piece of reformulation of something a student had been trying to say in response to a question in the coursebook asking what advice people would give to a guy they’d heard moaning about his new job. As students were talking, I wrote up on the board what they were trying to say and during my round-up elicited words like SHUT from HE SHOULD JUST S…….. UP AND PUT UP WITH IT,  STICK from HE SHOULD JUST STICK WITH IT and WAY from HE MIGHT BE ABLE TO WORK HIS WAY UP IN THE COMPANY. The board ended up looking like this:

Grammar Blog Post 1

As students were writing down what had ended up on the board, one student said she wasn’t sure about MIGHT BE ABLE TO. I explained that it meant maybe he can – and that it we often used it after modal verbs like MIGHT and SHOULD, so we say things like I CAN’T DO IT TODAY, BUT I SHOULD BE ABLE TO DO IT SOMETIME NEXT WEEK. This seemed to satisfy her, but then Raymond Murhpy’s Law kicked in and the questions came pouring forth:

“But be able to is also for the present, yes? That’s what my last teacher told me”

“And for the past. I wasn’t able to. I was able to.”

“Yes, And I am able to, like I am able to read.”

At which point I stopped the frenzy and said something along the lines of BE ABLE TO being possible in the present, but not really used much as CAN is much more common. You’d never tell anyone you can read, though, let alone that you were able to. The only thing you might say about reading is that someone CAN’T read – or that you couldn’t read the whole of a particular book – in the past – because it was too long or too boring. It’s much much more common to use CAN and CAN also refers to the future sometimes as well. I then wrote up on the board: I CAN’T MEET YOU TODAY OR TOMORROW, BUT I CAN DO SATURDAY. One student asked if COULD was also possible here, at which point other students shouted out “No! No! COULD is past”. I set them straight on this and said COULD was perfectly possible too, and was basically the same as CAN in this context – maybe a little less certain. One student asked if I’M ABLE TO or I WILL BE ABLE TO DO SATURDAY was OK. I said it was possible, but sounded weird and CAN / COULD were much more likely. I then wrote up an example using SHOULD BE ABLE TO as well, and we ended up with a board like this:

Grammar blog post 2

Students noted down what had gone up and we moved on.

The brief little episode did provide food for thought, though, and prompted a reflection on how earlier versions of myself might’ve handled this.

Both CELTA and DELTA instilled in me the belief that it was meanings and forms that were the most important things a teacher could make clear to students when tackling grammar. The whole trinity of meaning, form or pronunciation – or MFP for short (an acronym that for someone like me, who’s spent far too much of his life trawling second-hand record stores and charity shops, always recalled . . . with a chuckle . . . the Music For Pleasure label logo!!) – was pretty much all I considered when it came to handling anything grammatical for maybe the first six or seven years of my teaching career.


This, coupled with the obsession with the Present-Practise-Produce approach to grammar that these courses instilled in me meant that any incident such as the one I describe above would have once sparked major anxiety. “They still don’t get be able to”, I would’ve fretted. “I’d better build in a whole hour-long slot on it tomorrow – and give them a page on it from Murphy’s as homework.” Or else I may well have simply told them that yes, it can be used in the present. And the past. And then have written a few bizarre examples up, or perhaps simply have written up WAS / WERE ABLE TO + VERB, AM / IS / ARE ABLE TO + VERB, WILL BE ABLE TO + VERB and left it at that.

The single biggest thing that has improved my grammar teaching – and quite possibly my teaching in general (certainly the vocabulary part of what I do, for sure) – is getting my head round something I first read in The Lexical Approach by Michael Lewis: teach the probable, not the possible. Sure, tons of things MIGHT be said, but are they USUALLY? Yes, of course, be able to CAN be used in the present, but certainly not in the context the students presented it to me in . . . and generally only in fairly specific kinds of genres / contexts, none of which had particular pertinence here. Narrow things down to particulars. Focus on what’s typical. Give clear, concise explanations and examples. Move on. You’ll pass this way again sooner or later anyway, and accuracy comes in dribs and drabs. It seems fairly clear, also, that it depends more on the accretional impact of examples – or on priming, if you prefer – than on any particularly sophisticated grasp of the subtleties of rules.

Knowing these things are teaching with them ever present in the mind has allowed me not only to enjoy my teaching far more, and to feel less bogged down by pointless rambling meta-linguistic waffle, but also to feel I’m actually helping more – both by giving simple, easy-to-digest examples, but also by warning students off random friendly fire, by encouraging them to lay down arms and reduce the paranoia. And by doing this Murhpy’s Law can finally be thwarted.

What have corpora ever done for us?

Following a conversation over on the facebook page I use for talking about teaching and language, I’ve decided to post a talk I did at IATEFL many moons ago. I do remember, with a faint smile, that Dave Wills himself came along to watch this one, but at some point became overcome with either rage or tedium and flounced out, thus allowing me to make the cheap jibe about Elvis having left the building before carrying on. Were this post to generate even a tenth of that heady level of excitement, I’d be delighted!

Written maybe ten years ago, at the height of the corpora promo boom, it was intended as a partially tongue-in-cheek critical overview of corpora linguistics. And yes, for those of you that were wondering, the title WAS inspired by this rather splendid Monty Python sketch:

With that in place, here goes nothing . . .

The use of computers to store and help analyse language has obviously revolutionised many aspects of language teaching, and corpora linguists have become an ever-present feature at IATEFL and other similar conferences. Obviously, much good has come from this. We have had a whole new generation of much-improved dictionaries, all of which contain better information about usage, collocation and frequency; superb new reference books such as the Longman Grammar of Spoken and Written English have been made possible, and, perhaps inadvertently, corpora linguistics helped to launch the Lexical Approach and to thus help to move language at least some way back towards the centre of language teaching. Nevertheless, it seems to me that despite all these advances, corpora linguistics has also had several negative side-effects on the way teachers perceive their roles, and that they have actually enslaved us in ways which are not entirely healthy. I would like to move on to consider the ways in which I feel this has occurred.

The fallacy of frequency

Corpora linguists repeatedly promote their products with often highly-detailed reference to frequency counts and the idea that frequency is central has become a common one. However, should a Pre-Intermediate learner wish to be passed the salt over dinner, simply knowing the infrequent item ‘Salt’ will facilitate this in a way that knowing the far more frequent ‘Could’, ‘you’, ‘pass’, ‘the’ and ‘please’ would not. Generally, it’s not the most common words which carry core meanings; rather, it’s the far rarer items that do. Simply knowing the 800 most common words in the language makes you only able to say a lot about not very much. In the same way, failure to learn word which may well be low-frequency generally, but which are possibly much higher frequency within specific types of conversations condemns you to not being able to say very much about a lot!! Frequency tells us nothing more than what is frequent. It cannot tell us what’s useful, what’s necessary or even what’s teachable.

There are deeper problems here to do with the way in which frequency is actually calculated. Corpora remains word-obsessed and the process of lemmatisation compounds this. Hence, an idiom like ‘You’re a dark horse’ is entered not as a two-word idiom, but rather as one example of ‘dark’ and another of ‘horse, thus defaulting on two fronts.  Similarly, plural nouns are currently counted as other examples of singular ones, which is a rather major oversight. Is, for instance, the singular of ‘Many Happy Returns’ ‘A Happy Return’? ‘Meetings’ is not simply  the plural of ‘meeting’, and it collocates with different words. Finally, knowing that, say, ‘get’ is a very common word does little to help teachers know whether ‘get on with it’ is more frequent that’ Let’s get down to business’. Sadly, until corpora start sorting by chunk they will remain of limited relevance.

The fallibility of human endeavour.

That corpora need to be approached cautiously and with one’s intuition fully tuned is made apparent by a cursory glance at the word ‘thaw’ on several published CDs. Should one access the word, wishing to know whether snow melts or thaws, one would be surprised to learn that a far more frequent example of the word, and thus – if we follow the logic of corpora linguists – a more useful collocate for our students is actually John, as in John Thaw, the late, great British actor.

Similarly, I once saw a Jane Willis talk wherein she suggested that one of the most common three-word lexical items in the English language was ‘Princess of Wales’. It was only when pushed during questioning that she actually admitted that the corpora she had taken this data from was based almost exclusively on a couple of radio phone-in programmes. In the same, way, the actual construction of corpora-based materials – dictionaries and the like – also inevitably involve a degree of hammering out by researchers, often by means of a vote or a fudge. Corpora are by necessity human constructs based on limited samples of data, are easily skewed by input and thus are best viewed sceptically.

The limitations of what corpora can offer

While spoken language, conversation, may well form the basis – even the majority – of many corpora, what corpora can’t show us is what typical conversations look like. It’s not possible, for instance, to access ten typical conversations had by people talking about what they did last night or to look at the 20 most common ways of answering the question “So what do you do for a living, then?”. As such, if we want to present our students with models of the kinds of conversations they themselves might actually want to have, we are forced to fall back on our (actually ample) experience of such conversations in order to script them. However, I would argue that it is precisely because we have got such broad experience of such conversations that we do tend to know how they work and sound and look.

For teaching purposes. we need to be able to script conversations that aren’t so culturally and spatially bound as to exclude students; we need to ensure the conversations students are exposed to still somehow facilitate intra-class bonding. Input needs to be proto-typical and to include items which are easy for us to systematise and for learners to appropriate and assimilate. Corpora cannot do this for us.

Corpora and the non-native speaker teacher

It is often claimed – mainly by those who are employed to make, package and sell corpora – that corpora are an invaluable aid for the non-native speaker teacher. I would personally argue that the opposite is far too often true and that as they stand, corpora massively favour native speakers.

One understandable reaction many teachers, both native and non-native, have to the notion that they should teach more spoken English is the ‘but I’d never say this or that bit of language” response when faced with a spoken text. Ironically, written texts never elicit a similar “But I’d never write that myself” response, and there are several reasons for this, I feel. There is possibly an assumption that writing is a more creative realm where anything goes; there’s also the fact that the grammar and the lexis of the written language have already been codified and disseminated and are thus more familiar to teachers; thirdly, I think, there’s the fact that we pin our identities on our speech – our idiolect, our regional, class-based, age-oriented, in-group, gender-based grasp of lexis and grammar – far more profoundly than we do on what we write. We are so aware of differences in the way we speak that we usually fail to notice the massive similarities. A good example of this is the fact that every EFL book which focuses on the UK / US divide fails to note that the vast majority of the language used in both countries is remarkably similar, and instead frets over the present perfect, sidewalks versus pavements and the correct pronunciation of aluminium. Yet for every “It can out of the blue” / “It came out of left-field’ divergence, there must surely be ten other idioms we all have in common.

Given this, I personally feel it doesn’t take much to persuade non-native speaker teachers to stick to the already familiar, tried-and-tested formula of written texts and comprehension questions and structural grammar. By spending so much time pointing out relatively obscure quirks and neologisms, such as the fact that ‘like’ is being increasingly used to report speech (as in “He was like ‘Hi’ so I was like ‘Bye’) , corpora linguists are inadvertently making spoken English more of a foreign language for non-native speaker teachers than is perhaps wise for people who claim to believe – as I do – that spoken English should become much more a part of General English than is currently the case. Too relentless a focus on the new, the odd, the interesting, the different obscures the wealth of English that unites us all.

I also feel that it is not only many non-native speaker teachers who would never use ‘like’ in this way, but also many native speakers too. The vast majority of language teachers do NOT need corpora to tell us that this is a relatively unuseful piece of lexis, so long as it remains still relatively unused. Indeed, my own rule of thumb would be that if YOU don’t say it, don’t TEACH it. English as a foreign language is NOT English as the corpora knows it. If you believe, as I do, that the kind of model conversations coursebooks provide for teaching purposes should be better modelled on the information provided by corpora than is currently the case, then I find it hard to see how you couldn’t also support the idea that corpora specialists should concentrate more on insights which will be of direct use to coursebook writers and teachers alike. Indeed, given the problematic status of spoken language within the classroom at present, I’d go so far as to say assert that failure to do anything less serves to sabotage attempts to spread a methodology based on spoken language (and here, of course, I’m compelled to acknowledge my own interest in this area as a coursebook writer).

I find it particularly interesting to note that the constructors of corpora – or at least their backers – seem as yet very reluctant to work on a corpus of English as used by non-native speakers. Obviously, this would be in essence the same corpus, but with much left out. This is precisely the point : that which is left out by competent non-native speakers has no real place in most – and especially most pre-Advanced – teaching materials.

Animal Farm (or Beware of the oppressive tendencies of those who come claiming to liberate us!!)

It would be churlish to deny that corpora have provided us with some useful insights into such features of language as the fact that would is three times more common when talking about past habits than used to is, but at the same time it must also be added that the way in which corpora have been presented has all-too often intimidated us into pretending that we didn’t already know much – if not most – of what they confirm. For example, Mike McCarthy, at IATEFL Brighton 2001 spent half an hour blinding us with the statistics that showed – entirely unsurprisingly – that ‘take the mickey’ is far more common than ‘mickey-taker’ or ‘mickey-taking’. Surely any fluent speaker of the language could have guessed this (dubiously relevant) fact themselves, based on their own intuitions about the language.

The relentless emphasis on the finality of corporal truth no only denies the reality of the classroom practitioner who has to get in there each and every day and try to give their students information about the language being studied, but also refuses to acknowledge the fact that we all have heard and read millions and millions more words than any corpus will ever hold and thus have good hunches about words as a result. Sure, hunches about language can be wrong, but more often than not, they aren’t. I personally really resent the notion that not only are corpora useful for showing us the errors of our ways, but also for confirming when we’re right. The implication is that we are not right UNTIL we’ve checked! This way lies madness – and the deskilling of us all!!


Obviously, it is important that teachers do keep themselves up-to-date with corpora findings and adapt their understanding of the way language works accordingly. Here I totally agree with Ron Carter that one thing corpora has helped us become more aware of is the fact that grammar is much broader than sentence-based / tense-based grammar would seem to suggest. Words have their own micro-grammar and so lexis needs to continuously be grammaticalised in typical ways. Nevertheless, it is also vital that teachers are encouraged to believe that they can tap into and trust their own inner corpora.

If Carter and McCarthy can proclaim that the more students are encouraged and trained to notice, the more they actually will notice, then the same much surely be true for us as teachers. Indeed, the true sign of corpora-work well done is its own eventual redundancy. This really brings me to my final point – one of the great ironies of corpora is that they have actually unwittingly made teachers more intuitive, not less. What corpora have done is to place language back at the centre of classrooms and, as such, we all now have to think much more about how we actually use language.

To a degree, corpora and teachers exist in a parent-child relationship, and many teachers are now ready to leave home. Thanks Mum and Dad – you’ve done a great job, we may be back to visit every now and then, but we’ve basically already got the message!

However, lest we forget, corpora are bank-rolled by major publishing houses and have endless spin-off publications derived from them in an effort to recoup much of this investment. As such, maybe I’m expecting too much by asking those in receipt of the publisher’s pound to loose the reins on much of their power and place it back where it rightly belongs – back in the hands of the humble classroom practitioners!!!