Professor Paul Baker
Research Interests
My research interests include corpus linguistics, language and identities and (critical) discourse analysis. Books include: Using Corpora to Analyse Gender (2014), Discourse Analysis and Media Attitudes (2013), Corpus Linguistics and Sociolinguistics (2010), Sexed Texts: Language, Gender and Sexuality (2008), Using Corpora in Discourse Analysis (2006), Public Discourses of Gay Men (2005) and Polari: The Lost Language of Gay Men (2002). I am the commissioning editor for the journal .
Collins, L. and Baker, P. (2023) Language, Discourse and Anxiety. Cambridge: CUP.
Baker, P. (2023) Using Corpora for Discourse Analysis. Second Edition. London: Bloomsbury.
Baker, P. (2023) . London: Footnote Press
Gillings, M. Mautner, G. and Baker, P. (2023) . Cambridge Elements: CUP.
Baker, P. (2022) Outrageous! The Story of Section 28 and Britain's Battle for LGBT Education. London: Reaktion.
Brookes, G. and Baker, P. (2021) Obesity in the News: Language and Representation in the Press. Cambridge: Cambridge University Press.
Baker, P. Vessey, R. and McEnery, T. (2021) The Language of Violent Jihad. Cambridge: Cambridge University Press.
Egbert, J. and Baker, P. (eds) (2019) Using Corpus Methods to Triangulate Linguistic Analysis. London: Routledge.
Baker, P. (2019) Fabulosa! The Story of Polari, Britain's Secret Gay Language. London: Reaktion. .
Baker, P., Brookes, G. and Evans, C. (2019) The Language of Patient Feedback: A corpus linguistic study of online health communication. London: Routledge.
Baker, P. (2017) American and British English. Divided by a Common Language? Cambridge: Cambridge University Press.
Baker, P. and Balirano, G. (eds) (2017) Queering Masculinities in Language and Culture. London: Palgrave.
Baker, P. and Egbert, J. (eds) (2016) Triangulating Methodological Approaches in Corpus-Linguistic Research. London: Routledge.
Baker, P. and McEnery, T. (eds) (2015) Corpora and Discourse: Integrating Discourse and Corpora. London: Palgrave.
Baker, P. (2014) Using Corpora to Analyse Gender. London: Bloomsbury.
Baker, P. Gabrielatos, C. and McEnery. T. (2013) Discourse Analysis and Media Attitudes: The Representation of Islam in the British Press. Cambridge: Cambridge University Press.
Baker, P. and Ellece, S. (2011) Key Terms in Discourse Analysis. London: Continuum.
Baker, P. (2010) Sociolinguistics and Corpus Linguistics. Edinburgh: Edinburgh University Press.
Baker, P. (ed.) (2009) Contemporary Corpus Linguistics. London: Continuum.
Baker, P. (2008) Sexed Texts: Language, Gender and Sexuality. London: Exquinox.
Baker, P. (2006) Using Corpora in Discourse Analysis. London: Continuum.
Baker, P., Hardie, A. & McEnery, A. (2006) A Glossary of Corpus Linguistics. Edinburgh: Edinburgh University Press.
Baker, P. (2005) Public Discourses of Gay Men. London: Routledge.
Baker, P. & Stanley, J. (2003) Hello Sailor! Seafaring life for gay men: 1945-1990. London: Pearson.
Baker, P. (2002) Fantabulosa: A Dictionary of Polari and Gay Slang. London: Continuum.
Baker, P. (2002). Polari: The Lost Language of Gay Men. London: Routledge.
I am commissioning editor of the journal published by Edinburgh University Press.
I am on the editorial board for the Journal of English Linguistics, the Journal of Language and Sexuality, Gender and Language, Applied Linguistics, Journalism and Discourse Studies, Text and Talk and Discourse Coherence, Cognition and Creativity.
Journal Articles
Baker, P. (2023) A year to remember? Introducing the BE21 corpus and exploring recent part of speech tag change in British English. International Journal of Corpus Linguistics.
Baker, P. and Collins. L (2023) . Applied Corpus Linguistics 3(1).
Egber, J. Wizner, S., Keller, D., Biber, D., McEnery, T. and Baker, P. (2021) . Text and Talk 41(5-6): 715-737.
Brookes, G. and Baker, P. (2021) . Applied Corpus Linguistics 1(3).
Heritage, F. and Baker, P. (2021) . Critical Discourse Studies. Published online 12 April 2021.
Brookes, G. and Baker, P. (2021) . Journal of Risk Research. Published online 5 January 2021.
Baker, P. Brookes, G., Atanasova, D. and Flint, S. (2020) . Social Science and Medicine 264.
Baker, P. and Vessey, R. (2018) International Journal of Corpus Linguistics 23(3): 255-278.
Brookes, G. and Baker, P. (2017). BMJ Open 7(4).
Paknahad Jaborooty, M. and Baker, P. (2017) 'Resisting silence: moments of empowerment in Iranian women's blogs'. Gender and Language 11(1): 77-99.
Baker, P. (2016) ' International Journal of Corpus Linguistics 21(2): 139-164.
10(1): 106-139.
Anthony, L. and Baker, P. (2015) .' International Journal of Corpus Linguistics 20(3): 273-292.
Baker, P. (2015) .' Discourse and Communication 9(2): 143-147.
Chen, Y-H., and Baker, P. (2015) 'Investigating criterial discourse features across second language development: lexical bundles in rated learner essays, CEFR B1, B2 and C1.' Applied Linguistics
Baker, P. and Levon, E. (2015) .' Discourse and Communication 9(2): 221-336.
Baker, P. and Love, R. (2015) ' Journal of Language, Aggression and Conflict. 3(1): 57-86.
Baker, P., Gabrielatos, C. and McEnery T. (2013) ‘ 34:3255-78.
Baker, P. and Potts, A. (2013) '"Why do white people have thin lips?": Google and the perpetuation of stereotypes via auto-complete search forms." Critical Discourse Studies 10:2 187-204.
Baker, P. (2012) ‘From gay language to normative discourse: a diachronic corpus analysis of Lavender Linguistics conference abstracts 1994-201.’ Journal of Language and Sexuality 2:2 179-205.
Potts, A. and Baker. P. (2012) 'Does semantic tagging identify cultural change in British and American English?' International Journal of Corpus Linguistics 17:3 295-324.
Baker, P. (2012) 'Acceptable bias?: Using corpus linguistics methods with critical discourse analysis.' Critical Discourse Studies 9:3 247-256.
Gabrielatos, C., McEnery, T., Diggle, P., Baker. P. and ESRC (funder). (2012) 'The peaks and troughs of corpus-based contextual analysis.' International Journal of Corpus Linguistics. 17:2 151-175.
Baker, P. (2011) 'Times may change but we'll always have money: a corpus driven examination of vocabulary change in four diachronic corpora.' Journal of English Linguistics 39: 65-88.
Baker, P. (2010) 'Will Ms ever be as frequent as Mr? A corpus-based comparison of gendered terms across four diachronic corpora of British English.' Gender and Language 4.1: 125-129.
Chen, Y. and Baker, P. (2010) .' Language Learning and Technology. 14: 2 30-49.
Baker, P. (2010) 'Representations of Islam in British broadsheet and tabloid newspapers 1999-2005.' Language and Politics. 9:2 310-338.
Baker, P. (2009) 'The BE06 Corpus of British English and recent language change.' International Journal of Corpus Linguistics. 14:3 312-337.
Baker, P.,Gabrielatos, C., Khosravinik, M., Krzyzanowski, M., McEnery, T and Wodak, R. (2008) Discourse and Society 19(3): 273-306.
Gabrielatos, C. and Baker, P. (2008) Journal of English Linguistics 36:1 pp. 5-38.
Baker, P. and McEnery, A. (2005) Language and Politics 4:2 pp. 197-226(30).
Baker, P. Hardie, A. McEnery, A., Xiao, R., Bontcheva, K., Cunningham, H., Gaizauskas, R., Hamza, O., Maynard, D., Tablan, V., Ursu, C., Jayaram, B.D., Leisher, M. (2004) , Literary and Linguistic Computing, Volume 19, Issue 4, pp 509-524.
Baker, P. (2004) ' Journal of English Linguistics. 32: 4 pp 346-359.
Baker, P. (2004) Sociolinguistics 8:1 88-106.
Baker, P. (2002) ',' Lesbian and Gay Review, 3:3: pp 75-84.
Baker, P. (2001) ''. Journal of Computer Mediated Communication 7:1.
Baker, P. Lie, M., McEnery, A. and Sebba, M. (2000) 'Building a Corpus of Spoken Sylheti', Literary and Linguistic Computing, Volume 15, Issue 4, pp 419-431.
McEnery, A., Wilson, A.and Baker, P.(2000) 'Language teaching: corpus based help for teaching grammar', Journada de Corpus Linguistics, Volume 6, pp 65-77.
McEnery, A. Baker, P. Gaizauskas, R. & Cunningham, H. (2000) 'EMILLE: towards a corpus of South Asian languages', British Computing Society Machine Translation Specialist Group, London, pp 11-1 - 11-9.
McEnery, A. Wilson, A.and Baker, P. (1997) 'Teaching Grammar Again after Twenty Years: Corpus based help for grammar teaching.' New Approaches to Grammar Teaching, RECALL Journal, Volume 9, Number 2, pp 8-17.
Baker, P., McEnery, A.and Wilson, A. (1995) 'A brief report on a statistical analysis of corpus-based versus traditional human-teaching methods of part-of-speech analysis', Language Testing Update, Issue 18, pp 59-62.
McEnery, A., Baker, P. and Wilson, A. (1995) 'A Statistical Analysis of Corpus Based Computer vs Traditional Human Teaching Methods of Part of Speech Analysis', Computer Assisted Language Learning, Volume 8, Number 2-3, pp 259-274.
Baker, P. (1994) 'Lithium Discontinuation - A meta-analysis.' Lithium.
Book Chapters
Baker, P. and McGlashan, M. (2020) . In: Adolphs, S. & Knight, D. (Eds.) The Routledge Handbook of English Language and the Digital Humanities. London: Routledge. pp. 220-241.
Baker, P. (2020) Corpus-assisted discourse analysis. In C. Hart and V. Koller (eds) Researching Discourse: A Student Guide. London: Routledge, pp. 124-142.
Baker, P. and Baker, H. (2019) Conceptualising masculinity and femininity in the British press. In C. Carter and L. Steiner and S. Allan (eds) Journalism, Gender and Power. London: Routledge pp. 363-382.
Baker, P. and McEnery, T. (2018) 'The value of revisiting and extending previous studies: The Case of Islam in the UK Press.' In R. Scholz (Ed) Quantifying Approaches to Spoken Discourse. Cham: Palgrave Macmillan. pp. 215-249.
Subtirelu, N. C. and Baker, P. (2017) Corpus-based approaches. In Richardson, J. and Flowerdew, J. (eds) The Routledge Handbook of Critical Discourse Studies, pp. 107-120.
Baker, P. (2017) Sexuality. In E. Friginal (ed) Studies in Corpus-Based Sociolinguistics. London: Routledge, pp. 159-177.
Baker, P. (2016) 'Gendered Discourses' in Baker, P. and Egbert, J. (eds) Triangulating Methodological Approaches in Corpus-Linguistics Research. London Routledge, pp. 138-151.
Baker, P. and McEnery, T. (2015) 'Who benefits when discourse gets democratised? Analysing a Twitter corpus around the British Benefits Street debate.' In Baker, P. and McEnery T. (eds) (2015) Corpora and Discourse Studies: Integrating Discourse and Corpora. London: Palgrave, pp 244-265.
Baker, P. and McEnery, T. (2015) 'Introduction' In Baker, P. and McEnery, T. (eds) (2015) Corpora and Discourse Studies: Integrating Discourse and Corpora. London: Palgrave, pp 1-20.
Baker, P. (2015) 'Two hundred years of the American man.' In T. Milani (ed) Language and Masculinities: performances, intersections, dislocations. London: Routledge.
Baker, P. and McEnery, A. (2014) '"'FIND THE DOCTORS OF DEATH': The UK Press and the Issue of Foreign Doctors Working in the NHS, a Corpus-Based Approach". In A. Jaworski and N. Coupland (eds) The Discourse Reader. London: Routledge.
Baker, P. (2014) '"Bad wigs and screaming mimis": Using corpus-Assisted techniques to carry out critical discourse analysis of the representation of trans people in the British press.' In C. Hart and P. Cap (eds) Contemporary Critical Discourse Studies. London, Bloomsbury: 211-236
Baker P. ‘Discourse and Gender’. (2013) In K. Hyland and B. Paltridge (eds) Continuum Companion to Discourse Analysis. London: Continuum.
Baker, P. (2013) ‘Corpus Linguistics and Sociolinguistics’. J . Holmes (ed). Research Methods in Sociolinguistics. A Practical Guide. Wiley Blackwell.
Baker, P. (2012) 'Corpora and Gender studies' In K. Hyland, C. M. Huat and M. Handford (eds) Corpus Applications in Applied Linguistics. London: Continuum, pp. 100-116.
Baker, P. (2012) ‘Diachronic lexical change in American English (1961-2006).’ In J. Zhang (ed). A Morphologically-based 大秀视频 of the Lexical Collocation Heterogeneity in EST Texts. Shanghai Jiaotong University.
Baker, P. (2011) 'Social involvement in Corpus Studies.' In V. Viana, S. Zyngier, and G. Barnbrook (eds) Perspectives on Corpus Linguisitcs. Amsterdam: John Benjamins pp. 17-28.
Baker, P. (2010) 'Corpus Linguistics'. L. Litosseleti (ed) Research Methods in Linguistics. London: Continuum, pp. 93-113.
Baker, P. (2009) 'Issues in teaching corpus-based discourse analysis' In L. Lombardo (ed). Using Corpora to Learn about Language and Discourse. Peter Lang, pp. 73-98.
Baker, P. (2009) 'Introduction' In P. Baker (ed) Contemporary Approaches to Corpus Linguistics. London: Continnum, pp. 1-8.
Baker, P. (2009) 'Language and Sexuality'. In J. Culpeper, F. Katamba, P. Kerswill, R. Wodak and T. McEnery (eds) English Language and Linguistics. London: Palgrave, pp. 550-563.
Baker, P. (2008) 'Eligible' bachelors and 'frustrated' spinsters: corpus linguistics, gender and language. In J. Sunderland, K. Harrington and H. Stantson (eds) Gender and Language Research Methodologies. London: Palgrave.
McEnery, T. and Baker, P. (2003) 'Corpora, translation and multilingual computing' in F. Zannetin (ed.) Corpora in Translator Education, St. Jerome Press, Manchester.
Baker, P. (2002) 'No Fats, Femmes or Flamers: Changing Constructions of Identity and the Object of Desire in Gay Men's Magazines.' B. Benwell (ed.) Masculinity and Men's Lifestyle Magazines. Sociological Review.
McEnery, A., Baker, P. and Cheepen, C. (2001) 'Lexis, Indirectness and Politeness in Operator Calls.' In C. Meyer & P. Leistyna. (eds.) Corpus Analysis: Language Structure and Language Use. Rodopi: Amsterdam.
Singh, S., McEnery, A. and Baker, P.(2000) 'Building a Parallel Corpus of English/Punjabi', in J. Veronis (ed) Parallel Text Processing. Kluwer: Dordrecht, pp 335-347.
McEnery, A.M., Baker, P. andHardie, A. (2000) 'Swearing and Abuse in Modern British English', in B. Lewandowska-Tomaszczyk and P.J. Melia (eds.) Practical Applications of Language Corpora, Peter Lang: Hamburg, pp 37-48.
McEnery, A. and Baker, P. (2000) 'Minority Language Engineering', in B. Lewandowska-Tomaszczyk and P.J. Melia (eds.) Practical Applications of Language Corpora, Peter Lang: Hamburg, pp 411-428.
McEnery, A.M., Baker, P. andHardie, A. (2000) 'Assessing Claims about Language Use with Corpus Data - Swearing and Abuse', in J. Kirk (ed) Corpora Galore, Rodopi: Amsterdam, pp 45-55.
Baker, P. (1997) 'Consistency and Accuracy in Correcting Automatically Tagged Data.' In Corpus Annotation. R. Garside, G. Leech & A. McEnery (eds.) Longman Addison-Wesley, pp 243-250.
McEnery, A.M., Baker, P.& Hutchinson, J.E. (1997) 'A Corpus Based Grammar Tutor'. In R.G. Garside, G.N. Leech & A.M. McEnery (eds.) Corpus Annotation, Longman Addison-Wesley, pp 209-219.
Conference Proceedings
Xiao, Z, McEnery, A, Baker, P and Hardie, A (2004) ''. In: Proceedings of the 4th Workshop on Asian Language Resources, Sanya, China.
Baker, P, Hardie, A, McEnery, T and Jayaram, BD (2003) ''. In: Archer, D, Rayson, P, Wilson, A, and McEnery, T (eds.) (2003) Proceedings of the Corpus Linguistics 2003 conference. UCREL Technical Papers Volume 16. Department of Linguistics, Lancaster University.
Baker, P, Hardie, A, McEnery, AM and Jayaram, BD (2003) ''. In: Proceedings of the EACL Workshop on South Asian Languages, Budapest.
Tablan, V., Ursu, C., Bontcheva, K., Cunningham, H., Maynard, D., Hamza, O., McEnery, T., Baker, P. & Leisher, M. (2002) ,' in LREC 2002 Proceedings, pp 66-71.
Baker, P, Hardie, A, McEnery, A, Cunningham, H and Gaizauskas, R (2002) ''. In: Proceedings of LREC 2002.
Baker, P, Hardie, A, McEnery, A and Siewierska, A (eds.) (2000) Proceedings of the Third Discourse Anaphora and Reference Resolution Colloquium (2000). UCREL Technical Papers Volume 12 Special Issue. Department of Linguistics, Lancaster University.
McEnery, T., Baker, P., and Burnard, L. (2000) '', in M. Gavrilidou, G. Carayannis, S. Markantontou, S. Piperidis and G. Stainhauoer (eds) Proceedings of the Second International Conference on Language Resources and Evaluation, Athens, Greece, pp. 801-806.
McEnery, A. and Baker. P. (1998) 'Intergrating the Intranet into the teaching of linguistics.' (1998). The Future of the Humanities in the Digital Age. International Conference Bergen, Norway. 138-140.
Currrent Teaching
I currently teach various modules in Corpus Linguistics at MA level (on three different degree schemes), have several PhD students and supervise third year UG dissertations.
I have supervised the following PhD students (dates show completion):
Saiqa Asif (2006), Stephanie Suhr (2007), Sibonile Ellece (2008), Yufang Qian (2008), Andrew Brindle (2009), Yuhua Chen (2009), Sheena Kaur (2009), Maryam Pakhnahad (2011), Amir Salama (2011), Rob Bianchi (2011), Hiroko Usami (2012), Bandar Al-Hejin (2012), Rajab Zahrani (2013), Amanda Potts (2014), Anna Marchi (2014), David Brown (2014), Mark McGlashan (2016), Bilal Kadiri (2017), Karen Kinloch (2018), James Balfour (2020), Craig Evans (2021) and Frazer Heritage (2021)
My current PhD students are, Mark Wilkinson (a diachronic analysis of LGBT identity in The Times), Sijia Li (reporting of domestic violence in China) and Rakan Alibri (risks to life news reporting).
I am regularly in London so can supervise PhD students there or at Lancaster.
The BE06 Corpus
The BE06 Corpus is a one million word corpus of published general written British English. It has the same sampling frame as the LOB and FLOB corpora. This consists of 500 files of 2000 word samples taken from 15 genres of writing.
Eighty-two per cent of the texts were published between 2005 and 2007, while the remainder were published in 2003-4 and early 2008. The median sampling point is 2006, hence the title BE06 (British English 2006). The corpus is described in this paper:
Baker, P. (2009) 'The BE06 Corpus of British English and recent language change.' International Journal of Corpus Linguistics. 14:3 312-337.
Using the corpus
Due to copyright issues, there are no plans to make the corpus files fully available. However, the corpus has been placed on the at Lancaster University and users can carry out concordances, get distribution information (and eventually have access to collocation information). Contact in order to obtain a username and password.
Additionally, the following links give frequency lists of the BE06 in various formats (right click on the link and then save it - you may have to initially save as html and then manually change to the .lst file extension using File Explorer).
BE06 Wordlist in WordSmith 7 format
The AmE06 Corpus
The AmE06 Corpus is a one million word corpus of published general written American English, also using the same sampling frame as the LOB and FLOB corpora. This consists of 500 files of 2000 word samples taken from 15 genres of writing. The vast majority of the texts were published in 2006. The corpus is also available via CQPweb, and the wordlist is available below.
Research Overview
Corpus linguistics, particularly in relation to discourse analysis or critical discourse analysis, or recent diachronic change. Representation of identity, especially gender and sexuality. Analysis of news or online corpora.
PhD Supervision Interests
I have supervised PhD students on the following topics:
Construction of Islam in the BBC sitcom Citizen KhanMetrosexuality in MalaysiaDiscourses of infertility in blogs, news and clinic websitesRepresentation of dialect in fiction
Children's books containing same-sex parent families
Language around schizophrenia in the British press
Previous PhDs I have supervised include:A corpus-based examination of the concept of political correctness in British broadsheet newspapers The language of marriage rituals in Botswana Combining corpus approaches and CDA to examine discourses of terrorism in the British and Chinese popular press Combining corpus approaches and CDA to examine discourses of homophobia in a right-wing political organisation A corpus study to compare lexical bundle use of Chinese learners of English with native speakers of English A corpus study of keywords to examine gender identity in British and Malaysian children's writingThe construction of gender identity in Iranian bloggersA corpus-based comparison of two academic books about Wahhabi Islam
01/06/2024 → 31/05/2029
01/10/2020 → 31/03/2023
01/04/2018 → 31/01/2024
01/04/2018 → 31/05/2018
29/02/2016 → 29/10/2016
31/03/2013 → 30/03/2018
01/12/2010 → 01/03/2011
Public Lecture/ Debate/Seminar
Public Lecture/ Debate/Seminar
Invited talk
Invited talk
Invited talk
Public Lecture/ Debate/Seminar
Public Lecture/ Debate/Seminar
Invited talk
Invited talk
Invited talk
Public Lecture/ Debate/Seminar
Public Lecture/ Debate/Seminar
Public Lecture/ Debate/Seminar
Public Lecture/ Debate/Seminar
Invited talk
Invited talk
Invited talk
Invited talk
Public Lecture/ Debate/Seminar
Invited talk
Invited talk
Public Lecture/ Debate/Seminar
Invited talk
Invited talk
Invited talk
Participation in conference -Mixed Audience
- DisTex - Discourse and Text Research Group
- ESRC Centre for Corpus Approaches to Social Science
- UCREL - University Centre for Computer Corpus Research on Language