{"id":92,"date":"2022-11-16T10:08:10","date_gmt":"2022-11-16T10:08:10","guid":{"rendered":"https:\/\/sites.edgehill.ac.uk\/corpusresearchgroup\/?page_id=92"},"modified":"2026-03-06T14:51:58","modified_gmt":"2026-03-06T14:51:58","slug":"past","status":"publish","type":"page","link":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/","title":{"rendered":"PAST EVENTS"},"content":{"rendered":"\n<div class=\"wp-block-group is-layout-constrained wp-block-group-is-layout-constrained\">\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2026<\/strong><\/h2>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<div class=\"wp-block-group is-layout-constrained wp-block-group-is-layout-constrained\">\n<h6 class=\"wp-block-heading\"><span style=\"text-decoration: underline\">MEETING #20: Friday 6 March 2026, 10:00-11:30 am<\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: LLMs, Corpus Linguistics, and Language Learning<\/em><\/h6>\n\n\n\n<h5 class=\"wp-block-heading\"><a href=\"https:\/\/languages-cultures.uq.edu.au\/profile\/2845\/peter-crosthwaite\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Peter Crosthwaite<\/strong><\/a> (University of Queensland, Australia)<\/h5>\n\n\n\n<h5 class=\"wp-block-heading\"><em><strong>Corpora, Prompts, and Pedagogy: Human-AI Text Comparison in Applied Linguistics<\/strong><\/em><\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2026\/03\/EHU-CRG.20.Crosthwaite.Slides.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SLIDES<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Generative AI has fundamentally disrupted corpus linguistics by making it possible to create large, readily generable corpora of machine-produced text, challenging long-standing assumptions about what constitutes a \u201cnatural\u201d language dataset. In response, a growing body of corpus-based research has begun to examine the linguistic characteristics of AI-generated texts, often comparing them with human writing to identify similarities, differences, and potential risks for language education and assessment.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Drawing on my own corpus-based analyses, this talk first revisits findings showing that AI-generated academic texts do not reliably approximate human-like stance, particularly in how evaluation, commitment, and authorial presence are expressed. These results complicate claims that AI writing is simply more \u201cexpert-like\u201d or rhetorically mature than student writing. However, more recent work extends this line of inquiry by demonstrating that many of the features attributed to AI texts are strongly conditioned by the prompts used to generate them, including task descriptions, genre cues, and implicit assessment criteria.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Taken together, these studies suggest that differences between human and AI writing are neither fixed nor intrinsic, but emerge from the interaction between prompts, tasks, and evaluative expectations. The talk concludes by considering whether this insight opens productive possibilities for language teaching, asking whether corpus-driven data-driven learning approaches can be meaningfully combined with AI-mediated tools to support learner noticing, agency, and writing development, rather than positioning corpora and AI as competing paradigms.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\">MEETING #19: Friday 6 February 2026, 2:00-3:30 pm<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: Discourse Oriented Corpus Studies<\/em><\/h6>\n\n\n\n<h5 class=\"wp-block-heading\"><a href=\"https:\/\/www.researchgate.net\/profile\/Daniel-Malone\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Dan Malone<\/strong><\/a> (Edge Hill University, UK)<\/h5>\n\n\n\n<h5 class=\"wp-block-heading\"><em><strong>From Global Uncertainty to Domestic Danger: The lone wolf terrorist as a topos of threat in (poly)crisis discourses<\/strong><\/em><\/h5>\n\n\n\n<p class=\"wp-block-paragraph\">Crises arise amidst uncertainty and are characterised, alongside urgency, by a sense of threat (Lipscy 2020). Few figures embody uncertainty more vividly than the lone-wolf terrorist. Acting in isolation and without formal ties to organised groups, the lone-wolf terrorist has become increasingly prominent in public discourse over the past 15 years (Malone, 2025).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this talk, I explore how the lone-wolf terrorist emerges in the construction of polycrises in the UK press. Polycrises are understood here, following Krzy\u017canowski et al. (2023: 423), as the \u201ccombination of many, more or less simultaneous and overlapping, crises whose repercussions unfold in a cumulative manner\u201d. Their discursive construction thus relies on how crises are presented as interacting, exacerbating, and reshaping one another (cf. Janzwood &amp; Homer-Dixon, 2022: 4). Specifically, I focus on how the lone-wolf terrorist is rhetorically employed as a&nbsp;<em>topos<\/em>&nbsp;of threat, operating under the premise that if there is a risk or danger, action must be taken to prevent it (Wodak 2001; Boukala, 2016).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I draw on analyses from the Lone Wolf Corpus (Malone, 2026), a topic-specific corpus of UK newspaper articles featuring the lone-wolf terrorist, focusing on the years 2020 to 2024. The analytical approach employs three interrelated stages to identify collocations and semantic preferences of the lemma&nbsp;<em>lone wolf<\/em>, as well as discourse prosodies, that is, implicit and explicit attitudes (Stubbs, 2001) towards the lone-wolf terrorist. In the discussion, I pay particular attention to the role of metaphor clusters, viewing metaphor as a device for expressing evaluation implicitly (Martin &amp; White, 2005).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Findings show that the lone-wolf terrorist is a recurring evaluative resource through which crisis events are connected and presented as threatening. From Iran\u2019s nuclear ambitions, Hamas\u2019s activities, and Russia\u2019s aggression in Ukraine, to mass migration, the climate crisis, and fears of radicalisation fuelled by AI and intensified by COVID-19 lockdowns, the lone-wolf terrorist emerges to recontextualise international events as matters of domestic security, transforming global uncertainty into a sense of danger for UK audiences. In this way, otherwise discrete events are discursively construed and rendered as potential polycrises through imagined terroristic violence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>References<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Boukala, S. (2016). Rethinking topos in the discourse-historical approach: Endoxon seeking and argumentation in Greek media discourses on \u201cIslamist terrorism\u201d.&nbsp;<em>Discourse Studies, 18<\/em>(3), 249\u2013268.&nbsp;<a href=\"https:\/\/doi.org\/10.1177\/1461445616634550\">https:\/\/doi.org\/10.1177\/1461445616634550<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fairclough, N. (1989).&nbsp;<em>Language and power<\/em>. Longman.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hjermann, A. R., &amp; Wilhelmsen, J. (2025). Topos of threat and metapolitics in Russia\u2019s securitisation of NATO post-Crimea.&nbsp;<em>Review of International Studies<\/em>, Advance online publication, 1\u201320.&nbsp;<a href=\"https:\/\/doi.org\/10.1017\/S0260210524000937\">https:\/\/doi.org\/10.1017\/S0260210524000937<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hoey, M. (2005).&nbsp;<em>Lexical priming: A new theory of words and language<\/em>. Routledge.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Krzy\u017canowski, M., Wodak, R., Bradby, H., Gardell, M., Kallis, A., Krzy\u017canowska, N., Mudde, C., &amp; Rydgren, J. (2023). Discourses and practices of the \u201cnew normal\u201d: Towards an interdisciplinary research agenda on crisis and the normalization of anti- and post-democratic action.&nbsp;<em>Journal of Language and Politics, 22<\/em>(4), 415\u2013437.&nbsp;<a href=\"https:\/\/doi.org\/10.1075\/jlp.23024.krz\">https:\/\/doi.org\/10.1075\/jlp.23024.krz<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Janzwood, S., &amp; Homer-Dixon, T. (2022).&nbsp;<em>What is a global polycrisis? And how is it different from systemic risk?<\/em>&nbsp;Discussion paper. Cascade Institute.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Janzwood, Scott, and Thomas Homer-Dixon. 2022. What Is a Global Polycrisis? And How Is It Different From Systemic Risk? Discussion Paper.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Lipscy, P. Y. (2020). COVID-19 and the politics of crisis.&nbsp;<em>International Organization, 74<\/em>(S1), E98\u2013E127.&nbsp;<a href=\"https:\/\/doi.org\/10.1017\/S0020818320000375\">https:\/\/doi.org\/10.1017\/S0020818320000375<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Malone, D. (2025).&nbsp;<em>The discourse presentation of the lone-wolf terrorist in the British press, 2000\u20132019: A corpus-based study<\/em>&nbsp;(Doctoral dissertation, Edge Hill University).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Malone, D. (2026). Topic-specific corpus compilation: A componential approach to query formulation.&nbsp;<em>Applied Corpus Linguistics, 6<\/em>(1), 100180.&nbsp;<a href=\"https:\/\/doi.org\/10.1016\/j.acorp.2025.100180\">https:\/\/doi.org\/10.1016\/j.acorp.2025.100180<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Martin, J. R., &amp; White, P. R. R. (2005).&nbsp;<em>The language of evaluation: Appraisal in English<\/em>. Palgrave Macmillan.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Stubbs, M. (2001).&nbsp;<em>Words and phrases: Corpus studies of lexical semantics<\/em>. Blackwell.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Wodak, R. (2001). The discourse-historical approach. In R. Wodak &amp; M. Meyer (Eds.),&nbsp;<em>Methods of critical discourse analysis<\/em>&nbsp;(pp. 63\u201395). Sage.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2025<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\">MEETING #18: Friday 19 December 2025, 2:00-4:00 pm<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: Philosophies of Language and Corpus<\/em>&nbsp;<em>Linguistics<\/em><\/h6>\n\n\n\n<h5 class=\"wp-block-heading\"><strong>Alan Partington<\/strong>&nbsp;(<a href=\"https:\/\/site.unibo.it\/sibol-project\/en\" target=\"_blank\" rel=\"noreferrer noopener\">SiBol Group<\/a>\/<a href=\"https:\/\/centri.unibo.it\/colitec\/en\" target=\"_blank\" rel=\"noreferrer noopener\">CoLiTec<\/a>, Italy)<\/h5>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><em>Language Distrusted, Language Ignored, Language Recovered: From Plato to Corpus Linguistics and Beyond<\/em><\/strong><\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2025\/12\/EHU-CRG-18-Alan-Partington.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SLIDES<\/a><\/strong> <strong>|| <a href=\"https:\/\/edgehill.cloud.panopto.eu\/Panopto\/Pages\/Viewer.aspx?id=9663446a-7184-4934-923f-b3b9009ff798\" target=\"_blank\" rel=\"noreferrer noopener\">VIDEO<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I wish to discuss the Long Essay,&nbsp;<em>A Short History of the Philosophies Underpinning Corpus Linguistics: From Aristotle to AI<\/em>, which traces the intertwined histories of linguistic and philosophical thought that shaped\u2014and sometimes resisted\u2014the emergence of corpus linguistics. The heavily revised, updated and now illustrated edition is free to download:&nbsp;<strong><a href=\"https:\/\/zenodo.org\/records\/17966162\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/doi.org\/10.5281\/zenodo.17998000<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">New themes include:<br>\u2022&nbsp;Metaphorophobia in classical philosophy.<br>\u2022&nbsp;How and why CL can fruitfully cohabit and collaborate with AI\/LLMs<br>\u2022&nbsp;Ontological and epistemological differences between CL, CaDS and LLMs. LLMs: just artefacts or self-organising organisms (Kant)?<br>\u2022&nbsp;How AI learns metaphorical usage and evaluation. Are there patterns of creative language (including humour) that AI can acquire then use?<br>\u2022&nbsp;CL as a physical and a human science: causality (Bacon) versus teleology (Aristotle)<br>\u2022&nbsp;CL\/CaDS and the revenge of evaluation, from Hunston to evaluative cohesion.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Alan Partington has taught linguistics at the Universities of Bologna and Camerino, Italy. He is the author of&nbsp;<em>Patterns and Meanings<\/em>, co-author of&nbsp;<em>Patterns and Meanings in Discourse&nbsp;<\/em>(both John Benjamins) and author of&nbsp;<em>The Linguistics of Political Argument<\/em>,&nbsp;<em>The Linguistics of Laughter&nbsp;<\/em>and&nbsp;<em>The Language of Persuasion in Politics and the Media&nbsp;<\/em>(all Routledge). He was the co-founding Editor-in Chief of the&nbsp;<em>Journal of Corpora and Discourse Studies<\/em>. Contact:&nbsp;<a href=\"mailto:partington.alan@gmail.com\">partington.alan@gmail.com<\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\">MEETING #17 Thursday 13 November 2025, 3-5 pm&nbsp;(<a href=\"https:\/\/time.is\/United_Kingdom\" target=\"_blank\" rel=\"noreferrer noopener\">GMT<\/a>)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: LLMs and Corpus Tools<\/em><\/h6>\n\n\n\n<h5 class=\"wp-block-heading\"><a href=\"https:\/\/www.mark-davies.org\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Mark Davies<\/strong><\/a>&nbsp;(<a href=\"https:\/\/www.english-corpora.org\/\">English-Corpora.org<\/a>, USA)<\/h5>\n\n\n\n<h5 class=\"wp-block-heading\"><em><strong>Integrating information from AI \/ LLMs into English-Corpora.org<\/strong><\/em><\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"https:\/\/docs.google.com\/presentation\/d\/1U9T8mFmom2-ZUPG-5zq288rzi0chONOc\/edit?slide=id.p1#slide=id.p1\" target=\"_blank\" rel=\"noreferrer noopener\">SLIDES<\/a> <\/strong> ||  <a href=\"https:\/\/edgehill.cloud.panopto.eu\/Panopto\/Pages\/Viewer.aspx?id=0db6ae11-c365-4eee-be51-b39500dc4fb2\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>VIDEO<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In March 2025 I released&nbsp;<a href=\"https:\/\/www.english-corpora.org\/ai-llms\/corpora-vs-llms.html\">seven detailed studies<\/a>&nbsp;that discuss how well the predictions from LLMs (Large Language Models) match the actual data from large, well-known, publicly-accessible corpora (like those from English-Corpora.org). The seven detailed studies dealt with word frequency, phrase frequency, collocates, comparing words (via collocates), genre-based variation, historical variation, and dialectal variation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But it\u2019s probably not a question of \u201ceither\/or with corpora and AI; rather it is probably an issue of \u201cand\/with\u201d. Why not take the strengths of AI \/ LLMs and integrate them right into the corpus interface? As the comparisons between corpora and AI\/LLMs indicate, what LLM are really good at is classifying and explaining linguistic data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So as of September 2025, English-Corpora.org allows users to combine the depth and reliability of corpus data with the analytic power of LLMs like GPT, Gemini, Claude, Perplexity, Llama, Mistral, and DeepSeek. With just one click, the corpus can send collocates, frequency patterns, phrase lists, or concordance lines to an LLM via an \u201cAPI call\u201d, and then the LLM instantly groups, explains, and interprets the data, and returns that to the corpus. These AI-powered insights appear directly in the interface, alongside the original corpus results (while still keeping it very clear which is the corpus data, and which are the AI categorizations or analyses).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The following are the types of analyses \/ categorizations that are now available to end users:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Classifying and categorizing collocates, such as collocates of&nbsp;<em>cap<\/em>&nbsp;or&nbsp;<em>identity<\/em><\/li>\n\n\n\n<li>Classifying and categorizing phrases, such as&nbsp;<em>soft NOUN<\/em><\/li>\n\n\n\n<li>Comparing two words (via collocates), such as&nbsp;<em>quandary<\/em>&nbsp;vs&nbsp;<em>predicament<\/em><\/li>\n\n\n\n<li>Comparing genres, time periods, and dialects (two sections), such as&nbsp;<em>chain + NOUN<\/em>&nbsp;(fic \/ acad),&nbsp;<em>ADJ women<\/em>&nbsp;(1800s \/ now), or&nbsp;<em>ADJ scheme<\/em>&nbsp;(US \/ UK)<\/li>\n\n\n\n<li>Comparing genres, time periods, and dialects (all sections), such as&nbsp;<em>soft NOUN<\/em>&nbsp;(genres),&nbsp;<em>ADJ food<\/em>&nbsp;(historical), or&nbsp;<em>*ism<\/em>&nbsp;words (dialects)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/li>\n\n\n\n<li>Comparing genres, time periods, and dialects (charts), such as the \u201clike construction\u201d (genres),&nbsp;<em>need NEG<\/em>&nbsp;(historical), or&nbsp;<em>soft day<\/em>&nbsp;(dialects)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/li>\n\n\n\n<li>Analyzing KWIC\/concordance lines, such as the patterns for&nbsp;<em>fathom<\/em>&nbsp;or&nbsp;<em>naked eye<\/em>&nbsp;(including collocations, semantic prosody, syntactic patterns, and pragmatic functions)<\/li>\n\n\n\n<li>Generating words and phrases for topics and concepts, such as: climate change, famous actresses, or female jobs in 1800s<\/li>\n\n\n\n<li>Generating words and phrases via translations, such as German&nbsp;<em>sowohl alt als jung<\/em>, Russian &nbsp;\u0444\u0438\u043d\u0430\u043d\u0441\u043e\u0432\u043e\u0435 \u0441\u043e\u0441\u0442\u043e\u044f\u043d\u0438\u0435, or Korean\uc911\uc694\ud55c \uc0ac\uc548<\/li>\n\n\n\n<li>Generating words and phrases to find \u201cmore natural\u201d phrases, such as&nbsp;<em>make a photo<\/em>&nbsp;(perhaps from Japanese \u5199\u771f\u3092\u64ae\u308b),&nbsp;<em>pleasing scenery<\/em>, or&nbsp;<em>tough idea<\/em><\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Users can also seamlessly move from one LLM to another, they can see the results in any one of&nbsp;<em>30 different languages<\/em>, and they can create a simple \u201cAI profile\u201d (e.g. learner, teacher, translator, or linguist), which helps the AI to provide even more customized and helpful results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">English-Corpora.org already has the most widely used online corpora. But with these new AI-powered features, the corpora should be even more useful for teachers, learners, and researchers.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\">MEETING #16 Friday 2 May 2025, 2:00-3:30 pm<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: LLMs and Lexical Priming<\/em> <em>Theory<\/em><\/h6>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><a href=\"https:\/\/uefconnect.uef.fi\/en\/person\/michael.pace-sigge\/\">Michael Pace-Sigge<\/a> <\/strong>(University of Eastern Finland)<\/h5>\n\n\n\n<h5 class=\"wp-block-heading\"><em><strong>Large-Language-Model Tools and the Theory of Lexical Priming: Where technology and human cognition meet and diverge<\/strong><\/em><\/h5>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2025\/05\/EHU-CRG.2025.05.02.Pace-Sigge.pdf\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This paper revisits Michael Hoey\u2019s <em>Lexical Priming Theory <\/em>(2005) in the light of recent discussions of <em>Large Language Models<\/em> as forms of machine learning (commonly referred to as AI), which have been the centre of a lot of publicity in the wake of tools like OpenAI\u2019s <em>ChatGPT<\/em> or Google\u2019s <em>BARD\/Gemini<\/em>. Historically, theories of language have faced inherent difficulties, given language&#8217;s exclusive use by humans and the complexities involved in studying language acquisition and processing. The intersection between Hoey&#8217;s theory and Machine Learning tools, particularly those employing Large Language Models (LLMs), has been highlighted by several researchers. Hoey&#8217;s theory relies on the psychological concept of priming, aligning with approaches dating back to Ross M. Quillian&#8217;s 1960s proposal for a &#8220;Teachable Language Comprehender.&#8221; The theory posits that every word is primed for discourse based on cumulative effects, a concept mirrored in how LLMs are trained on vast corpora of text data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This paper tests LLM-produced samples against naturally (human-)produced material in the light of a number of language usage situations, investigates results from A.I. research and compares the results with how Hoey describes his theory. While LLMs can display a high degree of structural integrity and coherence, they still appear to fall short of meeting human-language criteria which include grounding and the objective to meet a communicative need.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hoey, M. (2005). <em>Lexical Priming<\/em>. London: Routledge.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hoey, M. (2009). Corpus-driven approaches to grammar. In: R\u00f6mer, U. &amp; Schulze, R: <em>Exploring the lexis-grammar interface<\/em>. Amsterdam\/Philadelphia: John Benjamins.pp. 33-47.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Pace-Sigge, M. &amp; Sumakul, T. (2022). What Teaching an Algorithm Teaches When Teaching Students How to Write Academic Texts. In Jantunen, Jarmo Harri, et al. <em>Diversity of Methods and Materials in Digital Human Sciences.<\/em> Proceedings of the Digital Research Data and Human Sciences DRDHum Conference 2022.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Quillian, R. M.&nbsp; (1967). Word concepts: A theory and simulation of some basic semantic capabilities. <em>Behavioural Science, <\/em>12(5), 410-430.&nbsp; <a href=\"https:\/\/doi.org\/10.1002\/bs.3830120511\">https:\/\/doi.org\/10.1002\/bs.3830120511<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Tools<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Brezina, V. &amp; Platt, W. (2023) <em>#LancsBox X<\/em>, Lancaster University, <a href=\"http:\/\/lancsbox.lancs.ac.uk\">http:\/\/lancsbox.lancs.ac.uk<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Google [2023] (2024). <em>BARD\/Gemini. <\/em><a href=\"https:\/\/bard.google.com\/chat\">https:\/\/BARD.google.com\/chat<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">OpenAI. [2022] (2024) <em>ChatGPT.(GPT 3.5)&nbsp; <\/em><a href=\"https:\/\/chat.openai.com\/\">https:\/\/chat.openai.com\/<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Scott, M. (2023). WordSmith Tools version 8, Stroud: Lexical Analysis Software.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\">MEETING #15 Friday 7 March 2025, 3:15-4:30 pm<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>Topic: LLMs<\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/research.edgehill.ac.uk\/en\/persons\/yannis-korkontzelos\" target=\"_blank\" rel=\"noreferrer noopener\">Yannis Korkontzelos<\/a>&nbsp;and Amir Amini <\/strong>(Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Detecting Text Generated by Large Language Models: A novel statistical technique to address paraphrasing<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\">Controlling illegitimate usage of AI in a multitude of educational and professional contexts requires automated systems able to detect text generated by Large Language Models (LLMs) and to distinguish it from human writing samples. Current techniques perform well, unless the text has been automatically paraphrased. In this talk, we will discuss research jointly conducted with Mr Amir Amini. We will start with exploring paraphrasing; how much it can diminish the accuracy of detectors of AI-generated text and we will explain why.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We will identify a property of the probability functions in large language models (LLMs) that can be useful for detecting LLM-generated text, even after paraphrasing. Then, we embed it in a state-of-the-art detector, DetectGPT (Mitchell et al., 2023), to form a new technique for detecting text generated by a particular LLM. We will discuss experiments ad results that demonstrate that this technique is more robust against paraphrasing attacks compared to recently introduced techniques, including DetectGPT and LogRank.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">E. Mitchell, K. Yoon-Ho Alex Lee, A. Khazatsky, C.D. Manning, and C. Finn. DetectGPT: Zero-shot machine-generated text detection using probability curvature. arXiv (Cornell University), 2023.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Sebastian Gehrmann, Hendrik Strobelt, and Alexander Rush. GLTR: Statistical detection and visualization of generated text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111\u2013116. Association for Computational Linguistics, 2019.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">X. Hu, P.-Y. Chen, and T.-Y. Ho. Radar: Robust ai-text detection via adversarial learning. Advances in Neural Information Processing Systems, 2023.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Y. Li, Q. Li, L. Cui, W. Bi, L. Wang, L. Yang, S. Shi, and Y. Zhang. Deepfake text detection in the wild. arXiv (Cornell University), 2023.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">C. Opara. Styloai: Distinguishing ai-generated content with stylometric analysis. In International Conference on Artificial Intelligence in Education. Springer Nature Switzerland, 2024.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">J. Su, T. Zhuo, D. Wang, P. Nakov, and CSIRO\u2019s Data61. DetectLLM: Leveraging log rank information for zero-shot detection of machine-generated text. arXiv preprint, arXiv:2306.05540, 2023.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Y. Zhou and J. Wang. Detecting AI-generated texts in cross-domains. In Proceedings of the ACM Symposium on Document Engineering 2024, pages 1\u20134,&nbsp;2024.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #14: Friday 24 January 2025, 2:00-3:30 pm (GMT), online (MS Teams)<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><sup>Topic: Corpus Methodology, Multi-Dimensional Analysis<\/sup><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/elenlefoll.eu\" target=\"_blank\" rel=\"noreferrer noopener\">Elen Le Foll <\/a><\/strong>(University of Cologne, Germany)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Modelling Textbook English using a Modified Multi-Feature\/Dimensional Analysis (MDA) Framework<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/zenodo.org\/records\/14738511\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">English as it is represented in secondary school English as Foreign Language (EFL) textbooks is often perceived as somehow different from \u2018real-life\u2019, \u2018authentic\u2019 English. Indeed, previous studies have shown that individual lexico-grammatical features are often misrepresented (see Le Foll 2024 for a synthesis of the literature). This is problematic given that textbooks are an important and highly influential vector of foreign language input in secondary education. It is therefore worth asking: Does Textbook English constitute a special variety of English? And, if so, in what ways does it differ from \u2018real-life\u2019, extra-curricular English?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This talk focuses on the modified version of the multi-feature\/multi-dimensional analysis (see Biber 1988; Berber Sardinha &amp; Veirano Pinto 2014; 2019: 19) framework used to answer these questions in Le Foll (2024). MDA is used to compare the language of nine series of EFL textbooks used at in lower secondary education in Germany, France and Spain with three target language reference corpora. Inspired by Diwersy et al. (2014) (2014) and Neumann &amp; Evert (2021), this modified MDA framework is based on principal component analysis (PCA) and extensive multi-dimensional visualisations. The framework further incorporates additional steps designed to increase both the reproducibility and replicability of the results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Following a theoretical introduction to both the research questions at hand and the MDA framework, the open-source tools used to conduct MDAs in this study are presented from a practical point of view. Together, we examine the functionalities of the Multi-Feature Tagger of English (MFTE Le Foll 2021; see also Le Foll &amp; Shakir 2023) and a number of useful R libraries. To this end, we draw on the RMarkdown scripts that are part of the Online Supplements of Le Foll (2024; https:\/\/elenlefoll.github.io\/TextbookMDA). Finally, we discuss the steps taken to improve the reproducibility and replicability of the results, in line with the principles of Open Science.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Berber Sardinha, Tony &amp; Marcia Veirano Pinto (eds.). 2014. <em>Multi-Dimensional Analysis, 25 Years on: A Tribute to Douglas Biber<\/em> (Studies in Corpus Linguistics 60). Amsterdam: John Benjamins.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Berber Sardinha, Tony, Marcia Veirano Pinto, Cristina Mayer, Maria Carolina Zuppardi &amp; Carlos Henrique Kauffmann. 2019. Adding Registers to a Previous Multi-Dimensional Analysis. In Tony Berber Sardinha &amp; Marcia Veirano Pinto (eds.), <em>Multi-Dimensional Analysis: Research Methods and Current Issues<\/em>, 165\u2013188. New York, NY: Bloomsbury.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Biber, Douglas. 1988. <em>Variation across speech and writing<\/em>. Cambridge: Cambridge University Press. https:\/\/doi.org\/10.1017\/CBO9780511621024.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Diwersy, Sascha, Stephanie Evert &amp; Stella Neumann. 2014. A weakly supervised multivariate approach to the study of language variation. In Benedikt Szmrecsanyi &amp; Bernhard W\u00e4lchli (eds.), <em>Aggregating dialectology, typology, and register analysis: Linguistic variation in text and speech<\/em>, 174\u2013204. Berlin: De Gruyter.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Le Foll, Elen. 2021. <em>Introducing the Multi-Feature Tagger of English (<\/em>MFTE). Perl. Osnabr\u00fcck University. https:\/\/github.com\/elenlefoll\/MultiFeatureTaggerEnglish. (5 January, 2022).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Le Foll, Elen. 2024. T<em>extbook English: A Multi-Dimensional Approach<\/em> (Studies in Corpus Linguistics 116). Amsterdam: John Benjamins.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Le Foll, Elen &amp; Muhammad Shakir. 2023. Introducing a New Open-Source Corpus-Linguistic Tool: The Multi-Feature Tagger of English (MFTE). Presented at the ICAME44, NWU Vanderbijlpark (South Africa).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Neumann, Stella &amp; Stephanie Evert. 2021. A register variation perspective on varieties of English. In Elena Seoane &amp; Douglas Biber (eds.), <em>Corpus-based approaches to register variation<\/em> (Studies in Corpus Linguistics 103), 144\u2013178. Amsterdam: Benjamins.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2024<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #13: Friday 15 November 2024  (Two presentations)<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><sup>Topic: Discourse-Oriented Corpus Studies<\/sup><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Katia Adimora<\/strong> (Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>Mexican immigration\/immigrants in American and Mexican newspapers<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2024\/11\/EHU-CRG.2024.11.15.Adimora.pdf\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The study employed Corpus Assisted Discourse Studies (CADS) methodology to conduct discourse prosody analysis to reveal hidden attitudes towards Mexican immigration\/immigrants in American and Mexican press. It created American immigration corpus (AIC) and Mexican immigration corpus (MIC).&nbsp; The AIC includes 12,595 articles (16,619,925 words) from: <em>The New York Times, The Washington Post, USA Today, Los Angeles Times, The Arizona Republic, Chicago Tribune<\/em>. The MIC includes 20,865 articles (12,258,123 words) from:&nbsp; <em>El Universal, Elimparcial.com, Reforma,<\/em> <em>El Norte, Lacronica.com,<\/em> and <em>Mural.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The results suggest that positive attitudes towards Mexican immigration\/immigrants surpass negative attitudes in both corpora, with MIC newspapers being more positive than AIC newspapers, which does not always coincide with public opinion. The attitudes fluctuated during the study period and seemed to correlate with socio-political events and the political leaning of newspapers. In addition, while AIC newspapers were more prone to use impersonal <em>thematic frames<\/em> to describe immigration issues, MIC newspapers were more likely to use personal <em>episodic frames<\/em>, which might contribute to more empathy towards Mexican immigrants among the Mexican readership.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The most frequent positive attitudes in both corpora were opposition to Trump\u2019s and Republican anti-immigration policies, and favourable comments towards the rights of immigrants and pro-immigration laws and regulations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">On the other hand, portraying immigrants as criminals was the most frequent negative attitude in both corpora. Also, common negative attitudes in both corpora expressed support for Trump\u2019s anti-immigrant policies, opposed pro-immigrant rules and regulations, and criticised the (perceived) high number of immigrants.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In AIC newspapers, \u2018illegal immigrant(s)\u2019 was used with negative discourse prosody, whereas in MIC newspapers, the term \u2018inmigrant(es) illegal(es)\u2019 expressed neutral (positive and negative) attitudes.<\/p>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Dan Malone<\/strong> (Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>When is the extreme also typical? Using prototypicality to investigate representations of the lone-wolf terrorist<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.researchgate.net\/publication\/385936679\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The term <em>lone wolf<\/em> figuratively conjures an image of an individual acting in isolation, perhaps motivated by a desire to break from societal norms. When applied to terrorism, <em>lone wolf<\/em> draws attention to the perceived aloneness of the perpetrator. However, evidence from the Lone Wolf Corpus (Malone, 2020) reveals that representations in the British press showed notable diachronic trends in how the lone-wolf terrorist\u2019s (LWT) aloneness was (re)presented, which in turn indexed broader discursive shifts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this presentation, I report on my approach to investigating representations of the LWT by adopting a prototypical categorisation framework to analysing discourse prosodies (i.e., implicit and explicit attitudes) (Stubbs, 2001: 66) of connection. This categorisation hinges on four key attributes identified during manual corpus annotation: (1) perpetration, (2) ideological motivation, (3) logistical support, and (4) resource provision. These attributes address whether the LWT was represented as operating in complete isolation or receiving some form of assistance, either direct or indirect, from individuals or organisations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Five distinct connection types emerged from the data, reflecting different combinations of these attributes: the <em>Prototypical Lone Wolf Terrorist<\/em>, depicted as ideologically self-driven and operationally independent; <em>Assisted by Non-Affiliated Individual(s)<\/em>; <em>Inspired by Organisation<\/em>; <em>Informed by Organisation<\/em>; <em>Directed by Organisation<\/em>; and <em>Member of Organisation<\/em>. Each connection type was quantified, and its frequency was statistically analysed to trace diachronic discursive shifts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The findings reveal a discursive reconstruction of the LWT over time. In the early period (2010-2014), the LWT was more frequently presented as a solitary actor, but later portrayals (particularly during 2015-2017) increasingly associated the lone wolf with broader, often Islamist, networks. This shift resulted in the LWT being depicted not as a fully independent individual, but rather as institutionalised and depersonalised\u2014a faceless agent acting on behalf of extremist organisations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Malone, D. (2020) Developing a complex query to build a specialised corpus: Reducing the issue of polysemous query terms. Paper presented at Corpora and Discourse International Conference 2020, University of Sussex, UK.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Stubbs, M. (2001). <em>Words and phrases: Corpus studies of lexical semantics<\/em>. Blackwell.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #12: Thursday 25 April 2024<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><sup>Topics: Corpus Methodology, Large Language Models<\/sup><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><a href=\"https:\/\/www.reading.ac.uk\/elal\/staff\/dr-sylvia-jaworska\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Sylvia Jaworska<\/strong><\/a> (University of Reading, UK) &amp; <a href=\"https:\/\/www.wu.ac.at\/ebc\/about-us\/team\/mathew-gillings\/\"><strong>Mathew Gillings<\/strong><\/a> (Vienna University of Economics and Business, Austria)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>How humans vs. machines identify discourse topics: an exploratory triangulation<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2024\/04\/EHU-CRG.2024.04.25.JaworskaGillings.Slides.pdf\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Identifying discourses and discursive topics in a set of texts has not only been of interest to linguists, but to researchers working across social sciences. Traditionally, these analyses have been conducted based on small-scale interpretive analyses of discourse which involve some form of close reading. Naturally, however, that close reading is only possible when the dataset is small, and it leaves the analyst open to accusations of bias and\/or cherry-picking.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Designed to avoid these issues, other methods have emerged which involve larger datasets and have some form of quantitative component. Within linguistics, this has typically been through the use of corpus-assisted methods, whilst outside of linguistics, topic modelling is one of the most widely-used approaches. Increasingly, researchers are also exploring the utility of LLMs (such as ChatGPT) to assist analyses and identification of topics. This talk reports on a study assessing the effect that analytical method has on the interpretation of texts, specifically in relation to the identification of the main topics. Using a corpus of corporate sustainability reports, totalling 98,277 words, we asked 6 different researchers, along with ChatGPT, to interrogate the corpus and decide on its main \u2018topics\u2019 via four different methods. Each method gradually increases in the amount of context available.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Method A: ChatGPT is used to categorise the topic model output and assign topic labels;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Method B: Two researchers were asked to view a topic model output and assign topic labels based purely on eyeballing the co-occurring words;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Method C: Two researchers were asked to assign topic labels based on a concordance analysis of 100 randomised lines of each co-occurring word;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Method D: Two researchers were asked to reverse-engineer a topic model output by creating topic labels based on a close reading.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The talk explores how the identified topics differed both between researchers in the same condition, and between researchers in different conditions shedding light on some of the mechanisms underlying topic identification by machines vs humans or machines assisted by humans. We conclude with a series of tentative observations regarding the benefits and limitations of each method along with suggestions for researchers in selecting an analytical approach for discourse topic identification. While this study is exploratory and limited in scope, it opens up a way for further methodological and larger scale triangulations of corpus-based analyses with other computational methods including AI.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #11: Thursday 29 February 2024<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong><sup>Topic: Corpus Methodology<\/sup><\/strong><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><a href=\"https:\/\/infogrep.it\/site\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Matteo Di Cristofaro<\/strong><\/a> (University of Modena and Reggio Emilia, Italy)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>One dataset, many corpora: Problems of scientific validity in corpora and corpus-derived results<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2024\/03\/EHU-CRG.2024.02.29.DiCristofaro.Slides.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SLIDES<\/a>  ||  <a href=\"https:\/\/youtu.be\/03lgFt7yszY\" target=\"_blank\" rel=\"noreferrer noopener\">VIDEO<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Corpus linguistics has, since its inception, recognised the relevance of digital technologies as a major driving force behind corpus techniques and their (r)evolution in the study of language (cf. Tognini-Bonelli 2012). And yet, while both corpus linguistics and digital technologies have frequently benefited from each other (the case of NLP\/NLU is one such macro example), their pathways have often diverged. The result is a disconnect between corpus linguistics and digital data processing whose effects directly impinge on the ability to analyse language through software tools. A disconnect becoming more and more relevant as corpus linguistics is being applied to vast amounts of data obtained from manifold sources &#8211; including a wide array of social media platforms, each one with its unique linguistic and technical peculiarities.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As the ground-truth of an ever-increasing number of language studies, corpora must be able to correctly treat and represent such peculiarities: e.g. the dialogic dimension of comments or forum posts; the presence (and potential subsequent normalisation) of spelling variations; the use of hashtags and emojis. Failing to do so, the corpus-derived results will likely present researchers with a falsified view of the language under scrutiny.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What is at stake is not the ability to \u201ccount\u201d what is in a corpus, but rather whether what is being counted <em>is<\/em> or <em>is not<\/em> a feature present in the original data \u2013 of which the corpus should be a faithful representation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The presentation is consequently devoted to tackling <em>digital technicalities<\/em>, i.e. \u201cthose notions and mechanisms that &#8211; while not classically associated with natural language &#8211; are i) foundational of the digital environments in which language production and exchanges occur and ii) at the core of the techniques that are used to produce, collect, and process the focus of investigation, that is, digital textual data.\u201d (Di Cristofaro 2023:5). One such example is represented by character encodings: although at the \u201ccore\u201d of the whole corpus linguistics enterprise (cf. McEnery and Xiao 2005; Gries 2016:39,111) \u2013 since they allow written language to be processed by a computer and understood by humans -, these are often overlooked at all stages of corpus compilation and analysis, potentially leading linguists to involuntarily tampering with the data and its linguistic contents.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Starting from practical examples, the presentation discusses the implications that digital technicalities have on corpora and their analyses \u2013 or rather, what happens when they are not properly treated &#8211; while outlining (also in the form of Python scripts and practical tools) potential new pathways that a \u201cdigital-aware\u201d perspective of corpus linguistics can open up.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Di Cristofaro, Matteo. Corpus Approaches to Language in Social Media. Routledge Advances in Corpus Linguistics. New York: Routledge, 2023.<a href=\"https:\/\/doi.org\/10.4324\/9781003225218\"> https:\/\/doi.org\/10.4324\/9781003225218<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gries, Stefan Th. Quantitative Corpus Linguistics with R: A Practical Introduction. 2nd ed. New York: Routledge, 2016.<a href=\"https:\/\/doi.org\/10.4324\/9781315746210\"> https:\/\/doi.org\/10.4324\/9781315746210<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">McEnery, Tony, and Richard Xiao. \u2018Character Encoding in Corpus Construction\u2019. In Developing Linguistic Corpora: A Guide to Good Practice, edited by Martin Wynne, 47&#8211;58. Oxford: Oxbow Books, 2005.<a href=\"https:\/\/users.ox.ac.uk\/~martinw\/dlc\/index.htm\"> https:\/\/users.ox.ac.uk\/~martinw\/dlc\/index.htm<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Tognini Bonelli, Elena. \u2018Theoretical Overview of the Evolution of Corpus Linguistics\u2019. In The Routledge Handbook of Corpus Linguistics, edited by Anne O\u2019Keeffe and Michael McCarthy, 14&#8211;27. Routledge Handbooks in Applied Linguistics. Milton Park, Abingdon, Oxon\u202f; New York: Routledge, 2012.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #10: Thursday 11 January 2024<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong><sup>Topics: Corpus Methodology, Phraseology<\/sup><\/strong><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><a href=\"https:\/\/www.coventry.ac.uk\/life-on-campus\/staff-directory\/arts-and-humanities\/benet-vincent\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Benet Vincent<\/strong><\/a>   (Coventry University, UK)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>Methodological issues and challenges in the use of phrase-frames to investigate phraseology<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\">[This talk is based on a project in which my collaborators are <a href=\"https:\/\/leemccallum.net\/\">Lee McCallum<\/a> &amp; <a href=\"http:\/\/unis.bakircay.edu.tr\/akademisyen\/aysel.sahinkizil\/lang=en\">Aysel \u015eahin K\u0131z\u0131l<\/a>]<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2024\/01\/EHU-CRG.2024.01.11.Vincent.Slides.pdf\">SLIDE<\/a><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2024\/01\/EHU-CRG.2024.01.11.Vincent.Slides.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">S<\/a> | For the video recording, contact <a href=\"ab6667@coventry.ac.uk\" target=\"_blank\" rel=\"noreferrer noopener\">Ben<\/a><a href=\"https:\/\/pureportal.coventry.ac.uk\/en\/persons\/benet-vincent\" target=\"_blank\" rel=\"noreferrer noopener\">et Vincent<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The importance of gaining a better understanding of phraseology has been recognised for some time now in the area of English for Academic Purposes (EAP). A widespread approach is to extract from a corpus frequently-occurring fixed strings (lexical bundles, or clusters) of potentially useful phrases\/multi-word units (see e.g. Gilmore and Millar&#8217;s 2018). A limitation of this sort of study is the focus on fixed continuous sequences when phrases are well-known to allow a degree of variation (see e.g. Gries, 2008). One proposal to address this limitation is the \u2018phrase frame\u2019 (p-frame), a fixed sequence of items occurring frequently in a corpus with one or two empty slots (Lu, Yoon &amp; Kisselev, 2021). This approach allows researchers to retrieve the most frequent p-frames in a particular corpus, then identify which items typically fill these slots and what meanings \/ functions might be associated with them. The idea is that the results of such research can help us better understand how members of a specific discourse community typically express themselves, which in turn may inform EAP pedagogy (Lu, Yoon, &amp; Kisselev, 2018). Our project aimed to use a p-frame approach to create a list of pedagogically useful phrases to help novice writers of RA introductions in Health Sciences. A number of studies have used a p-frame approach with similar aims though for different discipline areas, including Fuster-M\u00e1rquez and Pennock-Speck (2015), Cunningham (2017) and Lu et al., (2018, 2021). However, analysis of these studies indicates that they lack consensus on a number of issues central to p-frame methodology, presenting a challenge for new work in this area. This presentation will provide an overview of the key issues in p-frame research which we have identified and show how we have addressed them. The main aim will be to underline the importance of ensuring that the methods applied by a p-frame study align with the aims of the project.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Cunningham, K. J. (2017). A phraseological exploration of recent mathematics research articles through key phrase frames. <em>Journal of English for Academic Purposes<\/em>, <em>25<\/em>, 71. https:\/\/doi.org\/10.1016\/j.jeap.2016.11.005<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fuster-M\u00e1rquez, M., &amp; Pennock-Speck, B. (2015). Target frames in British hotel websites. <em>International Journal of English Studies<\/em>, <em>15<\/em>(1), 51\u201369. https:\/\/doi.org\/10.6018\/ijes\/2015\/1\/213231<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gilmore, A., &amp; Millar, N. (2018). The language of civil engineering research articles: A corpus-based approach. <em>English for Specific Purposes<\/em>, <em>51<\/em>, 1\u201317. https:\/\/doi.org\/10.1016\/j.esp.2018.02.002<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gries, S. (2008). Phraseology and linguistic theory. In <em>Phraseology: An interdisciplinary perspective<\/em>, S. Granger &amp; F. Meunier (eds.), 3-26.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Lu, X., Yoon, J., &amp; Kisselev, O. (2018). A phrase-frame list for social science research article introductions. <em>Journal of English for Academic Purposes<\/em>, <em>36<\/em>, 76\u201385. https:\/\/doi.org\/10.1016\/j.jeap.2018.09.004<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Lu, X., Yoon, J., &amp; Kisselev, O. (2021). Matching phrase-frames to rhetorical moves in social science research article introductions. <em>English for Specific Purposes<\/em>, <em>61<\/em>, 63\u201383. https:\/\/doi.org\/10.1016\/j.esp.2020.10.001<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Bio-note<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Benet Vincent is Assistant Professor in Applied Linguistics at Coventry University in the UK. His research covers applications of corpus linguistics in a range of areas including English for Academic Purposes, Translation, Pragmatics and more generally for the analysis of discourse. He is currently guest editing two special issues for peer-reviewed journals: \u2018Corpus Linguistics and the language of Covid-19\u2019 in <a href=\"https:\/\/www.sciencedirect.com\/journal\/applied-corpus-linguistics\/special-issue\/10J1HH2BPG5\"><em>Applied Corpus Linguistics<\/em><\/a> and \u2018Decision-Making in Selecting, Compiling, Analysing and Reporting on the Use of Corpora in Applied Linguistics Research\u2019 in <a href=\"https:\/\/www.sciencedirect.com\/journal\/research-methods-in-applied-linguistics\/about\/call-for-papers#decision-making-in-selecting-compiling-analysing-and-reporting-on-the-use-of-corpora-in-applied-linguistics-research\"><em>Research Methods in Applied Linguistics<\/em><\/a><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2023<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #9: Thursday 14 December 2023<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topics: Discourse-Oriented Corpus Studies, Collocation Networks<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><a href=\"https:\/\/independent.academia.edu\/DanielMalone14\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Dan Malone<\/strong><\/a> (Edge Hill University, UK) &amp; <a href=\"https:\/\/hannaschmueck.github.io\/\"><strong>Hanna Schm\u00fcck<\/strong><\/a> (Lancaster University, UK)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em>A pack of lone wolves? Exploring the nexus between the lone-wolf terrorist, Al-Qaeda, and ISIS in the British Press<\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/12\/EHU_CRG.2023.12.MaloneSchmuck.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SLIDES<\/a>  |  <a href=\"https:\/\/osf.io\/mw4jt\/\" target=\"_blank\" rel=\"noreferrer noopener\">LINK TO OPEN SCIENCE FRAMEWORK PAGE<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Following recent events in Belgium and Israel, the lone-wolf terrorist re-emerged in media reportage, with <a href=\"https:\/\/edition.cnn.com\/2011\/09\/11\/tv\/biden-does-not-rule-out-possibility-of-lone-wolf-attack\/index.html\">President Joe Biden<\/a> and former <a href=\"https:\/\/inews.co.uk\/news\/uk-facing-heightened-threat-from-lone-wolf-terror-linked-to-israel-conflict-former-intelligence-chiefs-say-2693705\">GCHQ Director Sir David Omand<\/a> expressing concerns over potential attacks in the USA and UK. Days later, <a href=\"https:\/\/www.theguardian.com\/world\/2023\/oct\/17\/killing-of-two-swedes-in-brussels-probably-lone-wolf-attack\">Belgian Prime Minister Alexander De Croo described the neutralised Brussels shooter as \u201cprobably a lone wolf,\u201d<\/a> thus aiming to downplay the risk of subsequent incidents. Together, these instances exemplify that by shaping a \u201creality\u201d (Entman, 2004), (in)security discourses can amplify or downplay a terrorist threat, in turn reflecting and\/or influencing public perception and potentially guiding policy responses.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Historically, the lone wolf has been associated with different movements, ranging from the propaganda of the deed in the 19th Century to the leaderless resistance of white-supremacist groups in the 1980s and 90s. More recently, it is within the domain of Islamist terrorism, often dominated by Al-Qaeda and ISIS, where the lone wolf has become increasingly associated, especially in the British press.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this joint presentation, we discuss the analytical approaches and results from our analysis of discourses surrounding the lone-wolf terrorist, al Qaeda, and ISIS in three diachronic sub-corpora of the Lone Wolf Corpus (Malone, 2020), a compilation of British Press articles from 2000 to 2019. In a unique methodological combination, we employed large-scale collocation networks and topical clustering to examine shifting discourses through collocational clusters, and applied a corpus-based critical discourse analysis to examine representations of the Al-Qaeda-ISIS nexus.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hanna introduces the methodology employed to generate topical clusters and discusses collocational changes and constants in emerging discourses surrounding the lone-wolf terrorist. The resulting patterns present a discursive shift from clusters related to causative factors (e.g., a mental health subcluster), towards the internationalisation and institutionalisation of lone-wolf terrorism, and finally to response management in the form of sentencing and punitive actions (e.g., a court proceedings\/prison subcluster).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Reporting on his corpus-based critical discourse analysis, Daniel presents the emergent representations surrounding co-occurrences of the node AL QAEDA with ISIS. These discourses were categorised into four modes of representation of presented relationship-types: Convergence, Association, Dissociation, and Divergence. These modes contributed to surrounding (in)security discourses that at times equate, promote and\/or relegate different entities in a continual reshuffling of the threat hierarchy; a process termed here <em>enmity reimagining<\/em>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Entman, R. (2004). <em>Projections of Power: Framing News, Public Opinion, and U.S. Foreign Policy<\/em>. The University of Chicago Press: London.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Malone, D. (2020). Developing a complex query to build a specialised corpus: Reducing the issue of polysemous query terms. <em>Corpora and Discourse International Conference 2020<\/em>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup>MEETING #8: Thursday 9 November 2023<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topic: Discourse-Oriented Corpus Studies, Immigration<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><a href=\"https:\/\/research.edgehill.ac.uk\/en\/persons\/katia-adimora\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Katia Adimora<\/strong><\/a> (Edge Hill University, UK)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Towards more positive portrayals of Mexican immigration\/immigrants in the American and Mexican press<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/11\/EHU_CRG.2023.11.09.Adimora.pdf\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Various studies (e.g., Galindo G\u00f3mez, 2019; Taylor, 2009; Gabrielatos and Baker, 2008) have explored press attitudes towards immigration\/ immigrants in different countries. To analyse the attitudes towards Mexican immigration\/immigrants in the American and Mexican press, two specialised corpora of 30 million words were created. The American corpus includes more than 12,000 articles from six American newspapers: <em>The New York Times, The Washington Post, USA Today, Los Angeles Times, The Arizona Republic <\/em>and<em> Chicago Tribune<\/em>. The corpus articles were published between 16 June 2015, which marked the start of Trump\u2019s presidential campaign, and 20 January 2021, the date of Biden\u2019s presidential inauguration. The Mexican corpus includes more than 20,000 articles from six Mexican newspapers, published during Trump\u2019s era: <em>El Universal, Elimparcial.com, Reforma, El Norte, Lacronica.com <\/em>and<em> Mural<\/em>. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Even though the negative discourse prosodies seem to dominate newspaper discourses, this study argues that the attitudes towards Mexican immigration\/immigrants in American and, especially, in Mexican newspapers are not as negative as expected. The results show that two-third (66%) of the instances in American corpus newspapers and more than three quarters (78%) of the instances in Mexican corpus newspapers express a positive perspective. However, among the most frequent negative attitudes in American and Mexican corpus newspapers is the description of immigrants as criminals (20% and 18%). The diachronic frequency analysis of the attitudes towards \u2018immigration\u2019 and \u2018immigrant(s)\u2019 shows correlations between socio-political events and press discourses, which might contribute to public opinion about Mexican immigration\/immigrants. For instance, Trump\u2019s family separation policy might have ignited empathy towards immigrants in the corpus newspapers.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup>MEETING #7: Thursday 30 March 2023<\/sup><\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topic: Corpus Tools &amp; Corpus Processing<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/www.lexically.net\" target=\"_blank\" rel=\"noreferrer noopener\">Mike Scott<\/a> <\/strong>(Lexical Analysis Software &amp; Aston University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>News Downloads and Text Coverage: Case Studies in Relevance<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/lexically.net\/downloads\/workshops\/EdgeHill2023\/index.html\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>VIDEO &amp; SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Thank goodness it is now quite easy for anyone with the relevant permissions (and patience) to download thousands of text files from online databases such as those of LexisNexis and Factiva. After downloading there are numerous issues to be handled in checking the format of the text, cleaning out remnants of HTML, handling references to images, formulae, reader comments, sorting them by date or source etc.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The problem to be addressed here, however, chiefly concerns a) duplicate contents and b) relevance to the user\u2019s research aims.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">News texts in particular suffer increasingly from duplication as minor changes are mad, or as a news story grows hour by hour.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Also, anyone who has looked at such searches will have noticed that some texts are really centrally concerned with the issue being studied, for example the brewing of beer, or the characters in Middlemarch, but others merely make a passing mention, such as \u201cPeter always liked his beer\u201d in an obituary or \u201cmost informants reportedly had never read Middlemarch\u201d in a survey of hobbies and interests.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this presentation, using WordSmith Tools 8.0, &nbsp;I shall attempt to quantify both the degree of content duplication in three sets of text, and the degree of central relevance to a theme. &nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Bio-note<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After teaching English for many years in Brazil and Mexico, Mike Scott moved to Liverpool University in 1990, teaching initially Applied Linguistics generally but then specialising in Corpus Linguistics. In 2009 he moved to Aston University. In the early 1980s he learned computer programming and began to develop corpus software: MicroConcord (1993 with Tim Johns), WordSmith Tools 1996, which is now in version 8. His current field is corpus linguistics, sub-field corpus linguistics programming.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><sup><span>MEETING #6: Thursday 2 March 2023<\/span> (Two presentations)<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><span class=\"stk-highlight\" style=\"color: #1a4548\"><sup><strong>Topic: Manual Annotation in Discourse-Oriented Corpus Studies<\/strong><\/sup><\/span><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/research.edgehill.ac.uk\/en\/persons\/katia-adimora\">Katia Adimora<\/a> <\/strong><span class=\"stk-highlight\" style=\"color: #000000\">(Edge Hill University)<\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>Annotating Mexican immigration discourses<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/03\/EHU-CRG.2023.03.Adimora.pdf\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Discourses of Mexican and American newspapers about Mexican immigration to the US during Trump\u2019s presidency were pragmatically annotated according to the attitudes they express about different semantical aspects of immigrants and immigration, such as border, families, and crime.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Firstly, by deploying the Mexican immigration corpus (in Spanish), American immigration corpus (in English) and corpus tool Sketch Engine (SE), &nbsp;the researcher conducted the search for concordances for the search terms \u2018immigration\u2019, \u2018immigrant\u2019 and \u2018inmigraci\u00f3n\u2019 and \u2018inmigrante\u2019, respectively. Secondly, the random samples of fifty concordances for each term were extracted. Concordances were transferred to the word table and manual annotated according to the pragmatic perspective they express towards immigrants.&nbsp; After all 200 instance were annotated and no additional attitudes were identified, the annotation scheme with codes for the expressed attitudes and their definitions were created. Attitudes towards immigration were classified into three levels:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"586\" height=\"308\" src=\"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1.png\" alt=\"\" class=\"wp-image-1322\" srcset=\"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1.png 586w, https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1-300x158.png 300w\" sizes=\"auto, (max-width: 586px) 100vw, 586px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">For example, second level \u2018attitude positive-anti-Trump\u2019 with a code (Att_ pos-aT) includes instances where the main sentiment is positive towards immigration, which is shown via opposition towards Trump\u2019s anti-immigration policies:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In Fountain Hills, Trump blamed immigrants in the country illegally for &#8220;so many killings, so much crime.&#8221;&nbsp;He then went after rival presidential candidates Ted Cruz and John Kasich, saying his approach to illegal&nbsp;<strong>immigration&nbsp;<\/strong>was tougher than theirs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/independent.academia.edu\/DanielMalone14\" target=\"_blank\" rel=\"noreferrer noopener\">Dan Malone<\/a> <\/strong>(Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>A lone wolf from the ISIS pack: Hunting discourses through manual annotation<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"http:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/03\/Daniel_Malone.CRG_.Mar2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The process of manual corpus annotation, where researchers add interpretive information to corpus data, is a valuable tool for systematically analysing linguistic or semantic features in a corpus. A core aspect of manual annotation is the <em>annotation scheme<\/em> \u2013 a set of guidelines for labelling corpus content, including annotation categories, definitions, and examples (McEnery &amp; Hardie 2012: 90).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this talk, I will introduce the manual annotation approach and annotation scheme I developed for analysing the representations of the nexus between lone-wolf terrorists and the extremist groups ISIS and al Qaeda in the British press. The underpinning goal of this annotation scheme is to systematically reveal discourse prosodies, in other words, the implicit and explicit attitudes (Stubbs 2001: 66) towards the lone-wolf terrorist.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The context for this talk is my doctoral research project &#8220;Constructing the Lone Wolf Terrorist: A corpus-based critical discourse analysis.&#8221; The dataset used in this study is The Lone Wolf Corpus, a purpose-built corpus consisting of approximately 8.5 million words and 8,600 texts from UK national newspapers, published between 2000 and 2019 (Malone 2020).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I will describe the iterative process used to develop the annotation scheme, which involved cycles of annotation. I will also provide a detailed explanation of the scheme&#8217;s four distinct categories and illustrate each category with examples. The first two categories identify the type of entity represented by the node LONE WOLF, as well as the collocates ISIS and AL QAEDA, and determine whether each entity is portrayed as an active and dynamic force (Van Leeuwen, 2008: 33). The third category denotes the connection between the node and the collocate, while the fourth category outlines the discursive link.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Additionally, I will address the practical challenges that arose during the development of the annotation scheme, namely: (1) the need to avoid top-down categorisation, (2) the difficulty of balancing scheme richness with the intensity of labour required for its application, and (3) ensuring the reliability of the coding process.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Malone, D. (2020).&nbsp;Developing a complex query to build a specialised corpus: Reducing the issue of polysemous query terms.&nbsp;<em>Corpora and Discourse International Conference 2020<\/em>. <a href=\"https:\/\/doi.org\/10.13140\/RG.2.2.31214.43846\">https:\/\/doi.org\/10.13140\/RG.2.2.31214.43846<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">McEnery, T., &amp; Hardie, A. (2012). <em>Corpus linguistics: Method, theory and practice<\/em>. Cambridge University Press.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Stubbs, M. (2001). <em>Words and phrases: Corpus studies of lexical semantics<\/em>. Oxford: Blackwell.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Van Leeuwen, T. (2008). <em>Discourse and Practice: New Tools for Critical Discourse Analysis<\/em>. Oxford University Press.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2022<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup>MEETING #5: Thursday 15 December 2022<\/sup><\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><sup>Topic: Corpus Tools &amp; Semi-Automated Annotation<\/sup><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/martinweisser.org\/mw.html\" target=\"_blank\" rel=\"noreferrer noopener\">Martin Weisser<\/a><\/strong> (University of Salzburg)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Doing Corpus Pragmatics in DART 3<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/edgehill.cloud.panopto.eu\/Panopto\/Pages\/Viewer.aspx?id=285b90df-8aa2-460b-a4b2-af6d00c67cbd\"><strong>VIDEO<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Dialogue Annotation and Research Tool (DART) is a freeware tool that makes it possible to annotate large amounts of dialogue data semi-automatically on a number of linguistic levels, as well as post-process and analyse the resulting corpora efficiently using various corpus analysis methods. Perhaps arguably, the most interesting and important levels of analysis produced by the tool from the point of view of corpus pragmatics are the syntactic and the speech-act level, but DART annotations also comprise information about the semantics (topics), semantico-pragmatic (modes; Searle\u2019s IFIDs), surface polarity, (completion) status, and disfluency of units. In this talk, I want to begin by providing a brief overview of the background, genesis and development of the tool. Next, we\u2019ll discuss the different levels of annotation, their potential significance in pragmatics, and how they may work together in determining the pragmatic meaning potential in the form of speech acts. I\u2019ll then briefly illustrate how it\u2019s possible to adapt the resources used for the annotation process to new domains, as well as the steps involved in the annotation process itself. And last, but not least, we\u2019ll explore the different analysis options related to speech acts and other patterns the tool has to offer.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>DART3:&nbsp;<a href=\"https:\/\/eur01.safelinks.protection.outlook.com\/?url=https%3A%2F%2Fmartinweisser.org%2Fling_soft.html%23DART&amp;data=05%7C01%7CGabrielc%40edgehill.ac.uk%7C991f4d5dfd3d4cbbe30a08dacbb6215e%7C093586914d8e491caa760a5cbd5ba734%7C0%7C0%7C638046280984579628%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C&amp;sdata=8E5hvbzN%2B7DEW70gAEZnnsf9Lj7A2dCwZ2PpmAa3fIU%3D&amp;reserved=0\">https:\/\/martinweisser.org\/ling_soft.html#DART<\/a><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup>MEETING #4: Friday 8 April 2022<\/sup><\/span><sup>  (Two presentations)<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong><sup>Topics: Corpus Tools &amp; Manual Annotation<\/sup><\/strong><\/em><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/www.ugr.es\/personal\/encarnacion-hidalgo-tenorio\" target=\"_blank\" rel=\"noreferrer noopener\">Encarnaci\u00f3n Hidalgo-Tenorio<\/a> <\/strong>(University of Granada) and <strong><a href=\"https:\/\/filologiainglesa.unizar.es\/personal\/miguel-angel-benitez-castro\" target=\"_blank\" rel=\"noreferrer noopener\">Miguel-\u00c1ngel Ben\u00edtez-Castro<\/a><\/strong> (Universidad de Zaragoza)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Workshop: <em>Manual Annotation with <a href=\"http:\/\/www.corpustool.com\/\">UAM Corpus Tool<\/a>.<\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Presentation: <em>Analysing Extremism under the Lens of Appraisal Theory.<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/03\/EHU.CRG_.2022.04.08.Slides.UAM_.pdf\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>SLIDES<\/strong><\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Appraisal Theory is aimed to understand how social relations are negotiated through alignment, as linguistically realised by the axes of ENGAGEMENT, GRADUATION and ATTITUDE (Martin &amp; White 2005). Of the three subsystems, the latter has attracted more attention so far. ATTITUDE helps classify instances of emotion\/al talk through the meaning domains of AFFECT, JUDGEMENT and APPRECIATION. As argued by White (2004) and Bednarek (2009), emotional talk may entail the more indirect expression of emotion by attending to ethical and aesthetic values. Given the omnipresence of affect in language (including Ochs &amp; Schieffelin 1989; Barrett 2017), there is growing consensus about treating AFFECT as a superordinate category, now taken to include the expression of EMOTION (emotional evaluation) and OPINION (ethical and aesthetic evaluation). As emotion permeates all levels of linguistic description (including Alba-Juez &amp; Thompson 2014; Alba-Juez &amp; Mackenzie 2019), and all utterances are produced and interpreted through emotions (Klann-Delius 2015), AFFECT may be enriched through a more explicit focus on affective psychology, thereby proposing more sharply defined categories that may better describe any instance of emotive language (Thompson 2014). This paper shows how Ben\u00edtez-Castro &amp; Hidalgo-Tenorio\u2019s (2019) more psychologically-driven Appraisal EMOTION sub-system can lead to a user-generated Appraisal scheme allowing a more fine-grained analysis of the complex interplay between (explicit and implicit) EMOTION and OPINION in discourse. To do so, we draw on examples and findings from two research strands we have covered so far: American right-wing populist discourse (Hidalgo-Tenorio &amp; Ben\u00edtez-Castro 2021b) and Jihadist propaganda (Ben\u00edtez-Castro &amp; Hidalgo-Tenorio Forthcoming).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Encarnaci\u00f3n Hidalgo-Tenorio&nbsp;<\/em>is Professor in English Linguistics at the University of Granada, Spain. Her main research area is corpus-based CDA, where she focuses on the notions of representation and power enactment in public discourse. She has published on language and gender, Irish studies, political communication, and has also paid attention to the analysis of the way identity is discursively constructed. She has tried to develop, or reconsider, some interesting aspects taken from SFL such as Transitivity, Modality, or Appraisal. Currently, she is working on the lexicogrammar of radicalization.&nbsp;<em>Address for correspondence<\/em>: Departamento de Filolog\u00edas Inglesa y Alemana, Facultad de Filosof\u00eda y Letras, Campus de Cartuja s\/n, 18071, University of Granada, Spain. &lt;ehidalgo@ugr.es&gt;&nbsp;<strong>&nbsp;<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Miguel-\u00c1ngel Ben\u00edtez-Castro&nbsp;<\/em>is lecturer in English Language at the University of Zaragoza, Spain. His main research interest lies in SFL-inspired discourse analysis, based on corpus-driven methodologies, which he has managed to apply to his general focus on the interface between lexical choice, discourse structure, and evaluation. This is reflected in his previous and ongoing research on shell-noun phrases, on the evaluation of social minorities in public discourse and on the refinement of SFL\u2019s linguistic theory of evaluation.&nbsp;<em>Address for correspondence<\/em>: Department of English and German Studies, Facultad de Ciencias Sociales y Humanas, Universidad de Zaragoza, Ciudad Escolar, s\/n, 44003 Teruel.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup>MEETING #3: Wednesday 8 February 2022<\/sup><\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topic: Corpus Tools<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/www.lancaster.ac.uk\/linguistics\/about\/people\/andrew-hardie\" target=\"_blank\" rel=\"noreferrer noopener\"><u>Andrew Hardie<\/u><\/a> <\/strong>(Lancaster University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>What\u2019s new in CQPweb \u2013 2022 edition<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\">In this informal workshop \/ presentation, Andrew Hardie will give an overview of the latest new features in CQPweb version 3.3. This includes, most notably, the option for users to upload their own corpora to the system, tagging the data using either CLAWS or TreeTagger \u2013 plus the new system that enables other users on the same server to share access to these uploaded corpora. Participants are welcome to try it out \u201cin real time\u201d during the session.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2021<\/strong><\/h2>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup> MEETING #2: Wednesday 15 December 2021<\/sup><\/span><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topic: Corpus Tools &amp; Automated Annotation<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><a href=\"https:\/\/www.lancaster.ac.uk\/scc\/about-us\/people\/paul-rayson\" target=\"_blank\" rel=\"noreferrer noopener\">Paul Rayson<\/a> <\/strong>(Lancaster University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Counting words or wording counts?<\/strong><\/em><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\">A wide variety of tools and methods are available across a number of disciplines (e.g. Education, History, Linguistics, Literature, Psychology) for the analysis of text, and many of the techniques (e.g. content analysis, topic modelling, sentiment analysis) rely on counting words. However, words can take different meanings in different contexts, and around 16% of running text counts as semantically meaningful multiword expressions (where the meaning of the whole expression is different from the collection of individual words). In this talk, I will describe what can be achieved by combining methods from computer science (natural language processing) with linguistics (corpus linguistics) to address these issues. The talk will cover the basics of semantic annotation where words and multiword expressions are automatically labelled with coarse-grained word senses using the UCREL Semantic Analysis System (USAS). Then, via a demonstration of the web-based Wmatrix tool, I will show how counting USAS categories and comparing the frequency profiles with those from other texts can be used to quickly gist a text or corpus. Along the way, I will provide some pointers to example case studies in psychology, political discourse analysis, and beyond, describe current research and development on open source USAS multilingual taggers, and provide attendees with pointers for Wmatrix access and further tutorials to follow up later using your own corpora.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Bio note<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I am a Professor in Computer Science at Lancaster University, UK and Director of the UCREL interdisciplinary research centre which carries out research in corpus linguistics and natural language processing (NLP). A long term focus of my work is semantic multilingual NLP in extreme circumstances where language is noisy e.g. in historical, learner, speech, email, txt and other CMC varieties. Along with domain experts, I have applied my research in the areas of dementia detection, mental health, online child protection, cyber security, learner dictionaries, and text mining of biomedical literature, historical corpora, and financial narratives. I was a co-investigator of the five-year ESRC Centre for Corpus Approaches to Social Science (CASS) which is designed to bring the corpus approach to bear on a range of social sciences. I\u2019m also a member of the multidisciplinary Institute Security Lancaster, the Lancaster Digital Humanities Hub, and the Data Science Institute.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity is-style-wide\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><span><sup>MEETING #1: Wednesday 10 November 2021<\/sup><\/span><sup>  (Two presentations)<\/sup><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em><sup>Topic: Constructing Topic-Specific Corpora<\/sup><\/em><\/strong><\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Dan Malone <\/strong>(Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><em><strong>Constructing the Lone Wolf Corpus:&nbsp;Using polysemous query terms to compile a topic-specific corpus<\/strong><\/em><\/h6>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<p class=\"wp-block-paragraph\">This paper is concerned with the process of developing a query to compile a topic-specific corpus from a text database. For a corpus to be topic-specific, its texts must be relevant to the topic(s) for which it was compiled to investigate. However, polysemous query terms are more likely than monosemous query terms to retrieve nonrelevant texts and, therefore, reduce query precision, that is, the ratio of relevant to nonrelevant texts retrieved.&nbsp;More specifically, then, this paper suggests that the issue of polysemous query terms can be addressed through the implementation of a dual-group complex query (hereafter, DGQ).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The motivation for this paper arose while compiling the corpus for my PhD project \u2018Constructing the Lone Wolf Terrorist: A corpus-driven study of the British press\u2019. In its actor-based approach, corpus compilation was underpinned by an onomasiological perspective of the connection between lexical items and the concept of \u2018the lone wolf terrorist\u2019. According to Geeraerts (2010: 23), \u201conomasiology takes its starting point in a concept and investigates the different expressions via which the concept can be designated or named\u201d. This is opposed to semasiology, which \u201ctakes its starting point in the word as a form and charts the meanings that the word can occur with\u201d (ibid). Indeed, the concept \u2018lone wolf terrorist\u2019 can be expressed via a number of polysemous lexical items, such as&nbsp;<em>lone actor<\/em>,&nbsp;<em>lone attacker<\/em>, and&nbsp;<em>solo actor<\/em>, with the specificity of their meaning being derived from context.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To compile the Lone Wolf Corpus (LWC), rather than employing a simple query-string, the DGQ was devised to mitigate the polysemy of its query terms. It comprises two distinct groups of terms, with each based around a core semantic component of \u2018lone-wolf terrorist\u2019; Group A terms represented lone-wolf actors or actions, whereas Group B represented terrorism. By linking terms within each group with the Boolean operator&nbsp;<em>OR<\/em>&nbsp;and by then linking the two groups using&nbsp;<em>AND<\/em>, the query retrieved texts containing at least one term from each group. By drawing on textual context in the form of collocation, the potential for multiplicity of meaning of the polysemous query terms is restricted, leading to a reduction in the number of nonrelevant texts being retrieved by the query.<br>This paper develops the query formulation technique outlined by Gabrielatos (2007). Central to Gabrielatos\u2019s technique is the metric of relative query term relevance (RQTR), which establishes the degree of relevance of candidate query terms to the topic being investigated. The RQTR technique has been adopted in a number of studies, such as Prentice (2010), Dimmroth, Steiger &amp; Sch\u00fcnemann (2017), and Kreischer (2019), as a means to both expanding queries and establishing the relevancy of candidate terms. This paper expands the applicability of the RQTR method by illustrating how it can be applied to the DGQ and, therefore, cater for polysemous query terms.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">From the initial core query terms&nbsp;<em>lone wolf<\/em>&nbsp;and&nbsp;<em>terrorism<\/em>, the LWC query was expanded to seventy query terms. When applied to the Lone Wolf Corpus (LWC) query, the DGQ improved query precision at minimal expense to recall, relative to a simple query. Based on a systematic sampling, the results show that the DGQ improved precision from 0.46 to 0.89, which was gained at the minimal expense of a 0.08 decrease in retrieved relevant texts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gabrielatos, C. (2007). Selecting query terms to build a specialised corpus from a restricted access database.&nbsp;<em>ICAME Journal<\/em>, 31, 5-44.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Geeraerts, D. (2010).&nbsp;<em>Theories of Lexical Semantics<\/em>. Oxford University Press.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Kreischer, K. S. (2019). The relation and function of discourses: a corpus-cognitive analysis of the Irish abortion debate.&nbsp;<em>Corpora<\/em>, 14(1), 105-130.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Prentice, S. (2010). Using automated semantic tagging in Critical Discourse Analysis: A case study on Scottish independence from a Scottish nationalist perspective.&nbsp;<em>Discourse &amp; Society<\/em>, 21(4), 405\u2013437.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Steiger, S., Sch\u00fcnemann, W. J., &amp; Dimmroth, K. (2017). Outrage without consequences? Post-Snowden discourses and governmental practice in Germany.&nbsp;<em>Media and Communication<\/em>, 5(1), 7-16.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Katia Adimora <\/strong>(Edge Hill University)<\/h6>\n\n\n\n<h6 class=\"wp-block-heading\"><strong><em>Building bilingual corpora for Critical Discourse Analysis: Mexican immigration to the US<\/em><\/strong><\/h6>\n\n\n\n<p class=\"wp-block-paragraph\">This talk will address the building of topic-specific corpora about Mexican immigration to the US during Donald Trump era. The corpora contain American and Mexican newspaper articles that cover Mexican immigration (44,779 articles, 30 million words). The aim is to analyse how immigrants are represented in them.&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The newspapers included in the corpora are:<br>US newspapers:&nbsp;<em>New York Times, Washington Post, USA Today, Los Angeles Times, The Arizona Republic&nbsp;<\/em>and<em>&nbsp;Chicago Tribune.&nbsp;<\/em><br>Mexican newspapers:&nbsp;<em>El Universal, Elimparcial.com,&nbsp;Reforma, El Norte, Lacronica.com and Mural.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To gather the relevant articles, three-part query was formed based on the reading through various American and Mexican articles, and by identifying the words that are deployed to talk about immigrants or immigration. Bilingual queries: in English and Spanish, needed to be constructed. Spanish query terms are synonyms to English ones, however not necessary the literal translation from English as Mexican newspapers do not use specific expression that is used in English, or they use different expressions to talk about immigration and immigrants.&nbsp;&nbsp;&nbsp;<br>Articles were transferred from online database&nbsp;<em>ProQuest<\/em>&nbsp;(<em>Global&nbsp;Newsstream<\/em>) to the software tool Sketch Engine (<a href=\"https:\/\/www.sketchengine.eu\/\">https:\/\/www.sketchengine.eu\/<\/a>).&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">American and Mexican corpora were divided in&nbsp;subcorpora&nbsp;to be able to compare how the newspapers in American states with the highest number of Mexican immigrants, represent them in comparison to national newspapers. Similarly, Mexican&nbsp;subcorpora&nbsp;was formed to compare how newspapers from the regions&nbsp;in Mexico&nbsp;with the high number of Mexican migrants that move to the US address them compared to the national newspapers.&nbsp;&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">These&nbsp;subcorpora&nbsp;division differs from the one commonly applied to the British press, on broadsheets&nbsp;<em>vs.<\/em>&nbsp;tabloid, and according to political leaning on leftist, rightist and centrist. This is due to the difficulty to draw the line between these types of grouping&nbsp;of&nbsp;American, and especially, Mexican newspapers.&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">References<br><strong><br><\/strong>Kilgarriff, Adam, V\u00edt Baisa, Jan Bu\u0161ta, Milo\u0161 Jakub\u00ed\u010dek, Vojt\u011bch Kov\u00e1\u0159, Jan Michelfeit, Pavel Rychl\u00fd, V\u00edt Suchomel (2014) The Sketch Engine: ten years on.&nbsp;<em>Lexicography<\/em>, 1: 7-36.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2026 MEETING #20: Friday 6 March 2026, 10:00-11:30 am Topic: LLMs, Corpus Linguistics, and Language Learning Peter Crosthwaite (University of Queensland, Australia) Corpora, Prompts, and Pedagogy: Human-AI Text Comparison in Applied Linguistics SLIDES Generative AI has fundamentally disrupted corpus linguistics by making it possible to create large, readily generable corpora of machine-produced text, challenging long-standing [&hellip;]<\/p>\n","protected":false},"author":2300,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-92","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Past events | Edge Hill Corpus Research Group | Edge Hill University<\/title>\n<meta name=\"description\" content=\"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers&#039; abstracts.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/sites.edgehill.ac.uk\/crg\/past\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Past events | Edge Hill Corpus Research Group | Edge Hill University\" \/>\n<meta property=\"og:description\" content=\"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers&#039; abstracts.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/sites.edgehill.ac.uk\/crg\/past\/\" \/>\n<meta property=\"og:site_name\" content=\"Edge Hill Corpus Research Group\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-06T14:51:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/sites.edgehill.ac.uk\/crg\/files\/2023\/02\/Picture1.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Past events | Edge Hill Corpus Research Group | Edge Hill University\" \/>\n<meta name=\"twitter:description\" content=\"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers&#039; abstracts.\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"41 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/\",\"url\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/\",\"name\":\"Past events | Edge Hill Corpus Research Group | Edge Hill University\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/wp-content\\\/uploads\\\/sites\\\/377\\\/2023\\\/02\\\/Picture1.png\",\"datePublished\":\"2022-11-16T10:08:10+00:00\",\"dateModified\":\"2026-03-06T14:51:58+00:00\",\"description\":\"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers' abstracts.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/#primaryimage\",\"url\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/wp-content\\\/uploads\\\/sites\\\/377\\\/2023\\\/02\\\/Picture1.png\",\"contentUrl\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/wp-content\\\/uploads\\\/sites\\\/377\\\/2023\\\/02\\\/Picture1.png\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/past\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"PAST EVENTS\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/#website\",\"url\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/\",\"name\":\"Edge Hill Corpus Research Group\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/sites.edgehill.ac.uk\\\/crg\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Past events | Edge Hill Corpus Research Group | Edge Hill University","description":"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers' abstracts.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/","og_locale":"en_GB","og_type":"article","og_title":"Past events | Edge Hill Corpus Research Group | Edge Hill University","og_description":"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers' abstracts.","og_url":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/","og_site_name":"Edge Hill Corpus Research Group","article_modified_time":"2026-03-06T14:51:58+00:00","og_image":[{"url":"https:\/\/sites.edgehill.ac.uk\/crg\/files\/2023\/02\/Picture1.png","type":"","width":"","height":""}],"twitter_card":"summary_large_image","twitter_title":"Past events | Edge Hill Corpus Research Group | Edge Hill University","twitter_description":"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers' abstracts.","twitter_misc":{"Est. reading time":"41 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/","url":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/","name":"Past events | Edge Hill Corpus Research Group | Edge Hill University","isPartOf":{"@id":"https:\/\/sites.edgehill.ac.uk\/crg\/#website"},"primaryImageOfPage":{"@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/#primaryimage"},"image":{"@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/#primaryimage"},"thumbnailUrl":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1.png","datePublished":"2022-11-16T10:08:10+00:00","dateModified":"2026-03-06T14:51:58+00:00","description":"Find out more about the past events held by the Edge Hill Corpus Research Group (EHU CRG) and read the speakers' abstracts.","breadcrumb":{"@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/sites.edgehill.ac.uk\/crg\/past\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/#primaryimage","url":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1.png","contentUrl":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-content\/uploads\/sites\/377\/2023\/02\/Picture1.png"},{"@type":"BreadcrumbList","@id":"https:\/\/sites.edgehill.ac.uk\/crg\/past\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/sites.edgehill.ac.uk\/crg\/"},{"@type":"ListItem","position":2,"name":"PAST EVENTS"}]},{"@type":"WebSite","@id":"https:\/\/sites.edgehill.ac.uk\/crg\/#website","url":"https:\/\/sites.edgehill.ac.uk\/crg\/","name":"Edge Hill Corpus Research Group","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/sites.edgehill.ac.uk\/crg\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"}]}},"_links":{"self":[{"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/pages\/92","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/users\/2300"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/comments?post=92"}],"version-history":[{"count":7,"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/pages\/92\/revisions"}],"predecessor-version":[{"id":1893,"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/pages\/92\/revisions\/1893"}],"wp:attachment":[{"href":"https:\/\/sites.edgehill.ac.uk\/crg\/wp-json\/wp\/v2\/media?parent=92"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}