The Nonverbal Element in Persian Verbal Multiword Expressions: A Corpus Annotation Approach
DOI:
https://doi.org/10.46991/jil/2025.02.04Keywords:
Compound Verb, Nonverbal Element, Persian, Preverb, Text Corpus, Verbal Multiword ExpressionAbstract
This article presents a linguistic framework for the identification and annotation of Persian (Farsi) Verbal Multiword Expressions (VMWEs), developed in alignment with the standards and methodologies set by the PARSEME Corpus—an international research network focused on the systematic analysis of multiword expressions across languages. The study aims to bridge the gap between universal annotation guidelines and language-specific grammatical features by tailoring the PARSEME framework to the structural and semantic properties of Persian. By extracting the characteristics of Persian VMWEs, particularly their nonverbal elements (preverbs) and their diverse syntactic and morphological patterns, this work contributes to a more refined understanding of Persian verbal idiomaticity and the advancement of natural language processing tasks. The article details the development of annotation guidelines that reflect both cross-linguistic categories and Persian-specific grammatical phenomena and the process of annotating a corpus of 5,617 sentences encompassing a wide range of Persian VMWEs including light verb constructions, verbal idioms, and prefix verbs. The practical applications of these guidelines in natural language processing are discussed, highlighting their potential to enhance machine understanding of complex verbal constructions, improve syntactic parsing accuracy, and support downstream tasks such as machine translation, information extraction, and semantic role labeling.
Downloads
References
Anosheh, M. 2019. “Serial Verb Construction in Persian: A Minimalist Approach.” Journal of Researches in Linguistics 11(1): 73–91.
Eshaghi, M., and G. Karimi-Doostan. 2023. “The Productivity of Persian Light Verbs.” In Light Verb Constructions as Complex Verbs, 1–28.
Farahani, M., M. Gharachorloo, M. Farahani, and M. Manthouri. 2021. “ParsBERT: Transformer-Based Model for Persian Language Understanding.” Neural Processing Letters 53: 3831–3847.
Folli, R., H. Harley, and S. Karimi. 2005. “Determinants of Event Type in Persian Complex Predicates.” Lingua 115(10): 1365–1401.
Iranpour Mobarakeh, M., and B. Minaei-Bidgoli. 2009. “Verb Detection in Persian Corpus.” International Journal of Digital Content Technology and its Applications 3(1): 58–65.
Karimi, S. 1997. “Persian Complex Verbs: Idiomatic or Compositional.” Lexicology-Berlin- 3: 273–318.
Karimi-Doostan, G. 2005. “Light Verbs and Structural Case.” Lingua 115(12): 1737–1756.
Karimi-Doostan, G. 2011. “Separability of Light Verb Constructions in Persian.” Studia Linguistica 65(1): 70–95.
Mansoory, N., M. Shamsfard, and M. Rouhizadeh. 2012. “Compound Verbs in Persian WordNet.” International Journal of Lexicography 25(1): 50–67.
Mohammad, J., and S. Karimi. 1992. “Light Verbs Are Taking Over: Complex Verbs in Persian.” In Proceedings of WECOL 5: 195–212.
Moloodi, A., and M. Kouhestani. 2017. “The Role of Metaphor and Metonymy in the Semantics of Persian Adjectival Preverbs: A Cognitive Linguistics Approach.” Language Art 2(2): 91–105.
Rasekh, M. 2014. “Persian Clitics: Doubling and Agreement.” Journal of Modern Languages 24(1): 16–33.
Rasooli, M. S., H. Faili, and B. Minaei-Bidgoli. 2011. “Unsupervised Identification of Persian Compound Verbs.” In Advances in Artificial Intelligence: MICAI 2011, 394–406. Springer Berlin Heidelberg.
Safari, P., M. S. Rasooli, A. Moloodi, and A. Nourian. 2022. “The Persian Dependency Treebank Made Universal.” In Proceedings of the Thirteenth Language Resources and Evaluation Conference, 7078–7087.
Samvelian, P., and P. Faghiri. 2013. “Introducing PersPred, a Syntactic and Semantic Database for Persian Complex Predicates.” In The 9th Workshop on Multiword Expressions, 11–20.
Samvelian, P., and P. Faghiri. 2014. “Persian Complex Predicates: How Compositional Are They?” Semantics-Syntax Interface 1(1): 43–74.
Sarlak, M., Y. Yarandi, and M. Shamsfard. 2023. “Predicting Compositionality of Verbal Multiword Expressions in Persian.” In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), 14–23.
Savary, A., C. B. Khelil, C. Ramisch, V. Giouli, V. B. Mititelu, N. H. Mohamed, … A. Walsh. 2023. “PARSEME Corpus Release 1.3.” In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), 24–35.
Shamsfard, M. 2007. “Developing FarsNet: A Lexical Ontology for Persian.” GWC 2008, 413.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Vahide Tajalli, Mehrnoush Shamsfard, Yalda Yarandi, Mahtab Sarlak, Arezoo Haghbin

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.