AUC Philologica (Acta Universitatis Carolinae Philologica) is an academic journal published by Charles University. It publishes scholarly articles in a large number of disciplines (English, German, Greek and Latin, Oriental, Romance and Slavonic studies, as well as in phonetics and translation studies), both on linguistic and on literary and cultural topics. Apart from articles it publishes reviews of new academic books or special issues of academic journals.

The journal is indexed in CEEOL, DOAJ, EBSCO, and ERIH PLUS.

AUC PHILOLOGICA, Vol 2013 No 2 (2013), 91–108

Analisi e classificazione automatica dei verbi italiani: uno studio sul corpus “La Repubblica”

Diana Peppoloni

published online: 29. 12. 2014


An Analys Is and Automatic Classification of Italian Verbs: a Study Based on the “La Repubblica” Corpus This article is concerned with experiments on the automatic induction of Italian semantic verb classes using k-Means, a standard clustering technique, for the purpose of verifying the plausibility of finding a direct connection between the meaning-bearing components of a verb and its syntactic behaviour. A theoretical foundation has been established in extensive works on semantic verb classes such as Levin (1993) for English and Schulte im Walde (2002, 2003, 2004, 2006) for German: each verb class contains verbs which are similar in their meaning and in their syntactic properties. Basing our work on this hypothesis, we have conducted a study of the “La Repubblica” corpus, one of the leading corpora freely available for the Italian language, to subsequently obtain an automatic classification of a sample of Italian verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 200 verbs into 40, 24, and 10 classes. The automatic clustering was evaluated against independently motivated, hand-constructed semantic verb classes. A series of post-hoc cluster analysis explored the influence of specific frames and frame groups on the coherence of the verb classes, and supported the validity of the syntactic-semantic hypothesis.

keywords: syntactic-semantic hypothesis; clustering; automatic classification; subcategorization frames; written Italian corpus ipotesi sintattico-semantica; clustering; classificazione automatica; frames di sottocategorizzazione; corpus dell’italiano scritto

230 x 157 mm
periodicity: 3 x per year
print price: 150 czk
ISSN: 0567-8269
E-ISSN: 2464-6830
