Incorporating sparse overcomplete word representations into transition-based dependency parsing

Egas López, José Vicente (2018) Incorporating sparse overcomplete word representations into transition-based dependency parsing. Masters, Szegedi Tudományegyetem.

[thumbnail of 2018_Egas_López_José_Vicente_T62SAR_SZ.pdf] PDF
2018_Egas_López_José_Vicente_T62SAR_SZ.pdf
Hozzáférés joga: SZTE designated computers only

Download (1MB)
[thumbnail of 2018_egas_lopez_jose_vincente.zip] Archive (ZIP)
2018_egas_lopez_jose_vincente.zip
Hozzáférés joga: SZTE designated computers only

Download (5kB)
[thumbnail of 2018_egas_lopez_jose_vicente_biralati_lap.pdf] PDF
2018_egas_lopez_jose_vicente_biralati_lap.pdf
Hozzáférés joga: Repository staff only

Download (646kB)

Abstract

A transition-based dependency parser is utilized in the work related by this document. Originally, the parser aims to be used with cross-lingual functionality, nevertheless, it can be used for monolingual dependency parsing as well. Basically, the parser uses different kind of distributed representations. Here, nothing but the monolingual setting was used for the comparison between the performances of the parser when using different kind of obtained word embeddings, i.e. sparse overcomplete vectorial embeddings versus dense vectorial embeddings, respectively. On the other hand, for achieving a more discernable comparison of the results, the consumption of distributed representations that are different from word embeddings is going to be omitted due to the utilization of the parser’s monolingual functionality merely. Thus, the parser ran consuming alternative word representations, namely, just one type of distributed representations (word embeddings) are destined to be utilized for the experiments. Training and testing experiments were realized with Universal Dependencies version 2, that is, CoNLL-U formatted datasets for training, development and testing were fed to the parser together with the pre-trained word embeddings. The results of this work point that the performance of sparse embeddings are extremely close to the one of the dense embeddings. The experiments yield that sparse embeddings achieved 85.76% and 83.5%, in contrast with 86.18% and 84.08% for the dense embeddings, in Unlabeled Attachment Score in Labeled Attachment Score, respectively. The usage of sparse embeddings fallouts in near to the state-of-the-art methods performance; plus, mentioned embeddings are more interpretable for humans. That is, their dimensions are more coherent than the dimensions within dense embeddings; word intrusion experiments done by (Murphy et al., 2012) and (Faruqui, 2015) corroborate the highly interpretability of sparse embeddings.

English title

Incorporating sparse overcomplete word representations into transition-based dependency parsing

Institution

Szegedi Tudományegyetem

Faculty

Faculty of Science and Informatics

Department

Számítógépes Algoritmusok és Mesterséges Intelligencia Tanszék

Discipline

Natural Sciences

Institute

Informatikai Intézet

Specialization

programtervező informatikus

Supervisor(s)

Supervisor
Supervisor scientific name label
Email
EHA
Berend, Gábor
assistant professor
UNSPECIFIED
UNSPECIFIED

Item Type: Thesis (Masters)
Subjects: 01. Natural sciences
01. Natural sciences > 01.02. Computer and information sciences
Depositing User: TTIK szerkesztő
Date Deposited: 2019. Sep. 24. 08:29
Last Modified: 2023. Oct. 28. 16:06
URI: https://diploma.bibl.u-szeged.hu/id/eprint/73539

Actions (login required)

View Item View Item