Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text

Tissera, M.; Weerasinghe, R.

Please use this identifier to cite or link to this item: http://repository.kln.ac.lk/handle/123456789/25455

Full metadata record

DC Field	Value	Language
dc.contributor.author	Tissera, M.	-
dc.contributor.author	Weerasinghe, R.	-
dc.date.accessioned	2022-11-01T06:29:27Z	-
dc.date.available	2022-11-01T06:29:27Z	-
dc.date.issued	2022	-
dc.identifier.citation	Tissera, M., & Weerasinghe, R. (2022). Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text. Journal of Information and Communication Convergence Engineering, 20(2), 113–124. https://doi.org/10.6109/JICCE.2022.20.2.113	en_US
dc.identifier.uri	http://repository.kln.ac.lk/handle/123456789/25455	-
dc.description.abstract	News in the form of web data generates increasingly large amounts of information as unstructured text. The capability of understanding the meaning of news is limited to humans; thus, it causes information overload. This hinders the effective use of embedded knowledge in such texts. Therefore, Automatic Knowledge Extraction (AKE) has now become an integral part of Semantic web and Natural Language Processing (NLP). Although recent literature shows that AKE has progressed, the results are still behind the expectations. This study proposes a method to auto-extract surface knowledge from English news into a machine-interpretable semantic format (triple). The proposed technique was designed using the grammatical structure of the sentence, and 11 original rules were discovered. The initial experiment extracted triples from the Sri Lankan news corpus, of which 83.5% were meaningful. The experiment was extended to the British Broadcasting Corporation (BBC) news dataset to prove its generic nature. This demonstrated a higher meaningful triple extraction rate of 92.6%. These results were validated using the inter-rater agreement method, which guaranteed the high reliability.	en_US
dc.publisher	Journal of Information and Communication Convergence Engineering	en_US
dc.subject	Automatic Knowledge Extraction, Relation extraction, Natural Language Processing, Semantic Web, Triples Extraction	en_US
dc.title	Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text	en_US
Appears in Collections:	Articles

Files in This Item:

File	Description	Size	Format
Grammatical Structure Oriented Automated Approach for Surface Knowledge Extraction from Open Domain Unstructured Text.pdf		148.39 kB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets