A Comparative Evaluation of PDF-to-HTML Conversion Tools

dc.contributor.authorPathirana, Pramodya
dc.contributor.authorSilva, Asini
dc.contributor.authorLawrence, Thenuka
dc.contributor.authorWeerasinghe, Thushani
dc.contributor.authorAbeyweera, Roshan
dc.date.accessioned2024-01-16T04:51:59Z
dc.date.available2024-01-16T04:51:59Z
dc.date.issued2023
dc.description.abstractPDF (Portable Document Format) is a popular file format used for sharing and storing documents across different platforms. However, there are occasions when the content of a PDF document needs to be re-purposed for online use. PDF-to-HTML conversion is a common method used to achieve this goal. This research paper presents a comparative evaluation of existing PDF-to-HTML conversion tools for their suitability in extracting text and images. These tools were tested using school textbooks in Sri Lanka, which contain complex text formatting and non-textual elements. The evaluation was based on various criteria, such as the accuracy of the output, handling of complex text formatting, and non-textual elements. Comparisons were drawn based on the performance of each of these tools with respect to the criteria. The study provides useful insights for individuals and organizations looking to re-purpose PDF content for online use in the HTML format, particularly in the education sector.en_US
dc.identifier.citationPathirana Pramodya; Silva Asini; Lawrence Thenuka; Weerasinghe Thushani; Abeyweera Roshan (2023), A Comparative Evaluation of PDF-to-HTML Conversion Tools, International Research Conference on Smart Computing and Systems Engineering (SCSE 2023), Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka. Page 24en_US
dc.identifier.urihttp://repository.kln.ac.lk/handle/123456789/27362
dc.publisherDepartment of Industrial Management, Faculty of Science, University of Kelaniya Sri Lankaen_US
dc.subjecte-learning, educational design research, text extraction, PDF to HTML conversionen_US
dc.titleA Comparative Evaluation of PDF-to-HTML Conversion Toolsen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
Proceeding SCSE 2023 (3) 24.pdf
Size:
11.19 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: