Browsing by Author "Jayakody, J.R.K.C."

Now showing 1 - 8 of 8

E-marker: Moodle plugin tool to grade essay type questions
(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.
Moodle is one of well-known Learning Management Systems (LMS) that helps academics to create varied assessment types such as Multiple Choice Question (MCQ), tutorials, short question and assignments etc. Typically, MCQ questions and small essay type questions are used as formative assessment techniques to evaluate students’ performance. MCQ question marking is automated and straight forward in Moodle whereas short essay type questions are marked manually by academics. Subsequently sizes of the class and diversity of courses and assessments are increasing day by day. Therefore, it is a challenging practice to evaluate and grade short type questions on time. Hence the present research was conducted to build a Moodle plug-in to mark essay type questions automatically. Two hundred short essay type questions of the Software Engineering course of the Department of Computing and Information System at University of Wayamba were used as the initial dataset. Initially, the research was conducted in a few steps. Statistical features were derived with Natural Language Processing (NLP) techniques such as number of word used in the answer, number of name entities, number of distinct words, correct words and incorrect words. In addition, several chunking rules were developed to identify the correct usage of the languages. Next, semantic mapper module was developed to extract the semantic features based on provided answers. Finally, several experiment were done to identify the most appropriate feature set to develop a logistic regression model with scikit learning machine learning package. The final model showed an accuracy of 82%.
Identification of varying standard of student based on Moodle Pattern Identification Business Intelligence Tool
(Department of Zoology and Environmental Management, University of Kelaniya, Kelaniya, Sri Lanka., 2016) Jayakody, J.R.K.C.; Allagalla, W.H.P.
Learning Management Systems (LMS) takes place as an interaction in the internet environment, with different methods for retrieving class content, materials, subject related information, resources and student teacher interactions. Since the interaction details of the LMS such as Moodle is stored in its database as log files, those logs can be used to analyze and understand the weak and good students. Discussions, Forums, Assignments, wikis and the course are the main categories of the logs which are resided in the logs. This research was done to develop a Business intelligence (BI) tool to identify the able students and less able students log patterns with Moodle which is immensely helpful to identify the less able students very early and find remedies to improve their educational standard. Moodle dataset of MSc Business Management students of University of Moratuwa was used for the research. Store procedures were written in java to extract the xml format data to store the log details to mysql server. BI capabilities such as organizational memory, information integration, insight creation and visualization were covered. Sql server 2012 was used as the main database to develop the data warehouse . Dimensions were created to generate the necessary cubes. Apart from that sql server integration services were used to enhance the Extract-Transform-Load (ETL) process. Data cubes were analyzed with Multidimensional Expressions (MDX) queries. finally dashboards were built using power BI too. Power Pivot graph and the power table were used to present interactive details to the end users. Number of patterns was realized to identify the less able students. Based on assignment submission, number of time a user used the system, number of times pages and resources were accessed, new patterns were identified and presented to the users to get the decision which is immensely helpful to the academics and the students.
Impact of Feature Selection Towards Short Text Classification
(Department of Industrial Management, Faculty of Science, University of Kelaniya Sri Lanka, 2023) Jayakody, J.R.K.C.; Vidanagama, V.G.T.N.; Perera, Indika; Herath, H.M.L.K.
Feature selection technique is used in text classification pipeline to reduce the number of redundant or irrelevant features. Moreover, feature selection algorithms help to decrease the overfitting, reduce training time, and improve the accuracy of the build models. Similarly, feature reduction techniques based on frequencies support eliminating unwanted features. Most of the existing work related to feature selection was based on general text and the behavior of feature selection was not evaluated properly with short text type dataset. Therefore, this research was conducted to investigate how performance varied with selected features from feature selection algorithms with short text type datasets. Three publicly available datasets were selected for the experiment. Chi square, info gain and f measure were examined as those algorithms were identified as the best algorithms to select features for text classification. Moreover, we examined the impact of those algorithms when selecting different types of features such as 1-gram and 2-gram. Finally, we look at the impact of frequency-based feature reduction techniques with the selected dataset. Our results showed that info gain algorithm outperform other two algorithms. Moreover, selection of best 20% feature set with info gain algorithm provide the same performance level as with the entire feature set. Further we observed the higher number of dimensions was due to bigrams and the impact of n grams towards feature selection algorithms. Moreover, it is worth noting that removing the features which occur twice in a document would be ideal before moving to apply feature selection techniques with different algorithms.
Moodle system performance analysis of Wayamba University
(Department of Zoology and Environmental Management, University of Kelaniya, Kelaniya, Sri Lanka., 2016) Jayakody, J.R.K.C.; Ekanayaka, S.M.E.W.P.M.; Ubesinghe, U.W.A.C.C.
Moodle is an online management system which was built for online learning. Moodle has already become a term of its own synonymous with a software package designed to help educators create quality online education. Most of the higher educational courses are conducted based on Moodle system. Due to different factors Moodle system performance degrade which makes a difficult task to academics to conduct the Moodle based courses. Research was done mostly with the Moodle resources of Wayamba University to check the response time of the pages for the users requests such as login page, video resource page, forum pages and discussion pages. System starts response and system finishes response was used as the performance matrices for Moodle resources. Several factors such as computer parameters (Speed of the CPU, Number of cores, Capacity of the disk, Main memory available capacity), network parameters (Network structure, Types of switches, routers ), Moodle services (login , view image ,view video files) were considered to evaluate the Moodle system performances. Apache JMeter was used as the testing tool. JMeter was used to simulate a heavy load on a server to test its strength or to analyze overall performance under different load types. Thread groups, config elements, timers, samplers and listeners were used extensively to check the performance. Testing data were collected during 20 working days. According to the analysis, number of users, size of the resources, and speed of the CUP with response time showcase a significant negative linear relationship.
Natural Language Processing (NLP) Techniques to Measure the Syllabus Coverage with the Final Exam Paper
(3rd International Conference on Advances in Computing and Technology (ICACT ‒ 2018), Faculty of Computing and Technology, University of Kelaniya, Sri Lanka., 2018) Dharmapriya, A.G.K.S.; Jayakody, J.R.K.C.
Exam questions usually play a pivotal role in the education and it is the main assessment technique to evaluate the Intended learning Outcome (ILO). The main Intention of the subject syllabus is to cover the ILO. If the syllabus is not covered effectively through the exam questions, it is a challenging task to identify whether students have acquired and enhance the necessary skills and the knowledge which are specified in the given ILO. Therefore the preparation of the exam papers and its evaluation plays an important role in education to improve the performance of the students. Having appropriate level of exam questions which covers the entire syllabus is a time consuming, tedious and challenging work for the instructors. Therefore this research work was done with the view of setting up effective question paper to measure the depth of the syllabus coverage. Further, Identification of the students’ knowledge level was an output. Natural Language Processing (NLP) techniques such as tokenization, stop word removal, non-alpha numeric word removals and tagging were used to process the contents and questions. NLP with NLTK and cosine similarity with TF-IDF (term frequency inverse document frequency), TF-IDF variations and semantic similarity algorithms were developed to generate a unique set of rules to identify the best syllabus coverage contents for exam questions. Rule based set of logics were developed to classify the exam questions under different syllabus topic. Based on the experiment output, evaluators and instructors can redesign their exam paper. 72 exam questions were used as the initial dataset. Final evaluation was done based on the total value generated from TF-IDF, TF-IDF variations, TF-IDF with cosine similarity and semantic similarity.
Natural language processing framework: WordNet based sentimental analyzer
(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.
Sentimental analysis is a technique which is used to classify different types of documents as positive, negative or neutral. Hand written form, mails, telephone surveys or online feedback forms are used to get customer feedbacks about products and services. In fact, sentimental analysis is the technique which is used to mine online and offline customer feedback data to gain insight of product and services automatically. Since business types are different, it is quite challenging to develop a generic sentimental analyzer. Therefore, this ongoing research focused on developing a generic framework that can be extended further in future to develop the best generic sentiment analyzer. Several online customer feedback forms were used as the dataset. Webpage scraping module was developed to extract the reviews from web pages and chunk and chink rules were developed to extract the comparative and superlative adverbs to build the knowledge base. The web site (Thesaurus.com) was used to build the test data with synonyms of good, bad and neutrals. Next WordNet database was used with different semantic similarity algorithms such as path similarity, Leacock-Chodorow-similarity, Wu- Palmer-Similarity and Jiang-conrath similarity to test the sentiments. Accuracy of this framework was improved further with the vector model built with natural language processing techniques. Label dataset of amazon product reviews provided by University of Pennsylvania were used to test the accuracy. Framework was developed to change the multiplied value based on the domain. The accuracy of the final sentiment value was given as a percentage of the positive or negative type. This framework gave fairly accurate results which are useful to generate good insights with user reviews.
Plagiarism detection educational tool: A student’s assessments similarity checker
(Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Jayakody, J.R.K.C.
Plagiarism is very common among students in higher education institutes due to many reasons such as lack of knowledge about the subject, poor academic writing skills or difficulty in meeting a given deadline. The most popular method of plagiarism is to use the online web pages or e-books as it is an easy effort to get the contents from internet, change it and to submit as an original work. Hence, there are bunch of online software tools as well as offline tools exists to detect the plagiarism. However, there are less software tools to identify the copied works among students. Therefore, in this research I developed a plagiarism detection tool to identify the plagiarized assignments or tutorial submitted. Individual assignments and tutorials which had been given to software engineering courses of the Department of Computing and Information System of Wayamba University were used as the dataset. Natural language processing algorithms were developed to derive the statistical features from the assignments such as bag of words, most frequent words, number of words, name entities and paragraphs etc. Moreover, Term Frequency and Inverted Document Frequency (TF-IDF) module was developed to generate a similarity index value among assignments. In addition, Latent semantic analysis module was developed with the word dictionary and vector corpus. Features that were generated and extracted from every module were used to identify the clusters of similar assignments. K-mean clustering algorithms in rapid minor were used to identify the clusters. Most of the submitted assignments were identified with number of clusters. Once the clustering results were verified with the students, it was evident that fairly good results were the given by the automatic cluster classification.
Question paper analysis with Natural Language Processing
(Department of Zoology and Environmental Management, University of Kelaniya, Kelaniya, Sri Lanka., 2016) Jayakody, J.R.K.C.; Perera, P.L.M.
“Art of Paper Setting” is very popular terminology when it is come to education examination process. As it is an “Art”, teachers should passionate enough to prepare a better question paper which will reflect the educational objectives. There are few steps involved in the process of paper setting and analysis of the paper is the most important element among those steps as it is only indicator of the alignment of questions with intended objectives. When it comes to the analysis process, human intelligence can analyze questions more easily. But implementing similar intelligent systems with computer intelligence is a real challenge. Therefore the purpose of this research is to build a computer intelligent system which can analyze and classify questions. When it is come to classification standards, Bloom’s Taxonomy is a world recognized cognitive skills classification standard. Therefore this standard was used as the guide for the questions categorization of question papers. In the analysis phase, natural language processing techniques were used to analyze the raw text. With these techniques, first the row texts were processed and then the meaningful features of the questions such as verb similarity stem pattern similarity and stem meaning similarity were extracted. Next with machine learning techniques, a model (the brain of the system) was trained by feeding extracted question features. For the model training, several classification algorithms such as Multinomial Naive Bayes Classifier, Bernoulli Naive Bayes Classifier, Logistic Regression Classifier, Stochastic Gradient Descent Classifier, C-Support Vector Classifier and Linear Support Vector Classifier were used. Accuracy levels of each and every classification algorithms were measured with changing the size of the training data set and the optimum algorithm was selected for model training. Finally the model was trained with the optimum algorithm and that model was used to classify the unseen questions. The ultimate model was fine tuned to gain 80% classification accuracy.