International Research Symposium on Pure and Applied Sciences (IRSPAS)

Permanent URI for this communityhttp://repository.kln.ac.lk/handle/123456789/15650

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    Item
    A study on Dengue spread in Western province: using Spatial and Cluster analysis.
    (4th International Research Symposium on Pure and Applied Sciences, Faculty of Science, University of Kelaniya, Sri Lanka, 2019) Kethmi, G. A. P.; Premarathna, L. P. N. D.
    Dengue virus is transmitted to humans through the bite of infected Aedes mosquitoes, mostly Aedes aegypti. According to National dengue control unit of Sri Lanka, the recent outbreak of dengue fever in the country was reported on July 2017. Since the life cycle of a mosquito is short, it is highly influenced by the variations in the environment. Also, Sri Lanka has a changing weather over time, hence the spread of dengue mosquito is time dependant. Considering these facts objectives of the study were to identify the correlation between number of dengue incidences and the environmental factors such as temperature, rainfall and humidity, recognize homogeneous areas of dengue and discover dengue dense area and non-dense area in Western province. Agglomerative hierarchical clustering method used to observe homogenous areas in the study area. In this method, initially each observation is considered as a cluster and continue the procedure by connecting most similar observations. Several linkage methods that can be used to join observations into clusters. From literature, Ward’s method proposed as the best linkage method in clustering where total within-cluster variance calculated and at every step clusters with minimum between cluster variance connected. After constructing the cluster dendrogram by connecting appropriate clusters, the optimum number of clusters identified using Elbow method. Spatial analysis explains a behaviour or a pattern of a variable geographically. Geographical maps are used to find the dengue dense and dengue non-dense areas. Number of dengue incidences and environmental factors for the period 2013 to 2017 of three districts in Western province were used for this study. R statistical software used to conduct the analysis. A descriptive analysis was carried out and outliers were treated using Winsorizing method. Normality of each variable was examined. Pearson’s correlation coefficient calculated when variables are normally distributed otherwise; Spearman correlation coefficient calculated. According to the results obtained, rainfall and humidity have a negative correlation with number of dengue incidences while temperature has a positive correlation. Three clusters identified as follows; first two months and last three months fell into one cluster, March and April were the next and remaining months as another cluster. Cluster analysis showed that, during the first period of monsoon season of the year, there is an increase in the spread of dengue virus in the Western province. Spatial analysis showed that the Colombo as the dengue dense and Kalutara as the dengue non-dense area in the Western province
  • Thumbnail Image
    Item
    Categorizing T20 batsmen based on their performances
    (Research Symposium on Pure and Applied Sciences, 2018 Faculty of Science, University of Kelaniya, Sri Lanka, 2018) Sumithrapala, D. M. S. S.; Mallawa Arachchi, D. K.
    T20 Cricket is the most popular shortest form of cricket which is played all over the world. It is obvious that some of the T20 batsmen are power hitters. Some of them are having higher averages and higher strike rates. The aim is to categorize the batsmen who are showing similar performances. In this research work, we perform a statistical analysis to categorize batsmen in the world based on their performances shown during the T20 matches that they have played. Several factors have been considered in our analysis namely, highest score scored by a batsmen, average, strike rate, number of 4s scored, number of 6s scored and number of half centuries scored. Cluster analysis was used in determining the number of clusters into which the batsmen should be clustered. This study helps identifying the batsmen who are showing the highest performances in T20 cricket. It enables one to predict the range of runs scored for the batting inning if a team is set with these batsmen and also it is useful to rank the batsmen. Data were collected through Cricinfo website from 58 T20 batsmen throughout the world. Analysis was done to identify the relatively homogeneous clusters, using Ward’s method of Hierarchical Cluster Analysis using SPSS statistical software and R-Studio. When considering the performances shown by the batsmen, there is no enough evidence to conclude that the batsmen who are showing the highest performances belong to one country or continent. When all the variables are considered together, Mohammad Shahzad, MJ Guptil, BB Macullum, TM Dilshan, and DPMD Jayawardene can be categorized as the batsmen showing the best performances in T20 while CH Gayle, MJ Guptil, BB Macullum, MEK Hussey, Najibullah Zadran, KP Pietersen, F du Plessis have both the highest batting averages and strike rates. It can be concluded that MJ Guptil and BB Macullum are the two batsmen showing the best performances in all forms of T20 cricket. The research helps to identify how cluster changes with different factors.
  • Thumbnail Image
    Item
    Applicability of unsupervised learning algorithms for setting profiles for consumer buying behavior
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Paranavithana, I.R.; Rupasinghe, T.D.
    The Consumer Buying Behaviour consists of a summation of attitudes, preferences, intentions and, decisions taken by them. The process that customer buys a product or service varies for each individual and each category of products they may purchase. With the development of Information Technology, the products and the behaviour of purchasing those products have drastically changed and become more unique to individuals. With respect to these changes, the data collection and analysis have become more dynamic and customer data has become larger and nosier in terms of volume and complexity. As a result of that, handling, analysing, and interpreting customer Point of Sale (POS) data has become a challenge for Retail Supply Chains (RSC) who wish to segregate customers into specific niche markets. Furthermore, it makes increasingly difficult for the retailer to find out when a person comes and buys the products from their outlets and to predict his/her behaviour for the subsequent purchases. As a solution for the aforementioned problems faced by the retailers, a novel a consumer buying behaviour profile mechanism is proposed. The profiles are created with respect to the frequency, time-stamp, and product category using a large POS dataset. The Unsupervised learning techniques were utilized in categorizing consumers in determining similar purchasing behaviour using K-means, Expectation Maximization, and Hierarchical Agglomerative Clustering (HAC). Along with the above clustering techniques, text mining techniques were used in categorizing the product descriptions to create the desired product categories. The study has used data from the UCI machine learning repository with 541,909 POS type records and has applied the aforementioned unsupervised learning techniques to setup the profiles. It has unveiled product related and non-product related charateristics for the given POS data and has laid a novel foundation to construct the profiles to determine buying behaviour. Furthermore, these profiles can be used in segmentation of consumers, RSC specific promotions, and to predict future possibilities to minimize inventory related problems.
  • Thumbnail Image
    Item
    Categorizing T20 cricket grounds
    (Faculty of Science, University of Kelaniya, Sri Lanka, 2016) Pathirana, O.D.R.; Mallawa Arachchi, D.K.
    T20 cricket matches are played by all cricket playing countries. There are more than 80 grounds in various countries on which these games are played. It is hypothesized that some of these grounds favor batsmen while others favor bowlers, or some grounds are high-scoring while others are low-scoring. In this research work, we perform a statistical analysis to determine whether those grounds can be categorized based on the past data. Numerous factors can be considered for the analysis. Main factors we have been considering are the total runs scored in both innings, humidity level, gust, wind, air pressure and the temperature at the grounds when the matches are played. Cluster analysis was used in investigating and determining the number of categories. This study helps identify the behavior of the T20 cricket grounds all over the world and thus enables one to predict the winning possibilities. Data were collected through Cricinfo website from 84 cricket grounds throughout the world. Ward’s method of Hierarchical cluster analysis, which is a major statistical method used in determining the relatively homogeneous clusters, was used. We found that grounds can be clustered into 3 clusters according to the coefficients of the Wards linkage table. When we consider the countries in which these grounds are located, there is no evidence to conclude that grounds in some specific countries are belonging to a particular category. For example there are grounds in India belonging to all three categories. SPSS statistical software was used in this analysis to categorize the grounds. The research work is being carried out to identify how cluster changes with different factors.