The Academic Events Group, 9th World Conference on Educational Sciences

Font Size: 
Semantic Analysis of Implied Meaning of Thai Words by Using Co-occurrence Analysis technique
Chalermpol Tapsai

Last modified: 2017-03-31

Abstract


Making computers understand human language to allow non- technician user to use and command a computer by their own language without extra training is one of the most interested topics which has been researched extensively until nowadays. Many techniques are implemented  to make computers understand human language and the most widely used techniques are Natural Language Processing and Text Analysis. Thai language, as well as many other languages, has a lot of non-common words(NCWs) which have meaning in two types: direct meaning and implied meaning.  For implied meaning, semantic process need to analyzes surrounded composition of these NCWs, such as presence of some co-occurrence keywords(specific neighbor’s keywords) or structure of sentence, to define the relevant meaning. This research aim to determine how to analyze the meaning of 18 NCWs which are the most frequently used  in Thai language by using Text Analysis technique to determine co-occurrence keywords which are presence nearly each NCW and the size of the window suitable for co-occurrence analysis which can imply relevant meaning with more than 90% of accuracy. Data used for this research were text files collected from content of various types of online media including news, articles and chat rooms of popular websites. With a total of 1800 files, each NCW will be analyzed by 100 files which contain both NCW with direct meaning and implied meaning. The results showed that amount of co-occurrence keywords which can be used to define relevant meaning for each NCW varies from 1 to 5 words and the appropriate size of the window used for analysis are between 1 to 8 while 16 of 18 NCWs have the appropriate size less than five.

Conference registration is required in order to view papers.