About SWRC:Research Topics

From SWRC

Revision as of 11:29, 26 April 2011 by Hudoni (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The research conducted in the research center can be divided into three main areas: natural language processing, ontology, and semantic web.

The ultimate objective of Natural Language Processing research is to develop a computer program that can speak, listen, read, and write like a human being. The areas of application for natural language processing include Machine Translation, which is the automatic translation of any written word into a different language, and Information Retrieval, which is the automatic retrieval of any content sought by a user.

Semantic Web is a technology that enables machines to process interrelationships between various resources (resource-web documents, various services, etc.) within a decentralized environment such as the internet. Currently, only humans can use the information scattered across the worldwide web by collecting and analyzing it personally, but the ultimate objective of semantic web research is to automate this entire process.

Ontology refers to the expression of the information on various resources and their interrelationships in a form that machines can recognize, in an environment such as the abovementioned semantic web. The expression of information in this manner enables different computers to share identical information.

Natural Language Processing

Example of a parse tree

Natural language processing is the research that aims to automate our everyday language activities in all possible forms. The ultimate objective of this research is to develop a computer program that can read, write, listen, and speak like a human being.

In order to carry out Natural language processing, a series of analytic processes regarding the language we use must be performed. Morphological analysis refers to an operation that divides words in a sentence into the smallest linguistic unit with semantic meaning (a morpheme). Parts-of-speech tagging is an operation that groups each word into already-defined specific categories (parts of speech) such as noun, verb, and adjective. Because a word can act as different parts of speech depending on how it is used in a sentence, simply arranging the word with its corresponding part of speech cannot solve this problem.

Parsing is a method that finds the grammatical structure of a given sentence by analyzing the sentence. Through Parsing, a vast amount of information can be obtained, including grammatical information about the given sentence. Parsing also serves as the base for most programs that apply Natural language processing. Natural language processing is applied in many different areas. Machine Translation is the automatic translation of any written words into a different language. Information Retrieval automatically retrieves the information a user wishes to find from a variety of texts. Text Categorization is an operation that finds and classifies the subjects of a given text, and Clustering is an operation that groups together texts or data with similar contents. These applications can greatly help our everyday lives.

Ontology

Example visualization of an ontology

In its original meaning, ontology refers to the study of the essence of everything that exists in the universe. However, in computer science, the definition of ontology is: “an explicit and formal specification of a shared concept.” In this definition, the word “shared” indicates that concepts within the ontology are based upon the knowledge that each computer has approved. In simpler terms, an ontology is the clear expression of a given knowledge or information in a previously-defined format, so that several computers can share the information.

Ontology is composed of concepts and the relationships between the concepts. A concept refers to a fundamental meaningful unit. For example, ‘a man’ and ‘a building’ are concepts. A relationship refers to the interrelationship formed between concepts. For example, ‘a man lives in a building’ can be expressed as the relationship ‘to live’ formed in between the concepts ‘a man’ and ‘a building’.

Ontology is becoming more and more crucial in areas that require knowledge-sharing, such as the Semantic web. Ontology is being researched in a wide range of areas. Main areas of research include automatic expansion, ontology quality evaluation, ontology mergers, and many others. Ontology is considered to be an area that will become more and more prominent in the future.

Semantic Web

Semantic Web Architecture

First proposed by Tim Berners-Lee in 1989, the World Wide Web greatly improved the average person’s accessibility to diverse types of information. However, problems began to arise as the amount of information accessible became more and more vast. It became necessary for web users to filter out huge amounts of irrelevant information when searching for the information they required. This problem is caused by the fact that most of today’s web pages are formatted so that human users can grasp them easily. How is this a problem? It is a problem because the converse is also true; web contents cannot easily be grasped by a program, and therefore the task of automatically filtering out useless information and extracting only the crucial information has become increasingly difficult. In order to solve this problem, the Semantic web was introduced.

Tim Berners-Lee suggests that the Semantic web is not an entirely new concept, but a technology that expands the existing web to enable machines to also understand its contents to a certain degree. World Wide Web Committee (W3C) defines that Semantic Web is “data in the World Wide Web written abstractly in a standard form”


The ultimate objective of the research concerning the Semantic web is to develop a standard or technology that would assist computers in understanding the information on the web in order to achieve the automation of operations, such as data collection, semantic search, and exploration. To describe it in more detail:

  • Provide a more precise search result
  • Combine and Compare information from various sources
  • Unify diverse resources with information on meanings
  • Publish accurate information on the web for automatic web service