Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach

Dmitri Roussinov, Nadezhda Puchnina

Research output: Contribution to conferencePaper

Abstract

Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN-s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second-tier category-specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50% improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state ofthe- art performance among all the current approaches including those taking advantage of such knowledge bases.

Conference

ConferenceThe 13th IEEE International Conference On Semantic Computing
CountryUnited States
CityNewport Beach, California
Period30/01/191/02/19
Internet address

Fingerprint

neural network
ontology
learning
movies
language
Southern Germany
performance
confidence
coverage
France
art

Keywords

  • natural langueage text
  • meta learning
  • information retrieval
  • semantic computing

Cite this

Roussinov, D., & Puchnina, N. (2019). Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach. Paper presented at The 13th IEEE International Conference On Semantic Computing, Newport Beach, California, United States.
Roussinov, Dmitri ; Puchnina, Nadezhda. / Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach. Paper presented at The 13th IEEE International Conference On Semantic Computing, Newport Beach, California, United States.8 p.
@conference{f5daccc0bb7d4e528c91e18c5b3cd41f,
title = "Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach",
abstract = "Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN-s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second-tier category-specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50{\%} improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state ofthe- art performance among all the current approaches including those taking advantage of such knowledge bases.",
keywords = "natural langueage text, meta learning, information retrieval, semantic computing",
author = "Dmitri Roussinov and Nadezhda Puchnina",
year = "2019",
month = "1",
day = "30",
language = "English",
note = "The 13th IEEE International Conference On Semantic Computing ; Conference date: 30-01-2019 Through 01-02-2019",
url = "https://www.ieee-icsc.org/",

}

Roussinov, D & Puchnina, N 2019, 'Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach' Paper presented at The 13th IEEE International Conference On Semantic Computing, Newport Beach, California, United States, 30/01/19 - 1/02/19, .

Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach. / Roussinov, Dmitri; Puchnina, Nadezhda.

2019. Paper presented at The 13th IEEE International Conference On Semantic Computing, Newport Beach, California, United States.

Research output: Contribution to conferencePaper

TY - CONF

T1 - Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach

AU - Roussinov, Dmitri

AU - Puchnina, Nadezhda

PY - 2019/1/30

Y1 - 2019/1/30

N2 - Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN-s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second-tier category-specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50% improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state ofthe- art performance among all the current approaches including those taking advantage of such knowledge bases.

AB - Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN-s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second-tier category-specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50% improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state ofthe- art performance among all the current approaches including those taking advantage of such knowledge bases.

KW - natural langueage text

KW - meta learning

KW - information retrieval

KW - semantic computing

UR - https://www.ieee-icsc.org/

M3 - Paper

ER -

Roussinov D, Puchnina N. Combining neural networks and pattern matching for ontology mining - a meta learning inspired approach. 2019. Paper presented at The 13th IEEE International Conference On Semantic Computing, Newport Beach, California, United States.