A Data Driven Semantic Network for Text Understanding

Probase is a data driven semantic network that consists of millions of fine-grained concepts and their relationships. One of the goal of Probase is to enable generalization in natural language processing. One important application we have built using Probase is short text analysis (a.k.a. deep query understanding). Using the knowledge in Probase, we perform segmentation, build dependency tree, and annotate terms in a short text. This enables us to understand the intent of keyword based queries.

Below is a comprehensive list of Probase related publications. More (and a little outdated) info can be found here.

Talks

  1. Inferencing in Information Extraction: Techniques and Applications, ICDE 2015 Tutorial 
  2. Knowledge Base for Text Understanding: Haixun Wang, Dec 2014.
  3. Learning Knowledge Bases for Text and Multimedia, Lexing Xie and Haixun Wang, Tutorial at ACM Multimedia, Nov 2014.
  4. Probase: A Review, Haixun Wang, Feb  2014.
  5. Short Text Understanding (invited talk), by Haixun Wang, in AKBC (Automated Knowledge Base Construction),  2013, San Francisco, USA.
  6. Understanding Short Texts (keynote), by Haixun Wang, in APWeb,  2013, Sydney, Australia.

Under Submission

  1. An Inference Approach to Basic Level of Categorization, by Zhongyuan Wang and Haixun Wang, Under Submission,  2015.
  2. On the Transitivity of isA Relations in Data-Driven Semantic Networks, by Jiaqing Liang, Haixun Wang, Yanghua Xiao, Under Submission, 2015
  3. Fine-grained Semantic Typing of FrameNet, by Seung-won Hwang, Haixun Wang, Under Submission, 2015
  4. Probase+: A Comprehensive Conceptual Taxonomy, by Jiaqing Liang, Yanghua Xiao, and Haixun Wang, Under Submission,  2015.

2015

  1. Learning Term Embeddings for Hypernymy Identification, by Yu Zheng, Haixun Wang, Xuemin Lin, and Min Wang, IJCAI 2015.
  2. Query Understanding through Knowledge-Based Conceptualization, by Zhongyuan Wang and Haixun Wang, IJCAI 2015                         
  3. On Conceptual Labeling of a Bag of Words, by Xiangyan Sun, Haixun Wang, Yanghua Xiao, IJCAI 2015
  4. Open Domain Short Text Conceptualization: A Generative + Descriptive Modeling Approach, by Yangqiu Song, Shusen Wang, Haixun Wang, IJCAI 2015
  1. Short Text Understanding Through Lexical-Semantic Analysis (Best Paper Award), by Wen Hua, Zhongyuan Wang, Haixun Wang, and Xiaofang Zhou, ICDE  2015.
  2. Automatic Taxonomy Construction from Keywords via Scalable Bayesian Rose Trees, by Xueqing Liu, Yangqiu Song, Shixia Liu, and Haixun Wang, TKDE, 2015.

2014

  1. Transfer Understanding from Head Queries to Tail Queries, by Yangqiu Song, Haixun Wang, Weizhu Chen, Shusen Wang, in CIKM, 2014, Shanghai, China.
  2. Concept-based Short Text Classification and Ranking, by Zhongyuan Wang, Fang Wang, Wen Ji-Rong, Zhoujun Li, in CIKM, 2014, Shanghai, China.
  3. Overcoming Semantic Drift in Information Extraction, by Zhixu Li, Hongsong Li, Haixun Wang, Yi Yang, Xiangliang Zhang, and Xiaofang Zhou, in EDBT,  2014, Athens, Greece.
  4. Data Driven Metaphor Recognition and Explanation, by Hongsong Li, Kenny Zhu, and Haixun Wang, in TACL,  2014.
  5. Head, Modifier, and Constraint Detection in Short Texts, by Zhongyuan Wang, Haixun Wang, and Zhirui Hu, in ICDE,  2014, Chicago, USA.
  6. Semantic Multidimensional Scaling for Open-Domain Sentiment Analysis, by Erik Cambria, Yangqiu Song, Haixun Wang, and Newton Howard, in IEEE Intelligent Systems, 2014.

2013

  1. Computing term similarity by large probabilistic isA knowledge, by Pei-Pei Li, Haixun Wang, Kenny Zhu, Zhongyuan Wang, and Xindong Wu, in CIKM,  2013, San Francisco, USA.
  2. Assessing sparse information extraction using semantic contexts, by Pei-Pei Li, Haixun Wang, Hongsong Li, and Xindong Wu, in CIKM,  2013, San Francisco, USA.
  3. Attribute extraction and scoring: A probabilistic approach, by Taesung Lee, Zhongyuan Wang, Haixun Wang, and Seung-won Hwang, in ICDE,  2013, Brisbane, Australia.
  4. Automatic extraction of top-k lists from the web, by Zhixian Zhang, Kenny Zhu, Haixun Wang, and Hongsong Li, in ICDE,  2013, Brisbane, Australia.
  5. Shallow Information Extraction for the knowledge Web (Tutorial), by Denilson Barbosa, Haixun Wang, and Cong Yu, in ICDE,  2013, Brisbane, Australia.
  6. Context-Dependent Conceptualization, by Dongwoo Kim, Haixun Wang, and Alice H. Oh, in IJCAI,  2013, Beijing, China.
  7. Identifying Users' Topical Tasks in Web Search, by Wen Hua, Yangqiu Song, Haixun Wang, and Xiaofang Zhou, in WSDM,  2013, Rome, Italy.
  8. Semantic multi-dimensional scaling for open-domain sentiment analysis, by Eric Cambria, Yangqiu Song, Haixun Wang, and N Howard, in IEEE Intelligent Systems,  2013.


2012

  1. A System for Extracting Top-K Lists from the Web (demo), by Zhixian Zhang, Kenny Zhu, and Haixun Wang, in SIGKDD,  2012, Beijing, China.
  2. Automatic Taxonomy Construction from Keywords, by Xueqing Liu, Yangqiu Song, Shixia Liu, and Haixun Wang, in SIGKDD,  2012, Beijing, China.
  3. Probase: A Probabilistic Taxonomy for Text Understanding, by Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Zhu, in ACM International Conference on Management of Data (SIGMOD),  2012, Arizona, USA.
  4. Optimizing Index for Taxonomy Keyword Search, by Bolin Ding, Haixun Wang, Ruomin Jin, Jiawei Han, and Zhongyuan Wang, in ACM International Conference on Management of Data (SIGMOD),  2012, Arizona, USA.

2011

  1. Web Scale Taxonomy Cleansing, by Taesung Lee, Zhongyuan Wang, Haixun Wang, and Seung-won Hwang, in 37th International Conference on Very Large Data Bases (VLDB),  2011
  2. Isanette: A common and common sense knowledge base for opinion mining, by Eric Cambria, Yangqiu Song, Haixun Wang, and A Hussain, in ICDM,  2011, Vancouver, Canada.
  3. Short Text Conceptualization using a Probabilistic Knowledgebase, by Yangqiu Song, Haixun Wang, Zhongyuan Wang, and Hongsong Li, in The 26th International Joint Conference on Artificial Intelligence (IJCAI),  2011, Spain.