[Adapt] FW: 【学术报告】12.20 2PM Jiawei Han: Multi-Dimensional Analysis of Massive Text Corpora
Kenny Zhu
kzhu at cs.sjtu.edu.cn
Wed Dec 20 12:06:08 CST 2017
Pls go to this one if you are interested.
Kenny
From: Weinan Zhang [mailto:wnzhang at sjtu.edu.cn]
Sent: Wednesday, December 13, 2017 1:23 PM
To: all at cs.sjtu.edu.cn
Cc: 俞勇
Subject: 【学术报告】12.20 2PM Jiawei Han: Multi-Dimensional Analysis of Massive Text Corpora
各位老师好,
UIUC的Jiawei Han教授下周三来交大做报告,欢迎您和学术参加!
时间:2017年12月20日,星期三,下午2点-4点
地点:光彪楼多功能厅
主持:俞勇教授
Title: Multi-Dimensional Analysis of Massive Text Corpora
Speaker: Jiawei Han, Abel Bliss Professor, Department of Computer Science, University of Illinois at Urbana-Champaign
ABSTRACT
The real-world big data are largely unstructured and interconnected, in the form of natural language text. It is highly desirable to conduct multi-dimensional anaysis on massive text data. However, this poses a major challenge on how to transform unstructured text data into structured text and analyze such data in multidimensional space. To faciltiate such analytical functionality, we propose a textcube modeling and discuss how to construct such cubes from massive text coropora and how to conduct multidimensional OLAP analysis using such textcubes. In the past several years, we have developed a text mining approach that only requires distant or minimal supervision but relies on massive data. We show (i) quality phrases can be mined from such massive text data, (ii) types can be extracted from massive text data with distant supervision, (iii) entities, attributes and values can be discovered by meta-path directed pattern discovery, (iv) faceted taxonomy can be constructed from massive corpora, (v) textcubes can be constructed from massive text, and (v) multi-dimensional analysis can be conducted on such cubes. We show such a paradigm represents a promising direction at turning massive text data into structured and useful knowledge.
Short bio:
Jiawei Han is Abel Bliss Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign. He has been researching into data mining, information network analysis, database systems, and data warehousing, with over 900 journal and conference publications. He has chaired or served on many program committees of international conferences in most data mining and database conferences. He also served as the founding Editor-In-Chief of ACM Transactions on Knowledge Discovery from Data and the Director of Information Network Academic Research Center supported by U.S. Army Research Lab (2009-2016), and is the co-Director of KnowEnG, an NIH funded Center of Excellence in Big Data Computing since 2014. He is Fellow of ACM, Fellow of IEEE, and received 2004 ACM SIGKDD Innovations Award, 2005 IEEE Computer Society Technical Achievement Award, and 2009 M. Wallace McDowell Award from IEEE Computer Society. His co-authored book "Data Mining: Concepts and Techniques" has been adopted as a textbook popularly worldwide.
祝好!
张伟楠
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.sjtu.edu.cn/pipermail/adapt/attachments/20171220/37df961f/attachment.html>
More information about the Adapt
mailing list