日本データベース学会

dbjapanメーリングリストアーカイブ(2011年)

[dbjapan] Prof. Sharma Chakravarthy 講演会(3月17日)


日本データベース学会の皆様、

東工大の横田です。

下記の ACM SIGMOD 日本支部講演会のご案内をさせて頂きます。
http://www.sigmodj.org/Events/kouenkai20110317.html

講演者: Prof. Sharma Chakravarthy (The University of Texas at Arlington)
演題: InfoSift: Adapting Graph Mining Techniques for Document Classification
日時: 3月17日(木) 午後3時〜午後4時
会場: 東京工業大学 大岡山キャンパス 西8E号棟10階 1004会議室
       (東京都目黒区大岡山2-12-1)
       下記地図(大岡山東・西・南地区)の 26 の番号が振ってあるビル
       http://www.titech.ac.jp/about/campus/o_map.html?id=03
参加費:無料
参加申込: Webにてお申し込み下さい。
       http://www.sigmodj.org/Event-reg/form-20110317.html
       に登録用フォームがありますので、ご利用下さい。

       会場の収容人数に限りがございますので、収容人数を越えましたら
       そこで申込みは絞めきらせて頂きますので、ご了承下さい。

       参加登録には確認のメールをお返しします。確認メールが来ない場合には、
       helpdesk [at] sigmodj.org までお問合せください。

                            ACM SIGMOD日本支部 支部長 横田 治夫

-----------------------------------------------------------------

Title:
 InfoSift: Adapting Graph Mining Techniques for Document Classification

Abstract: 
I will briefly describe ongoing projects at the IT Lab before the main
presentation.

Text classification is the problem of assigning pre-defined class labels to
incoming, unclassified documents. The class labels are defined based on a
sample of pre-classified documents, which are used as a training corpus. A
number of machine learning, probabilistic, and information retrieval based
approaches have been proposed for text classification.

This talk proposes a novel graph-based mining approach for document
classification. Our approach is based on the premise that representative --
common and recurring -- structures or patterns can be extracted from a
pre-classified document class and the same can be used effectively for
classifying incoming documents. To the best of our knowledge, there is no
existing work in the area of text, email or web page classification based on
pattern inference and the utilization of the learned patterns for
classification. A number of factors that influence representative structure
extraction and classification are analyzed conceptually and validated
experimentally. In our approach, the notion of inexact graph match is
leveraged for deriving structures that provide coverage for characterizing the
contents of a document class. The results of our approach are compared with
Naïve Bayes approach. We discuss both single and multi-folder classification
for emails.

This is a joint work with my students Many Aery and Aravind Venkatachalam.


Short Biography:
Sharma Chakravarthy is Professor of Computer Science and Engineering
Department at The University of Texas at Arlington, Texas. He established the
Information Technology Laboratory at UT Arlington in Jan 2000 and currently
heads it. Sharma Chakravarthy has also established the NSF-funded, Distributed
and Parallel Computing Cluster at UT Arlington in 2003. He is the recipient of
the university-level “Creative Outstanding Researcher” award for 2003 and the
department level senior outstanding researcher award in 2002. His book -
Stream Data Processing: A Quality of Service Perspective - is published by
Springer in 2009.

He is well known for his work on stream and event processing, semantic query
optimization, multiple query optimization, active databases (HiPAC project at
CCA and Sentinel project at the University of Florida, Gainesville), and more
recently web database ranking, graph mining, and identification of experts.

His current research includes web technologies, stream data processing,
complex event processing, mining & knowledge discovery, and information
integration. He has published over 150 papers in refereed international
journals and conference proceedings. He has given tutorial on a number of
database topics, such as stream processing, graph mining, database mining,
active, real-time, distributed, object-oriented, and heterogeneous databases
in North America, Europe, and Asia. He is an associate editor of TKDE. He is
listed in Who's Who Among South Asian Americans and Who's Who Among America's
Teachers.

Prior to joining UTA, he was with the University of Florida, Gainesville.
Prior to that, he worked as a Computer Scientist at the Computer Corporation
of America (CCA) and as a Member, Technical Staff at Xerox Advanced
Information Technology, Cambridge, MA.

Sharma Chakrvarthy received the B.E. degree in Electrical Engineering from the
Indian Institute of Science, Bangalore and M.Tech from IIT Bombay, India. He
worked at TIFR (Tata Institute of Fundamental Research), Bombay, India for a
few years. He received M.S. and Ph.D degrees from the University of Maryland
in College park in 1981 and 1985, respectively.


−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
                     横田 治夫    〒152-8552 東京都目黒区大岡山 2-12-1
                     東京工業大学 大学院 情報理工学研究科 計算工学専攻 
                     TEL: (03) 5734-3505 (直通)、  FAX: (03) 5734-3504
                                         email: yokota [at] cs.titech.ac.jp
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−