Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)

Colin Allen; Hongliang Luo; Jaimie Murdock; Jianghuai Pu; Xiaohong Wang; Yanjie Zhai; Kun Zhao; Colin Allen; Hongliang Luo; Jaimie Murdock; Jianghuai Pu; Xiaohong Wang; Yanjie Zhai; Kun Zhao

doi:10.22148/001c.11882

Article

Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)

Colin Allen
Hongliang Luo
Jaimie Murdock
Jianghuai Pu
Xiaohong Wang
Yanjie Zhai
Kun Zhao

Abstract

There is a small but growing literature on large-scale statistical modeling of Chineselanguage texts. Ouyang analyzed a corpus of over 40,000 ancient documents downloadedfrom multiple sources. This was used to plot the temporal distributions of word frequenciesand geographic distributions of authors. Huang and Yu modeled the SongCi poetry corpus,first converting it to tonally marked pinyin to conserve poetically important pronunciationinformation. Nichols and colleagues reported initial modeling of the Chinese Text Projectcorpus1 in a conference paper. (Further below, we describe differences between this corpusand the Handian.) With additional collaborators, this group has now conducted two studiesthat are currently unpublished but under review. In the first, they apply topic models toaddress scholarly questions about the relationships among important texts of AncientChinese philosophy. In the second, they use topic modeling to investigate the concepts ofmind and body in ancient Chinese philosophy. Although we share similar scholarlyobjectives with these researchers, our approach in this paper is unique in that for the firsttime anywhere we bring the benefits of computational modeling of ancient Chinese texts to a robust public platform that is mirrored on both sides of the Pacific. Besides being just auseful portal to the texts, our approach foregrounds the interpretive issues surrounding topic models, and makes more sophisticated exploration and analysis of interpretive questions possible for experts and novices alike.

How to Cite:

Allen, C., Luo, H., Murdock, J., Pu, J., Wang, X., Zhai, Y. & Zhao, K., (2017) “Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)”, Journal of Cultural Analytics 2(1). https://doi.org/10.22148/001c.11882 (external link, opens in new tab).

Files

Issue

Volume 2 • Issue 1 • 2021 • Articles in 2017

Information

Published on 12 October 2017
Peer Reviewed
Licence Creative Commons Attribution 4.0 (external link, opens in new tab).

Metrics

Views: 0
Downloads: 0

Citation

RIS BibTeX

File Checksums

(MD5)

XML: 308b3bbf2dbe10c91ab30982fa474415
PDF: d0d34e0a81528ce04e2efd19027a32bc

Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)

Abstract

How to Cite:

Files

Share

Files

Issue

Information

Metrics

Citation

File Checksums

Table of Contents

Topic Modeling the Hàn diăn Ancient Classics (汉典古籍)

Text display options

Abstract

How to Cite:

Files

Share

Files

Issue

Information

Metrics

Citation

File Checksums

Table of Contents