Talk:Chinese characters

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

IT section[edit]

@Ctxz2323, hey, thank you for writing in this area! one of the last things i was planning on doing for this article was properly writing a section about characters in computing, input methods etc. and putting it into the "methods and styles" section, alongside discussions of calligraphy, ordinary handwriting etc. if you don't mind, i'm probably going to reintegrate your passage as such?
One thing to keep in mind is the article is already very long, so it will likely have to be a bit shorter, but it is a subtopic that deserves a treatment in this article. If you're fine with that, lmk so I don't get in your way while you're editing! cheers Remsense 05:21, 28 November 2023 (UTC)[reply]

Yes, the article is already very long. But Chinese character IT is an important subtopic and will help people see that we are closely following the latest development of language computing at a glance of the table of contents.
So, if you don't mind, I prefer it to remain a first-level section while trying to cut it shorter, much shorter? Ctxz2323 (talk) 09:36, 28 November 2023 (UTC)[reply]
How do you feel about its present placement? The higher level is meant to be about the various methods of writing and transmitting text written in Chinese characters, so I feel it fits. Remsense 09:40, 28 November 2023 (UTC)[reply]
Well done!
and thanks. Ctxz2323 (talk) 12:32, 28 November 2023 (UTC)[reply]
Doesn't "Computer encoding and interchange" belong in "Use with computers" rather than "Lexicography"? Kanguole 08:58, 13 February 2024 (UTC)[reply]
Yes, "Use with computers" is a more appropriate location. Ctxz2323 (talk) 13:50, 13 February 2024 (UTC)[reply]
@Kanguole @Ctxz2323 I see it presently like this, please tell me if I'm missing something:
  • Methods and styles covers the ways people directly create characters (i.e. printing, handwriting, typing), not necessary which characters they may use or the underlying inventory. Thus, the relevant material covers how people input characters on computers.
  • Lexicography covers the ways characters are collated, organized, sorted, etc. An encoding seems more akin to a dictionary or other large character set, to me. I've also put this section next to the related section Standardization, which has additional overlap with history.
All of these categories and themes overlap with each other, and I'm open as always if people think I'm wrong here. Remsense 22:28, 13 February 2024 (UTC)[reply]
Then perhaps "Use with computers" belongs outside "Methods and styles", which could then just be "Styles". Kanguole 22:40, 13 February 2024 (UTC)[reply]
But then I worry about the over-conflation with the overlapping notion of "scripts", which is the anchor of the History section. This is difficult. Remsense 22:46, 13 February 2024 (UTC)[reply]
OK, next suggestion is to put "Printing" and "Use with computers" in the "History" section and "Calligraphy" in the "Structure" section, and drop "Methods and styles". Kanguole 22:53, 13 February 2024 (UTC)[reply]
I think that may be a good idea. I'll see what I can do. Remsense 00:16, 14 February 2024 (UTC)[reply]
I still prefer Kanguole's original idea: just simply move "Computer encoding and interchange" into "Use with computers". It is more of computer encoding than of Lexicography, just like we may more like to put an article on Unicode in the domain of IT or computers rather than in Dictionaries. Ctxz2323 (talk) 03:44, 14 February 2024 (UTC)[reply]
I'll do this for now, then. Remsense 06:17, 14 February 2024 (UTC)[reply]

Statistical "simplification" analysis due?[edit]

I suppose the question isn't just for this article, and I know single papers that say all sorts of things come out, but this paper[1] made me a bit self-conscious about how I've currently written the relevant part of this article and others. Any more wizened analysis from others about its conclusions?

Basically, it rejects the notion that the writing system has more or less steadily simplified visually since the oracle bone form, and instead views seal script as the historical peak of visual complexity. A weak incentive trade-off between visual complexity and distinctiveness is postulated, but traditional regular script is overall both more visually complex and more distinctive according to their model. Of course, when Qiu et al say "simple" they don't just mean the pure graphical sense.

References

  1. ^ Han, Simon J.; Kelly, Piers; Winters, James; Kemp, Charles (2022-12-02). "Simplification Is Not Dominant in the Evolution of Chinese Characters". Open Mind. 6: 264–279. doi:10.1162/opmi_a_00064. ISSN 2470-2986. PMC 9987343. PMID 36891037.

Remsense 09:22, 10 March 2024 (UTC)[reply]

Change the age of the oldest oracle bone?[edit]

Hi @Remsense, I think we can change the date for the oldest oracle bone. The 2021 radiocarbon dating of 26 Wu Ding bone samples gives the age of the oldest measured bone to c. 1254-1221 BCE. That's quite a distance from the 1200 BCE given by David Keightley. So I'm asking your opinion on this. Strongman13072007 (talk) 13:43, 17 March 2024 (UTC)[reply]

I did see that, and I put it on my mental checklist for this article, I just didn't want it to forget about it technically being unsourced in the article while I was working on other stuff—I agree, I think the radiocarbon cite can be attached to a 1250 BCE start date. Remsense 13:46, 17 March 2024 (UTC)[reply]
We should not rely on a single research paper (WP:PRIMARY). Note also that date range is stated to be a 68% range for that bone, and they have an even earlier one in Table 4: 1307–1228 for Heji 34240 from the Li type-1 diviner group (usually assigned to the later part of Wu Ding's reign).
Ken-ichi Takashima gives a date of 1230. The Xia–Shang–Zhou Chronology Project dated Wu Ding's reign at 1250–1192, though the first oracle bones may have been produced some time after the start of his reign. Kanguole 14:34, 17 March 2024 (UTC)[reply]
Agreed—also I do not think the citation should've been swapped out, but appended to Keightly, perhaps with an explanation in the footnote. Remsense 14:36, 17 March 2024 (UTC)[reply]

The last stumbling block: 方言字[edit]

The final unsourced statements—and the final major point contravening the GA criteria, I think—regard dialectal characters, and I've really been at a loss tracking down adequate sources for what's there, or to rewrite the section adequately from. I would very much appreciate peoples' help if they happen to know anything about how characters are used and created in the particular context of representing fangyan—obviously it's variations on a theme, but I want it to be well-attested. I think I'm otherwise just about ready to submit for peer review or GAN—for real, this time! Remsense 17:24, 17 March 2024 (UTC)[reply]

@Alsosaid1987 this may be a bit of a stretch given you added many of these statements back in 2019, but do by any chance have any leads as to how I could source them? Remsense 14:15, 26 March 2024 (UTC)[reply]
Are you referring to the examples of dialectal characters? For most of them, Wiktionary gives the definitions and gives the dialectal usages. For the Shanghainese phrase whose etymological form is 触祭, the online authoritative Wu dictionary published by the Wu Language Association https://wu-chinese.com/minidict cites the Suzhou pingtan script 白蛇传, while the "billboard" form 戳鸡, I was able to find at http://languagelog.ldc.upenn.edu/myl/ShanghaiMP3/SVocab.html. Unfortunately, I have not been able to find print sources for these examples.Alsosaid1987 (talk) 00:27, 30 March 2024 (UTC)[reply]
Specifically, I really would like sources for the characters being used in legal proceedings, etc. I've taken those out of the article for now, but I would like to put them back in if I can source them. Remsense 00:30, 30 March 2024 (UTC)[reply]

GA Review[edit]

This review is transcluded from Talk:Chinese characters/GA1. The edit link for this section can be used to add comments to the review.

Nominator: Remsense (talk · contribs) 18:14, 28 March 2024 (UTC)[reply]

@Kanguole, while Kusma is elsewhere, do you happen to have any critique or comments about the article in its present state? I know I've asked this question of you a lot, but you always have something insightful to say, so! Remsense 10:19, 6 April 2024 (UTC)[reply]
I'm looking ahead, and I think the Old Chinese section that explains character evolution vis a vis Old Chinese morphology (i.e. the qusheng and transitivity prefix) is a dramatic jump in technicality from the rest of the article, and I'm not sure how I could simplify it for an article with this broad a scope. I want to keep something that's roughly this informative though, any thoughts @Kanguole? Remsense 21:02, 9 April 2024 (UTC)[reply]
I think this is a special case of phonetic loans: the same character was used to write another word with a similar pronunciation, and subsequent sound change separated them. This can be seen with phono-semantic compounds too, e.g. 張 'stretch'/長 'long' (Baxter p324).
Some nits:
  • Is that a stray quote next to drjwen?
  • 去聲 (qùshēng) is the name of the MC departing tone; *-s is a proposed source of it.
  • The residual morphology of OC is usually said to be derivational rather than inflectional.
  • The usual notation for sound change (e.g. in the cited Baxter 1992) is > rather than →.
  • The last two examples of voiced initials don't show clear distinctions. If you had something in the level tone (like 張/長, you would end up with different initials and different tones in the modern forms.
Kanguole 23:03, 9 April 2024 (UTC)[reply]

Reviewer: Kusma (talk · contribs) 07:13, 1 April 2024 (UTC)[reply]


Will take this on. I do know a few things about Chinese characters, but am looking forward to reading this in detail and learning more. —Kusma (talk) 07:13, 1 April 2024 (UTC)[reply]

Oh, thank you so much! I thought I may be waiting a bit for a review, it's a big vital article with a broad scope. Very excited to hear your feedback. Remsense 11:28, 1 April 2024 (UTC)[reply]
It is probably going to take me some time, so you may need to wait a little for a complete review. —Kusma (talk) 13:11, 1 April 2024 (UTC)[reply]
Oh, please take your time! Remsense 13:21, 1 April 2024 (UTC)[reply]
I will, and I will comment on everything I come across that I find worth mentioning. Not everything will need addressing in order to satisfy the GA criteria, and I may also be wrong about things, so please do let me know when you disagree with me. —Kusma (talk) 14:29, 1 April 2024 (UTC)[reply]
Yeah! Really, doing a few reviews first was a really good choice—I'm much more comfortable on the nominator side than I may have otherwise been. Do you mind if I reply to your bullet points below inline, or is that too much of a headache? Remsense 14:39, 1 April 2024 (UTC)[reply]
Inline is fine unless we are editing at the same time. —Kusma (talk) 16:07, 1 April 2024 (UTC)[reply]
I don't know whether I'll have much editing time until Thursday (and I had another stalled GA review coming back). So I will indeed take my time. If you are trying to give names in characters for everyone (a goal I support, but general consensus might be slightly different): Chan Sin-Wai is 陳善偉. —Kusma (talk) 07:04, 2 April 2024 (UTC)[reply]
Thank you! I think it is generally necessary to give the characters for people who write their names with characters—my litmus test has generally been to include characters when they are printed in the book or immediately visible upon a google search for the person, but extra inclusion certainly can't hurt.
And no problem at all! As you may have seen, I keep finding things to polish and tweak—I hope that's alright with you, I can gladly refrain if it complicates the review process. I'm honestly just stoked that the review has been taken up so quickly by someone I'm going to learn a lot from—so please take all the time you need, I mean it. I'll be here. Remsense 08:19, 2 April 2024 (UTC)[reply]
I am going to take some time, especially as I want to engage with the sources. My own personal library has two that you don't use, the classic Chinese Characters by Léon Wieger (most of the book is a dictionary sorted by phonetic component, something I have not seen anywhere else) and the modern Chinese Characters by Han Jiantang. Reading all this will be fun, but won't be quick. Thank you for your patience. —Kusma (talk) 10:13, 7 April 2024 (UTC)[reply]
I am over the moon that you're affording me so much effort. I hope in kind you don't mind that I tinker with the article some (I have a graphical addition in mind but theoretically it shouldn't be disruptive to the flow of analysis)! I will also find those books and add them to my collection, thank you! Remsense 10:25, 7 April 2024 (UTC)[reply]

Section by section content and prose review[edit]

Lead[edit]

  • Lead: will discuss later

Development[edit]

  • Development: four independent inventions of writing in human history probably better to say "at least" as it is not always known what else was independent? [1]
  • It is probably possible to explain the timing of the four different inventions without using BP for the Maya, which looks out of place with the BCE notation for the other three.
  • Not sure about some of the italics here (compare MOS:ITALIC). proto-writing doesn't need emphasis imo, and is not really WP:WAW.
  • What is the difference between "development" and "early history"?
  • 大鹿 could be linked to Wiktionary like character examples later are
  • 假借: better to incorporate the pinyin (with tones) into the template like you do for Chinese words later in the text

Classification[edit]

  • Classification: again, over-italicised in my view. "writing systems" and "morphosyllabic" have little reason for italics.
  • You do not mention the traditional zh:六書 system, but then cite the 漢語大字典 which often goes back to the 說文 for its character information.
  • Structural analysis: this may of course be justified, but it seems a lot of space is given to Qiu's view, even if it differs only little from the "straightforward structural classification scheme" presented earlier.
  • Pictographs: While relatively few in number, most of the earliest characters originated as pictographs relatively few in number in the last 2000 years? Early on, they were apparently the majority, so relatively many in number?
  • Example lexeme A: 日 links to the 漢語大字典 page for 月.
  • Would it make sense to mention that these are 象形 in
  • Is there some deeper reason why some characters are linked and others are not?
  • Over time, this process sometimes creates excess ambiguity between different senses of a character, which is then usually resolved by adding additional components to create new characters used for specific senses. This can result in new pictographs, but usually results in other character types. this makes me itch to say that modern Chinese resolves this ambiguity not by new characters, but by multi-syllable words; not sure if that is an appropriate comment here

More later! —Kusma (talk) 14:28, 1 April 2024 (UTC)[reply]

  • four independent inventions; BP: these are taken directly from Handel, but I can happily adjust the date format to favor internal consistency.
  • What is the difference between "development" and "early history"? I think I attempted to organize the "development" section to explain how proto-writing became a writing system while explicating what that means, while the "history" section is meant to cover the evolution of a fully mature writing system.
  • do not mention the traditional zh:六書 system [yet] This is a bit of a pickle for me, I'll have to think about it a bit more. In short, I did not think 六書 is incompatible with or requires explication vis a vis with modern analysis before it is explicitly discussed. In such cases, the 說文 analysis is considered accurate to the best of my knowledge
  • Qiu's view: I attribute a lot of statements to Qiu in the first half that I probably don't need to, though in this specific case I do. That said, Qiu's analyses in the Development section etc. are basically the industry golden standard and reflect broad scholarly consensus; Qiu 1988 is basically the definitive monograph for Chinese palaeography (which makes it infuriating that they haven't reprinted the 2000 translation.) Will think about this a bit more.
  • Early on, they were apparently the majority? Great catch! Simply, there weren't very many characters in the proto-writing system—it's impossible to hold thousands of ideographs in your head without starting to encode spoken language. I will explicate this.
  • Is there some deeper reason why some characters are linked and others are not? Inertia! I am not sure what should and should not be linked, but I definitely lean on "don't link without a specific reason", and removed a ton of wiktionary linking from this article as I was working on it.
  • excess ambiguity: The ambiguity that creates new character forms is graphical (or lexicographical) in nature—I do see how this can be conflated with the also-mentioned ambiguity that creates compound words, which is phonetic in nature, and therefore went largely unreflected in the written language until the advent of the vernacular movement to supplant Literary Chinese. Good catch, will explicate this also.
  • checkY All other bullet points I agree with and have applied the most straightforward changes for.
Remsense 15:13, 1 April 2024 (UTC)[reply]
  • Compound ideographs: you use ('Sun') but 'MAN'; is there some obvious reason to use two different formats? Also, {{kxr}} seems to create a huge amount of whitespace behind it when viewed in Safari, making it look awkward (not a problem in Chrome).
  • There seems to be no citation for 休.
  • I am not certain that the 漢語大字典 is the best possible source for character etymologies, given that it often just reprints the 說文解字. I will revisit this point when talking about sources.
  • I see no information in the 漢語大字典 that 砼 is a compound ideograph, or even how it is composed; all the dictionary says is that it means 混凝土, concrete.

More later! —Kusma (talk) 10:07, 7 April 2024 (UTC)[reply]

  • Oh wow, I'm curious why that is! Will investigate. For now, I've tried to use smallcaps with single quotes specifically for radicals, and single quotes for the glosses of lexemes per usual.
  • And yes, I've been thinking about how to cite lexemes also. Of course, there are other dictionaries that I'm prepared to swap out if we don't think the 漢語大字典 is fit for purpose here—but I would consider that it is, given that concerns the latter are not strictly about individual classifications per se. It's odd!
  • oh, dangit! I accidentally did an OR here—specifically, I went to a chat room where a bunch of Taiwanese lexicographers were hanging out and begged them in my sloppy Chinese to give me a better example of a modern sessemantograph than 'platinum', which had been the previous example and simply was obviously not one tóng is at the very least also arguably a phono-semantic compound with phonetic 工 gōng. Perhaps these further examples can simply be cut if they are too nettlesome.
Will fix these shortly, wanted to organize my thoughts. Remsense 10:18, 7 April 2024 (UTC)[reply]
鉑 is an interesting case, I wonder if there are sources stating whether it is supposed to be syssemantographic gold+white or phonosemantic 金+bo (while 白 is only rarely bo, it is often bo in compounds). The choice of 白 for the sound component may have been inspired by the meaning... —Kusma (talk) 20:04, 8 April 2024 (UTC)[reply]
As mentioned in the article, the line between phono-semantic and ideographic compounds can be fuzzy. I recalled after looking that 砼 was specifically invented by a Chinese architect and professor named Cai Fangyin in 1953 (which explains why it's not in any Taiwanese dictionaries) but it is also easily described as a phono-semantic compound of 石 and 仝 tóng. Remsense 20:32, 8 April 2024 (UTC)[reply]
I'm now also very curious about platinum. When did knowledge of that metal reach China? The usual informal word is 白金, but Wiktionary tells me that that was coined in Japanese by the chemist Udagawa Yōan in 1834. If knowledge of Pt reached China that late, then it seems plausible that the use of 鉑 to mean Pt (instead of its old meaning "thin sheet of gold") was a new invention from those who coined the Chinese names of the elements, with a character being chosen to make it meaningfully appropriate. That would make it something like 氫 qīng for hydrogen, where the character and reading were both coinages, and the chosen phonetic was meant to give meaning instead of to match the pronunciation in the major European languages; the difference would be that 鉑 was repurposing an old character with its old reading (not unheard of, some of the characters like 鈰 for cerium are actually in names of the House of Zhu) and 氫 was an entirely new coinage. (I'm particularly fond of 氕 piē, 氘 dāo and 氚 chuān for the hydrogen isotopes, which follow the same sort of thinking behind H itself.) Needless to say, if this speculation is correct, it's not a good example. :)
P.S. on reading Wright's Translating Science, I now feel it was a gloriously wasted opportunity not to translate osmium as ⾦惡. Double sharp (talk) 17:14, 14 April 2024 (UTC)[reply]
  • Phono-semantic compounds: "For a modern reader, the resulting drift is such that the phonetic component no longer provides any hint as to each character's pronunciation." this is specific to 也; maybe add some comment that clarifies that modern readers get lots of hints from most other characters?
  • Table: is the OC column in IPA? What are C and A? Can you link to some IPA help page?
  • Loangraphs: Some loangraphs may represent words that have never been written another way—this is often the case with abstract grammatical particles such as 之 and 其—but this is not always so this is a bit redundantly written. Also, what does "words that have never been written another way" really mean? 之 is often written の in Taiwanese Mandarin handwriting ...
  • 可口可樂: well, both 可口 and 可樂 are actual words on their own, so this is a bit questionable as an example in my view. Neither Coca-Cola nor Romania are mentioned in the source.
  • (random anecdote): my favourite transcription of a Western company into Chinese is actually 大众 for VW, with the 众 looking like the VW logo :)
  • Traditional Shuowen Jiezi classification: liùshū (六書; 六书; 'six writings') here you have pinyin-traditional-simplified; Romania and Coca-Cola were simplified-traditional-pinyin. Worth making consistent?
  • Are there examples for characters mis-classified by the 說文解字?
  • while this number would be reduced substantially in future dictionaries, the underlying concept would remain ubiquitous. try to avoid going too far WP:INTOTHEWOULDS.

More later, as usual :) —Kusma (talk) 21:11, 8 April 2024 (UTC)[reply]

History[edit]

  • History: the broad trend towards simplification is probably scholarly consensus and can probably be stated in wikivoice instead of attributing to Qiu? You could still mention Qiu and cite his words.
  • Neolithic writing: aren't there some people (including Qiu) who claim that the Dawenkou culture's signs may be connected to Jiaguwen, compare e.g. zh:大汶口陶文?
  • Oracle bone script: I just fell into a slight rabbit hole of trying to read up on earliest jiaguwen. Now I'm wondering whether an eclipse in 1173 BCE (see p. 4) ought to be added to Historically significant lunar eclipses... but that's probably nothing we need to discuss here.
  • However, it might be interesting to mention Wu Ding?
  • village near Anyang in Henan, which was excavated why not link to / mention Yinxu?
  • which was excavated by the Academia Sinica more precisely, by Li Ji from their (freshly founded just like the Academia) zh:中央研究院歷史語言研究所 (Institute of History and Philology).
  • I think it would be worth to mention the Chinese names of the various scripts.
  • Zhou scripts: The traditional notion of an orderly procession of scripts, with each suddenly invented and displacing the one previous, has been conclusively superseded by modern archaeological finds and scholarly research. More often, two or more scripts coexisted in a given area, and scripts evolved gradually. this seems to be rather general, not limited to the Zhou dynasty?
  • Examples of these styles were preserved as variants in the Shuowen Jiezi. are these other scripts only in the Shuowen Jiezi, or are there extant samples from the Zhou dynasty?
  • Large seal script is only in the See Also on top; shouldn't it be mentioned and discussed in this section?
  • Qin unification and small seal script: Isn't this the place to talk about Li Si?
  • The coexistence of proto-clerical and seal script runs counter to the traditional belief that only the latter was used by the Qin, with clerical script being suddenly invented during the early Han. does everybody agree with Qiu on this? Even if yes, might be better to either drop the traditional belief or state both what the traditional theory was and then what the modern theory is and why it superseded the traditional one.
  • I know that we get some samples later in the article, but I think it would be great to see pictures of oracle bone script/large seal script/small seal script/clerical script closer to where these are discussed.

More later... —Kusma (talk) 22:48, 9 April 2024 (UTC)[reply]

This list is especially worthwhile, many points for refinement toward maximum encyclopedic value here, thank you so much! Remsense 17:07, 10 April 2024 (UTC)[reply]
Do you think for the history section broadly, it might be worthwhile to have a pseudo-infobox/table where the names are listed in order? I've spent a lot of time cutting a lot of non-English text (imperfectly as you've seen) from this article but yes the native names for scripts do seem like necessary inclusions. Remsense 18:01, 10 April 2024 (UTC)[reply]
I think that could work, definitely worth trying to create a table of scripts with their Chinese names. —Kusma (talk) 12:20, 11 April 2024 (UTC)[reply]
We'll cross that bridge when we get to it, but I'll keep this concept in mind for other terminology: MOS:ZH states (rightfully in the realm of style guideline) that linked terms shouldn't also include characters/romanization inline, but there do seem some common-sense exceptions here, since this is such an omnibus article that should reliably stand on its own as such. Remsense 12:22, 11 April 2024 (UTC)[reply]
We should not write "Mao Zedong (毛泽东)" but we may wish to explain terminology, and the Chinese terminology is potentially important here. For GA, we are free to ignore MOS:ZH but of course at FA level you will need to justify any deviation from the MoS. —Kusma (talk) 14:10, 11 April 2024 (UTC)[reply]
it's definitely domain-specific MOS that hasn't really been worked through at the FA level specifically, but yes I agree for script styles, and possibly other terms, let me know if there are others you notice that should have characters attached. Remsense 14:23, 11 April 2024 (UTC)[reply]
By the way, I've been working through your lists, fixing the obvious or easy ones, and copying the ones I'm less sure about to either work through or possibly discuss further once we've done a complete pass through the article, does this seem like a plan? Remsense 15:17, 12 April 2024 (UTC)[reply]
I can work with that. I am not actually sure what the best method is, so I am happy to use yours. —Kusma (talk) 16:01, 12 April 2024 (UTC)[reply]
  • Han clerical script: I don't understand why Libian and Liding are linked as See Also. As their titles are in Chinese, it is unclear what the reader should expect. Shouldn't the libian processes be touched upon here a little?
  • its sophistication was comparable to small seal script. what is meant by this "sophistication"?
  • I find the treatment of ancient cursive versus cursive (and later semi-cursive) a bit confusing, and considerations of the Jin dynasty make up near half of your Han dynasty section. Maybe treat all cursives in a separate section?
  • Neo-clerical: is this a name used only by Qiu?
    Coined by Qiu. Remsense 16:00, 17 April 2024 (UTC)[reply]
  • Regular script: this is missing a statement that regular script is still used today. I also find the section a bit difficult to follow. We have a "simplified, convenient" form that is changed in a way that is "neat, formal". It is unclear how the predecessors were inconvenient or informal or lacking in neatness. I generally don't like this section much.
  • Cao Wei calligrapher Zhong Yao: I didn't know that Cao Wei is a state/dynasty without clicking the link; can this be made clearer?
  • How did the Wangs develop the script?
  • calligraphers such as Wang Wang Xizhi or his son or both?
  • In the History section, is it worth mentioning what tools people used to write the different script styles and how the tool choice influenced the script? (I am pretty sure writing on bones or bronze is very different from using a brush and ink).

Structure next! —Kusma (talk) 16:38, 12 April 2024 (UTC)[reply]

Structure[edit]

  • rectilinear units of uniform width. nowadays they are square, and they have been nearly square (even when writing top to bottom) for a long time, although the image of the "printed Song publication" has 義 much taller than 之. In the next sentence, you say within the square.
  • Not sure whether it is worth mentioning how standardised stroke counts are (so they are used for all kinds of sorting, from dictionaries to bookshops).
  • I don't understand how the Eight Principles of Yong exemplify the categorisation of strokes.
  • Both the order in which strokes are drawn within a given component, as well as the order components are written in a character is largely fixed. well, but there are several competing standards for stroke order, so this isn't fixed across all of the Sinosphere.
  • Variants and allographs: this might benefit from some more examples (usually we don't see as much variety as for 龜). 棋/䃆 (you play Go with stone pieces on a wooden board, so both make sense) and the top-to-bottom versions 棊/碁 belong to my personal favourites, as does 裡/裏. No sources, sorry.

Methods of writing[edit]

  • Printing and typefaces: I would either not mention the 字号 or say what it is. Missing traditional 字號.
  • "Kai" typefaces at this point it turns out we need to know that regular script is called kaishu.
  • Input methods: Predominantly, Chinese characters are input as strings of Latin characters with some notable exceptions like Taiwan, where people usually have bopomofo keyboards and "ji32k7au4a83" is a famously terrible password (我的密碼 typed on a bopomofo keyboard in Latin mode).
  • a more unique shorthand this hurts my mathematician brain. either it is unique or not. "less ambiguous"?
  • 大学 is missing its traditional friend 大學.
  • Encoding and interchange: Text is represented digitally by a series of binary numbers called code points why "series" and not "string" or "sequence"? I don't see what is "binary" about the numbers; Unicode code points are usually written in hexadecimal. Better drop "binary"?
  • I'm not sure you need all of this technical detail, but if you have so much detail on different encoding systems, why do you omit Shift JIS?
  • As of 2024, Big5 encodes 3053 characters.[92] Citation 92 is from 2008 and it seems unlikely that the author knows what Big5 is like in 2024.

Enough for today. —Kusma (talk) 22:07, 12 April 2024 (UTC)[reply]

Thank you so much as always. Remsense 23:11, 12 April 2024 (UTC)[reply]

Vocabulary and adaptation[edit]

  • Vocabulary and adaptation: this section combines talking about 文言文 with the adoption of Chinese characters for other languages. There is probably a good reason to do so, but it is not very clear from the current writing. It is odd that Vietnam, Korea etc. are discussed in a paragraph that begins with a mention of Literary and colloquial readings.
  • Outside China, the notion of special reading systems was extended, what is a "special reading system"?
  • maybe explain quickly what brushtalk is? what are face-to-face conversions?
  • In Korea, hanja were usually used to write Sino-Korean vocabulary I suppose the point is that they were used only for Sino-Korean vocabulary? Was there a different writing system for other Korean words?
  • It isn't really clear how this section is ordered. It is neither chronological nor by area, and some points are revisited more than once.
  • If the bit between the "Vocabulary and adaptation" header and the "Old Chinese" subheader is supposed to be an overview/introduction/"lead section" for what follows, it is exceedingly long.
  • Old Chinese: I don't quite understand why you have so much content about the "departing tone"; how is that relevant to the characters?
  • : why not the standard form 說 but this intermediate form closer to 说? (I have no idea why these strokes are different between traditional and simplified, but they are).
    説 was the form used in the source (Baxter 1994). Remsense 16:00, 17 April 2024 (UTC)[reply]
  • Vernacular Chinese varieties: here we have two paragraphs, one talking about 2000 years of history leading to written vernacular Chinese, another about much more recent (I think, no dates are given) adaptations for non-Mandarin varieties.
  • Say that 孬 is Cantonese and when it was introduced?
    This seems a variant used across different variants (originally attributed to Northeast Mandarin), so I've opted not to be specific. Remsense 16:00, 17 April 2024 (UTC)[reply]
  • My favourite Cantonese character is 冇 (meaning 沒有). It isn't really a "compound" ideograph, but more a "reduction" one. Do you know how people classify this one?
  • semi-official characters I think the MoE is trying to have an official list (which has some choices that I don't like, in particular 袂 instead of , as I learned during the GA review of Taiwanese Mandarin, but it is what it is). What do you mean by "semi-official"?
  • Why not give examples of Taiwanese Southern Min Recommended Characters? 睏 is a nice phonosemantic that has 目 as in 睡 (same meaning) and the different sound component 困. (My 台语 is terrible).
    Actually, many of these seem old, including 睏 and 袂. Not sure which are fully new designs.

Next stop: Japanese. —Kusma (talk) 13:25, 14 April 2024 (UTC)[reply]

Remsense 20:17, 14 April 2024 (UTC)[reply]
  • Why See also to Kanbun?
  • the word meaning "Chinese characters" is rendered as kanji isn't it 漢字 just like in Chinese?
  • The 刀 story comes in the middle of the katakana/ateji story.
  • Not sure we need so much about kana.
  • Korean: Define Hanja.
  • As early as the Gojoseon period when was that?
    I wish I knew! I was shocked at how trying to include a parenthetical dating for this opened every single can of worms in the world, hence its conspicuous absence. Not sure what better to do. Remsense 16:00, 17 April 2024 (UTC)[reply]
  • This ambiguity had historically been resolved by also including the associated hanja. While still sometimes used for Sino-Korean vocabulary, it is much rarer for native Korean words to be written using hanja. tense? when is "historically" and when is the now with the native Korean words?
  • Examples of listings why "listings"?
  • Are "native translation" and "Sino-Korean" like the kun and on reading in Japanese or does it just look like that?

More later. —Kusma (talk) 20:27, 14 April 2024 (UTC)[reply]

  • South Korea: I am a bit confused about the extent of Hanja use and how it changed in the past 70 years.
  • Others support returning to a level of ordinary hanja use previously seen during the 1970s and 1980s so they want the use to increase a bit again?
  • Not sure we need two paragraphs on Hanja in South Korea.
  • Maybe some examples of Korean use of Hanja would be good (of 15 random pages on kowiki, only one, ko:남병길 had any Hanja content; all just names). But maybe you have too much about Korean already.
  • North Korea: the use of all so-called foreign languages, which has been interpreted as including hanja is this a North Korean interpretation or a foreign interpretation?
  • Vietnamese: Link chữ Nôm. Why are there two sentences about use of Chinese in Vietnam between the first mention of chữ Nôm and the explanation what it is?
  • Similar to Zhuang sawndip hasn't been introduced yet
  • Before 1945, the library of the French School of the Far East (EFEO) in Hanoi collected a total of around 20,000 Chinese and Vietnamese epigraphy rubbings from throughout Indochina.[133] The oldest surviving extant manuscript in Vietnamese is a late 15th-century bilingual copy of the Buddhist Sutra of Filial Piety, currently kept by the EFEO. It features Chinese text in larger characters, and an Old Vietnamese translation in smaller characters glossing the text.[134] Every Hán Nôm book in Vietnam after the Phật thuyết is dated between the 17th and the 20th centuries, with most being hand-copied works, and few printed texts. By 1987, the library of the Institute of Hán-Nôm Studies in Hanoi had collected a total of 4,808 Hán Nôm manuscripts is any of this necessary in this article?
  • Other languages: scripts based on Chinese characters, but also included many locally created characters why "but"?
  • According to surveys, traditional sawndip script has twice as many users as the official Latin script this basically repeats the earlier point still in use despite efforts to encourage the writing of Zhuang with a Latin-based alphabet, better to merge these.
  • Nüshu and bopomofo seem to be in the wrong section, they are both phonetic scripts for Chinese.
  • Other scripts within China that have adapted some characters but are otherwise distinct what languages are these for?
  • Transcription: why is this section only about Mongolian?

Special cases[edit]

A somewhat odd name for the section, which I will look at next. —Kusma (talk) 20:41, 15 April 2024 (UTC)[reply]

  • The entire "Contractions and abbreviations" section appears to be sourced to a single blog post. Even if it is by Victor H. Mair, this seems less than ideal. If this one blog post is the only mention of these characters in the literature, it is undue weight to spend two paragraphs on this.
  • What does the concept of "ligature" even mean in Chinese? It is not in the source (only in a comment, and Mair seems to be opposed to its use).
  • 囍 can also be pronounced xǐ
  • There are a few more sources and examples at zh:合文, for example common Lunar New Year things like zh:招財進寶 (成語). Modern readers might even like the Grass Mud Horse :)
  • Multi-syllable morphemes: what are the sources for this section?
  • 琵琶, incorporating 珡 just the upper half I'd say...
  • 儿 for erhua: traditional is 兒.
  • Rare and complex characters: Rare or antiquated character variants appear more often in personal or place names more often than where? Any examples? (Biangbiang noodles isn't a personal or place name).
  • Being unable to remember how to write 憂鬱症 by hand did make me slightly melancholic at some point.
  • 鬱 does have a simplified form, 郁.
  • quadruplicated quadrupled? quadruple?

Not convinced this is the best name or place for this content, nor that all of this needs to be here. —Kusma (talk) 21:16, 16 April 2024 (UTC)[reply]

Indeed—mind that this article as submitted was following me already having cut several thousand words out of it, the general perspective that the work wasn't done is well-appreciated, as you can see I've trimmed the article still further per your suggestions/thoughts Remsense 21:20, 16 April 2024 (UTC)[reply]

Standardisation[edit]

  • Is the table supposed to be complete? (Macao? Singapore? North Korea??)
  • Going from this table into "Received forms" is quite a break in the flow
  • I am wondering whether it wouldn't make more sense to put content on "Received" woodblock forms and the history of simplification into the "History" section and to make the section here just about (current) "standards" instead of including the process of standardisation. (There is nothing about standardisation in the TW and HK sections).
  • Simplified characters: Fascinating as it is that KMT and CCP has competing romanisation projects, I don't think we need this much detail about Gwoyeu Romatzyh and Latinxua Sinwenz. Awesome as Jaw Yuanrenn is, he might be not on topic here.
    You may be right; I had trouble writing about the simplification without mentioning its tight interwovenness with the alphabetization movements, but surely I can do a bit better. Remsense 11:11, 19 April 2024 (UTC)[reply]
  • Cursive script served as a source for many simplified character forms; others had already been used in print, though usually not in formal works this is mostly redundant with the content in the following section, drop here and mention Cursive later
  • The Republic of China, as well as Hong Kong and Macau—still under colonial rule at the time—were not affected by the reforms. and why should they?
  • The PRC initiated the first round of character simplifications with two documents published in 1956 and 1965 so did the use of simplified characters start in the 1950s or in the 1960s?
  • a total of 8,105 characters, with 3,500 categorised as primary, 3,000 as secondary, and 1,605 as tertiary. what do these categories mean? We seen to move from standardisation of the characters (what are their standard forms and how were they changed?) to the different question of which characters a student is supposed to learn,
  • The Chinese Proficiency Test (HSK) covers 2,663 characters and 5,000 words at its highest level, while the Chinese Proficiency Grading Standards for International Chinese Language Education would cover 3,000 characters and 11,092 words at its highest level that is the pre-2021 汉语水平考试, I think the new one is a but different. Also, this isn't about standardisation of characters, but standardisation of the curriculum taught to learners of Chinese as a foreign language.
  • Singapore: The first round was published in 1969, and consisted of 498 simplified and 502 traditional characters the first round of simplification consisted of simplified and traditional characters? I don't understand this.
  • Generally, is it worth discussing the writing directions and their standardisation (or lack thereof) somewhere?
  • Philippines: why do we have a special section for this?
  • bopomofo, owing to Taiwanese influence due to a shared Hokkien heritage not in the source. The source doesn't look particularly reliable either.
  • North America: why does it get its own section? (Where is Europe? Australia?)

Hope I can get through the last section tomorrow! —Kusma (talk) 22:02, 18 April 2024 (UTC)[reply]

I was thinking about education having its own section—what do you think about that? Remsense 12:08, 19 April 2024 (UTC)[reply]
That might work? "Educational standards", maybe? You could have all the information about educational standards in various countries (like the fact that it is usually standardised when a child should learn a specific character; I own a Taiwanese student dictionary from ca. 1999 and it has stroke order and information like 二上 for "first half of second grade" given with each character) in such a section, and also information about how many characters are required for things like HSK/TOCFL or JLPT that foreigners care about. —Kusma (talk) 12:43, 19 April 2024 (UTC)[reply]

Lexicography[edit]

  • What is a "meaning-based order"? Is this about the time before modern encyclopaedias when the world was small enough that people tried to sort the encyclopaedia in some logical way instead of the meaningless alphabetical order we have nowadays? (Western encyclopaedias made this shift around the 16th century).
  • Most modern Chinese dictionaries arrange the main character entries alphabetically according to pinyin spelling, while also providing a traditional radical-based index I can't find that in the cited source. From my own experience, the last time I went dictionary shopping (I have to admit that was over 20 years ago), it was difficult to buy pinyin-based dictionaries in Taipei. I think there is also a difference between monolingual and bilingual dictionaries, with monolingual dictionaries more favouring radical/stroke and bilingual ones more favouring pinyin.
  • Most dictionaries choose one method to sort characters and actually have indices for other methods.

The following are more personal rants than suggestions to improve the article, so no need to engage too much with the small print.

  • The text doesn't emphasise enough what a terrible user-unfriendly system radical+stroke was in practice, back when we had to use paper dictionaries. I remember spending entire days looking up characters. Drawing on my phone in Pleco is about a hundred times faster.
  • (That and both have seven strokes in Kangxi does not help).
  • In fact, my best paper dictionary (a Far East Chinese English Dictionary from ca. 2000) starts with radical index, table of contents, then the coolest thing of all: a table of hard to find characters sorted by stroke count, telling me things like where to find 恭 even if I don't know that the radical is 心.
  • Active vocabulary/total number of characters: might make sense to combine this with the "education" content about number of characters. Maybe you just want a "number of characters" section somewhere?
  • Is it worth mentioning the Table of Indexing Chinese Character Components as a modern Simplified alternative to Kangxi that may or may not be a widely used standard?

See also[edit]

First read through done (finally). Images, sources, lead still need to be done, and I need to discuss your responses. —Kusma (talk) 16:12, 19 April 2024 (UTC)[reply]

I am so grateful for your keen eye and consideration, I can't say it enough—as regards Modern Chinese characters, I actually raised concerns as such when the page was first created, but I realized later it's likely a page I can get around to expanding at some point. Remsense 16:17, 19 April 2024 (UTC)[reply]
I've also started tinkering with a visual presentation of historical character styles, and I think it's going to look really nice, but I still need to assemble representative examples for some (e.g. clerical script). Remsense 19:15, 21 April 2024 (UTC)[reply]
I'm looking forward to seeing it! I'll go and review the current image selection for now, and will happily re-review anything you add. —Kusma (talk) 20:41, 21 April 2024 (UTC)[reply]
Essentially, instead of baking new SVGs, I'm planning on using most of what individual glyphs we already have on Commons to track the evolution of eight of our friends, inspired by File:Shang Bronze and Oracle Script.png:
visual idea
Oracle bone script
Bronze script
Small seal script
I think I'd want to do all 8 in a 4×2 grid, but none of the image templates seem to do that as well as a header/footer caption, so I'm still puzzling but think reducing to 5 representative glyphs might be the move for now. Remsense 20:56, 21 April 2024 (UTC)[reply]
Looks promising! —Kusma (talk) 20:12, 22 April 2024 (UTC)[reply]

General comments and GA criteria[edit]

Good Article review progress box
Criteria: 1a. prose () 1b. MoS () 2a. ref layout () 2b. cites WP:RS () 2c. no WP:OR () 2d. no WP:CV ()
3a. broadness () 3b. focus () 4. neutral () 5. stable () 6a. free or tagged images () 6b. pics relevant ()
Note: this represents where the article stands relative to the Good Article criteria. Criteria marked are unassessed

Images[edit]

  • File:Comparative evolution of Cuneiform, Egyptian and Chinese characters.svg - this is cute, but I think it should be attributed in the caption. I assume the Chinese characters were drawn by Maspero or some other Egyptologist? They look rather like beginner's handwriting.
  • Structural templates used in compounds, with red marking possible positions for components: is this supposed to be complete? shouldn't there be some source for these? Is something like the 或 in 國 included, or just the 囗?
  • File:噹噹茶餐廳2021年7月初的午餐餐牌-tweaked.jpg – not sure if it is worth pointing out, but there are some typical handwriting abbreviations on the board, like the way they write 炒 and in particular writing 反 as shorthand for 飯 (technically wrong, but anyone literate enough to understand this when written correctly should understand the abbreviation).
  • File:Chineseprimer3.png – a higher quality scan would be nice, but certainly optional
  • File:ROC24 SC1.jpg – source link is dead, but this seems fine. I wonder whether it is worth linking (in caption or in the body) to zh:第一批简体字表

No major complaints here, just minor suggestions for captions. —Kusma (talk) 21:19, 21 April 2024 (UTC)[reply]

Infoboxes[edit]

  • Direction: Top-to-bottom is still widely used in Japan and Taiwan, so I do not understand why this is marked "historical".
  • Do the Ryukyuan languages need their own mention?
  • I don't quite see the point in the massive list of transcriptions. Maybe not all of them are needed?
  • At least in Hanyu pinyin, I think it is usually Hànzì with a capital H.
  • the "other Mandarin" heading is odd
  • "Romanization" should be "Romanisation" in the variety of English used in the body
  • Why do we care about the name in Thai or Khmer, and perhaps the one in Zhuang?

Source reviewing[edit]

Numbering is from Special:PermanentLink/1220112331. Not looking at everything, but at quite a few sources.

  • Notes: e,f,g,k are unsourced (even if k is WP:BLUESKY to anyone who reads Chinese). h,i seem unfinished.
  • 2: ok
  • 4: ok, but is it worth discussing the "Naxi script" that Qiu mentions?
  • 6: can't see a discussion of jiajie on p. 11; you probably need to cite a few earlier pages
  • 8: can't see "logography" defined on these pages
  • 10: ok

More later! —Kusma (talk) 20:54, 22 April 2024 (UTC)[reply]

  • 11: Handel just mentions 六書 and does not say 說文解字. Are you sure about the page numbers for Yong & Peng?
  • 12: ok
  • 13: does Qiu call them forms? Maybe I overlooked it but didn't see it
  • 14/15 could not find full text, but maybe I didn't try hard enough
  • 16: this makes it sound as if this is Qiu's theory, but he clearly says this is Tang Lan's theory (and then says why it is not enough).
  • 17: not sure where you see that "most" early characters are 象形

First impression of sources: generally excellent sources, but sometimes not everything you say is in these sources (and sometimes you cite rather long sections of the books). More later (sorry for the slow progress). —Kusma (talk) 21:06, 23 April 2024 (UTC)[reply]

  • 21: ok
  • C: I don't see how the MoE dictionary page for 明白 says how it is " touching on the derived association of 明 with 'illumination'" Could you point me to where the 漢語大字典 explains how 萌 alludes to heliotropic plants?
  • 23 "made obscure by subsequent changes in form" I read the source more as referring to subsequent changes in pronunciation.
  • D: As discussed earlier, 砼 is not described as a compound ideograph in the source.
  • 26: ok
  • 27: Qiu p. 154 seems to discuss the difficulty of deciding between 象形 and 指事 and not talk about 形聲 versus 會意
  • Footnote d: no sources for this paragraph?
  • 33: can't see anything about chemistry on p. 211, and in particular nothing about plutonium.
  • 35, 36: both seem to be about syllables and not about transcriptions / connotations; nothing about 可口可樂.

More later! (Sorry, busy days...) —Kusma (talk) 20:52, 25 April 2024 (UTC)[reply]

I have been pretty under the weather myself the past week or so and as such haven't been able to be as up-to-date with your suggestions as I've liked, so I appreciate your patience and continued diligence so much, I should be on the upswing now... Remsense 19:13, 26 April 2024 (UTC)[reply]