- Timestamp:
- Jan 7, 2014, 6:02:25 PM (9 years ago)
- Children:
- 611236e
- Parents:
- 4b9c3b9
- git-author:
- Jason Gross <jgross@mit.edu> (01/01/14 20:59:51)
- git-committer:
- Jason Gross <jgross@mit.edu> (01/07/14 18:02:25)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
util.c
r7b89e8c rcc27237 640 640 } 641 641 642 CALLER_OWN char *owl_util_compat_casefold(const char *str) 643 { 644 /* 645 * Quoting Anders Kaseorg at https://github.com/barnowl/barnowl/pull/54#issuecomment-31452543: 646 * 647 * The Unicode specification calls this compatibility caseless matching, and 648 * the correct transformation actually has five calls: 649 * NFKC(toCasefold(NFKD(toCasefold(NFD(string))))) Zephyr’s current 650 * implementation incorrectly omits the innermost NFD, but that difference 651 * only matters for characters including U+0345 ◌ͅ COMBINING GREEK 652 * YPOGEGRAMMENI. I think we should just write the correct version and get 653 * Zephyr fixed. 654 * 655 * Neither of these operations should be called toNFKC_Casefold, because that 656 * has slightly different behavior regarding Default_Ignorable_Code_Point. I 657 * propose compat_casefold. And I guess if Jabber wants it too, we should 658 * move it to util.c. 659 */ 660 char *tmp0 = g_utf8_normalize(str, -1, G_NORMALIZE_NFD); 661 char *tmp1 = g_utf8_casefold(tmp0, -1); 662 char *tmp2 = g_utf8_normalize(tmp1, -1, G_NORMALIZE_NFKD); 663 char *tmp3 = g_utf8_casefold(tmp2, -1); 664 char *out = g_utf8_normalize(tmp3, -1, G_NORMALIZE_NFKC); 665 g_free(tmp0); 666 g_free(tmp1); 667 g_free(tmp2); 668 g_free(tmp3); 669 670 return out; 671 } 672 642 673 /* This is based on _extract() and _isCJ() from perl's Text::WrapI18N */ 643 674 int owl_util_can_break_after(gunichar c)
Note: See TracChangeset
for help on using the changeset viewer.