1@node Character Handling, String and Array Utilities, Memory, Top 2@c %MENU% Character testing and conversion functions 3@chapter Character Handling 4 5Programs that work with characters and strings often need to classify a 6character---is it alphabetic, is it a digit, is it whitespace, and so 7on---and perform case conversion operations on characters. The 8functions in the header file @file{ctype.h} are provided for this 9purpose. 10@pindex ctype.h 11 12Since the choice of locale and character set can alter the 13classifications of particular character codes, all of these functions 14are affected by the current locale. (More precisely, they are affected 15by the locale currently selected for character classification---the 16@code{LC_CTYPE} category; see @ref{Locale Categories}.) 17 18The @w{ISO C} standard specifies two different sets of functions. The 19one set works on @code{char} type characters, the other one on 20@code{wchar_t} wide characters (@pxref{Extended Char Intro}). 21 22@menu 23* Classification of Characters:: Testing whether characters are 24 letters, digits, punctuation, etc. 25 26* Case Conversion:: Case mapping, and the like. 27* Classification of Wide Characters:: Character class determination for 28 wide characters. 29* Using Wide Char Classes:: Notes on using the wide character 30 classes. 31* Wide Character Case Conversion:: Mapping of wide characters. 32@end menu 33 34@node Classification of Characters, Case Conversion, , Character Handling 35@section Classification of Characters 36@cindex character testing 37@cindex classification of characters 38@cindex predicates on characters 39@cindex character predicates 40 41This section explains the library functions for classifying characters. 42For example, @code{isalpha} is the function to test for an alphabetic 43character. It takes one argument, the character to test, and returns a 44nonzero integer if the character is alphabetic, and zero otherwise. You 45would use it like this: 46 47@smallexample 48if (isalpha (c)) 49 printf ("The character `%c' is alphabetic.\n", c); 50@end smallexample 51 52Each of the functions in this section tests for membership in a 53particular class of characters; each has a name starting with @samp{is}. 54Each of them takes one argument, which is a character to test, and 55returns an @code{int} which is treated as a boolean value. The 56character argument is passed as an @code{int}, and it may be the 57constant value @code{EOF} instead of a real character. 58 59The attributes of any given character can vary between locales. 60@xref{Locales}, for more information on locales.@refill 61 62These functions are declared in the header file @file{ctype.h}. 63@pindex ctype.h 64 65@cindex lower-case character 66@deftypefun int islower (int @var{c}) 67@standards{ISO, ctype.h} 68@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 69@c The is* macros call __ctype_b_loc to get the ctype array from the 70@c current locale, and then index it by c. __ctype_b_loc reads from 71@c thread-local memory the (indirect) pointer to the ctype array, which 72@c may involve one word access to the global locale object, if that's 73@c the active locale for the thread, and the array, being part of the 74@c locale data, is undeletable, so there's no thread-safety issue. We 75@c might want to mark these with @mtslocale to flag to callers that 76@c changing locales might affect them, even if not these simpler 77@c functions. 78Returns true if @var{c} is a lower-case letter. The letter need not be 79from the Latin alphabet, any alphabet representable is valid. 80@end deftypefun 81 82@cindex upper-case character 83@deftypefun int isupper (int @var{c}) 84@standards{ISO, ctype.h} 85@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 86Returns true if @var{c} is an upper-case letter. The letter need not be 87from the Latin alphabet, any alphabet representable is valid. 88@end deftypefun 89 90@cindex alphabetic character 91@deftypefun int isalpha (int @var{c}) 92@standards{ISO, ctype.h} 93@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 94Returns true if @var{c} is an alphabetic character (a letter). If 95@code{islower} or @code{isupper} is true of a character, then 96@code{isalpha} is also true. 97 98In some locales, there may be additional characters for which 99@code{isalpha} is true---letters which are neither upper case nor lower 100case. But in the standard @code{"C"} locale, there are no such 101additional characters. 102@end deftypefun 103 104@cindex digit character 105@cindex decimal digit character 106@deftypefun int isdigit (int @var{c}) 107@standards{ISO, ctype.h} 108@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 109Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}). 110@end deftypefun 111 112@cindex alphanumeric character 113@deftypefun int isalnum (int @var{c}) 114@standards{ISO, ctype.h} 115@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 116Returns true if @var{c} is an alphanumeric character (a letter or 117number); in other words, if either @code{isalpha} or @code{isdigit} is 118true of a character, then @code{isalnum} is also true. 119@end deftypefun 120 121@cindex hexadecimal digit character 122@deftypefun int isxdigit (int @var{c}) 123@standards{ISO, ctype.h} 124@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 125Returns true if @var{c} is a hexadecimal digit. 126Hexadecimal digits include the normal decimal digits @samp{0} through 127@samp{9} and the letters @samp{A} through @samp{F} and 128@samp{a} through @samp{f}. 129@end deftypefun 130 131@cindex punctuation character 132@deftypefun int ispunct (int @var{c}) 133@standards{ISO, ctype.h} 134@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 135Returns true if @var{c} is a punctuation character. 136This means any printing character that is not alphanumeric or a space 137character. 138@end deftypefun 139 140@cindex whitespace character 141@deftypefun int isspace (int @var{c}) 142@standards{ISO, ctype.h} 143@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 144Returns true if @var{c} is a @dfn{whitespace} character. In the standard 145@code{"C"} locale, @code{isspace} returns true for only the standard 146whitespace characters: 147 148@table @code 149@item ' ' 150space 151 152@item '\f' 153formfeed 154 155@item '\n' 156newline 157 158@item '\r' 159carriage return 160 161@item '\t' 162horizontal tab 163 164@item '\v' 165vertical tab 166@end table 167@end deftypefun 168 169@cindex blank character 170@deftypefun int isblank (int @var{c}) 171@standards{ISO, ctype.h} 172@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 173Returns true if @var{c} is a blank character; that is, a space or a tab. 174This function was originally a GNU extension, but was added in @w{ISO C99}. 175@end deftypefun 176 177@cindex graphic character 178@deftypefun int isgraph (int @var{c}) 179@standards{ISO, ctype.h} 180@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 181Returns true if @var{c} is a graphic character; that is, a character 182that has a glyph associated with it. The whitespace characters are not 183considered graphic. 184@end deftypefun 185 186@cindex printing character 187@deftypefun int isprint (int @var{c}) 188@standards{ISO, ctype.h} 189@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 190Returns true if @var{c} is a printing character. Printing characters 191include all the graphic characters, plus the space (@samp{ }) character. 192@end deftypefun 193 194@cindex control character 195@deftypefun int iscntrl (int @var{c}) 196@standards{ISO, ctype.h} 197@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 198Returns true if @var{c} is a control character (that is, a character that 199is not a printing character). 200@end deftypefun 201 202@cindex ASCII character 203@deftypefun int isascii (int @var{c}) 204@standards{SVID, ctype.h} 205@standards{BSD, ctype.h} 206@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 207Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits 208into the US/UK ASCII character set. This function is a BSD extension 209and is also an SVID extension. 210@end deftypefun 211 212@node Case Conversion, Classification of Wide Characters, Classification of Characters, Character Handling 213@section Case Conversion 214@cindex character case conversion 215@cindex case conversion of characters 216@cindex converting case of characters 217 218This section explains the library functions for performing conversions 219such as case mappings on characters. For example, @code{toupper} 220converts any character to upper case if possible. If the character 221can't be converted, @code{toupper} returns it unchanged. 222 223These functions take one argument of type @code{int}, which is the 224character to convert, and return the converted character as an 225@code{int}. If the conversion is not applicable to the argument given, 226the argument is returned unchanged. 227 228@strong{Compatibility Note:} In pre-@w{ISO C} dialects, instead of 229returning the argument unchanged, these functions may fail when the 230argument is not suitable for the conversion. Thus for portability, you 231may need to write @code{islower(c) ? toupper(c) : c} rather than just 232@code{toupper(c)}. 233 234These functions are declared in the header file @file{ctype.h}. 235@pindex ctype.h 236 237@deftypefun int tolower (int @var{c}) 238@standards{ISO, ctype.h} 239@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 240@c The to* macros/functions call different functions that use different 241@c arrays than those of__ctype_b_loc, but the access patterns and 242@c thus safety guarantees are the same. 243If @var{c} is an upper-case letter, @code{tolower} returns the corresponding 244lower-case letter. If @var{c} is not an upper-case letter, 245@var{c} is returned unchanged. 246@end deftypefun 247 248@deftypefun int toupper (int @var{c}) 249@standards{ISO, ctype.h} 250@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 251If @var{c} is a lower-case letter, @code{toupper} returns the corresponding 252upper-case letter. Otherwise @var{c} is returned unchanged. 253@end deftypefun 254 255@deftypefun int toascii (int @var{c}) 256@standards{SVID, ctype.h} 257@standards{BSD, ctype.h} 258@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 259This function converts @var{c} to a 7-bit @code{unsigned char} value 260that fits into the US/UK ASCII character set, by clearing the high-order 261bits. This function is a BSD extension and is also an SVID extension. 262@end deftypefun 263 264@deftypefun int _tolower (int @var{c}) 265@standards{SVID, ctype.h} 266@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 267This is identical to @code{tolower}, and is provided for compatibility 268with the SVID. @xref{SVID}.@refill 269@end deftypefun 270 271@deftypefun int _toupper (int @var{c}) 272@standards{SVID, ctype.h} 273@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 274This is identical to @code{toupper}, and is provided for compatibility 275with the SVID. 276@end deftypefun 277 278 279@node Classification of Wide Characters, Using Wide Char Classes, Case Conversion, Character Handling 280@section Character class determination for wide characters 281 282@w{Amendment 1} to @w{ISO C90} defines functions to classify wide 283characters. Although the original @w{ISO C90} standard already defined 284the type @code{wchar_t}, no functions operating on them were defined. 285 286The general design of the classification functions for wide characters 287is more general. It allows extensions to the set of available 288classifications, beyond those which are always available. The POSIX 289standard specifies how extensions can be made, and this is already 290implemented in the @glibcadj{} implementation of the @code{localedef} 291program. 292 293The character class functions are normally implemented with bitsets, 294with a bitset per character. For a given character, the appropriate 295bitset is read from a table and a test is performed as to whether a 296certain bit is set. Which bit is tested for is determined by the 297class. 298 299For the wide character classification functions this is made visible. 300There is a type classification type defined, a function to retrieve this 301value for a given class, and a function to test whether a given 302character is in this class, using the classification value. On top of 303this the normal character classification functions as used for 304@code{char} objects can be defined. 305 306@deftp {Data type} wctype_t 307@standards{ISO, wctype.h} 308The @code{wctype_t} can hold a value which represents a character class. 309The only defined way to generate such a value is by using the 310@code{wctype} function. 311 312@pindex wctype.h 313This type is defined in @file{wctype.h}. 314@end deftp 315 316@deftypefun wctype_t wctype (const char *@var{property}) 317@standards{ISO, wctype.h} 318@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 319@c Although the source code of wctype contains multiple references to 320@c the locale, that could each reference different locale_data objects 321@c should the global locale object change while active, the compiler can 322@c and does combine them all into a single dereference that resolves 323@c once to the LCTYPE locale object used throughout the function, so it 324@c is safe in (optimized) practice, if not in theory, even when the 325@c locale changes. Ideally we'd explicitly save the resolved 326@c locale_data object to make it visibly safe instead of safe only under 327@c compiler optimizations, but given the decision that setlocale is 328@c MT-Unsafe, all this would afford us would be the ability to not mark 329@c this function with @mtslocale. 330@code{wctype} returns a value representing a class of wide 331characters which is identified by the string @var{property}. Besides 332some standard properties each locale can define its own ones. In case 333no property with the given name is known for the current locale 334selected for the @code{LC_CTYPE} category, the function returns zero. 335 336@noindent 337The properties known in every locale are: 338 339@multitable @columnfractions .25 .25 .25 .25 340@item 341@code{"alnum"} @tab @code{"alpha"} @tab @code{"cntrl"} @tab @code{"digit"} 342@item 343@code{"graph"} @tab @code{"lower"} @tab @code{"print"} @tab @code{"punct"} 344@item 345@code{"space"} @tab @code{"upper"} @tab @code{"xdigit"} 346@end multitable 347 348@pindex wctype.h 349This function is declared in @file{wctype.h}. 350@end deftypefun 351 352To test the membership of a character to one of the non-standard classes 353the @w{ISO C} standard defines a completely new function. 354 355@deftypefun int iswctype (wint_t @var{wc}, wctype_t @var{desc}) 356@standards{ISO, wctype.h} 357@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 358@c The compressed lookup table returned by wctype is read-only. 359This function returns a nonzero value if @var{wc} is in the character 360class specified by @var{desc}. @var{desc} must previously be returned 361by a successful call to @code{wctype}. 362 363@pindex wctype.h 364This function is declared in @file{wctype.h}. 365@end deftypefun 366 367To make it easier to use the commonly-used classification functions, 368they are defined in the C library. There is no need to use 369@code{wctype} if the property string is one of the known character 370classes. In some situations it is desirable to construct the property 371strings, and then it is important that @code{wctype} can also handle the 372standard classes. 373 374@cindex alphanumeric character 375@deftypefun int iswalnum (wint_t @var{wc}) 376@standards{ISO, wctype.h} 377@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 378@c The implicit wctype call in the isw* functions is actually an 379@c optimized version because the category has a known offset, but the 380@c wctype is equally safe when optimized, unsafe with changing locales 381@c if not optimized (thus @mtslocale). Since it's not a macro, we 382@c always optimize, and the locale can't change in any MT-Safe way, it's 383@c fine. The test whether wc is ASCII to use the non-wide is* 384@c macro/function doesn't bring any other safety issues: the test does 385@c not depend on the locale, and each path after the decision resolves 386@c the locale object only once. 387This function returns a nonzero value if @var{wc} is an alphanumeric 388character (a letter or number); in other words, if either @code{iswalpha} 389or @code{iswdigit} is true of a character, then @code{iswalnum} is also 390true. 391 392@noindent 393This function can be implemented using 394 395@smallexample 396iswctype (wc, wctype ("alnum")) 397@end smallexample 398 399@pindex wctype.h 400It is declared in @file{wctype.h}. 401@end deftypefun 402 403@cindex alphabetic character 404@deftypefun int iswalpha (wint_t @var{wc}) 405@standards{ISO, wctype.h} 406@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 407Returns true if @var{wc} is an alphabetic character (a letter). If 408@code{iswlower} or @code{iswupper} is true of a character, then 409@code{iswalpha} is also true. 410 411In some locales, there may be additional characters for which 412@code{iswalpha} is true---letters which are neither upper case nor lower 413case. But in the standard @code{"C"} locale, there are no such 414additional characters. 415 416@noindent 417This function can be implemented using 418 419@smallexample 420iswctype (wc, wctype ("alpha")) 421@end smallexample 422 423@pindex wctype.h 424It is declared in @file{wctype.h}. 425@end deftypefun 426 427@cindex control character 428@deftypefun int iswcntrl (wint_t @var{wc}) 429@standards{ISO, wctype.h} 430@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 431Returns true if @var{wc} is a control character (that is, a character that 432is not a printing character). 433 434@noindent 435This function can be implemented using 436 437@smallexample 438iswctype (wc, wctype ("cntrl")) 439@end smallexample 440 441@pindex wctype.h 442It is declared in @file{wctype.h}. 443@end deftypefun 444 445@cindex digit character 446@deftypefun int iswdigit (wint_t @var{wc}) 447@standards{ISO, wctype.h} 448@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 449Returns true if @var{wc} is a digit (e.g., @samp{0} through @samp{9}). 450Please note that this function does not only return a nonzero value for 451@emph{decimal} digits, but for all kinds of digits. A consequence is 452that code like the following will @strong{not} work unconditionally for 453wide characters: 454 455@smallexample 456n = 0; 457while (iswdigit (*wc)) 458 @{ 459 n *= 10; 460 n += *wc++ - L'0'; 461 @} 462@end smallexample 463 464@noindent 465This function can be implemented using 466 467@smallexample 468iswctype (wc, wctype ("digit")) 469@end smallexample 470 471@pindex wctype.h 472It is declared in @file{wctype.h}. 473@end deftypefun 474 475@cindex graphic character 476@deftypefun int iswgraph (wint_t @var{wc}) 477@standards{ISO, wctype.h} 478@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 479Returns true if @var{wc} is a graphic character; that is, a character 480that has a glyph associated with it. The whitespace characters are not 481considered graphic. 482 483@noindent 484This function can be implemented using 485 486@smallexample 487iswctype (wc, wctype ("graph")) 488@end smallexample 489 490@pindex wctype.h 491It is declared in @file{wctype.h}. 492@end deftypefun 493 494@cindex lower-case character 495@deftypefun int iswlower (wint_t @var{wc}) 496@standards{ISO, ctype.h} 497@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 498Returns true if @var{wc} is a lower-case letter. The letter need not be 499from the Latin alphabet, any alphabet representable is valid. 500 501@noindent 502This function can be implemented using 503 504@smallexample 505iswctype (wc, wctype ("lower")) 506@end smallexample 507 508@pindex wctype.h 509It is declared in @file{wctype.h}. 510@end deftypefun 511 512@cindex printing character 513@deftypefun int iswprint (wint_t @var{wc}) 514@standards{ISO, wctype.h} 515@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 516Returns true if @var{wc} is a printing character. Printing characters 517include all the graphic characters, plus the space (@samp{ }) character. 518 519@noindent 520This function can be implemented using 521 522@smallexample 523iswctype (wc, wctype ("print")) 524@end smallexample 525 526@pindex wctype.h 527It is declared in @file{wctype.h}. 528@end deftypefun 529 530@cindex punctuation character 531@deftypefun int iswpunct (wint_t @var{wc}) 532@standards{ISO, wctype.h} 533@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 534Returns true if @var{wc} is a punctuation character. 535This means any printing character that is not alphanumeric or a space 536character. 537 538@noindent 539This function can be implemented using 540 541@smallexample 542iswctype (wc, wctype ("punct")) 543@end smallexample 544 545@pindex wctype.h 546It is declared in @file{wctype.h}. 547@end deftypefun 548 549@cindex whitespace character 550@deftypefun int iswspace (wint_t @var{wc}) 551@standards{ISO, wctype.h} 552@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 553Returns true if @var{wc} is a @dfn{whitespace} character. In the standard 554@code{"C"} locale, @code{iswspace} returns true for only the standard 555whitespace characters: 556 557@table @code 558@item L' ' 559space 560 561@item L'\f' 562formfeed 563 564@item L'\n' 565newline 566 567@item L'\r' 568carriage return 569 570@item L'\t' 571horizontal tab 572 573@item L'\v' 574vertical tab 575@end table 576 577@noindent 578This function can be implemented using 579 580@smallexample 581iswctype (wc, wctype ("space")) 582@end smallexample 583 584@pindex wctype.h 585It is declared in @file{wctype.h}. 586@end deftypefun 587 588@cindex upper-case character 589@deftypefun int iswupper (wint_t @var{wc}) 590@standards{ISO, wctype.h} 591@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 592Returns true if @var{wc} is an upper-case letter. The letter need not be 593from the Latin alphabet, any alphabet representable is valid. 594 595@noindent 596This function can be implemented using 597 598@smallexample 599iswctype (wc, wctype ("upper")) 600@end smallexample 601 602@pindex wctype.h 603It is declared in @file{wctype.h}. 604@end deftypefun 605 606@cindex hexadecimal digit character 607@deftypefun int iswxdigit (wint_t @var{wc}) 608@standards{ISO, wctype.h} 609@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 610Returns true if @var{wc} is a hexadecimal digit. 611Hexadecimal digits include the normal decimal digits @samp{0} through 612@samp{9} and the letters @samp{A} through @samp{F} and 613@samp{a} through @samp{f}. 614 615@noindent 616This function can be implemented using 617 618@smallexample 619iswctype (wc, wctype ("xdigit")) 620@end smallexample 621 622@pindex wctype.h 623It is declared in @file{wctype.h}. 624@end deftypefun 625 626@Theglibc{} also provides a function which is not defined in the 627@w{ISO C} standard but which is available as a version for single byte 628characters as well. 629 630@cindex blank character 631@deftypefun int iswblank (wint_t @var{wc}) 632@standards{ISO, wctype.h} 633@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 634Returns true if @var{wc} is a blank character; that is, a space or a tab. 635This function was originally a GNU extension, but was added in @w{ISO C99}. 636It is declared in @file{wchar.h}. 637@end deftypefun 638 639@node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling 640@section Notes on using the wide character classes 641 642The first note is probably not astonishing but still occasionally a 643cause of problems. The @code{isw@var{XXX}} functions can be implemented 644using macros and in fact, @theglibc{} does this. They are still 645available as real functions but when the @file{wctype.h} header is 646included the macros will be used. This is the same as the 647@code{char} type versions of these functions. 648 649The second note covers something new. It can be best illustrated by a 650(real-world) example. The first piece of code is an excerpt from the 651original code. It is truncated a bit but the intention should be clear. 652 653@smallexample 654int 655is_in_class (int c, const char *class) 656@{ 657 if (strcmp (class, "alnum") == 0) 658 return isalnum (c); 659 if (strcmp (class, "alpha") == 0) 660 return isalpha (c); 661 if (strcmp (class, "cntrl") == 0) 662 return iscntrl (c); 663 @dots{} 664 return 0; 665@} 666@end smallexample 667 668Now, with the @code{wctype} and @code{iswctype} you can avoid the 669@code{if} cascades, but rewriting the code as follows is wrong: 670 671@smallexample 672int 673is_in_class (int c, const char *class) 674@{ 675 wctype_t desc = wctype (class); 676 return desc ? iswctype ((wint_t) c, desc) : 0; 677@} 678@end smallexample 679 680The problem is that it is not guaranteed that the wide character 681representation of a single-byte character can be found using casting. 682In fact, usually this fails miserably. The correct solution to this 683problem is to write the code as follows: 684 685@smallexample 686int 687is_in_class (int c, const char *class) 688@{ 689 wctype_t desc = wctype (class); 690 return desc ? iswctype (btowc (c), desc) : 0; 691@} 692@end smallexample 693 694@xref{Converting a Character}, for more information on @code{btowc}. 695Note that this change probably does not improve the performance 696of the program a lot since the @code{wctype} function still has to make 697the string comparisons. It gets really interesting if the 698@code{is_in_class} function is called more than once for the 699same class name. In this case the variable @var{desc} could be computed 700once and reused for all the calls. Therefore the above form of the 701function is probably not the final one. 702 703 704@node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling 705@section Mapping of wide characters. 706 707The classification functions are also generalized by the @w{ISO C} 708standard. Instead of just allowing the two standard mappings, a 709locale can contain others. Again, the @code{localedef} program 710already supports generating such locale data files. 711 712@deftp {Data Type} wctrans_t 713@standards{ISO, wctype.h} 714This data type is defined as a scalar type which can hold a value 715representing the locale-dependent character mapping. There is no way to 716construct such a value apart from using the return value of the 717@code{wctrans} function. 718 719@pindex wctype.h 720@noindent 721This type is defined in @file{wctype.h}. 722@end deftp 723 724@deftypefun wctrans_t wctrans (const char *@var{property}) 725@standards{ISO, wctype.h} 726@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 727@c Similar implementation, same caveats as wctype. 728The @code{wctrans} function has to be used to find out whether a named 729mapping is defined in the current locale selected for the 730@code{LC_CTYPE} category. If the returned value is non-zero, you can use 731it afterwards in calls to @code{towctrans}. If the return value is 732zero no such mapping is known in the current locale. 733 734Beside locale-specific mappings there are two mappings which are 735guaranteed to be available in every locale: 736 737@multitable @columnfractions .5 .5 738@item 739@code{"tolower"} @tab @code{"toupper"} 740@end multitable 741 742@pindex wctype.h 743@noindent 744These functions are declared in @file{wctype.h}. 745@end deftypefun 746 747@deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc}) 748@standards{ISO, wctype.h} 749@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} 750@c Same caveats as iswctype. 751@code{towctrans} maps the input character @var{wc} 752according to the rules of the mapping for which @var{desc} is a 753descriptor, and returns the value it finds. @var{desc} must be 754obtained by a successful call to @code{wctrans}. 755 756@pindex wctype.h 757@noindent 758This function is declared in @file{wctype.h}. 759@end deftypefun 760 761For the generally available mappings, the @w{ISO C} standard defines 762convenient shortcuts so that it is not necessary to call @code{wctrans} 763for them. 764 765@deftypefun wint_t towlower (wint_t @var{wc}) 766@standards{ISO, wctype.h} 767@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 768@c Same caveats as iswalnum, just using a wctrans rather than a wctype 769@c table. 770If @var{wc} is an upper-case letter, @code{towlower} returns the corresponding 771lower-case letter. If @var{wc} is not an upper-case letter, 772@var{wc} is returned unchanged. 773 774@noindent 775@code{towlower} can be implemented using 776 777@smallexample 778towctrans (wc, wctrans ("tolower")) 779@end smallexample 780 781@pindex wctype.h 782@noindent 783This function is declared in @file{wctype.h}. 784@end deftypefun 785 786@deftypefun wint_t towupper (wint_t @var{wc}) 787@standards{ISO, wctype.h} 788@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} 789If @var{wc} is a lower-case letter, @code{towupper} returns the corresponding 790upper-case letter. Otherwise @var{wc} is returned unchanged. 791 792@noindent 793@code{towupper} can be implemented using 794 795@smallexample 796towctrans (wc, wctrans ("toupper")) 797@end smallexample 798 799@pindex wctype.h 800@noindent 801This function is declared in @file{wctype.h}. 802@end deftypefun 803 804The same warnings given in the last section for the use of the wide 805character classification functions apply here. It is not possible to 806simply cast a @code{char} type value to a @code{wint_t} and use it as an 807argument to @code{towctrans} calls. 808