1@node Character Handling, String and Array Utilities, Memory, Top
2@c %MENU% Character testing and conversion functions
3@chapter Character Handling
4
5Programs that work with characters and strings often need to classify a
6character---is it alphabetic, is it a digit, is it whitespace, and so
7on---and perform case conversion operations on characters.  The
8functions in the header file @file{ctype.h} are provided for this
9purpose.
10@pindex ctype.h
11
12Since the choice of locale and character set can alter the
13classifications of particular character codes, all of these functions
14are affected by the current locale.  (More precisely, they are affected
15by the locale currently selected for character classification---the
16@code{LC_CTYPE} category; see @ref{Locale Categories}.)
17
18The @w{ISO C} standard specifies two different sets of functions.  The
19one set works on @code{char} type characters, the other one on
20@code{wchar_t} wide characters (@pxref{Extended Char Intro}).
21
22@menu
23* Classification of Characters::       Testing whether characters are
24			                letters, digits, punctuation, etc.
25
26* Case Conversion::                    Case mapping, and the like.
27* Classification of Wide Characters::  Character class determination for
28                                        wide characters.
29* Using Wide Char Classes::            Notes on using the wide character
30                                        classes.
31* Wide Character Case Conversion::     Mapping of wide characters.
32@end menu
33
34@node Classification of Characters, Case Conversion,  , Character Handling
35@section Classification of Characters
36@cindex character testing
37@cindex classification of characters
38@cindex predicates on characters
39@cindex character predicates
40
41This section explains the library functions for classifying characters.
42For example, @code{isalpha} is the function to test for an alphabetic
43character.  It takes one argument, the character to test, and returns a
44nonzero integer if the character is alphabetic, and zero otherwise.  You
45would use it like this:
46
47@smallexample
48if (isalpha (c))
49  printf ("The character `%c' is alphabetic.\n", c);
50@end smallexample
51
52Each of the functions in this section tests for membership in a
53particular class of characters; each has a name starting with @samp{is}.
54Each of them takes one argument, which is a character to test, and
55returns an @code{int} which is treated as a boolean value.  The
56character argument is passed as an @code{int}, and it may be the
57constant value @code{EOF} instead of a real character.
58
59The attributes of any given character can vary between locales.
60@xref{Locales}, for more information on locales.@refill
61
62These functions are declared in the header file @file{ctype.h}.
63@pindex ctype.h
64
65@cindex lower-case character
66@deftypefun int islower (int @var{c})
67@standards{ISO, ctype.h}
68@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
69@c The is* macros call __ctype_b_loc to get the ctype array from the
70@c current locale, and then index it by c.  __ctype_b_loc reads from
71@c thread-local memory the (indirect) pointer to the ctype array, which
72@c may involve one word access to the global locale object, if that's
73@c the active locale for the thread, and the array, being part of the
74@c locale data, is undeletable, so there's no thread-safety issue.  We
75@c might want to mark these with @mtslocale to flag to callers that
76@c changing locales might affect them, even if not these simpler
77@c functions.
78Returns true if @var{c} is a lower-case letter.  The letter need not be
79from the Latin alphabet, any alphabet representable is valid.
80@end deftypefun
81
82@cindex upper-case character
83@deftypefun int isupper (int @var{c})
84@standards{ISO, ctype.h}
85@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
86Returns true if @var{c} is an upper-case letter.  The letter need not be
87from the Latin alphabet, any alphabet representable is valid.
88@end deftypefun
89
90@cindex alphabetic character
91@deftypefun int isalpha (int @var{c})
92@standards{ISO, ctype.h}
93@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
94Returns true if @var{c} is an alphabetic character (a letter).  If
95@code{islower} or @code{isupper} is true of a character, then
96@code{isalpha} is also true.
97
98In some locales, there may be additional characters for which
99@code{isalpha} is true---letters which are neither upper case nor lower
100case.  But in the standard @code{"C"} locale, there are no such
101additional characters.
102@end deftypefun
103
104@cindex digit character
105@cindex decimal digit character
106@deftypefun int isdigit (int @var{c})
107@standards{ISO, ctype.h}
108@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
109Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
110@end deftypefun
111
112@cindex alphanumeric character
113@deftypefun int isalnum (int @var{c})
114@standards{ISO, ctype.h}
115@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
116Returns true if @var{c} is an alphanumeric character (a letter or
117number); in other words, if either @code{isalpha} or @code{isdigit} is
118true of a character, then @code{isalnum} is also true.
119@end deftypefun
120
121@cindex hexadecimal digit character
122@deftypefun int isxdigit (int @var{c})
123@standards{ISO, ctype.h}
124@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
125Returns true if @var{c} is a hexadecimal digit.
126Hexadecimal digits include the normal decimal digits @samp{0} through
127@samp{9} and the letters @samp{A} through @samp{F} and
128@samp{a} through @samp{f}.
129@end deftypefun
130
131@cindex punctuation character
132@deftypefun int ispunct (int @var{c})
133@standards{ISO, ctype.h}
134@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
135Returns true if @var{c} is a punctuation character.
136This means any printing character that is not alphanumeric or a space
137character.
138@end deftypefun
139
140@cindex whitespace character
141@deftypefun int isspace (int @var{c})
142@standards{ISO, ctype.h}
143@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
144Returns true if @var{c} is a @dfn{whitespace} character.  In the standard
145@code{"C"} locale, @code{isspace} returns true for only the standard
146whitespace characters:
147
148@table @code
149@item ' '
150space
151
152@item '\f'
153formfeed
154
155@item '\n'
156newline
157
158@item '\r'
159carriage return
160
161@item '\t'
162horizontal tab
163
164@item '\v'
165vertical tab
166@end table
167@end deftypefun
168
169@cindex blank character
170@deftypefun int isblank (int @var{c})
171@standards{ISO, ctype.h}
172@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
173Returns true if @var{c} is a blank character; that is, a space or a tab.
174This function was originally a GNU extension, but was added in @w{ISO C99}.
175@end deftypefun
176
177@cindex graphic character
178@deftypefun int isgraph (int @var{c})
179@standards{ISO, ctype.h}
180@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
181Returns true if @var{c} is a graphic character; that is, a character
182that has a glyph associated with it.  The whitespace characters are not
183considered graphic.
184@end deftypefun
185
186@cindex printing character
187@deftypefun int isprint (int @var{c})
188@standards{ISO, ctype.h}
189@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
190Returns true if @var{c} is a printing character.  Printing characters
191include all the graphic characters, plus the space (@samp{ }) character.
192@end deftypefun
193
194@cindex control character
195@deftypefun int iscntrl (int @var{c})
196@standards{ISO, ctype.h}
197@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
198Returns true if @var{c} is a control character (that is, a character that
199is not a printing character).
200@end deftypefun
201
202@cindex ASCII character
203@deftypefun int isascii (int @var{c})
204@standards{SVID, ctype.h}
205@standards{BSD, ctype.h}
206@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
207Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits
208into the US/UK ASCII character set.  This function is a BSD extension
209and is also an SVID extension.
210@end deftypefun
211
212@node Case Conversion, Classification of Wide Characters, Classification of Characters, Character Handling
213@section Case Conversion
214@cindex character case conversion
215@cindex case conversion of characters
216@cindex converting case of characters
217
218This section explains the library functions for performing conversions
219such as case mappings on characters.  For example, @code{toupper}
220converts any character to upper case if possible.  If the character
221can't be converted, @code{toupper} returns it unchanged.
222
223These functions take one argument of type @code{int}, which is the
224character to convert, and return the converted character as an
225@code{int}.  If the conversion is not applicable to the argument given,
226the argument is returned unchanged.
227
228@strong{Compatibility Note:} In pre-@w{ISO C} dialects, instead of
229returning the argument unchanged, these functions may fail when the
230argument is not suitable for the conversion.  Thus for portability, you
231may need to write @code{islower(c) ? toupper(c) : c} rather than just
232@code{toupper(c)}.
233
234These functions are declared in the header file @file{ctype.h}.
235@pindex ctype.h
236
237@deftypefun int tolower (int @var{c})
238@standards{ISO, ctype.h}
239@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
240@c The to* macros/functions call different functions that use different
241@c arrays than those of__ctype_b_loc, but the access patterns and
242@c thus safety guarantees are the same.
243If @var{c} is an upper-case letter, @code{tolower} returns the corresponding
244lower-case letter.  If @var{c} is not an upper-case letter,
245@var{c} is returned unchanged.
246@end deftypefun
247
248@deftypefun int toupper (int @var{c})
249@standards{ISO, ctype.h}
250@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
251If @var{c} is a lower-case letter, @code{toupper} returns the corresponding
252upper-case letter.  Otherwise @var{c} is returned unchanged.
253@end deftypefun
254
255@deftypefun int toascii (int @var{c})
256@standards{SVID, ctype.h}
257@standards{BSD, ctype.h}
258@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
259This function converts @var{c} to a 7-bit @code{unsigned char} value
260that fits into the US/UK ASCII character set, by clearing the high-order
261bits.  This function is a BSD extension and is also an SVID extension.
262@end deftypefun
263
264@deftypefun int _tolower (int @var{c})
265@standards{SVID, ctype.h}
266@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
267This is identical to @code{tolower}, and is provided for compatibility
268with the SVID.  @xref{SVID}.@refill
269@end deftypefun
270
271@deftypefun int _toupper (int @var{c})
272@standards{SVID, ctype.h}
273@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
274This is identical to @code{toupper}, and is provided for compatibility
275with the SVID.
276@end deftypefun
277
278
279@node Classification of Wide Characters, Using Wide Char Classes, Case Conversion, Character Handling
280@section Character class determination for wide characters
281
282@w{Amendment 1} to @w{ISO C90} defines functions to classify wide
283characters.  Although the original @w{ISO C90} standard already defined
284the type @code{wchar_t}, no functions operating on them were defined.
285
286The general design of the classification functions for wide characters
287is more general.  It allows extensions to the set of available
288classifications, beyond those which are always available.  The POSIX
289standard specifies how extensions can be made, and this is already
290implemented in the @glibcadj{} implementation of the @code{localedef}
291program.
292
293The character class functions are normally implemented with bitsets,
294with a bitset per character.  For a given character, the appropriate
295bitset is read from a table and a test is performed as to whether a
296certain bit is set.  Which bit is tested for is determined by the
297class.
298
299For the wide character classification functions this is made visible.
300There is a type classification type defined, a function to retrieve this
301value for a given class, and a function to test whether a given
302character is in this class, using the classification value.  On top of
303this the normal character classification functions as used for
304@code{char} objects can be defined.
305
306@deftp {Data type} wctype_t
307@standards{ISO, wctype.h}
308The @code{wctype_t} can hold a value which represents a character class.
309The only defined way to generate such a value is by using the
310@code{wctype} function.
311
312@pindex wctype.h
313This type is defined in @file{wctype.h}.
314@end deftp
315
316@deftypefun wctype_t wctype (const char *@var{property})
317@standards{ISO, wctype.h}
318@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
319@c Although the source code of wctype contains multiple references to
320@c the locale, that could each reference different locale_data objects
321@c should the global locale object change while active, the compiler can
322@c and does combine them all into a single dereference that resolves
323@c once to the LCTYPE locale object used throughout the function, so it
324@c is safe in (optimized) practice, if not in theory, even when the
325@c locale changes.  Ideally we'd explicitly save the resolved
326@c locale_data object to make it visibly safe instead of safe only under
327@c compiler optimizations, but given the decision that setlocale is
328@c MT-Unsafe, all this would afford us would be the ability to not mark
329@c this function with @mtslocale.
330@code{wctype} returns a value representing a class of wide
331characters which is identified by the string @var{property}.  Besides
332some standard properties each locale can define its own ones.  In case
333no property with the given name is known for the current locale
334selected for the @code{LC_CTYPE} category, the function returns zero.
335
336@noindent
337The properties known in every locale are:
338
339@multitable @columnfractions .25 .25 .25 .25
340@item
341@code{"alnum"} @tab @code{"alpha"} @tab @code{"cntrl"} @tab @code{"digit"}
342@item
343@code{"graph"} @tab @code{"lower"} @tab @code{"print"} @tab @code{"punct"}
344@item
345@code{"space"} @tab @code{"upper"} @tab @code{"xdigit"}
346@end multitable
347
348@pindex wctype.h
349This function is declared in @file{wctype.h}.
350@end deftypefun
351
352To test the membership of a character to one of the non-standard classes
353the @w{ISO C} standard defines a completely new function.
354
355@deftypefun int iswctype (wint_t @var{wc}, wctype_t @var{desc})
356@standards{ISO, wctype.h}
357@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
358@c The compressed lookup table returned by wctype is read-only.
359This function returns a nonzero value if @var{wc} is in the character
360class specified by @var{desc}.  @var{desc} must previously be returned
361by a successful call to @code{wctype}.
362
363@pindex wctype.h
364This function is declared in @file{wctype.h}.
365@end deftypefun
366
367To make it easier to use the commonly-used classification functions,
368they are defined in the C library.  There is no need to use
369@code{wctype} if the property string is one of the known character
370classes.  In some situations it is desirable to construct the property
371strings, and then it is important that @code{wctype} can also handle the
372standard classes.
373
374@cindex alphanumeric character
375@deftypefun int iswalnum (wint_t @var{wc})
376@standards{ISO, wctype.h}
377@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
378@c The implicit wctype call in the isw* functions is actually an
379@c optimized version because the category has a known offset, but the
380@c wctype is equally safe when optimized, unsafe with changing locales
381@c if not optimized (thus @mtslocale).  Since it's not a macro, we
382@c always optimize, and the locale can't change in any MT-Safe way, it's
383@c fine.  The test whether wc is ASCII to use the non-wide is*
384@c macro/function doesn't bring any other safety issues: the test does
385@c not depend on the locale, and each path after the decision resolves
386@c the locale object only once.
387This function returns a nonzero value if @var{wc} is an alphanumeric
388character (a letter or number); in other words, if either @code{iswalpha}
389or @code{iswdigit} is true of a character, then @code{iswalnum} is also
390true.
391
392@noindent
393This function can be implemented using
394
395@smallexample
396iswctype (wc, wctype ("alnum"))
397@end smallexample
398
399@pindex wctype.h
400It is declared in @file{wctype.h}.
401@end deftypefun
402
403@cindex alphabetic character
404@deftypefun int iswalpha (wint_t @var{wc})
405@standards{ISO, wctype.h}
406@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
407Returns true if @var{wc} is an alphabetic character (a letter).  If
408@code{iswlower} or @code{iswupper} is true of a character, then
409@code{iswalpha} is also true.
410
411In some locales, there may be additional characters for which
412@code{iswalpha} is true---letters which are neither upper case nor lower
413case.  But in the standard @code{"C"} locale, there are no such
414additional characters.
415
416@noindent
417This function can be implemented using
418
419@smallexample
420iswctype (wc, wctype ("alpha"))
421@end smallexample
422
423@pindex wctype.h
424It is declared in @file{wctype.h}.
425@end deftypefun
426
427@cindex control character
428@deftypefun int iswcntrl (wint_t @var{wc})
429@standards{ISO, wctype.h}
430@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
431Returns true if @var{wc} is a control character (that is, a character that
432is not a printing character).
433
434@noindent
435This function can be implemented using
436
437@smallexample
438iswctype (wc, wctype ("cntrl"))
439@end smallexample
440
441@pindex wctype.h
442It is declared in @file{wctype.h}.
443@end deftypefun
444
445@cindex digit character
446@deftypefun int iswdigit (wint_t @var{wc})
447@standards{ISO, wctype.h}
448@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
449Returns true if @var{wc} is a digit (e.g., @samp{0} through @samp{9}).
450Please note that this function does not only return a nonzero value for
451@emph{decimal} digits, but for all kinds of digits.  A consequence is
452that code like the following will @strong{not} work unconditionally for
453wide characters:
454
455@smallexample
456n = 0;
457while (iswdigit (*wc))
458  @{
459    n *= 10;
460    n += *wc++ - L'0';
461  @}
462@end smallexample
463
464@noindent
465This function can be implemented using
466
467@smallexample
468iswctype (wc, wctype ("digit"))
469@end smallexample
470
471@pindex wctype.h
472It is declared in @file{wctype.h}.
473@end deftypefun
474
475@cindex graphic character
476@deftypefun int iswgraph (wint_t @var{wc})
477@standards{ISO, wctype.h}
478@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
479Returns true if @var{wc} is a graphic character; that is, a character
480that has a glyph associated with it.  The whitespace characters are not
481considered graphic.
482
483@noindent
484This function can be implemented using
485
486@smallexample
487iswctype (wc, wctype ("graph"))
488@end smallexample
489
490@pindex wctype.h
491It is declared in @file{wctype.h}.
492@end deftypefun
493
494@cindex lower-case character
495@deftypefun int iswlower (wint_t @var{wc})
496@standards{ISO, ctype.h}
497@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
498Returns true if @var{wc} is a lower-case letter.  The letter need not be
499from the Latin alphabet, any alphabet representable is valid.
500
501@noindent
502This function can be implemented using
503
504@smallexample
505iswctype (wc, wctype ("lower"))
506@end smallexample
507
508@pindex wctype.h
509It is declared in @file{wctype.h}.
510@end deftypefun
511
512@cindex printing character
513@deftypefun int iswprint (wint_t @var{wc})
514@standards{ISO, wctype.h}
515@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
516Returns true if @var{wc} is a printing character.  Printing characters
517include all the graphic characters, plus the space (@samp{ }) character.
518
519@noindent
520This function can be implemented using
521
522@smallexample
523iswctype (wc, wctype ("print"))
524@end smallexample
525
526@pindex wctype.h
527It is declared in @file{wctype.h}.
528@end deftypefun
529
530@cindex punctuation character
531@deftypefun int iswpunct (wint_t @var{wc})
532@standards{ISO, wctype.h}
533@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
534Returns true if @var{wc} is a punctuation character.
535This means any printing character that is not alphanumeric or a space
536character.
537
538@noindent
539This function can be implemented using
540
541@smallexample
542iswctype (wc, wctype ("punct"))
543@end smallexample
544
545@pindex wctype.h
546It is declared in @file{wctype.h}.
547@end deftypefun
548
549@cindex whitespace character
550@deftypefun int iswspace (wint_t @var{wc})
551@standards{ISO, wctype.h}
552@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
553Returns true if @var{wc} is a @dfn{whitespace} character.  In the standard
554@code{"C"} locale, @code{iswspace} returns true for only the standard
555whitespace characters:
556
557@table @code
558@item L' '
559space
560
561@item L'\f'
562formfeed
563
564@item L'\n'
565newline
566
567@item L'\r'
568carriage return
569
570@item L'\t'
571horizontal tab
572
573@item L'\v'
574vertical tab
575@end table
576
577@noindent
578This function can be implemented using
579
580@smallexample
581iswctype (wc, wctype ("space"))
582@end smallexample
583
584@pindex wctype.h
585It is declared in @file{wctype.h}.
586@end deftypefun
587
588@cindex upper-case character
589@deftypefun int iswupper (wint_t @var{wc})
590@standards{ISO, wctype.h}
591@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
592Returns true if @var{wc} is an upper-case letter.  The letter need not be
593from the Latin alphabet, any alphabet representable is valid.
594
595@noindent
596This function can be implemented using
597
598@smallexample
599iswctype (wc, wctype ("upper"))
600@end smallexample
601
602@pindex wctype.h
603It is declared in @file{wctype.h}.
604@end deftypefun
605
606@cindex hexadecimal digit character
607@deftypefun int iswxdigit (wint_t @var{wc})
608@standards{ISO, wctype.h}
609@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
610Returns true if @var{wc} is a hexadecimal digit.
611Hexadecimal digits include the normal decimal digits @samp{0} through
612@samp{9} and the letters @samp{A} through @samp{F} and
613@samp{a} through @samp{f}.
614
615@noindent
616This function can be implemented using
617
618@smallexample
619iswctype (wc, wctype ("xdigit"))
620@end smallexample
621
622@pindex wctype.h
623It is declared in @file{wctype.h}.
624@end deftypefun
625
626@Theglibc{} also provides a function which is not defined in the
627@w{ISO C} standard but which is available as a version for single byte
628characters as well.
629
630@cindex blank character
631@deftypefun int iswblank (wint_t @var{wc})
632@standards{ISO, wctype.h}
633@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
634Returns true if @var{wc} is a blank character; that is, a space or a tab.
635This function was originally a GNU extension, but was added in @w{ISO C99}.
636It is declared in @file{wchar.h}.
637@end deftypefun
638
639@node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling
640@section Notes on using the wide character classes
641
642The first note is probably not astonishing but still occasionally a
643cause of problems.  The @code{isw@var{XXX}} functions can be implemented
644using macros and in fact, @theglibc{} does this.  They are still
645available as real functions but when the @file{wctype.h} header is
646included the macros will be used.  This is the same as the
647@code{char} type versions of these functions.
648
649The second note covers something new.  It can be best illustrated by a
650(real-world) example.  The first piece of code is an excerpt from the
651original code.  It is truncated a bit but the intention should be clear.
652
653@smallexample
654int
655is_in_class (int c, const char *class)
656@{
657  if (strcmp (class, "alnum") == 0)
658    return isalnum (c);
659  if (strcmp (class, "alpha") == 0)
660    return isalpha (c);
661  if (strcmp (class, "cntrl") == 0)
662    return iscntrl (c);
663  @dots{}
664  return 0;
665@}
666@end smallexample
667
668Now, with the @code{wctype} and @code{iswctype} you can avoid the
669@code{if} cascades, but rewriting the code as follows is wrong:
670
671@smallexample
672int
673is_in_class (int c, const char *class)
674@{
675  wctype_t desc = wctype (class);
676  return desc ? iswctype ((wint_t) c, desc) : 0;
677@}
678@end smallexample
679
680The problem is that it is not guaranteed that the wide character
681representation of a single-byte character can be found using casting.
682In fact, usually this fails miserably.  The correct solution to this
683problem is to write the code as follows:
684
685@smallexample
686int
687is_in_class (int c, const char *class)
688@{
689  wctype_t desc = wctype (class);
690  return desc ? iswctype (btowc (c), desc) : 0;
691@}
692@end smallexample
693
694@xref{Converting a Character}, for more information on @code{btowc}.
695Note that this change probably does not improve the performance
696of the program a lot since the @code{wctype} function still has to make
697the string comparisons.  It gets really interesting if the
698@code{is_in_class} function is called more than once for the
699same class name.  In this case the variable @var{desc} could be computed
700once and reused for all the calls.  Therefore the above form of the
701function is probably not the final one.
702
703
704@node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling
705@section Mapping of wide characters.
706
707The classification functions are also generalized by the @w{ISO C}
708standard.  Instead of just allowing the two standard mappings, a
709locale can contain others.  Again, the @code{localedef} program
710already supports generating such locale data files.
711
712@deftp {Data Type} wctrans_t
713@standards{ISO, wctype.h}
714This data type is defined as a scalar type which can hold a value
715representing the locale-dependent character mapping.  There is no way to
716construct such a value apart from using the return value of the
717@code{wctrans} function.
718
719@pindex wctype.h
720@noindent
721This type is defined in @file{wctype.h}.
722@end deftp
723
724@deftypefun wctrans_t wctrans (const char *@var{property})
725@standards{ISO, wctype.h}
726@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
727@c Similar implementation, same caveats as wctype.
728The @code{wctrans} function has to be used to find out whether a named
729mapping is defined in the current locale selected for the
730@code{LC_CTYPE} category.  If the returned value is non-zero, you can use
731it afterwards in calls to @code{towctrans}.  If the return value is
732zero no such mapping is known in the current locale.
733
734Beside locale-specific mappings there are two mappings which are
735guaranteed to be available in every locale:
736
737@multitable @columnfractions .5 .5
738@item
739@code{"tolower"} @tab @code{"toupper"}
740@end multitable
741
742@pindex wctype.h
743@noindent
744These functions are declared in @file{wctype.h}.
745@end deftypefun
746
747@deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc})
748@standards{ISO, wctype.h}
749@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
750@c Same caveats as iswctype.
751@code{towctrans} maps the input character @var{wc}
752according to the rules of the mapping for which @var{desc} is a
753descriptor, and returns the value it finds.  @var{desc} must be
754obtained by a successful call to @code{wctrans}.
755
756@pindex wctype.h
757@noindent
758This function is declared in @file{wctype.h}.
759@end deftypefun
760
761For the generally available mappings, the @w{ISO C} standard defines
762convenient shortcuts so that it is not necessary to call @code{wctrans}
763for them.
764
765@deftypefun wint_t towlower (wint_t @var{wc})
766@standards{ISO, wctype.h}
767@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
768@c Same caveats as iswalnum, just using a wctrans rather than a wctype
769@c table.
770If @var{wc} is an upper-case letter, @code{towlower} returns the corresponding
771lower-case letter.  If @var{wc} is not an upper-case letter,
772@var{wc} is returned unchanged.
773
774@noindent
775@code{towlower} can be implemented using
776
777@smallexample
778towctrans (wc, wctrans ("tolower"))
779@end smallexample
780
781@pindex wctype.h
782@noindent
783This function is declared in @file{wctype.h}.
784@end deftypefun
785
786@deftypefun wint_t towupper (wint_t @var{wc})
787@standards{ISO, wctype.h}
788@safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
789If @var{wc} is a lower-case letter, @code{towupper} returns the corresponding
790upper-case letter.  Otherwise @var{wc} is returned unchanged.
791
792@noindent
793@code{towupper} can be implemented using
794
795@smallexample
796towctrans (wc, wctrans ("toupper"))
797@end smallexample
798
799@pindex wctype.h
800@noindent
801This function is declared in @file{wctype.h}.
802@end deftypefun
803
804The same warnings given in the last section for the use of the wide
805character classification functions apply here.  It is not possible to
806simply cast a @code{char} type value to a @code{wint_t} and use it as an
807argument to @code{towctrans} calls.
808