mbrtowc — convert a multibyte sequence to a wide character
#include <wchar.h>
size_t
mbrtowc( |
wchar_t *pwc, |
| const char *s, | |
| size_t n, | |
mbstate_t *ps); |
The main case for this function is when s is not NULL and pwc is not NULL. In this case,
the mbrtowc() function inspects
at most n bytes of
the multibyte string starting at s, extracts the next complete
multibyte character, converts it to a wide character and
stores it at *pwc. It
updates the shift state *ps. If the converted wide
character is not L'\0' (the null wide character), it returns
the number of bytes that were consumed from s. If the converted wide
character is L'\0', it resets the shift state *ps to the initial state and
returns 0.
If the n bytes
starting at s do not
contain a complete multibyte character, mbrtowc() returns (size_t) −2. This can happen
even if n >=
MB_CUR_MAX, if the multibyte
string contains redundant shift sequences.
If the multibyte string starting at s contains an invalid multibyte
sequence before the next complete character, mbrtowc() returns (size_t) −1 and sets
errno to EILSEQ. In this case, the effects on
*ps are
undefined.
A different case is when s is not NULL but pwc is NULL. In this case the
mbrtowc() function behaves as
above, except that it does not store the converted wide
character in memory.
A third case is when s is NULL. In this case,
pwc and n are ignored. If the
conversion state represented by *ps denotes an incomplete
multibyte character conversion, the mbrtowc() function returns (size_t) −1, sets
errno to EILSEQ, and leaves *ps in an undefined state.
Otherwise, the mbrtowc()
function puts *ps in
the initial state and returns 0.
In all of the above cases, if ps is a NULL pointer, a static
anonymous state known only to the mbrtowc() function is used instead.
Otherwise, *ps must
be a valid mbstate_t object. An
mbstate_t object a can be initialized to the initial state by
zeroing it, for example using
memset(&a, 0, sizeof(a));
The mbrtowc() function
returns the number of bytes parsed from the multibyte
sequence starting at s, if a non-L'\0' wide
character was recognized. It returns 0, if a L'\0' wide
character was recognized. It returns (size_t) −1 and sets
errno to EILSEQ, if an invalid multibyte sequence
was encountered. It returns (size_t) −2 if it couldn't
parse a complete multibyte character, meaning that n should be increased.
This page is part of release 3.52 of the Linux man-pages project. A
description of the project, and information about reporting
bugs, can be found at
http://www.kernel.org/doc/man−pages/.
|
Copyright (c) Bruno Haible <haibleclisp.cons.org> %%%LICENSE_START(GPLv2+_DOC_ONEPARA) This is free documentation; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. %%%LICENSE_END References consulted: GNU glibc-2 source code and manual Dinkumware C library reference http://www.dinkumware.com/ OpenGroup's Single UNIX specification http://www.UNIX-systems.org/online.html ISO/IEC 9899:1999 |