qemu with hax to log dma reads & writes jcs.org/2018/11/12/vfio

Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-08-24' into staging

QObject patches for 2018-08-24

# gpg: Signature made Fri 24 Aug 2018 20:28:53 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qobject-2018-08-24: (58 commits)
json: Update references to RFC 7159 to RFC 8259
json: Support %% in JSON strings when interpolating
json: Improve safety of qobject_from_jsonf_nofail() & friends
json: Keep interpolation state in JSONParserContext
tests/drive_del-test: Fix harmless JSON interpolation bug
json: Clean up headers
qobject: Drop superfluous includes of qemu-common.h
json: Make JSONToken opaque outside json-parser.c
json: Unbox tokens queue in JSONMessageParser
json: Streamline json_message_process_token()
json: Enforce token count and size limits more tightly
qjson: Have qobject_from_json() & friends reject empty and blank
json: Assert json_parser_parse() consumes all tokens on success
json: Fix streamer not to ignore trailing unterminated structures
json: Fix latent parser aborts at end of input
qjson: Fix qobject_from_json() & friends for multiple values
json: Improve names of lexer states related to numbers
json: Replace %I64d, %I64u by %PRId64, %PRIu64
json: Leave rejecting invalid interpolation to parser
json: Pass lexical errors and limit violations to callback
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

+1469 -1326
+1
MAINTAINERS
··· 1715 1715 F: docs/devel/*qmp-* 1716 1716 F: scripts/qmp/ 1717 1717 F: tests/qmp-test.c 1718 + F: tests/qmp-cmd-test.c 1718 1719 T: git git://repo.or.cz/qemu/armbru.git qapi-next 1719 1720 1720 1721 qtest
-5
block.c
··· 1478 1478 1479 1479 options_obj = qobject_from_json(filename, errp); 1480 1480 if (!options_obj) { 1481 - /* Work around qobject_from_json() lossage TODO fix that */ 1482 - if (errp && !*errp) { 1483 - error_setg(errp, "Could not parse the JSON options"); 1484 - return NULL; 1485 - } 1486 1481 error_prepend(errp, "Could not parse the JSON options: "); 1487 1482 return NULL; 1488 1483 }
+28 -14
docs/interop/qmp-spec.txt
··· 20 20 2. Protocol Specification 21 21 ========================= 22 22 23 - This section details the protocol format. For the purpose of this document 24 - "Client" is any application which is using QMP to communicate with QEMU and 25 - "Server" is QEMU itself. 23 + This section details the protocol format. For the purpose of this 24 + document, "Server" is either QEMU or the QEMU Guest Agent, and 25 + "Client" is any application communicating with it via QMP. 26 26 27 27 JSON data structures, when mentioned in this document, are always in the 28 28 following format: ··· 34 34 35 35 http://www.ietf.org/rfc/rfc7159.txt 36 36 37 - The protocol is always encoded in UTF-8 except for synchronization 38 - bytes (documented below); although thanks to json-string escape 39 - sequences, the server will reply using only the strict ASCII subset. 37 + The server expects its input to be encoded in UTF-8, and sends its 38 + output encoded in ASCII. 40 39 41 40 For convenience, json-object members mentioned in this document will 42 41 be in a certain order. However, in real protocol usage they can be in ··· 215 214 dropped, and the last one is delayed. "Similar" normally means same 216 215 event type. See qmp-events.txt for details. 217 216 218 - 2.6 QGA Synchronization 217 + 2.6 Forcing the JSON parser into known-good state 218 + ------------------------------------------------- 219 + 220 + Incomplete or invalid input can leave the server's JSON parser in a 221 + state where it can't parse additional commands. To get it back into 222 + known-good state, the client should provoke a lexical error. 223 + 224 + The cleanest way to do that is sending an ASCII control character 225 + other than '\t' (horizontal tab), '\r' (carriage return), or '\n' (new 226 + line). 227 + 228 + Sadly, older versions of QEMU can fail to flag this as an error. If a 229 + client needs to deal with them, it should send a 0xFF byte. 230 + 231 + 2.7 QGA Synchronization 219 232 ----------------------- 220 233 221 - When using QGA, an additional synchronization feature is built into 222 - the protocol. If the Client sends a raw 0xFF sentinel byte (not valid 223 - JSON), then the Server will reset its state and discard all pending 224 - data prior to the sentinel. Conversely, if the Client makes use of 225 - the 'guest-sync-delimited' command, the Server will send a raw 0xFF 226 - sentinel byte prior to its response, to aid the Client in discarding 227 - any data prior to the sentinel. 234 + When a client connects to QGA over a transport lacking proper 235 + connection semantics such as virtio-serial, QGA may have read partial 236 + input from a previous client. The client needs to force QGA's parser 237 + into known-good state using the previous section's technique. 238 + Moreover, the client may receive output a previous client didn't read. 239 + To help with skipping that output, QGA provides the 240 + 'guest-sync-delimited' command. Refer to its documentation for 241 + details. 228 242 229 243 230 244 3. QMP Examples
-56
include/qapi/qmp/json-lexer.h
··· 1 - /* 2 - * JSON lexer 3 - * 4 - * Copyright IBM, Corp. 2009 5 - * 6 - * Authors: 7 - * Anthony Liguori <aliguori@us.ibm.com> 8 - * 9 - * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. 10 - * See the COPYING.LIB file in the top-level directory. 11 - * 12 - */ 13 - 14 - #ifndef QEMU_JSON_LEXER_H 15 - #define QEMU_JSON_LEXER_H 16 - 17 - 18 - typedef enum json_token_type { 19 - JSON_MIN = 100, 20 - JSON_LCURLY = JSON_MIN, 21 - JSON_RCURLY, 22 - JSON_LSQUARE, 23 - JSON_RSQUARE, 24 - JSON_COLON, 25 - JSON_COMMA, 26 - JSON_INTEGER, 27 - JSON_FLOAT, 28 - JSON_KEYWORD, 29 - JSON_STRING, 30 - JSON_ESCAPE, 31 - JSON_SKIP, 32 - JSON_ERROR, 33 - } JSONTokenType; 34 - 35 - typedef struct JSONLexer JSONLexer; 36 - 37 - typedef void (JSONLexerEmitter)(JSONLexer *, GString *, 38 - JSONTokenType, int x, int y); 39 - 40 - struct JSONLexer 41 - { 42 - JSONLexerEmitter *emit; 43 - int state; 44 - GString *token; 45 - int x, y; 46 - }; 47 - 48 - void json_lexer_init(JSONLexer *lexer, JSONLexerEmitter func); 49 - 50 - int json_lexer_feed(JSONLexer *lexer, const char *buffer, size_t size); 51 - 52 - int json_lexer_flush(JSONLexer *lexer); 53 - 54 - void json_lexer_destroy(JSONLexer *lexer); 55 - 56 - #endif
+30 -6
include/qapi/qmp/json-parser.h
··· 1 1 /* 2 - * JSON Parser 2 + * JSON Parser 3 3 * 4 4 * Copyright IBM, Corp. 2009 5 5 * ··· 11 11 * 12 12 */ 13 13 14 - #ifndef QEMU_JSON_PARSER_H 15 - #define QEMU_JSON_PARSER_H 14 + #ifndef QAPI_QMP_JSON_PARSER_H 15 + #define QAPI_QMP_JSON_PARSER_H 16 16 17 - #include "qemu-common.h" 17 + typedef struct JSONLexer { 18 + int start_state, state; 19 + GString *token; 20 + int x, y; 21 + } JSONLexer; 18 22 19 - QObject *json_parser_parse(GQueue *tokens, va_list *ap); 20 - QObject *json_parser_parse_err(GQueue *tokens, va_list *ap, Error **errp); 23 + typedef struct JSONMessageParser { 24 + void (*emit)(void *opaque, QObject *json, Error *err); 25 + void *opaque; 26 + va_list *ap; 27 + JSONLexer lexer; 28 + int brace_count; 29 + int bracket_count; 30 + GQueue tokens; 31 + uint64_t token_size; 32 + } JSONMessageParser; 33 + 34 + void json_message_parser_init(JSONMessageParser *parser, 35 + void (*emit)(void *opaque, QObject *json, 36 + Error *err), 37 + void *opaque, va_list *ap); 38 + 39 + void json_message_parser_feed(JSONMessageParser *parser, 40 + const char *buffer, size_t size); 41 + 42 + void json_message_parser_flush(JSONMessageParser *parser); 43 + 44 + void json_message_parser_destroy(JSONMessageParser *parser); 21 45 22 46 #endif
-46
include/qapi/qmp/json-streamer.h
··· 1 - /* 2 - * JSON streaming support 3 - * 4 - * Copyright IBM, Corp. 2009 5 - * 6 - * Authors: 7 - * Anthony Liguori <aliguori@us.ibm.com> 8 - * 9 - * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. 10 - * See the COPYING.LIB file in the top-level directory. 11 - * 12 - */ 13 - 14 - #ifndef QEMU_JSON_STREAMER_H 15 - #define QEMU_JSON_STREAMER_H 16 - 17 - #include "qapi/qmp/json-lexer.h" 18 - 19 - typedef struct JSONToken { 20 - int type; 21 - int x; 22 - int y; 23 - char str[]; 24 - } JSONToken; 25 - 26 - typedef struct JSONMessageParser 27 - { 28 - void (*emit)(struct JSONMessageParser *parser, GQueue *tokens); 29 - JSONLexer lexer; 30 - int brace_count; 31 - int bracket_count; 32 - GQueue *tokens; 33 - uint64_t token_size; 34 - } JSONMessageParser; 35 - 36 - void json_message_parser_init(JSONMessageParser *parser, 37 - void (*func)(JSONMessageParser *, GQueue *)); 38 - 39 - int json_message_parser_feed(JSONMessageParser *parser, 40 - const char *buffer, size_t size); 41 - 42 - int json_message_parser_flush(JSONMessageParser *parser); 43 - 44 - void json_message_parser_destroy(JSONMessageParser *parser); 45 - 46 - #endif
-3
include/qapi/qmp/qerror.h
··· 61 61 #define QERR_IO_ERROR \ 62 62 "An IO error has occurred" 63 63 64 - #define QERR_JSON_PARSING \ 65 - "Invalid JSON syntax" 66 - 67 64 #define QERR_MIGRATION_ACTIVE \ 68 65 "There's a migration process in progress" 69 66
+1 -1
include/qapi/qmp/qnum.h
··· 25 25 26 26 /* 27 27 * QNum encapsulates how our dialect of JSON fills in the blanks left 28 - * by the JSON specification (RFC 7159) regarding numbers. 28 + * by the JSON specification (RFC 8259) regarding numbers. 29 29 * 30 30 * Conceptually, we treat number as an abstract type with three 31 31 * concrete subtypes: floating-point, signed integer, unsigned
+1
include/qemu/unicode.h
··· 2 2 #define QEMU_UNICODE_H 3 3 4 4 int mod_utf8_codepoint(const char *s, size_t n, char **end); 5 + ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint); 5 6 6 7 #endif
+8 -13
monitor.c
··· 58 58 #include "qapi/qmp/qnum.h" 59 59 #include "qapi/qmp/qstring.h" 60 60 #include "qapi/qmp/qjson.h" 61 - #include "qapi/qmp/json-streamer.h" 62 61 #include "qapi/qmp/json-parser.h" 63 62 #include "qapi/qmp/qlist.h" 64 63 #include "qom/object_interfaces.h" ··· 4256 4255 4257 4256 #define QMP_REQ_QUEUE_LEN_MAX (8) 4258 4257 4259 - static void handle_qmp_command(JSONMessageParser *parser, GQueue *tokens) 4258 + static void handle_qmp_command(void *opaque, QObject *req, Error *err) 4260 4259 { 4261 - QObject *req, *id = NULL; 4260 + Monitor *mon = opaque; 4261 + QObject *id = NULL; 4262 4262 QDict *qdict; 4263 - MonitorQMP *mon_qmp = container_of(parser, MonitorQMP, parser); 4264 - Monitor *mon = container_of(mon_qmp, Monitor, qmp); 4265 - Error *err = NULL; 4266 4263 QMPRequest *req_obj; 4267 4264 4268 - req = json_parser_parse_err(tokens, NULL, &err); 4269 - if (!req && !err) { 4270 - /* json_parser_parse_err() sucks: can fail without setting @err */ 4271 - error_setg(&err, QERR_JSON_PARSING); 4272 - } 4265 + assert(!req != !err); 4273 4266 4274 4267 qdict = qobject_to(QDict, req); 4275 4268 if (qdict) { ··· 4465 4458 monitor_qmp_response_flush(mon); 4466 4459 monitor_qmp_cleanup_queues(mon); 4467 4460 json_message_parser_destroy(&mon->qmp.parser); 4468 - json_message_parser_init(&mon->qmp.parser, handle_qmp_command); 4461 + json_message_parser_init(&mon->qmp.parser, handle_qmp_command, 4462 + mon, NULL); 4469 4463 mon_refcount--; 4470 4464 monitor_fdsets_cleanup(); 4471 4465 break; ··· 4683 4677 4684 4678 if (monitor_is_qmp(mon)) { 4685 4679 qemu_chr_fe_set_echo(&mon->chr, true); 4686 - json_message_parser_init(&mon->qmp.parser, handle_qmp_command); 4680 + json_message_parser_init(&mon->qmp.parser, handle_qmp_command, 4681 + mon, NULL); 4687 4682 if (mon->use_io_thread) { 4688 4683 /* 4689 4684 * Make sure the old iowatch is gone. It's possible when
+1 -1
qapi/introspect.json
··· 120 120 ## 121 121 # @JSONType: 122 122 # 123 - # The four primitive and two structured types according to RFC 7159 123 + # The four primitive and two structured types according to RFC 8259 124 124 # section 1, plus 'int' (split off 'number'), plus the obvious top 125 125 # type 'value'. 126 126 #
-1
qapi/qmp-dispatch.c
··· 14 14 #include "qemu/osdep.h" 15 15 #include "qapi/error.h" 16 16 #include "qapi/qmp/dispatch.h" 17 - #include "qapi/qmp/json-parser.h" 18 17 #include "qapi/qmp/qdict.h" 19 18 #include "qapi/qmp/qjson.h" 20 19 #include "qapi/qmp/qbool.h"
-5
qapi/qobject-input-visitor.c
··· 725 725 if (is_json) { 726 726 obj = qobject_from_json(str, errp); 727 727 if (!obj) { 728 - /* Work around qobject_from_json() lossage TODO fix that */ 729 - if (errp && !*errp) { 730 - error_setg(errp, "JSON parse error"); 731 - return NULL; 732 - } 733 728 return NULL; 734 729 } 735 730 args = qobject_to(QDict, obj);
+5 -10
qga/main.c
··· 18 18 #include <syslog.h> 19 19 #include <sys/wait.h> 20 20 #endif 21 - #include "qapi/qmp/json-streamer.h" 22 21 #include "qapi/qmp/json-parser.h" 23 22 #include "qapi/qmp/qdict.h" 24 23 #include "qapi/qmp/qjson.h" ··· 597 596 } 598 597 599 598 /* handle requests/control events coming in over the channel */ 600 - static void process_event(JSONMessageParser *parser, GQueue *tokens) 599 + static void process_event(void *opaque, QObject *obj, Error *err) 601 600 { 602 - GAState *s = container_of(parser, GAState, parser); 603 - QObject *obj; 601 + GAState *s = opaque; 604 602 QDict *req, *rsp; 605 - Error *err = NULL; 606 603 int ret; 607 604 608 - g_assert(s && parser); 609 - 610 605 g_debug("process_event: called"); 611 - obj = json_parser_parse_err(tokens, NULL, &err); 606 + assert(!obj != !err); 612 607 if (err) { 613 608 goto err; 614 609 } 615 610 req = qobject_to(QDict, obj); 616 611 if (!req) { 617 - error_setg(&err, QERR_JSON_PARSING); 612 + error_setg(&err, "Input must be a JSON object"); 618 613 goto err; 619 614 } 620 615 if (!qdict_haskey(req, "execute")) { ··· 1320 1315 s->command_state = ga_command_state_new(); 1321 1316 ga_command_state_init(s, s->command_state); 1322 1317 ga_command_state_init_all(s->command_state); 1323 - json_message_parser_init(&s->parser, process_event); 1318 + json_message_parser_init(&s->parser, process_event, s, NULL); 1324 1319 1325 1320 #ifndef _WIN32 1326 1321 if (!register_signal_handlers()) {
+139 -174
qobject/json-lexer.c
··· 12 12 */ 13 13 14 14 #include "qemu/osdep.h" 15 - #include "qemu-common.h" 16 - #include "qapi/qmp/json-lexer.h" 15 + #include "json-parser-int.h" 17 16 18 17 #define MAX_TOKEN_SIZE (64ULL << 20) 19 18 20 19 /* 21 - * Required by JSON (RFC 7159): 20 + * From RFC 8259 "The JavaScript Object Notation (JSON) Data 21 + * Interchange Format", with [comments in brackets]: 22 22 * 23 - * \"([^\\\"]|\\[\"'\\/bfnrt]|\\u[0-9a-fA-F]{4})*\" 24 - * -?(0|[1-9][0-9]*)(.[0-9]+)?([eE][-+]?[0-9]+)? 25 - * [{}\[\],:] 26 - * [a-z]+ # covers null, true, false 23 + * The set of tokens includes six structural characters, strings, 24 + * numbers, and three literal names. 25 + * 26 + * These are the six structural characters: 27 + * 28 + * begin-array = ws %x5B ws ; [ left square bracket 29 + * begin-object = ws %x7B ws ; { left curly bracket 30 + * end-array = ws %x5D ws ; ] right square bracket 31 + * end-object = ws %x7D ws ; } right curly bracket 32 + * name-separator = ws %x3A ws ; : colon 33 + * value-separator = ws %x2C ws ; , comma 34 + * 35 + * Insignificant whitespace is allowed before or after any of the six 36 + * structural characters. 37 + * [This lexer accepts it before or after any token, which is actually 38 + * the same, as the grammar always has structural characters between 39 + * other tokens.] 40 + * 41 + * ws = *( 42 + * %x20 / ; Space 43 + * %x09 / ; Horizontal tab 44 + * %x0A / ; Line feed or New line 45 + * %x0D ) ; Carriage return 46 + * 47 + * [...] three literal names: 48 + * false null true 49 + * [This lexer accepts [a-z]+, and leaves rejecting unknown literal 50 + * names to the parser.] 51 + * 52 + * [Numbers:] 53 + * 54 + * number = [ minus ] int [ frac ] [ exp ] 55 + * decimal-point = %x2E ; . 56 + * digit1-9 = %x31-39 ; 1-9 57 + * e = %x65 / %x45 ; e E 58 + * exp = e [ minus / plus ] 1*DIGIT 59 + * frac = decimal-point 1*DIGIT 60 + * int = zero / ( digit1-9 *DIGIT ) 61 + * minus = %x2D ; - 62 + * plus = %x2B ; + 63 + * zero = %x30 ; 0 27 64 * 28 - * Extension of '' strings: 65 + * [Strings:] 66 + * string = quotation-mark *char quotation-mark 29 67 * 30 - * '([^\\']|\\[\"'\\/bfnrt]|\\u[0-9a-fA-F]{4})*' 68 + * char = unescaped / 69 + * escape ( 70 + * %x22 / ; " quotation mark U+0022 71 + * %x5C / ; \ reverse solidus U+005C 72 + * %x2F / ; / solidus U+002F 73 + * %x62 / ; b backspace U+0008 74 + * %x66 / ; f form feed U+000C 75 + * %x6E / ; n line feed U+000A 76 + * %x72 / ; r carriage return U+000D 77 + * %x74 / ; t tab U+0009 78 + * %x75 4HEXDIG ) ; uXXXX U+XXXX 79 + * escape = %x5C ; \ 80 + * quotation-mark = %x22 ; " 81 + * unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 82 + * [This lexer accepts any non-control character after escape, and 83 + * leaves rejecting invalid ones to the parser.] 31 84 * 32 - * Extension for vararg handling in JSON construction: 33 85 * 34 - * %((l|ll|I64)?d|[ipsf]) 86 + * Extensions over RFC 8259: 87 + * - Extra escape sequence in strings: 88 + * 0x27 (apostrophe) is recognized after escape, too 89 + * - Single-quoted strings: 90 + * Like double-quoted strings, except they're delimited by %x27 91 + * (apostrophe) instead of %x22 (quotation mark), and can't contain 92 + * unescaped apostrophe, but can contain unescaped quotation mark. 93 + * - Interpolation, if enabled: 94 + * The lexer accepts %[A-Za-z0-9]*, and leaves rejecting invalid 95 + * ones to the parser. 35 96 * 97 + * Note: 98 + * - Input must be encoded in modified UTF-8. 99 + * - Decoding and validating is left to the parser. 36 100 */ 37 101 38 102 enum json_lexer_state { 39 103 IN_ERROR = 0, /* must really be 0, see json_lexer[] */ 40 - IN_DQ_UCODE3, 41 - IN_DQ_UCODE2, 42 - IN_DQ_UCODE1, 43 - IN_DQ_UCODE0, 44 104 IN_DQ_STRING_ESCAPE, 45 105 IN_DQ_STRING, 46 - IN_SQ_UCODE3, 47 - IN_SQ_UCODE2, 48 - IN_SQ_UCODE1, 49 - IN_SQ_UCODE0, 50 106 IN_SQ_STRING_ESCAPE, 51 107 IN_SQ_STRING, 52 108 IN_ZERO, 53 - IN_DIGITS, 54 - IN_DIGIT, 109 + IN_EXP_DIGITS, 110 + IN_EXP_SIGN, 55 111 IN_EXP_E, 56 112 IN_MANTISSA, 57 113 IN_MANTISSA_DIGITS, 58 - IN_NONZERO_NUMBER, 59 - IN_NEG_NONZERO_NUMBER, 114 + IN_DIGITS, 115 + IN_SIGN, 60 116 IN_KEYWORD, 61 - IN_ESCAPE, 62 - IN_ESCAPE_L, 63 - IN_ESCAPE_LL, 64 - IN_ESCAPE_I, 65 - IN_ESCAPE_I6, 66 - IN_ESCAPE_I64, 117 + IN_INTERP, 67 118 IN_WHITESPACE, 68 119 IN_START, 120 + IN_START_INTERP, /* must be IN_START + 1 */ 69 121 }; 70 122 71 - QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START); 123 + QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START_INTERP); 124 + QEMU_BUILD_BUG_ON(IN_START_INTERP != IN_START + 1); 72 125 73 126 #define TERMINAL(state) [0 ... 0x7F] = (state) 74 127 ··· 76 129 from OLD_STATE required lookahead. This happens whenever the table 77 130 below uses the TERMINAL macro. */ 78 131 #define TERMINAL_NEEDED_LOOKAHEAD(old_state, terminal) \ 79 - (json_lexer[(old_state)][0] == (terminal)) 132 + (terminal != IN_ERROR && json_lexer[(old_state)][0] == (terminal)) 80 133 81 134 static const uint8_t json_lexer[][256] = { 82 135 /* Relies on default initialization to IN_ERROR! */ 83 136 84 137 /* double quote string */ 85 - [IN_DQ_UCODE3] = { 86 - ['0' ... '9'] = IN_DQ_STRING, 87 - ['a' ... 'f'] = IN_DQ_STRING, 88 - ['A' ... 'F'] = IN_DQ_STRING, 89 - }, 90 - [IN_DQ_UCODE2] = { 91 - ['0' ... '9'] = IN_DQ_UCODE3, 92 - ['a' ... 'f'] = IN_DQ_UCODE3, 93 - ['A' ... 'F'] = IN_DQ_UCODE3, 94 - }, 95 - [IN_DQ_UCODE1] = { 96 - ['0' ... '9'] = IN_DQ_UCODE2, 97 - ['a' ... 'f'] = IN_DQ_UCODE2, 98 - ['A' ... 'F'] = IN_DQ_UCODE2, 99 - }, 100 - [IN_DQ_UCODE0] = { 101 - ['0' ... '9'] = IN_DQ_UCODE1, 102 - ['a' ... 'f'] = IN_DQ_UCODE1, 103 - ['A' ... 'F'] = IN_DQ_UCODE1, 104 - }, 105 138 [IN_DQ_STRING_ESCAPE] = { 106 - ['b'] = IN_DQ_STRING, 107 - ['f'] = IN_DQ_STRING, 108 - ['n'] = IN_DQ_STRING, 109 - ['r'] = IN_DQ_STRING, 110 - ['t'] = IN_DQ_STRING, 111 - ['/'] = IN_DQ_STRING, 112 - ['\\'] = IN_DQ_STRING, 113 - ['\''] = IN_DQ_STRING, 114 - ['\"'] = IN_DQ_STRING, 115 - ['u'] = IN_DQ_UCODE0, 139 + [0x20 ... 0xFD] = IN_DQ_STRING, 116 140 }, 117 141 [IN_DQ_STRING] = { 118 - [1 ... 0xBF] = IN_DQ_STRING, 119 - [0xC2 ... 0xF4] = IN_DQ_STRING, 142 + [0x20 ... 0xFD] = IN_DQ_STRING, 120 143 ['\\'] = IN_DQ_STRING_ESCAPE, 121 144 ['"'] = JSON_STRING, 122 145 }, 123 146 124 147 /* single quote string */ 125 - [IN_SQ_UCODE3] = { 126 - ['0' ... '9'] = IN_SQ_STRING, 127 - ['a' ... 'f'] = IN_SQ_STRING, 128 - ['A' ... 'F'] = IN_SQ_STRING, 129 - }, 130 - [IN_SQ_UCODE2] = { 131 - ['0' ... '9'] = IN_SQ_UCODE3, 132 - ['a' ... 'f'] = IN_SQ_UCODE3, 133 - ['A' ... 'F'] = IN_SQ_UCODE3, 134 - }, 135 - [IN_SQ_UCODE1] = { 136 - ['0' ... '9'] = IN_SQ_UCODE2, 137 - ['a' ... 'f'] = IN_SQ_UCODE2, 138 - ['A' ... 'F'] = IN_SQ_UCODE2, 139 - }, 140 - [IN_SQ_UCODE0] = { 141 - ['0' ... '9'] = IN_SQ_UCODE1, 142 - ['a' ... 'f'] = IN_SQ_UCODE1, 143 - ['A' ... 'F'] = IN_SQ_UCODE1, 144 - }, 145 148 [IN_SQ_STRING_ESCAPE] = { 146 - ['b'] = IN_SQ_STRING, 147 - ['f'] = IN_SQ_STRING, 148 - ['n'] = IN_SQ_STRING, 149 - ['r'] = IN_SQ_STRING, 150 - ['t'] = IN_SQ_STRING, 151 - ['/'] = IN_SQ_STRING, 152 - ['\\'] = IN_SQ_STRING, 153 - ['\''] = IN_SQ_STRING, 154 - ['\"'] = IN_SQ_STRING, 155 - ['u'] = IN_SQ_UCODE0, 149 + [0x20 ... 0xFD] = IN_SQ_STRING, 156 150 }, 157 151 [IN_SQ_STRING] = { 158 - [1 ... 0xBF] = IN_SQ_STRING, 159 - [0xC2 ... 0xF4] = IN_SQ_STRING, 152 + [0x20 ... 0xFD] = IN_SQ_STRING, 160 153 ['\\'] = IN_SQ_STRING_ESCAPE, 161 154 ['\''] = JSON_STRING, 162 155 }, ··· 169 162 }, 170 163 171 164 /* Float */ 172 - [IN_DIGITS] = { 165 + [IN_EXP_DIGITS] = { 173 166 TERMINAL(JSON_FLOAT), 174 - ['0' ... '9'] = IN_DIGITS, 167 + ['0' ... '9'] = IN_EXP_DIGITS, 175 168 }, 176 169 177 - [IN_DIGIT] = { 178 - ['0' ... '9'] = IN_DIGITS, 170 + [IN_EXP_SIGN] = { 171 + ['0' ... '9'] = IN_EXP_DIGITS, 179 172 }, 180 173 181 174 [IN_EXP_E] = { 182 - ['-'] = IN_DIGIT, 183 - ['+'] = IN_DIGIT, 184 - ['0' ... '9'] = IN_DIGITS, 175 + ['-'] = IN_EXP_SIGN, 176 + ['+'] = IN_EXP_SIGN, 177 + ['0' ... '9'] = IN_EXP_DIGITS, 185 178 }, 186 179 187 180 [IN_MANTISSA_DIGITS] = { ··· 196 189 }, 197 190 198 191 /* Number */ 199 - [IN_NONZERO_NUMBER] = { 192 + [IN_DIGITS] = { 200 193 TERMINAL(JSON_INTEGER), 201 - ['0' ... '9'] = IN_NONZERO_NUMBER, 194 + ['0' ... '9'] = IN_DIGITS, 202 195 ['e'] = IN_EXP_E, 203 196 ['E'] = IN_EXP_E, 204 197 ['.'] = IN_MANTISSA, 205 198 }, 206 199 207 - [IN_NEG_NONZERO_NUMBER] = { 200 + [IN_SIGN] = { 208 201 ['0'] = IN_ZERO, 209 - ['1' ... '9'] = IN_NONZERO_NUMBER, 202 + ['1' ... '9'] = IN_DIGITS, 210 203 }, 211 204 212 205 /* keywords */ ··· 224 217 ['\n'] = IN_WHITESPACE, 225 218 }, 226 219 227 - /* escape */ 228 - [IN_ESCAPE_LL] = { 229 - ['d'] = JSON_ESCAPE, 230 - ['u'] = JSON_ESCAPE, 231 - }, 232 - 233 - [IN_ESCAPE_L] = { 234 - ['d'] = JSON_ESCAPE, 235 - ['l'] = IN_ESCAPE_LL, 236 - ['u'] = JSON_ESCAPE, 237 - }, 238 - 239 - [IN_ESCAPE_I64] = { 240 - ['d'] = JSON_ESCAPE, 241 - ['u'] = JSON_ESCAPE, 242 - }, 243 - 244 - [IN_ESCAPE_I6] = { 245 - ['4'] = IN_ESCAPE_I64, 246 - }, 247 - 248 - [IN_ESCAPE_I] = { 249 - ['6'] = IN_ESCAPE_I6, 250 - }, 251 - 252 - [IN_ESCAPE] = { 253 - ['d'] = JSON_ESCAPE, 254 - ['i'] = JSON_ESCAPE, 255 - ['p'] = JSON_ESCAPE, 256 - ['s'] = JSON_ESCAPE, 257 - ['u'] = JSON_ESCAPE, 258 - ['f'] = JSON_ESCAPE, 259 - ['l'] = IN_ESCAPE_L, 260 - ['I'] = IN_ESCAPE_I, 220 + /* interpolation */ 221 + [IN_INTERP] = { 222 + TERMINAL(JSON_INTERP), 223 + ['A' ... 'Z'] = IN_INTERP, 224 + ['a' ... 'z'] = IN_INTERP, 225 + ['0' ... '9'] = IN_INTERP, 261 226 }, 262 227 263 - /* top level rule */ 264 - [IN_START] = { 228 + /* 229 + * Two start states: 230 + * - IN_START recognizes JSON tokens with our string extensions 231 + * - IN_START_INTERP additionally recognizes interpolation. 232 + */ 233 + [IN_START ... IN_START_INTERP] = { 265 234 ['"'] = IN_DQ_STRING, 266 235 ['\''] = IN_SQ_STRING, 267 236 ['0'] = IN_ZERO, 268 - ['1' ... '9'] = IN_NONZERO_NUMBER, 269 - ['-'] = IN_NEG_NONZERO_NUMBER, 237 + ['1' ... '9'] = IN_DIGITS, 238 + ['-'] = IN_SIGN, 270 239 ['{'] = JSON_LCURLY, 271 240 ['}'] = JSON_RCURLY, 272 241 ['['] = JSON_LSQUARE, ··· 274 243 [','] = JSON_COMMA, 275 244 [':'] = JSON_COLON, 276 245 ['a' ... 'z'] = IN_KEYWORD, 277 - ['%'] = IN_ESCAPE, 278 246 [' '] = IN_WHITESPACE, 279 247 ['\t'] = IN_WHITESPACE, 280 248 ['\r'] = IN_WHITESPACE, 281 249 ['\n'] = IN_WHITESPACE, 282 250 }, 251 + [IN_START_INTERP]['%'] = IN_INTERP, 283 252 }; 284 253 285 - void json_lexer_init(JSONLexer *lexer, JSONLexerEmitter func) 254 + void json_lexer_init(JSONLexer *lexer, bool enable_interpolation) 286 255 { 287 - lexer->emit = func; 288 - lexer->state = IN_START; 256 + lexer->start_state = lexer->state = enable_interpolation 257 + ? IN_START_INTERP : IN_START; 289 258 lexer->token = g_string_sized_new(3); 290 259 lexer->x = lexer->y = 0; 291 260 } 292 261 293 - static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush) 262 + static void json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush) 294 263 { 295 264 int char_consumed, new_state; 296 265 ··· 304 273 assert(lexer->state <= ARRAY_SIZE(json_lexer)); 305 274 new_state = json_lexer[lexer->state][(uint8_t)ch]; 306 275 char_consumed = !TERMINAL_NEEDED_LOOKAHEAD(lexer->state, new_state); 307 - if (char_consumed) { 276 + if (char_consumed && !flush) { 308 277 g_string_append_c(lexer->token, ch); 309 278 } 310 279 ··· 315 284 case JSON_RSQUARE: 316 285 case JSON_COLON: 317 286 case JSON_COMMA: 318 - case JSON_ESCAPE: 287 + case JSON_INTERP: 319 288 case JSON_INTEGER: 320 289 case JSON_FLOAT: 321 290 case JSON_KEYWORD: 322 291 case JSON_STRING: 323 - lexer->emit(lexer, lexer->token, new_state, lexer->x, lexer->y); 292 + json_message_process_token(lexer, lexer->token, new_state, 293 + lexer->x, lexer->y); 324 294 /* fall through */ 325 295 case JSON_SKIP: 326 296 g_string_truncate(lexer->token, 0); 327 - new_state = IN_START; 297 + new_state = lexer->start_state; 328 298 break; 329 299 case IN_ERROR: 330 300 /* XXX: To avoid having previous bad input leaving the parser in an 331 301 * unresponsive state where we consume unpredictable amounts of 332 302 * subsequent "good" input, percolate this error state up to the 333 - * tokenizer/parser by forcing a NULL object to be emitted, then 334 - * reset state. 303 + * parser by emitting a JSON_ERROR token, then reset lexer state. 335 304 * 336 305 * Also note that this handling is required for reliable channel 337 306 * negotiation between QMP and the guest agent, since chr(0xFF) ··· 340 309 * never a valid ASCII/UTF-8 sequence, so this should reliably 341 310 * induce an error/flush state. 342 311 */ 343 - lexer->emit(lexer, lexer->token, JSON_ERROR, lexer->x, lexer->y); 312 + json_message_process_token(lexer, lexer->token, JSON_ERROR, 313 + lexer->x, lexer->y); 344 314 g_string_truncate(lexer->token, 0); 345 - new_state = IN_START; 346 - lexer->state = new_state; 347 - return 0; 315 + lexer->state = lexer->start_state; 316 + return; 348 317 default: 349 318 break; 350 319 } ··· 355 324 * this is a security consideration. 356 325 */ 357 326 if (lexer->token->len > MAX_TOKEN_SIZE) { 358 - lexer->emit(lexer, lexer->token, lexer->state, lexer->x, lexer->y); 327 + json_message_process_token(lexer, lexer->token, lexer->state, 328 + lexer->x, lexer->y); 359 329 g_string_truncate(lexer->token, 0); 360 - lexer->state = IN_START; 330 + lexer->state = lexer->start_state; 361 331 } 362 - 363 - return 0; 364 332 } 365 333 366 - int json_lexer_feed(JSONLexer *lexer, const char *buffer, size_t size) 334 + void json_lexer_feed(JSONLexer *lexer, const char *buffer, size_t size) 367 335 { 368 336 size_t i; 369 337 370 338 for (i = 0; i < size; i++) { 371 - int err; 372 - 373 - err = json_lexer_feed_char(lexer, buffer[i], false); 374 - if (err < 0) { 375 - return err; 376 - } 339 + json_lexer_feed_char(lexer, buffer[i], false); 377 340 } 378 - 379 - return 0; 380 341 } 381 342 382 - int json_lexer_flush(JSONLexer *lexer) 343 + void json_lexer_flush(JSONLexer *lexer) 383 344 { 384 - return lexer->state == IN_START ? 0 : json_lexer_feed_char(lexer, 0, true); 345 + if (lexer->state != lexer->start_state) { 346 + json_lexer_feed_char(lexer, 0, true); 347 + } 348 + json_message_process_token(lexer, lexer->token, JSON_END_OF_INPUT, 349 + lexer->x, lexer->y); 385 350 } 386 351 387 352 void json_lexer_destroy(JSONLexer *lexer)
+54
qobject/json-parser-int.h
··· 1 + /* 2 + * JSON Parser 3 + * 4 + * Copyright IBM, Corp. 2009 5 + * 6 + * Authors: 7 + * Anthony Liguori <aliguori@us.ibm.com> 8 + * 9 + * This work is licensed under the terms of the GNU LGPL, version 2.1 or later. 10 + * See the COPYING.LIB file in the top-level directory. 11 + * 12 + */ 13 + 14 + #ifndef JSON_PARSER_INT_H 15 + #define JSON_PARSER_INT_H 16 + 17 + #include "qapi/qmp/json-parser.h" 18 + 19 + 20 + typedef enum json_token_type { 21 + JSON_MIN = 100, 22 + JSON_LCURLY = JSON_MIN, 23 + JSON_RCURLY, 24 + JSON_LSQUARE, 25 + JSON_RSQUARE, 26 + JSON_COLON, 27 + JSON_COMMA, 28 + JSON_INTEGER, 29 + JSON_FLOAT, 30 + JSON_KEYWORD, 31 + JSON_STRING, 32 + JSON_INTERP, 33 + JSON_SKIP, 34 + JSON_ERROR, 35 + JSON_END_OF_INPUT, 36 + } JSONTokenType; 37 + 38 + typedef struct JSONToken JSONToken; 39 + 40 + /* json-lexer.c */ 41 + void json_lexer_init(JSONLexer *lexer, bool enable_interpolation); 42 + void json_lexer_feed(JSONLexer *lexer, const char *buffer, size_t size); 43 + void json_lexer_flush(JSONLexer *lexer); 44 + void json_lexer_destroy(JSONLexer *lexer); 45 + 46 + /* json-streamer.c */ 47 + void json_message_process_token(JSONLexer *lexer, GString *input, 48 + JSONTokenType type, int x, int y); 49 + 50 + /* json-parser.c */ 51 + JSONToken *json_token(JSONTokenType type, int x, int y, GString *tokstr); 52 + QObject *json_parser_parse(GQueue *tokens, va_list *ap, Error **errp); 53 + 54 + #endif
+181 -196
qobject/json-parser.c
··· 13 13 14 14 #include "qemu/osdep.h" 15 15 #include "qemu/cutils.h" 16 + #include "qemu/unicode.h" 16 17 #include "qapi/error.h" 17 18 #include "qemu-common.h" 18 19 #include "qapi/qmp/qbool.h" ··· 21 22 #include "qapi/qmp/qnull.h" 22 23 #include "qapi/qmp/qnum.h" 23 24 #include "qapi/qmp/qstring.h" 24 - #include "qapi/qmp/json-parser.h" 25 - #include "qapi/qmp/json-lexer.h" 26 - #include "qapi/qmp/json-streamer.h" 25 + #include "json-parser-int.h" 26 + 27 + struct JSONToken { 28 + JSONTokenType type; 29 + int x; 30 + int y; 31 + char str[]; 32 + }; 27 33 28 34 typedef struct JSONParserContext 29 35 { 30 36 Error *err; 31 37 JSONToken *current; 32 38 GQueue *buf; 39 + va_list *ap; 33 40 } JSONParserContext; 34 41 35 42 #define BUG_ON(cond) assert(!(cond)) ··· 43 50 * 4) deal with premature EOI 44 51 */ 45 52 46 - static QObject *parse_value(JSONParserContext *ctxt, va_list *ap); 53 + static QObject *parse_value(JSONParserContext *ctxt); 47 54 48 55 /** 49 56 * Error handler ··· 53 60 { 54 61 va_list ap; 55 62 char message[1024]; 63 + 64 + if (ctxt->err) { 65 + return; 66 + } 56 67 va_start(ap, msg); 57 68 vsnprintf(message, sizeof(message), msg, ap); 58 69 va_end(ap); 59 - if (ctxt->err) { 60 - error_free(ctxt->err); 61 - ctxt->err = NULL; 62 - } 63 70 error_setg(&ctxt->err, "JSON parse error, %s", message); 64 71 } 65 72 66 - /** 67 - * String helpers 68 - * 69 - * These helpers are used to unescape strings. 70 - */ 71 - static void wchar_to_utf8(uint16_t wchar, char *buffer, size_t buffer_length) 73 + static int cvt4hex(const char *s) 72 74 { 73 - if (wchar <= 0x007F) { 74 - BUG_ON(buffer_length < 2); 75 + int cp, i; 75 76 76 - buffer[0] = wchar & 0x7F; 77 - buffer[1] = 0; 78 - } else if (wchar <= 0x07FF) { 79 - BUG_ON(buffer_length < 3); 80 - 81 - buffer[0] = 0xC0 | ((wchar >> 6) & 0x1F); 82 - buffer[1] = 0x80 | (wchar & 0x3F); 83 - buffer[2] = 0; 84 - } else { 85 - BUG_ON(buffer_length < 4); 86 - 87 - buffer[0] = 0xE0 | ((wchar >> 12) & 0x0F); 88 - buffer[1] = 0x80 | ((wchar >> 6) & 0x3F); 89 - buffer[2] = 0x80 | (wchar & 0x3F); 90 - buffer[3] = 0; 77 + cp = 0; 78 + for (i = 0; i < 4; i++) { 79 + if (!qemu_isxdigit(s[i])) { 80 + return -1; 81 + } 82 + cp <<= 4; 83 + if (s[i] >= '0' && s[i] <= '9') { 84 + cp |= s[i] - '0'; 85 + } else if (s[i] >= 'a' && s[i] <= 'f') { 86 + cp |= 10 + s[i] - 'a'; 87 + } else if (s[i] >= 'A' && s[i] <= 'F') { 88 + cp |= 10 + s[i] - 'A'; 89 + } else { 90 + return -1; 91 + } 91 92 } 92 - } 93 - 94 - static int hex2decimal(char ch) 95 - { 96 - if (ch >= '0' && ch <= '9') { 97 - return (ch - '0'); 98 - } else if (ch >= 'a' && ch <= 'f') { 99 - return 10 + (ch - 'a'); 100 - } else if (ch >= 'A' && ch <= 'F') { 101 - return 10 + (ch - 'A'); 102 - } 103 - 104 - return -1; 93 + return cp; 105 94 } 106 95 107 96 /** 108 - * parse_string(): Parse a json string and return a QObject 97 + * parse_string(): Parse a JSON string 98 + * 99 + * From RFC 8259 "The JavaScript Object Notation (JSON) Data 100 + * Interchange Format": 101 + * 102 + * char = unescaped / 103 + * escape ( 104 + * %x22 / ; " quotation mark U+0022 105 + * %x5C / ; \ reverse solidus U+005C 106 + * %x2F / ; / solidus U+002F 107 + * %x62 / ; b backspace U+0008 108 + * %x66 / ; f form feed U+000C 109 + * %x6E / ; n line feed U+000A 110 + * %x72 / ; r carriage return U+000D 111 + * %x74 / ; t tab U+0009 112 + * %x75 4HEXDIG ) ; uXXXX U+XXXX 113 + * escape = %x5C ; \ 114 + * quotation-mark = %x22 ; " 115 + * unescaped = %x20-21 / %x23-5B / %x5D-10FFFF 109 116 * 110 - * string 111 - * "" 112 - * " chars " 113 - * chars 114 - * char 115 - * char chars 116 - * char 117 - * any-Unicode-character- 118 - * except-"-or-\-or- 119 - * control-character 120 - * \" 121 - * \\ 122 - * \/ 123 - * \b 124 - * \f 125 - * \n 126 - * \r 127 - * \t 128 - * \u four-hex-digits 117 + * Extensions over RFC 8259: 118 + * - Extra escape sequence in strings: 119 + * 0x27 (apostrophe) is recognized after escape, too 120 + * - Single-quoted strings: 121 + * Like double-quoted strings, except they're delimited by %x27 122 + * (apostrophe) instead of %x22 (quotation mark), and can't contain 123 + * unescaped apostrophe, but can contain unescaped quotation mark. 124 + * 125 + * Note: 126 + * - Encoding is modified UTF-8. 127 + * - Invalid Unicode characters are rejected. 128 + * - Control characters \x00..\x1F are rejected by the lexer. 129 129 */ 130 - static QString *qstring_from_escaped_str(JSONParserContext *ctxt, 131 - JSONToken *token) 130 + static QString *parse_string(JSONParserContext *ctxt, JSONToken *token) 132 131 { 133 132 const char *ptr = token->str; 134 133 QString *str; 135 - int double_quote = 1; 136 - 137 - if (*ptr == '"') { 138 - double_quote = 1; 139 - } else { 140 - double_quote = 0; 141 - } 142 - ptr++; 134 + char quote; 135 + const char *beg; 136 + int cp, trailing; 137 + char *end; 138 + ssize_t len; 139 + char utf8_buf[5]; 143 140 141 + assert(*ptr == '"' || *ptr == '\''); 142 + quote = *ptr++; 144 143 str = qstring_new(); 145 - while (*ptr && 146 - ((double_quote && *ptr != '"') || (!double_quote && *ptr != '\''))) { 147 - if (*ptr == '\\') { 148 - ptr++; 149 144 150 - switch (*ptr) { 145 + while (*ptr != quote) { 146 + assert(*ptr); 147 + switch (*ptr) { 148 + case '\\': 149 + beg = ptr++; 150 + switch (*ptr++) { 151 151 case '"': 152 - qstring_append(str, "\""); 153 - ptr++; 152 + qstring_append_chr(str, '"'); 154 153 break; 155 154 case '\'': 156 - qstring_append(str, "'"); 157 - ptr++; 155 + qstring_append_chr(str, '\''); 158 156 break; 159 157 case '\\': 160 - qstring_append(str, "\\"); 161 - ptr++; 158 + qstring_append_chr(str, '\\'); 162 159 break; 163 160 case '/': 164 - qstring_append(str, "/"); 165 - ptr++; 161 + qstring_append_chr(str, '/'); 166 162 break; 167 163 case 'b': 168 - qstring_append(str, "\b"); 169 - ptr++; 164 + qstring_append_chr(str, '\b'); 170 165 break; 171 166 case 'f': 172 - qstring_append(str, "\f"); 173 - ptr++; 167 + qstring_append_chr(str, '\f'); 174 168 break; 175 169 case 'n': 176 - qstring_append(str, "\n"); 177 - ptr++; 170 + qstring_append_chr(str, '\n'); 178 171 break; 179 172 case 'r': 180 - qstring_append(str, "\r"); 181 - ptr++; 173 + qstring_append_chr(str, '\r'); 182 174 break; 183 175 case 't': 184 - qstring_append(str, "\t"); 185 - ptr++; 176 + qstring_append_chr(str, '\t'); 186 177 break; 187 - case 'u': { 188 - uint16_t unicode_char = 0; 189 - char utf8_char[4]; 190 - int i = 0; 178 + case 'u': 179 + cp = cvt4hex(ptr); 180 + ptr += 4; 191 181 192 - ptr++; 193 - 194 - for (i = 0; i < 4; i++) { 195 - if (qemu_isxdigit(*ptr)) { 196 - unicode_char |= hex2decimal(*ptr) << ((3 - i) * 4); 182 + /* handle surrogate pairs */ 183 + if (cp >= 0xD800 && cp <= 0xDBFF 184 + && ptr[0] == '\\' && ptr[1] == 'u') { 185 + /* leading surrogate followed by \u */ 186 + cp = 0x10000 + ((cp & 0x3FF) << 10); 187 + trailing = cvt4hex(ptr + 2); 188 + if (trailing >= 0xDC00 && trailing <= 0xDFFF) { 189 + /* followed by trailing surrogate */ 190 + cp |= trailing & 0x3FF; 191 + ptr += 6; 197 192 } else { 198 - parse_error(ctxt, token, 199 - "invalid hex escape sequence in string"); 200 - goto out; 193 + cp = -1; /* invalid */ 201 194 } 202 - ptr++; 203 195 } 204 196 205 - wchar_to_utf8(unicode_char, utf8_char, sizeof(utf8_char)); 206 - qstring_append(str, utf8_char); 207 - } break; 197 + if (mod_utf8_encode(utf8_buf, sizeof(utf8_buf), cp) < 0) { 198 + parse_error(ctxt, token, 199 + "%.*s is not a valid Unicode character", 200 + (int)(ptr - beg), beg); 201 + goto out; 202 + } 203 + qstring_append(str, utf8_buf); 204 + break; 208 205 default: 209 206 parse_error(ctxt, token, "invalid escape sequence in string"); 210 207 goto out; 211 208 } 212 - } else { 213 - char dummy[2]; 214 - 215 - dummy[0] = *ptr++; 216 - dummy[1] = 0; 217 - 218 - qstring_append(str, dummy); 209 + break; 210 + case '%': 211 + if (ctxt->ap && ptr[1] != '%') { 212 + parse_error(ctxt, token, "can't interpolate into string"); 213 + goto out; 214 + } 215 + ptr++; 216 + /* fall through */ 217 + default: 218 + cp = mod_utf8_codepoint(ptr, 6, &end); 219 + if (cp < 0) { 220 + parse_error(ctxt, token, "invalid UTF-8 sequence in string"); 221 + goto out; 222 + } 223 + ptr = end; 224 + len = mod_utf8_encode(utf8_buf, sizeof(utf8_buf), cp); 225 + assert(len >= 0); 226 + qstring_append(str, utf8_buf); 219 227 } 220 228 } 221 229 ··· 233 241 static JSONToken *parser_context_pop_token(JSONParserContext *ctxt) 234 242 { 235 243 g_free(ctxt->current); 236 - assert(!g_queue_is_empty(ctxt->buf)); 237 244 ctxt->current = g_queue_pop_head(ctxt->buf); 238 245 return ctxt->current; 239 246 } 240 247 241 248 static JSONToken *parser_context_peek_token(JSONParserContext *ctxt) 242 249 { 243 - assert(!g_queue_is_empty(ctxt->buf)); 244 250 return g_queue_peek_head(ctxt->buf); 245 251 } 246 252 247 - static JSONParserContext *parser_context_new(GQueue *tokens) 248 - { 249 - JSONParserContext *ctxt; 250 - 251 - if (!tokens) { 252 - return NULL; 253 - } 254 - 255 - ctxt = g_malloc0(sizeof(JSONParserContext)); 256 - ctxt->buf = tokens; 257 - 258 - return ctxt; 259 - } 260 - 261 - /* to support error propagation, ctxt->err must be freed separately */ 262 - static void parser_context_free(JSONParserContext *ctxt) 263 - { 264 - if (ctxt) { 265 - while (!g_queue_is_empty(ctxt->buf)) { 266 - parser_context_pop_token(ctxt); 267 - } 268 - g_free(ctxt->current); 269 - g_queue_free(ctxt->buf); 270 - g_free(ctxt); 271 - } 272 - } 273 - 274 253 /** 275 254 * Parsing rules 276 255 */ 277 - static int parse_pair(JSONParserContext *ctxt, QDict *dict, va_list *ap) 256 + static int parse_pair(JSONParserContext *ctxt, QDict *dict) 278 257 { 279 258 QObject *value; 280 259 QString *key = NULL; ··· 286 265 goto out; 287 266 } 288 267 289 - key = qobject_to(QString, parse_value(ctxt, ap)); 268 + key = qobject_to(QString, parse_value(ctxt)); 290 269 if (!key) { 291 270 parse_error(ctxt, peek, "key is not a string in object"); 292 271 goto out; ··· 303 282 goto out; 304 283 } 305 284 306 - value = parse_value(ctxt, ap); 285 + value = parse_value(ctxt); 307 286 if (value == NULL) { 308 287 parse_error(ctxt, token, "Missing value in dict"); 309 288 goto out; ··· 321 300 return -1; 322 301 } 323 302 324 - static QObject *parse_object(JSONParserContext *ctxt, va_list *ap) 303 + static QObject *parse_object(JSONParserContext *ctxt) 325 304 { 326 305 QDict *dict = NULL; 327 306 JSONToken *token, *peek; ··· 338 317 } 339 318 340 319 if (peek->type != JSON_RCURLY) { 341 - if (parse_pair(ctxt, dict, ap) == -1) { 320 + if (parse_pair(ctxt, dict) == -1) { 342 321 goto out; 343 322 } 344 323 ··· 354 333 goto out; 355 334 } 356 335 357 - if (parse_pair(ctxt, dict, ap) == -1) { 336 + if (parse_pair(ctxt, dict) == -1) { 358 337 goto out; 359 338 } 360 339 ··· 375 354 return NULL; 376 355 } 377 356 378 - static QObject *parse_array(JSONParserContext *ctxt, va_list *ap) 357 + static QObject *parse_array(JSONParserContext *ctxt) 379 358 { 380 359 QList *list = NULL; 381 360 JSONToken *token, *peek; ··· 394 373 if (peek->type != JSON_RSQUARE) { 395 374 QObject *obj; 396 375 397 - obj = parse_value(ctxt, ap); 376 + obj = parse_value(ctxt); 398 377 if (obj == NULL) { 399 378 parse_error(ctxt, token, "expecting value"); 400 379 goto out; ··· 414 393 goto out; 415 394 } 416 395 417 - obj = parse_value(ctxt, ap); 396 + obj = parse_value(ctxt); 418 397 if (obj == NULL) { 419 398 parse_error(ctxt, token, "expecting value"); 420 399 goto out; ··· 457 436 return NULL; 458 437 } 459 438 460 - static QObject *parse_escape(JSONParserContext *ctxt, va_list *ap) 439 + static QObject *parse_interpolation(JSONParserContext *ctxt) 461 440 { 462 441 JSONToken *token; 463 - 464 - if (ap == NULL) { 465 - return NULL; 466 - } 467 442 468 443 token = parser_context_pop_token(ctxt); 469 - assert(token && token->type == JSON_ESCAPE); 444 + assert(token && token->type == JSON_INTERP); 470 445 471 446 if (!strcmp(token->str, "%p")) { 472 - return va_arg(*ap, QObject *); 447 + return va_arg(*ctxt->ap, QObject *); 473 448 } else if (!strcmp(token->str, "%i")) { 474 - return QOBJECT(qbool_from_bool(va_arg(*ap, int))); 449 + return QOBJECT(qbool_from_bool(va_arg(*ctxt->ap, int))); 475 450 } else if (!strcmp(token->str, "%d")) { 476 - return QOBJECT(qnum_from_int(va_arg(*ap, int))); 451 + return QOBJECT(qnum_from_int(va_arg(*ctxt->ap, int))); 477 452 } else if (!strcmp(token->str, "%ld")) { 478 - return QOBJECT(qnum_from_int(va_arg(*ap, long))); 479 - } else if (!strcmp(token->str, "%lld") || 480 - !strcmp(token->str, "%I64d")) { 481 - return QOBJECT(qnum_from_int(va_arg(*ap, long long))); 453 + return QOBJECT(qnum_from_int(va_arg(*ctxt->ap, long))); 454 + } else if (!strcmp(token->str, "%lld")) { 455 + return QOBJECT(qnum_from_int(va_arg(*ctxt->ap, long long))); 456 + } else if (!strcmp(token->str, "%" PRId64)) { 457 + return QOBJECT(qnum_from_int(va_arg(*ctxt->ap, int64_t))); 482 458 } else if (!strcmp(token->str, "%u")) { 483 - return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned int))); 459 + return QOBJECT(qnum_from_uint(va_arg(*ctxt->ap, unsigned int))); 484 460 } else if (!strcmp(token->str, "%lu")) { 485 - return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned long))); 486 - } else if (!strcmp(token->str, "%llu") || 487 - !strcmp(token->str, "%I64u")) { 488 - return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned long long))); 461 + return QOBJECT(qnum_from_uint(va_arg(*ctxt->ap, unsigned long))); 462 + } else if (!strcmp(token->str, "%llu")) { 463 + return QOBJECT(qnum_from_uint(va_arg(*ctxt->ap, unsigned long long))); 464 + } else if (!strcmp(token->str, "%" PRIu64)) { 465 + return QOBJECT(qnum_from_uint(va_arg(*ctxt->ap, uint64_t))); 489 466 } else if (!strcmp(token->str, "%s")) { 490 - return QOBJECT(qstring_from_str(va_arg(*ap, const char *))); 467 + return QOBJECT(qstring_from_str(va_arg(*ctxt->ap, const char *))); 491 468 } else if (!strcmp(token->str, "%f")) { 492 - return QOBJECT(qnum_from_double(va_arg(*ap, double))); 469 + return QOBJECT(qnum_from_double(va_arg(*ctxt->ap, double))); 493 470 } 471 + parse_error(ctxt, token, "invalid interpolation '%s'", token->str); 494 472 return NULL; 495 473 } 496 474 ··· 503 481 504 482 switch (token->type) { 505 483 case JSON_STRING: 506 - return QOBJECT(qstring_from_escaped_str(ctxt, token)); 484 + return QOBJECT(parse_string(ctxt, token)); 507 485 case JSON_INTEGER: { 508 486 /* 509 487 * Represent JSON_INTEGER as QNUM_I64 if possible, else as ··· 538 516 } 539 517 case JSON_FLOAT: 540 518 /* FIXME dependent on locale; a pervasive issue in QEMU */ 541 - /* FIXME our lexer matches RFC 7159 in forbidding Inf or NaN, 519 + /* FIXME our lexer matches RFC 8259 in forbidding Inf or NaN, 542 520 * but those might be useful extensions beyond JSON */ 543 521 return QOBJECT(qnum_from_double(strtod(token->str, NULL))); 544 522 default: ··· 546 524 } 547 525 } 548 526 549 - static QObject *parse_value(JSONParserContext *ctxt, va_list *ap) 527 + static QObject *parse_value(JSONParserContext *ctxt) 550 528 { 551 529 JSONToken *token; 552 530 ··· 558 536 559 537 switch (token->type) { 560 538 case JSON_LCURLY: 561 - return parse_object(ctxt, ap); 539 + return parse_object(ctxt); 562 540 case JSON_LSQUARE: 563 - return parse_array(ctxt, ap); 564 - case JSON_ESCAPE: 565 - return parse_escape(ctxt, ap); 541 + return parse_array(ctxt); 542 + case JSON_INTERP: 543 + return parse_interpolation(ctxt); 566 544 case JSON_INTEGER: 567 545 case JSON_FLOAT: 568 546 case JSON_STRING: ··· 575 553 } 576 554 } 577 555 578 - QObject *json_parser_parse(GQueue *tokens, va_list *ap) 556 + JSONToken *json_token(JSONTokenType type, int x, int y, GString *tokstr) 579 557 { 580 - return json_parser_parse_err(tokens, ap, NULL); 558 + JSONToken *token = g_malloc(sizeof(JSONToken) + tokstr->len + 1); 559 + 560 + token->type = type; 561 + memcpy(token->str, tokstr->str, tokstr->len); 562 + token->str[tokstr->len] = 0; 563 + token->x = x; 564 + token->y = y; 565 + return token; 581 566 } 582 567 583 - QObject *json_parser_parse_err(GQueue *tokens, va_list *ap, Error **errp) 568 + QObject *json_parser_parse(GQueue *tokens, va_list *ap, Error **errp) 584 569 { 585 - JSONParserContext *ctxt = parser_context_new(tokens); 570 + JSONParserContext ctxt = { .buf = tokens, .ap = ap }; 586 571 QObject *result; 587 572 588 - if (!ctxt) { 589 - return NULL; 590 - } 573 + result = parse_value(&ctxt); 574 + assert(ctxt.err || g_queue_is_empty(ctxt.buf)); 591 575 592 - result = parse_value(ctxt, ap); 576 + error_propagate(errp, ctxt.err); 593 577 594 - error_propagate(errp, ctxt->err); 595 - 596 - parser_context_free(ctxt); 578 + while (!g_queue_is_empty(ctxt.buf)) { 579 + parser_context_pop_token(&ctxt); 580 + } 581 + g_free(ctxt.current); 597 582 598 583 return result; 599 584 }
+57 -59
qobject/json-streamer.c
··· 12 12 */ 13 13 14 14 #include "qemu/osdep.h" 15 - #include "qemu-common.h" 16 - #include "qapi/qmp/json-lexer.h" 17 - #include "qapi/qmp/json-streamer.h" 15 + #include "qapi/error.h" 16 + #include "json-parser-int.h" 18 17 19 18 #define MAX_TOKEN_SIZE (64ULL << 20) 20 19 #define MAX_TOKEN_COUNT (2ULL << 20) 21 - #define MAX_NESTING (1ULL << 10) 20 + #define MAX_NESTING (1 << 10) 22 21 23 - static void json_message_free_token(void *token, void *opaque) 22 + static void json_message_free_tokens(JSONMessageParser *parser) 24 23 { 25 - g_free(token); 26 - } 24 + JSONToken *token; 27 25 28 - static void json_message_free_tokens(JSONMessageParser *parser) 29 - { 30 - if (parser->tokens) { 31 - g_queue_foreach(parser->tokens, json_message_free_token, NULL); 32 - g_queue_free(parser->tokens); 33 - parser->tokens = NULL; 26 + while ((token = g_queue_pop_head(&parser->tokens))) { 27 + g_free(token); 34 28 } 35 29 } 36 30 37 - static void json_message_process_token(JSONLexer *lexer, GString *input, 38 - JSONTokenType type, int x, int y) 31 + void json_message_process_token(JSONLexer *lexer, GString *input, 32 + JSONTokenType type, int x, int y) 39 33 { 40 34 JSONMessageParser *parser = container_of(lexer, JSONMessageParser, lexer); 35 + QObject *json = NULL; 36 + Error *err = NULL; 41 37 JSONToken *token; 42 - GQueue *tokens; 43 38 44 39 switch (type) { 45 40 case JSON_LCURLY: ··· 54 49 case JSON_RSQUARE: 55 50 parser->bracket_count--; 56 51 break; 52 + case JSON_ERROR: 53 + error_setg(&err, "JSON parse error, stray '%s'", input->str); 54 + goto out_emit; 55 + case JSON_END_OF_INPUT: 56 + if (g_queue_is_empty(&parser->tokens)) { 57 + return; 58 + } 59 + json = json_parser_parse(&parser->tokens, parser->ap, &err); 60 + goto out_emit; 57 61 default: 58 62 break; 59 63 } 60 64 61 - token = g_malloc(sizeof(JSONToken) + input->len + 1); 62 - token->type = type; 63 - memcpy(token->str, input->str, input->len); 64 - token->str[input->len] = 0; 65 - token->x = x; 66 - token->y = y; 65 + /* 66 + * Security consideration, we limit total memory allocated per object 67 + * and the maximum recursion depth that a message can force. 68 + */ 69 + if (parser->token_size + input->len + 1 > MAX_TOKEN_SIZE) { 70 + error_setg(&err, "JSON token size limit exceeded"); 71 + goto out_emit; 72 + } 73 + if (g_queue_get_length(&parser->tokens) + 1 > MAX_TOKEN_COUNT) { 74 + error_setg(&err, "JSON token count limit exceeded"); 75 + goto out_emit; 76 + } 77 + if (parser->bracket_count + parser->brace_count > MAX_NESTING) { 78 + error_setg(&err, "JSON nesting depth limit exceeded"); 79 + goto out_emit; 80 + } 67 81 82 + token = json_token(type, x, y, input); 68 83 parser->token_size += input->len; 69 84 70 - g_queue_push_tail(parser->tokens, token); 85 + g_queue_push_tail(&parser->tokens, token); 71 86 72 - if (type == JSON_ERROR) { 73 - goto out_emit_bad; 74 - } else if (parser->brace_count < 0 || 75 - parser->bracket_count < 0 || 76 - (parser->brace_count == 0 && 77 - parser->bracket_count == 0)) { 78 - goto out_emit; 79 - } else if (parser->token_size > MAX_TOKEN_SIZE || 80 - g_queue_get_length(parser->tokens) > MAX_TOKEN_COUNT || 81 - parser->bracket_count + parser->brace_count > MAX_NESTING) { 82 - /* Security consideration, we limit total memory allocated per object 83 - * and the maximum recursion depth that a message can force. 84 - */ 85 - goto out_emit_bad; 87 + if ((parser->brace_count > 0 || parser->bracket_count > 0) 88 + && parser->bracket_count >= 0 && parser->bracket_count >= 0) { 89 + return; 86 90 } 87 91 88 - return; 92 + json = json_parser_parse(&parser->tokens, parser->ap, &err); 89 93 90 - out_emit_bad: 91 - /* 92 - * Clear out token list and tell the parser to emit an error 93 - * indication by passing it a NULL list 94 - */ 95 - json_message_free_tokens(parser); 96 94 out_emit: 97 - /* send current list of tokens to parser and reset tokenizer */ 98 95 parser->brace_count = 0; 99 96 parser->bracket_count = 0; 100 - /* parser->emit takes ownership of parser->tokens. Remove our own 101 - * reference to parser->tokens before handing it out to parser->emit. 102 - */ 103 - tokens = parser->tokens; 104 - parser->tokens = g_queue_new(); 105 - parser->emit(parser, tokens); 97 + json_message_free_tokens(parser); 106 98 parser->token_size = 0; 99 + parser->emit(parser->opaque, json, err); 107 100 } 108 101 109 102 void json_message_parser_init(JSONMessageParser *parser, 110 - void (*func)(JSONMessageParser *, GQueue *)) 103 + void (*emit)(void *opaque, QObject *json, 104 + Error *err), 105 + void *opaque, va_list *ap) 111 106 { 112 - parser->emit = func; 107 + parser->emit = emit; 108 + parser->opaque = opaque; 109 + parser->ap = ap; 113 110 parser->brace_count = 0; 114 111 parser->bracket_count = 0; 115 - parser->tokens = g_queue_new(); 112 + g_queue_init(&parser->tokens); 116 113 parser->token_size = 0; 117 114 118 - json_lexer_init(&parser->lexer, json_message_process_token); 115 + json_lexer_init(&parser->lexer, !!ap); 119 116 } 120 117 121 - int json_message_parser_feed(JSONMessageParser *parser, 118 + void json_message_parser_feed(JSONMessageParser *parser, 122 119 const char *buffer, size_t size) 123 120 { 124 - return json_lexer_feed(&parser->lexer, buffer, size); 121 + json_lexer_feed(&parser->lexer, buffer, size); 125 122 } 126 123 127 - int json_message_parser_flush(JSONMessageParser *parser) 124 + void json_message_parser_flush(JSONMessageParser *parser) 128 125 { 129 - return json_lexer_flush(&parser->lexer); 126 + json_lexer_flush(&parser->lexer); 127 + assert(g_queue_is_empty(&parser->tokens)); 130 128 } 131 129 132 130 void json_message_parser_destroy(JSONMessageParser *parser)
-1
qobject/qbool.c
··· 13 13 14 14 #include "qemu/osdep.h" 15 15 #include "qapi/qmp/qbool.h" 16 - #include "qemu-common.h" 17 16 18 17 /** 19 18 * qbool_from_bool(): Create a new QBool from a bool
+22 -9
qobject/qjson.c
··· 13 13 14 14 #include "qemu/osdep.h" 15 15 #include "qapi/error.h" 16 - #include "qapi/qmp/json-lexer.h" 17 16 #include "qapi/qmp/json-parser.h" 18 - #include "qapi/qmp/json-streamer.h" 19 17 #include "qapi/qmp/qjson.h" 20 18 #include "qapi/qmp/qbool.h" 21 19 #include "qapi/qmp/qdict.h" ··· 27 25 typedef struct JSONParsingState 28 26 { 29 27 JSONMessageParser parser; 30 - va_list *ap; 31 28 QObject *result; 32 29 Error *err; 33 30 } JSONParsingState; 34 31 35 - static void parse_json(JSONMessageParser *parser, GQueue *tokens) 32 + static void consume_json(void *opaque, QObject *json, Error *err) 36 33 { 37 - JSONParsingState *s = container_of(parser, JSONParsingState, parser); 34 + JSONParsingState *s = opaque; 38 35 39 - s->result = json_parser_parse_err(tokens, s->ap, &s->err); 36 + assert(!json != !err); 37 + assert(!s->result || !s->err); 38 + 39 + if (s->result) { 40 + qobject_unref(s->result); 41 + s->result = NULL; 42 + error_setg(&s->err, "Expecting at most one JSON value"); 43 + } 44 + if (s->err) { 45 + qobject_unref(json); 46 + error_free(err); 47 + return; 48 + } 49 + s->result = json; 50 + s->err = err; 40 51 } 41 52 42 53 /* ··· 54 65 { 55 66 JSONParsingState state = {}; 56 67 57 - state.ap = ap; 58 - 59 - json_message_parser_init(&state.parser, parse_json); 68 + json_message_parser_init(&state.parser, consume_json, &state, ap); 60 69 json_message_parser_feed(&state.parser, string, strlen(string)); 61 70 json_message_parser_flush(&state.parser); 62 71 json_message_parser_destroy(&state.parser); 72 + 73 + if (!state.result && !state.err) { 74 + error_setg(&state.err, "Expecting a JSON value"); 75 + } 63 76 64 77 error_propagate(errp, state.err); 65 78 return state.result;
-1
qobject/qlist.c
··· 17 17 #include "qapi/qmp/qnum.h" 18 18 #include "qapi/qmp/qstring.h" 19 19 #include "qemu/queue.h" 20 - #include "qemu-common.h" 21 20 22 21 /** 23 22 * qlist_new(): Create a new QList
-1
qobject/qnull.c
··· 11 11 */ 12 12 13 13 #include "qemu/osdep.h" 14 - #include "qemu-common.h" 15 14 #include "qapi/qmp/qnull.h" 16 15 17 16 QNull qnull_ = {
-1
qobject/qnum.c
··· 14 14 15 15 #include "qemu/osdep.h" 16 16 #include "qapi/qmp/qnum.h" 17 - #include "qemu-common.h" 18 17 19 18 /** 20 19 * qnum_from_int(): Create a new QNum from an int64_t
-1
qobject/qobject.c
··· 8 8 */ 9 9 10 10 #include "qemu/osdep.h" 11 - #include "qemu-common.h" 12 11 #include "qapi/qmp/qbool.h" 13 12 #include "qapi/qmp/qnull.h" 14 13 #include "qapi/qmp/qnum.h"
-1
qobject/qstring.c
··· 12 12 13 13 #include "qemu/osdep.h" 14 14 #include "qapi/qmp/qstring.h" 15 - #include "qemu-common.h" 16 15 17 16 /** 18 17 * qstring_new(): Create a new empty QString
+3
tests/Makefile.include
··· 183 183 184 184 check-qtest-generic-y = tests/qmp-test$(EXESUF) 185 185 gcov-files-generic-y = monitor.c qapi/qmp-dispatch.c 186 + check-qtest-generic-y += tests/qmp-cmd-test$(EXESUF) 187 + 186 188 check-qtest-generic-y += tests/device-introspect-test$(EXESUF) 187 189 gcov-files-generic-y = qdev-monitor.c qmp.c 188 190 check-qtest-generic-y += tests/cdrom-test$(EXESUF) ··· 779 781 libqos-virtio-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) tests/libqos/virtio.o tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o tests/libqos/malloc-generic.o 780 782 781 783 tests/qmp-test$(EXESUF): tests/qmp-test.o 784 + tests/qmp-cmd-test$(EXESUF): tests/qmp-cmd-test.o 782 785 tests/device-introspect-test$(EXESUF): tests/device-introspect-test.o 783 786 tests/rtc-test$(EXESUF): tests/rtc-test.o 784 787 tests/m48t59-test$(EXESUF): tests/m48t59-test.o
+543 -501
tests/check-qjson.c
··· 20 20 #include "qapi/qmp/qnull.h" 21 21 #include "qapi/qmp/qnum.h" 22 22 #include "qapi/qmp/qstring.h" 23 + #include "qemu/unicode.h" 23 24 #include "qemu-common.h" 24 25 25 - static void escaped_string(void) 26 + static QString *from_json_str(const char *jstr, bool single, Error **errp) 26 27 { 27 - int i; 28 - struct { 29 - const char *encoded; 30 - const char *decoded; 31 - int skip; 32 - } test_cases[] = { 33 - { "\"\\b\"", "\b" }, 34 - { "\"\\f\"", "\f" }, 35 - { "\"\\n\"", "\n" }, 36 - { "\"\\r\"", "\r" }, 37 - { "\"\\t\"", "\t" }, 38 - { "\"/\"", "/" }, 39 - { "\"\\/\"", "/", .skip = 1 }, 40 - { "\"\\\\\"", "\\" }, 41 - { "\"\\\"\"", "\"" }, 42 - { "\"hello world \\\"embedded string\\\"\"", 43 - "hello world \"embedded string\"" }, 44 - { "\"hello world\\nwith new line\"", "hello world\nwith new line" }, 45 - { "\"single byte utf-8 \\u0020\"", "single byte utf-8 ", .skip = 1 }, 46 - { "\"double byte utf-8 \\u00A2\"", "double byte utf-8 \xc2\xa2" }, 47 - { "\"triple byte utf-8 \\u20AC\"", "triple byte utf-8 \xe2\x82\xac" }, 48 - { "'\\b'", "\b", .skip = 1 }, 49 - { "'\\f'", "\f", .skip = 1 }, 50 - { "'\\n'", "\n", .skip = 1 }, 51 - { "'\\r'", "\r", .skip = 1 }, 52 - { "'\\t'", "\t", .skip = 1 }, 53 - { "'\\/'", "/", .skip = 1 }, 54 - { "'\\\\'", "\\", .skip = 1 }, 55 - {} 56 - }; 28 + char quote = single ? '\'' : '"'; 29 + char *qjstr = g_strdup_printf("%c%s%c", quote, jstr, quote); 30 + QString *ret = qobject_to(QString, qobject_from_json(qjstr, errp)); 57 31 58 - for (i = 0; test_cases[i].encoded; i++) { 59 - QObject *obj; 60 - QString *str; 32 + g_free(qjstr); 33 + return ret; 34 + } 61 35 62 - obj = qobject_from_json(test_cases[i].encoded, &error_abort); 63 - str = qobject_to(QString, obj); 64 - g_assert(str); 65 - g_assert_cmpstr(qstring_get_str(str), ==, test_cases[i].decoded); 36 + static char *to_json_str(QString *str) 37 + { 38 + QString *json = qobject_to_json(QOBJECT(str)); 39 + char *jstr; 66 40 67 - if (test_cases[i].skip == 0) { 68 - str = qobject_to_json(obj); 69 - g_assert_cmpstr(qstring_get_str(str), ==, test_cases[i].encoded); 70 - qobject_unref(obj); 71 - } 72 - 73 - qobject_unref(str); 41 + if (!json) { 42 + return NULL; 74 43 } 44 + /* peel off double quotes */ 45 + jstr = g_strndup(qstring_get_str(json) + 1, 46 + qstring_get_length(json) - 2); 47 + qobject_unref(json); 48 + return jstr; 75 49 } 76 50 77 - static void simple_string(void) 51 + static void escaped_string(void) 78 52 { 79 - int i; 80 53 struct { 81 - const char *encoded; 82 - const char *decoded; 54 + /* Content of JSON string to parse with qobject_from_json() */ 55 + const char *json_in; 56 + /* Expected parse output; to unparse with qobject_to_json() */ 57 + const char *utf8_out; 58 + int skip; 83 59 } test_cases[] = { 84 - { "\"hello world\"", "hello world" }, 85 - { "\"the quick brown fox jumped over the fence\"", 86 - "the quick brown fox jumped over the fence" }, 60 + { "\\b\\f\\n\\r\\t\\\\\\\"", "\b\f\n\r\t\\\"" }, 61 + { "\\/\\'", "/'", .skip = 1 }, 62 + { "single byte utf-8 \\u0020", "single byte utf-8 ", .skip = 1 }, 63 + { "double byte utf-8 \\u00A2", "double byte utf-8 \xc2\xa2" }, 64 + { "triple byte utf-8 \\u20AC", "triple byte utf-8 \xe2\x82\xac" }, 65 + { "quadruple byte utf-8 \\uD834\\uDD1E", /* U+1D11E */ 66 + "quadruple byte utf-8 \xF0\x9D\x84\x9E" }, 67 + { "\\", NULL }, 68 + { "\\z", NULL }, 69 + { "\\ux", NULL }, 70 + { "\\u1x", NULL }, 71 + { "\\u12x", NULL }, 72 + { "\\u123x", NULL }, 73 + { "\\u12345", "\341\210\2645" }, 74 + { "\\u0000x", "\xC0\x80x" }, 75 + { "unpaired leading surrogate \\uD800", NULL }, 76 + { "unpaired leading surrogate \\uD800\\uCAFE", NULL }, 77 + { "unpaired leading surrogate \\uD800\\uD801\\uDC02", NULL }, 78 + { "unpaired trailing surrogate \\uDC00", NULL }, 79 + { "backward surrogate pair \\uDC00\\uD800", NULL }, 80 + { "noncharacter U+FDD0 \\uFDD0", NULL }, 81 + { "noncharacter U+FDEF \\uFDEF", NULL }, 82 + { "noncharacter U+1FFFE \\uD87F\\uDFFE", NULL }, 83 + { "noncharacter U+10FFFF \\uDC3F\\uDFFF", NULL }, 87 84 {} 88 85 }; 89 - 90 - for (i = 0; test_cases[i].encoded; i++) { 91 - QObject *obj; 92 - QString *str; 93 - 94 - obj = qobject_from_json(test_cases[i].encoded, &error_abort); 95 - str = qobject_to(QString, obj); 96 - g_assert(str); 97 - g_assert(strcmp(qstring_get_str(str), test_cases[i].decoded) == 0); 86 + int i, j; 87 + QString *cstr; 88 + char *jstr; 98 89 99 - str = qobject_to_json(obj); 100 - g_assert(strcmp(qstring_get_str(str), test_cases[i].encoded) == 0); 101 - 102 - qobject_unref(obj); 103 - 104 - qobject_unref(str); 90 + for (i = 0; test_cases[i].json_in; i++) { 91 + for (j = 0; j < 2; j++) { 92 + if (test_cases[i].utf8_out) { 93 + cstr = from_json_str(test_cases[i].json_in, j, &error_abort); 94 + g_assert_cmpstr(qstring_get_try_str(cstr), 95 + ==, test_cases[i].utf8_out); 96 + if (!test_cases[i].skip) { 97 + jstr = to_json_str(cstr); 98 + g_assert_cmpstr(jstr, ==, test_cases[i].json_in); 99 + g_free(jstr); 100 + } 101 + qobject_unref(cstr); 102 + } else { 103 + cstr = from_json_str(test_cases[i].json_in, j, NULL); 104 + g_assert(!cstr); 105 + } 106 + } 105 107 } 106 108 } 107 109 108 - static void single_quote_string(void) 110 + static void string_with_quotes(void) 109 111 { 112 + const char *test_cases[] = { 113 + "\"the bee's knees\"", 114 + "'double quote \"'", 115 + NULL 116 + }; 110 117 int i; 111 - struct { 112 - const char *encoded; 113 - const char *decoded; 114 - } test_cases[] = { 115 - { "'hello world'", "hello world" }, 116 - { "'the quick brown fox \\' jumped over the fence'", 117 - "the quick brown fox ' jumped over the fence" }, 118 - {} 119 - }; 118 + QString *str; 119 + char *cstr; 120 120 121 - for (i = 0; test_cases[i].encoded; i++) { 122 - QObject *obj; 123 - QString *str; 124 - 125 - obj = qobject_from_json(test_cases[i].encoded, &error_abort); 126 - str = qobject_to(QString, obj); 121 + for (i = 0; test_cases[i]; i++) { 122 + str = qobject_to(QString, 123 + qobject_from_json(test_cases[i], &error_abort)); 127 124 g_assert(str); 128 - g_assert(strcmp(qstring_get_str(str), test_cases[i].decoded) == 0); 129 - 125 + cstr = g_strndup(test_cases[i] + 1, strlen(test_cases[i]) - 2); 126 + g_assert_cmpstr(qstring_get_str(str), ==, cstr); 127 + g_free(cstr); 130 128 qobject_unref(str); 131 129 } 132 130 } ··· 134 132 static void utf8_string(void) 135 133 { 136 134 /* 137 - * FIXME Current behavior for invalid UTF-8 sequences is 138 - * incorrect. This test expects current, incorrect results. 139 - * They're all marked "bug:" below, and are to be replaced by 140 - * correct ones as the bugs get fixed. 141 - * 142 - * The JSON parser rejects some invalid sequences, but accepts 143 - * others without correcting the problem. 144 - * 145 - * We should either reject all invalid sequences, or minimize 146 - * overlong sequences and replace all other invalid sequences by a 147 - * suitable replacement character. A common choice for 148 - * replacement is U+FFFD. 149 - * 150 - * Problem: we can't easily deal with embedded U+0000. Parsing 151 - * the JSON string "this \\u0000" is fun" yields "this \0 is fun", 152 - * which gets misinterpreted as NUL-terminated "this ". We should 153 - * consider using overlong encoding \xC0\x80 for U+0000 ("modified 154 - * UTF-8"). 155 - * 156 135 * Most test cases are scraped from Markus Kuhn's UTF-8 decoder 157 136 * capability and stress test at 158 137 * http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt 159 138 */ 160 139 static const struct { 140 + /* Content of JSON string to parse with qobject_from_json() */ 161 141 const char *json_in; 142 + /* Expected parse output */ 162 143 const char *utf8_out; 163 - const char *json_out; /* defaults to @json_in */ 164 - const char *utf8_in; /* defaults to @utf8_out */ 144 + /* Expected unparse output, defaults to @json_in */ 145 + const char *json_out; 165 146 } test_cases[] = { 166 - /* 167 - * Bug markers used here: 168 - * - bug: not corrected 169 - * JSON parser fails to correct invalid sequence(s) 170 - * - bug: rejected 171 - * JSON parser rejects invalid sequence(s) 172 - * We may choose to define this as feature 173 - * - bug: want "..." 174 - * JSON parser produces incorrect result, this is the 175 - * correct one, assuming replacement character U+FFFF 176 - * We may choose to reject instead of replace 177 - */ 178 - 147 + /* 0 Control characters */ 148 + { 149 + /* 150 + * Note: \x00 is impossible, other representations of 151 + * U+0000 are covered under 4.3 152 + */ 153 + "\x01\x02\x03\x04\x05\x06\x07" 154 + "\x08\x09\x0A\x0B\x0C\x0D\x0E\x0F" 155 + "\x10\x11\x12\x13\x14\x15\x16\x17" 156 + "\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F", 157 + NULL, 158 + "\\u0001\\u0002\\u0003\\u0004\\u0005\\u0006\\u0007" 159 + "\\b\\t\\n\\u000B\\f\\r\\u000E\\u000F" 160 + "\\u0010\\u0011\\u0012\\u0013\\u0014\\u0015\\u0016\\u0017" 161 + "\\u0018\\u0019\\u001A\\u001B\\u001C\\u001D\\u001E\\u001F", 162 + }, 179 163 /* 1 Some correct UTF-8 text */ 180 164 { 181 165 /* a bit of German */ 182 - "\"Falsches \xC3\x9C" "ben von Xylophonmusik qu\xC3\xA4lt" 183 - " jeden gr\xC3\xB6\xC3\x9F" "eren Zwerg.\"", 184 166 "Falsches \xC3\x9C" "ben von Xylophonmusik qu\xC3\xA4lt" 185 167 " jeden gr\xC3\xB6\xC3\x9F" "eren Zwerg.", 186 - "\"Falsches \\u00DCben von Xylophonmusik qu\\u00E4lt" 187 - " jeden gr\\u00F6\\u00DFeren Zwerg.\"", 168 + "Falsches \xC3\x9C" "ben von Xylophonmusik qu\xC3\xA4lt" 169 + " jeden gr\xC3\xB6\xC3\x9F" "eren Zwerg.", 170 + "Falsches \\u00DCben von Xylophonmusik qu\\u00E4lt" 171 + " jeden gr\\u00F6\\u00DFeren Zwerg.", 188 172 }, 189 173 { 190 174 /* a bit of Greek */ 191 - "\"\xCE\xBA\xE1\xBD\xB9\xCF\x83\xCE\xBC\xCE\xB5\"", 192 175 "\xCE\xBA\xE1\xBD\xB9\xCF\x83\xCE\xBC\xCE\xB5", 193 - "\"\\u03BA\\u1F79\\u03C3\\u03BC\\u03B5\"", 176 + "\xCE\xBA\xE1\xBD\xB9\xCF\x83\xCE\xBC\xCE\xB5", 177 + "\\u03BA\\u1F79\\u03C3\\u03BC\\u03B5", 194 178 }, 195 179 /* 2 Boundary condition test cases */ 196 180 /* 2.1 First possible sequence of a certain length */ 197 - /* 2.1.1 1 byte U+0000 */ 181 + /* 182 + * 2.1.1 1 byte U+0020 183 + * Control characters are already covered by their own test 184 + * case under 0. Test the first 1 byte non-control character 185 + * here. 186 + */ 198 187 { 199 - "\"\\u0000\"", 200 - "", /* bug: want overlong "\xC0\x80" */ 201 - "\"\\u0000\"", 202 - "\xC0\x80", 188 + " ", 189 + " ", 203 190 }, 204 191 /* 2.1.2 2 bytes U+0080 */ 205 192 { 206 - "\"\xC2\x80\"", 207 193 "\xC2\x80", 208 - "\"\\u0080\"", 194 + "\xC2\x80", 195 + "\\u0080", 209 196 }, 210 197 /* 2.1.3 3 bytes U+0800 */ 211 198 { 212 - "\"\xE0\xA0\x80\"", 213 199 "\xE0\xA0\x80", 214 - "\"\\u0800\"", 200 + "\xE0\xA0\x80", 201 + "\\u0800", 215 202 }, 216 203 /* 2.1.4 4 bytes U+10000 */ 217 204 { 218 - "\"\xF0\x90\x80\x80\"", 219 205 "\xF0\x90\x80\x80", 220 - "\"\\uD800\\uDC00\"", 206 + "\xF0\x90\x80\x80", 207 + "\\uD800\\uDC00", 221 208 }, 222 209 /* 2.1.5 5 bytes U+200000 */ 223 210 { 224 - "\"\xF8\x88\x80\x80\x80\"", 225 - NULL, /* bug: rejected */ 226 - "\"\\uFFFD\"", 227 211 "\xF8\x88\x80\x80\x80", 212 + NULL, 213 + "\\uFFFD", 228 214 }, 229 215 /* 2.1.6 6 bytes U+4000000 */ 230 216 { 231 - "\"\xFC\x84\x80\x80\x80\x80\"", 232 - NULL, /* bug: rejected */ 233 - "\"\\uFFFD\"", 234 217 "\xFC\x84\x80\x80\x80\x80", 218 + NULL, 219 + "\\uFFFD", 235 220 }, 236 221 /* 2.2 Last possible sequence of a certain length */ 237 222 /* 2.2.1 1 byte U+007F */ 238 223 { 239 - "\"\x7F\"", 224 + "\x7F", 240 225 "\x7F", 241 - "\"\\u007F\"", 226 + "\\u007F", 242 227 }, 243 228 /* 2.2.2 2 bytes U+07FF */ 244 229 { 245 - "\"\xDF\xBF\"", 230 + "\xDF\xBF", 246 231 "\xDF\xBF", 247 - "\"\\u07FF\"", 232 + "\\u07FF", 248 233 }, 249 234 /* 250 235 * 2.2.3 3 bytes U+FFFC ··· 256 241 * U+FFFC here. 257 242 */ 258 243 { 259 - "\"\xEF\xBF\xBC\"", 244 + "\xEF\xBF\xBC", 260 245 "\xEF\xBF\xBC", 261 - "\"\\uFFFC\"", 246 + "\\uFFFC", 262 247 }, 263 248 /* 2.2.4 4 bytes U+1FFFFF */ 264 249 { 265 - "\"\xF7\xBF\xBF\xBF\"", 266 - NULL, /* bug: rejected */ 267 - "\"\\uFFFD\"", 268 250 "\xF7\xBF\xBF\xBF", 251 + NULL, 252 + "\\uFFFD", 269 253 }, 270 254 /* 2.2.5 5 bytes U+3FFFFFF */ 271 255 { 272 - "\"\xFB\xBF\xBF\xBF\xBF\"", 273 - NULL, /* bug: rejected */ 274 - "\"\\uFFFD\"", 275 256 "\xFB\xBF\xBF\xBF\xBF", 257 + NULL, 258 + "\\uFFFD", 276 259 }, 277 260 /* 2.2.6 6 bytes U+7FFFFFFF */ 278 261 { 279 - "\"\xFD\xBF\xBF\xBF\xBF\xBF\"", 280 - NULL, /* bug: rejected */ 281 - "\"\\uFFFD\"", 282 262 "\xFD\xBF\xBF\xBF\xBF\xBF", 263 + NULL, 264 + "\\uFFFD", 283 265 }, 284 266 /* 2.3 Other boundary conditions */ 285 267 { 286 268 /* last one before surrogate range: U+D7FF */ 287 - "\"\xED\x9F\xBF\"", 269 + "\xED\x9F\xBF", 288 270 "\xED\x9F\xBF", 289 - "\"\\uD7FF\"", 271 + "\\uD7FF", 290 272 }, 291 273 { 292 274 /* first one after surrogate range: U+E000 */ 293 - "\"\xEE\x80\x80\"", 275 + "\xEE\x80\x80", 294 276 "\xEE\x80\x80", 295 - "\"\\uE000\"", 277 + "\\uE000", 296 278 }, 297 279 { 298 280 /* last one in BMP: U+FFFD */ 299 - "\"\xEF\xBF\xBD\"", 300 281 "\xEF\xBF\xBD", 301 - "\"\\uFFFD\"", 282 + "\xEF\xBF\xBD", 283 + "\\uFFFD", 302 284 }, 303 285 { 304 286 /* last one in last plane: U+10FFFD */ 305 - "\"\xF4\x8F\xBF\xBD\"", 306 287 "\xF4\x8F\xBF\xBD", 307 - "\"\\uDBFF\\uDFFD\"" 288 + "\xF4\x8F\xBF\xBD", 289 + "\\uDBFF\\uDFFD" 308 290 }, 309 291 { 310 292 /* first one beyond Unicode range: U+110000 */ 311 - "\"\xF4\x90\x80\x80\"", 312 293 "\xF4\x90\x80\x80", 313 - "\"\\uFFFD\"", 294 + NULL, 295 + "\\uFFFD", 314 296 }, 315 297 /* 3 Malformed sequences */ 316 298 /* 3.1 Unexpected continuation bytes */ 317 299 /* 3.1.1 First continuation byte */ 318 300 { 319 - "\"\x80\"", 320 - "\x80", /* bug: not corrected */ 321 - "\"\\uFFFD\"", 301 + "\x80", 302 + NULL, 303 + "\\uFFFD", 322 304 }, 323 305 /* 3.1.2 Last continuation byte */ 324 306 { 325 - "\"\xBF\"", 326 - "\xBF", /* bug: not corrected */ 327 - "\"\\uFFFD\"", 307 + "\xBF", 308 + NULL, 309 + "\\uFFFD", 328 310 }, 329 311 /* 3.1.3 2 continuation bytes */ 330 312 { 331 - "\"\x80\xBF\"", 332 - "\x80\xBF", /* bug: not corrected */ 333 - "\"\\uFFFD\\uFFFD\"", 313 + "\x80\xBF", 314 + NULL, 315 + "\\uFFFD\\uFFFD", 334 316 }, 335 317 /* 3.1.4 3 continuation bytes */ 336 318 { 337 - "\"\x80\xBF\x80\"", 338 - "\x80\xBF\x80", /* bug: not corrected */ 339 - "\"\\uFFFD\\uFFFD\\uFFFD\"", 319 + "\x80\xBF\x80", 320 + NULL, 321 + "\\uFFFD\\uFFFD\\uFFFD", 340 322 }, 341 323 /* 3.1.5 4 continuation bytes */ 342 324 { 343 - "\"\x80\xBF\x80\xBF\"", 344 - "\x80\xBF\x80\xBF", /* bug: not corrected */ 345 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 325 + "\x80\xBF\x80\xBF", 326 + NULL, 327 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 346 328 }, 347 329 /* 3.1.6 5 continuation bytes */ 348 330 { 349 - "\"\x80\xBF\x80\xBF\x80\"", 350 - "\x80\xBF\x80\xBF\x80", /* bug: not corrected */ 351 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 331 + "\x80\xBF\x80\xBF\x80", 332 + NULL, 333 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 352 334 }, 353 335 /* 3.1.7 6 continuation bytes */ 354 336 { 355 - "\"\x80\xBF\x80\xBF\x80\xBF\"", 356 - "\x80\xBF\x80\xBF\x80\xBF", /* bug: not corrected */ 357 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 337 + "\x80\xBF\x80\xBF\x80\xBF", 338 + NULL, 339 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 358 340 }, 359 341 /* 3.1.8 7 continuation bytes */ 360 342 { 361 - "\"\x80\xBF\x80\xBF\x80\xBF\x80\"", 362 - "\x80\xBF\x80\xBF\x80\xBF\x80", /* bug: not corrected */ 363 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 343 + "\x80\xBF\x80\xBF\x80\xBF\x80", 344 + NULL, 345 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 364 346 }, 365 347 /* 3.1.9 Sequence of all 64 possible continuation bytes */ 366 348 { 367 - "\"\x80\x81\x82\x83\x84\x85\x86\x87" 368 - "\x88\x89\x8A\x8B\x8C\x8D\x8E\x8F" 369 - "\x90\x91\x92\x93\x94\x95\x96\x97" 370 - "\x98\x99\x9A\x9B\x9C\x9D\x9E\x9F" 371 - "\xA0\xA1\xA2\xA3\xA4\xA5\xA6\xA7" 372 - "\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF" 373 - "\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7" 374 - "\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF\"", 375 - /* bug: not corrected */ 376 349 "\x80\x81\x82\x83\x84\x85\x86\x87" 377 350 "\x88\x89\x8A\x8B\x8C\x8D\x8E\x8F" 378 351 "\x90\x91\x92\x93\x94\x95\x96\x97" ··· 381 354 "\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF" 382 355 "\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7" 383 356 "\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF", 384 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 357 + NULL, 385 358 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 386 359 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 387 360 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 388 361 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 389 362 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 390 363 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 391 - "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"" 364 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 365 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 392 366 }, 393 367 /* 3.2 Lonely start characters */ 394 368 /* 3.2.1 All 32 first bytes of 2-byte sequences, followed by space */ 395 369 { 396 - "\"\xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 " 397 - "\xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF " 398 - "\xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD7 " 399 - "\xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF \"", 400 - NULL, /* bug: rejected */ 401 - "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 402 - "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 403 - "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 404 - "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"", 405 370 "\xC0 \xC1 \xC2 \xC3 \xC4 \xC5 \xC6 \xC7 " 406 371 "\xC8 \xC9 \xCA \xCB \xCC \xCD \xCE \xCF " 407 372 "\xD0 \xD1 \xD2 \xD3 \xD4 \xD5 \xD6 \xD7 " 408 373 "\xD8 \xD9 \xDA \xDB \xDC \xDD \xDE \xDF ", 374 + NULL, 375 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 376 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 377 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 378 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD ", 409 379 }, 410 380 /* 3.2.2 All 16 first bytes of 3-byte sequences, followed by space */ 411 381 { 412 - "\"\xE0 \xE1 \xE2 \xE3 \xE4 \xE5 \xE6 \xE7 " 413 - "\xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF \"", 414 - /* bug: not corrected */ 415 382 "\xE0 \xE1 \xE2 \xE3 \xE4 \xE5 \xE6 \xE7 " 416 383 "\xE8 \xE9 \xEA \xEB \xEC \xED \xEE \xEF ", 417 - "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 418 - "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"", 384 + NULL, 385 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD " 386 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD ", 419 387 }, 420 388 /* 3.2.3 All 8 first bytes of 4-byte sequences, followed by space */ 421 389 { 422 - "\"\xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF7 \"", 423 - NULL, /* bug: rejected */ 424 - "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \"", 425 390 "\xF0 \xF1 \xF2 \xF3 \xF4 \xF5 \xF6 \xF7 ", 391 + NULL, 392 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD \\uFFFD ", 426 393 }, 427 394 /* 3.2.4 All 4 first bytes of 5-byte sequences, followed by space */ 428 395 { 429 - "\"\xF8 \xF9 \xFA \xFB \"", 430 - NULL, /* bug: rejected */ 431 - "\"\\uFFFD \\uFFFD \\uFFFD \\uFFFD \"", 432 396 "\xF8 \xF9 \xFA \xFB ", 397 + NULL, 398 + "\\uFFFD \\uFFFD \\uFFFD \\uFFFD ", 433 399 }, 434 400 /* 3.2.5 All 2 first bytes of 6-byte sequences, followed by space */ 435 401 { 436 - "\"\xFC \xFD \"", 437 - NULL, /* bug: rejected */ 438 - "\"\\uFFFD \\uFFFD \"", 439 402 "\xFC \xFD ", 403 + NULL, 404 + "\\uFFFD \\uFFFD ", 440 405 }, 441 406 /* 3.3 Sequences with last continuation byte missing */ 442 407 /* 3.3.1 2-byte sequence with last byte missing (U+0000) */ 443 408 { 444 - "\"\xC0\"", 445 - NULL, /* bug: rejected */ 446 - "\"\\uFFFD\"", 447 409 "\xC0", 410 + NULL, 411 + "\\uFFFD", 448 412 }, 449 413 /* 3.3.2 3-byte sequence with last byte missing (U+0000) */ 450 414 { 451 - "\"\xE0\x80\"", 452 - "\xE0\x80", /* bug: not corrected */ 453 - "\"\\uFFFD\"", 415 + "\xE0\x80", 416 + NULL, 417 + "\\uFFFD", 454 418 }, 455 419 /* 3.3.3 4-byte sequence with last byte missing (U+0000) */ 456 420 { 457 - "\"\xF0\x80\x80\"", 458 - "\xF0\x80\x80", /* bug: not corrected */ 459 - "\"\\uFFFD\"", 421 + "\xF0\x80\x80", 422 + NULL, 423 + "\\uFFFD", 460 424 }, 461 425 /* 3.3.4 5-byte sequence with last byte missing (U+0000) */ 462 426 { 463 - "\"\xF8\x80\x80\x80\"", 464 - NULL, /* bug: rejected */ 465 - "\"\\uFFFD\"", 466 427 "\xF8\x80\x80\x80", 428 + NULL, 429 + "\\uFFFD", 467 430 }, 468 431 /* 3.3.5 6-byte sequence with last byte missing (U+0000) */ 469 432 { 470 - "\"\xFC\x80\x80\x80\x80\"", 471 - NULL, /* bug: rejected */ 472 - "\"\\uFFFD\"", 473 433 "\xFC\x80\x80\x80\x80", 434 + NULL, 435 + "\\uFFFD", 474 436 }, 475 437 /* 3.3.6 2-byte sequence with last byte missing (U+07FF) */ 476 438 { 477 - "\"\xDF\"", 478 - "\xDF", /* bug: not corrected */ 479 - "\"\\uFFFD\"", 439 + "\xDF", 440 + NULL, 441 + "\\uFFFD", 480 442 }, 481 443 /* 3.3.7 3-byte sequence with last byte missing (U+FFFF) */ 482 444 { 483 - "\"\xEF\xBF\"", 484 - "\xEF\xBF", /* bug: not corrected */ 485 - "\"\\uFFFD\"", 445 + "\xEF\xBF", 446 + NULL, 447 + "\\uFFFD", 486 448 }, 487 449 /* 3.3.8 4-byte sequence with last byte missing (U+1FFFFF) */ 488 450 { 489 - "\"\xF7\xBF\xBF\"", 490 - NULL, /* bug: rejected */ 491 - "\"\\uFFFD\"", 492 451 "\xF7\xBF\xBF", 452 + NULL, 453 + "\\uFFFD", 493 454 }, 494 455 /* 3.3.9 5-byte sequence with last byte missing (U+3FFFFFF) */ 495 456 { 496 - "\"\xFB\xBF\xBF\xBF\"", 497 - NULL, /* bug: rejected */ 498 - "\"\\uFFFD\"", 499 457 "\xFB\xBF\xBF\xBF", 458 + NULL, 459 + "\\uFFFD", 500 460 }, 501 461 /* 3.3.10 6-byte sequence with last byte missing (U+7FFFFFFF) */ 502 462 { 503 - "\"\xFD\xBF\xBF\xBF\xBF\"", 504 - NULL, /* bug: rejected */ 505 - "\"\\uFFFD\"", 506 463 "\xFD\xBF\xBF\xBF\xBF", 464 + NULL, 465 + "\\uFFFD", 507 466 }, 508 467 /* 3.4 Concatenation of incomplete sequences */ 509 468 { 510 - "\"\xC0\xE0\x80\xF0\x80\x80\xF8\x80\x80\x80\xFC\x80\x80\x80\x80" 511 - "\xDF\xEF\xBF\xF7\xBF\xBF\xFB\xBF\xBF\xBF\xFD\xBF\xBF\xBF\xBF\"", 512 - NULL, /* bug: rejected */ 513 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 514 - "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 515 469 "\xC0\xE0\x80\xF0\x80\x80\xF8\x80\x80\x80\xFC\x80\x80\x80\x80" 516 470 "\xDF\xEF\xBF\xF7\xBF\xBF\xFB\xBF\xBF\xBF\xFD\xBF\xBF\xBF\xBF", 471 + NULL, 472 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 473 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 517 474 }, 518 475 /* 3.5 Impossible bytes */ 519 476 { 520 - "\"\xFE\"", 521 - NULL, /* bug: rejected */ 522 - "\"\\uFFFD\"", 523 477 "\xFE", 478 + NULL, 479 + "\\uFFFD", 524 480 }, 525 481 { 526 - "\"\xFF\"", 527 - NULL, /* bug: rejected */ 528 - "\"\\uFFFD\"", 529 482 "\xFF", 483 + NULL, 484 + "\\uFFFD", 530 485 }, 531 486 { 532 - "\"\xFE\xFE\xFF\xFF\"", 533 - NULL, /* bug: rejected */ 534 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 535 487 "\xFE\xFE\xFF\xFF", 488 + NULL, 489 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 536 490 }, 537 491 /* 4 Overlong sequences */ 538 492 /* 4.1 Overlong '/' */ 539 493 { 540 - "\"\xC0\xAF\"", 541 - NULL, /* bug: rejected */ 542 - "\"\\uFFFD\"", 543 494 "\xC0\xAF", 495 + NULL, 496 + "\\uFFFD", 544 497 }, 545 498 { 546 - "\"\xE0\x80\xAF\"", 547 - "\xE0\x80\xAF", /* bug: not corrected */ 548 - "\"\\uFFFD\"", 499 + "\xE0\x80\xAF", 500 + NULL, 501 + "\\uFFFD", 549 502 }, 550 503 { 551 - "\"\xF0\x80\x80\xAF\"", 552 - "\xF0\x80\x80\xAF", /* bug: not corrected */ 553 - "\"\\uFFFD\"", 504 + "\xF0\x80\x80\xAF", 505 + NULL, 506 + "\\uFFFD", 554 507 }, 555 508 { 556 - "\"\xF8\x80\x80\x80\xAF\"", 557 - NULL, /* bug: rejected */ 558 - "\"\\uFFFD\"", 559 509 "\xF8\x80\x80\x80\xAF", 510 + NULL, 511 + "\\uFFFD", 560 512 }, 561 513 { 562 - "\"\xFC\x80\x80\x80\x80\xAF\"", 563 - NULL, /* bug: rejected */ 564 - "\"\\uFFFD\"", 565 514 "\xFC\x80\x80\x80\x80\xAF", 515 + NULL, 516 + "\\uFFFD", 566 517 }, 567 518 /* 568 519 * 4.2 Maximum overlong sequences ··· 572 523 */ 573 524 { 574 525 /* \U+007F */ 575 - "\"\xC1\xBF\"", 576 - NULL, /* bug: rejected */ 577 - "\"\\uFFFD\"", 578 526 "\xC1\xBF", 527 + NULL, 528 + "\\uFFFD", 579 529 }, 580 530 { 581 531 /* \U+07FF */ 582 - "\"\xE0\x9F\xBF\"", 583 - "\xE0\x9F\xBF", /* bug: not corrected */ 584 - "\"\\uFFFD\"", 532 + "\xE0\x9F\xBF", 533 + NULL, 534 + "\\uFFFD", 585 535 }, 586 536 { 587 537 /* ··· 590 540 * noncharacter. Testing U+FFFC seems more useful. See 591 541 * also 2.2.3 592 542 */ 593 - "\"\xF0\x8F\xBF\xBC\"", 594 - "\xF0\x8F\xBF\xBC", /* bug: not corrected */ 595 - "\"\\uFFFD\"", 543 + "\xF0\x8F\xBF\xBC", 544 + NULL, 545 + "\\uFFFD", 596 546 }, 597 547 { 598 548 /* \U+1FFFFF */ 599 - "\"\xF8\x87\xBF\xBF\xBF\"", 600 - NULL, /* bug: rejected */ 601 - "\"\\uFFFD\"", 602 549 "\xF8\x87\xBF\xBF\xBF", 550 + NULL, 551 + "\\uFFFD", 603 552 }, 604 553 { 605 554 /* \U+3FFFFFF */ 606 - "\"\xFC\x83\xBF\xBF\xBF\xBF\"", 607 - NULL, /* bug: rejected */ 608 - "\"\\uFFFD\"", 609 555 "\xFC\x83\xBF\xBF\xBF\xBF", 556 + NULL, 557 + "\\uFFFD", 610 558 }, 611 559 /* 4.3 Overlong representation of the NUL character */ 612 560 { 613 561 /* \U+0000 */ 614 - "\"\xC0\x80\"", 615 - NULL, /* bug: rejected */ 616 - "\"\\u0000\"", 617 562 "\xC0\x80", 563 + "\xC0\x80", 564 + "\\u0000", 618 565 }, 619 566 { 620 567 /* \U+0000 */ 621 - "\"\xE0\x80\x80\"", 622 - "\xE0\x80\x80", /* bug: not corrected */ 623 - "\"\\uFFFD\"", 568 + "\xE0\x80\x80", 569 + NULL, 570 + "\\uFFFD", 624 571 }, 625 572 { 626 573 /* \U+0000 */ 627 - "\"\xF0\x80\x80\x80\"", 628 - "\xF0\x80\x80\x80", /* bug: not corrected */ 629 - "\"\\uFFFD\"", 574 + "\xF0\x80\x80\x80", 575 + NULL, 576 + "\\uFFFD", 630 577 }, 631 578 { 632 579 /* \U+0000 */ 633 - "\"\xF8\x80\x80\x80\x80\"", 634 - NULL, /* bug: rejected */ 635 - "\"\\uFFFD\"", 636 580 "\xF8\x80\x80\x80\x80", 581 + NULL, 582 + "\\uFFFD", 637 583 }, 638 584 { 639 585 /* \U+0000 */ 640 - "\"\xFC\x80\x80\x80\x80\x80\"", 641 - NULL, /* bug: rejected */ 642 - "\"\\uFFFD\"", 643 586 "\xFC\x80\x80\x80\x80\x80", 587 + NULL, 588 + "\\uFFFD", 644 589 }, 645 590 /* 5 Illegal code positions */ 646 591 /* 5.1 Single UTF-16 surrogates */ 647 592 { 648 593 /* \U+D800 */ 649 - "\"\xED\xA0\x80\"", 650 - "\xED\xA0\x80", /* bug: not corrected */ 651 - "\"\\uFFFD\"", 594 + "\xED\xA0\x80", 595 + NULL, 596 + "\\uFFFD", 652 597 }, 653 598 { 654 599 /* \U+DB7F */ 655 - "\"\xED\xAD\xBF\"", 656 - "\xED\xAD\xBF", /* bug: not corrected */ 657 - "\"\\uFFFD\"", 600 + "\xED\xAD\xBF", 601 + NULL, 602 + "\\uFFFD", 658 603 }, 659 604 { 660 605 /* \U+DB80 */ 661 - "\"\xED\xAE\x80\"", 662 - "\xED\xAE\x80", /* bug: not corrected */ 663 - "\"\\uFFFD\"", 606 + "\xED\xAE\x80", 607 + NULL, 608 + "\\uFFFD", 664 609 }, 665 610 { 666 611 /* \U+DBFF */ 667 - "\"\xED\xAF\xBF\"", 668 - "\xED\xAF\xBF", /* bug: not corrected */ 669 - "\"\\uFFFD\"", 612 + "\xED\xAF\xBF", 613 + NULL, 614 + "\\uFFFD", 670 615 }, 671 616 { 672 617 /* \U+DC00 */ 673 - "\"\xED\xB0\x80\"", 674 - "\xED\xB0\x80", /* bug: not corrected */ 675 - "\"\\uFFFD\"", 618 + "\xED\xB0\x80", 619 + NULL, 620 + "\\uFFFD", 676 621 }, 677 622 { 678 623 /* \U+DF80 */ 679 - "\"\xED\xBE\x80\"", 680 - "\xED\xBE\x80", /* bug: not corrected */ 681 - "\"\\uFFFD\"", 624 + "\xED\xBE\x80", 625 + NULL, 626 + "\\uFFFD", 682 627 }, 683 628 { 684 629 /* \U+DFFF */ 685 - "\"\xED\xBF\xBF\"", 686 - "\xED\xBF\xBF", /* bug: not corrected */ 687 - "\"\\uFFFD\"", 630 + "\xED\xBF\xBF", 631 + NULL, 632 + "\\uFFFD", 688 633 }, 689 634 /* 5.2 Paired UTF-16 surrogates */ 690 635 { 691 636 /* \U+D800\U+DC00 */ 692 - "\"\xED\xA0\x80\xED\xB0\x80\"", 693 - "\xED\xA0\x80\xED\xB0\x80", /* bug: not corrected */ 694 - "\"\\uFFFD\\uFFFD\"", 637 + "\xED\xA0\x80\xED\xB0\x80", 638 + NULL, 639 + "\\uFFFD\\uFFFD", 695 640 }, 696 641 { 697 642 /* \U+D800\U+DFFF */ 698 - "\"\xED\xA0\x80\xED\xBF\xBF\"", 699 - "\xED\xA0\x80\xED\xBF\xBF", /* bug: not corrected */ 700 - "\"\\uFFFD\\uFFFD\"", 643 + "\xED\xA0\x80\xED\xBF\xBF", 644 + NULL, 645 + "\\uFFFD\\uFFFD", 701 646 }, 702 647 { 703 648 /* \U+DB7F\U+DC00 */ 704 - "\"\xED\xAD\xBF\xED\xB0\x80\"", 705 - "\xED\xAD\xBF\xED\xB0\x80", /* bug: not corrected */ 706 - "\"\\uFFFD\\uFFFD\"", 649 + "\xED\xAD\xBF\xED\xB0\x80", 650 + NULL, 651 + "\\uFFFD\\uFFFD", 707 652 }, 708 653 { 709 654 /* \U+DB7F\U+DFFF */ 710 - "\"\xED\xAD\xBF\xED\xBF\xBF\"", 711 - "\xED\xAD\xBF\xED\xBF\xBF", /* bug: not corrected */ 712 - "\"\\uFFFD\\uFFFD\"", 655 + "\xED\xAD\xBF\xED\xBF\xBF", 656 + NULL, 657 + "\\uFFFD\\uFFFD", 713 658 }, 714 659 { 715 660 /* \U+DB80\U+DC00 */ 716 - "\"\xED\xAE\x80\xED\xB0\x80\"", 717 - "\xED\xAE\x80\xED\xB0\x80", /* bug: not corrected */ 718 - "\"\\uFFFD\\uFFFD\"", 661 + "\xED\xAE\x80\xED\xB0\x80", 662 + NULL, 663 + "\\uFFFD\\uFFFD", 719 664 }, 720 665 { 721 666 /* \U+DB80\U+DFFF */ 722 - "\"\xED\xAE\x80\xED\xBF\xBF\"", 723 - "\xED\xAE\x80\xED\xBF\xBF", /* bug: not corrected */ 724 - "\"\\uFFFD\\uFFFD\"", 667 + "\xED\xAE\x80\xED\xBF\xBF", 668 + NULL, 669 + "\\uFFFD\\uFFFD", 725 670 }, 726 671 { 727 672 /* \U+DBFF\U+DC00 */ 728 - "\"\xED\xAF\xBF\xED\xB0\x80\"", 729 - "\xED\xAF\xBF\xED\xB0\x80", /* bug: not corrected */ 730 - "\"\\uFFFD\\uFFFD\"", 673 + "\xED\xAF\xBF\xED\xB0\x80", 674 + NULL, 675 + "\\uFFFD\\uFFFD", 731 676 }, 732 677 { 733 678 /* \U+DBFF\U+DFFF */ 734 - "\"\xED\xAF\xBF\xED\xBF\xBF\"", 735 - "\xED\xAF\xBF\xED\xBF\xBF", /* bug: not corrected */ 736 - "\"\\uFFFD\\uFFFD\"", 679 + "\xED\xAF\xBF\xED\xBF\xBF", 680 + NULL, 681 + "\\uFFFD\\uFFFD", 737 682 }, 738 683 /* 5.3 Other illegal code positions */ 739 684 /* BMP noncharacters */ 740 685 { 741 686 /* \U+FFFE */ 742 - "\"\xEF\xBF\xBE\"", 743 - "\xEF\xBF\xBE", /* bug: not corrected */ 744 - "\"\\uFFFD\"", 687 + "\xEF\xBF\xBE", 688 + NULL, 689 + "\\uFFFD", 745 690 }, 746 691 { 747 692 /* \U+FFFF */ 748 - "\"\xEF\xBF\xBF\"", 749 - "\xEF\xBF\xBF", /* bug: not corrected */ 750 - "\"\\uFFFD\"", 693 + "\xEF\xBF\xBF", 694 + NULL, 695 + "\\uFFFD", 751 696 }, 752 697 { 753 698 /* U+FDD0 */ 754 - "\"\xEF\xB7\x90\"", 755 - "\xEF\xB7\x90", /* bug: not corrected */ 756 - "\"\\uFFFD\"", 699 + "\xEF\xB7\x90", 700 + NULL, 701 + "\\uFFFD", 757 702 }, 758 703 { 759 704 /* U+FDEF */ 760 - "\"\xEF\xB7\xAF\"", 761 - "\xEF\xB7\xAF", /* bug: not corrected */ 762 - "\"\\uFFFD\"", 705 + "\xEF\xB7\xAF", 706 + NULL, 707 + "\\uFFFD", 763 708 }, 764 709 /* Plane 1 .. 16 noncharacters */ 765 710 { 766 711 /* U+1FFFE U+1FFFF U+2FFFE U+2FFFF ... U+10FFFE U+10FFFF */ 767 - "\"\xF0\x9F\xBF\xBE\xF0\x9F\xBF\xBF" 768 - "\xF0\xAF\xBF\xBE\xF0\xAF\xBF\xBF" 769 - "\xF0\xBF\xBF\xBE\xF0\xBF\xBF\xBF" 770 - "\xF1\x8F\xBF\xBE\xF1\x8F\xBF\xBF" 771 - "\xF1\x9F\xBF\xBE\xF1\x9F\xBF\xBF" 772 - "\xF1\xAF\xBF\xBE\xF1\xAF\xBF\xBF" 773 - "\xF1\xBF\xBF\xBE\xF1\xBF\xBF\xBF" 774 - "\xF2\x8F\xBF\xBE\xF2\x8F\xBF\xBF" 775 - "\xF2\x9F\xBF\xBE\xF2\x9F\xBF\xBF" 776 - "\xF2\xAF\xBF\xBE\xF2\xAF\xBF\xBF" 777 - "\xF2\xBF\xBF\xBE\xF2\xBF\xBF\xBF" 778 - "\xF3\x8F\xBF\xBE\xF3\x8F\xBF\xBF" 779 - "\xF3\x9F\xBF\xBE\xF3\x9F\xBF\xBF" 780 - "\xF3\xAF\xBF\xBE\xF3\xAF\xBF\xBF" 781 - "\xF3\xBF\xBF\xBE\xF3\xBF\xBF\xBF" 782 - "\xF4\x8F\xBF\xBE\xF4\x8F\xBF\xBF\"", 783 - /* bug: not corrected */ 784 712 "\xF0\x9F\xBF\xBE\xF0\x9F\xBF\xBF" 785 713 "\xF0\xAF\xBF\xBE\xF0\xAF\xBF\xBF" 786 714 "\xF0\xBF\xBF\xBE\xF0\xBF\xBF\xBF" ··· 797 725 "\xF3\xAF\xBF\xBE\xF3\xAF\xBF\xBF" 798 726 "\xF3\xBF\xBF\xBE\xF3\xBF\xBF\xBF" 799 727 "\xF4\x8F\xBF\xBE\xF4\x8F\xBF\xBF", 800 - "\"\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 728 + NULL, 801 729 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 802 730 "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 803 - "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\"", 731 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD" 732 + "\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD\\uFFFD", 804 733 }, 805 734 {} 806 735 }; 807 - int i; 808 - QObject *obj; 736 + int i, j; 809 737 QString *str; 810 - const char *json_in, *utf8_out, *utf8_in, *json_out; 738 + const char *json_in, *utf8_out, *utf8_in, *json_out, *tail; 739 + char *end, *in, *jstr; 811 740 812 741 for (i = 0; test_cases[i].json_in; i++) { 813 - json_in = test_cases[i].json_in; 814 - utf8_out = test_cases[i].utf8_out; 815 - utf8_in = test_cases[i].utf8_in ?: test_cases[i].utf8_out; 816 - json_out = test_cases[i].json_out ?: test_cases[i].json_in; 742 + for (j = 0; j < 2; j++) { 743 + json_in = test_cases[i].json_in; 744 + utf8_out = test_cases[i].utf8_out; 745 + utf8_in = test_cases[i].utf8_out ?: test_cases[i].json_in; 746 + json_out = test_cases[i].json_out ?: test_cases[i].json_in; 817 747 818 - obj = qobject_from_json(json_in, utf8_out ? &error_abort : NULL); 819 - if (utf8_out) { 820 - str = qobject_to(QString, obj); 821 - g_assert(str); 822 - g_assert_cmpstr(qstring_get_str(str), ==, utf8_out); 823 - } else { 824 - g_assert(!obj); 825 - } 826 - qobject_unref(obj); 748 + /* Parse @json_in, expect @utf8_out */ 749 + if (utf8_out) { 750 + str = from_json_str(json_in, j, &error_abort); 751 + g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_out); 752 + qobject_unref(str); 753 + } else { 754 + str = from_json_str(json_in, j, NULL); 755 + g_assert(!str); 756 + /* 757 + * Failure may be due to any sequence, but *all* sequences 758 + * are expected to fail. Test each one in isolation. 759 + */ 760 + for (tail = json_in; *tail; tail = end) { 761 + mod_utf8_codepoint(tail, 6, &end); 762 + if (*end == ' ') { 763 + end++; 764 + } 765 + in = strndup(tail, end - tail); 766 + str = from_json_str(in, j, NULL); 767 + g_assert(!str); 768 + g_free(in); 769 + } 770 + } 827 771 828 - obj = QOBJECT(qstring_from_str(utf8_in)); 829 - str = qobject_to_json(obj); 830 - if (json_out) { 831 - g_assert(str); 832 - g_assert_cmpstr(qstring_get_str(str), ==, json_out); 833 - } else { 834 - g_assert(!str); 835 - } 836 - qobject_unref(str); 837 - qobject_unref(obj); 772 + /* Unparse @utf8_in, expect @json_out */ 773 + str = qstring_from_str(utf8_in); 774 + jstr = to_json_str(str); 775 + g_assert_cmpstr(jstr, ==, json_out); 776 + qobject_unref(str); 777 + g_free(jstr); 838 778 839 - /* 840 - * Disabled, because qobject_from_json() is buggy, and I can't 841 - * be bothered to add the expected incorrect results. 842 - * FIXME Enable once these bugs have been fixed. 843 - */ 844 - if (0 && json_out != json_in) { 845 - obj = qobject_from_json(json_out, &error_abort); 846 - str = qobject_to(QString, obj); 847 - g_assert(str); 848 - g_assert_cmpstr(qstring_get_str(str), ==, utf8_out); 779 + /* Parse @json_out right back, unless it has replacements */ 780 + if (!strstr(json_out, "\\uFFFD")) { 781 + str = from_json_str(json_out, j, &error_abort); 782 + g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_in); 783 + } 849 784 } 850 - } 851 - } 852 - 853 - static void vararg_string(void) 854 - { 855 - int i; 856 - struct { 857 - const char *decoded; 858 - } test_cases[] = { 859 - { "hello world" }, 860 - { "the quick brown fox jumped over the fence" }, 861 - {} 862 - }; 863 - 864 - for (i = 0; test_cases[i].decoded; i++) { 865 - QString *str; 866 - 867 - str = qobject_to(QString, 868 - qobject_from_jsonf_nofail("%s", 869 - test_cases[i].decoded)); 870 - g_assert(str); 871 - g_assert(strcmp(qstring_get_str(str), test_cases[i].decoded) == 0); 872 - 873 - qobject_unref(str); 874 785 } 875 786 } 876 787 ··· 991 902 } 992 903 } 993 904 994 - static void vararg_number(void) 995 - { 996 - QNum *qnum; 997 - int value = 0x2342; 998 - long long value_ll = 0x2342342343LL; 999 - double valuef = 2.323423423; 1000 - int64_t val; 1001 - 1002 - qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%d", value)); 1003 - g_assert(qnum_get_try_int(qnum, &val)); 1004 - g_assert_cmpint(val, ==, value); 1005 - qobject_unref(qnum); 1006 - 1007 - qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%lld", value_ll)); 1008 - g_assert(qnum_get_try_int(qnum, &val)); 1009 - g_assert_cmpint(val, ==, value_ll); 1010 - qobject_unref(qnum); 1011 - 1012 - qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%f", valuef)); 1013 - g_assert(qnum_get_double(qnum) == valuef); 1014 - qobject_unref(qnum); 1015 - } 1016 - 1017 905 static void keyword_literal(void) 1018 906 { 1019 907 QObject *obj; ··· 1043 931 1044 932 qobject_unref(qbool); 1045 933 934 + obj = qobject_from_json("null", &error_abort); 935 + g_assert(obj != NULL); 936 + g_assert(qobject_type(obj) == QTYPE_QNULL); 937 + 938 + null = qnull(); 939 + g_assert(QOBJECT(null) == obj); 940 + 941 + qobject_unref(obj); 942 + qobject_unref(null); 943 + } 944 + 945 + static void interpolation_valid(void) 946 + { 947 + long long value_lld = 0x123456789abcdefLL; 948 + int64_t value_d64 = value_lld; 949 + long value_ld = (long)value_lld; 950 + int value_d = (int)value_lld; 951 + unsigned long long value_llu = 0xfedcba9876543210ULL; 952 + uint64_t value_u64 = value_llu; 953 + unsigned long value_lu = (unsigned long)value_llu; 954 + unsigned value_u = (unsigned)value_llu; 955 + double value_f = 2.323423423; 956 + const char *value_s = "hello world"; 957 + QObject *value_p = QOBJECT(qnull()); 958 + QBool *qbool; 959 + QNum *qnum; 960 + QString *qstr; 961 + QObject *qobj; 962 + 963 + /* bool */ 964 + 1046 965 qbool = qobject_to(QBool, qobject_from_jsonf_nofail("%i", false)); 1047 966 g_assert(qbool); 1048 967 g_assert(qbool_get_bool(qbool) == false); ··· 1054 973 g_assert(qbool_get_bool(qbool) == true); 1055 974 qobject_unref(qbool); 1056 975 1057 - obj = qobject_from_json("null", &error_abort); 1058 - g_assert(obj != NULL); 1059 - g_assert(qobject_type(obj) == QTYPE_QNULL); 976 + /* number */ 977 + 978 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%d", value_d)); 979 + g_assert_cmpint(qnum_get_int(qnum), ==, value_d); 980 + qobject_unref(qnum); 981 + 982 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%ld", value_ld)); 983 + g_assert_cmpint(qnum_get_int(qnum), ==, value_ld); 984 + qobject_unref(qnum); 985 + 986 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%lld", value_lld)); 987 + g_assert_cmpint(qnum_get_int(qnum), ==, value_lld); 988 + qobject_unref(qnum); 989 + 990 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%" PRId64, value_d64)); 991 + g_assert_cmpint(qnum_get_int(qnum), ==, value_lld); 992 + qobject_unref(qnum); 993 + 994 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%u", value_u)); 995 + g_assert_cmpuint(qnum_get_uint(qnum), ==, value_u); 996 + qobject_unref(qnum); 997 + 998 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%lu", value_lu)); 999 + g_assert_cmpuint(qnum_get_uint(qnum), ==, value_lu); 1000 + qobject_unref(qnum); 1001 + 1002 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%llu", value_llu)); 1003 + g_assert_cmpuint(qnum_get_uint(qnum), ==, value_llu); 1004 + qobject_unref(qnum); 1005 + 1006 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%" PRIu64, value_u64)); 1007 + g_assert_cmpuint(qnum_get_uint(qnum), ==, value_llu); 1008 + qobject_unref(qnum); 1009 + 1010 + qnum = qobject_to(QNum, qobject_from_jsonf_nofail("%f", value_f)); 1011 + g_assert(qnum_get_double(qnum) == value_f); 1012 + qobject_unref(qnum); 1013 + 1014 + /* string */ 1015 + 1016 + qstr = qobject_to(QString, 1017 + qobject_from_jsonf_nofail("%s", value_s)); 1018 + g_assert_cmpstr(qstring_get_try_str(qstr), ==, value_s); 1019 + qobject_unref(qstr); 1020 + 1021 + /* object */ 1022 + 1023 + qobj = qobject_from_jsonf_nofail("%p", value_p); 1024 + g_assert(qobj == value_p); 1025 + } 1060 1026 1061 - null = qnull(); 1062 - g_assert(QOBJECT(null) == obj); 1027 + static void interpolation_unknown(void) 1028 + { 1029 + if (g_test_subprocess()) { 1030 + qobject_from_jsonf_nofail("%x", 666); 1031 + } 1032 + g_test_trap_subprocess(NULL, 0, 0); 1033 + g_test_trap_assert_failed(); 1034 + g_test_trap_assert_stderr("*Unexpected error*" 1035 + "invalid interpolation '%x'*"); 1036 + } 1063 1037 1064 - qobject_unref(obj); 1065 - qobject_unref(null); 1038 + static void interpolation_string(void) 1039 + { 1040 + if (g_test_subprocess()) { 1041 + qobject_from_jsonf_nofail("['%s', %s]", "eins", "zwei"); 1042 + } 1043 + g_test_trap_subprocess(NULL, 0, 0); 1044 + g_test_trap_assert_failed(); 1045 + g_test_trap_assert_stderr("*Unexpected error*" 1046 + "can't interpolate into string*"); 1066 1047 } 1067 1048 1068 1049 static void simple_dict(void) ··· 1236 1217 })), 1237 1218 }, 1238 1219 { 1239 - .encoded = " [ 43 , { 'h' : 'b' }, [ ], 42 ]", 1220 + .encoded = "\t[ 43 , { 'h' : 'b' },\r\n\t[ ], 42 ]\n", 1240 1221 .decoded = QLIT_QLIST(((QLitObject[]){ 1241 1222 QLIT_QNUM(43), 1242 1223 QLIT_QDICT(((QLitDictEntry[]){ ··· 1283 1264 } 1284 1265 } 1285 1266 1286 - static void simple_varargs(void) 1267 + static void simple_interpolation(void) 1287 1268 { 1288 1269 QObject *embedded_obj; 1289 1270 QObject *obj; 1290 1271 QLitObject decoded = QLIT_QLIST(((QLitObject[]){ 1291 1272 QLIT_QNUM(1), 1292 - QLIT_QNUM(2), 1273 + QLIT_QSTR("100%"), 1293 1274 QLIT_QLIST(((QLitObject[]){ 1294 1275 QLIT_QNUM(32), 1295 1276 QLIT_QNUM(42), ··· 1299 1280 embedded_obj = qobject_from_json("[32, 42]", &error_abort); 1300 1281 g_assert(embedded_obj != NULL); 1301 1282 1302 - obj = qobject_from_jsonf_nofail("[%d, 2, %p]", 1, embedded_obj); 1283 + obj = qobject_from_jsonf_nofail("[%d, '100%%', %p]", 1, embedded_obj); 1303 1284 g_assert(qlit_equal_qobject(&decoded, obj)); 1304 1285 1305 1286 qobject_unref(obj); ··· 1307 1288 1308 1289 static void empty_input(void) 1309 1290 { 1310 - const char *empty = ""; 1311 - QObject *obj = qobject_from_json(empty, &error_abort); 1291 + Error *err = NULL; 1292 + QObject *obj; 1293 + 1294 + obj = qobject_from_json("", &err); 1295 + error_free_or_abort(&err); 1296 + g_assert(obj == NULL); 1297 + } 1298 + 1299 + static void blank_input(void) 1300 + { 1301 + Error *err = NULL; 1302 + QObject *obj; 1303 + 1304 + obj = qobject_from_json("\n ", &err); 1305 + error_free_or_abort(&err); 1306 + g_assert(obj == NULL); 1307 + } 1308 + 1309 + static void junk_input(void) 1310 + { 1311 + /* Note: junk within strings is covered elsewhere */ 1312 + Error *err = NULL; 1313 + QObject *obj; 1314 + 1315 + obj = qobject_from_json("@", &err); 1316 + error_free_or_abort(&err); 1317 + g_assert(obj == NULL); 1318 + 1319 + obj = qobject_from_json("{\x01", &err); 1320 + error_free_or_abort(&err); 1321 + g_assert(obj == NULL); 1322 + 1323 + obj = qobject_from_json("[0\xFF]", &err); 1324 + error_free_or_abort(&err); 1325 + g_assert(obj == NULL); 1326 + 1327 + obj = qobject_from_json("00", &err); 1328 + error_free_or_abort(&err); 1329 + g_assert(obj == NULL); 1330 + 1331 + obj = qobject_from_json("[1e", &err); 1332 + error_free_or_abort(&err); 1333 + g_assert(obj == NULL); 1334 + 1335 + obj = qobject_from_json("truer", &err); 1336 + error_free_or_abort(&err); 1312 1337 g_assert(obj == NULL); 1313 1338 } 1314 1339 ··· 1316 1341 { 1317 1342 Error *err = NULL; 1318 1343 QObject *obj = qobject_from_json("\"abc", &err); 1319 - g_assert(!err); /* BUG */ 1344 + error_free_or_abort(&err); 1320 1345 g_assert(obj == NULL); 1321 1346 } 1322 1347 ··· 1324 1349 { 1325 1350 Error *err = NULL; 1326 1351 QObject *obj = qobject_from_json("'abc", &err); 1327 - g_assert(!err); /* BUG */ 1352 + error_free_or_abort(&err); 1328 1353 g_assert(obj == NULL); 1329 1354 } 1330 1355 ··· 1332 1357 { 1333 1358 Error *err = NULL; 1334 1359 QObject *obj = qobject_from_json("\"abc\\\"", &err); 1335 - g_assert(!err); /* BUG */ 1360 + error_free_or_abort(&err); 1336 1361 g_assert(obj == NULL); 1337 1362 } 1338 1363 ··· 1340 1365 { 1341 1366 Error *err = NULL; 1342 1367 QObject *obj = qobject_from_json("[32", &err); 1343 - g_assert(!err); /* BUG */ 1368 + error_free_or_abort(&err); 1344 1369 g_assert(obj == NULL); 1345 1370 } 1346 1371 ··· 1348 1373 { 1349 1374 Error *err = NULL; 1350 1375 QObject *obj = qobject_from_json("[32,", &err); 1351 - g_assert(!err); /* BUG */ 1376 + error_free_or_abort(&err); 1352 1377 g_assert(obj == NULL); 1353 1378 } 1354 1379 ··· 1364 1389 { 1365 1390 Error *err = NULL; 1366 1391 QObject *obj = qobject_from_json("{'abc':32", &err); 1367 - g_assert(!err); /* BUG */ 1392 + error_free_or_abort(&err); 1368 1393 g_assert(obj == NULL); 1369 1394 } 1370 1395 ··· 1372 1397 { 1373 1398 Error *err = NULL; 1374 1399 QObject *obj = qobject_from_json("{'abc':32,", &err); 1375 - g_assert(!err); /* BUG */ 1400 + error_free_or_abort(&err); 1376 1401 g_assert(obj == NULL); 1377 1402 } 1378 1403 ··· 1418 1443 g_assert(obj == NULL); 1419 1444 } 1420 1445 1446 + static void multiple_values(void) 1447 + { 1448 + Error *err = NULL; 1449 + QObject *obj; 1450 + 1451 + obj = qobject_from_json("false true", &err); 1452 + error_free_or_abort(&err); 1453 + g_assert(obj == NULL); 1454 + 1455 + obj = qobject_from_json("} true", &err); 1456 + error_free_or_abort(&err); 1457 + g_assert(obj == NULL); 1458 + } 1459 + 1421 1460 int main(int argc, char **argv) 1422 1461 { 1423 1462 g_test_init(&argc, &argv, NULL); 1424 1463 1425 - g_test_add_func("/literals/string/simple", simple_string); 1426 1464 g_test_add_func("/literals/string/escaped", escaped_string); 1465 + g_test_add_func("/literals/string/quotes", string_with_quotes); 1427 1466 g_test_add_func("/literals/string/utf8", utf8_string); 1428 - g_test_add_func("/literals/string/single_quote", single_quote_string); 1429 - g_test_add_func("/literals/string/vararg", vararg_string); 1430 1467 1431 1468 g_test_add_func("/literals/number/simple", simple_number); 1432 1469 g_test_add_func("/literals/number/large", large_number); 1433 1470 g_test_add_func("/literals/number/float", float_number); 1434 - g_test_add_func("/literals/number/vararg", vararg_number); 1435 1471 1436 1472 g_test_add_func("/literals/keyword", keyword_literal); 1437 1473 1474 + g_test_add_func("/literals/interpolation/valid", interpolation_valid); 1475 + g_test_add_func("/literals/interpolation/unkown", interpolation_unknown); 1476 + g_test_add_func("/literals/interpolation/string", interpolation_string); 1477 + 1438 1478 g_test_add_func("/dicts/simple_dict", simple_dict); 1439 1479 g_test_add_func("/dicts/large_dict", large_dict); 1440 1480 g_test_add_func("/lists/simple_list", simple_list); 1441 1481 1442 - g_test_add_func("/whitespace/simple_whitespace", simple_whitespace); 1443 - 1444 - g_test_add_func("/varargs/simple_varargs", simple_varargs); 1482 + g_test_add_func("/mixed/simple_whitespace", simple_whitespace); 1483 + g_test_add_func("/mixed/interpolation", simple_interpolation); 1445 1484 1446 - g_test_add_func("/errors/empty_input", empty_input); 1485 + g_test_add_func("/errors/empty", empty_input); 1486 + g_test_add_func("/errors/blank", blank_input); 1487 + g_test_add_func("/errors/junk", junk_input); 1447 1488 g_test_add_func("/errors/unterminated/string", unterminated_string); 1448 1489 g_test_add_func("/errors/unterminated/escape", unterminated_escape); 1449 1490 g_test_add_func("/errors/unterminated/sq_string", unterminated_sq_string); ··· 1455 1496 g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma); 1456 1497 g_test_add_func("/errors/unterminated/literal", unterminated_literal); 1457 1498 g_test_add_func("/errors/limits/nesting", limits_nesting); 1499 + g_test_add_func("/errors/multiple_values", multiple_values); 1458 1500 1459 1501 return g_test_run(); 1460 1502 }
+6 -2
tests/drive_del-test.c
··· 65 65 66 66 static void test_after_failed_device_add(void) 67 67 { 68 + char driver[32]; 68 69 QDict *response; 69 70 QDict *error; 71 + 72 + snprintf(driver, sizeof(driver), "virtio-blk-%s", 73 + qvirtio_get_dev_type()); 70 74 71 75 qtest_start("-drive if=none,id=drive0"); 72 76 ··· 75 79 */ 76 80 response = qmp("{'execute': 'device_add'," 77 81 " 'arguments': {" 78 - " 'driver': 'virtio-blk-%s'," 82 + " 'driver': %s," 79 83 " 'drive': 'drive0'" 80 - "}}", qvirtio_get_dev_type()); 84 + "}}", driver); 81 85 g_assert(response); 82 86 error = qdict_get_qdict(response, "error"); 83 87 g_assert_cmpstr(qdict_get_try_str(error, "class"), ==, "GenericError");
+39 -18
tests/libqtest.c
··· 21 21 #include <sys/un.h> 22 22 23 23 #include "libqtest.h" 24 + #include "qemu-common.h" 24 25 #include "qemu/cutils.h" 25 26 #include "qapi/error.h" 26 27 #include "qapi/qmp/json-parser.h" 27 - #include "qapi/qmp/json-streamer.h" 28 28 #include "qapi/qmp/qdict.h" 29 29 #include "qapi/qmp/qjson.h" 30 30 #include "qapi/qmp/qlist.h" ··· 446 446 QDict *response; 447 447 } QMPResponseParser; 448 448 449 - static void qmp_response(JSONMessageParser *parser, GQueue *tokens) 449 + static void qmp_response(void *opaque, QObject *obj, Error *err) 450 450 { 451 - QMPResponseParser *qmp = container_of(parser, QMPResponseParser, parser); 452 - QObject *obj; 451 + QMPResponseParser *qmp = opaque; 452 + 453 + assert(!obj != !err); 453 454 454 - obj = json_parser_parse(tokens, NULL); 455 - if (!obj) { 456 - fprintf(stderr, "QMP JSON response parsing failed\n"); 455 + if (err) { 456 + error_prepend(&err, "QMP JSON response parsing failed: "); 457 + error_report_err(err); 457 458 abort(); 458 459 } 459 460 ··· 468 469 bool log = getenv("QTEST_LOG") != NULL; 469 470 470 471 qmp.response = NULL; 471 - json_message_parser_init(&qmp.parser, qmp_response); 472 + json_message_parser_init(&qmp.parser, qmp_response, &qmp, NULL); 472 473 while (!qmp.response) { 473 474 ssize_t len; 474 475 char c; ··· 506 507 void qmp_fd_vsend(int fd, const char *fmt, va_list ap) 507 508 { 508 509 QObject *qobj; 509 - 510 - /* 511 - * qobject_from_vjsonf_nofail() chokes on leading 0xff as invalid 512 - * JSON, but tests/test-qga.c needs to send that to test QGA 513 - * synchronization 514 - */ 515 - if (*fmt == '\377') { 516 - socket_send(fd, fmt, 1); 517 - fmt++; 518 - } 519 510 520 511 /* Going through qobject ensures we escape strings properly */ 521 512 qobj = qobject_from_vjsonf_nofail(fmt, ap); ··· 601 592 602 593 va_start(ap, fmt); 603 594 qtest_qmp_vsend(s, fmt, ap); 595 + va_end(ap); 596 + } 597 + 598 + void qmp_fd_vsend_raw(int fd, const char *fmt, va_list ap) 599 + { 600 + bool log = getenv("QTEST_LOG") != NULL; 601 + char *str = g_strdup_vprintf(fmt, ap); 602 + 603 + if (log) { 604 + fprintf(stderr, "%s", str); 605 + } 606 + socket_send(fd, str, strlen(str)); 607 + g_free(str); 608 + } 609 + 610 + void qmp_fd_send_raw(int fd, const char *fmt, ...) 611 + { 612 + va_list ap; 613 + 614 + va_start(ap, fmt); 615 + qmp_fd_vsend_raw(fd, fmt, ap); 616 + va_end(ap); 617 + } 618 + 619 + void qtest_qmp_send_raw(QTestState *s, const char *fmt, ...) 620 + { 621 + va_list ap; 622 + 623 + va_start(ap, fmt); 624 + qmp_fd_vsend_raw(s->qmp_fd, fmt, ap); 604 625 va_end(ap); 605 626 } 606 627
+13
tests/libqtest.h
··· 97 97 GCC_FMT_ATTR(2, 3); 98 98 99 99 /** 100 + * qtest_qmp_send_raw: 101 + * @s: #QTestState instance to operate on. 102 + * @fmt...: text to send, formatted like sprintf() 103 + * 104 + * Sends text to the QMP monitor verbatim. Need not be valid JSON; 105 + * this is useful for negative tests. 106 + */ 107 + void qtest_qmp_send_raw(QTestState *s, const char *fmt, ...) 108 + GCC_FMT_ATTR(2, 3); 109 + 110 + /** 100 111 * qtest_qmpv: 101 112 * @s: #QTestState instance to operate on. 102 113 * @fmt: QMP message to send to QEMU, formatted like ··· 948 959 QDict *qmp_fd_receive(int fd); 949 960 void qmp_fd_vsend(int fd, const char *fmt, va_list ap) GCC_FMT_ATTR(2, 0); 950 961 void qmp_fd_send(int fd, const char *fmt, ...) GCC_FMT_ATTR(2, 3); 962 + void qmp_fd_send_raw(int fd, const char *fmt, ...) GCC_FMT_ATTR(2, 3); 963 + void qmp_fd_vsend_raw(int fd, const char *fmt, va_list ap) GCC_FMT_ATTR(2, 0); 951 964 QDict *qmp_fdv(int fd, const char *fmt, va_list ap) GCC_FMT_ATTR(2, 0); 952 965 QDict *qmp_fd(int fd, const char *fmt, ...) GCC_FMT_ATTR(2, 3); 953 966
+213
tests/qmp-cmd-test.c
··· 1 + /* 2 + * QMP command test cases 3 + * 4 + * Copyright (c) 2017 Red Hat Inc. 5 + * 6 + * Authors: 7 + * Markus Armbruster <armbru@redhat.com> 8 + * 9 + * This work is licensed under the terms of the GNU GPL, version 2 or later. 10 + * See the COPYING file in the top-level directory. 11 + */ 12 + 13 + #include "qemu/osdep.h" 14 + #include "libqtest.h" 15 + #include "qapi/error.h" 16 + #include "qapi/qapi-visit-introspect.h" 17 + #include "qapi/qmp/qdict.h" 18 + #include "qapi/qobject-input-visitor.h" 19 + 20 + const char common_args[] = "-nodefaults -machine none"; 21 + 22 + /* Query smoke tests */ 23 + 24 + static int query_error_class(const char *cmd) 25 + { 26 + static struct { 27 + const char *cmd; 28 + int err_class; 29 + } fails[] = { 30 + /* Success depends on build configuration: */ 31 + #ifndef CONFIG_SPICE 32 + { "query-spice", ERROR_CLASS_COMMAND_NOT_FOUND }, 33 + #endif 34 + #ifndef CONFIG_VNC 35 + { "query-vnc", ERROR_CLASS_GENERIC_ERROR }, 36 + { "query-vnc-servers", ERROR_CLASS_GENERIC_ERROR }, 37 + #endif 38 + #ifndef CONFIG_REPLICATION 39 + { "query-xen-replication-status", ERROR_CLASS_COMMAND_NOT_FOUND }, 40 + #endif 41 + /* Likewise, and require special QEMU command-line arguments: */ 42 + { "query-acpi-ospm-status", ERROR_CLASS_GENERIC_ERROR }, 43 + { "query-balloon", ERROR_CLASS_DEVICE_NOT_ACTIVE }, 44 + { "query-hotpluggable-cpus", ERROR_CLASS_GENERIC_ERROR }, 45 + { "query-vm-generation-id", ERROR_CLASS_GENERIC_ERROR }, 46 + { NULL, -1 } 47 + }; 48 + int i; 49 + 50 + for (i = 0; fails[i].cmd; i++) { 51 + if (!strcmp(cmd, fails[i].cmd)) { 52 + return fails[i].err_class; 53 + } 54 + } 55 + return -1; 56 + } 57 + 58 + static void test_query(const void *data) 59 + { 60 + const char *cmd = data; 61 + int expected_error_class = query_error_class(cmd); 62 + QDict *resp, *error; 63 + const char *error_class; 64 + 65 + qtest_start(common_args); 66 + 67 + resp = qmp("{ 'execute': %s }", cmd); 68 + error = qdict_get_qdict(resp, "error"); 69 + error_class = error ? qdict_get_str(error, "class") : NULL; 70 + 71 + if (expected_error_class < 0) { 72 + g_assert(qdict_haskey(resp, "return")); 73 + } else { 74 + g_assert(error); 75 + g_assert_cmpint(qapi_enum_parse(&QapiErrorClass_lookup, error_class, 76 + -1, &error_abort), 77 + ==, expected_error_class); 78 + } 79 + qobject_unref(resp); 80 + 81 + qtest_end(); 82 + } 83 + 84 + static bool query_is_blacklisted(const char *cmd) 85 + { 86 + const char *blacklist[] = { 87 + /* Not actually queries: */ 88 + "add-fd", 89 + /* Success depends on target arch: */ 90 + "query-cpu-definitions", /* arm, i386, ppc, s390x */ 91 + "query-gic-capabilities", /* arm */ 92 + /* Success depends on target-specific build configuration: */ 93 + "query-pci", /* CONFIG_PCI */ 94 + /* Success depends on launching SEV guest */ 95 + "query-sev-launch-measure", 96 + /* Success depends on Host or Hypervisor SEV support */ 97 + "query-sev", 98 + "query-sev-capabilities", 99 + NULL 100 + }; 101 + int i; 102 + 103 + for (i = 0; blacklist[i]; i++) { 104 + if (!strcmp(cmd, blacklist[i])) { 105 + return true; 106 + } 107 + } 108 + return false; 109 + } 110 + 111 + typedef struct { 112 + SchemaInfoList *list; 113 + GHashTable *hash; 114 + } QmpSchema; 115 + 116 + static void qmp_schema_init(QmpSchema *schema) 117 + { 118 + QDict *resp; 119 + Visitor *qiv; 120 + SchemaInfoList *tail; 121 + 122 + qtest_start(common_args); 123 + resp = qmp("{ 'execute': 'query-qmp-schema' }"); 124 + 125 + qiv = qobject_input_visitor_new(qdict_get(resp, "return")); 126 + visit_type_SchemaInfoList(qiv, NULL, &schema->list, &error_abort); 127 + visit_free(qiv); 128 + 129 + qobject_unref(resp); 130 + qtest_end(); 131 + 132 + schema->hash = g_hash_table_new(g_str_hash, g_str_equal); 133 + 134 + /* Build @schema: hash table mapping entity name to SchemaInfo */ 135 + for (tail = schema->list; tail; tail = tail->next) { 136 + g_hash_table_insert(schema->hash, tail->value->name, tail->value); 137 + } 138 + } 139 + 140 + static SchemaInfo *qmp_schema_lookup(QmpSchema *schema, const char *name) 141 + { 142 + return g_hash_table_lookup(schema->hash, name); 143 + } 144 + 145 + static void qmp_schema_cleanup(QmpSchema *schema) 146 + { 147 + qapi_free_SchemaInfoList(schema->list); 148 + g_hash_table_destroy(schema->hash); 149 + } 150 + 151 + static bool object_type_has_mandatory_members(SchemaInfo *type) 152 + { 153 + SchemaInfoObjectMemberList *tail; 154 + 155 + g_assert(type->meta_type == SCHEMA_META_TYPE_OBJECT); 156 + 157 + for (tail = type->u.object.members; tail; tail = tail->next) { 158 + if (!tail->value->has_q_default) { 159 + return true; 160 + } 161 + } 162 + 163 + return false; 164 + } 165 + 166 + static void add_query_tests(QmpSchema *schema) 167 + { 168 + SchemaInfoList *tail; 169 + SchemaInfo *si, *arg_type, *ret_type; 170 + char *test_name; 171 + 172 + /* Test the query-like commands */ 173 + for (tail = schema->list; tail; tail = tail->next) { 174 + si = tail->value; 175 + if (si->meta_type != SCHEMA_META_TYPE_COMMAND) { 176 + continue; 177 + } 178 + 179 + if (query_is_blacklisted(si->name)) { 180 + continue; 181 + } 182 + 183 + arg_type = qmp_schema_lookup(schema, si->u.command.arg_type); 184 + if (object_type_has_mandatory_members(arg_type)) { 185 + continue; 186 + } 187 + 188 + ret_type = qmp_schema_lookup(schema, si->u.command.ret_type); 189 + if (ret_type->meta_type == SCHEMA_META_TYPE_OBJECT 190 + && !ret_type->u.object.members) { 191 + continue; 192 + } 193 + 194 + test_name = g_strdup_printf("qmp/%s", si->name); 195 + qtest_add_data_func(test_name, si->name, test_query); 196 + g_free(test_name); 197 + } 198 + } 199 + 200 + int main(int argc, char *argv[]) 201 + { 202 + QmpSchema schema; 203 + int ret; 204 + 205 + g_test_init(&argc, &argv, NULL); 206 + 207 + qmp_schema_init(&schema); 208 + add_query_tests(&schema); 209 + ret = g_test_run(); 210 + 211 + qmp_schema_cleanup(&schema); 212 + return ret; 213 + }
+60 -192
tests/qmp-test.c
··· 1 1 /* 2 2 * QMP protocol test cases 3 3 * 4 - * Copyright (c) 2017 Red Hat Inc. 4 + * Copyright (c) 2017-2018 Red Hat Inc. 5 5 * 6 6 * Authors: 7 - * Markus Armbruster <armbru@redhat.com>, 7 + * Markus Armbruster <armbru@redhat.com> 8 8 * 9 9 * This work is licensed under the terms of the GNU GPL, version 2 or later. 10 10 * See the COPYING file in the top-level directory. ··· 13 13 #include "qemu/osdep.h" 14 14 #include "libqtest.h" 15 15 #include "qapi/error.h" 16 - #include "qapi/qapi-visit-introspect.h" 17 16 #include "qapi/qapi-visit-misc.h" 18 17 #include "qapi/qmp/qdict.h" 19 18 #include "qapi/qmp/qlist.h" 20 19 #include "qapi/qobject-input-visitor.h" 21 - #include "qapi/util.h" 22 - #include "qapi/visitor.h" 23 20 #include "qapi/qmp/qstring.h" 24 21 25 22 const char common_args[] = "-nodefaults -machine none"; ··· 45 42 visit_free(v); 46 43 } 47 44 45 + static bool recovered(QTestState *qts) 46 + { 47 + QDict *resp; 48 + bool ret; 49 + 50 + resp = qtest_qmp(qts, "{ 'execute': 'no-such-cmd' }"); 51 + ret = !strcmp(get_error_class(resp), "CommandNotFound"); 52 + qobject_unref(resp); 53 + return ret; 54 + } 55 + 48 56 static void test_malformed(QTestState *qts) 49 57 { 50 58 QDict *resp; 59 + 60 + /* syntax error */ 61 + qtest_qmp_send_raw(qts, "{]\n"); 62 + resp = qtest_qmp_receive(qts); 63 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 64 + qobject_unref(resp); 65 + g_assert(recovered(qts)); 66 + 67 + /* lexical error: impossible byte outside string */ 68 + qtest_qmp_send_raw(qts, "{\xFF"); 69 + resp = qtest_qmp_receive(qts); 70 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 71 + qobject_unref(resp); 72 + g_assert(recovered(qts)); 73 + 74 + /* lexical error: funny control character outside string */ 75 + qtest_qmp_send_raw(qts, "{\x01"); 76 + resp = qtest_qmp_receive(qts); 77 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 78 + qobject_unref(resp); 79 + g_assert(recovered(qts)); 80 + 81 + /* lexical error: impossible byte in string */ 82 + qtest_qmp_send_raw(qts, "{'bad \xFF"); 83 + resp = qtest_qmp_receive(qts); 84 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 85 + qobject_unref(resp); 86 + g_assert(recovered(qts)); 87 + 88 + /* lexical error: control character in string */ 89 + qtest_qmp_send_raw(qts, "{'execute': 'nonexistent', 'id':'\n"); 90 + resp = qtest_qmp_receive(qts); 91 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 92 + qobject_unref(resp); 93 + g_assert(recovered(qts)); 94 + 95 + /* lexical error: interpolation */ 96 + qtest_qmp_send_raw(qts, "%%p\n"); 97 + /* two errors, one for "%", one for "p" */ 98 + resp = qtest_qmp_receive(qts); 99 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 100 + qobject_unref(resp); 101 + resp = qtest_qmp_receive(qts); 102 + g_assert_cmpstr(get_error_class(resp), ==, "GenericError"); 103 + qobject_unref(resp); 104 + g_assert(recovered(qts)); 51 105 52 106 /* Not even a dictionary */ 53 107 resp = qtest_qmp(qts, "null"); ··· 253 307 qtest_quit(qts); 254 308 } 255 309 256 - /* Query smoke tests */ 257 - 258 - static int query_error_class(const char *cmd) 259 - { 260 - static struct { 261 - const char *cmd; 262 - int err_class; 263 - } fails[] = { 264 - /* Success depends on build configuration: */ 265 - #ifndef CONFIG_SPICE 266 - { "query-spice", ERROR_CLASS_COMMAND_NOT_FOUND }, 267 - #endif 268 - #ifndef CONFIG_VNC 269 - { "query-vnc", ERROR_CLASS_GENERIC_ERROR }, 270 - { "query-vnc-servers", ERROR_CLASS_GENERIC_ERROR }, 271 - #endif 272 - #ifndef CONFIG_REPLICATION 273 - { "query-xen-replication-status", ERROR_CLASS_COMMAND_NOT_FOUND }, 274 - #endif 275 - /* Likewise, and require special QEMU command-line arguments: */ 276 - { "query-acpi-ospm-status", ERROR_CLASS_GENERIC_ERROR }, 277 - { "query-balloon", ERROR_CLASS_DEVICE_NOT_ACTIVE }, 278 - { "query-hotpluggable-cpus", ERROR_CLASS_GENERIC_ERROR }, 279 - { "query-vm-generation-id", ERROR_CLASS_GENERIC_ERROR }, 280 - { NULL, -1 } 281 - }; 282 - int i; 283 - 284 - for (i = 0; fails[i].cmd; i++) { 285 - if (!strcmp(cmd, fails[i].cmd)) { 286 - return fails[i].err_class; 287 - } 288 - } 289 - return -1; 290 - } 291 - 292 - static void test_query(const void *data) 293 - { 294 - const char *cmd = data; 295 - int expected_error_class = query_error_class(cmd); 296 - QDict *resp, *error; 297 - const char *error_class; 298 - 299 - qtest_start(common_args); 300 - 301 - resp = qmp("{ 'execute': %s }", cmd); 302 - error = qdict_get_qdict(resp, "error"); 303 - error_class = error ? qdict_get_str(error, "class") : NULL; 304 - 305 - if (expected_error_class < 0) { 306 - g_assert(qdict_haskey(resp, "return")); 307 - } else { 308 - g_assert(error); 309 - g_assert_cmpint(qapi_enum_parse(&QapiErrorClass_lookup, error_class, 310 - -1, &error_abort), 311 - ==, expected_error_class); 312 - } 313 - qobject_unref(resp); 314 - 315 - qtest_end(); 316 - } 317 - 318 - static bool query_is_blacklisted(const char *cmd) 319 - { 320 - const char *blacklist[] = { 321 - /* Not actually queries: */ 322 - "add-fd", 323 - /* Success depends on target arch: */ 324 - "query-cpu-definitions", /* arm, i386, ppc, s390x */ 325 - "query-gic-capabilities", /* arm */ 326 - /* Success depends on target-specific build configuration: */ 327 - "query-pci", /* CONFIG_PCI */ 328 - /* Success depends on launching SEV guest */ 329 - "query-sev-launch-measure", 330 - /* Success depends on Host or Hypervisor SEV support */ 331 - "query-sev", 332 - "query-sev-capabilities", 333 - NULL 334 - }; 335 - int i; 336 - 337 - for (i = 0; blacklist[i]; i++) { 338 - if (!strcmp(cmd, blacklist[i])) { 339 - return true; 340 - } 341 - } 342 - return false; 343 - } 344 - 345 - typedef struct { 346 - SchemaInfoList *list; 347 - GHashTable *hash; 348 - } QmpSchema; 349 - 350 - static void qmp_schema_init(QmpSchema *schema) 351 - { 352 - QDict *resp; 353 - Visitor *qiv; 354 - SchemaInfoList *tail; 355 - 356 - qtest_start(common_args); 357 - resp = qmp("{ 'execute': 'query-qmp-schema' }"); 358 - 359 - qiv = qobject_input_visitor_new(qdict_get(resp, "return")); 360 - visit_type_SchemaInfoList(qiv, NULL, &schema->list, &error_abort); 361 - visit_free(qiv); 362 - 363 - qobject_unref(resp); 364 - qtest_end(); 365 - 366 - schema->hash = g_hash_table_new(g_str_hash, g_str_equal); 367 - 368 - /* Build @schema: hash table mapping entity name to SchemaInfo */ 369 - for (tail = schema->list; tail; tail = tail->next) { 370 - g_hash_table_insert(schema->hash, tail->value->name, tail->value); 371 - } 372 - } 373 - 374 - static SchemaInfo *qmp_schema_lookup(QmpSchema *schema, const char *name) 375 - { 376 - return g_hash_table_lookup(schema->hash, name); 377 - } 378 - 379 - static void qmp_schema_cleanup(QmpSchema *schema) 380 - { 381 - qapi_free_SchemaInfoList(schema->list); 382 - g_hash_table_destroy(schema->hash); 383 - } 384 - 385 - static bool object_type_has_mandatory_members(SchemaInfo *type) 386 - { 387 - SchemaInfoObjectMemberList *tail; 388 - 389 - g_assert(type->meta_type == SCHEMA_META_TYPE_OBJECT); 390 - 391 - for (tail = type->u.object.members; tail; tail = tail->next) { 392 - if (!tail->value->has_q_default) { 393 - return true; 394 - } 395 - } 396 - 397 - return false; 398 - } 399 - 400 - static void add_query_tests(QmpSchema *schema) 401 - { 402 - SchemaInfoList *tail; 403 - SchemaInfo *si, *arg_type, *ret_type; 404 - char *test_name; 405 - 406 - /* Test the query-like commands */ 407 - for (tail = schema->list; tail; tail = tail->next) { 408 - si = tail->value; 409 - if (si->meta_type != SCHEMA_META_TYPE_COMMAND) { 410 - continue; 411 - } 412 - 413 - if (query_is_blacklisted(si->name)) { 414 - continue; 415 - } 416 - 417 - arg_type = qmp_schema_lookup(schema, si->u.command.arg_type); 418 - if (object_type_has_mandatory_members(arg_type)) { 419 - continue; 420 - } 421 - 422 - ret_type = qmp_schema_lookup(schema, si->u.command.ret_type); 423 - if (ret_type->meta_type == SCHEMA_META_TYPE_OBJECT 424 - && !ret_type->u.object.members) { 425 - continue; 426 - } 427 - 428 - test_name = g_strdup_printf("qmp/%s", si->name); 429 - qtest_add_data_func(test_name, si->name, test_query); 430 - g_free(test_name); 431 - } 432 - } 433 - 434 310 /* Preconfig tests */ 435 311 436 312 static void test_qmp_preconfig(void) ··· 474 350 475 351 int main(int argc, char *argv[]) 476 352 { 477 - QmpSchema schema; 478 - int ret; 479 - 480 353 g_test_init(&argc, &argv, NULL); 481 354 482 355 qtest_add_func("qmp/protocol", test_qmp_protocol); 483 356 qtest_add_func("qmp/oob", test_qmp_oob); 484 - qmp_schema_init(&schema); 485 - add_query_tests(&schema); 486 357 qtest_add_func("qmp/preconfig", test_qmp_preconfig); 487 358 488 - ret = g_test_run(); 489 - 490 - qmp_schema_cleanup(&schema); 491 - return ret; 359 + return g_test_run(); 492 360 }
+2 -1
tests/test-qga.c
··· 147 147 unsigned char c; 148 148 QDict *ret; 149 149 150 + qmp_fd_send_raw(fixture->fd, "\xff"); 150 151 qmp_fd_send(fixture->fd, 151 - "\xff{'execute': 'guest-sync-delimited'," 152 + "{'execute': 'guest-sync-delimited'," 152 153 " 'arguments': {'id': %u } }", 153 154 r); 154 155
+62 -7
util/unicode.c
··· 13 13 #include "qemu/osdep.h" 14 14 #include "qemu/unicode.h" 15 15 16 + static bool is_valid_codepoint(int codepoint) 17 + { 18 + if (codepoint > 0x10FFFFu) { 19 + return false; /* beyond Unicode range */ 20 + } 21 + if ((codepoint >= 0xFDD0 && codepoint <= 0xFDEF) 22 + || (codepoint & 0xFFFE) == 0xFFFE) { 23 + return false; /* noncharacter */ 24 + } 25 + if (codepoint >= 0xD800 && codepoint <= 0xDFFF) { 26 + return false; /* surrogate code point */ 27 + } 28 + return true; 29 + } 30 + 16 31 /** 17 32 * mod_utf8_codepoint: 18 33 * @s: string encoded in modified UTF-8 ··· 83 98 cp <<= 6; 84 99 cp |= byte & 0x3F; 85 100 } 86 - if (cp > 0x10FFFF) { 87 - cp = -1; /* beyond Unicode range */ 88 - } else if ((cp >= 0xFDD0 && cp <= 0xFDEF) 89 - || (cp & 0xFFFE) == 0xFFFE) { 90 - cp = -1; /* noncharacter */ 91 - } else if (cp >= 0xD800 && cp <= 0xDFFF) { 92 - cp = -1; /* surrogate code point */ 101 + if (!is_valid_codepoint(cp)) { 102 + cp = -1; 93 103 } else if (cp < min_cp[len - 2] && !(cp == 0 && len == 2)) { 94 104 cp = -1; /* overlong, not \xC0\x80 */ 95 105 } ··· 99 109 *end = (char *)p; 100 110 return cp; 101 111 } 112 + 113 + /** 114 + * mod_utf8_encode: 115 + * @buf: Destination buffer 116 + * @bufsz: size of @buf, at least 5. 117 + * @codepoint: Unicode codepoint to encode 118 + * 119 + * Convert Unicode codepoint @codepoint to modified UTF-8. 120 + * 121 + * Returns: the length of the UTF-8 sequence on success, -1 when 122 + * @codepoint is invalid. 123 + */ 124 + ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint) 125 + { 126 + assert(bufsz >= 5); 127 + 128 + if (!is_valid_codepoint(codepoint)) { 129 + return -1; 130 + } 131 + 132 + if (codepoint > 0 && codepoint <= 0x7F) { 133 + buf[0] = codepoint & 0x7F; 134 + buf[1] = 0; 135 + return 1; 136 + } 137 + if (codepoint <= 0x7FF) { 138 + buf[0] = 0xC0 | ((codepoint >> 6) & 0x1F); 139 + buf[1] = 0x80 | (codepoint & 0x3F); 140 + buf[2] = 0; 141 + return 2; 142 + } 143 + if (codepoint <= 0xFFFF) { 144 + buf[0] = 0xE0 | ((codepoint >> 12) & 0x0F); 145 + buf[1] = 0x80 | ((codepoint >> 6) & 0x3F); 146 + buf[2] = 0x80 | (codepoint & 0x3F); 147 + buf[3] = 0; 148 + return 3; 149 + } 150 + buf[0] = 0xF0 | ((codepoint >> 18) & 0x07); 151 + buf[1] = 0x80 | ((codepoint >> 12) & 0x3F); 152 + buf[2] = 0x80 | ((codepoint >> 6) & 0x3F); 153 + buf[3] = 0x80 | (codepoint & 0x3F); 154 + buf[4] = 0; 155 + return 4; 156 + }