My working unpac repository
at opam/upstream/seq 824 lines 31 kB view raw
1(**************************************************************************) 2(* *) 3(* OCaml *) 4(* *) 5(* Xavier Leroy, projet Cristal, INRIA Rocquencourt *) 6(* *) 7(* Copyright 1996 Institut National de Recherche en Informatique et *) 8(* en Automatique. *) 9(* *) 10(* All rights reserved. This file is distributed under the terms of *) 11(* the GNU Lesser General Public License version 2.1, with the *) 12(* special exception on linking described in the file LICENSE. *) 13(* *) 14(**************************************************************************) 15 16(* NOTE: 17 If this file is bytesLabels.mli, run tools/sync_stdlib_docs after editing it 18 to generate bytes.mli. 19 20 If this file is bytes.mli, do not edit it directly -- edit 21 bytesLabels.mli instead. 22 *) 23 24(** Byte sequence operations. 25 26 A byte sequence is a mutable data structure that contains a 27 fixed-length sequence of bytes. Each byte can be indexed in 28 constant time for reading or writing. 29 30 Given a byte sequence [s] of length [l], we can access each of the 31 [l] bytes of [s] via its index in the sequence. Indexes start at 32 [0], and we will call an index valid in [s] if it falls within the 33 range [[0...l-1]] (inclusive). A position is the point between two 34 bytes or at the beginning or end of the sequence. We call a 35 position valid in [s] if it falls within the range [[0...l]] 36 (inclusive). Note that the byte at index [n] is between positions 37 [n] and [n+1]. 38 39 Two parameters [start] and [len] are said to designate a valid 40 range of [s] if [len >= 0] and [start] and [start+len] are valid 41 positions in [s]. 42 43 Byte sequences can be modified in place, for instance via the [set] 44 and [blit] functions described below. See also strings (module 45 {!String}), which are almost the same data structure, but cannot be 46 modified in place. 47 48 Bytes are represented by the OCaml type [char]. 49 50 The labeled version of this module can be used as described in the 51 {!StdLabels} module. 52 53 @since 4.02 54 55 *) 56 57external length : bytes -> int = "%bytes_length" 58(** Return the length (number of bytes) of the argument. *) 59 60external get : bytes -> int -> char = "%bytes_safe_get" 61(** [get s n] returns the byte at index [n] in argument [s]. 62 @raise Invalid_argument if [n] is not a valid index in [s]. *) 63 64 65external set : bytes -> int -> char -> unit = "%bytes_safe_set" 66(** [set s n c] modifies [s] in place, replacing the byte at index [n] 67 with [c]. 68 @raise Invalid_argument if [n] is not a valid index in [s]. *) 69 70external create : int -> bytes = "caml_create_bytes" 71(** [create n] returns a new byte sequence of length [n]. The 72 sequence is uninitialized and contains arbitrary bytes. 73 @raise Invalid_argument if [n < 0] or [n > ]{!Sys.max_string_length}. *) 74 75val make : int -> char -> bytes 76(** [make n c] returns a new byte sequence of length [n], filled with 77 the byte [c]. 78 @raise Invalid_argument if [n < 0] or [n > ]{!Sys.max_string_length}. *) 79 80val init : int -> (int -> char) -> bytes 81(** [init n f] returns a fresh byte sequence of length [n], 82 with character [i] initialized to the result of [f i] (in increasing 83 index order). 84 @raise Invalid_argument if [n < 0] or [n > ]{!Sys.max_string_length}. *) 85 86val empty : bytes 87(** A byte sequence of size 0. *) 88 89val copy : bytes -> bytes 90(** Return a new byte sequence that contains the same bytes as the 91 argument. *) 92 93val of_string : string -> bytes 94(** Return a new byte sequence that contains the same bytes as the 95 given string. *) 96 97val to_string : bytes -> string 98(** Return a new string that contains the same bytes as the given byte 99 sequence. *) 100 101val sub : bytes -> int -> int -> bytes 102(** [sub s pos len] returns a new byte sequence of length [len], 103 containing the subsequence of [s] that starts at position [pos] 104 and has length [len]. 105 @raise Invalid_argument if [pos] and [len] do not designate a 106 valid range of [s]. *) 107 108val sub_string : bytes -> int -> int -> string 109(** Same as {!sub} but return a string instead of a byte sequence. *) 110 111val extend : bytes -> int -> int -> bytes 112(** [extend s left right] returns a new byte sequence that contains 113 the bytes of [s], with [left] uninitialized bytes prepended and 114 [right] uninitialized bytes appended to it. If [left] or [right] 115 is negative, then bytes are removed (instead of appended) from 116 the corresponding side of [s]. 117 @raise Invalid_argument if the result length is negative or 118 longer than {!Sys.max_string_length} bytes. 119 @since 4.05 in BytesLabels *) 120 121val fill : bytes -> int -> int -> char -> unit 122(** [fill s pos len c] modifies [s] in place, replacing [len] 123 characters with [c], starting at [pos]. 124 @raise Invalid_argument if [pos] and [len] do not designate a 125 valid range of [s]. *) 126 127val blit : 128 bytes -> int -> bytes -> int -> int 129 -> unit 130(** [blit src src_pos dst dst_pos len] copies [len] bytes from byte 131 sequence [src], starting at index [src_pos], to byte sequence [dst], 132 starting at index [dst_pos]. It works correctly even if [src] and [dst] are 133 the same byte sequence, and the source and destination intervals 134 overlap. 135 @raise Invalid_argument if [src_pos] and [len] do not 136 designate a valid range of [src], or if [dst_pos] and [len] 137 do not designate a valid range of [dst]. *) 138 139val blit_string : 140 string -> int -> bytes -> int -> int 141 -> unit 142(** [blit_string src src_pos dst dst_pos len] copies [len] bytes from 143 string [src], starting at index [src_pos], to byte sequence [dst], 144 starting at index [dst_pos]. 145 @raise Invalid_argument if [src_pos] and [len] do not 146 designate a valid range of [src], or if [dst_pos] and [len] 147 do not designate a valid range of [dst]. 148 @since 4.05 in BytesLabels *) 149 150val concat : bytes -> bytes list -> bytes 151(** [concat sep sl] concatenates the list of byte sequences [sl], 152 inserting the separator byte sequence [sep] between each, and 153 returns the result as a new byte sequence. 154 @raise Invalid_argument if the result is longer than 155 {!Sys.max_string_length} bytes. 156 *) 157 158val cat : bytes -> bytes -> bytes 159(** [cat s1 s2] concatenates [s1] and [s2] and returns the result 160 as a new byte sequence. 161 @raise Invalid_argument if the result is longer than 162 {!Sys.max_string_length} bytes. 163 @since 4.05 in BytesLabels *) 164 165val iter : (char -> unit) -> bytes -> unit 166(** [iter f s] applies function [f] in turn to all the bytes of [s]. 167 It is equivalent to [f (get s 0); f (get s 1); ...; f (get s 168 (length s - 1)); ()]. *) 169 170val iteri : (int -> char -> unit) -> bytes -> unit 171(** Same as {!iter}, but the function is applied to the index of 172 the byte as first argument and the byte itself as second 173 argument. *) 174 175val map : (char -> char) -> bytes -> bytes 176(** [map f s] applies function [f] in turn to all the bytes of [s] (in 177 increasing index order) and stores the resulting bytes in a new sequence 178 that is returned as the result. *) 179 180val mapi : (int -> char -> char) -> bytes -> bytes 181(** [mapi f s] calls [f] with each character of [s] and its 182 index (in increasing index order) and stores the resulting bytes 183 in a new sequence that is returned as the result. *) 184 185val fold_left : ('acc -> char -> 'acc) -> 'acc -> bytes -> 'acc 186(** [fold_left f x s] computes 187 [f (... (f (f x (get s 0)) (get s 1)) ...) (get s (n-1))], 188 where [n] is the length of [s]. 189 @since 4.13 *) 190 191val fold_right : (char -> 'acc -> 'acc) -> bytes -> 'acc -> 'acc 192(** [fold_right f s x] computes 193 [f (get s 0) (f (get s 1) ( ... (f (get s (n-1)) x) ...))], 194 where [n] is the length of [s]. 195 @since 4.13 *) 196 197val for_all : (char -> bool) -> bytes -> bool 198(** [for_all p s] checks if all characters in [s] satisfy the predicate [p]. 199 @since 4.13 *) 200 201val exists : (char -> bool) -> bytes -> bool 202(** [exists p s] checks if at least one character of [s] satisfies the predicate 203 [p]. 204 @since 4.13 *) 205 206val trim : bytes -> bytes 207(** Return a copy of the argument, without leading and trailing 208 whitespace. The bytes regarded as whitespace are the ASCII 209 characters [' '], ['\012'], ['\n'], ['\r'], and ['\t']. *) 210 211val escaped : bytes -> bytes 212(** Return a copy of the argument, with special characters represented 213 by escape sequences, following the lexical conventions of OCaml. 214 All characters outside the ASCII printable range (32..126) are 215 escaped, as well as backslash and double-quote. 216 @raise Invalid_argument if the result is longer than 217 {!Sys.max_string_length} bytes. *) 218 219val index : bytes -> char -> int 220(** [index s c] returns the index of the first occurrence of byte [c] 221 in [s]. 222 @raise Not_found if [c] does not occur in [s]. *) 223 224val index_opt: bytes -> char -> int option 225(** [index_opt s c] returns the index of the first occurrence of byte [c] 226 in [s] or [None] if [c] does not occur in [s]. 227 @since 4.05 *) 228 229val rindex : bytes -> char -> int 230(** [rindex s c] returns the index of the last occurrence of byte [c] 231 in [s]. 232 @raise Not_found if [c] does not occur in [s]. *) 233 234val rindex_opt: bytes -> char -> int option 235(** [rindex_opt s c] returns the index of the last occurrence of byte [c] 236 in [s] or [None] if [c] does not occur in [s]. 237 @since 4.05 *) 238 239val index_from : bytes -> int -> char -> int 240(** [index_from s i c] returns the index of the first occurrence of 241 byte [c] in [s] after position [i]. [index s c] is 242 equivalent to [index_from s 0 c]. 243 @raise Invalid_argument if [i] is not a valid position in [s]. 244 @raise Not_found if [c] does not occur in [s] after position [i]. *) 245 246val index_from_opt: bytes -> int -> char -> int option 247(** [index_from_opt s i c] returns the index of the first occurrence of 248 byte [c] in [s] after position [i] or [None] if [c] does not occur in [s] 249 after position [i]. 250 [index_opt s c] is equivalent to [index_from_opt s 0 c]. 251 @raise Invalid_argument if [i] is not a valid position in [s]. 252 @since 4.05 *) 253 254val rindex_from : bytes -> int -> char -> int 255(** [rindex_from s i c] returns the index of the last occurrence of 256 byte [c] in [s] before position [i+1]. [rindex s c] is equivalent 257 to [rindex_from s (length s - 1) c]. 258 @raise Invalid_argument if [i+1] is not a valid position in [s]. 259 @raise Not_found if [c] does not occur in [s] before position [i+1]. *) 260 261val rindex_from_opt: bytes -> int -> char -> int option 262(** [rindex_from_opt s i c] returns the index of the last occurrence 263 of byte [c] in [s] before position [i+1] or [None] if [c] does not 264 occur in [s] before position [i+1]. [rindex_opt s c] is equivalent to 265 [rindex_from s (length s - 1) c]. 266 @raise Invalid_argument if [i+1] is not a valid position in [s]. 267 @since 4.05 *) 268 269val contains : bytes -> char -> bool 270(** [contains s c] tests if byte [c] appears in [s]. *) 271 272val contains_from : bytes -> int -> char -> bool 273(** [contains_from s start c] tests if byte [c] appears in [s] after 274 position [start]. [contains s c] is equivalent to [contains_from 275 s 0 c]. 276 @raise Invalid_argument if [start] is not a valid position in [s]. *) 277 278val rcontains_from : bytes -> int -> char -> bool 279(** [rcontains_from s stop c] tests if byte [c] appears in [s] before 280 position [stop+1]. 281 @raise Invalid_argument if [stop < 0] or [stop+1] is not a valid 282 position in [s]. *) 283 284val uppercase_ascii : bytes -> bytes 285(** Return a copy of the argument, with all lowercase letters 286 translated to uppercase, using the US-ASCII character set. 287 @since 4.03 (4.05 in BytesLabels) *) 288 289val lowercase_ascii : bytes -> bytes 290(** Return a copy of the argument, with all uppercase letters 291 translated to lowercase, using the US-ASCII character set. 292 @since 4.03 (4.05 in BytesLabels) *) 293 294val capitalize_ascii : bytes -> bytes 295(** Return a copy of the argument, with the first character set to uppercase, 296 using the US-ASCII character set. 297 @since 4.03 (4.05 in BytesLabels) *) 298 299val uncapitalize_ascii : bytes -> bytes 300(** Return a copy of the argument, with the first character set to lowercase, 301 using the US-ASCII character set. 302 @since 4.03 (4.05 in BytesLabels) *) 303 304type t = bytes 305(** An alias for the type of byte sequences. *) 306 307val compare: t -> t -> int 308(** The comparison function for byte sequences, with the same 309 specification as {!Stdlib.compare}. Along with the type [t], 310 this function [compare] allows the module [Bytes] to be passed as 311 argument to the functors {!Set.Make} and {!Map.Make}. *) 312 313val equal: t -> t -> bool 314(** The equality function for byte sequences. 315 @since 4.03 (4.05 in BytesLabels) *) 316 317val starts_with : 318 prefix (* comment thwarts tools/sync_stdlib_docs *) :bytes -> bytes -> bool 319(** [starts_with ][~prefix s] is [true] if and only if [s] starts with 320 [prefix]. 321 322 @since 4.13 *) 323 324val ends_with : 325 suffix (* comment thwarts tools/sync_stdlib_docs *) :bytes -> bytes -> bool 326(** [ends_with ][~suffix s] is [true] if and only if [s] ends with [suffix]. 327 328 @since 4.13 *) 329 330(** {1:unsafe Unsafe conversions (for advanced users)} 331 332 This section describes unsafe, low-level conversion functions 333 between [bytes] and [string]. They do not copy the internal data; 334 used improperly, they can break the immutability invariant on 335 strings. They are available for expert library authors, but for 336 most purposes you should use the always-correct {!to_string} and 337 {!of_string} instead. 338*) 339 340val unsafe_to_string : bytes -> string 341(** Unsafely convert a byte sequence into a string. 342 343 To reason about the use of [unsafe_to_string], it is convenient to 344 consider an "ownership" discipline. A piece of code that 345 manipulates some data "owns" it; there are several disjoint ownership 346 modes, including: 347 - Unique ownership: the data may be accessed and mutated 348 - Shared ownership: the data has several owners, that may only 349 access it, not mutate it. 350 351 Unique ownership is linear: passing the data to another piece of 352 code means giving up ownership (we cannot write the 353 data again). A unique owner may decide to make the data shared 354 (giving up mutation rights on it), but shared data may not become 355 uniquely-owned again. 356 357 [unsafe_to_string s] can only be used when the caller owns the byte 358 sequence [s] -- either uniquely or as shared immutable data. The 359 caller gives up ownership of [s], and gains ownership of the 360 returned string. 361 362 There are two valid use-cases that respect this ownership 363 discipline: 364 365 1. Creating a string by initializing and mutating a byte sequence 366 that is never changed after initialization is performed. 367 368 {[ 369let string_init len f : string = 370 let s = Bytes.create len in 371 for i = 0 to len - 1 do Bytes.set s i (f i) done; 372 Bytes.unsafe_to_string s 373 ]} 374 375 This function is safe because the byte sequence [s] will never be 376 accessed or mutated after [unsafe_to_string] is called. The 377 [string_init] code gives up ownership of [s], and returns the 378 ownership of the resulting string to its caller. 379 380 Note that it would be unsafe if [s] was passed as an additional 381 parameter to the function [f] as it could escape this way and be 382 mutated in the future -- [string_init] would give up ownership of 383 [s] to pass it to [f], and could not call [unsafe_to_string] 384 safely. 385 386 We have provided the {!String.init}, {!String.map} and 387 {!String.mapi} functions to cover most cases of building 388 new strings. You should prefer those over [to_string] or 389 [unsafe_to_string] whenever applicable. 390 391 2. Temporarily giving ownership of a byte sequence to a function 392 that expects a uniquely owned string and returns ownership back, so 393 that we can mutate the sequence again after the call ended. 394 395 {[ 396let bytes_length (s : bytes) = 397 String.length (Bytes.unsafe_to_string s) 398 ]} 399 400 In this use-case, we do not promise that [s] will never be mutated 401 after the call to [bytes_length s]. The {!String.length} function 402 temporarily borrows unique ownership of the byte sequence 403 (and sees it as a [string]), but returns this ownership back to 404 the caller, which may assume that [s] is still a valid byte 405 sequence after the call. Note that this is only correct because we 406 know that {!String.length} does not capture its argument -- it could 407 escape by a side-channel such as a memoization combinator. 408 409 The caller may not mutate [s] while the string is borrowed (it has 410 temporarily given up ownership). This affects concurrent programs, 411 but also higher-order functions: if {!String.length} returned 412 a closure to be called later, [s] should not be mutated until this 413 closure is fully applied and returns ownership. 414*) 415 416val unsafe_of_string : string -> bytes 417(** Unsafely convert a shared string to a byte sequence that should 418 not be mutated. 419 420 The same ownership discipline that makes [unsafe_to_string] 421 correct applies to [unsafe_of_string]: you may use it if you were 422 the owner of the [string] value, and you will own the return 423 [bytes] in the same mode. 424 425 In practice, unique ownership of string values is extremely 426 difficult to reason about correctly. You should always assume 427 strings are shared, never uniquely owned. 428 429 For example, string literals are implicitly shared by the 430 compiler, so you never uniquely own them. 431 432 {[ 433let incorrect = Bytes.unsafe_of_string "hello" 434let s = Bytes.of_string "hello" 435 ]} 436 437 The first declaration is incorrect, because the string literal 438 ["hello"] could be shared by the compiler with other parts of the 439 program, and mutating [incorrect] is a bug. You must always use 440 the second version, which performs a copy and is thus correct. 441 442 Assuming unique ownership of strings that are not string 443 literals, but are (partly) built from string literals, is also 444 incorrect. For example, mutating [unsafe_of_string ("foo" ^ s)] 445 could mutate the shared string ["foo"] -- assuming a rope-like 446 representation of strings. More generally, functions operating on 447 strings will assume shared ownership, they do not preserve unique 448 ownership. It is thus incorrect to assume unique ownership of the 449 result of [unsafe_of_string]. 450 451 The only case we have reasonable confidence is safe is if the 452 produced [bytes] is shared -- used as an immutable byte 453 sequence. This is possibly useful for incremental migration of 454 low-level programs that manipulate immutable sequences of bytes 455 (for example {!Marshal.from_bytes}) and previously used the 456 [string] type for this purpose. 457*) 458 459 460val split_on_char: char -> bytes -> bytes list 461(** [split_on_char sep s] returns the list of all (possibly empty) 462 subsequences of [s] that are delimited by the [sep] character. 463 If [s] is empty, the result is the singleton list [[empty]]. 464 465 The function's output is specified by the following invariants: 466 467 - The list is not empty. 468 - Concatenating its elements using [sep] as a separator returns a 469 byte sequence equal to the input ([Bytes.concat (Bytes.make 1 sep) 470 (Bytes.split_on_char sep s) = s]). 471 - No byte sequence in the result contains the [sep] character. 472 473 @since 4.13 474*) 475 476(** {1 Iterators} *) 477 478val to_seq : t -> char Seq.t 479(** Iterate on the string, in increasing index order. Modifications of the 480 string during iteration will be reflected in the sequence. 481 @since 4.07 *) 482 483val to_seqi : t -> (int * char) Seq.t 484(** Iterate on the string, in increasing order, yielding indices along chars 485 @since 4.07 *) 486 487val of_seq : char Seq.t -> t 488(** Create a string from the generator 489 @since 4.07 *) 490 491(** {1:utf UTF codecs and validations} 492 493 @since 4.14 *) 494 495(** {2:utf_8 UTF-8} *) 496 497val get_utf_8_uchar : t -> int -> Uchar.utf_decode 498(** [get_utf_8_uchar b i] decodes an UTF-8 character at index [i] in 499 [b]. *) 500 501val set_utf_8_uchar : t -> int -> Uchar.t -> int 502(** [set_utf_8_uchar b i u] UTF-8 encodes [u] at index [i] in [b] 503 and returns the number of bytes [n] that were written starting 504 at [i]. If [n] is [0] there was not enough space to encode [u] 505 at [i] and [b] was left untouched. Otherwise a new character can 506 be encoded at [i + n]. *) 507 508val is_valid_utf_8 : t -> bool 509(** [is_valid_utf_8 b] is [true] if and only if [b] contains valid 510 UTF-8 data. *) 511 512(** {2:utf_16be UTF-16BE} *) 513 514val get_utf_16be_uchar : t -> int -> Uchar.utf_decode 515(** [get_utf_16be_uchar b i] decodes an UTF-16BE character at index 516 [i] in [b]. *) 517 518val set_utf_16be_uchar : t -> int -> Uchar.t -> int 519(** [set_utf_16be_uchar b i u] UTF-16BE encodes [u] at index [i] in [b] 520 and returns the number of bytes [n] that were written starting 521 at [i]. If [n] is [0] there was not enough space to encode [u] 522 at [i] and [b] was left untouched. Otherwise a new character can 523 be encoded at [i + n]. *) 524 525val is_valid_utf_16be : t -> bool 526(** [is_valid_utf_16be b] is [true] if and only if [b] contains valid 527 UTF-16BE data. *) 528 529(** {2:utf_16le UTF-16LE} *) 530 531val get_utf_16le_uchar : t -> int -> Uchar.utf_decode 532(** [get_utf_16le_uchar b i] decodes an UTF-16LE character at index 533 [i] in [b]. *) 534 535val set_utf_16le_uchar : t -> int -> Uchar.t -> int 536(** [set_utf_16le_uchar b i u] UTF-16LE encodes [u] at index [i] in [b] 537 and returns the number of bytes [n] that were written starting 538 at [i]. If [n] is [0] there was not enough space to encode [u] 539 at [i] and [b] was left untouched. Otherwise a new character can 540 be encoded at [i + n]. *) 541 542val is_valid_utf_16le : t -> bool 543(** [is_valid_utf_16le b] is [true] if and only if [b] contains valid 544 UTF-16LE data. *) 545 546(** {1 Binary encoding/decoding of integers} *) 547 548(** The functions in this section binary encode and decode integers to 549 and from byte sequences. 550 551 All following functions raise [Invalid_argument] if the space 552 needed at index [i] to decode or encode the integer is not 553 available. 554 555 Little-endian (resp. big-endian) encoding means that least 556 (resp. most) significant bytes are stored first. Big-endian is 557 also known as network byte order. Native-endian encoding is 558 either little-endian or big-endian depending on {!Sys.big_endian}. 559 560 32-bit and 64-bit integers are represented by the [int32] and 561 [int64] types, which can be interpreted either as signed or 562 unsigned numbers. 563 564 8-bit and 16-bit integers are represented by the [int] type, 565 which has more bits than the binary encoding. These extra bits 566 are handled as follows: 567 {ul 568 {- Functions that decode signed (resp. unsigned) 8-bit or 16-bit 569 integers represented by [int] values sign-extend 570 (resp. zero-extend) their result.} 571 {- Functions that encode 8-bit or 16-bit integers represented by 572 [int] values truncate their input to their least significant 573 bytes.}} 574*) 575 576val get_uint8 : bytes -> int -> int 577(** [get_uint8 b i] is [b]'s unsigned 8-bit integer starting at byte index [i]. 578 @since 4.08 579*) 580 581val get_int8 : bytes -> int -> int 582(** [get_int8 b i] is [b]'s signed 8-bit integer starting at byte index [i]. 583 @since 4.08 584*) 585 586val get_uint16_ne : bytes -> int -> int 587(** [get_uint16_ne b i] is [b]'s native-endian unsigned 16-bit integer 588 starting at byte index [i]. 589 @since 4.08 590*) 591 592val get_uint16_be : bytes -> int -> int 593(** [get_uint16_be b i] is [b]'s big-endian unsigned 16-bit integer 594 starting at byte index [i]. 595 @since 4.08 596*) 597 598val get_uint16_le : bytes -> int -> int 599(** [get_uint16_le b i] is [b]'s little-endian unsigned 16-bit integer 600 starting at byte index [i]. 601 @since 4.08 602*) 603 604val get_int16_ne : bytes -> int -> int 605(** [get_int16_ne b i] is [b]'s native-endian signed 16-bit integer 606 starting at byte index [i]. 607 @since 4.08 608*) 609 610val get_int16_be : bytes -> int -> int 611(** [get_int16_be b i] is [b]'s big-endian signed 16-bit integer 612 starting at byte index [i]. 613 @since 4.08 614*) 615 616val get_int16_le : bytes -> int -> int 617(** [get_int16_le b i] is [b]'s little-endian signed 16-bit integer 618 starting at byte index [i]. 619 @since 4.08 620*) 621 622val get_int32_ne : bytes -> int -> int32 623(** [get_int32_ne b i] is [b]'s native-endian 32-bit integer 624 starting at byte index [i]. 625 @since 4.08 626*) 627 628val get_int32_be : bytes -> int -> int32 629(** [get_int32_be b i] is [b]'s big-endian 32-bit integer 630 starting at byte index [i]. 631 @since 4.08 632*) 633 634val get_int32_le : bytes -> int -> int32 635(** [get_int32_le b i] is [b]'s little-endian 32-bit integer 636 starting at byte index [i]. 637 @since 4.08 638*) 639 640val get_int64_ne : bytes -> int -> int64 641(** [get_int64_ne b i] is [b]'s native-endian 64-bit integer 642 starting at byte index [i]. 643 @since 4.08 644*) 645 646val get_int64_be : bytes -> int -> int64 647(** [get_int64_be b i] is [b]'s big-endian 64-bit integer 648 starting at byte index [i]. 649 @since 4.08 650*) 651 652val get_int64_le : bytes -> int -> int64 653(** [get_int64_le b i] is [b]'s little-endian 64-bit integer 654 starting at byte index [i]. 655 @since 4.08 656*) 657 658val set_uint8 : bytes -> int -> int -> unit 659(** [set_uint8 b i v] sets [b]'s unsigned 8-bit integer starting at byte index 660 [i] to [v]. 661 @since 4.08 662*) 663 664val set_int8 : bytes -> int -> int -> unit 665(** [set_int8 b i v] sets [b]'s signed 8-bit integer starting at byte index 666 [i] to [v]. 667 @since 4.08 668*) 669 670val set_uint16_ne : bytes -> int -> int -> unit 671(** [set_uint16_ne b i v] sets [b]'s native-endian unsigned 16-bit integer 672 starting at byte index [i] to [v]. 673 @since 4.08 674*) 675 676val set_uint16_be : bytes -> int -> int -> unit 677(** [set_uint16_be b i v] sets [b]'s big-endian unsigned 16-bit integer 678 starting at byte index [i] to [v]. 679 @since 4.08 680*) 681 682val set_uint16_le : bytes -> int -> int -> unit 683(** [set_uint16_le b i v] sets [b]'s little-endian unsigned 16-bit integer 684 starting at byte index [i] to [v]. 685 @since 4.08 686*) 687 688val set_int16_ne : bytes -> int -> int -> unit 689(** [set_int16_ne b i v] sets [b]'s native-endian signed 16-bit integer 690 starting at byte index [i] to [v]. 691 @since 4.08 692*) 693 694val set_int16_be : bytes -> int -> int -> unit 695(** [set_int16_be b i v] sets [b]'s big-endian signed 16-bit integer 696 starting at byte index [i] to [v]. 697 @since 4.08 698*) 699 700val set_int16_le : bytes -> int -> int -> unit 701(** [set_int16_le b i v] sets [b]'s little-endian signed 16-bit integer 702 starting at byte index [i] to [v]. 703 @since 4.08 704*) 705 706val set_int32_ne : bytes -> int -> int32 -> unit 707(** [set_int32_ne b i v] sets [b]'s native-endian 32-bit integer 708 starting at byte index [i] to [v]. 709 @since 4.08 710*) 711 712val set_int32_be : bytes -> int -> int32 -> unit 713(** [set_int32_be b i v] sets [b]'s big-endian 32-bit integer 714 starting at byte index [i] to [v]. 715 @since 4.08 716*) 717 718val set_int32_le : bytes -> int -> int32 -> unit 719(** [set_int32_le b i v] sets [b]'s little-endian 32-bit integer 720 starting at byte index [i] to [v]. 721 @since 4.08 722*) 723 724val set_int64_ne : bytes -> int -> int64 -> unit 725(** [set_int64_ne b i v] sets [b]'s native-endian 64-bit integer 726 starting at byte index [i] to [v]. 727 @since 4.08 728*) 729 730val set_int64_be : bytes -> int -> int64 -> unit 731(** [set_int64_be b i v] sets [b]'s big-endian 64-bit integer 732 starting at byte index [i] to [v]. 733 @since 4.08 734*) 735 736val set_int64_le : bytes -> int -> int64 -> unit 737(** [set_int64_le b i v] sets [b]'s little-endian 64-bit integer 738 starting at byte index [i] to [v]. 739 @since 4.08 740*) 741 742 743(** {1:bytes_concurrency Byte sequences and concurrency safety} 744 745 Care must be taken when concurrently accessing byte sequences from 746 multiple domains: accessing a byte sequence will never crash a program, 747 but unsynchronized accesses might yield surprising 748 (non-sequentially-consistent) results. 749 750 {2:byte_atomicity Atomicity} 751 752 Every byte sequence operation that accesses more than one byte is not 753 atomic. This includes iteration and scanning. 754 755 For example, consider the following program: 756{[let size = 100_000_000 757let b = Bytes.make size ' ' 758let update b f () = 759 Bytes.iteri (fun i x -> Bytes.set b i (Char.chr (f (Char.code x)))) b 760let d1 = Domain.spawn (update b (fun x -> x + 1)) 761let d2 = Domain.spawn (update b (fun x -> 2 * x + 1)) 762let () = Domain.join d1; Domain.join d2 763]} 764 the bytes sequence [b] may contain a non-deterministic mixture 765 of ['!'], ['A'], ['B'], and ['C'] values. 766 767 768 After executing this code, each byte of the sequence [b] is either ['!'], 769 ['A'], ['B'], or ['C']. If atomicity is required, then the user must 770 implement their own synchronization (for example, using {!Mutex.t}). 771 772 {2:bytes_data_race Data races} 773 774 If two domains only access disjoint parts of a byte sequence, then the 775 observed behaviour is the equivalent to some sequential interleaving of the 776 operations from the two domains. 777 778 A data race is said to occur when two domains access the same byte 779 without synchronization and at least one of the accesses is a write. 780 In the absence of data races, the observed behaviour is equivalent to some 781 sequential interleaving of the operations from different domains. 782 783 Whenever possible, data races should be avoided by using synchronization 784 to mediate the accesses to the elements of the sequence. 785 786 Indeed, in the presence of data races, programs will not crash but the 787 observed behaviour may not be equivalent to any sequential interleaving of 788 operations from different domains. Nevertheless, even in the presence of 789 data races, a read operation will return the value of some prior write to 790 that location. 791 792 {2:bytes_mixed_access Mixed-size accesses } 793 794 Another subtle point is that if a data race involves mixed-size writes and 795 reads to the same location, the order in which those writes and reads 796 are observed by domains is not specified. 797 For instance, the following code write sequentially a 32-bit integer and a 798 [char] to the same index 799{[ 800let b = Bytes.make 10 '\000' 801let d1 = Domain.spawn (fun () -> Bytes.set_int32_ne b 0 100; b.[0] <- 'd' ) 802]} 803 804 In this situation, a domain that observes the write of 'd' to b.[0] is not 805 guaranteed to also observe the write to indices [1], [2], or [3]. 806 807*) 808 809(**/**) 810 811(* The following is for system use only. Do not call directly. *) 812 813external unsafe_get : bytes -> int -> char = "%bytes_unsafe_get" 814external unsafe_set : bytes -> int -> char -> unit = "%bytes_unsafe_set" 815external unsafe_blit : 816 bytes -> int -> bytes -> int -> int -> 817 unit = "caml_blit_bytes" [@@noalloc] 818external unsafe_blit_string : 819 string -> int -> bytes -> int -> int -> unit 820 = "caml_blit_string" [@@noalloc] 821external unsafe_fill : 822 bytes -> int -> int -> char -> unit = "caml_fill_bytes" [@@noalloc] 823 824val unsafe_escape : bytes -> bytes