Module Bigstringaf
Bigstrings, but fast.
The OCaml compiler has a bunch of intrinsics for Bigstrings, but they're not widely-known, sometimes misused, and so programs that use Bigstrings are slower than they have to be. And even if a library got that part right and exposed the intrinsics properly, the compiler doesn't have any fast blits between Bigstrings and other string-like types.
So here they are. Go crazy.
type t= (char, Bigarray_compat.int8_unsigned_elt, Bigarray_compat.c_layout) Bigarray_compat.Array1.t
Constructors
val create : int -> tcreate nreturns a bigstring of lengthn
val empty : temptyis the empty bigstring. It has length0and you can't really do much with it, but it's a good placeholder that only needs to be allocated once.
val of_string : off:int -> len:int -> string -> tof_string ~off ~len sreturns a bigstring of lengthlenthat contains the contents of string from the range[off, len).
Memory-safe Operations
val length : t -> intlength tis the length of the bigstring, in bytes.
val substring : t -> off:int -> len:int -> stringsubstring t ~off ~lenreturns a string of lengthlencontaining the bytes oftstarting atoff.
val to_string : t -> stringto_string tis equivalent tosubstring t ~off:0 ~len:(length t)
val get : t -> int -> charget t ireturns the character at offsetiint.
val set : t -> int -> char -> unitset t i csets the character at offsetiintto bec
Little-endian Byte Order
The following operations assume a little-endian byte ordering of the bigstring. If the machine-native byte ordering differs, then the get operations will reorder the bytes so that they are in machine-native byte order before returning the result, and the set operations will reorder the bytes so that they are written out in the appropriate order.
Most modern processor architectures are little-endian, so more likely than not, these operations will not do any byte reordering.
val get_int16_le : t -> int -> intget_int16_le t ireturns the two bytes intstarting at offseti, interpreted as an unsigned integer.
val get_int16_sign_extended_le : t -> int -> intget_int16_sign_extended_le t ireturns the two bytes intstarting at offseti, interpreted as a signed integer and performing sign extension to the native word size before returning the result.
val set_int16_le : t -> int -> int -> unitset_int16_le t i vsets the two bytes intstarting at offsetito the valuev.
val get_int32_le : t -> int -> int32get_int32_le t ireturns the four bytes intstarting at offseti.
val set_int32_le : t -> int -> int32 -> unitset_int32_le t i vsets the four bytes intstarting at offsetito the valuev.
val get_int64_le : t -> int -> int64get_int64_le t ireturns the eight bytes intstarting at offseti.
val set_int64_le : t -> int -> int64 -> unitset_int64_le t i vsets the eight bytes intstarting at offsetito the valuev.
Big-endian Byte Order
The following operations assume a big-endian byte ordering of the bigstring. If the machine-native byte ordering differs, then the get operations will reorder the bytes so that they are in machine-native byte order before returning the result, and the set operations will reorder the bytes so that they are written out in the appropriate order.
Network byte order is big-endian, so you may need these operations when dealing with raw frames, for example, in a userland networking stack.
val get_int16_be : t -> int -> intget_int16_be t ireturns the two bytes intstarting at offseti, interpreted as an unsigned integer.
val get_int16_sign_extended_be : t -> int -> intget_int16_sign_extended_be t ireturns the two bytes intstarting at offseti, interpreted as a signed integer and performing sign extension to the native word size before returning the result.
val set_int16_be : t -> int -> int -> unitset_int16_be t i vsets the two bytes intstarting at offsetoffto the valuev.
val get_int32_be : t -> int -> int32get_int32_be t ireturns the four bytes intstarting at offseti.
val set_int32_be : t -> int -> int32 -> unitset_int32_be t i vsets the four bytes intstarting at offsetito the valuev.
val get_int64_be : t -> int -> int64get_int64_be t ireturns the eight bytes intstarting at offseti.
val set_int64_be : t -> int -> int64 -> unitset_int64_be t i vsets the eight bytes intstarting at offsetito the valuev.
Blits
All the following blit operations do the same thing. They copy a given number of bytes from a source starting at some offset to a destination starting at some other offset. Forgetting for a moment that OCaml is a memory-safe language, these are all equivalent to:
memcpy(dst + dst_off, src + src_off, len);And in fact, that's how they're implemented. Except that bounds checking is performed before performing the blit.
val blit : t -> src_off:int -> t -> dst_off:int -> len:int -> unitval blit_from_string : string -> src_off:int -> t -> dst_off:int -> len:int -> unitval blit_from_bytes : Stdlib.Bytes.t -> src_off:int -> t -> dst_off:int -> len:int -> unitval blit_to_bytes : t -> src_off:int -> Stdlib.Bytes.t -> dst_off:int -> len:int -> unit
memcmp
Fast comparisons based on memcmp. Similar to the blits, these are implemented as C calls after performing bounds checks.
memcmp(buf1 + off1, buf2 + off2, len);Memory-unsafe Operations
The following operations are not memory safe. However, they do compile down to just a couple instructions. Make sure when using them to perform your own bounds checking. Or don't. Just make sure you know what you're doing. You can do it, but only do it if you have to.
val unsafe_set : t -> int -> char -> unitunsafe_set t i cis likesetexcept no bounds checking is performed.
val unsafe_get_int16_le : t -> int -> intunsafe_get_int16_le t iis likeget_int16_leexcept no bounds checking is performed.
val unsafe_get_int16_be : t -> int -> intunsafe_get_int16_be t iis likeget_int16_beexcept no bounds checking is performed.
val unsafe_get_int16_sign_extended_le : t -> int -> intunsafe_get_int16_sign_extended_le t iis likeget_int16_sign_extended_leexcept no bounds checking is performed.
val unsafe_get_int16_sign_extended_be : t -> int -> intunsafe_get_int16_sign_extended_be t iis likeget_int16_sign_extended_beexcept no bounds checking is performed.
val unsafe_set_int16_le : t -> int -> int -> unitunsafe_set_int16_le t i vis likeset_int16_leexcept no bounds checking is performed.
val unsafe_set_int16_be : t -> int -> int -> unitunsafe_set_int16_be t i vis likeset_int16_beexcept no bounds checking is performed.
val unsafe_get_int32_le : t -> int -> int32unsafe_get_int32_le t iis likeget_int32_leexcept no bounds checking is performed.
val unsafe_get_int32_be : t -> int -> int32unsafe_get_int32_be t iis likeget_int32_beexcept no bounds checking is performed.
val unsafe_set_int32_le : t -> int -> int32 -> unitunsafe_set_int32_le t i vis likeset_int32_leexcept no bounds checking is performed.
val unsafe_set_int32_be : t -> int -> int32 -> unitunsafe_set_int32_be t i vis likeset_int32_beexcept no bounds checking is performed.
val unsafe_get_int64_le : t -> int -> int64unsafe_get_int64_le t iis likeget_int64_leexcept no bounds checking is performed.
val unsafe_get_int64_be : t -> int -> int64unsafe_get_int64_be t iis likeget_int64_beexcept no bounds checking is performed.
val unsafe_set_int64_le : t -> int -> int64 -> unitunsafe_set_int64_le t i vis likeset_int64_leexcept no bounds checking is performed.
val unsafe_set_int64_be : t -> int -> int64 -> unitunsafe_set_int64_be t i vis likeset_int64_beexcept no bounds checking is performed.
Blits
All the following blit operations do the same thing. They copy a given number of bytes from a source starting at some offset to a destination starting at some other offset. Forgetting for a moment that OCaml is a memory-safe language, these are all equivalent to:
memcpy(dst + dst_off, src + src_off, len);And in fact, that's how they're implemented. Except in the case of unsafe_blit which uses a memmove so that overlapping blits behave as expected. But in both cases, there's no bounds checking.
val unsafe_blit : t -> src_off:int -> t -> dst_off:int -> len:int -> unitval unsafe_blit_from_string : string -> src_off:int -> t -> dst_off:int -> len:int -> unitval unsafe_blit_from_bytes : Stdlib.Bytes.t -> src_off:int -> t -> dst_off:int -> len:int -> unitval unsafe_blit_to_bytes : t -> src_off:int -> Stdlib.Bytes.t -> dst_off:int -> len:int -> unit
memcmp
Fast comparisons based on memcmp. Similar to the blits, these are not memory safe and are implemented by the same C call:
memcmp(buf1 + off1, buf2 + off2, len);