.TH "Stdlib.Hashtbl" 3 2024-05-31 OCamldoc "OCaml library" .SH NAME Stdlib.Hashtbl \- no description .SH Module Module Stdlib.Hashtbl .SH Documentation .sp Module .BI "Hashtbl" : .B (module Stdlib__Hashtbl) .sp .sp .sp .sp .PP Unsynchronized accesses .PP .PP Unsynchronized accesses to a hash table may lead to an invalid hash table state\&. Thus, concurrent accesses to a hash tables must be synchronized (for instance with a .ft B Mutex\&.t .ft R )\&. .PP .PP .SS Generic interface .PP .I type .B (!'a, !'b) .I t .sp The type of hash tables from type .ft B \&'a .ft R to type .ft B \&'b .ft R \&. .sp .I val create : .B ?random:bool -> int -> ('a, 'b) t .sp .ft B Hashtbl\&.create n .ft R creates a new, empty hash table, with initial size .ft B n .ft R \&. For best results, .ft B n .ft R should be on the order of the expected number of elements that will be in the table\&. The table grows as needed, so .ft B n .ft R is just an initial guess\&. .sp The optional .ft B ~random .ft R parameter (a boolean) controls whether the internal organization of the hash table is randomized at each execution of .ft B Hashtbl\&.create .ft R or deterministic over all executions\&. .sp A hash table that is created with .ft B ~random .ft R set to .ft B false .ft R uses a fixed hash function ( .ft B Hashtbl\&.hash .ft R ) to distribute keys among buckets\&. As a consequence, collisions between keys happen deterministically\&. In Web\-facing applications or other security\-sensitive applications, the deterministic collision patterns can be exploited by a malicious user to create a denial\-of\-service attack: the attacker sends input crafted to create many collisions in the table, slowing the application down\&. .sp A hash table that is created with .ft B ~random .ft R set to .ft B true .ft R uses the seeded hash function .ft B Hashtbl\&.seeded_hash .ft R with a seed that is randomly chosen at hash table creation time\&. In effect, the hash function used is randomly selected among .ft B 2^{30} .ft R different hash functions\&. All these hash functions have different collision patterns, rendering ineffective the denial\-of\-service attack described above\&. However, because of randomization, enumerating all elements of the hash table using .ft B Hashtbl\&.fold .ft R or .ft B Hashtbl\&.iter .ft R is no longer deterministic: elements are enumerated in different orders at different runs of the program\&. .sp If no .ft B ~random .ft R parameter is given, hash tables are created in non\-random mode by default\&. This default can be changed either programmatically by calling .ft B Hashtbl\&.randomize .ft R or by setting the .ft B R .ft R flag in the .ft B OCAMLRUNPARAM .ft R environment variable\&. .sp .B "Before4.00" the .ft B ~random .ft R parameter was not present and all hash tables were created in non\-randomized mode\&. .sp .I val clear : .B ('a, 'b) t -> unit .sp Empty a hash table\&. Use .ft B reset .ft R instead of .ft B clear .ft R to shrink the size of the bucket table to its initial size\&. .sp .I val reset : .B ('a, 'b) t -> unit .sp Empty a hash table and shrink the size of the bucket table to its initial size\&. .sp .B "Since" 4.00 .sp .I val copy : .B ('a, 'b) t -> ('a, 'b) t .sp Return a copy of the given hashtable\&. .sp .I val add : .B ('a, 'b) t -> 'a -> 'b -> unit .sp .ft B Hashtbl\&.add tbl key data .ft R adds a binding of .ft B key .ft R to .ft B data .ft R in table .ft B tbl .ft R \&. .sp Warning: Previous bindings for .ft B key .ft R are not removed, but simply hidden\&. That is, after performing .ft B Hashtbl\&.remove .ft R .ft B tbl key .ft R , the previous binding for .ft B key .ft R , if any, is restored\&. (Same behavior as with association lists\&.) .sp If you desire the classic behavior of replacing elements, see .ft B Hashtbl\&.replace .ft R \&. .sp .I val find : .B ('a, 'b) t -> 'a -> 'b .sp .ft B Hashtbl\&.find tbl x .ft R returns the current binding of .ft B x .ft R in .ft B tbl .ft R , or raises .ft B Not_found .ft R if no such binding exists\&. .sp .I val find_opt : .B ('a, 'b) t -> 'a -> 'b option .sp .ft B Hashtbl\&.find_opt tbl x .ft R returns the current binding of .ft B x .ft R in .ft B tbl .ft R , or .ft B None .ft R if no such binding exists\&. .sp .B "Since" 4.05 .sp .I val find_all : .B ('a, 'b) t -> 'a -> 'b list .sp .ft B Hashtbl\&.find_all tbl x .ft R returns the list of all data associated with .ft B x .ft R in .ft B tbl .ft R \&. The current binding is returned first, then the previous bindings, in reverse order of introduction in the table\&. .sp .I val mem : .B ('a, 'b) t -> 'a -> bool .sp .ft B Hashtbl\&.mem tbl x .ft R checks if .ft B x .ft R is bound in .ft B tbl .ft R \&. .sp .I val remove : .B ('a, 'b) t -> 'a -> unit .sp .ft B Hashtbl\&.remove tbl x .ft R removes the current binding of .ft B x .ft R in .ft B tbl .ft R , restoring the previous binding if it exists\&. It does nothing if .ft B x .ft R is not bound in .ft B tbl .ft R \&. .sp .I val replace : .B ('a, 'b) t -> 'a -> 'b -> unit .sp .ft B Hashtbl\&.replace tbl key data .ft R replaces the current binding of .ft B key .ft R in .ft B tbl .ft R by a binding of .ft B key .ft R to .ft B data .ft R \&. If .ft B key .ft R is unbound in .ft B tbl .ft R , a binding of .ft B key .ft R to .ft B data .ft R is added to .ft B tbl .ft R \&. This is functionally equivalent to .ft B Hashtbl\&.remove .ft R .ft B tbl key .ft R followed by .ft B Hashtbl\&.add .ft R .ft B tbl key data .ft R \&. .sp .I val iter : .B ('a -> 'b -> unit) -> ('a, 'b) t -> unit .sp .ft B Hashtbl\&.iter f tbl .ft R applies .ft B f .ft R to all bindings in table .ft B tbl .ft R \&. .ft B f .ft R receives the key as first argument, and the associated value as second argument\&. Each binding is presented exactly once to .ft B f .ft R \&. .sp The order in which the bindings are passed to .ft B f .ft R is unspecified\&. However, if the table contains several bindings for the same key, they are passed to .ft B f .ft R in reverse order of introduction, that is, the most recent binding is passed first\&. .sp If the hash table was created in non\-randomized mode, the order in which the bindings are enumerated is reproducible between successive runs of the program, and even between minor versions of OCaml\&. For randomized hash tables, the order of enumeration is entirely random\&. .sp The behavior is not specified if the hash table is modified by .ft B f .ft R during the iteration\&. .sp .I val filter_map_inplace : .B ('a -> 'b -> 'b option) -> ('a, 'b) t -> unit .sp .ft B Hashtbl\&.filter_map_inplace f tbl .ft R applies .ft B f .ft R to all bindings in table .ft B tbl .ft R and update each binding depending on the result of .ft B f .ft R \&. If .ft B f .ft R returns .ft B None .ft R , the binding is discarded\&. If it returns .ft B Some new_val .ft R , the binding is update to associate the key to .ft B new_val .ft R \&. .sp Other comments for .ft B Hashtbl\&.iter .ft R apply as well\&. .sp .B "Since" 4.03 .sp .I val fold : .B ('a -> 'b -> 'acc -> 'acc) -> ('a, 'b) t -> 'acc -> 'acc .sp .ft B Hashtbl\&.fold f tbl init .ft R computes .ft B (f kN dN \&.\&.\&. (f k1 d1 init)\&.\&.\&.) .ft R , where .ft B k1 \&.\&.\&. kN .ft R are the keys of all bindings in .ft B tbl .ft R , and .ft B d1 \&.\&.\&. dN .ft R are the associated values\&. Each binding is presented exactly once to .ft B f .ft R \&. .sp The order in which the bindings are passed to .ft B f .ft R is unspecified\&. However, if the table contains several bindings for the same key, they are passed to .ft B f .ft R in reverse order of introduction, that is, the most recent binding is passed first\&. .sp If the hash table was created in non\-randomized mode, the order in which the bindings are enumerated is reproducible between successive runs of the program, and even between minor versions of OCaml\&. For randomized hash tables, the order of enumeration is entirely random\&. .sp The behavior is not specified if the hash table is modified by .ft B f .ft R during the iteration\&. .sp .I val length : .B ('a, 'b) t -> int .sp .ft B Hashtbl\&.length tbl .ft R returns the number of bindings in .ft B tbl .ft R \&. It takes constant time\&. Multiple bindings are counted once each, so .ft B Hashtbl\&.length .ft R gives the number of times .ft B Hashtbl\&.iter .ft R calls its first argument\&. .sp .I val randomize : .B unit -> unit .sp After a call to .ft B Hashtbl\&.randomize() .ft R , hash tables are created in randomized mode by default: .ft B Hashtbl\&.create .ft R returns randomized hash tables, unless the .ft B ~random:false .ft R optional parameter is given\&. The same effect can be achieved by setting the .ft B R .ft R parameter in the .ft B OCAMLRUNPARAM .ft R environment variable\&. .sp It is recommended that applications or Web frameworks that need to protect themselves against the denial\-of\-service attack described in .ft B Hashtbl\&.create .ft R call .ft B Hashtbl\&.randomize() .ft R at initialization time before any domains are created\&. .sp Note that once .ft B Hashtbl\&.randomize() .ft R was called, there is no way to revert to the non\-randomized default behavior of .ft B Hashtbl\&.create .ft R \&. This is intentional\&. Non\-randomized hash tables can still be created using .ft B Hashtbl\&.create ~random:false .ft R \&. .sp .B "Since" 4.00 .sp .I val is_randomized : .B unit -> bool .sp Return .ft B true .ft R if the tables are currently created in randomized mode by default, .ft B false .ft R otherwise\&. .sp .B "Since" 4.03 .sp .I val rebuild : .B ?random:bool -> ('a, 'b) t -> ('a, 'b) t .sp Return a copy of the given hashtable\&. Unlike .ft B Hashtbl\&.copy .ft R , .ft B Hashtbl\&.rebuild .ft R .ft B h .ft R re\-hashes all the (key, value) entries of the original table .ft B h .ft R \&. The returned hash table is randomized if .ft B h .ft R was randomized, or the optional .ft B random .ft R parameter is true, or if the default is to create randomized hash tables; see .ft B Hashtbl\&.create .ft R for more information\&. .sp .ft B Hashtbl\&.rebuild .ft R can safely be used to import a hash table built by an old version of the .ft B Hashtbl .ft R module, then marshaled to persistent storage\&. After unmarshaling, apply .ft B Hashtbl\&.rebuild .ft R to produce a hash table for the current version of the .ft B Hashtbl .ft R module\&. .sp .B "Since" 4.12 .sp .I type statistics = { num_bindings : .B int ; (* Number of bindings present in the table\&. Same value as returned by .ft B Hashtbl\&.length .ft R \&. *) num_buckets : .B int ; (* Number of buckets in the table\&. *) max_bucket_length : .B int ; (* Maximal number of bindings per bucket\&. *) bucket_histogram : .B int array ; (* Histogram of bucket sizes\&. This array .ft B histo .ft R has length .ft B max_bucket_length + 1 .ft R \&. The value of .ft B histo\&.(i) .ft R is the number of buckets whose size is .ft B i .ft R \&. *) } .sp .B "Since" 4.00 .sp .I val stats : .B ('a, 'b) t -> statistics .sp .ft B Hashtbl\&.stats tbl .ft R returns statistics about the table .ft B tbl .ft R : number of buckets, size of the biggest bucket, distribution of buckets by size\&. .sp .B "Since" 4.00 .sp .PP .SS Hash tables and Sequences .PP .I val to_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t .sp Iterate on the whole table\&. The order in which the bindings appear in the sequence is unspecified\&. However, if the table contains several bindings for the same key, they appear in reversed order of introduction, that is, the most recent binding appears first\&. .sp The behavior is not specified if the hash table is modified during the iteration\&. .sp .B "Since" 4.07 .sp .I val to_seq_keys : .B ('a, 'b) t -> 'a Seq.t .sp Same as .ft B Seq\&.map fst (to_seq m) .ft R .sp .B "Since" 4.07 .sp .I val to_seq_values : .B ('a, 'b) t -> 'b Seq.t .sp Same as .ft B Seq\&.map snd (to_seq m) .ft R .sp .B "Since" 4.07 .sp .I val add_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t -> unit .sp Add the given bindings to the table, using .ft B Hashtbl\&.add .ft R .sp .B "Since" 4.07 .sp .I val replace_seq : .B ('a, 'b) t -> ('a * 'b) Seq.t -> unit .sp Add the given bindings to the table, using .ft B Hashtbl\&.replace .ft R .sp .B "Since" 4.07 .sp .I val of_seq : .B ('a * 'b) Seq.t -> ('a, 'b) t .sp Build a table from the given bindings\&. The bindings are added in the same order they appear in the sequence, using .ft B Hashtbl\&.replace_seq .ft R , which means that if two pairs have the same key, only the latest one will appear in the table\&. .sp .B "Since" 4.07 .sp .PP .SS Functorial interface .PP .PP The functorial interface allows the use of specific comparison and hash functions, either for performance/security concerns, or because keys are not hashable/comparable with the polymorphic builtins\&. .sp For instance, one might want to specialize a table for integer keys: .EX .ft B .br \& module IntHash = .br \& struct .br \& type t = int .br \& let equal i j = i=j .br \& let hash i = i land max_int .br \& end .br \& .br \& module IntHashtbl = Hashtbl\&.Make(IntHash) .br \& .br \& let h = IntHashtbl\&.create 17 in .br \& IntHashtbl\&.add h 12 "hello" .br \& .ft R .EE .sp This creates a new module .ft B IntHashtbl .ft R , with a new type .ft B \&'a .br \& IntHashtbl\&.t .ft R of tables from .ft B int .ft R to .ft B \&'a .ft R \&. In this example, .ft B h .ft R contains .ft B string .ft R values so its type is .ft B string IntHashtbl\&.t .ft R \&. .sp Note that the new type .ft B \&'a IntHashtbl\&.t .ft R is not compatible with the type .ft B (\&'a,\&'b) Hashtbl\&.t .ft R of the generic interface\&. For example, .ft B Hashtbl\&.length h .ft R would not type\-check, you must use .ft B IntHashtbl\&.length .ft R \&. .PP .I module type HashedType = .B sig end .sp The input signature of the functor .ft B Hashtbl\&.Make .ft R \&. .sp .I module type S = .B sig end .sp The output signature of the functor .ft B Hashtbl\&.Make .ft R \&. .sp .I module Make : .B functor (H : HashedType) -> sig end .sp Functor building an implementation of the hashtable structure\&. The functor .ft B Hashtbl\&.Make .ft R returns a structure containing a type .ft B key .ft R of keys and a type .ft B \&'a t .ft R of hash tables associating data of type .ft B \&'a .ft R to keys of type .ft B key .ft R \&. The operations perform similarly to those of the generic interface, but use the hashing and equality functions specified in the functor argument .ft B H .ft R instead of generic equality and hashing\&. Since the hash function is not seeded, the .ft B create .ft R operation of the result structure always returns non\-randomized hash tables\&. .sp .I module type SeededHashedType = .B sig end .sp The input signature of the functor .ft B Hashtbl\&.MakeSeeded .ft R \&. .sp .B "Since" 4.00 .sp .I module type SeededS = .B sig end .sp The output signature of the functor .ft B Hashtbl\&.MakeSeeded .ft R \&. .sp .B "Since" 4.00 .sp .I module MakeSeeded : .B functor (H : SeededHashedType) -> sig end .sp Functor building an implementation of the hashtable structure\&. The functor .ft B Hashtbl\&.MakeSeeded .ft R returns a structure containing a type .ft B key .ft R of keys and a type .ft B \&'a t .ft R of hash tables associating data of type .ft B \&'a .ft R to keys of type .ft B key .ft R \&. The operations perform similarly to those of the generic interface, but use the seeded hashing and equality functions specified in the functor argument .ft B H .ft R instead of generic equality and hashing\&. The .ft B create .ft R operation of the result structure supports the .ft B ~random .ft R optional parameter and returns randomized hash tables if .ft B ~random:true .ft R is passed or if randomization is globally on (see .ft B Hashtbl\&.randomize .ft R )\&. .sp .B "Since" 4.00 .sp .PP .SS The polymorphic hash functions .PP .I val hash : .B 'a -> int .sp .ft B Hashtbl\&.hash x .ft R associates a nonnegative integer to any value of any type\&. It is guaranteed that if .ft B x = y .ft R or .ft B Stdlib\&.compare x y = 0 .ft R , then .ft B hash x = hash y .ft R \&. Moreover, .ft B hash .ft R always terminates, even on cyclic structures\&. .sp .I val seeded_hash : .B int -> 'a -> int .sp A variant of .ft B Hashtbl\&.hash .ft R that is further parameterized by an integer seed\&. .sp .B "Since" 4.00 .sp .I val hash_param : .B int -> int -> 'a -> int .sp .ft B Hashtbl\&.hash_param meaningful total x .ft R computes a hash value for .ft B x .ft R , with the same properties as for .ft B hash .ft R \&. The two extra integer parameters .ft B meaningful .ft R and .ft B total .ft R give more precise control over hashing\&. Hashing performs a breadth\-first, left\-to\-right traversal of the structure .ft B x .ft R , stopping after .ft B meaningful .ft R meaningful nodes were encountered, or .ft B total .ft R nodes (meaningful or not) were encountered\&. If .ft B total .ft R as specified by the user exceeds a certain value, currently 256, then it is capped to that value\&. Meaningful nodes are: integers; floating\-point numbers; strings; characters; booleans; and constant constructors\&. Larger values of .ft B meaningful .ft R and .ft B total .ft R means that more nodes are taken into account to compute the final hash value, and therefore collisions are less likely to happen\&. However, hashing takes longer\&. The parameters .ft B meaningful .ft R and .ft B total .ft R govern the tradeoff between accuracy and speed\&. As default choices, .ft B Hashtbl\&.hash .ft R and .ft B Hashtbl\&.seeded_hash .ft R take .ft B meaningful = 10 .ft R and .ft B total = 100 .ft R \&. .sp .I val seeded_hash_param : .B int -> int -> int -> 'a -> int .sp A variant of .ft B Hashtbl\&.hash_param .ft R that is further parameterized by an integer seed\&. Usage: .ft B Hashtbl\&.seeded_hash_param meaningful total seed x .ft R \&. .sp .B "Since" 4.00 .sp .PP .SS Examples .sp .SS Basic Example .sp .EX .ft B .br \& (* 0\&.\&.\&.99 *) .br \& let seq = Seq\&.ints 0 |> Seq\&.take 100 .br \& .br \& (* build from Seq\&.t *) .br \& # let tbl = .br \& seq .br \& |> Seq\&.map (fun x \-> x, string_of_int x) .br \& |> Hashtbl\&.of_seq .br \& val tbl : (int, string) Hashtbl\&.t = .br \& .br \& # Hashtbl\&.length tbl .br \& \- : int = 100 .br \& .br \& # Hashtbl\&.find_opt tbl 32 .br \& \- : string option = Some "32" .br \& .br \& # Hashtbl\&.find_opt tbl 166 .br \& \- : string option = None .br \& .br \& # Hashtbl\&.replace tbl 166 "one six six" .br \& \- : unit = () .br \& .br \& # Hashtbl\&.find_opt tbl 166 .br \& \- : string option = Some "one six six" .br \& .br \& # Hashtbl\&.length tbl .br \& \- : int = 101 .br \& .ft R .EE .sp .SS Counting Elements .sp Given a sequence of elements (here, a .ft B Seq\&.t .ft R ), we want to count how many times each distinct element occurs in the sequence\&. A simple way to do this, assuming the elements are comparable and hashable, is to use a hash table that maps elements to their number of occurrences\&. .sp Here we illustrate that principle using a sequence of (ascii) characters (type .ft B char .ft R )\&. We use a custom .ft B Char_tbl .ft R specialized for .ft B char .ft R \&. .sp .EX .ft B .br \& # module Char_tbl = Hashtbl\&.Make(struct .br \& type t = char .br \& let equal = Char\&.equal .br \& let hash = Hashtbl\&.hash .br \& end) .br \& .br \& (* count distinct occurrences of chars in [seq] *) .br \& # let count_chars (seq : char Seq\&.t) : _ list = .br \& let counts = Char_tbl\&.create 16 in .br \& Seq\&.iter .br \& (fun c \-> .br \& let count_c = .br \& Char_tbl\&.find_opt counts c .br \& |> Option\&.value ~default:0 .br \& in .br \& Char_tbl\&.replace counts c (count_c + 1)) .br \& seq; .br \& (* turn into a list *) .br \& Char_tbl\&.fold (fun c n l \-> (c,n) :: l) counts [] .br \& |> List\&.sort (fun (c1,_)(c2,_) \-> Char\&.compare c1 c2) .br \& val count_chars : Char_tbl\&.key Seq\&.t \-> (Char\&.t * int) list = .br \& .br \& (* basic seq from a string *) .br \& # let seq = String\&.to_seq "hello world, and all the camels in it!" .br \& val seq : char Seq\&.t = .br \& .br \& # count_chars seq .br \& \- : (Char\&.t * int) list = .br \& [(\&' \&', 7); (\&'!\&', 1); (\&',\&', 1); (\&'a\&', 3); (\&'c\&', 1); (\&'d\&', 2); (\&'e\&', 3); .br \& (\&'h\&', 2); (\&'i\&', 2); (\&'l\&', 6); (\&'m\&', 1); (\&'n\&', 2); (\&'o\&', 2); (\&'r\&', 1); .br \& (\&'s\&', 1); (\&'t\&', 2); (\&'w\&', 1)] .br \& .br \& (* "abcabcabc\&.\&.\&." *) .br \& # let seq2 = .br \& Seq\&.cycle (String\&.to_seq "abc") |> Seq\&.take 31 .br \& val seq2 : char Seq\&.t = .br \& .br \& # String\&.of_seq seq2 .br \& \- : String\&.t = "abcabcabcabcabcabcabcabcabcabca" .br \& .br \& # count_chars seq2 .br \& \- : (Char\&.t * int) list = [(\&'a\&', 11); (\&'b\&', 10); (\&'c\&', 10)] .br \& .br \& .ft R .EE .PP