Galaxy_TOC
Plaice Series
of hash Functions
Plaice hash functions are specifically designed for web applications.
The main difference between the classical hash functions (MD5, sha256,
Tiger,
SHAvite-3,
BLAKE2,
etc.)
and Plaice hash functions is that the classical hash functions work with bit-streams,
but the Plaice hash functions operate only with
characters and classical, CPU-supported, 4B integers.
Where classical hash functions tend to use the XOR, the Plaice hash
functions tend to use the
TXOR.
Motivation
Bitstreams depend on the
byte endianness
and
bit endianness
of the CPU. That is to say, there are literally 4 types of CPU-s:
big-endian by bytes and little-endian by bits, big-endian by bytes and big-endian by bits, etc.
The optionally present
Unicode
byte order mark
illustrates a case, where the same text can have 2 different bit-streams,
regardless of the endianness of the CPU. For 2 different bit-streams
the bit-stream oriented hash functions probabilistically return
2 different hash values. Consequently, a bit-stream oriented hash function
can return at least 2 different values for the very same text. (4 CPU-types times
BOM-or-no-BOM gives 8 different hashes.)
Things to Consider, when Designing Solutions
If Unicode code points were used for calculations, then
the calculation result might equal with a code point that
has not been assigned to a character. Text editors can not
display the code-points that are not assigned to a character,
string processing routines are likely to fail, etc.
Code points of a same character might have different
integer values at different encoding standards.
For example, Unicode differs from
TRON.
Due to the difference of integer values of code points at
different encoding standards, some sort of a unifying encoding
should be used. The unifying encoding must not contain un-assigned code
points. To simplify calculations, the lowest code point
of the unifying encoding should be 0.
A Possible Sub-solution:
The unifying encoding might be based on an artificial alphabet that
is dynamically constructed from the hash-able text. The
alphabet assembly algorithm can be part of the hashing algorithm.
Plaice_t1
The Plaice_t1 is the very first, experimental, version of the Plaice series of hash functions.
Its
reference implementation,
which is part of the Kibuvits Ruby Library, is its specification.
That is to say, the Ruby code is the specification.
Thank You for reading this HTML-page. :-)
Timestamped 1. version of this
document.