Popularity

8.6

Stable

Activity

4.8

Stars 46

Watchers 8

Forks 16

Last Commit 7 months ago

Monthly Downloads: 36

Programming language: Haskell

License: BSD 3-clause "New" or "Revised" License

Tags: Data Text Unicode

Latest version: v0.3.7

unicode-transforms alternatives and similar packages

Based on the "Data" category.
Alternatively, view unicode-transforms alternatives based on common mentions on social networks and blogs.

lens

10.0 6.8 unicode-transforms VS lens

Lenses, Folds, and Traversals - Join us on web.libera.chat #haskell-lens
semantic-source

10.0 8.9 unicode-transforms VS semantic-source

DISCONTINUED. Parsing, analyzing, and comparing source code across many languages

Sevalla - Deploy and host your apps and databases, now with $50 credit!

Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

Promo sevalla.com

hnix

9.9 6.8 unicode-transforms VS hnix

A Haskell re-implementation of the Nix expression language
text

9.8 7.8 unicode-transforms VS text

Haskell library for space- and time-efficient operations over Unicode text.
code-builder

9.8 0.0 unicode-transforms VS code-builder

Packages for defining APIs, running them, generating client code and documentation.
Frames

9.7 6.1 unicode-transforms VS Frames

Data frames for tabular data.
unordered-containers

9.7 1.9 unicode-transforms VS unordered-containers

Efficient hashing-based container types
massiv

9.7 4.4 unicode-transforms VS massiv

Efficient Haskell Arrays featuring Parallel computation
compendium-client

9.7 0.0 unicode-transforms VS compendium-client

DISCONTINUED. Mu (μ) is a purely functional framework for building micro services.
cassava

9.7 4.7 unicode-transforms VS cassava

A CSV parsing and encoding library optimized for ease of use and high performance
holmes

9.6 0.0 unicode-transforms VS holmes

A reference library for constraint-solving with propagators and CDCL.
primitive

9.6 5.2 unicode-transforms VS primitive

This package provides various primitive memory-related operations.
binary

9.5 2.8 unicode-transforms VS binary

Efficient, pure binary serialisation using ByteStrings in Haskell.
alfred-margaret

9.5 6.0 unicode-transforms VS alfred-margaret

Fast Aho-Corasick string searching
resource-pool

9.5 0.0 unicode-transforms VS resource-pool

A high-performance striped resource pooling implementation for Haskell
hashable

9.5 3.0 unicode-transforms VS hashable

A class for types that can be converted to a hash value
critbit

9.5 0.0 unicode-transforms VS critbit

A Haskell implementation of crit-bit trees.
refined

9.5 4.8 unicode-transforms VS refined

Refinement types with static checking
network-msgpack-rpc

9.4 unicode-transforms VS network-msgpack-rpc

A MessagePack-RPC Implementation
diskhash

9.4 3.2 unicode-transforms VS diskhash

Diskbased (persistent) hashtable
higgledy

9.4 2.6 unicode-transforms VS higgledy

Higher-kinded data via generics
data-msgpack

9.4 unicode-transforms VS data-msgpack

A Haskell implementation of MessagePack
hashtables

9.4 6.1 unicode-transforms VS hashtables

Mutable hash tables for Haskell, in the ST monad
caledon

9.4 0.0 unicode-transforms VS caledon

higher order dependently typed logic programing
aeson-qq

9.4 3.6 unicode-transforms VS aeson-qq

JSON quasiquoter for Haskell
jump

9.4 0.0 unicode-transforms VS jump

Jump start your Haskell development
cereal

9.4 0.0 unicode-transforms VS cereal

A binary serialization library
audiovisual

9.3 4.8 unicode-transforms VS audiovisual

Extensible records, variants, structs, effects, tangles
json-autotype

9.3 0.0 unicode-transforms VS json-autotype

Automatic Haskell type inference from JSON input
IORefCAS

9.3 5.7 unicode-transforms VS IORefCAS

A collection of different packages for CAS based data structures.
dependent-map

9.3 4.7 unicode-transforms VS dependent-map

Dependently-typed finite maps (partial dependent products)
discrimination

9.3 5.0 unicode-transforms VS discrimination

Fast linear time sorting and discrimination for a large class of data types
dependent-sum

9.3 4.3 unicode-transforms VS dependent-sum

Dependent sums and supporting typeclasses for comparing and displaying them
certificate

9.3 0.0 unicode-transforms VS certificate

Certificate and Key Reader/Writer in haskell
orgmode-parse

9.2 0.0 unicode-transforms VS orgmode-parse

Attoparsec parser combinators for parsing org-mode structured text!
uuid-types

9.2 3.9 unicode-transforms VS uuid-types

A Haskell library for creating, printing and parsing UUIDs
reflection

9.2 4.7 unicode-transforms VS reflection

Reifies arbitrary Haskell terms into types that can be reflected back into terms
rei

9.2 0.0 unicode-transforms VS rei

Process lists easily
text-icu

9.2 3.5 unicode-transforms VS text-icu

This package provides the Haskell Data.Text.ICU library, for performing complex manipulation of Unicode text.
calamity

9.2 6.0 unicode-transforms VS calamity

A library for writing discord bots in haskell
protobuf

9.2 2.6 unicode-transforms VS protobuf

An implementation of Google's Protocol Buffers in Haskell.
safecopy

9.2 5.8 unicode-transforms VS safecopy

An extension to Data.Serialize with built-in version control
uuid

9.2 3.9 unicode-transforms VS uuid

A Haskell library for creating, printing and parsing UUIDs
bifunctors

9.2 4.4 unicode-transforms VS bifunctors

Haskell 98 bifunctors, bifoldables and bitraversables
scientific

9.2 0.0 unicode-transforms VS scientific

Arbitrary-precision floating-point numbers represented using scientific notation
avro

9.2 5.2 unicode-transforms VS avro

Haskell Avro Encoding and Decoding Native Support (no RPC)
b-tree

9.1 1.8 unicode-transforms VS b-tree

Haskell on-disk B* tree implementation
witherable

9.1 2.4 unicode-transforms VS witherable

Filter with effects
tables

9.1 0.0 unicode-transforms VS tables

Deprecated because of
streaming

9.1 0.0 unicode-transforms VS streaming

An optimized general monad transformer for streaming applications, with a simple prelude of functions

Do you think we are missing an alternative of unicode-transforms or a related project?

Add another 'Data' Package

Popular Comparisons

README

Unicode Transforms

Fast Unicode 13.0.0 normalization in Haskell (NFC, NFKC, NFD, NFKD).

What is normalization?

Unicode characters with adornments (e.g. Á) can be represented in two different forms, as a single composed character (U+00C1 = Á) or as multiple decomposed characters (U+0041(A) U+0301( ́ ) = Á). They are differently encoded byte sequences but for humans they have exactly the same visual appearance.

A regular byte comparison may tell that two strings are different even though they might be equivalent. We need to convert both the strings in a normalized form using the Unicode Character Database before we can compare them for equivalence. For example:

>> import Data.Text.Normalize
>> normalize NFC "\193" == normalize NFC "\65\769"
True

Performance

Normalization performance comparison of this package (v0.3.7) with the text-icu package using the ICU C++ library version ICU4C 65.1 on macOS. The benchmarks compare the time taken in milliseconds to normalize files in different languages and normalization forms using both the packages. In most cases unicode-transforms outperforms ICU.

Benchmark       unicode-transforms(ms) ICU(ms)    % Diff
--------------- ---------------------- -------   --------
NFKD/Korean                       7.78   37.10    +376.87
NFD/Korean                        7.86   37.06    +371.50
NFKD/Vietnamese                   6.85   12.48     +82.20
NFKD/Deutsch                      2.17    3.55     +63.30
NFKD/English                      1.71    2.78     +62.30
NFKC/Korean                       4.77    7.65     +60.28
NFD/Deutsch                       2.24    3.53     +57.41
NFD/English                       1.76    2.77     +57.32
NFC/Vietnamese                   10.66   16.63     +56.00
NFKC/Vietnamese                  10.95   16.58     +51.43
NFD/Devanagari                    6.48    8.68     +34.10
NFC/Devanagari                    6.77    8.49     +25.48
NFD/AllChars                      6.18    7.41     +19.91
NFD/Japanese                      7.80    9.20     +17.99
NFKC/Devanagari                   7.33    8.48     +15.74
NFKD/Japanese                     8.71   10.05     +15.39
NFD/Vietnamese                    5.94    6.83     +14.99
NFKD/Devanagari                   7.59    8.68     +14.27
NFKD/AllChars                     9.80   10.66      +8.82
NFKC/Deutsch                      3.21    3.18      -0.72
NFC/Korean                        4.62    4.38      -5.35
NFKC/English                      2.21    2.06      -6.88
NFC/English                       2.19    2.04      -7.21
NFKC/AllChars                    14.67    9.75     -50.51
NFC/Deutsch                       3.02    1.95     -54.39
NFKC/Japanese                    12.46    5.42    -129.93
NFC/AllChars                      9.72    3.58    -171.63
NFC/Japanese                     11.90    3.04    -292.04