JustParse alternatives and similar packages
Based on the "Text" category.
Alternatively, view JustParse alternatives based on common mentions on social networks and blogs.
-
pandoc-citeproc
Library and executable for using citeproc with pandoc -
scholdoc
Fork of Pandoc for the implementation of a ScholarlyMarkdown parser -
blaze-from-html
A blazingly fast HTML combinator library for Haskell. -
skylighting
A Haskell syntax highlighting library with tokenizers derived from KDE syntax highlighting descriptions -
prettyprinter
A modern, extensible and well-documented prettyprinter. -
regex-genex
Given a list of regexes, generate all possible strings that matches all of them. -
commonmark
Pure Haskell commonmark parsing library, designed to be flexible and extensible -
regex-applicative
Regex-based parsing with an applicative interface -
pandoc-csv2table
A Pandoc filter that renders CSV as Pandoc Markdown Tables. -
servant-checked-exceptions
type-level errors for Servant APIs. -
pretty-show
Tools for working with derived Show instances in Haskell. -
text-format
A Haskell text formatting library optimized for ease of use and high performance. -
diagrams-pandoc
A pandoc filter to express diagrams inline using the haskell EDSL diagrams. -
double-conversion
A fast Haskell library for converting between double precision floating point numbers and text strings. It is implemented as a binding to the V8-derived C++ double-conversion library. -
boxes
A pretty-printing library for laying out text in two dimensions, using a simple box model. -
ghc-syntax-highlighter
Syntax highlighter for Haskell using the lexer of GHC
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of JustParse or a related project?
README
JustParse
A simple and comprehensive Haskell parsing library
Differences and similarities from Parsec
and Attoparsec
Similarities to Parsec
- Allows for parsing arbitrary Streams
- Makes extensive use of combinators
Similarities to Attoparsec
- Allows for return partial results
- Is not a monad transformer
Differences from both
- Returns a list of all possible parses
- Allows for conversion of a regular expression to a parser
Non-greedy parsing
The last item in that list is the most important. In both Parsec
and
Attoparsec
, parsers such as "many" are greedy. That is, they will consume
as much input as possible. This is makes writing a parser equivalent to the
regular expression a[ab]*a
a bit tricky. We would be tempted to write:
p = do
a <- char 'a'
b <- many (oneOf "ab")
c <- char 'a'
return (a,b,c)
The problem is that the many (oneOf "ab")
parser is greedy, and will
consume the final char 'a'
term that we try to bind to c
, resulting in a failed parse. We could write this using a combination of try
,
notFollowedBy
, and lookAhead
parsers, but it doesn't capture the same
elegance of "parse an 'a', then some 'a's or 'b's, then an 'a'".
JustParse removes this problem with its ability to match all possible
parses. That same parser in JustParse (with many
changed to many_
),
applied to the input abaaba
would return:
('a', "b", 'a')
('a', "ba", 'a')
('a', "baab", 'a')
Partial
The Partial result represents the branch of the parse tree in which the
"many_" term consumes all available input. Supplying it with something
like a
would yield an additional result of ('a', "baaba", 'a')
(and
another Partial
), since it will resume parsing.
For compatability reasons, the parsers many
, sepBy
, etc. operate as
they do in Parsec
and Attoparsec
. To use the ones that return all
possible parses, merely append an underscore, such as many_
and sepBy_
.
For general purpose parse branching, one may use the branch
function, or
its infix name of <||>
.
Regex convenience
JustParse provides the regex
parser. This parser is of the type
Stream s Char => Parser s Match
. A Match
object contains all of the
text matched within it, and a list of Match
objects which represent any
subgroups (which may themselves contain subgroups, etc). These regular
expressions are truly regular in that they do not have backreferences (for
now). If one only wants the entirety of the matched text, the regex'
parser will do that. Example:
p = regex' "ab+cd?"
is equivalent to the standard parser:
p = do
a <- char 'a'
b <- many1 (char 'b')
c <- char 'c'
d <- option "" (string "d")
return (a:b++c:d)
So for small String
parsers, or for use in larger parsers, the regex
or
regex'
parsers prove very convenient.