Releases: commonmark/cmark
Releases · commonmark/cmark
cmark 0.24.0
- [API change] Added
cmark_node_replace(oldnode, newnode). - Updated spec.txt to 0.24.
- Fixed edge case with escaped parens in link destination (#97).
This was also checked against the #82 case with asan. - Removed unnecessary check for
fencedincmark_render_html.
It's sufficient to check that the info string is empty.
Indeed, those who use the API may well create a code block
with an info string without explicitly settingfenced. - Updated format of
test/smart_punct.txt. - Updated
test/spec.txt,test/smart_punct.txt, and
spec_tests.pyto new format. - Fixed
get_containing_blocklogic insrc/commonmark.c.
This did not allow for the possibility that a node might have no
containing block, causing the commonmark renderer to segfault if
passed an inline node with no block parent. - Fixed string representations of
CUSTOM_BLOCK,
CUSTOM_INLINE. The old versionsraw_inlineand
raw_blockwere being used, and this led to incorrect xml output. - Use default opts in python sample wrapper.
- Allow multiline setext header content, as per spec.
- Don't allow spaces in link destinations, even with pointy brackets.
Conforms to latest change in spec. - Updated
schemescanner according to spec change. We no longer use
a whitelist of valid schemes. - Allow any kind of nodes as children of
CUSTOM_BLOCK(#96). cmark.h: moved typedefs for iterator into iterator section.
This just moves some code around so it makes more sense
to read, and in the man page.- Fixed
make_man_page.pyso it includes typedefs again.
cmark 0.23
- [API change] Added
CUSTOM_BLOCKandCUSTOM_INLINEnode types.
They are never generated by the parser, and do not correspond
to CommonMark elements. They are designed to be inserted by
filters that postprocess the AST. For example, a filter might
convert specially marked code blocks to svg diagrams in HTML
and tikz diagrams in LaTeX, passing these through to the renderer
as aCUSTOM_BLOCK. These nodes can have children, but they
also have literal text to be printed by the renderer "on enter"
and "on exit." Addedcmark_node_get_on_enter,
cmark_node_set_on_enter,cmark_node_get_on_exit,
cmark_node_set_on_exitto API. - [API change] Rename
NODE_HTML->NODE_HTML_BLOCK,
NODE_INLINE_HTML->NODE_HTML_INLINE. Define aliases
so the old names still work, for backwards compatibility. - [API change] Rename
CMARK_NODE_HEADER->CMARK_NODE_HEADING.
Note that for backwards compatibility, we have defined aliases:
CMARK_NODE_HEADER=CMARK_NODE_HEADING,
cmark_node_get_header_level=cmark_node_get_heading_level, and
cmark_node_set_header_level=cmark_node_set_heading_level. - [API change] Rename
CMARK_NODE_HRULE->CMARK_NODE_THEMATIC_BREAK.
Defined the former as the latter for backwards compatibility. - Don't allow space between link text and link label in a reference link
(spec change). - Separate parsing and rendering opts in
cmark.h(#88).
This change also changes some of these constants' numerical values,
but nothing should change in the API if you use the constants
themselves. It should now be clear in the man page which
options affect parsing and which affect rendering. - xml renderer - Added xmlns attribute to document node (commonmark/commonmark-spec#87).
- Commonmark renderer: ensure html blocks surrounded by blanks.
Otherwise we get failures of roundtrip tests. - Commonmark renderer: ensure that literal characters get escaped
when they're at the beginning of a block, e.g.> \- foo. - LaTeX renderer - better handling of internal links.
Now we render[foo](#bar)as\protect\hyperlink{bar}{foo}. - Check for NULL pointer in _scan_at (#81).
Makefile.nmake: be more robust when cmake is missing. Previously,
when cmake was missing, the build dir would be created anyway, and
subsequent attempts (even with cmake) would fail, because cmake would
not be run. Depending onbuild/CMakeFilesis more robust -- this won't
be created unless cmake is run. Partially addresses #85.- Fixed DOCTYPE in xml output.
- commonmark.c: fix
size_ttoint. This fixes an MSVC warning
"conversion from 'size_t' to 'int', possible loss of data" (Kevin Wojniak). - Correct string length in
cmark_parse_documentexample (Lee Jeffery). - Fix non-ASCII end-of-line character check (andyuhnak).
- Fix "declaration shadows a local variable" (Kevin Wojniak).
- Install static library (commonmark/commonmark-spec#381).
- Fix warnings about dropping const qualifier (Kevin Wojniak).
- Use full (unabbreviated) versions of constants (
CMARK_...). - Removed outdated targets from Makefile.
- Removed need for sudo in
make bench. - Improved benchmark. Use longer test, since
timehas limited resolution. - Removed
bench.hand timing calls inmain.c. - Updated API docs; getters return empty strings if not set
rather than NULL, as previously documented. - Added api_tests for custom nodes.
- Made roundtrip test part of the test suite run by cmake.
- Regenerate
scanners.cusing re2c 0.15.3. - Adjusted scanner for link url. This fixes a heap buffer overflow (#82).
- Added version number (1.0) to XML namespace. We don't guarantee
stability in this until 1.0 is actually released, however. - Removed obsolete
TIMERmacro. - Make
LIB_INSTALL_DIRconfigurable (Mathieu Bridon, #79). - Removed out-of-date luajit wrapper.
- Use
input, notparser->curlineto determine last line length. - Small optimizations in
_scan_at. - Replaced hard-coded 4 with
TAB_STOP. - Have
make formatreformat api tests as well. - Added api tests for man, latex, commonmark, and xml renderers (#51).
- render.c: added
begin_contentfield. This is likebegin_lineexcept
that it doesn't trigger production of the prefix. So it can be set
after an initial prefix (say>) is printed by the renderer, and
consulted in determining whether to escape content that has a special
meaning at the beginning of a line. Used in the commonmark renderer. - Python 3.5 compatibility: don't require HTMLParseError (Zhiming Wang).
HTMLParseError was removed in Python 3.5. Since it could never be thrown
in Python 3.5+, we simply define a placeholder when HTMLParseError
cannot be imported. - Set
convert_charrefs=Falseinnormalize.py(#83). This defeats the
new default as of python 3.5, and allows the script to work with python
3.5.
cmark 0.22.0
- Removed
prefrom blocktags scanner.preis handled separately
in rule 1 and needn't be handled in rule 6. - Added
iframeto list of blocktags, as per spec change. - Fixed bug with
HRULEafter blank line. This previously caused cmark
to break out of a list, thinking it had two consecutive blanks. - Check for empty string before trying to look at line ending.
- Make sure every line fed to
S_process_lineends with\n(#72).
SoS_process_linesees only unix style line endings. Ultimately we
probably want a better solution, allowing the line ending style of
the input file to be preserved. This solution forces output with newlines. - Improved
cmark_strbuf_normalize_whitespace(#73). Now all characters
that satisfycmark_isspaceare recognized as whitespace. Previously
\rand\t(and others) weren't included. - Treat line ending with EOF as ending with newline (#71).
- Fixed
--hardbreakswith\r\nline breaks (#68). - Disallow list item starting with multiple blank lines (commonmark/commonmark-spec#332).
- Allow tabs before closing
#s in ATX header - Removed
cmark_strbuf_printfandcmark_strbuf_vprintf.
These are no longer needed, and cause complications for MSVC.
Also removedHAVE_VA_COPYandHAVE_C99_SNPRINTFfeature tests. - Added option to disable tests (Kevin Wojniak).
- Added
CMARK_INLINEmacro. - Removed need to disable MSVC warnings 4267, 4244, 4800
(Kevin Wojniak). - Fixed MSVC inline errors when cmark is included in sources that
don't have the same set of disabled warnings (Kevin Wojniak). - Fix
FileNotFoundErrorerrors on tests when cmark is built from
another project viaadd_subdirectory()(Kevin Wojniak). - Prefix
utf8procfunctions to avoid conflict with existing library
(Kevin Wojniak). - Avoid name clash between Windows
.pdbfiles (Nick Wellnhofer). - Improved
smart_punct.txt(see commonmark/commonmark.js#61). - Set
POSITION_INDEPENDENT_CODEONfor static library (see #39). make bench: allow overridingBENCHFILE. Previously if you did
this, it would clopperBENCHFILEwith the default bench file.make bench: Use -10 priority with renice.- Improved
make_autolink. Ensures that title is chunk with empty
string rather than NULL, as with other links. - Added
clang-checktarget. - Travis: split
roundtrip_testandleakcheck(OGINO Masanori). - Use clang-format, llvm style, for formatting. Reformatted all source files.
Addedformattarget to Makefile. Removedastyletarget.
Updated.editorconfig.
cmark 0.21.0
- Updated to version 0.21 of spec.
- Added latex renderer (#31). New exported function in API:
cmark_render_latex. New source file:src/latex.hs. - Updates for new HTML block spec. Removed old
html_block_tagscanner.
Added newhtml_block_startandhtml_block_start_7, as well
ashtml_block_end_nfor n = 1-5. Rewrote block parser for new HTML
block spec. - We no longer preprocess tabs to spaces before parsing.
Instead, we keep track of both the byte offset and
the (virtual) column as we parse block starts.
This allows us to handle tabs without converting
to spaces first. Tabs are left as tabs in the output, as
per the revised spec. - Removed utf8 validation by default. We now replace null characters
in the line splitting code. - Added
CMARK_OPT_VALIDATE_UTF8option and command-line option
--validate-utf8. This option causes cmark to check for valid
UTF-8, replacing invalid sequences with the replacement
character, U+FFFD. Previously this was done by default in
connection with tab expansion, but we no longer do it by
default with the new tab treatment. (Many applications will
know that the input is valid UTF-8, so validation will not
be necessary.) - Added
CMARK_OPT_SAFEoption and--safecommand-line flag.- Added
CMARK_OPT_SAFE. This option disables rendering of raw HTML
and potentially dangerous links. - Added
--safeoption in command-line program. - Updated
cmark.3man page. - Added
scan_dangerous_urlto scanners. - In HTML, suppress rendering of raw HTML and potentially dangerous
links ifCMARK_OPT_SAFE. Dangerous URLs are those that begin
withjavascript:,vbscript:,file:, ordata:(except for
image/png,image/gif,image/jpeg, orimage/webpmime types). - Added
api_testforOPT_CMARK_SAFE. - Rewrote
README.mdon security.
- Added
- Limit ordered list start to 9 digits, per spec.
- Added width parameter to
render_man(API change). - Extracted common renderer code from latex, man, and commonmark
renderers into a separate module,renderer.[ch](#63). To write a
renderer now, you only need to write a character escaping function
and a node rendering function. You pass these tocmark_render
and it handles all the plumbing (including line wrapping) for you.
So far this is an internal module, but we might consider adding
it to the API in the future. - commonmark writer: correctly handle email autolinks.
- commonmark writer: escape
!. - Fixed soft breaks in commonmark renderer.
- Fixed scanner for link url. re2c returns the longest match, so we
were getting bad results with[link](foo\(and\(bar\)\))
which it would parse as containing a bare\followed by
an in-parens chunk ending with the final paren. - Allow non-initial hyphens in html tag names. This allows for
custom tags, see commonmark/commonmark-spec#239. - Updated
test/smart_punct.txt. - Implemented new treatment of hyphens with
--smart, converting
sequences of hyphens to sequences of em and en dashes that contain no
hyphens. - HTML renderer: properly split info on first space char (see
commonmark/commonmark.js#54). - Changed version variables to functions (#60, Andrius Bentkus).
This is easier to access using ffi, since some languages, like C#
like to use only function interfaces for accessing library
functionality. process_emphasis: Fixed setting lower bound to potential openers.
Renamedpotential_openers->openers_bottom.
Renamedstart_delim->stack_bottom.- Added case for #59 to
pathological_test.py. - Fixed emphasis/link parsing bug (#59).
- Fixed off-by-one error in line splitting routine.
This caused certain NULLs not to be replaced. - Don't rtrim in
subject_from_buffer. This gives bad results in
parsing reference links, where we might have trailing blanks
(finalizeremoves the bytes parsed as a reference definition;
before this change, some blank bytes might remain on the line).- Added
columnandfirst_nonspace_columnfields toparser. - Added utility function to advance the offset, computing
the virtual column too. Note that we don't need to deal with
UTF-8 here at all. Only ASCII occurs in block starts. - Significant performance improvement due to the fact that
we're not doing UTF-8 validation.
- Added
- Fixed entity lookup table. The old one had many errors.
The new one is derived from the list in the npm entities package.
Since the sequences can now be longer (multi-code-point), we
have bumped the length limit from 4 to 8, which also affects
houdini_html_u.c. An example of the kind of error that was fixed:
≧̸should be rendered as "≧̸" (U+02267 U+00338), but it was
being rendered as "≧" (which is the same as≧). - Replace gperf-based entity lookup with binary tree lookup.
The primary advantage is a big reduction in the size of
the compiled library and executable (> 100K).
There should be no measurable performance difference in
normal documents. I detected only a slight performance
hit in a file containing 1,000,000 entities.- Removed
src/html_unescape.gperfandsrc/html_unescape.h. - Added
src/entities.h(generated bytools/make_entities_h.py). - Added binary tree lookup functions to
houdini_html_u.c, and
use the data insrc/entities.h. - Renamed
entities.h->entities.inc, and
tools/make_entities_h.py->tools/make_entitis_inc.py.
- Removed
- Fixed cases like
[ref]: url "title" ok
Here we should parse the first line as a reference. inlines.c: Added utility functions to skip spaces and line endings.- Fixed backslashes in link destinations that are not part of escapes
(commonmark/commonmark-spec#45). process_line: Removed "add newline if line doesn't have one."
This isn't actually needed.- Small logic fixes and a simplification in
process_emphasis. - Added more pathological tests:
- Many link closers with no openers.
- Many link openers with no closers.
- Many emph openers with no closers.
- Many closers with no openers.
"*a_ " * 20000.
- Fixed
process_emphasisto handle new pathological cases.
Now we have an array of pointers (potential_openers),
keyed to the delim char. When we've failed to match a potential opener
prior to point X in the delimiter stack, we resetpotential_openers
for that opener type to X, and thus avoid having to look again through
all the openers we've already rejected. process_inlines: remove closers from delim stack when possible.
When they have no matching openers and cannot be openers themselves,
we can safely remove them. This helps with a performance case:
"a_ " * 20000(commonmark/commonmark.js#43).- Roll utf8proc_charlen into utf8proc_valid (Nick Wellnhofer).
Speeds up "make bench" by another percent. spec_tests.py: allow→for tab in HTML examples.normalize.py: don't collapse whitespace in pre contexts.- Use utf-8 aware re2c.
- Makefile afl target: removed
-m none, addedCMARK_OPTS. - README: added
make aflinstructions. - Limit generated generated
cmark.3to 72 character line width. - Travis: switched to containerized build system.
- Removed
debug.h. (It uses GNU extensions, and we don't need it anyway.) - Removed sundown from benchmarks, because the reading was anomalous.
sundown had an arbitrary 16MB limit on buffers, and the benchmark
input exceeded that. So who knows what we were actually testing?
Added hoedown, sundown's successor, which is a better comparison.
cmark 0.20.0
- Fixed bug in list item parsing when items indented >= 4 spaces (#52).
- Don't allow link labels with no non-whitespace characters
(commonmark/commonmark-spec#322). - Fixed multiple issues with numeric entities (#33, Nick Wellnhofer).
- Support CR and CRLF line endings (Ben Trask).
- Added test for different line endings to
api_test. - Allow NULL value in string setters (Nick Wellnhofer). (NULL
produces a 0-length string value.) Internally, URL and
title are now stored ascmark_chunkrather thanchar *. - Fixed memory leak in
cmark_consolidate_text_nodes(#32). - Fixed
is_autolinkin the CommonMark renderer (#50). Previously any
link with an absolute URL was treated as an autolink. - Cope with broken
snprintfon Windows (Nick Wellnhofer). On Windows,
snprintfreturns -1 if the output was truncated. Fall back to
Windows-specific_scprintf. - Switched length parameter on
cmark_markdown_to_html,
cmark_parser_feed, andcmark_parse_documentfromint
tosize_t(#53, Nick Wellnhofer). - Use a custom type
bufsize_tfor all string sizes and indices.
This allows to switch to 64-bit string buffers by changing a single
typedef and a macro definition (Nick Wellnhofer). - Hardened the
strbufcode, checking for integer overflows and
adding range checks (Nick Wellnhofer). - Removed unused function
cmark_strbuf_attach(Nick Wellnhofer). - Fixed all implicit 64-bit to 32-bit conversions that
-Wshorten-64-to-32warns about (Nick Wellnhofer). - Added helper function
cmark_strbuf_safe_strlenthat converts
fromsize_ttobufsize_tand throws an error in case of
an overflow (Nick Wellnhofer). - Abort on
strbufout of memory errors (Nick Wellnhofer).
Previously such errors were not being trapped. This involves
some internal changes to thebufferlibrary that do not affect
the API. - Factored out
S_find_first_nonspaceinS_proces_line.
Added fieldsoffset,first_nonspace,indent, andblank
tocmark_parserstruct. This just removes some repetition. - Added Racket Racket (5.3+) wrapper (Eli Barzilay).
- Removed
-pgfrom Debug build flags (#47). - Added Ubsan build target, to check for undefined behavior.
- Improved
make leakcheck. We now return an error status if anything
in the loop fails. We now check--smartand--normalizeoptions. - Removed
wrapper3.py, madewrapper.pywork with python 2 and 3.
Also improved the wrapper to work with Windows, and to use smart
punctuation (as an example). - In
wrapper.rb, added argument for options. - Revised luajit wrapper.
- Added build status badges to README.md.
- Added links to go, perl, ruby, R, and Haskell bindings to README.md.
cmark 0.19.0
- Fixed
_emphasis parsing to conform to spec (commonmark/commonmark-spec#317). - Updated
spec.txt. - Compile static library with
-DCMARK_STATIC_DEFINE(Nick Wellnhofer). - Suppress warnings about Windows runtime library files (Nick Wellnhofer). Visual Studio Express editions do not include the redistributable files. Set
CMAKE_INSTALL_SYSTEM_RUNTIME_LIBS_NO_WARNINGSto suppress warnings. - Added appyeyor: Windows continuous integration (
appveyor.yml). - Use
os.path.joinintest/cmark.pyfor proper cross-platform paths. - Fixed
Makefile.nmake. - Improved
make afl: addedtest/afl_dictionary, increased timeout for hangs. - Improved README with a description of the library's strengths.
- Pass-through Unicode non-characters (Nick Wellnhofer). Despite their name, Unicode non-characters are valid code points. They should be passed through by a library like libcmark.
- Check return status of
utf8proc_iterate(#27).
cmark 0.18.3
- Include patch level in soname (Nick Wellnhofer). Minor version is tied to spec version, so this allows breaking the ABI between spec releases.
- Install compiler-provided system runtime libraries (Changjiang Yang).
- Use
strbuf_printfinstead ofsnprintf.snprintfis not available on some platforms (Visual Studio 2013 and earlier). - Fixed memory access bug: "invalid read of size 1" on input
[link](<>).
cmark 0.18.2
- Added commonmark renderer:
cmark_render_commonmark. In addition to options, this takes awidthparameter. A value of 0 disables wrapping; a positive value wraps the document to the specified width. Note that width is automatically set to 0 if theCMARK_OPT_HARDBREAKSoption is set. - The
cmarkexecutable now allows-t commonmarkfor output as CommonMark. A--widthoption has been added to specify wrapping width. - Added
roundtrip_testMakefile target. This runs all the spec through the commonmark renderer, and then through the commonmark parser, and compares normalized HTML to the test. All tests pass with the current parser and renderer, giving us some confidence that the commonmark renderer is sufficiently robust. Eventually this should be pythonized and put in the cmake test routine. - Removed an unnecessary check in
blocks.c. By the time we check for a list start, we've already checked for a horizontal rule, so we don't need to repeat that check here. Thanks to Robin Stocker for pointing out a similar redundancy in commonmark.js. - Fixed bug in
cmark_strbuf_unescape(buffer.c). The old function gave incorrect results on input like\\*, since the next backslash would be treated as escaping the*instead of being escaped itself. scanners.re: added_scan_scheme,scan_scheme, used in the commonmark renderer.- Check for
CMAKE_C_COMPILER(notCC_COMPILER) when setting C flags. - Update code examples in documentation, adding new parser option argument, and using
CMARK_OPT_DEFAULT(Nick Wellnhofer). - Added options parameter to
cmark_markdown_to_html. - Removed obsolete reference to
CMARK_NODE_LINK_LABEL. make leakchecknow checks all output formats.test/cmark.py: set default options formarkdown_to_html.- Warn about buggy re2c versions (Nick Wellnhofer).
cmark 0.18.1
cmark 0.18
- Switch to 2-clause BSD license, with agreement of contributors.
- Added Profile build type,
make proftarget. - Fixed autolink scanner to conform to the spec. Backslash escapes
not allowed in autolinks. - Don't rely on strnlen being available (Nick Wellnhofer).
- Updated scanners for new whitespace definition.
- Added
CMARK_OPT_SMARTand--smartoption,smart.c,smart.h. - Added test for
--smartoption. - Fixed segfault with --normalize (closes #7).
- Moved normalization step from XML renderer to
cmark_parser_finish. - Added options parameter to
cmark_parse_document,cmark_parse_file. - Fixed man renderer's escaping for unicode characters.
- Don't require python3 to make
cmark.3man page. - Use ASCII escapes for punctuation characters for portability.
- Made
optionsan int rather than a long, for consistency. - Packed
cmark_nodestruct to fit into 128 bytes.
This gives a small performance boost and lowers memory usage. - Repacked
delimiterstruct to avoid hole. - Fixed use-after-free bug, which arose when a paragraph containing
only reference links and blank space was finalized (#9).
Avoid usingparser->currentin the loop that creates new
blocks, sincefinalizeinadd_childmay have removed
the current parser (if it contains only reference definitions).
This isn't a great solution; in the long run we need to rewrite
to make the logic clearer and to make it harder to make
mistakes like this one. - Added 'Asan' build type.
make asanwill link against ASan; the
resulting executable will do checks for memory access issues.
Thanks @JordanMilne for the suggestion. - Add Makefile target to fuzz with AFL (Nick Wellnhofer)
The variable$AFL_PATHmust point to the directory containing the AFL
binaries. It can be set as an environment variable or passed to make on
the command line.