- (mypy plugin) Supports
ElementDefaultClassLookup - Supports type checking and runtime testing under PyPy 3.11
- Stub is error-free for
tytype checker
HtmlElement.headand.bodycan be None- [breaking] Convert
ElementDefaultClassLookupinto Generic class - Extend the list of unusable content-only elem methods
Resolvermethods args mostly position only- [breaking] Remove
_ResolverRegistry.copy()
- Retire some unused type aliases
- (mypy plugin) Determine class lookup names more systematically
- Existing old docstring layout converted to new one
- New docstrings for some etree classes and funcs
- Add
CHANGELOG.mdto help searching among past changes
- Remove some signature tests already covered by
mypy.stubtest - Retire type checker
"if KEYWORD:"usage - Compat fix for
pyright1.1.408+ andbasedpyright1.37.1+ - Compat fix for
pyright1.1.406 andbasedpyright1.31.6 - Split
mypy.stubtestas standalone tests - More static tests migrated to runtime:
HtmlElementsequence tests,DocInfo,Resolver- "backport" some
HtmlElementsequence tests to_Element - Partially migrate
Elementfactory annotation test
- Introduce
pre-commitusage to help runningactionlint - Drop pytest-mypy-plugin from
[dev]extras - Add config for
git-cliff
- Enable Bandit security scanner in workflow
- Add
tyto compat checks, and add more versions - Use
ubuntu-slimGitHub runner for lightweight workflows - Stop using
reviewdogfor PR check - Add
ci-annotation-converteras submodule to support type checker reporting
- Supports facebook's
pyreflytype checker (#106, #107) - Initial mypy plugin that mimics
XMLParser.set_element_class_lookup()behavior
- Setting
HtmlElement.labeltoNoneis disallowed - Basic stub works with Python 3.9 again;
TypeAliasusage caused requirement of Python 3.10
- Migrate
HtmlMixinproperties and.set()method tests to runtime
types-lxml[dev]extras is installable again- It is possible to verify all release files indeed originate from GitHub and not altered elsewhere using GitHub CLI
- Declare Python 3.14 support
- PEP 800 support (
@disjoint_base)- As a result, remove Python 3.8 support and require newest type checkers /
typing_extensions
- As a result, remove Python 3.8 support and require newest type checkers /
- Additional
libxml2error constants fromlxml6.0.1+ - Use
io.Readerandio.Writerfrom Python 3.14, replacingSupportsReadandSupportsWritefromtypeshed.
- (#100, thanks to @BeatButton) Replace
__init__()with__new__()for allXMLParsersubclasses, overridingXMLParser.__new__(). Due toCustomTargetParserchange in fdf2a8117562c1b07309239554bb36d021f0b207,XMLParseruses__new__()instead of__init__(). That commit brought in undesirable effect:pyrighttreats allXMLParser/HTMLParsersubclasses instances as base class instances. - Add
@type_check_onlyto some protocols and generics
- Drop deprecated collection-related typing aliases
- Replace some
LiteralStringwithLiteralconstants when value is fixed - Drop unused type ignore comments because we are not supporting wide range of type checker versions now
- No more globally ignore mypy
assignmenterror code in tests - Add
HTMLParser.__init__()andXMLParser.__init__()to allowlist, due to #100
- Brings in full
lxml6.0.x support. Additional exported constants were already present in earliertypes-lxmlrelease, here are the remaining features:xmlfile.write()supports writingCDATAobject directly- (#94, thanks to @udifuchs)
Element()andElementTree()used to be factory functions to generate_Elementand_ElementTreecorrespondingly, but now become virtual superclasses themselves
- No more tested against lxml 4.9.x. Doesn't mean it will break immediately, but will not have any guarantee that
types-lxmlcompletely matches 4.9.x API over time.
- Also test against lxml 5.4 and newest 5.3.x
- (#92, thanks to @macro1) Apply
mypy.stubtestcheck to help guarantee stub implementation doesn't deviate too much from runtime signatures and types, except intentional ones. Helps finding many of the bug fixes below. - Compatible with
mypy1.16+ andpyright1.1.399+ - (#86) Revive custom target parser support (stub-only
ParserTargetas target object, andCustomTargetParseras stub-only variant ofXMLParser)- Functions involved:
fromstring(),parse(),_ElementTree.parse(),ElementTree(),fromstringlist(),HTML(),XML() - Params of all target object methods are positional
- attribute is a dict in target object
.start()method - Leave the capability of creating custom target parser to only
XMLParserandHTMLParser, and droptarget=param from all parser subclasses (such aslxml.htmlones) C14NWriterTargetinherits fromParserTarget
- Functions involved:
- Sync or add
__all__in various submodules
- (#85, thanks to @BrandonStudio)
cleanup_namespaces()shouldn't warn withoutkeep_ns_prefixesarg - Allow specifying default value of
output_parentarg forXSLTExtension.apply_template()and.process_children() - Mark
_Attribas final - Add missing
XMLSyntaxAssertionError.__init__() set_default_parser()arg missing default valuestrip_elements()with_tailarg should be keyword-only- Use original param name in tag cleanup functions
- Strip unnecessary arguments in
XSLTExtensionoverloads - Give users a rough idea about
XSLTExtensionmethod arguments, such as using_Elementto approximately represent_ReadOnlyElementProxy. Avoids creating even more stub-only classes and requiring user to poke into them
FormElement._nameis a method, not property
- some Schematron variables are Literal constants
enable_recursive_str()arg missing default valueparse()file parameter name was wrong
- Trim down
canonicalize(),etree.tostring()andExtension()overloads to avoid confusion - Implement
objectify.NumberElementafter all, in rare case where somebody wants to implement new type of number related toDataElement - Move
NumberElement._setValueParser()to subclasses - (#71) Remove last traces of
_AnyStr - Reorder
_ElementTree.write()overloads, with the most generic overload presented first for UX - Fix
XMLParserandHTMLParserAPI doc links - Better docstring and warning for
C14NWriterTarget - Drop unused
_HtmlElemParseralias
- (#82) Add buffer type support for upcoming lxml 6.0.
HtmlElement.text_content()result will become plainstrsince lxml 6.0. This change shouldn't break much compatibility for users of previous lxml versions.- Warn user about
strinput andguess_charsetcombo bug inhtml.html5parserfunctions - Warn user about incorrect usage of specifying single element as
.extend()argument - lxml 6.0 exports
LIBXML_COMPILED_FEATURESconstant
- (#84) Tag selector supports iterator but not
bytearray - A few combinations of
QNameconstruction argument were actually disallowed; second argument can't beQNameor_Elementif first argument is non-empty - Multiple issues for
Resolverclass- Don't annotate opaque internal context object
- Drop
_ResolverRegistry.resolve()which can't possibly appear in user land code - Missing default value for
Resolver.resolve_file()keyword arguments Resolver.resolve()arguments can beNone
- Drop unused keyword arguments from
iterparse()html mode overload namespacesarg of.xpath()method accepts tuple form. Change forXPathclasses already done earlier.- Confine the type of public element (subclass of
ElementBase) class attributes _Element.findtext()didn't allow default argument in certain overload formRelaxNG.from_rnc_string()base_urlargument acceptsbyteshtml.html5parserguess_charsetbug revisitedparse()is not affected as it always open files/URL in binary mode- For other functions, even
guess_charset=Falsetriggers the bug
- Some
html5parser.HTMLParserinitialisation arguments should be keyword only - Corrected import of
typing.Neverinhtmlmodule andhtml.html5parsersubmodule .extend()and__setitem__()of_ElementandHtmlElementsupport iterator as value_Element.index()had wrong parameter name- Continued verification of properties and arguments supporting
bytearray:_Element.textand.tailproperties- Content-only elements
XPathinput expression_IDDictmixin argumentsxmlfile.write*()methods andencodingargument
- Drop
_ElemClsLookupArgalias, which is almost unused - Rename
_StrictNSMapto more aptly named_StrOnlyNSMap - Don't include superclass attributes in
ParseErrordefinition - Continue getting rid of
_AnyStrin most places - Mark constants as
Final
- Migrate following tests to property based runtime testing:
- All basic validators:
DTD,RelaxNG,ISO Schematron(XMLSchemadone in earlier release) - All existing
_Elementmethod / property tests and content-only elements html.html5parsersubmoduleXMLID()and friendsQName
- All basic validators:
- For all negative tests on properties or arguments bombarded with random objects, also add iterables of correct objects to the list, to make sure iterables of correct argument or value would become incorrect arguments.
- Fill in docstring for all
_Elementproperties and methods
- Depends on
beautifulsoup4itself because version 4.13 has bundled inline annotation. Droppingtypes-beautifulsoup4dependency as result. - Multi subclass patch includes change in
CSSSelectorresult - Implement
ErrorTypesconstants as enum
- Additional
type: ignores that improve compatibility with older versions ofmypyandpyright - For
soupparsersubmodule input arguments, copy definition frombeautifulsoup4code directly html.fragment_fromstringcreate_parentargument can be string (#83, thanks to @sciyoshi)XPathnamespacesargument can accept namespace tuples- Fixes compatibility with mypy 1.14+
bytesnot allowed ashtml.diff.htmldiff()argument- Parser
encodingarguments do supportbytearray _ListErrorLog.filter_from_level()supports real numbers
- Migrate
beautifulsoupandErrorLogtests to property based - Migrate
cssselectandXMLSchematests to runtime ones - Add mocked HTTP response to file input fixture; introduces
urllib3andpookas test dependency
-
Add
basedpyrighttype checker support -
Incorporate changes from
lxml5.3.1 and (pending) 6.0- More
html.buildershorthands libxmlfeature constantsetree.DTD(external_id=...)supportstrnow- Deprecate some
Memdebugmethods
- More
-
html.submit_form()always returnHTTPResponsefor default handler -
Instance attributes are converted to properties because they are not deletable:
html.SelectElement.multiplehtml.InputElement.type
-
More function arguments supports
bytearray:register_namespace()inclusive_ns_prefixesparameter ofetree.tostring()
- Add docstring for some
etreemodule functionoverloads - Drop
_AnyStrfrometreemodule level functions
bytearrayaccepted as tag names, attribute names and attribute values- Related change: create
_TextArgtype alias to slowly replace existing_AnyStr(#71)
- Related change: create
- Warn IDE users via
warnings.deprecatedabout exception upon certain argument combinations in HTML link functions
- Property deleter missing for HTML elements (#73)
etree.strip_attributes()supportbytesandQNameas input- Completion of #64 for remaining known cases
- Corrected link replacement function return type in
html.rewrite_links() etree.canonicalize()shouldn't acceptbytesas input
- Use
hypothesisfor extensive tests on function arguments, currently used in_Attriband HTML link function tests (#75) reveal_type()injector has been split into its own project and pulled via dependency
- Folder structure changes for the whole repository (#70)
- Remove
_HANDLE_FAILUREStype alias and show values directly to users - Rename type-only protocol
SupportsLaxedItemstoSupportsLaxItems
pyrightusers (and IDE that can make use ofpyright) will see warning if a single string is supplied where collection of string is expected (tuple,set,listetc). In terms of typing, a singlestritself is valid as aSequence, so type checkers normally would not raise alarm when usingstrin such function parameters, but can induce unexpected runtime behavior. (#64)_ElementTree.write(),etree.fromstringlist(),etree.tostring(),html.soupparser.fromstring(),html.soupparser.parse()
- It is possible to verify release files indeed come from GitHub and not maliciously altered. See Release file attestation for detail.
- Runtime tests support comparing with
mypyresults, therefore officially making static stub tests obsolete
- Element tag names, attribute names and attribute values support
bytearray. This is discovered viahypothesistesting, which is intended to be utilized in next release - Compatibility with
pyright ⩾ 1.1.378, which imposes additional overload warning foretree.iterparse() - Use relative import in
lxml.ElementInclude, otherwisemypytriggers--install-typebehavior. ObjectifiedElement__getitem()__and__setitem()__should acceptstras key, which behaves mostly like__getattr__()and__setattr__(). That means,elem["foo"]is equivalent toelem.foofor non-repeating subelements.
_Element.tagproperty is not just astr. It isstrafter initial document or string parsing, but can be set manually to any type supported by tag name and returns the same object.- When
QNameis initialized with first argument set toNone,_Elementcan be used as second argument (which is promoted to first argument in implementation) - Relax single argument usage in
_Element.iter*()method family, doesn't needtag=keyword when argument isNone FunctionNamespace()should generate an_XPathFunctionNamespaceRegistryobject, not its superclass- For decorator usage of
_XPathFunctionNamespaceRegistryand_ClassNamespaceRegistry, decorator signature included an extraneous argument, though it doesn't affect any existing correct usage. indent()first parameter has wrong name
soupparser.parse()should acceptpathlib.Pathobject as input.valueproperty ofSelectElementcan't be set tobytes.actionproperty ofFormElementcan have a value ofNone, and can be set toNone. They have different meanings though.
- Declare python 3.13 support and perform CI tests.
- Separation of
pyrightandmypyignore comments: in previous releases# type: ignore[code]was enabled inpyrightsettings. Now it only uses# pyright: ignore[code]somypycomment won't affectpyrightbehavior. - Add
._nameproperty tohtml.FormElementfor form name - Eliminate
typing.TypeAliasusage (declared obsolete, and we can do without it)
- Stub tests migration to runtime:
- Most of remaining
etree._Elementmethods, now only.makeelement()and.xpath()left in stub test
- Most of remaining
- Runtime test additions:
ElementNamespaceClassLookup()
toxconfig migrated topyproject.toml, thus requiringtox ⩾ 4.22- Runtime tests are now executed within
test-rtfolder due to python/mypy#8400 - Some tests need to be performed conditionally when multi-subclass patch is applied
- Some tests or syntaxes need to be turned off to cope with
mypydeficiencies - Usage of Rust-based
uvas well as relatedtoxplugin to speed up test environment recreation - Don't force users installing
tox-gh-actionswhen checkout out repository, it is only useful for GitHub workflows
etreesubmodule:parse(),fromstringlist(),tostring(),indent(),iselement(),adopt_external_document(),DocInfoproperties,QName,CData, some exception classeshtml.soupparsersubmodule:fromstring(),parse(),convert_tree()
- Namespace argument in Elementpath methods should allow
None(#60 thanks to @cukiernick)
- Perform runtime tests against
lxml 5.3
- Multiple builds available, with the alternative build enhancing multiple XML subclassing scenario. See relevant README section for detail. Thanks to @scanny for the driving force behind #51.
Mypy1.11 required, which introduced backward incompatible@typing.overloadchanges.lxml.html.cleanstub depreated,lxml 5.2.0completely removes the submodule due to multiple security issues. Corresponding code and type definitions are split into a new independent repo.
- (#56) Replace
typing.TypeGuardwithtyping.TypeIs - Use callback protocol for more precise element and
ElementMakerfactory function typing lxml.etree.ICONV_COMPILED_VERSIONexported since5.2.2- Special handling for
ObjectifiedElementandHTMLElementinlxml.cssselect.CSSSelectorand variouscssselect()methods html.buildershorthands return more precise element type for certain HTML elements. For example,html.builder.LABEL(), corresponding to<LABEL>tag, yieldsLabelElement.- More precise
etree.Extension()annotation depending on supplied namespace - Stricter namespace argument type in
_ElementElementPath methods - For
lxml.builder.ElementMakerclass:- Provide better hint in
__call__()argument - Accepts namespace tuple in
nsmapargument - Export private properties
- Provide better hint in
- For
lxml.saxmodule:- Export private properties in various classes
- Explicitly list all inherited methods in
ElementTreeContentHandlerclass, as method arguments names are different from superclass ones
- Alert
etree.HTMLParserusers to remove deprecatedstrip_cdataargument
- Some
_Elementrelated input arguments fixed to usetyping.Sequenceinstead ofInterable, as_Elementis already anIterableitself. Supplying_Elementwhere a properIterableis expected would cause problem. - Similar situation arises for
strorbytein tag selector argument; usetyping.Collectionto alert user more clearly. Nonecan't be used asetree.strip_*()argument- Some
etree.DocInforead-only properties can't beNone - Fix
etree.Resolvermethod return types - Avoid exception raising arg combinations in
html.html5parser.HTMLParser
- The usual static stub to runtime test migration:
- Part of basic
_Elementtests and itsfind*()methods - More extensive
_Attribtests
- Part of basic
- Use
ruffto replaceblackandisortas code formatter - Migrate stub tests to support
pytest-mypy-plugins ⩾ 2.0 - Use
pdm-backendas build backend due to its more versatile versioning support
Mypy1.9 is required, dropping 1.5 support. 1.6 - 1.8 was never supported.lxml.ElementIncludecompletely reworked
- PEP 696 support, simplifying usage of some subscripted types (#42)
- As a convenient side effect,
lxml.htmlparser constructor signatures can be removed
- As a convenient side effect,
- All annotations do provide default values in their signatures now instead of
...
- Type of
_Comment.textproperty (and those of similar elements) is alwaysstr(#46, thanks to @eemeli) - Tag selector argument in element iterator methods should support keyword with a single tag (#45, thanks to @eemeli)
html.fragments_fromstring()should receive same fix ashtml.html5parser.fragments_fromstring()do (#43, thanks to @Wuestengecko)@overloadforetree.SubElement()on handling ofHtmlElementandObjectifiedElement- Some exported constants were missing from
lxml.ElementIncludestub html.soupparsermodule functions return type depends onmakeelementargument- Keyword arguments in
html.soupparsermodule functions are explicitly listed now (instead of generic**kwargsbefore) - The 2 arguments in
html.diff.html_annotate()should align their annotation types html.submit_form()return type depends on the result ofopen_httpfunction argument- Add missing exported variable for
lxml.isoschematron - Uppercase variants of output method arguments ("HTML", "TEXT", "XML") were dropped
- Usual runtime test additions:
lxml.html.soupparser,lxml.ElementInclude, various exported constants - Runtime tests also do test against lxml 5.2
- Requires
cssselect ⩾ 1.2for annotation inlxml.cssselect, sincecssselectis now inline annotated.
- Compatibility with
pyright ⩾ 1.1.353 - In
etree.clean_*functions, first argument (the Element or ElementTree to be processed) must be strictly positional etree._LogEntry.filenameproperty is never empty, as it uses the value<string>as fallbacketree._BaseErrorLog.receive()argument name was wrong- Self brewed
SupportsReadCloseprotocol dropped, replacing with more standardizedSupportsRead html.html5parser.parse()should support data stream as inputhtml.html5parser.fragments_fromstring()return type is dependent onno_leading_textargumentencodingarguments in various methods / functions used to only support ASCII and UTF-8 as byte encodings, now the restriction is lifted- Place some
typingusage under python version check (if sys.version_info >= (3, x)) etree.PyErrorLogconstructor shouldn't accept 2 logger arguments simultaneouslyetree.PyErrorLog.level_mapproperty reverted to vanilla type (int) instead of our fakeenum
- Some runtime tests are lxml version dependent (#34, thanks to @fabaff)
- Adds stub check for
_Element,_Commentand_ElementTree(#33, thanks to @udifuchs) - Following stub tests migrated to runtime:
_Attrib,_ErrorLogand friends,html5lib
- Add back
HtmlProcessingInstructionelement (#28, thanks to @eliotwrobson) - Silence
pyright⩾ 1.1.345 warning on overriding read-write property with read-only one (ObjectifyElement.text)
mypy⩾ 1.6 does not support PEP702, thus shouldn't be used withtypes-lxml
- Stub test suite uses
mypy1.5.x now
- Types for emitted events and values in
iterparse()were not optimal (issue #19, thanks to @Daverball) - Most
htmllink and clean functions should be unable to processElementTree, exceptCleaner.clean_html()
- Completed following modules, thus really having
lxmlfully covered (sans a few submodules that will never be implemented):lxml.html.difflxml.ElementInclude
- Declares support for Python 3.12
- Update for upcoming
lxml5.0Schematronconstructor arguments- Some obsolete functions removed
- Start implementing runtime type checks and compare with static type checker results, utilizing
typeguardandpyright - Use
setuptools_scmin place ofpdm-backendas package build backend
The list of changes since last release is huge, be it visible by users or not.
- Class inheritance of
html.HtmlCommentand friends have changed to deviate from source code. Now they are 'thought' to inherit fromhtml.HtmlElementwithin stubs, like the XMLetree._Elementcounterpart. Refer to wiki document on how and why this change is done. - Shelved custom parser target support (custom parser target is used when initiating XML / HTML parsers with
target=argument), as current python typing system is deemed insufficient to get it working without plugins. - Stub package only depends on other stub packages, following behavior of typeshed distributed stubs. This means
lxmlis no longer pulled in when installingtypes-lxml. etree.SmartStrreverted back to its original class nameetree._ErrorLogis now made a function that generatesetree._ListErrorLog(despite the fact that it is a class in source code), according to actual created instance type
- Completed following submodules and parts, thus removing the partial status of
types-lxmlpackage:lxml.etreeproper:- XSLT related classes / functions
- XML:ID support
- External document and URI resolving
- XInclude support
- XPath and XSLT extension function registry
- Error log and reporting, along with numerous bug fixes
-
etree.iterparseandetree.iterwalk - Various
ElementClassLookuptypes
-
lxml.objectify- Includes all
DataElementsubtypes and type annotation support
- Includes all
-
lxml.isoschematron
- When subclassing XML elements, now most of its methods can be inherited without overriding output element type.
- More extensive usage of Python 3.9-3.11 typing features, this is possible since
types-lxmlis external stub package and doesn't affect source code. Such as: - Both
mypyandpyrighttype checkers have strict mode turned on when verifying stub source _Element.sourcelineproperty becomes read-only- Re-added most deprecated methods in various places, with help from provisional PEP 702 support (
@deprecated) inpyright - Incorporate more docstring from official
lxmlclasses, in case IDEs can display them in user interface. - Force
_XPathEvaluatorBasesubclasses to make__call__available, by explicitly declaring it as abstract method within_XPathEvaluatorBase - Removal of
http.open_http_urllib, which is only intended as a fallback callback function forhtml.submit_form()without user intervention libxml2error constants become integerenumin stub- Warn userland usage of dummy
etree.PyErrorLog.copy(), because it is only intended for smoother internallxmlerror handling.
- File reading source (used in
file=argument inparse()and friends) requirement relaxed html.(X)HtmlParser__init__was missing some arguments- Convert
iter*methods of Elements and some tag cleanup functions into@overload, to better reflect its original intended arguments usage etree.ElementBaseand similar public base element classes lacked__init__- Setting of
etree.DocInfotext properties now acceptsbytes name=argument ofhtml.HtmlElementClassLookup()doesn't acceptNone- Concerning
_Comment,_Entity,_ProcessingInstruction, and their subclasses.tagattribute now returns correct value (the basic etree element factory function)- Users will be warned if they use these elements like normal XML
_Elementdo, such as treating them as parent elements and insert children element into them
- Add types for XML canonicalization function/class and incremental generation context managers
- (#5, thanks to @f-ohler) Add
etree.indent() - (#6, thanks to @wRAR) Fix signature of
_Attrib.pop() - (#7, thanks to @wRAR) Fix signature of
etree.fromstring() - Fix signature of
objectify.fromstring()
This is the second release of types-lxml. Followings are enhancements on top of lxml-stubs 0.4.0:
- All previous contributions reviewed and made coherent (contributions came from so many people)
- Implemented stub for following submodules:
lxml.builderlxml.saxlxml.html.builderlxml.html.cleanlxml.html.soupparser(adapter for BeautifulSoup 4)lxml.html.html5parser(adapter for html5lib)
- Annotations for lots of classes and methods implemented too, please browse commit log for detail, or [project page] for future plans and progress
- In particular, annotations for
lxml.etree.DTDandlxml.etree.RelaxNGclasses are complete in this release
- In particular, annotations for
Pyrightsupport (guarantees error-free under basic checking mode)- Extensively expanded test cases
There are still some missing puzzle pieces before whole annotation package can be deemed complete and escape its partial status.
First release to stand in its own right. There are still some missing puzzle pieces before whole annotation package can be deemed complete and escape its partial status. Followings are enhancements done so far:
- All previous contributions reviewed and made coherent (contributions came from so many people)
- Pyright support (not the test cases though, which are mypy checkable only)
- Extensively expanded test cases
- Implemented stub for following submodules:
lxml.builderlxml.saxlxml.html.builderlxml.html.cleanlxml.html.soupparser(adapter for BeautifulSoup)
