Skip to content
This repository has been archived by the owner on Jan 8, 2018. It is now read-only.

Update lxml to 4.0.0 #72

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

pyup-bot
Copy link
Collaborator

There's a new version of lxml available.
You are currently using 3.7.2. I have updated it to 4.0.0

These links might come in handy: PyPI | Changelog | Homepage | Bugtracker

Changelog

4.0.0

==================

Features added

  • The ElementPath implementation is now compiled using Cython,
    which speeds up the .find*() methods quite significantly.
  • The modules lxml.builder, lxml.html.diff and lxml.html.clean
    are also compiled using Cython in order to speed them up.
  • xmlfile() supports async coroutines using async with and await.
  • iterwalk() has a new method skip_subtree() that prevents walking into
    the descendants of the current element.
  • RelaxNG.from_rnc_string() accepts a base_url argument to
    allow relative resource lookups.
  • The XSLT result object has a new method .write_output(file) that serialises
    output data into a file according to the <xsl:output> configuration.

Bugs fixed

  • GH251: HTML comments were handled incorrectly by the soupparser.
    Patch by mozbugbox.
  • LP1654544: The html5parser no longer passes the useChardet option
    if the input is a Unicode string, unless explicitly requested. When parsing
    files, the default is to enable it when a URL or file path is passed (because
    the file is then opened in binary mode), and to disable it when reading from
    a file(-like) object.

Note: This is a backwards incompatible change of the default configuration.
If your code parses byte strings/streams and depends on character detection,
please pass the option guess_charset=True explicitly, which already worked
in older lxml versions.

  • LP1703810: etree.fromstring() failed to parse UTF-32 data with BOM.
  • LP1526522: Some RelaxNG errors were not reported in the error log.
  • LP1567526: Empty and plain text input raised a TypeError in soupparser.
  • LP1710429: Uninitialised variable usage in HTML diff.
  • LP1415643: The closing tags context manager in xmlfile() could continue
    to output end tags even after writing failed with an exception.
  • LP1465357: xmlfile.write() now accepts and ignores None as input argument.
  • Compilation under Py3.7-pre failed due to a modified function signature.

Other changes

  • The main module source files were renamed from lxml.*.pyx to plain
    *.pyx (e.g. etree.pyx) to simplify their handling in the build
    process. Care was taken to keep the old header files as fallbacks for
    code that compiles against the public C-API of lxml, but it might still
    be worth validating that third-party code does not notice this change.

3.8.0

==================

Features added

  • ElementTree.write() has a new option doctype that writes out a
    doctype string before the serialisation, in the same way as tostring().
  • GH220: xmlfile allows switching output methods at an element level.
    Patch by Burak Arslan.
  • LP1595781, GH240: added a PyCapsule Python API and C-level API for
    passing externally generated libxml2 documents into lxml.
  • GH244: error log entries have a new property path with an XPath
    expression (if known, None otherwise) that points to the tree element
    responsible for the error. Patch by Bob Kline.
  • The namespace prefix mapping that can be used in ElementPath now injects
    a default namespace when passing a None prefix.

Bugs fixed

  • GH238: Character escapes were not hex-encoded in the xmlfile serialiser.
    Patch by matejcik.
  • GH229: fix for externally created XML documents. Patch by Theodore Dubois.
  • LP1665241, GH228: Form data handling in lxml.html no longer strips the
    option values specified in form attributes but only the text values.
    Patch by Ashish Kulkarni.
  • LP1551797: revert previous fix for XSLT error logging as it breaks
    multi-threaded XSLT processing.
  • LP1673355, GH233: fromstring() html5parser failed to parse byte strings.

Other changes

  • The previously undocumented docstring option in ElementTree.write()
    produces a deprecation warning and will eventually be removed.

3.7.4

==================

Bugs fixed

  • LP1551797: revert previous fix for XSLT error logging as it breaks
    multi-threaded XSLT processing.
  • LP1673355, GH233: fromstring() html5parser failed to parse byte strings.

3.7.3

==================

Bugs fixed

  • GH218 was ineffective in Python 3.
  • GH222: lxml.html.submit_form() failed in Python 3.
    Patch by Jakub Wilk.

Got merge conflicts? Close this PR and delete the branch. I'll create a new PR for you.

Happy merging! 🤖

@pyup-bot pyup-bot mentioned this pull request Sep 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant