This article explains the new features in Python 3.4, compared to 3.3.
For full details, see the changelog.
Note
Prerelease users should be aware that this document is currently in draft form. It will be updated substantially as Python 3.4 moves towards release, so it’s worth checking back even after reading earlier versions.
See also
PEP 429 – Python 3.4 Release Schedule
New syntax features:
New expected features for Python implementations:
New library modules:
Significantly Improved Library Modules:
CPython implementation improvements:
Please read on for a comprehensive list of user-facing changes, including many other smaller improvements, CPython optimizations, deprecations, and potential porting issues.
The new ensurepip module (defined in PEP 453) provides a standard cross-platform mechanism to boostrap the pip installer into Python installations and virtual environments.
The venv module and the pyvenv utility make use of this module to make pip readily available in virtual environments. When using the command line interface, pip is installed by default, while for the module API installation of pip must be requested explicitly.
For CPython source builds on POSIX systems, the make install and make altinstall commands bootstrap pip by default. This behaviour can be controlled through configure options, and overridden through Makefile options.
On Windows and Mac OS X, the CPython installers now offer the option to install pip along with CPython itself.
As discussed in the PEP, platform packagers may choose not to install pip by default, as long as the command pip, when invoked, provides clear and simple directions on how to install pip on the platform.
Note
The implementation of PEP 453 is still a work in progress. Refer to issue 19347 for the progress on additional steps:
See also
PEP 446 makes newly created file descriptors non-inheritable. New functions and methods:
See also
Since it was first introduced, the codecs module has always been intended to operate as a type-neutral dynamic encoding and decoding system. However, its close coupling with the Python text model, especially the type restricted convenience methods on the builtin str, bytes and bytearray types, has historically obscured that fact.
As a key step in clarifying the situation, the codecs.encode() and codecs.decode() convenience functions are now properly documented in Python 2.7, 3.3 and 3.4. These functions have existed in the codecs module (and have been covered by the regression test suite) since Python 2.4, but were previously only discoverable through runtime introspection.
Unlike the convenience methods on str, bytes and bytearray, these convenience functions support arbitrary codecs in both Python 2 and Python 3, rather than being limited to Unicode text encodings (in Python 3) or basestring <-> basestring conversions (in Python 2).
In Python 3.4, the interpreter is able to identify the known non-text encodings provided in the standard library and direct users towards these general purpose convenience functions when appropriate:
>>> b"abcdef".decode("hex")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LookupError: 'hex' is not a text encoding; use codecs.decode() to handle arbitrary codecs
>>> "hello".encode("rot13")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LookupError: 'rot13' is not a text encoding; use codecs.encode() to handle arbitrary codecs
In a related change, whenever it is feasible without breaking backwards compatibility, exceptions raised during encoding and decoding operations will be wrapped in a chained exception of the same type that mentions the name of the codec responsible for producing the error:
>>> import codecs
>>> codecs.decode(b"abcdefgh", "hex")
binascii.Error: Non-hexadecimal digit found
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
binascii.Error: decoding with 'hex' codec failed (Error: Non-hexadecimal digit found)
>>> codecs.encode("hello", "bz2")
TypeError: 'str' does not support the buffer interface
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: encoding with 'bz2' codec failed (TypeError: 'str' does not support the buffer interface)
Finally, as the examples above show, these improvements have permitted the restoration of the convenience aliases for the non-Unicode codecs that were themselves restored in Python 3.2. This means that encoding binary data to and from its hexadecimal representation (for example) can now be written as:
>>> from codecs import encode, decode
>>> encode(b"hello", "hex")
b'68656c6c6f'
>>> decode(b"68656c6c6f", "hex")
b'hello'
The binary and text transforms provided in the standard library are detailed in Binary Transforms and Text Transforms.
(Contributed by Nick Coghlan in issue 7475, , issue 17827, issue 17828 and issue 19619)
PEP 451 provides an encapsulation of the information about a module that the import machinery will use to load it (that is, a module specification). This helps simplify both the import implementation and several import-related APIs. The change is also a stepping stone for several future import-related improvements.
The public-facing changes from the PEP are entirely backward-compatible. Furthermore, they should be transparent to everyone but importer authors. Key finder and loader methods have been deprecated, but they will continue working. New importers should use the new methods described in the PEP. Existing importers should be updated to implement the new methods.
Some smaller changes made to the core Python language are:
The new asyncio module (defined in PEP 3156) provides a standard pluggable event loop model for Python, providing solid asynchronous IO support in the standard library, and making it easier for other event loop implementations to interoperate with the standard library and each other.
For Python 3.4, this module is considered a provisional API.
See also
The new ensurepip module is the primary infrastructure for the PEP 453 implementation. In the normal course of events end users will not need to interact with this module, but it can be used to manually bootstrap pip if the automated bootstrapping into an installation or virtual environment was declined.
ensurepip includes a bundled copy of pip, up-to-date as of the first release candidate of the release of CPython with which it ships (this applies to both maintenance releases and feature releases). ensurepip does not access the internet. (If the installation has Internet access, it is of course possible to upgrade pip to a release more recent than the bundled pip by using the bundled pip command itself once it is installed.)
The module is named ensurepip because if called when pip is already installed, it does nothing. It also has an --upgrade option that will cause it to install the bundled copy of pip if the existing installed version of pip is older than the bundled copy.
The new enum module (defined in PEP 435) provides a standard implementation of enumeration types, allowing other modules (such as socket) to provide more informative error messages and better debugging support by replacing opaque integer constants with backwards compatible enumeration values.
See also
The new pathlib module offers classes representing filesystem paths with semantics appropriate for different operating systems. Path classes are divided between pure paths, which provide purely computational operations without I/O, and concrete paths, which inherit from pure paths but also provide I/O operations.
For Python 3.4, this module is considered a provisional API.
See also
The new selectors module (created as part of implementing PEP 3156) allows high-level and efficient I/O multiplexing, built upon the select module primitives.
The new statistics module (defined in PEP 450) offers some core statistics functionality directly in the standard library. This module supports calculation of the mean, median, mode, variance and standard deviation of a data series.
See also
The new tracemalloc module (defined in PEP 454) is a debug tool to trace memory blocks allocated by Python. It provides the following information:
See also
New function abc.get_cache_token() can be used to know when to invalidate caches that are affected by changes in the object graph. (Contributed by Łukasz Langa in issue 16832.)
The getparams() method now returns a namedtuple rather than a plain tuple. (Contributed by Claudiu Popa in issue 17818.)
Added support for 24-bit samples (issue 12866).
Added the byteswap() function to convert big-endian samples to little-endian and vice versa (issue 19641).
The encoding and decoding functions in base64 now accept any bytes-like object in cases where it previously required a bytes or bytearray instance (issue 17839).
The number of digits in the coefficients for the RGB — YIQ conversions have been expanded so that they match the FCC NTSC versions. The change in results should be less than 1% and may better match results found elsewhere.
The new contextlib.suppress context manager helps to clarify the intent of code that deliberately suppresses exceptions from a single statement. (Contributed by Raymond Hettinger in issue 15806 and Zero Piraeus in issue 19266)
The new contextlib.redirect_stdout() context manager makes it easier for utility scripts to handle inflexible APIs that don’t provide any options to retrieve their output as a string or direct it to somewhere other than sys.stdout. In conjunction with io.StringIO, this context manager is also useful for checking expected output from command line utilities. (Contribute by Raymond Hettinger in issue 15805)
The contextlib documentation has also been updated to include a discussion of the differences between single use, reusable and reentrant context managers.
The dis module is now built around an Instruction class that provides details of individual bytecode operations and a get_instructions() iterator that emits the Instruction stream for a given piece of Python code. The various display tools in the dis module have been updated to be based on these new components.
The new dis.Bytecode class provides an object-oriented API for inspecting bytecode, both in human-readable form and for iterating over instructions.
(Contributed by Nick Coghlan, Ryan Kelly and Thomas Kluyver in issue 11816 and Claudiu Popa in issue 17916)
Added FAIL_FAST flag to halt test running as soon as the first failure is detected. (Contributed by R. David Murray and Daniel Urban in issue 16522.)
Updated the doctest command line interface to use argparse, and added -o and -f options to the interface. -o allows doctest options to be specified on the command line, and -f is a shorthand for -o FAIL_FAST (to parallel the similar option supported by the unittest CLI). (Contributed by R. David Murray in issue 11390.)
as_string() now accepts a policy argument to override the default policy of the message when generating a string representation of it. This means that as_string can now be used in more circumstances, instead of having to create and use a generator in order to pass formatting parameters to its flatten method.
New method as_bytes() added to produce a bytes representation of the message in a fashion similar to how as_string produces a string representation. It does not accept the maxheaderlen argument, but does accept the unixfrom and policy arguments. The Message __bytes__() method calls it, meaning that bytes(mymsg) will now produce the intuitive result: a bytes object containing the fully formatted message.
(Contributed by R. David Murray in issue 18600.)
A pair of new subclasses of Message have been added, along with a new sub-module, contentmanager. All documentation is currently in the new module, which is being added as part of the new provisional email API. These classes provide a number of new methods that make extracting content from and inserting content into email messages much easier. See the contentmanager documentation for details.
These API additions complete the bulk of the work that was planned as part of the email6 project. The currently provisional API is scheduled to become final in Python 3.5 (possibly with a few minor additions in the area of error handling).
(Contributed by R. David Murray in issue 18891.)
The new partialmethod() descriptor bring partial argument application to descriptors, just as partial() provides for normal callables. The new descriptor also makes it easier to get arbitrary callables (including partial() instances) to behave like normal instance methods when included in a class definition.
(Contributed by Alon Horev and Nick Coghlan in issue 4331)
The new singledispatch() decorator brings support for single-dispatch generic functions to the Python standard library. Where object oriented programming focuses on grouping multiple operations on a common set of data into a class, a generic function focuses on grouping multiple implementations of an operation that allows it to work with different kinds of data.
See also
New hashlib.pbkdf2_hmac() function. (Contributed by Christian Heimes in issue 18582)
New hash algorithms sah3_224(), sha3_256(), sha3_384(), and sha3_512(). (Contributed by Christian Heimes in issue 16113.)
Added a new html.unescape() function that converts HTML5 character references to the corresponding Unicode characters. (Contributed by Ezio Melotti in issue 2927)
Added a new convert_charrefs keyword argument to HTMLParser that, when True, automatically converts all character references. For backward-compatibility, its value defaults to False, but it will change to True in future versions, so you are invited to set it explicitly and update your code to use this new feature. (Contributed by Ezio Melotti in issue 13633)
The strict argument of HTMLParser is now deprecated. (Contributed by Ezio Melotti in issue 15114)
The inspect module now offers a basic command line interface to quickly display source code and other information for modules, classes and functions. (Contributed by Claudiu Popa and Nick Coghlan in issue 18626)
unwrap() makes it easy to unravel wrapper function chains created by functools.wraps() (and any other API that sets the __wrapped__ attribute on a wrapper function). (Contributed by Daniel Urban, Aaron Iles and Nick Coghlan in issue 13266)
As part of the implementation of the new enum module, the inspect module now has substantially better support for custom __dir__ methods and dynamic class attributes provided through metaclasses (Contributed by Ethan Furman in issue 18929 and issue 19030)
The default marshal version has been bumped to 3. The code implementing the new version restores the Python2 behavior of recording only one copy of interned strings and preserving the interning on deserialization, and extends this “one copy” ability to any object type (including handling recursive references). This reduces both the size of .pyc files and the amount of memory a module occupies in memory when it is loaded from a .pyc (or .pyo) file. (Contributed by Kristján Valur Jónsson in issue 16475.)
mmap objects can now be weakref’ed. (Contributed by Valerie Lambert in issue 4885.)
On Unix, two new start methods (spawn and forkserver) have been added for starting processes using multiprocessing. These make the mixing of processes with threads more robust, and the spawn method matches the semantics that multiprocessing has always used on Windows. (Contributed by Richard Oudkerk in issue 8713).
Also, except when using the old fork start method, child processes will no longer inherit unneeded handles/file descriptors from their parents (part of issue 8713).
multiprocessing now relies on runpy (which implements the -m switch) to initialise __main__ appropriately in child processes when using the spawn or forkserver start methods. This resolves some edge cases where combining multiprocessing, the -m command line switch and explicit relative imports could cause obscure failures in child processes. (Contributed by Nick Coghlan in issue 19946)
New functions to get and set the inheritable flag of a file descriptors or a Windows handle:
The print command has been removed from pdb, restoring access to the print function.
Rationale: Python2’s pdb did not have a print command; instead, entering print executed the print statement. In Python3 print was mistakenly made an alias for the pdb p command. p, however, prints the repr of its argument, not the str like the Python2 print command did. Worse, the Python3 pdb print command shadowed the Python3 print function, making it inaccessible at the pdb prompt.
(Contributed by Connor Osborn in issue 18764.)
protocol 4
pickle now supports (but does not use by default) a new pickle protocol, protocol 4. This new protocol addresses a number of issues that were present in previous protocols, such as the serialization of nested classes, very large strings and containers, or classes whose __new__() method takes keyword-only arguments. It also provides some efficiency improvements.
See also
New stls() method to switch a clear-text POP3 session into an encrypted POP3 session.
New capa() method to query the capabilities advertised by the POP3 server.
(Contributed by Lorenzo Catucci in issue 4473.)
The pprint module now supports compact mode for formatting long sequences (issue 19132).
pty.spawn() now returns the status value from os.waitpid() on the child process, instead of None. (Contributed by Gregory P. Smith.)
While significant changes have not been made to pydoc directly, its handling of custom __dir__ methods and various descriptor behaviours has been improved substantially by the underlying changes in the inspect module.
Added re.fullmatch() function and regex.fullmatch() method, which anchor the pattern at both ends of the string to match. (Contributed by Matthew Barnett in issue 16203.)
The repr of regex objects now includes the pattern and the flags; the repr of match objects now includes the start, end, and the part of the string that matched. (Contributed by Serhiy Storchaka in issue 13592 and issue 17087.)
New resource.prlimit() function and Linux specific constants. (Contributed by Christian Heimes in issue 16595 and issue 19324.)
SMTPException is now a subclass of OSError, which allows both socket level errors and SMTP protocol level errors to be caught in one try/except statement by code that only cares whether or not an error occurred. (issue 2118).
Socket objects have new methods to get or set their inheritable flag:
The socket.AF_* and socket.SOCK_* constants are enumeration values, using the new enum module. This allows descriptive reporting during debugging, instead of seeing integer “magic numbers”.
PROTOCOL_TLSv1_1 and PROTOCOL_TLSv1_2 (TLSv1.1 and TLSv1.2 support) have been added; support for these protocols is only available if Python is linked with OpenSSL 1.0.1 or later. (Contributed by Michele Orrù and Antoine Pitrou in issue 16692)
New diagnostic functions get_default_verify_paths(), cert_store_stats() and get_ca_certs() (Contributed by Christian Heimes in issue 18143 and issue 18147)
Add ssl.enum_cert_store() to retrieve certificates and CRL from Windows’ cert store. (Contributed by Christian Heimes in issue 17134.)
Support for server-side SNI using the new ssl.SSLContext.set_servername_callback() method. (Contributed by Daniel Black in issue 8109.)
The stat module is now backed by a C implementation in _stat. A C implementation is required as most of the values aren’t standardized and platform-dependent. (Contributed by Christian Heimes in issue 11016.)
The module supports new file types: door, event port and whiteout.
Streaming struct unpacking using struct.iter_unpack(). (Contributed by Antoine Pitrou in issue 17804.)
The getparams() method now returns a namedtuple rather than a plain tuple. (Contributed by Claudiu Popa in issue 18901.)
sunau.open() now supports the context manager protocol (issue 18878).
New function sys.getallocatedblocks() returns the current number of blocks allocated by the interpreter (in CPython with the default --with-pymalloc setting, this is allocations made through the PyObject_Malloc() API). This can be useful for tracking memory leaks, especially if automated via a test suite. (Contributed by Antoine Pitrou in issue 13390.)
A new traceback.clear_frames() function takes a traceback object and clears the local variables in all of the frames it references, reducing the amount of memory consumed (issue 1565525).
Add support.for data: URLs in urllib.request. (Contributed by Mathias Panzenböck in issue 16423.)
Support for easy dynamically-generated subtests using the subTest() context manager. (Contributed by Antoine Pitrou in issue 16997.)
The getparams() method now returns a namedtuple rather than a plain tuple. (Contributed by Claudiu Popa in issue 17487.)
wave.open() now supports the context manager protocol. (Contributed by Claudiu Popa in issue 17616.)
New WeakMethod class simulates weak references to bound methods. (Contributed by Antoine Pitrou in issue 14631.)
New finalize class makes it possible to register a callback to be invoked when an object is garbage collected, without needing to carefully manage the lifecycle of the weak reference itself. (Contributed by Richard Oudkerk in issue 15528)
Add an event-driven parser for non-blocking applications, XMLPullParser. (Contributed by Antoine Pitrou in issue 17741.)
Add a filter function to ignore some packages (tests for instance), writepy(). (Contributed by Christian Tismer in issue 19274.)
PEP 445 adds new C level interfaces to customize memory allocation in the CPython interpreter.
See also
PEP 442 removes the current limitations and quirks of object finalization in CPython. With it, objects with __del__() methods, as well as generators with finally clauses, can be finalized when they are part of a reference cycle.
As part of this change, module globals are no longer forcibly set to None during interpreter shutdown in most cases, instead relying on the normal operation of the cyclic garbage collector. This avoids a whole class of interpreter-shutdown-time errors, usually involving __del__ methods, that have plagued Python since the cyclic GC was first introduced.
See also
PEP 456 follows up on earlier security fix work done on Python’s hash algorithm to address certain DOS attacks to which public facing APIs backed by dictionary lookups may be subject. (See issue 14621 for the start of the current round of improvements.) The PEP unifies CPython’s hash code to make it easier for a packager to substitute a different hash algorithm, and switches Python’s default implementation to a SipHash implementation on platforms that have a 64 bit data type. Any performance differences in comparison with the older FNV algorithm are trivial.
The PEP adds additional fields to the sys.hash_info() struct sequence to describe the hash algorithm in use by the currently executing binary. Otherwise, the PEP does not alter any existing CPython APIs.
“Argument Clinic” (PEP 436) is now part of the CPython build process and can be used to simplify the process of defining and maintaining accurate signatures for builtins and standard library extension modules implemented in C.
Note
The Argument Clinic PEP is not fully up to date with the state of the implementation. This has been deemed acceptable by the release manager and core development team in this case, as Argument Clinic will not be made available as a public API for third party use in Python 3.4.
See also
The UTF-32 decoder is now 3x to 4x faster.
The cost of hash collisions for sets is now reduced. Each hash table probe now checks a series of consecutive, adjacent key/hash pairs before continuing to make random probes through the hash table. This exploits cache locality to make collision resolution less expensive.
The collision resolution scheme can be described as a hybrid of linear probing and open addressing. The number of additional linear probes defaults to nine. This can be changed at compile-time by defining LINEAR_PROBES to be any value. Set LINEAR_PROBES=0 to turn-off linear probing entirely.
(Contributed by Raymond Hettinger in issue 18771.)
The interpreter starts about 30% faster. A couple of measures lead to the speedup. The interpreter loads fewer modules on startup, e.g. the re, collections and locale modules and their dependencies are no longer imported by default. The marshal module has been improved to load compiled Python code faster.
(Contributed by Antoine Pitrou, Christian Heimes and Victor Stinner in issue 19219, issue 19218, issue 19209, issue 19205 and issue 9548)
bz2.BZ2File is now as fast or faster than the Python2 version for most cases. lzma.LZMAFile has also been optimized. (Contributed by Serhiy Storchaka and Nadeem Vawda in issue 16034.)
This section covers various APIs and other features that have been deprecated in Python 3.4, and will be removed in Python 3.5 or later. In most (but not all) cases, using the deprecated APIs will produce a DeprecationWarning when the interpreter is run with deprecation warnings enabled (for example, by using -Wd).
XXX: None so far
The following previously deprecated APIs and features have been removed in Python 3.4:
Support for the following operating systems has been removed from the source and build tools:
- OS/2 (issue 16135).
- Windows 2000 (changeset e52df05b496a).
- VMS (issue 16136).
The unmaintained Misc/TextMate and Misc/vim directories have been removed (see the devguide for what to use instead).
The SO makefile macro is removed (it was replaced by the SHLIB_SUFFIX and EXT_SUFFIX macros) (issue 16754).
The PyThreadState.tick_counter field has been removed; its value has been meaningless since Python 3.2, when the “new GIL” was introduced.
This section lists previously described changes and other bugfixes that may require changes to your code.