Releases: dimitri/pgloader
pgloader v3.5.1
This release is mainly about lots of bug fixes thanks to user reports in GitHub issues, and also contains an heavily optimised code for preparing the COPY buffers. This optimisation comes with the realisation that when using pgloader to migrate from a source database to PostgreSQL, it's best to fail early. As a result, we don't keep batches in memory at all (by default) in those cases, allowing clear benefits of CPU and memory usage in pgloader.
As usual, for a full list of changes you may have a look at the git history here: v3.4.1...v3.5.1.
Guessing CSV formats
In this version of pgloader, when you load a CSV file into a target table, pgloader considers that if you don't specify the CSV format then it's expected to match with the target table definition. Which means that pgloader can guess the separator and quoting rules used in your source file!
Casting Rules
It's now possible in User Defined Casting Rules to specify new guards and actions, as seen in the documentation.
Other improvements
pgloader now supports loading data into PostgreSQL Foreign Tables and Partitioned Tables.
Lots of bug where fixed in the PostgreSQL support, in the SQLite support, in the MySQL support and in the MS SQL support.
SQLite improvements
The multiple ways SQLite can represent primary keys and indexes can be confusing, and pgloader got smarter about that. Pgloader now deal with more default values for SQLite too.
MySQL improvements
When loading a huge MySQL table it's now possible to have pgloader work with more than one reader in parallel, each reader querying a range of primary key values from the source database. This technique might improve reading times in some cases.
MySQL connection string can now use the useSSL parameter, or a sslmode parameter like when using PostgreSQL URIs.
In the previous release, pgloader changed to target in PostgreSQL a schema created with the name of the MySQL database. In this release, pgloader also automatically adds that schema to the database search_path.
Load file templates
In pgloader v3.5.1 it's now possible to use the https://mustache.github.io templating engine. Values can be given via the new command line parameter --context
or from the process environment. This allows using the same load file with different source files, for instance, and has been a long asked-for feature for pgloader.
pgloader 3.4.1 is now available!
This new release brings stability on the table. Both memory allocation optimization and error handling from the Command Line have been strong focus points in the preparing of this release of pgloader. It is intended to be a just works release... well otherwise you know where to open new issues.
Users of MS SQL will appreciate a lot of bug fixes and improvements to the coverage of their source database.
The parallelism features introduced in 3.3.1 have been overhauled and simplified internally, without changing the user facing knobs for them. Please report oddities if you find some.
You can read a full article about the release at from MySQL to PostgreSQL on the blog!
As usual, enjoy Free Software, enjoy pgloader and enjoy PostgreSQL!
MySQL to PostgreSQL and schema target
When converting from MySQL to PostgreSQL with this new release of pgloader, the default is now to target (and create) a PostgreSQL schema with the same name as the MySQL database. If you want to target the public schema instead, use a load file with the following command:
ALTER SCHEMA 'dbname' RENAME TO 'public' -- in pgloader command.load file
Given this command, pgloader then register your source table into the schema given in the load file and PostgreSQL commands all target this target schema. This also applies to data only migrations where the target schema has been created by a tool for you.
If you want to use your new PostgreSQL database easily with the new schema, you might want to alter PostgreSQL's target database to include it in the search_path
automatically:
alter database dbname set search_path to dbname, public; -- at PostgreSQL prompt
pgloader 3.3.2 is now available!
This is a maintenance release triggered by debian users having to deal with a bug fixed upstream. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843555
pgloader 3.3.1
pgloader 3.3.1 is now available!
This release contains about a year of changes to pgloader: code cleanup, new features, improvements and bug fixes. Thanks to all users who report use cases and bugs and open issues!
changelog
See the detailed changeling thanks to git log
and the github view of it here: v3.2.2...v3.3.1
contributors
A lot of you guys did contribute to pgloader v3.3.1: thanks for joining the fun!
sponsoring
A special thanks to our new sponsors! Thanks to them our support for MS SQL is now solid and usable for everyone, and it's possible to properly load data into a pre-created PostgreSQL schema (see the ORM case below). If you want to consider sponsoring a pgloader feature, consider buying a pgloader Moral License as detailed at http://pgloader.io/pgloader-moral-license.html.
Thanks to sponsors, pgloader 3.3.1 is now even able to rewrite some Partial Indexes WHERE
clauses automatically!
Sponsors are listed at http://pgloader.io/sponsors.html when they want to. It's easy to have your name here too!
release notes
Plenty things are worthy of some notes, let's focus on the main themes found in the 179 commits in between pgloader 3.2.2 and 3.3.1.
online schema changes
It's now possible to use the new pgloader clauses ALTER TABLE
and ALTER SCHEMA
to edit the matching between source table and their destinations.
parallelism
The way the parallelism is handled by pgloader now offers more options and the ability to use even more cores! To that effect, we first make it possible for the next table load to begin straight away rather than waiting for the all-parallel index creation to be done on the previous just-loaded table. Then more options are available, see the code comment for details:
We allow WORKER-COUNT simultaneous workers to be active at the same time
in the context of this COPY object. A single unit of work consist of
several kinds of workers:- a reader getting raw data from the COPY source with `map-rows', - N transformers preparing raw data for PostgreSQL COPY protocol, - N writers sending the data down to PostgreSQL.
The N here is setup to the CONCURRENCY parameter: with a CONCURRENCY of
2, we start (+ 1 2 2) = 5 concurrent tasks, with a CONCURRENCY of 4 we
start (+ 1 4 4) = 9 concurrent tasks, of which only WORKER-COUNT may be
active simultaneously.
For that new parallel control option are available, check the section A NOTE ABOUT PARALLELISM
in the main documentation.
migrate to a pre-existing schema
Also, the infamous ORM case is now handled correctly. It's possible for pgloader to migrate from a source database to a pre-installed target in PostgreSQL, where the schema has already been installed, either manually or by your favorite tooling. See the test file test/sakila-data.load
which uses that option when migrating from MySQL, using the following options:
WITH concurrency = 1, workers = 6,
max parallel create index = 4,
create no tables, include drop, truncate
On the code cleaning front, some refactoring did take place. The main parts of it is the introduction of our own internal catalog representation, allowing the previous feature by being able to load and compare metadata from MySQL and PostgreSQL.
docker
pgloader now includes Dockerfiles for both SBCL and CCL, and is using the DockerHub service so that a build is triggered at each commit pushed in the master's branch. Use the following URL to see about that, and please use those pre-made docker images if you need them!
https://hub.docker.com/r/dimitri/pgloader/
bundle distribution
This 3.3.1 release is also the first release to come with a bundle file. This distribution format allows to easily build pgloader from a single source archive that vendors in all the build dependencies. Look, if you don't use any library then you're mocked for choosing a poor programming language ecosystem where you have to do it all yourself, and when you pick a language that offers plenty of libs then packagers don't want to have to do the legwork themselves until proven interest by the users. And the users are basing their choice on the availability of the libs in their favorite distribution. Can you spell chicken and egg?
So, I hope having the bundle distribution of pgloader will help fellow packagers to work on including pgloader in their favorite distribution.
pgloader 3.2.2
pgloader 3.2.2 is now available!
This is the first release done in source format only. The previously available binary images were not good enough and I am not in a capacity to offer a good service here, so please see with your OS of choice packagers to obtain a binary release there directly. That said, I am taking care of debian.
Release Notes
This release is mostly about lots of bug fixes, answering to many github issues. Thanks everybody for reporting bugs! Some of you did send Pull Requests (aka patches or bug fixes) along with the bug report, let's hope that more of you are going to do that in the future ;-)
Release Early, release often
I believe that release early is something that has been done correctly in pgloader, but the release often parts have been neglected up to now. My intention is to fix that part by only doing the parts I know how to: source code and debian package. Given that organisation it's now quite easy for me to cut a release, and I intend on doing that often enough!
pgloader 3.2.1 preview
pgloader 3.2.1 preview
This is a preview release that contains some bugfixes for pgloader 3.2.0 found by early testers. Not all binary formats are covered yet. It's an interim release motivated by a down hosting server just right when pgloader makes it to Hacker News, done in the middle of the night to serve curious visitors: enjoy ;-)
It's a pre-release but as only bug fixes made it on top of pgloader 3.2.0, it's as safe as 3.2.0 really, just try it and tell me!
Binary Files
You have a choice of .pkg
for MacOSX systems, .deb
for debian sid (no backport to testing or stable yet, but these days I would expect that libc
are at the same version so it's worth a try), and a RPM
for CentOS 6.4 (Final).
Runtime dependencies
You might need to install freetds-devel
package and openssl-devel
depending on the features you're using from pgloader, freetds being the MS SQL driver.
CentOS binary file
The tar contains a single file pgloader
that is an almost static binary:
[vagrant@localhost vagrant]$ cat /etc/redhat-release
CentOS release 6.4 (Final)
[vagrant@localhost vagrant]$ ldd ./build/bin/pgloader
linux-vdso.so.1 => (0x00007fff9d5ff000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fcc88183000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcc87f66000)
libz.so.1 => /lib64/libz.so.1 (0x00007fcc87d4f000)
libm.so.6 => /lib64/libm.so.6 (0x00007fcc87acb000)
libc.so.6 => /lib64/libc.so.6 (0x00007fcc87738000)
/lib64/ld-linux-x86-64.so.2 (0x00007fcc88390000)