Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide precompiled gem for Linux #575

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

andyundso
Copy link
Member

@andyundso andyundso commented Jan 8, 2025

Closes #569.

Background

If I read that correctly from the project's history, tiny_tds used to ship with an extensive mini_portile configuration prior to v2 to build FreeTDS. This has been removed with v2, see #345. Now you need to bring your own FreeTDS, which tiny_tds will find and compile against.

Compiling FreeTDS yourself is not too complicated, especially since you can follow along with the README. However, I noted the general sentiment of the Ruby community of shipping gems precompiled to you, not only for Windows, but also Linux and Mac OS. nokogiri and sqlite3-ruby have been doing this for a while, and I saw that the pg gem prepared for a Linux build. So I feel it would be good to follow along and also provide a precompiled gem for Linux.

Approach

There are two approaches on how you can ship precompiled gems: static or dynamic. On that note, we also have to know that for each supported Ruby version of the precompiled gem, a separate compiled file has to be shipped. Right now, our Windows gem ships with 6 different compiled tiny_tds files for each supported Ruby version.

If you would ship with a statically compiled tiny_tds file, all our dependencies (FreeTDS, iconv, OpenSSL) are squeezed into the resulting tiny_tds files. I tried that out, and it resulted in a massive gem file (60 MB compressed), since we had all the dependencies times 6, for each supported Ruby version. So this was not feasable.

Dynamic means: You compile a library with all its dependencies visible and tell it "you will find it at runtime" again, don't worry. It is a good way to safe space. This is also how the current tiny_tds build works. After the results of building the static version, I felt like this is the best way to go, considering our dependency size.

Changes

First commit: Partially static build

I heavily looked into what has been done in the pg, and decided to replicate the changes in order to have it easier to implement the Linux support. So the first commit is mostly aligning the build system.

The gist of the changes is:

  • Ports are now build as part of rake compile when cross-compiling instead of having a separate task.
  • OpenSSL and iconv are built as a static library and are linked into FreeTDS, which remains a dynamic library. This means we only have to ship a single library and not the entire ports directory.
  • The OpenSSL build should be much faster since only the required libraries are now built.

Next 4 commits: Support for Linux

The first commit contains the most changes.

Generally, I decided to ship four different Linux gems:

  • One for x86_64 and glibc (the usual case)
  • One for x86_64 and musl (when running an Alpine OS)
  • One for ARM64 and glibc (could be for AWS Graviton or folks with M1-4 CPUs using Docker)
  • One for ARM64 and musl (mostly for folks with M1-4 CPUs using Docker)

Then most of it was declaring Windows and Linux specific compiler flags.

There is one interesting thing, which is this line in extconf.rb:

 $LDFLAGS << " '-Wl,-rpath=$$ORIGIN/../../../ports/#{gem_platform}/lib'"

When the linker links our precompiled tiny_tds gem against FreeTDS, it passes it "runtime paths", basically information where it find different required libraries, like FreeTDS. Of course, our gem could be installed anywhere, so we cannot use absolute paths here. This little flag tells the linker to tell the resulting file that it should look in our ports directory from its current location for the FreeTDS library.

I also extended the test suite in this commit to test the resulting files and make sure they pass the tests. I do not run tests for anything else than the regular x86_64 GNU build, but I test the installation on the other platforms in Docker according to a template by flavorjones.

Last commit: Fix binstubs

tiny_tds ships with binstubs to call defncopy and tsql from FreeTDS. tsql is for connection diagnostics, defncopy is used by the SQL server adapter to dump schemas. I noted that these have been broken prior to my changes, so I thought it might be good to fix them.

  • The precompiled binaries were not included in the final gem since I touched the build system.
  • There is some logic to find tsql. On Windows, the tsql.exe by the msys2 system did not pass the binary? check. I removed it altogether since I did not see an advantage of having it.
  • On Linux, I had to tell the resulting tsql and defncopy where to find its sybdb library file with a runtime path, similar as with the tiny_tds files.
  • Execution of the wrappers is now tested across all platforms.

Consideration

While this PR will simply the installation for tiny_tds on Linux, and reduces the size for Windows users, it makes our build process more complicated and also larger because of many additional tests.

As I have written in the initial paragraph, I would like to follow the example of other gems and simplify the installation process. Preparing this PR cost me a lot of time since I was not familiar with the C build system at all (and FreeTDS makes things extra difficult for a beginner with cmake and libtool). It has been 8 years since the old build process was touched, and I expect a similar life span for these changes. I do not really expect more maintenance effort from our side, but it will require some more frequent updates to keep the dependencies inside the gem up-to-date.

@andyundso
Copy link
Member Author

the PR is almost ready.

  • I need to change the sysconfdir parameter for FreeTDS based on the platform.
  • Tests are also sometimes flaky, requiring two or three runs. Maybe I can debug this somehow.

You can use this parameter to specify a `freetds.conf` there. per default, it will use the build path, which will be some obscure thing on GitHub Actions. Although most people will customize this location at runtime using environment variables, it still makes sense to provide sensible defaults.
Copy link
Contributor

@aharpervc aharpervc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, amazing 😮

It has been 8 years since the old build process was touched,

Yeah ugh, incredible that you had the time to pick this apart...

I tried that out, and it resulted in a massive gem file (60 MB compressed), since we had all the dependencies times 6, for each supported Ruby version. So this was not feasable.

I'm not understanding this, if we're shipping binaries now why is it different static vs dynamic, wouldn't the binary be different per-build-target? Or are you saying that approach builds/ships all targets in 1 gem? Also what do you mean by times 6?

Ports are now build as part of rake compile when cross-compiling instead of having a separate task.

Can you share more of your rationale here? It seems like having a separate task is useful to debug compiling tiny tds C vs dependency C. Also if you're choosing to build dynamic libraries doesn't that mean we'd have separate compile tasks regardless? So I don't understand merging the rake tasks like this

Windows & Linux

Lastly for the moment, (maybe you said this), why not also macOS? It seems virtuous to treat all OS's equally for compilation/distribution if possible


To say it... incredible...!

Rakefile Outdated
Comment on lines 10 to 17
CrossLibraries = [
['x64-mingw-ucrt', 'mingw64'],
['x64-mingw32', 'mingw64'],
['x86_64-linux-gnu', 'linux-x86_64'],
['x86_64-linux-musl', 'linux-x86_64'],
['aarch64-linux-gnu', 'linux-aarch64'],
['aarch64-linux-musl', 'linux-aarch64'],
].map do |platform, openssl_config|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some weird indentation here, is it tabs vs spaces?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think introducing a formatter for my own sake would be good 😅

... #574 :)

Rakefile Outdated Show resolved Hide resolved
lib/tiny_tds.rb Outdated
Comment on lines 31 to 32
# libpq is found by a relative rpath in the cross compiled extension dll
# or by the system library loader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's libpq?

lib/tiny_tds.rb Outdated
block.call
end
end

# Temporary add bin directories for DLL search, so that freetds DLLs can be found.
add_dll_paths.call( TinyTds::Gem.ports_bin_paths ) do
# Add a load path to the one retrieved from pg_config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait did this get copy/pasted from pg 😆

@@ -115,6 +115,7 @@ def configure_defaults

recipe.configure_options << "--with-openssl=#{openssl_recipe.path}"
recipe.configure_options << "--with-libiconv-prefix=#{libiconv_recipe.path}"
recipe.configure_options << "--sysconfdir=#{MiniPortile.windows? ? "C:/Sites" : "/usr/local/etc"}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should this come from env var?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A user can overwrite it at runtime:
https://www.freetds.org/userguide/freetdsconf.htm

@andyundso
Copy link
Member Author

I tried that out, and it resulted in a massive gem file (60 MB compressed), since we had all the dependencies times 6, for each supported Ruby version. So this was not feasable.

I'm not understanding this, if we're shipping binaries now why is it different static vs dynamic, wouldn't the binary be different per-build-target? Or are you saying that approach builds/ships all targets in 1 gem? Also what do you mean by times 6?

It's maybe easier to show. This is the structure of a precompiled tiny_tds gem:

$ tree /tmp/tiny_tds/lib/
/tmp/tiny_tds/lib/
├── tiny_tds
│   ├── 2.7
│   │   └── tiny_tds.so
│   ├── 3.0
│   │   └── tiny_tds.so
│   ├── 3.1
│   │   └── tiny_tds.so
│   ├── 3.2
│   │   └── tiny_tds.so
│   ├── 3.3
│   │   └── tiny_tds.so
│   ├── 3.4
│   │   └── tiny_tds.so

Now if we look into the file size of one of these files when everything is statically compiled, it comes out as follows:

$ ls -l  /tmp/tiny_tds/lib/tiny_tds/2.7/tiny_tds.so 
-rwxr-xr-x 1 andy andy 9264808 Jan  7 23:00 /tmp/tiny_tds/lib/tiny_tds/2.7/tiny_tds.so

So it's 9.2 MB. For each supported Ruby version, each of these shared library is 9.2 MB since all dependencies are statically compiled into it. This is what I meant by times 6.

Ports are now build as part of rake compile when cross-compiling instead of having a separate task.

Can you share more of your rationale here? It seems like having a separate task is useful to debug compiling tiny tds C vs dependency C. Also if you're choosing to build dynamic libraries doesn't that mean we'd have separate compile tasks regardless? So I don't understand merging the rake tasks like this

It was originally just to align with pg. During the development of this PR, I also did not have the need to compile ports and the gem itself separately. Another nice advantage is that we no longer need to pass host around in the ports. host was to originally invoke the correct gcc and linker when cross compiling (e.g. x86_64-w64-mingw32 when compiling for x64-mingw-ucrt). but when we compile the ports through rake-compiler, it somehow sets everything correct from the start.

I just tried to separate the builds and I am not super happy about it.

When we compile FreeTDS statically with OpenSSL and iconv, when compiling tiny_tds, it needs to see these references again (-lssl -lcrypto and so). You can do this by saying have_library(ssl) in the extconf file. But then OpenSSL needs different OS libraries on Windows and Linux, so we also have to declare that (e.g. -lcrypt32 on Windows). So we partially copy our compile arguments, which I do not like.

When both ports and tiny_tds are compiled inside extconf, we do not have that problem because all the references have been set during the build. So I would suggest to keep everything inside extconf.

Windows & Linux

Lastly for the moment, (maybe you said this), why not also macOS? It seems virtuous to treat all OS's equally for compilation/distribution if possible

to be honest, installation of FreeTDS on Mac OS is really easy and I personally do not have a need for a precompiled version on Mac. If there is a need by the community, I am happy to ship it at a later point.

@andyundso
Copy link
Member Author

andyundso commented Jan 10, 2025

so:

  • sysconfdir is now only set on Windows because most common paths on Linux require root to write there.
  • I need to investigate the flaky tests on Windows in a different PR. I first need a Windows box up and running.

otherwise I think this PR is ready to merge.

@aharpervc
Copy link
Contributor

aharpervc commented Jan 10, 2025

For each supported Ruby version, each of these shared library is 9.2 MB since all dependencies are statically compiled into it. This is what I meant by times 6.

I'm still working through the implications here & running the branch locally. However I'm trying to remember what about this makes them different for different Ruby versions. I understand why tiny_tds itself needs to be compiled n times (per target platform/architecture?), since we're building against ruby.h. I'm putting it together here... you're saying now, static linking is better, therefore the dependencies are compiled together with tiny_tds, therefore we have n distinct copies. Did I work it out right?

And then if I'm getting it, does the equation change here if we switched to dynamic linking (still providing the compiled dependencies blob in the gem)? freetds/libiconv/openssl won't have the Ruby dependency and could be compiled once (per target platform/architecture), and then we have that dependency blob + 6 tiny_tds.so's. And that tiny_tds.so would be really small (...tiny... 😆) so 6x doesn't seem so bad.

I re-read your comments above but I'm still unclear on why you're preferring to avoid dynamic linking, can you expand on that more? If we did it that way, Windows & Linux would be more similar to macOS, right? Because in all cases the dependencies are dynamically linked on macOS (we are declining to provide a statically compiled fat gem for that OS)

@aharpervc
Copy link
Contributor

aharpervc commented Jan 10, 2025

Okay, I built the gem from this branch okay

but it failed to install. Does this seem like a "me problem" or something changed here?

$ ruby -v
ruby 3.1.6p260 (2024-05-29 revision a777087be6) [x64-mingw-ucrt]

$ gem install .\pkg\tiny_tds-3.2.0-x64-mingw-ucrt.gem
<internal:C:/ruby/316-1-x64/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require': 126: The specified module could not be found.   - C:/ruby/316-1-x64/lib/ruby/gems/3.1.0/gems/date-3.3.4/lib/date_core.so (LoadError)

(never mind, "me problem")

@andyundso
Copy link
Member Author

well this is somewhat complicated to explain over text 😅.

I understand why tiny_tds itself needs to be compiled n times (per target platform/architecture?), since we're building against ruby.h.

Correct.

I'm putting it together here... you're saying now, static linking is better, therefore the dependencies are compiled together with tiny_tds, therefore we have n distinct copies. Did I work it out right?

I think we have a misunderstanding here: I do not think linking everything statically in our case is better, because of the size of the dependencies.

And then if I'm getting it, does the equation change here if we switched to dynamic linking (still providing the compiled dependencies blob in the gem)? freetds/libiconv/openssl won't have the Ruby dependency and could be compiled once (per target platform/architecture), and then we have that dependency blob + 6 tiny_tds.so's. And that tiny_tds.so would be really small (...tiny... 😆) so 6x doesn't seem so bad.

This is now exactly how it works now with this PR. One large dependency blob, 6 tiny tiny_tds.

$ tree --du -h ports/x86_64-linux-gnu/lib lib/tiny_tds
[9.1M]  ports/x86_64-linux-gnu/lib
└── [9.1M]  libsybdb.so.5
[293K]  lib/tiny_tds
├── [ 47K]  2.7
│   └── [ 43K]  tiny_tds.so
├── [ 47K]  3.0
│   └── [ 43K]  tiny_tds.so
├── [ 47K]  3.1
│   └── [ 43K]  tiny_tds.so
├── [ 47K]  3.2
│   └── [ 43K]  tiny_tds.so
├── [ 47K]  3.3
│   └── [ 43K]  tiny_tds.so
├── [ 47K]  3.4
│   └── [ 43K]  tiny_tds.so

Just for completeness / comparison: This is how the ports look for the current version of tiny_tds (v3.1.0):

$ tree ports/
ports/
└── x64-mingw-ucrt
    ├── freetds
    │   └── 1.4.23
    │       ├── bin
    │       │   ├── bsqldb.exe
    │       │   ├── datacopy.exe
    │       │   ├── defncopy.exe
    │       │   ├── freebcp.exe
    │       │   ├── libct-4.dll
    │       │   ├── libsybdb-5.dll
    │       │   ├── osql
    │       │   ├── tdspool.exe
    │       │   └── tsql.exe
    │       └── lib
    │           ├── libct.dll.a
    │           ├── libct.la
    │           ├── libsybdb.dll.a
    │           └── libsybdb.la
    ├── libiconv
    │   └── 1.17
    │       ├── bin
    │       │   ├── iconv.exe
    │       │   ├── libcharset-1.dll
    │       │   └── libiconv-2.dll
    │       └── lib
    │           ├── libcharset.dll.a
    │           ├── libcharset.la
    │           ├── libiconv.dll.a
    │           └── libiconv.la
    └── openssl
        └── 3.4.0
            └── bin
                ├── c_rehash
                ├── libcrypto-3-x64.dll
                ├── libssl-3-x64.dll
                └── openssl.exe

basically, everything is a shared library and dynamically linked. When starting tiny_tds on Windows, we mess with the PATH to ensure our shared libraries are loaded first (happening here). I could have built a similar solution for Linux, instead of messing with PATH you would have to modify LD_LIBRARY_PATH or add a dependency on fiddle. But I like that we have only one dependency blob now, makes things much cleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a precompiled gem for Mac OS X and Linux
2 participants