Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Initiative] Improve Cabal documentation structure to become more beginner-friendly #9214

Open
8 of 19 tasks
malteneuss opened this issue Aug 27, 2023 · 13 comments
Open
8 of 19 tasks

Comments

@malteneuss
Copy link
Collaborator

malteneuss commented Aug 27, 2023

Additional context
Haskell's development tooling has matured a lot in the last years. One of the important areas for improvement to me is documentation.

What is wrong with the docs?
Having recently switched from Stack back to Cabal i struggled with finding examples and explanations for parts of a .cabal file, typical use cases and wordings. I'm convinced that this has to do with the overall documentation structure. I propose to introduce a clear(er) division between tutorials, guides and reference and explanations as described in https://documentation.divio.com/ and to follow a structure similar to the documentation for Rust' package manager cargo: https://doc.rust-lang.org/cargo/index.html.

The issues i see

If you also see the need to improve the documentation (and what and what else to do first and how), feedback is welcome. I started with a small improvement in #9212.

@lsmor
Copy link

lsmor commented Aug 27, 2023

Recently, as part of Summer of Haskell, I wrote a little cabal overview which I think covers (partitally) many points of this issue. Consider the information below as part of an informal conversation, so many things aren't official wrt cabal documentation. We may use it as a reference a build from here

A brief overview on cabal

A cabal project has many items ordered in a hierarchy.

Top level: Project

  • a cabal project is defined in a cabal.project file.
  • This file essentially is a configuration file for the cabal tool.
  • Any flag you pass to cabal, can be written in this file
  • Also, In this file you specify all packages you want to build
  • A project may have one or more packages
  • This file is not mandatory. But it is if you one more than one package

Level Two: Package

  • A package is the minimum item buildable with cabal
  • A package is define in a <name>.cabal file.
  • A package includes metadata (author name, email, etc...) and components
  • A package may have one or multiple components but only one public library (this isn't true anymore, I think)

Level Three: Component

  • A component is actual code which should be build together (one or many files)
  • There are four kind of components divided in two groups:
    • Runnables: executables, tests, benchmarks
    • Not Runnables: libraries and internal libraries (altough you can load them in ghci)
  • As libraries can not be run. They only serve to share code between other components and packages
  • internal libraries are used to share code between components in the same package but not with other packages
  • Since executables, tests and benchmarks can be run, they need an entrypoint (a.k.a main function within a Main module)
  • Runnable components can depend on libraries, but not viceversa.
  • A component may have one or more modules

Level Four: Module

  • A module is a Haskell file (ex: MyModule.hs)
  • A module has a header module ModuleName where
  • The module name must be the same as the filename, except for the entrypoint of a runnable component
  • The entrypoint module:
    • must have the header module Main where
    • it can have any filename, but commonly is Main.hs

Summary

This is a visual of the hierarchy

Project
|- package-one
   |- library
      |- ModuleOne
      |- ModuleTwo
   |- internal-library
      |- ...
   |- test
      |- ...
   |- ...
|- package-two
   |- executable-one
      |- Main
      |- ...
   |- executable-two
      |- ...
   |- benchmark
      |-...

Example

This is an example of a complex project. Imagine a file manager which can work remotely. In this project you'd have

  • A package with
    • a library exposing the core abstrations as FileTree, data FileSystemAction = CreateFile | DeleteFile, etc...
    • a test suite for testing the library
    • an executable component which creates a fake filesystem, so you can run safely you library
  • A package with a server version
    • an executable which runs the software on a remote server
    • a benchmark which tests how good your server implementation is
  • A package with a clients in different versions: cli, gui and tui
    • a internal library with common functionality among clients
    • an executable per version which can connect to the server and send messages via console

The file structure could be

hs-filesystem
 |- app
    |- FakeFS.hs
 |- src
    |- FileTree.hs
    |- ...
 |- test
    |- FileTreeSpec.hs
    |- Main.hs
 |- hs-filesystem.cabal

hs-filesystem-server
 |- bench
    |- Main.hs
 |- app
    |- server.hs
 |- server.cabal

clients
 |- src
    |- Common.hs
 |- cli
    |- Main.hs
    |- ...
 |- gui
    |- Main.hs
    |- ...
 |- tui
    |- Main.hs
    |- ...
 |- filesystem-client.cabal

cabal.project

Below, how the files look alike

The cabal.project file

packages: ./hs-filesystem
          ./hs-filesystem-server
          ./clients
.
.
.

The hs-filesystem.cabal

-- this is the name of the public library it is a top level value because a .cabal file has at most one public library
name: hs-filesystem 

-- You must indicate the folder code is.
-- list all dependencies and modules exposed
library
  hs-source-dirs: src/
  build-depends: ...
  exposed-modules: ...

-- Notice, you depend on the compoment hs-filesystem!!
-- the source files are in the folder test
-- Because is a runnable component, you need to specify the entrypoint
-- runnuable components may have other modules aside the entrypoint.
test hs-fs-tests
  build-depends: hs-filesystem
  hs-source-dirs: test/
  main-is: Main.hs
  other-modules:
    FileTreeSpec.hs

-- Notice, you depend on the compoment hs-filesystem, but not depend on tests.
-- Of course, you don't need to build test to build the executable
-- Also, the entrypoint file is name FakeFS but inside the file the header must be `module Main where`
-- When building with cabal it will create an binary executable fake-fs you can run ffrom the console

executable fake-fs
  build-depends: hs-filesystem
  hs-source-dirs: app/
  main-is: FakeFS.hs

The filesystem-client.cabal

-- notice you are creating a library component which depends on a a different package library (hs-filesystem) all within
-- the same project. This library is internal, hence it has a name tag.
-- Compare it with the public library hs-filesystem.cabal in which the name is top level.
library filesystem-client
  hs-source-dirs: src/
  build-depends: hs-filesystem
  exposed-modules: Common.hs

-- The executable component depends on both:
--    hs-filesystem (public library define in other .cabal file) 
--    filesystem-client (internal library define in this very .cabal file)
-- Notice that transitive dependencies do not apply. If you want to use a function from hs-filesystem
-- you must make it an explicit dependency
executable filesystem-cli
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: gui/
  main-is: Main.hs

executable filesystem-tui
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: tui/
  main-is: Main.hs

executable filesystem-gui
  build-depends: hs-filesystem
               , filesystem-client
  hs-source-dirs: gui/
  main-is: Main.hs

when building filesystem-client.cabal it will create three binary executables.

@malteneuss
Copy link
Collaborator Author

@lsmor Nice. This looks like good structure for a chapter about disambiguating typical packaging terms.

@andreabedini
Copy link
Collaborator

This is great!Thank you for taking the lead <3. Few hot comments while reading the posts above.
None of them might not be relevant at the pedagogical level but better to be on the same page about some technical details. Feel free to AMA if something is not clear and/or correct me if I have made mistakes.

Define an official name for multi-package setup

This is definitely called a cabal project.

beginners and average Haskellers probably don't know or care about NixOS

100% this. As much as I am care deeply about making cabal and nix get along, cabal documentation should be about cabal not nix. I understand nix integration is also being deprecated as it has not been working with v2 commands (introduced quite a while ago now).

Move Setup.hs to separate (legacy?) chapter.

Setup.hs should be relegated to a sepeate section for niche features (i.e. custom-setup) and not even mentioned otherwise. It's not even necessary to specify build-type: Simple these days. It's the default.

Any flag you pass to cabal, can be written in this file

Not all of them but yeah. The reference specifies, for each cabal.project option, the corresponding cli flag (if there is one).

Also, In this file you specify all packages you want to build
A project may have one or more packages

Yes. Here are some excessively-detailed notes:

Dependencies

  1. In first approximation, all dependencies have to be built as well.
  2. Some depedendencies will be cached in the cabal store; meaning: we built the exact same package once so we don't need to build it again. "Exact same" here is determined by hashing by all dependencies (through their own hash), flags, compiler version, and some build parameters. See cabal-hash.txt in the cabal store, and Distribution.Client.PackageHash. This requires having already decided all the dependencies and indeed happens after generating a build plan (see below).
  3. Some dependencies can be chosen among pre-installed packages. tl;dr: GHC and cabal communicate through package databases (packagedb). Two are "well-known" and others are custom. The global packagedb comes with GHC pre-populated with a set of "boot" packages. The user packagedb, if ever used, lives somewhere in your home. Other packagedbs can be listed in cabal.project. If you use nix-style builds (using v2 commands, which has been the default for a while) you don't need to think about this but cabal-install's solver does try to reuse pre-installed packages when it can. This is very different from the above mechanism, since it is part of generating an install plan and (currently) only the package name and version are taken into consideration.

Local packages

There's a distinction between local packages and non-local packages. See findProjectPackages. Local packages all the packages directly mentioned in cabal.project: packages:, optional-packages:, extra-packages, source-repository-packages. I believe this is decently documented in the reference.

Targets

When you do cabal build xyz, cabal-install jumps through a bunch of hoops to figure out what exactly you mean by xyz. Target forms can point to any (component of) any package in the build plan, not only local packages. E.g. you can add in cabal.project options for a package abc in your depedency tree and rebuild just that package with cabal build abc. Also, you can list multiple targets.

This file is not mandatory. But it is if you one more than one package

Yes. This works like this: if cabal.project is not present then use the default cabal.project which is packages: .. This mean that in the single-package scenario, you need to add project configuration options to cabal.project.local(totally fine having acabal.project.localwithout acabal.project). This is indeed what cabal configuredoes, it turns cli flags to cabal.project options. If you were to create acabal.projectwith some options, you'd need to remember to putpackages: .` as well.

FWIW some people wish they could pass packages: from the cli, which would make cabal.project not mandatory also with multiple packages. I am not opposed to the idea TBH.

A package is the minimum item buildable with cabal

You can actually build only selected components (see targets above). Cabal will take into account the component dependencies too (e.g. exe depends on some libs).

Nevertheless a "cabal package" is the unit of distribution.

A cabal package is described by a package description file, commonly known as a "cabal file".

A package may have one or multiple components but only one public library (this isn't true anymore, I think)

Correct, as of Cabal 3.0 (2019!) you can specify visibility: public in a sub-library. The default is always priviate for sublibs and implicitly public for the main library. Note that the solver doesn't understand public sublibraries very well and will never choose a pre-installed on (see no. 3 in "Dependencies" above).

Also, it's worth noticing that the solver operates at package level in the sense that the version bounds on all (~ sort off, see below re: tests and bechmarks) components dependencies are grouped togheter and there cannot be cycles between components in separate packages (e.g. pkg-b:lib depends on pkg-a:lib, pkg-a:exe depends on pkg-b:lib). You can build this manually with Cabal but cabal-install's solver will reject it like it was "pkg-b depends on pkg-a, pkg-a depends on pkg-b".

There are four kind of components divided in two groups

There's a fifth, foreign-libraries. You can build a haskell library to be linked into non-haskell code.

Runnables / non-runnables

I never heard this terminology. Maybe executables and libraries could be a simpler option? tests and benchmarks are executables just like exes. You can cabal run them like an executable (in addition to cabal test/cabal bench).

Also, the user guide makes a bit of a mess with the terminology "internal/private/sub". There's even a reference to (quotes) "private internal sub-library" 😂 Someone is proposing the pov that they are all libraries all the same, just one has the same name as the package name and you don't need to write it. TBH I don't have strong opinions here, as long as the terminology is consistent.

internal libraries are used to share code between components in the same package but not with other packages

Unless they are made public.

Other notes (for what they are worth):

  • components can share some of the source files (you can write exposed-modules: A in two libraries).
  • libraries have exposed-modules (which other packages can see) and other-modules (which other packages won't be able to see). Executables only have other-modules.
  • There's also reexported-modules, virtual-modules and autogen-modules, see the docs).
  • Tests and benchmarks components can be enabled or disabled (with tests: True/False, benchmarks: True/False). If they are disabled their dependencies would not affect the plan.
  • Exes are always enabled and their dependencies will always be part of the plan. This includes exes in all packages in the dependency tree, e.g. pkg-a:lib depends on pkg-b:lib and pkg-c:lib, pkg-b:exe depends on pkg-c:lib<X.Y.Z.W, then pkg-a:lib won't be able to build with pkg-c:lib>=X.Y.Z.W
  • Tests and benchmarks are by default enabled for local packages and otherwise disabled.
  • IIRC the solver has some freedom in deciding whether or not building test/benchmark components.

This is an example of a complex project.

I suggest we frame this as "something something ... cabal for projects". There are separate considerations to make if you want to publish a package. The current user guide tends to lean toward package development (roughly writing libraries to publish that other people can build, rather than writing project to build so other people can run).

In this project you'd have

Other things that might be worth adding (perhaps one a the time with some narrative?)

  • One of the tests needs an executable from another package, so you can show-case build-tool-depends.
  • "Discover" style tests and doctests using code-generators: (a bit new, the ecosystem might not be ready (there's a great opportunity here :P)).
  • Fixing the index-state in cabal.project for reproducibility.
  • Adding an allow-newer in cabal.project for the common case where a package in your dependency tree has overly conservative bounds (with a suggestion to get in touch with either the maintainer or the hackage-trustees to resolve the situation).
  • Using source-repository-packages for a dependency pre-release.
  • Package flags and conditionals.
  • Conditionals and imports in cabal.project (imports are handy if one is migrating from stack).
  • Using package candidates from Hackage.
  • Generating documentation with Haddock.
  • Non-haskell source files (c,cpp,js,asm,cmm,alex,happy,etc)

The elephant in the room of course is backpack.

Ok, I accidentally a book. Happy to chat if you like.

@malteneuss
Copy link
Collaborator Author

@andreabedini Thanks for your hints. A few things became a lot clearer to me, e.g. why the word "cabal project" makes sense and why there is a "Package description" chapter (i didn't see the 1-1 correspondence to a .cabal file before xD).
You mention a lot of important topics for specialized guides. For now, i would like to focus on the top level structure, the intro and a few sections a lot of users will read or look up. As soon as #9212 is merged, i can start with the top-level structure and re-organizing the content that's already there.

@BinderDavid
Copy link
Contributor

BinderDavid commented Sep 5, 2023

In the past week I have also been looking at ways to improve the cabal user guide. Here are some of my assorted thoughts:

I propose to introduce a clear(er) division between tutorials, guides and reference and explanations as described in https://documentation.divio.com/

I think this is a problem of the current state of the user guide, which can and should be fixed early. I think a clear first step would be to use the Sphinx feature of "parts", which allows to split the table of contents into several separate parts. (I.e. this is just typography) A simple suggestion would be to use the two parts "User Guides" and "Reference", and to triage the existing documentation into those two parts.(Edit: I see you suggested exactly that 👍 )

Move Setup.hs to separate (legacy?) chapter.

I think the Setup.hs chapter is only one example of information that is displayed too prominently.
There are currently 12 toplevel sections, and since these are always visible in the table of contents they should correspond to the 12 most important anchors which allow to navigate the users guide. In my opinion this is currently not the case. For example, the last 5 toplevel sections are for niche usecases only, and the user guide shouldn't spend the most important toplevel sections to refer to them.

  • 8. Setup.hs commands I think barely anyone uses this anymore. There could be a section "Advanced and legacy features" on the toplevel which collects these kind of sections.
  • 9. Package Description Format Specification History This section contains very useful information, but should probably be grouped with the reference which describes the .cabal file format, instead of being a toplevel section.
  • 10. Field Syntax Reference Similarly to section 9, I think that this should be grouped with the cabal file reference documentation. I opened Sections "Field Syntax Reference" and "Package Description" are out of sync and inconsistent #9186 about this.
  • 11. Reporting Bugs and Stability of Cabal Interfaces I doubt very much that the stability information in that section is even remotely accurate. It only contains some blanket statements about the stability of the Cabal API, and the last time that information was changed was 7 years ago when the files were migrated from Markdown to rst files. I guess the text is much older than that. The cabal API has changed in the meantime, for example via the split into the cabal and cabal-syntax packages, and the CLI has undergone various revisions with the v1- and v2- commands.
  • 12. Nix integration This feature has recently been deprecated, so this section won't survive for long.

By contrast, some subsections contain the most relevant information and are difficult to find because they are hidden several subsections deep.

  • All cabal-install commands like cabal run etc. are hidden 4 levels deep. This is information that users are probably looking for all the time.
  • Section 6.2 is probably the most important reference section in the entire user-guide, since it contains all the information about what the fields in the cabal file actually mean. This information should be very easy to find. I think this has already been achieved for cabal.project files with section 7. In my opinion section 6.2 should be promoted to a similar toplevel section.

@liamzee
Copy link
Collaborator

liamzee commented Sep 29, 2023

@malteneuss, if I can be your runner boy, I'd be happy. I've taken the task on of trying to work on the Wiki documentation myself, as well as aiming (and I hope I can be successful here) to help document the existing codebase.

The "Complaints and Grievances community" brought up issues with tooling, and your initiative seems the lowest hanging fruit.

@malteneuss
Copy link
Collaborator Author

@liamzee Great to have your support and thanks for improving the Wiki. I'll come back to you when the top-level structure is settled.

@BinderDavid
Copy link
Contributor

I am currently attempting a rewrite of parts of the introduction which introduce general packaging concepts. (My attempts are on this branch here: https://github.com/BinderDavid/cabal/tree/rewrite-user-guide-introduction I haven't opened a PR yet and am still trying to figure out how to structure things).

Concretely, I am looking at ways to improve section 2 (Introduction) and section 3.2 (Package concepts and development -> Package concepts). I think the main issue that can be improved is that they introduce the cabal packaging system by comparing it to distribution package managers (like rpm/debian) and to GNU style building with autoconf/configure/make. This can be explained by the fact cabal was one of the first programming languages which introduced this style of packaging and handling dependencies. But I think for a new programmer coming to Haskell today it would be more useful to compare to other similar systems like Rust with cargo+crates.io or Javascript with npm.

@malteneuss You are currently working on #9212 . After that is merged, I think it would be useful to look at section 2.1: Package concepts and Development -> Quickstart. As far as I can see that section has more or less the same content as the Getting Started section: How to initialize a new app, how to add a dependency, how to run the program. We could compare what information is contained in sec 2.1 that is not in the Getting started section, and move this information to the Getting started section (I don't think it is much). Afterwards section 2.1. could be removed as redundant.

@ulysses4ever
Copy link
Collaborator

Hope some prior work could be used as a source of inspiration: https://github.com/haskell/cabal-userguide

@andreabedini
Copy link
Collaborator

🤯 why is that in a separate repo!?

@ulysses4ever
Copy link
Collaborator

@andreabedini it's abandoned now so I don't think it matters much, but the reason was, I believe, is that the current manual was deemed unsalvageable by the authors of that initiative.

@BinderDavid
Copy link
Contributor

BinderDavid commented Oct 2, 2023

@andreabedini it's abandoned now so I don't think it matters much, but the reason was, I believe, is that the current manual was deemed unsalvageable by the authors of that initiative.

It have taken a look, and they put a lot of work into developing a nice global structure for how the documentation of cabal should be structured. But, as far as I can see, only one chapter of this new structure was finished (unless I am missing some work on branches that I haven't checked out).

My impression is that the cabal documentation is not unsalvageable, but what it does need is aggressive editing. It is always simpler to edit or add just a single paragraph or subsection of the docs than touching the overall organization, deleting material and merging and moving sections. Also, editing can be done piecemeal, and it is less likely to run out of steam than a complete rewrite. But I think there was a hesitancy to edit older material, and instead only new information was added. This led to the current state which is a bit lacking in focus and structure.

@ulysses4ever
Copy link
Collaborator

If you feel like doing piecemeal, go for it: everyone (myself included) will thank you. But I personally think until the global structure is improved along the lines described in that repo, people will hardly notice your effort. The reason I think that is that it seems to me that it's completely impossible to navigate it without knowing a lot about Cabal already. For experts it kinda works fine already I'd say, but to make it novice-digestible, structural changes are necessary. Just my 2c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants