Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pkg3: naming of project filenames #43

Open
dns2utf8 opened this issue Aug 16, 2017 · 20 comments
Open

Pkg3: naming of project filenames #43

dns2utf8 opened this issue Aug 16, 2017 · 20 comments
Labels

Comments

@dns2utf8
Copy link

dns2utf8 commented Aug 16, 2017

Hi all

I saw the talk on Pkg3 and was a bit confused with the naming.
My personal expectation was something like this:

  • Manifest.toml for the actual configuration
  • Manifest.lock for the latest working set
  • Manifest.journal for the log. This one I would expect to be a like firebird or another single file db with a lock.

Of course Config.* would be fine too. The mix of Config.toml and Manifest.toml was confusing me since many develop environments use them also.
Like android with Manifest.xml or npm with packages.json and packages-lock.json or cargo with Cargo.toml and Cargo.lock.

Regards

@ararslan ararslan added the Pkg3 label Aug 16, 2017
@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 16, 2017

Naming these things is a bit challenging. I find the terminology used by other package managers kind of confusing and unfortunate. Let me try to explain some of the naming choices I've made.

The TOML spec says that all TOML files should end in .toml – Pkg3 follows that.

The Config.toml file contains a project's top-level information that is completely independent of any details of how the package manager (or anyone else) may have chosen to satisfy the dependencies of the project. It has entries like this:

authors = "Stefan Karpinski <[email protected]>"
desc = "The next-generation Julia package manager."
keywords = ["package", "management"]
license = "MIT"
name = "Pkg3"
repo = "https://github.com/StefanKarpinski/Pkg3.jl.git"

[deps]
SHA = "ea8e919c-243c-51af-8825-aaa63cd721ce"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

These are objective facts about the project, which do not depend in any way on what particular versions of anything are chosen to make things work. The project needs the packages whose UUIDs are ea8e919c-243c-51af-8825-aaa63cd721ce and 2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91 and these packages are referred to in the project as SHA and StatsBase, respectively. Unless the project's needs and code change, these are simple facts about the project. This file says nothing about what versions are used to satisfy these dependencies. The Config.toml file should always be checked into a project since otherwise you have no idea what it depends on or how to get it working.

Another reasonable name for the Config.toml file might be Project.toml since it is metadata about a project. That would seem a bit weird when it was in a package repo, however, where you'd expect it to be called Package.toml or something. It would be even weirder when appearing in a global named environment directory like ~/.julia/environments/v1.2 where there's no specific project that it describes. So Config.toml seems like a good name since the word "config" implies top-level user-provided configuration and applies equally to projects (non-reusable units of code), packages (reusable units of code), and global environments (named sets of packages one uses together). Another decent name might be Metadata.toml but that seems a bit too abstract and overloaded. Other names might be Env.toml or Environment.toml but that's also not super.

The Manifest.toml file records specific versions used to satisfy the dependencies listed in Config.toml. It includes not only versions of top-level dependencies listed in Config.toml but also the versions of all of their dependencies. An example of its contents might be:

[[Compat]]
hash-sha1 = "6e9c90ac34a173c2a2c179735427078b989a3bdc"
uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
version = "0.26.0"

[[DataStructures]]
deps = ["Compat"]
hash-sha1 = "84bea819ff0c08e8f9fd55a637d25bdc685c6c5b"
uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
version = "0.6.0"

[[SHA]]
deps = ["Compat"]
hash-sha1 = "9ce386dcf6dde95a1e267e320332d192bc090fff"
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.3.3"

[[SpecialFunctions]]
deps = ["Compat"]
hash-sha1 = "03e6a824d4f33a6bc856a5fcfd9d14729a9f18d4"
uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
version = "0.1.1"

[[StatsBase]]
deps = ["Compat", "DataStructures", "SpecialFunctions"]
hash-sha1 = "4820d195cd378926a7a59e6e14727a394cc8f123"
uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
version = "0.17.0"

This corresponds to a particular way of satisfying the dependencies given in the above Config.toml file – i.e. it provides SHA and StatsBase. There are many different ways of satisfying these dependencies, and this is just one of them. This file may or may not be checked into a project since one particular way of satisfying requirements is not really always of interest. However, I'm thinking that one will generally want to commit this anyway, since even if one doesn't use the same exact versions oneself, at least that way there is a record of some working configuration, which presumably passed tests and whatnot. But it's not strictly necessary.

This file is called Manifest.toml because the dictionary defines a "manifest" as:

A document giving comprehensive details of a ship and its cargo and other contents, passengers, and crew for the use of customs officers.

or, if you prefer the Webster's 1913 dictionary definition:

A list or invoice of a ship's cargo, containing a description by marks, numbers, etc., of each package of goods, to be exhibited at the customhouse.

That's what this file does – it gives all of the identifying details of exactly what's "on the ship". The Config file, on the other hand, is not a manifest at all – it's a high-level description of what a package is and what it needs. There are no specifics of how those needs are met, only enough information that the needs are clear and unambiguous. Indirect dependencies are not listed in Config.toml at all, even though they are definitely on the ship.

The naming and format of environment logs hasn't really been settled on yet. However, I have a somewhat hard time seeing why that file should have the word "manifest" in it anywhere. In what sense is it a manifest? It's not a list of the contents of anything. It's a record of the locations of environments that have been used, thereby allowing the package manager to figure out what versions of packages are still potentially in use and (by process of elimination) which can be safely deleted. I guess if the file ends up being a log of paths to Manifest.toml files it might make sense to call it something like ~/.julia/Manifest.log. It still wouldn't itself be a manifest, but it would be a log of manifests, so the name would make sense. I suspect, however, that it will make more sense to track environment locations, since then you can log environments whether they have manifest files or not. As to the file format, having it be a full-on database seems like overkill. Some kind of file locking and/or log sharding seems like it should be sufficient, although I guess we'll see.

@vtjnash
Copy link
Member

vtjnash commented Aug 16, 2017

Like Cargo.toml, perhaps it should just be Julia.toml? But I'm assuming the containing path would usually already contain indications that it is an associated artifact for a given project. But I'm not sure if the examples given are entire inline with that assumption.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 16, 2017

The plan (spelled out in the Julep and implemented in Pkg3.jl) is to look for JuliaConfig.toml and JuliaManifest.toml first and use those if they exist (and completely ignore Config.toml and Manifest.toml if they do). That way a Julia-only project doesn't need to redundantly name things Julia-this or Julia-that but mixed-language projects can use longer names with Julia prefixes.

@tkelman
Copy link

tkelman commented Aug 17, 2017

Config.toml sounds to me a bit like something the user might be expected to modify to tweak optional settings, which it sort of would be for projects but not really packages, right? Even for projects wouldn't things like adding dependencies usually be done through Julia Pkg3 APIs instead of manually editing a toml file?

Description.toml or Info.toml or Listing.toml might be overly generic possible names for it.

@StefanKarpinski
Copy link
Member

You can edit the file by hand, and it will likely contain other kinds of configuration. But yes, looking up UUIDs and entering them is tedious so it would likely be done by Pkg3 automatically in response to a command (and interactive disambiguation when necessary) and update actual versions in the manifest file at the same time. Info.toml is ok, but at that point I might prefer Meta.toml.

Let's play the "what do you call it" game:

This file provides high-level metadata about a project (non-reusable unit of code), package (reusable unit of code), or a named global environment (set of packages often used together), including what packages it depends on, global configuration, and "project targets" – i.e. things you can do with a project.

This suggests Metadata.toml but something still doesn't feel right about that. I guess we could call it Project.toml and just deal with that being a bit off in packages or environments.

@staticfloat
Copy link
Member

I like something similar to Package.toml, or Packages.toml. This gets across the idea that the configuration has something to do with packaging, which may not be immediately obvious to users coming to Pkg3 for the first time.

@quinnj
Copy link
Member

quinnj commented Aug 18, 2017

I have to say I agree w/ @staticfloat; Package.toml seems the most obvious and natural to me. I know we're worried about the "wait this isn't a package, just a project I'm working on!" use-case, but I still feel like you can just conceptually call that a "package", even if it's not something you plan on publishing for the rest of the world (which we could maybe call Libraries or Public Packages as convention instead).

@staticfloat
Copy link
Member

For the expliciphiles among us:

  • package_system_metadata.toml
  • PackageMetadata.toml

For the cutesy among us:

  • JPSI.toml, pronounced "Gypsy", stands for "Julia Packaging System Information". Bonus points if you symlink it to a file named JPSI.danger.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 18, 2017

There are many projects for which it would never make sense to turn them into packages – e.g. projects where the end artifact is a program, not a reusable piece of Julia code. And a project isn't just not-yet-a-package; e.g. only non-package projects will be able to do any global configuration of other packages. Otherwise when using multiple packages together, they could require conflicting configurations. Similarly, environments are reusable sets of packages and are also not packages.

Another option would be to call it Package.toml, Project.toml or Environment.toml depending on whether it's in a package, project or named environment. I don't really like squatting on so many names though.

@staticfloat
Copy link
Member

I think Environment.toml could possibly describe all three in one fell swoop; Environment.toml describes how a package fits into its larger environment, how a project's environment should be setup, or how an environment proper should be constructed.

@StefanKarpinski
Copy link
Member

It's just so looooong though. I have also considered Env.toml but that seems a bit too terse.

@staticfloat
Copy link
Member

I think Environment.toml is an acceptable length.

@StefanKarpinski
Copy link
Member

Says the man who proposed package_system_metadata.toml 😝

@ararslan
Copy link
Member

I actually like Env.toml; "env" is a pretty standard abbreviation of "environment," so the name is still clear.

The ship has probably sailed, but I still think it'd be nice to always specify that the TOML file is related to Julia, not conditionally look for a Julia-named file. For example, it could be PkgEnv.toml, PkgConfig.toml, or what have you, akin to Rust's Cargo.toml. It's more immediate disambiguation. IMO anyway.

@StefanKarpinski
Copy link
Member

Another option: call it Project.toml (which applies to both packages and projects) and represent named global environments with a single file with this format:

SHA = "ea8e919c-243c-51af-8825-aaa63cd721ce"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

[[Compat]]
hash-sha1 = "6e9c90ac34a173c2a2c179735427078b989a3bdc"
uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
version = "0.26.0"

[[DataStructures]]
deps = ["Compat"]
hash-sha1 = "84bea819ff0c08e8f9fd55a637d25bdc685c6c5b"
uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
version = "0.6.0"

[[SHA]]
deps = ["Compat"]
hash-sha1 = "9ce386dcf6dde95a1e267e320332d192bc090fff"
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.3.3"

[[SpecialFunctions]]
deps = ["Compat"]
hash-sha1 = "03e6a824d4f33a6bc856a5fcfd9d14729a9f18d4"
uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
version = "0.1.1"

[[StatsBase]]
deps = ["Compat", "DataStructures", "SpecialFunctions"]
hash-sha1 = "4820d195cd378926a7a59e6e14727a394cc8f123"
uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
version = "0.17.0"

After all, the rest of the metadata in the artist currently known as Config.toml doesn't really make any sense for a named global environment, which serves only to provide a coherent set of packages.

@rofinn
Copy link

rofinn commented Aug 19, 2017

FWIW, I like Package.toml (or Pkg.toml) as:

  1. Config seems generic enough to be confused with a settings file that contains constants for the package/application/project code.
  2. Pkg.toml leaves room to add Env.toml and Project.toml later on... if we need them.

I'm not sure it matters how long the name is as long as it's a single word that describes what's in the file; I don't imagine folks will be typing/editing these files by hand regularly.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Aug 23, 2017

I did some experiments with different names. Actually changing the name of the file and the places where it's used in code and docs gives a tangible sense for how these names will feel to use – and I like Project.toml by far the best. It reads right in code and in documentation when you refer to a "(Julia) project file" and a "(Julia) manifest file". As in:

When doing rm A, if A is not in the project file, the operation does nothing and prints a message to that effect. When doing rm A=uuid, even if A is not in the project file but is in the manifest with UUID uuid, then ...

When this was written with Config.toml and "config file" it made a lot less natural sense. The name Package.toml and "package file" works pretty well as long as the project you're talking about happens to be a package. However, I think it's really important that we unify the expression of dependencies for all kinds of projects, not just packages, and using the name "package" when you're not talking about a package is just confusing. Even when talking about a package, the name "project" seems better to me since we're talking about the requirements of the package as a project – providing reusable code is only one of many roles that a package has as a project. A package can have targets besides those needed for loading it, e.g. for testing and running code. So I just think the term "project" gives the right sense of what's in the file. I think the terms "project file" and "manifest file" give a clear intuitive sense of what's in each of the files: the project file contains info about the project – name, description, authors, dependencies – while the manifest file contains info about a particular snapshot of everything "on board".

In short, I'm going ahead with this file naming scheme:

  • Project file: Project.toml or JuliaProject.toml for multi-lingual projects.
  • Manifest file: Manifest.toml or JuliaManifest.toml for multi-lingual projects.
  • Named environments: ~/.julia/environments/$name.toml – this is a single TOML file containing the [deps] section of a project file and the manifest merged together.

@jpfairbanks
Copy link

Just to support the namestorming I think that saying package file when you are working on a project makes sense because you are talking about the packages that the project depends on. And we know that the project only depends on packages because projects aren't reusable pieces of code.

@StefanKarpinski
Copy link
Member

True, but saying that "A is in the project" or "B is in the manifest" makes sense whereas "A is in the package" does not make sense. Only some of the information in the project file is about packages that a project depends on; all of the information is metadata about the project, however.

@jpfairbanks
Copy link

That makes sense. I was thinking only about the dependency management aspects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants