Skip to content

Commit

Permalink
Initial git-cvs-fast-import release.
Browse files Browse the repository at this point in the history
  • Loading branch information
LawnGnome committed Oct 4, 2021
0 parents commit 3c99335
Show file tree
Hide file tree
Showing 89 changed files with 4,709 additions and 0 deletions.
19 changes: 19 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
**/target
/target

**/examples
examples

**/dist
dist

*.md

Dockerfile
.dockerignore
.gitignore
.git
.goreleaser
.goreleaser.yml
LICENSE
renovate.json
42 changes: 42 additions & 0 deletions .github/workflows/goreleaser.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: goreleaser

on:
push:
tags:
- 'v[0-9]*'

permissions:
contents: write

jobs:
goreleaser:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0

# Unsurprisingly, we need Go to run goreleaser.
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.17

# We need a Rust toolchain, but specifically only the
# x86_64-unknown-linux-musl target.
- name: Set up Rust
uses: actions-rs/toolchain@v1
with:
toolchain: 1.55.0
target: x86_64-unknown-linux-musl
override: true

- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v2
with:
distribution: goreleaser
version: latest
args: release --rm-dist
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
51 changes: 51 additions & 0 deletions .github/workflows/rust.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
on: [push, pull_request]

name: Rust CI

jobs:
test:
name: Test Suite
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- uses: actions-rs/cargo@v1
with:
command: test
args: --all

fmt:
name: Rustfmt
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- run: rustup component add rustfmt
- uses: actions-rs/cargo@v1
with:
command: fmt
args: --all -- --check

clippy:
name: Clippy
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- run: rustup component add clippy
- uses: actions-rs/cargo@v1
with:
command: clippy
args: -- -D warnings
12 changes: 12 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
**/target
**/.vscode
Cargo.lock
/dist


# Added by cargo

/target
.gdb_history

dist/
39 changes: 39 additions & 0 deletions .goreleaser.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# This approach is adapted from
# https://jondot.medium.com/shipping-rust-binaries-with-goreleaser-d5aa42a46be0
#
# Basically, we want to use goreleaser to handle packaging and releasing
# git-cvs-fast-import binaries, since it already has all the functionality
# required to build the packages and create the changelog. To do that, we
# provide a dummy main package that goreleaser can happily build into a no-op
# binary, and then overwrite the binary with one that we build normally using
# cargo.

project_name: git-cvs-fast-import
builds:
- main: .goreleaser/dummy.go
goarch:
- amd64
goos:
- linux
binary: git-cvs-fast-import
hooks:
post: sh .goreleaser/post.sh
nfpms:
- package_name: git-cvs-fast-import
vendor: Sourcegraph
homepage: https://github.com/sourcegraph/git-cvs-fast-import
maintainer: Batch Changes <[email protected]>
description: A tool to import CVS repositories into Git for analysis
license: Proprietary
formats:
- rpm
- deb
dependencies:
- git
bindir: /usr/bin
checksum:
name_template: 'checksums.txt'
snapshot:
name_template: "{{ incpatch .Version }}-next"
changelog:
sort: asc
3 changes: 3 additions & 0 deletions .goreleaser/dummy.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
package main

func main() {}
14 changes: 14 additions & 0 deletions .goreleaser/post.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/sh

# This script is used by goreleaser, and shouldn't be invoked otherwise.

set -e
set -x

# Perform a completely static build using musl to avoid glibc versioning issues
# on older distros. This requires the x86_64-unknown-linux-musl Rust target to
# be available.
RUSTFLAGS='-C link-arg=-s' cargo build --release --target x86_64-unknown-linux-musl

# Overwrite the dummy Go binary.
cp target/x86_64-unknown-linux-musl/release/git-cvs-fast-import dist/git-cvs-fast-import_linux_amd64/git-cvs-fast-import
31 changes: 31 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
[package]
name = "git-cvs-fast-import"
version = "0.1.0"
edition = "2018"

[workspace]
members = [ "comma-v", "eq-macro", "git-fast-import", "internal/process", "internal/state", "patchset", "rcs-ed" ]

[dev-dependencies]
tokio-test = "0.4.2"

[dependencies]
anyhow = "1.0.44"
comma-v = { path = "comma-v" }
flexi_logger = { version = "0.19.4", features = ["async", "colors"] }
flume = "0.10.9"
git-cvs-fast-import-process = { path = "internal/process" }
git-cvs-fast-import-state = { path = "internal/state" }
git-fast-import = { path = "git-fast-import" }
log = "0.4.14"
num_cpus = "1.13.0"
parse_duration = "2.1.1"
patchset = { path = "patchset" }
rcs-ed = { path = "rcs-ed" }
structopt = "0.3.23"
tempfile = "3.2.0"
thiserror = "1.0.29"
tokio = { version = "1.12.0", features = ["io-util", "macros", "process", "rt-multi-thread", "signal", "sync", "time", "fs"] }
walkdir = "2.3.2"

[features]
27 changes: 27 additions & 0 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Development

`git-cvs-fast-import` is a fairly standard Rust binary that uses several packages within its workspace to provide functionality. We intend to publish the packages that make sense to publish to crates.io in the future.

## Package structure

### Generic packages

* `comma-v`: a parser for the RCS `,v` format used by CVS.
* `git-fast-import`: a client for the [`git fast-import`](https://git-scm.com/docs/git-fast-import) streaming format.
* `rcs-ed`: an implementation of the subset of [`ed`](https://linux.die.net/man/1/ed) commands [used by RCS](https://www.gnu.org/software/diffutils/manual/html_node/RCS.html).

### Helper packages

* `eq-macro`: a proc macro to derive `PartialEq<[u8]>`, used internally by `comma-v`.

### `git-cvs-fast-import` specific packages

* `src`: contains the source for the `git-cvs-fast-import` binary itself, which intentionally doesn't do very much and mostly delegates to other packages.
* `internal/process`: process management for `git fast-import`.
* `internal/state`: state management and persistence.

## Releasing

[GoReleaser](https://github.com/goreleaser/goreleaser) is used for package building and GitHub changelog generation. A mild amount of hackery is required to make it work with a Rust program, see [`.goreleaser.yml`](.goreleaser.yml) for the gory details.

For maximum compatibility, binaries are built as static Linux binaries using the `x86_64-unknown-linux-musl` Rust target. Note that this means that non-Rust dependencies can only be added if they can easily be statically linked. (In practice, this hasn't been a problem thus far.)
12 changes: 12 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
FROM rust:1.55.0-alpine@sha256:1c01fb410179b21f809ef935fd66277e964b5a8ad20431ad49b1c52b5778fd34 AS builder

RUN apk add --update alpine-sdk
COPY --chown=nobody:nobody . /src/
WORKDIR /src
USER nobody:nobody
RUN cargo build --release

FROM alpine:3.14@sha256:e1c082e3d3c45cccac829840a25941e679c25d438cc8412c2fa221cf1a824e6a

COPY --from=builder /src/target/release/git-cvs-fast-import /usr/local/bin/git-cvs-fast-import
ENTRYPOINT ["/usr/local/bin/git-cvs-fast-import"]
39 changes: 39 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
The Sourcegraph Enterprise license (the “Enterprise License”)
Copyright (c) 2021 Sourcegraph Inc.

With regard to the Sourcegraph Software:

This software and associated documentation files (the "Software") may only be
used in production, if you (and any entity that you represent) have agreed to,
and are in compliance with, the Sourcegraph Terms of Service, available
at https://about.sourcegraph.com/terms (the “Enterprise Terms”), or other
agreement governing the use of the Software, as agreed by you and Sourcegraph,
and otherwise have a valid Sourcegraph Enterprise subscription for the
correct number of user seats. Subject to the foregoing sentence, you are free to
modify this Software and publish patches to the Software. You agree that Sourcegraph
and/or its licensors (as applicable) retain all right, title and interest in and
to all such modifications and/or patches, and all such modifications and/or
patches may only be used, copied, modified, displayed, distributed, or otherwise
exploited with a valid Sourcegraph Enterprise subscription for the correct
number of user seats. Notwithstanding the foregoing, you may copy and modify
the Software for development and testing purposes, without requiring a
subscription. You agree that Sourcegraph and/or its licensors (as applicable) retain
all right, title and interest in and to all such modifications. You are not
granted any other rights beyond what is expressly stated herein. Subject to the
foregoing, it is forbidden to copy, merge, publish, distribute, sublicense,
and/or sell the Software.

The full text of this Enterprise License shall be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

For all third party components incorporated into the Sourcegraph Software, those
components are licensed under the original license provided by the owner of the
applicable component.
42 changes: 42 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# git-cvs-fast-import

`git-cvs-fast-import` provides a restartable, performant Git importer for CVS repositories, focused on providing continuously updated mirrors of CVS repositories so that they can be analysed using existing tools that can only handle Git natively, such as [Sourcegraph](https://sourcegraph.com).

To support continuous updates, `git-cvs-fast-import` makes some tradeoffs compared to other tools that convert CVS repositories into Git: most notably, considerably less effort is made to preserve precise history, especially on branches and tags. This is not intended to be a general purpose, one time converter when migrating from a CVS setup to Git: see [the comparison to other tools](#comparison-to-other-tools) for suggestions on what to use in that case.

## Installation

Linux binaries are provided on [the releases page](https://github.com/sourcegraph/git-cvs-fast-import/releases), including RPM and DEB packages for RHEL/CentOS and Debian/Ubuntu installs, respectively. These binaries have been tested back to CentOS 7 and Ubuntu 16.04.

You will also need `git` installed, as `git-cvs-fast-import` uses the [`git fast-import`](https://git-scm.com/docs/git-fast-import) command internally when operating. Any version released in the last decade should be sufficient.

## Usage

You will need access to the `CVSROOT` of the CVS repository you wish to import, as `git-cvs-fast-import` parses the RCS files in the root to import the history of each file. In practice, this means you should expect to see a tree of files ending in `,v`.

You will also need a valid Git repository. This means that you need to `git init` your target repository before running `git-cvs-fast-import` for the first time.

Full help is available through `git-cvs-fast-import --help`, but for most uses, you only need to provide the CVSROOT, Git repository, metadata store, and (optionally) the CVS directories to be imported. For example, to import the `project` and `src` directories from a CVS repository at `/cvs`, and write to a Git repository at `/git`, and store the metadata at `/tmp/import.db`, you would run the following:

```sh
git-cvs-fast-import -c /cvs -g /git -s /tmp/import.db project src
```

## Comparison to other tools

We know of three other tools that allow for CVS-to-Git conversion:

* [`git-cvsimport`](https://git-scm.com/docs/git-cvsimport): this ships with Git, and has support for incremental updates.
* [`cvs-fast-export`](https://gitlab.com/esr/cvs-fast-export): this is a standalone tool that parses CVS repositories and exports data in the `git fast-import` stream format, but does not support incremental updates.
* `cvs2git` is referenced in the `git-cvsimport` manpage, but no longer appears to have a home page.

We would suggest trying `cvs-fast-export` first for one-time conversions where the CVS repository will not be used thereafter, and then falling back to `git-cvsimport` if `cvs-fast-export` fails (which can happen with complex Git histories).

## Known issues

* Importing branches other than `HEAD` is currently unsupported, but is intended to be added in the medium term.
* Tag history can be misleading: CVS tags are applied on a per-file basis, whereas Git tags are per-repository. As a result, `git-cvs-fast-import` makes a fake commit for each tag: this ensures that the actual content of the tag is correct, but may be misleading in terms of the history of the tag if the same CVS tag was applied to different files at different times, as commits may appear in the Git log that weren't logically part of the CVS history for a specific file.

## Development

Please refer to [`DEVELOPMENT.md`](DEVELOPMENT.md) for more detail on how this tool is structured, why some choices were made, and how to contribute.
17 changes: 17 additions & 0 deletions comma-v/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
[package]
name = "comma-v"
version = "0.1.0"
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
chrono = "0.4.19"
derive_more = "0.99.16"
eq-macro = { path = "../eq-macro" }
nom = "7.0.0"
thiserror = "1.0.29"

[dev-dependencies]
anyhow = "1.0.44"
structopt = "0.3.23"
Loading

0 comments on commit 3c99335

Please sign in to comment.