Skip to content
Martin Pool edited this page Jan 1, 2024 · 2 revisions

This page contains rough notes and thoughts; some are implemented, some are unimplemented, and some are obsolete


  • Refactor text formatting into being part of the UI rather than within the CLI?

  • Add a 'high-level' module similar to the CLI, but not coupled to it?

  • Report warnings by failures/errors that are passed to the UI rather just printing text.

  • Change log/error statements to use report

  • Make a macro like try! that logs when it sees an error?

  • Errors to stderr rather than stdout? Hard to reconcile with use of terminal for colored errors.

Plain text progress

When running without a tty, logging every file to stdout will be excessive and slow. Perhaps we should have a progress mode where we render a single line, something like the progress bar, every 10 or 30 seconds, so some progress is visible.

Refactor of the UI

I'm not sure there's a really meaningful difference between the color and plain UIs: both are essentially text UIs. Perhaps the user visible options should be --color=auto and --progress=auto, and it's all the same text UI. It can contain a progress bar, or a progress sink that does nothing if it's turned off or the terminal doesn't support it. And similarly for drawing colors if possible and wanted, and not otherwise.

Fancy UI seems to have some performance impact

(This might be obsolete?)

Backup with a color UI is slower than without. Maybe due to contention for locks? Should we have a separate thread just to show UI updates?

I want to avoid having threads that could be doing useful work stalling waiting to update the Report, or even worse drawing to the terminal.

It might help to use atomic ints for the counters (when they're stable) rather than locking and unlocking.

It's probably not the cause of any slow down, but I'm not sure the Sizes struct really helps, because some things such as source files don't have an easy compressed size.

Better progress bar

  • Show a progress bar during initial measurement of the source tree.

  • Don't measure the source tree when the progress bar is turned off. (And maybe provide an option to skip this even when progress is one, or is that too much complication?)

For backup, and probably also for restore, it'd be better to calculate completion fraction based on percentage of bytes, rather than files. Walking the source tree, or even the stored tree, to get this, should be acceptably cheap. (For the source tree, not much more than listing the tree.) Obviously some files might go faster than others if their data is already stored, but that's fine.

Ideally should show some of these:

  • Bytes read (uncompressed source) and written (compressed and after deduplication)
  • Current filename
  • Progress within the current file (if that's known, but this will be complicated with parallelism)

This could, at least on Unix, be even fancier by using terminfo to scroll up the region above the bar, leaving the bar there.

Progress bar modes?

In some cases we know the number of files or bytes to be processed; in others not.

Perhaps it's best to just always get the total number of bytes up front, including for restore and validate?

We do need a mode while we're still counting, when the total isn't known, and if pre-measuring is turned off we could stay in this mode.

UI

Goals:

  • accumulate all actions so they can easily be compared to expected results at the end of a test

  • show them in nicely formatted text output, eg with indenting, color or tabulation, not just log output

  • stream output rather than waiting for the whole command to finish

  • perhaps later support a gui

  • ui interactions can be externalized onto pipes

  • show progress bars, which implies knowing when an operation starts and ends and if possible how many items are to be processed

  • simple inside the main application code

  • not too many special cases in the ui code

Emit fairly abstracted events that can be mapped into a ui, or just emitted to stdout. Maybe emit them as (ascii?) protobufs?

Human strings are internationalized: this should be done strictly in the UI layer. Debug/log strings can be emitted anywhere and don't need i18n.

XXX: is it enough, perhaps, just to use logging? Perhaps that's the simplest thing that would work, for now, enough to do some testing? Open questions:

  • Transmit the actual text to be shown to the user, or some kind of symbol? Text is enough to test it, but not so good for reformatting things.

Alarms

When Conserve hits something unexpected in the environment, the core code will signal an alarm, and then attempt to continue. The alarms are structured and can be filtered. The default handlers will try to balance safety vs completion, but they can be customized. In particular, you can tell it to accept everything and try hard to continue, so you have the best chance of recovering something from a damaged backup.

This is somewhat similar to Python's warnings module, but a different implementation, because Python is so tied to the warnings being about code.

Fields:

  • area: source, archive, band, block, restore
  • condition:
    • missing
    • ioerror
    • denied: permissions error from the OS
    • corrupt: protobuf deserialization failed, etc
    • mismatch: hash is not what a higher-level object says it should be
    • exists: a file to be written already exists
  • filename
  • message - only what can't be stored elsewhere

Handling options:

  • abort
  • continue (with a warning)
  • ask (interactively; perhaps not very useful in a long backup)
  • suppress (with only a debug message)

The default should probably (?) be to abort on almost everything, except perhaps not on source alarms.

It should be possible to get a summary, and machine-readable details of alarms fired.

Return codes and result reporting

It's bad if a backup aborts without storing anything because of a footling error: it may be some unimportant source file was unreadable and therefore nothing was stored. On the other hand, it's also bad if the backup apparently succeeds when there are errors, because the file that was skipped might have actually been the most important one.

Therefore there need to be concise and clear summary results, that can be read by humans and by scripts reading the output, and an overall one-byte summary in the return code.

Possible return codes:

  • everything was ok (0)

    • no alarms at all
    • backup completed
  • backup completed with warnings:

    • some source files couldn't be read?
    • every source file that could be read has been stored
  • backup completed but with major warnings:

    • some already stored data seems to be corrupt?
  • fatal error

    • bad arguments, etc
    • unexpected exception

We also need to consider the diff, verify, validate cases:

  • some data is wrong or missing, but it may still be possible to restore everything (eg a hash is wrong)
  • some data is wrong or missing so at least some files can't be restored
  • the source differs from the backup, in ways that might be accounted for by changes since the date of the backup
  • the source differs from the backup, with those changes apparently dating from before the backup was made