Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.
Jerome Flesch edited this page May 3, 2016 · 65 revisions

How to access my documents from many places ?

The easiest solution is usually to synchronize your documents across many machines. Tools like ownCloud Desktop Client or SparkleShare can do it (there are many others). Just tell them to sync the work directory of Paperwork (by default ~/papers, can be changed in the settings).

When Paperwork is started, the first thing it does is checking if the documents have changed. If one of them do, it will update its index accordingly.

However, please note that Paperwork doesn't crypt documents. So it is not advised to store your document in "the cloud" (Dropbox, Google Drive, etc). Doing so would make a private company able to access all the details of your private life far too easily.

How to crypt my documents ?

GNU/Linux distributions include many tools to crypt whole directories.

With Paperwork, there are 2 directories that should be crypted to protect your privacy:

  • Your work directory (by default ~/papers, can be changed in the settings)
  • The cache directory (~/.local/share/paperwork, cannot be changed) (it contains index files from which the content of your documents could be partially recovered)

Ecryptfs

On GNU/Linux Debian and Ubuntu, you can easily create a directory Private in your home directory. This directory will be crypted using the password you use to connect when you start your computer. Just type ecryptfs-setup-private in a terminal to create it. You can then put the work directory of Paperwork in it.

Once the directory has been created, you can also store Paperwork cache in it:

$ mv ~/.local/share/paperwork ~/Private/paperwork_cache
$ ln -s ~/Private/paperwork_cache ~/.local/share/paperwork

Encfs

Encfs can also be used to create crypted directories easily. However, Encfs seems to have some security weaknesses.

How to import many PDF documents in one shot ?

  1. Create a directory
  2. Put all your PDFs in it
  3. In paperwork, in the sub-menu next to the "Print" button -> "Import file(s)"
  4. Select the directory containing all the PDFs

Where are Paperwork files located ?

By default:

  • Configuration : ~/.config/paperwork.conf
  • Index : ~/.local/share/paperwork
  • Documents : ~/papers

The index is always updated according based on the documents. When Paperwork starts, the modification time of each file is used to detecte changes on the documents.

How are the documents stored ?

See the page describing the work directory organisation

How to uninstall Paperwork ?

If you installed Paperwork manually:

sudo pip uninstall paperwork
sudo pip uninstall pyocr
sudo pip uninstall pyinsane

(it's python-pip on some systems)

If you installed many versions of these packages, you may have to run these commands many times.

Note that there are other dependencies installed with Paperwork. However, python-pip can't detect and remove automatically unused dependencies. This is why you should use your distribution package(s) if possible.

How to change the file browser used by Paperwork

When you click "Open document directory", Paperwork uses your default file browser (the one called by 'xdg-open'). To change it, as normal user:

  • Nautilus (Gnome's file browser):
xdg-mime default nautilus.desktop inode/directory
  • Thunar (XFCE's file browser):
xdg-mime default Thunar.desktop inode/directory

Why did you do X instead of Y ?

Variant: Why haven't you implemented X ?

Variant: I need feature X. Can I have it ?

Basically, because the SABDFL (me) said so or just didn't have time to do the change yet :-)

If you want something changed or improved, your options are:

Let's be honest: I'm not going to do anything just because it looks better to you. I'm also not going to do anything just to satisfy a weird use case that only concern you. So please do your best to be convincing without being annoying :-)

Also, please keep in mind I'm doing this on my free time. In other words, I have a very limited amount of time I can spend on Paperwork. So weird or crazy (but valid) features may be delayed from version to version until the end of time.

Why can't X be configured ?

Because if we added all the options everyone want, the settings dialog would look like the space shuttle panel. I'm not going to design a crazy GUI like the one of Eclipse.

However, in the future, there may be hidden settings in the configuration file to accommodate weird requirements.

How can I get statistics regarding my documents ?

Statistics are fun. Unfortunately, they are not really helpful here, so there is nothing in the GUI to get some. However, there is a script:

$ git clone https://github.com/jflesch/paperwork.git
$ cd paperwork
$ scripts/stats.py
(...)
Statistics
==========
Total number of documents: 989
Total number of pages: 1846
Total number of words: 382490
Total words len: 2751489
Total number of unique words: 54399
===
Maximum number of pages in one document: 75
Maximum word length: 179
Average word length: 7.193623
Average number of words per page: 207.199350
Average number of words per document: 386.744186
Average number of pages per document: 1.866532
Average number of unique words per document: 202.777553
Average accuracy of label prediction (global): 98%
Average accuracy of label prediction (positive): 88%
Average accuracy of label prediction (negative): 99%

Why a Gtk GUI instead of a web interface ?

  • Paperwork is designed to be as simple to install and use as possible. Web servers and web applications are not simple to install (yeah, I know, the dependency nightmare of Paperwork doesn't make it easy either as a Gtk GUI ... but package managers are supposed to take care of that).
  • I (jflesch) have no use for a web frontend (I sync my files with SparkleShare, and I can access them everywhere I need them). I don't develop features I don't use. Doing otherwise would be the best way to get regressions all the time.
  • With a web application, scanning is not an easy problem. You could use any scanner connected on the server side .. but it wouldn't make sense. You need to scan on the client side, and afaik, it would require a browser plugin.

Feel free to make one if you want. We will gladly help you with any questions about Paperwork backend you might have.

Clone this wiki locally