Setting Up Paperless-ngx for Document Search

by · Saturday. Apr 4, 2026

Installing Paperless-ngx

This server started as a way to keep entertainment off my phone and out of the cloud. But once it was stable, it kind of turned into a tiny home “office” server too.

So I added Paperless-ngx for documents: ingest, OCR, and search. It’s the same idea as Immich, but for PDFs instead of photos — and it stays LAN-only.

For this install, I kept it simple and used the official installation script, which sets up a Docker Compose deployment for you.

The goal is not to scan every paper I have ever touched. I want a reliable place for the boring documents that are hard to find later: tax forms, receipts, lease paperwork, insurance PDFs, manuals, and anything else that normally disappears into a drawer. Paperless-ngx gives me OCR, tags, correspondents, document types, and full-text search without sending the archive to a cloud service.

The official Paperless-ngx documentation is the source of truth for installation and administration. I used the install script because this is a personal LAN service, but the generated Docker Compose files are still worth reading before trusting the setup.

What this solves

  • Turns loose PDFs and scans into a searchable archive.
  • Adds OCR so document text can be found later.
  • Keeps personal paperwork on the LAN instead of a cloud drive.
  • Creates a simple workflow for upload, tag, search, and restore.

1. Create the Paperless-ngx app folder

This is where I run the installer and keep the generated files.

sudo mkdir -p /srv/appdata/paperless-ngx
sudo chown -R $USER:$USER /srv/appdata/paperless-ngx
cd /srv/appdata/paperless-ngx

Like the other services in this series, I keep app data under /srv/appdata/. Paperless has a few important directories after setup: configuration, database/cache services, original documents, archived documents, and a consume folder. Keeping them together makes backups and future migration easier.

2. Run the installation script

This is the one-liner from the docs:

bash -c "$(curl --location --silent --show-error https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"

The script is interactive: it asks a few setup questions, generates the config + compose files, pulls images, and starts Paperless-ngx.

After the script finishes, I check what it created:

ls -lah
docker compose ps

Then I read the generated environment/config files before moving on. The installer saves time, but I still want to know which port the web UI uses, where documents are stored, and which paths need to be backed up.

3. First login

Once it’s up, I access it by local IP from another device on my network: http://<server-ip>:8000 and create an admin account.

If the page does not load, I check the containers and logs first:

cd /srv/appdata/paperless-ngx
docker compose ps
docker compose logs --tail=100

Paperless depends on more than one container, so a blank page is not always a web-server problem. It can be the broker, database, or permissions on one of the document paths.

4. Workflow

Paperless-ngx is at its best when it’s boring:

  • scan → upload / import → OCR → search later

Once it’s set, it’s just a quiet archive that makes paper stop piling up.

My first workflow is intentionally manual. I upload one PDF through the web UI, wait for Paperless to process it, then search for a word that only appears inside the document. That tests the full path: upload, consume, OCR, index, and search.

After that works, I can use the consume folder for a more automatic workflow. The Paperless consumption docs explain the details, but the idea is simple: files placed in the consume directory are imported and processed by Paperless.

5. How I organize documents

Paperless-ngx can auto-tag and auto-match over time, but I started with a tiny taxonomy so I would actually use it:

  • Correspondents: the organization or person the document came from.
  • Document types: receipt, tax form, manual, statement, contract.
  • Tags: broad cross-cutting labels like home, car, medical, warranty, taxes.

I avoid making too many tags early. If every document needs a custom tagging decision, the system becomes a chore. Search is the main feature; metadata is there to make common filters faster.

6. What needs backing up

Paperless is another service where the data matters more than the app. Containers can be recreated. Documents cannot.

For this setup, I care about backing up the generated compose/config files and the Paperless data directories. The project’s backup documentation is the reference I would follow before doing a migration or reinstall.

At a practical level, I want to preserve:

/srv/appdata/paperless-ngx

If I later move document storage to a separate drive, I’ll update this mental checklist the same way I did for Immich. The important thing is knowing where the originals and archive live before I need to restore them.

7. My “done” check

I call Paperless done when four things work:

  1. The web UI loads from another device on the LAN.
  2. I can upload a PDF.
  3. OCR finishes and the document becomes searchable.
  4. The service comes back after a reboot.

That last one is easy to test:

sudo reboot

Then, after the server is back:

cd /srv/appdata/paperless-ngx
docker compose ps

If the containers are healthy and search still works, the system is ready for real documents.