I Built a PDF Redaction Tool Because I'm Too Lazy (and Too Cheap for Two Credit Cards)

projectApril 02, 2026· 3 min read

Every year, my company let us reimburse travel expenses to the office. Because, well — London. The cost of living crisis spares no one.

At first, I used an Oyster card for TFL. Then I kept losing them. So I switched to Apple Pay on my credit card. Problem solved — except now my reimbursement claims meant submitting the entire statement.

Redacting the other transactions manually? Too much work. And we all know programmers are lazy about working hard but will happily spend a weekend automating a 10-minute task.

So for a while, I just submitted the whole thing. Then it started to bother me. Some random approver in finance was seeing everything. You'd be surprised how much a credit card statement reveals — where you eat, what gym you go to, how many Deliveroo orders you placed at 1 AM, and exactly how much you've spent on fans and air coolers. (Don't ask.)

The sensible solution? Get a second card. Use one exclusively for travel. But I'm a minimalist. I refuse to get another credit card just for this silly reason.

The programmer solution? Build an app.

So I did. Redact — upload your statement, type the keywords you want to keep (like TFL, Uber), and get back a clean PDF with everything else blacked out. Properly blacked out — not "draw a rectangle and hope nobody copy-pastes" blacked out.

Why PDFs Are Annoying to Redact

If you've ever tried to edit a PDF, you know it's not like editing a Word doc. Text in a PDF isn't stored as lines — it's stored as spans: individual text fragments, each with their own coordinates.

A single transaction line like:

03 Mar   SAINSBURYS S/MKTS   £47.50

Is actually three separate objects:

  • "03 Mar" at position (50, 340)
  • "SAINSBURYS S/MKTS" at position (120, 340)
  • "£47.50" at position (450, 340)

This makes "find the line with Sainsbury's and delete it" surprisingly non-trivial.

The Approach: Keep What Matters, Nuke the Rest

Most people think of redaction as "hide the bad parts." That's a blacklist — and it's fragile. Miss one thing and your data leaks.

I flipped it. Whitelist the transactions you want to keep, redact everything else. The default is private.

The engine uses PyMuPDF to process every span in the transaction section:

for span_data in all_spans:
    should_keep = False
    
    # Check all spans on the same line for keywords
    for keyword in keep_keywords:
        same_line_text = ' '.join(
            s['text'] for s in all_spans 
            if abs(s['y'] - span_data['y']) < 2
        )
        if keyword.lower() in same_line_text.lower():
            should_keep = True
            break
    
    if not should_keep:
        page.add_redact_annot(fitz.Rect(span['bbox']), fill=(0, 0, 0))

page.apply_redactions()  # Permanently removes the text

The key trick is same-line detection — when you search for "TFL", the keyword might be in one span while the date and amount are in separate spans. By grouping everything within 2 Y-coordinate points, we treat the whole line as a unit.

And apply_redactions() is crucial — it permanently destroys the underlying text. No copy-paste recovery. The data is gone.

Dealing with Different Providers

AMEX formats dates as Mar 03. Barclaycard uses 03 Mar. Transaction sections have different headers. So the tool has a provider config system:

@dataclass
class ProviderConfig:
    name: str
    date_patterns: List[Tuple[str, str]]
    currency_patterns: List[str]
    section_headers: List[str]

Upload any supported statement and it auto-detects the provider from the PDF content. Currently supports American Express and Barclaycard.

There's also an Enhanced Privacy mode that goes further — redacting your name, address, account number, credit limit, and balance. For when you want the absolute minimum shared.

Security

No data is stored. Your PDF is processed in memory, served back, and deleted. No accounts, no database, no copies. The entire thing is stateless.

Try It

It's free at redact.javascriptbit.com.

Next time you need to submit a statement, don't give them the director's cut. Give them the highlights reel.