In yet-another-model-announcement, recently, Anthropic released their new version of Claude – 3.7 Sonnet.
It’s got a coding assistant, a much longer context window (128,000 tokens – equivalent to 120 pages of A4 paper with Arial 11 font. It’s a lot.) and it’s, apparently, much smarter than the last version of Claude, which I already think was the smartest LLM on the market.
For us, however, it gets us really close to solving a problem that was previously not-cost-effective-enough to solve: reading receipts.
We have lots of PDFs and scans and receipts to read. There are lots of use-cases:
– Read scanned receipts (and invoices) and categorize expenses (automating bookkeeping and expense sorting)
– Read long PDFs with annotations and handwriting (also usually with respect to bookkeeping but also lots of businesses have their contracts and other info in PDFs).
– Turn PDFs with highly specific data into structure data (such as health insurance documents)
The big one, though, is receipts and invoices. It’s a hard problem to solve because:
1. Usually receipts are crumpled.
2. Usually they’re not printed with good ink so the ink fades.
3. Usually they’re too long to fit on one page so they are on multiple pages which need collated.
4. Usually they’re poorly laid out so it’s difficult for a computer to read them accurately.
General solutions right now usually can get you to around 70% accuracy, which is fine for a demo but not fine for production.
In production, we want human-level or (ideally) better-than-human-level accuracy.
Claude 3.7-Sonnet, however, is solving the problem for us.
Obviously, this requires extensive testing before we put the solution into a production environment, but all indicators point toward a production release for our clients with those needs.