AI Tax Preparation

Your tax software validates math. It doesn't validate truth.

Leroy Kerry | CEO at Filed

Every tax software has diagnostics. They run after you finish the return. They check math. They flag missing fields. And if you have ever sat through a review of a return that passed every diagnostic and still had a $13,500 error on line 1, you already know the problem.

The diagnostics passed because they were never designed to catch that kind of mistake.

This is not a knock on tax software. CCH, UltraTax, Lacerte, Drake, ProSeries, ProConnect. They all do what they are supposed to do. They validate structure. They make sure the numbers on the return are internally consistent. If Schedule C flows to line 8, and line 8 flows to page 2, and page 2 produces the right tax, the diagnostic says everything is fine.

What diagnostics actually check

Tax software diagnostics are calculators. They are very good calculators. They verify that:


All of this is useful. None of it tells you whether the return reflects reality.

A preparer types $15,000 in wages instead of $1,500. The math still works. Schedule 1 flows to page 1. The tax calculates. The diagnostic runs clean. The return goes to the reviewer, who now has to catch a $13,500 data entry error by manually comparing the return against the W-2. If they are reviewing 30 returns that week, maybe they catch it. Maybe they don't.

Tax software checks whether the math is internally consistent. It does not check whether the numbers are true.

This is the gap. It has always been the gap. And for most firms, the only thing standing between that gap and a signed return is the senior reviewer's attention span during busy season.

What "document-first" means

Document-first is a simple idea. Before you evaluate the return, read the source documents. Then compare them. The W-2 says $1,500 in box 1. The return says $15,000 on line 1. That is a discrepancy. Flag it, cite it, and tell the preparer exactly what needs to change before the reviewer ever opens the file.

This is what Filed does. Our AI reads the same source documents the team reads. W-2s, 1099s, K-1s, brokerage statements, prior-year returns. It reads the draft return from whatever tax software the firm uses. And then it compares them, line by line, field by field.

When it finds a mismatch, it generates a Review Action List. A specific, cited queue of issues with the source document reference, the return field, and what the expected value should be. The preparer resolves the list before the return reaches the reviewer.

How the multi-agent system works

Filed is not a single model reading a document and guessing. It uses multiple specialized agents, each responsible for a different part of the review. This matters because tax documents are messy. A brokerage statement from Schwab looks nothing like one from Fidelity. A K-1 from a real estate partnership has different relevant fields than a K-1 from an S-corp. One model trying to do everything will make mistakes that a specialized system will not.

Here is how the agents work together:

W-2 1099-INT K-1 1099-NEC Prior yr Read Extraction agents Pull structured values from each document type Return parsing agents Map every return value to its line item Cross-check agents Compare source values against return values Validation layer Deterministic checks, filter false positives Review action list Cited discrepancies for the preparer Extract Parse Compare Filter Output

The key word in all of this is "deterministic." When Filed says the W-2 shows $1,500 and the return shows $15,000, that is not a guess. It is a comparison of two values. The multi-agent system handles the hard part (reading messy documents) and the validation layer handles the straightforward part (comparing numbers). The combination is what makes the system reliable enough to use in production.

Why this matters for review cycles

Most firms structure review the same way. Preparer does the return. Reviewer checks it. If there are errors, it goes back to the preparer. The preparer fixes it. The reviewer checks again. Repeat until clean.

The problem is that the first pass usually catches data entry errors. Not tax judgment issues. Not complex planning questions. Data entry. The reviewer spends their time verifying that numbers were typed correctly instead of evaluating whether the return is optimized.

When those data entry errors are caught before the return reaches the reviewer, two things happen:

Review cycles get shorter. Firms using Filed Reviewer report 64% shorter review cycles. That is not because the reviewer works faster. It is because the return arrives cleaner. Fewer kickbacks. Fewer rounds. The reviewer opens a return and can focus on the things that actually require their expertise.

Reviewers do higher-value work. When reviewers are not spending their time on data verification, they can spend it on tax planning, client advisory, and training junior staff. The 42 minutes saved per return adds up across a season. For a firm processing 5,000 returns, that is 3,500 hours back. See how firms are using that time. Not hypothetical hours. Hours that were previously spent comparing W-2s against line 1.

Reviewers are for tax judgment, not data verification. The question is whether your workflow reflects that.

What this is not

This is not about replacing reviewers. The reviewer's job is to apply tax judgment. To look at a return and ask whether the client is positioned well, whether there are elections or strategies being missed, whether the filing position is defensible. That work requires experience and it requires a human.

What it does not require is manually cross-referencing a brokerage statement against Schedule D for the 14th time today. That is the work Filed takes off the table.

This is also not about trusting AI blindly. Filed shows its work. Every discrepancy comes with a citation to the source document and the return field. The preparer can verify the finding before making any change. The reviewer can see what was flagged and what was resolved. There is a full audit trail.

CPAs do not want to trust AI. They want to verify it (See common questions about accuracy and security). That is the right instinct, and the system is built for it.

See the Review Action List in your returns

Filed Reviewer works with any tax software, including CCH, UltraTax, Lacerte, ProSeries, Drake, and ProConnect. No integration work. No workflow changes. Ready in a day.

See how it works