A guided workflow for preparing US federal income tax returns. This skill covers all filer types — US citizens, resident aliens (RA), and nonresident aliens (NRA) — by first determining the correct filer type, then routing to the appropriate forms and procedures. Both citizen/RA and NRA workflows are fully self-contained in this skill, including PDF form field mappings, cross-form validation, and the safe update_form.py script.
Before anything else, ask the user what documents they have. Common source docs:
| Document | What it tells you |
|---|---|
| ---------- | ------------------- |
| W-2 | Wages, federal/state tax withheld, employer HSA contributions |
| 1099-NEC | Contractor / self-employment income |
| 1099-INT | Bank interest |
| 1099-DIV | Dividends (qualified and ordinary) |
| 1099-B | Stock/crypto sales (proceeds and cost basis) |
| 1099-MISC | Other income (royalties, rents, etc.) |
| 1099-SA / 5498-SA | HSA distributions and contributions |
| 1098 | Mortgage interest paid |
| 1098-T | Tuition paid (education credits) |
| I-94 | Travel history (needed for NRA determination) |
If the user has an I-94 or mentions a visa type, that's a strong signal they may be NRA — proceed to Step 2 with that in mind.
This is the critical routing decision. Read references/filing-status.md for the full decision tree. The short version:
Ask the user directly if unclear. Don't assume.
Read references/form-routing.md to determine which schedules and forms are needed based on the user's income types. For field-level details on individual schedule lines and common pitfalls, read references/common-schedules.md when filling specific forms.
references/filing-status.md)references/form-routing.mdscripts/update_form.py (bundled with this skill) or write equivalent code following the three critical rules:auto_regenerate=FalseThis section covers the complete NRA filing workflow. For NRA-specific field-to-line mappings, see references/form-field-maps.md. For PDF recovery procedures, see references/pypdf-recovery.md.
These rules prevent data loss. Violating them will corrupt PDF files. The bundled scripts/update_form.py enforces all three automatically — use it instead of writing update logic from scratch.
auto_regenerate=False when calling update_page_form_field_values(). The default True removes /AP (appearance stream) entries from each field. Without appearance streams, some PDF viewers render the field as blank even though the /V value is correct — the data is there but invisible.page.get("/Annots") → annot.get("/V")/Fields array from page annotationsreferences/pypdf-recovery.md when you see this symptom — it has the full step-by-step repair procedureA bundled script at scripts/update_form.py encodes all three critical rules above plus post-write verification. Use it for all form updates:
# CLI usage — fix a field
python scripts/update_form.py Form1040NR.pdf /tmp/Form1040NR_fixed.pdf --set "f1_53=5000"
# Fix multiple fields and clear one
python scripts/update_form.py Form8843.pdf /tmp/Form8843_fixed.pdf \
--set "f1_14=338" "f1_17=338" --clear "f1_15"
# Or import as a library in your own script
from scripts.update_form import update_form
import shutil
update_form("Form.pdf", "/tmp/Form_fixed.pdf", {"f1_53": "5000"}, clear_fields=["f1_65"])
shutil.copy("/tmp/Form_fixed.pdf", "Form.pdf") # only then overwrite original
The script automatically verifies that fields survived the write and warns if the output looks corrupted. Ensure pypdf is available: pip install pypdf --break-system-packages.
Before modifying any form, always extract and map fields first.
Step 1: Extract all field names and values
reader = PdfReader("Form.pdf")
fields = reader.get_form_text_fields()
for name, value in sorted(fields.items()):
short = name.split(".")[-1].replace("[0]", "")
print(f"{short} = {value}")
Step 2: Map fields to line numbers via Y-position
Before this step, read references/form-field-maps.md for the expected field-to-line table — it covers 1040-NR, 8843, Schedule NEC, Schedule OI, Form 8833, Form 8889, and Schedule 1. Use it as a reference while verifying the Y-position analysis below.
IRS PDFs use positional layout. Extract annotation rectangles to determine which line a field corresponds to:
page = reader.pages[0]
annots = page.get("/Annots")
field_positions = []
for annot_ref in annots:
annot = annot_ref.get_object()
t = str(annot.get("/T", ""))
v = annot.get("/V", "")
rect = annot.get("/Rect", [])
ft = str(annot.get("/FT", ""))
if ft == "/Tx": # text fields only
y = float(rect[1]) if rect else 0
x = float(rect[0]) if rect else 0
field_positions.append((y, x, t, v))
# Sort by Y descending = top of page to bottom (matches line order)
for y, x, t, v in sorted(field_positions, reverse=True):
short = t.split(".")[-1].replace("[0]", "")
print(f"Y={y:.0f} X={x:.0f} {short} = {v}")
Compare the Y-position ordering against the physical form layout to create a definitive field-to-line map.
Step 3: Check checkboxes and radio buttons
all_fields = reader.get_fields()
for name, field in sorted(all_fields.items()):
v = field.get("/V", "")
ft = field.get("/FT", "")
if ft == "/Btn":
short = name.split(".")[-1].replace("[0]", "")
print(f"{short} = {v} (button)")
A typical NRA (F-1 OPT) filing includes these forms. See references/form-field-maps.md for complete field-to-line mappings.
| Form | Purpose | Key Fields |
|---|---|---|
| ------ | --------- | ------------ |
| 1040-NR | Main return | Income lines, AGI, tax, withholding, refund |
| Schedule 1 | Additional income/adjustments | Contractor income (Line 8h), HSA deduction |
| Schedule NEC | Tax on non-effectively-connected income | Dividends, capital gains, NEC tax |
| Schedule OI | Other information | Visa type, country, treaty claims, days present |
| Form 8843 | Statement for exempt individuals | Days of presence, visa status, exclusion days |
| Form 8833 | Treaty-based return position | Treaty article, exemption amount |
| Form 8889 | HSA | Contributions, employer contributions, deduction |
references/form-field-maps.mdscripts/update_form.py (different output path!)After filling, validate these consistency checks:
Watch for these — they are the most frequent mistakes when auto-filling:
This section covers the US-China treaty as a concrete example. Similar treaties exist for other countries (e.g., India Article 21(2), South Korea Article 21(1)) — verify article numbers and rates against the specific treaty if your country differs.
For Chinese nationals on F-1 visa:
auto_regenerate=False, iterate all pages)共 1 个版本