citation-audit
✓Clean>
Install Command
npx skills add Chinelytra/academic-citation-audit-skillSKILL.md
---
name: citation-audit
description: >
Systematic audit of academic manuscript references: authenticity verification,
bibliographic accuracy, citation appropriateness, and software/data version
consistency. Triggers on: citation audit, reference check, bibliography
verification, fabricated/fake/hallucinated reference detection, DOI verification,
pre-submission check, manuscript review, R/Python package version consistency,
data source citation, checking if a paper is real, æç®å®¡æ¥, å¼ç¨å®¡æ¥, åèæç®æ£æ¥,
ä¼ªé æç®, æç¨¿åæ£æ¥, DOIæ ¸å¯¹, è½¯ä»¶çæ¬æ ¸å¯¹, æ°æ®æºå¼ç¨.
Applicable file types: .docx, .tex, .bib, .ris, .enl, .nbib, manuscript files.
---
# Citation Audit / 妿¯æç®å®¡æ¥
Systematic audit of all references in an academic manuscript before submission.
æç¨¿å坹妿¯ç¨¿ä»¶åèæç®è¿è¡ç³»ç»æ§å
¨é¢å®¡æ¥ã
## Audit Dimensions / 审æ¥ç»´åº¦
| Level | Scope / èå´ | Severity / 严é度 |
| ----- | ------------ | ----------------- |
| **L1** | Authenticity â does the paper exist? Is the DOI correct? / ç宿§ââè®ºææ¯å¦åå¨ï¼DOI æ¯å¦æ£ç¡®ï¼ | ð´ Fatal |
| **L2** | Bibliographic accuracy â authors, year, volume, pages, journal / 书ç®ä¿¡æ¯ââä½è
ã年份ãå·å·ã页ç ãæå | ð´ Critical |
| **L3** | Textâlist consistency â every in-text citation has a matching entry and vice versa / æ£æä¸å表ä¸è´æ§ | ð¡ Important |
| **L4** | Citation appropriateness â each citation supports the claim it is attached to / å¼ç¨æ°å½æ§ | ð¡ Improvement |
| **L5** | Formatting & version consistency â style uniformity, software/data versions match actual usage / æ ¼å¼ä¸çæ¬ä¸è´æ§ | ⪠Housekeeping |
## Workflow / 工使µç¨
### Phase 1: Extract manuscript text / æå稿件å
¨æ
Extract all text with paragraph indices for cross-referencing. See `scripts/extract_docx.py`.
For `.tex` files, parse directly. For `.docx`, use the python-docx library. Separate the **reference list** from the **body text** and index each entry.
### Phase 2: L1 â Authenticity verification / ç宿§éªè¯
> [!CAUTION]
> AI-assisted writing frequently introduces "hallucinated" references â DOIs that resolve to unrelated papers, or entirely fabricated entries. This is the most severe error class.
>
> AI è¾
å©å使æå¼å
¥"å¹»è§æç®"ãæ¤ç±»éè¯¯ä¸æ¦å表åæä¸¥éã
**Method / æ¹æ³: CrossRef API + web search dual verification**
1. Run `scripts/crossref_batch_check.py` to batch-query CrossRef API metadata.
2. **Mandatory web-search re-verification** for:
- Entries where API results mismatch the manuscript
- Connection errors or timeouts
- Papers published within the last 1â2 years (CrossRef indexing lag)
- Any citation that "looks too perfect" but cannot be independently found
**Red flags for fabricated references / ä¼ªé æç®ç¹å¾:**
- DOI resolves to an unrelated paper
- Author + year + journal combination yields zero Google Scholar results
- Claims to cite a "preprint" but provides a formal journal DOI
**Verification chain for suspicious entries / å¯çæ¡ç®éªè¯é¾:**
1. Resolve DOI directly â check title and author match
2. Google Scholar: search author + keywords
3. Author's personal page / ORCID publication list
4. Journal website: browse the table of contents for the cited volume/issue
### Phase 3: L2 â Bibliographic accuracy / 书ç®ä¿¡æ¯æ ¸å¯¹
Check every entry against its verified source for:
| Field | Common errors / 常è§é误 |
| ----- | ----------------------- |
| Authors | Missing co-authors (especially 4th+), wrong initials (G.H. vs C.H.) / éæ¼åèè
ãå缩åé误 |
| Year | Early Online vs. official publication date confusion / å¨çº¿ä¼å
䏿£å¼åºçæ¥ææ··æ· |
| Journal | Abbreviated vs. full name inconsistency / 缩åä¸ç»ä¸ |
| Volume/Pages | Mismatch with DOI record / ä¸ DOI è®°å½ä¸ç¬¦ |
| DOI | Placeholder not replaced (e.g. `zenodo.XXXXXXX`), points to wrong article / å ä½ç¬¦æªæ¿æ¢ |
### Phase 4: L3 â Textâlist cross-check / æ£æ-åè¡¨äº¤åæ ¸å¯¹
1. Extract all `(Author, Year)` and `(Author et al., Year)` citations from the body text.
2. Match bidirectionally:
- **In text â not in list** = missing reference (must add) / 缺失å¼ç¨
- **In list â not in text** = orphan reference (delete or cite) / å¹½çµå¼ç¨
3. Special attention to data sources, software packages, and datasets that are mentioned in text but absent from the reference list.
### Phase 5: L4 â Citation appropriateness / å¼ç¨æ°å½æ§
Evaluate each citation:
- Does it directly support the claim it is attached to?
- Is there a more canonical or more recent alternative?
- Excessive self-citation or citation stacking?
### Phase 6: L5 â Formatting & version consistency / æ ¼å¼ä¸çæ¬ä¸è´æ§
#### Style uniformity / æ ¼å¼ç»ä¸
- "et al." usage, punctuation, spacing
- Author name ordering for multi-work citations
#### Software & package version verification / è½¯ä»¶çæ¬æ ¸å¯¹
> [!IMPORTANT]
> The manuscript MUST report the actual software versions used for the analysis, not the latest CRAN/PyPI versions.
**R environment:**
```r
pkgs <- c('ecospat', 'biomod2', 'terra', 'sf')
for (p in pkgs) cat(sprintf("%-12s %s\n", p, packageVersion(p)))
cat(sprintf("%-12s %s\n", "R", R.version.string))
```
**Python environment:**
```python
import pkg_resources, sys
for p in ['numpy', 'pandas', 'scikit-learn', 'tensorflow']:
try: print(f"{p:20s} {pkg_resources.get_distribution(p).version}")
except: print(f"{p:20s} NOT INSTALLED")
print(f"{'Python':20s} {sys.version.split()[0]}")
```
**Other environments** (Julia, MATLAB, etc.): adapt the pattern to query installed package versions.
**Cross-check steps:**
1. Search project scripts for all `library()` / `import` / `using` calls.
2. Query actual installed versions in the runtime environment.
3. Compare with versions stated in the manuscript and reference list.
4. Flag packages mentioned in the manuscript but never called in any script (may indicate a methodâcode mismatch).
#### Data source & dataset citation / æ°æ®æºå¼ç¨æ ¸å¯¹
Verify that **every external data source** used in the analysis is properly cited:
| Data type | Examples | What to check |
| --------- | -------- | ------------- |
| Remote sensing | MODIS, Landsat, Sentinel | Product name, version, DOI or data center URL |
| Climate data | WorldClim, CHELSA, ERA5 | Version number, resolution, temporal coverage |
| Biodiversity records | GBIF, iNaturalist, VertNet | Download DOI, access date, query parameters |
| Geospatial layers | Natural Earth, GADM, OpenStreetMap | Version, access date |
| Genomic data | GenBank, SRA, ENA | Accession numbers |
| Statistical databases | World Bank, UN, national bureaus | Dataset name, access date, URL |
Common issues:
- Dataset is used in methods but has no reference entry
- DOI or accession number is a placeholder
- Version mismatch between what was downloaded and what is cited
## Output format / è¾åºæ ¼å¼
Generate a `citation_audit.md` report structured as:
```markdown
# Citation Audit Report / åèæç®å®¡æ¥æ¥å
## ð´ Must-fix errors / å¿
须修æ£
(Ordered: fabricated > missing > bibliographic)
## ð¡ Recommended improvements / 建议æ¹è¿
(Appropriateness, formatting)
## â
Verified entries / å·²éªè¯éè¿
(Full checklist with per-entry status)
```
## Key lessons / å
³é®ç»éª
1. **Never trust CrossRef alone** â its "best match" is frequently wrong for books, chapters, datasets, non-English literature, and same-surname authors. Always web-search verify.
CrossRef è¿åç"æä½³å¹é
"ç»å¸¸æ¯é误çï¼å¿
é¡»ç¨ Web æç´¢äºæ¬¡éªè¯ã
2. **Year discrepancies need judgment** â "Early Online" vs. print dates can differ by 1â2 years; both are acceptable. Differences > 2 years likely indicate a real error.
年份差å¼é夿ï¼Early Online 䏿£å¼åºçå·® 1â2 年屿£å¸¸ã
3. **Methods must match code** â if the manuscript claims package X was used but the scripts call package Y, this is a reviewable error. Cross-check Methods section against actual scripts line by line.
ç¨¿ä»¶æ¹æ³æè¿°å¿
é¡»ä¸ä»£ç ä¸è´ï¼ééè¡æ¯å¯¹ã
4. **Data sources need citations too** â remote sensing products, climate databases, and biodiversity data portals all require proper citation with DOI/version/access date.
æ°æ®æºä¹éè¦è§èå¼ç¨ã
## Anti-patterns
| Don't / ä¸è¦ | Do instead / åºè¯¥ |
| ------------ | ---------------- |
| Trust CrossRef blindly | CrossRef + web search dual verification |
| Ignore recent publications | Extra scrutiny for papers < 2 years old |
| Assume all DOIs are correct | Resolve every DOI and verify the target |
| Only check the reference list | Also cross-check body citations and code |
| Report everything at once | Triage by severity: fatal â critical â improvement |
| Skip data source citations | Verify every dataset, layer, and product is cited |
Similar Skills
Perform security audits on code changes, diffs, or branches to find high-confidence exploitable vulnerabilities. Use when asked to "audit security", "review for vulnerabilities", "security scan", "check for security issues", "audit this PR", "review these changes for security", or "find vulnerabilities in diff". Distinct from security-review (which provides secure coding patterns/checklists) â this skill actively audits code changes using a structured methodology with false positive filtering. Includes Python scripts for GitHub Action CI integration and PR evaluation.
npx skills add LeonMelamud/claude-code-security-reviewInteractive system flow tracing across CODE, API, AUTH, DATA, NETWORK layers with SQLite persistence and Mermaid export. Use for security audits, compliance documentation, flow tracing, feature ideation, brainstorming, debugging, architecture reviews, or incident post-mortems. Triggers on audit, trace flow, document flow, security review, debug flow, brainstorm, architecture review, post-mortem, incident review.
npx skills add ArunJRK/audit-flow>
npx skills add anivar/contributor-codebase-analyzer