AI-Enhanced Scientific Writing

Transforming Quarto Analyses into Publication-Ready Narratives with Claude code

Claude
AI
Writing
Quarto
Author

JM

Published

December 20, 2025

What is Quarto?

Quarto is an open-source scientific publishing system that enables researchers to weave together code, results, and narrative text into reproducible documents. Think of it as the evolution of R Markdown and Jupyter notebooks, a polyglot platform that works seamlessly with R, Python, Julia, and other languages.

A Quarto project combines:

  • Source files (.qmd or .ipynb): Plain-text documents containing code chunks, statistical analyses, and markdown text
  • Configuration (_quarto.yml): Defines the project structure, navigation, and visual theme
  • Rendered output (docs/ directory): Auto-generated HTML, PDF, or website versions of your analyses

For scientists, Quarto solves a critical problem: maintaining a single source of truth where code and interpretation live together, preventing the drift between analysis and write-up that plagues traditional workflows.

The Challenge: From Code to Coherent Narrative

While Quarto excels at rendering code and results, transforming computational analysis into publication-ready scientific prose remains cognitively demanding. I normaly must:

  • Synthesize complex operations into clear methodology descriptions
  • Extract quantitative results from rendered outputs
  • Maintain consistent scientific voice and style
  • Ensure logical flow between analysis sections
  • Balance technical accuracy with accessibility

This is where AI integration becomes transformative.

The AI-Enhanced Workflow Architecture

I have implemented a Claude-assisted scientific writing system in all my Quarto projects using Positron (IDE enviroment) built on three pillars:

1. Project Instructions (CLAUDE.md)

The CLAUDE.md file serves as the command center, providing Claude with:

  • System architecture overview: Explains the Positron IDE environment, Quarto structure, and file organization
  • Coding standards: Enforces tidyverse patterns, vectorization over loops, use of native R pipe |>, and mandatory code annotations
  • Workflow protocols: Defines how to interpret results, write/edit code, and handle git operations
  • Quick reference guide: Maps common requests to specific actions

This file essentially “programs” Claude to function as a domain-specific research assistant who understands both the technical environment and scientific goals.

2. Context Profiles (.context/ directory)

Three critical files shape Claude’s behavior:

quarto-narrative-skill/SKILL.md

This is the heart of the enhancer system; a comprehensive instruction manual for transforming Quarto analyses into scientific narratives. It defines:

  • Input requirements: Reads three files per analysis:

    1. The .qmd source (code logic and structure)
    2. The rendered .html (numerical results and outputs)
    3. The index.qmd (experimental context and research questions)
  • Core workflow:

    1. Parse all input files to extract code operations, results, and context
    2. Maintain exact section hierarchy from the .qmd structure
    3. For each section: describe code operations → report HTML results → interpret biologically
    4. Apply strict writing style constraints
    5. Synthesize into cohesive narrative prose
  • HTML parsing strategy: Extracts content from rendered HTML by identifying text between tags (<p>, <td>, <pre>), ignoring JavaScript/CSS, and pulling numerical values from tables and code outputs

  • Special handling: Covers multiple comparisons, negative results, technical issues, and complex visualizations

writingStyle_OSmithies.json

A structured specification of scientific voice inspired by Nobel laureate Oliver Smithies. Key constraints:

  • Voice: Active, first-person agency (“We performed” not “was performed”)
  • Tone: Decisive, objective, authoritative—avoid hedging
  • Sentence structure: Under 25 words, one idea per sentence, linear flow
  • Data reporting: Specific values, explicit statistical significance
  • Flow pattern: Context → Action → Result → Meaning

Example transformation:

❌ "Analysis was performed using DESeq2"
✅ "We performed analysis using DESeq2"

❌ "Many genes were significant"
✅ "We detected 347 significant genes"

git_workflow.md

Enforces version control standards:

  • Commit format: Title line + bullet points explaining specific changes
  • Distinguishes between detailed explanations (for analysis code) and simple messages (for documentation)
  • Explicitly excludes AI attribution messages

IMPORTANT: These files live inside the Quarto project

3. The Invocation Pattern

When I need to document an analysis, I fire up Claude code in a terminal window and use the next prompt to trigger the workflow:

"Write the analysis for xxxxxx.qmd" 

Claude then invokes quarto-narrative-skill:

  1. Reads the source .qmd file (understanding methodology)
  2. Reads the rendered .html file (extracting results)
  3. Reads index.qmd (gathering experimental context)
  4. Applies the Oliver Smithies writing style
  5. Generates publication-ready prose that:
    • Maintains the exact section structure from the .qmd
    • Describes what each code chunk does and why
    • Reports specific numerical results from the HTML
    • Provides biological interpretation
    • Uses active voice, concise sentences, and precise terminology

Example Output Pattern

Given a .qmd section:

## Differential Expression Analysis
dds <- DESeq(dds)
res <- results(dds, contrast=c("genotype", "APOE2", "WT"))
res_sig <- subset(res, padj < 0.05 & abs(log2FoldChange) > 1)

And HTML showing:

347 genes with padj < 0.05 and |log2FC| > 1
Upregulated: 198 genes
Downregulated: 149 genes

Claude generates: > “We performed differential expression analysis using DESeq2 to compare APOE2 and wild-type mice. We identified 347 genes with adjusted p-values below 0.05 and absolute log2 fold changes exceeding 1. Of these, 198 genes showed increased expression in APOE2 mice, while 149 showed decreased expression.”

Why This Approach Works

This system succeeds because it:

  1. Separates concerns: Code lives in .qmd, results in .html, style rules in CLAUDE.md and .context/. Style rules can be reused in other projects.
  2. Enforces consistency: The writing style JSON ensures every narrative follows the same rigorous standard
  3. Preserves scientific accuracy: By reading both source code and rendered output, Claude reports exact values and methods
  4. Maintains structure: The skill explicitly preserves the .qmd section hierarchy, preventing AI “creativity” from reorganizing logical flow
  5. Is reproducible: The entire workflow is version-controlled and can be audited

Key Innovations

  • HTML-as-database: Treating rendered HTML as a structured data source for result extraction
  • Style-as-code: Formalizing scientific voice into a machine-readable specification
  • Context injection: Using .context/ files to provide persistent behavioral constraints
  • Skill-based invocation: Creating specialized “modes” for Claude through detailed skill definitions

Practical Benefits

For researchers, this system:

  • Reduces writing time from hours to minutes
  • Ensures consistent voice across multi-year projects
  • Catches methodology/results mismatches (if code doesn’t match description, it’s obvious)
  • Enables rapid iteration (change analysis → re-render → regenerate narrative)
  • Creates an audit trail (git history shows when code changed vs. when prose changed)

Files That Make It Work

Essential components:

  • CLAUDE.md: System instructions and quick reference
  • .context/quarto-narrative-skill/SKILL.md: Narrative generation protocol
  • .context/writingStyle_OSmithies.json: Voice specification
  • .context/git_workflow.md: Version control standards
  • _quarto.yml: Project configuration

Together, these files transform Claude from a general-purpose AI into a specialized scientific writing assistant that understands computational biology, R/Python code, statistical analysis, and publication standards.

The Future of AI-Assisted Science

This method demonstrates that AI’s value in research isn’t just running analyses, it’s in maintaining the connective tissue between computation and communication. By codifying scientific writing standards and creating structured workflows, we can leverage AI to handle the mechanical aspects of prose generation while researchers focus on interpretation and discovery.

The system is transparent, auditable, and version-controlled, essential properties for scientific reproducibility. It doesn’t replace scientific thinking; it amplifies it by removing friction between “I analyzed this” and “I can clearly explain what I did.”