Contributing¶

Day-to-day workflow for contributors. Recipes for adding sources, agents, analyzers, and landscapes; the Make-target catalogue; testing conventions; and debugging.

For the system tour (data model, engine internals, dedup invariants), see docs/architecture.md. For per-endpoint API contracts, docs/api-reference.md. For the test-fixture catalogue, docs/testing.md.

Setup¶

Follow docs/installation.md first. The rest of this doc assumes make test passes.

The `make` menu¶

All common commands are wrapped in Makefile. Run make help for a printable list.

Setup¶

Target	What it does
`make install`	`uv sync --extra dev` — runtime + dev deps
`make sync`	`uv sync` — runtime only (no pytest/ruff)

Data pipeline¶

Target	Notes
`make seed`	Seed `nsclc-001` (~5 min, rate-limited by source)
`make seed-immunology`	Seed `immunology-001`
`make seed-immunology-full`	Wipe DB + reproducible immunology setup: seed → profiles → KIQs → briefing → analyses
`make analyze-immunology`	Regenerate per-drug analyses for dupilumab (no reseed)
`make profiles`	Build DrugProfile rows from signal data (~5 s)
`make targets`	Build Target nodes + DrugTarget edges
`make briefing`	Generate briefing for `nsclc-001`
`make briefing-immunology`	Generate briefing for `immunology-001`
`make briefing-html`	Generate briefing → JSON + HTML render
`make briefing-since SINCE=2026-03-01`	Custom lookback window
`make drug-briefing DRUG=pembrolizumab [LANDSCAPE=nsclc-001]`	Drug-focused deep-dive briefing. Pass `LANDSCAPE` for non-NSCLC drugs (e.g. `make drug-briefing DRUG=dupilumab LANDSCAPE=immunology-001`); briefing loads drug-specific KIQs from `{landscape}-{drug}` only
`make to-html F=path/to/briefing.json`	Convert any briefing JSON to HTML
`make pipeline`	Full chain: seed → profiles → targets → briefing
`make pipeline-html`	Full chain ending in JSON + HTML render
`make query Q="question"`	One-shot QueryEngine ask against `nsclc-001`
`make inspect`	Print signal counts by source/type/severity

Evidence pipeline (separate from briefing)¶

The evidence pipeline builds the durable structured-trial-outcomes layer (EvidenceRecord + ProtocolProfile). It is CLI-only — never invoked by a normal briefing run. See architecture.md §4.7.

Target	Notes
`make evidence-pilot`	Run pipeline on `immunology-001` — writes DB + `evals/pilot_immunology/report.md`
`make evidence-pilot-dry`	Dry run: calls extractors, no DB writes — for cost estimation

The underlying script (scripts/run_evidence_pipeline.py) accepts --landscape, --top-n, --limit-trials, --dry-run, --force-replace, --budget-usd. --force-replace overwrites existing rows for prompt-iteration runs; everything is content-hashed, so reruns without --force-replace are idempotent.

KIQ seeding¶

Briefings synthesize answers against Key Intelligence Questions. KIQs scope to either a landscape (class-level) or a drug-composite ID (drug-specific) — see architecture.md §2 — Path C. Seed them once per scope:

uv run python scripts/seed_kiqs.py                       # all landscapes
uv run python scripts/seed_kiqs.py --landscape immunology-001

The seeder uses stable IDs, so re-runs idempotently merge — safe to run after editing the canonical KIQ list inline. make seed-immunology-full wires this in automatically.

One-time migrations. scripts/migrate_immunology_kiqs.py is the migration that moved legacy parent-only KIQs into the Path C scoping (drug-composite for drug-specific questions, parent landscape for class-level). Already applied to local DBs — left in tree for reference.

Smoke scripts¶

Two end-to-end smoke scripts live under scripts/. Both assume the API is already running — start it with make dev in another terminal before invoking:

bash scripts/smoke_drug_briefing.sh        # asserts drug briefing has kiq_answers populated
bash scripts/smoke_briefing_frontend.sh    # asserts briefing has structured entity ids on signal_analyses

Both honour API_BASE for non-default hosts (e.g. API_BASE=http://staging:8000 bash scripts/smoke_*).

These are not part of make test. Run before merging anything that touches the synthesizer or the briefing API surface.

Servers¶

Target	URL
`make dev`	http://localhost:8000 (FastAPI + reload)
`make frontend-install`	npm install (first time only)
`make frontend`	http://localhost:5173 (Vite dev server)

Quality¶

Target	What it does
`make test`	Full suite (`pytest tests/`)
`make test-v`	Verbose
`make test-fast`	Stop on first failure
`make test-file F=tests/unit/engine/test_detector.py`	One file
`make lint`	`ruff check`
`make fmt`	`ruff format` (auto-fix)
`make check`	`lint` + full test suite

Cleanup¶

Target	What it does
`make clean-db`	Delete `ogur.db` (irreversible)
`make clean-cache`	Wipe `__pycache__`, `.pytest_cache`, `.ruff_cache`
`make clean`	`clean-cache` only

Code style¶

ruff is the only formatter + linter. Config in pyproject.toml — line length 100, Python 3.11 target.
No separate black, isort, or flake8.
Type hints everywhere. SQLModel does runtime enforcement, but type hints document intent.

Adding a new data source¶

Implement ogur/sources/<name>.py:

from ogur.sources.base import Source
from ogur.models.signal import Signal, SignalType, SignalSeverity

class MyNewSource(Source):
    name = "mynewsource"

    async def fetch(self, landscape) -> list[Signal]:
        data = await self._get("https://api.example.com/…")
        return [self._to_signal(row, landscape) for row in data["rows"]]

Use self._get / self._post for HTTP — they get automatic retry + exponential backoff.

Compute content_hash correctly. Use Source.compute_hash(self.name, source_id, signal_type.value) — do not invent a new hashing scheme. The dedup invariant depends on this.
Test it at tests/unit/sources/test_<name>.py:

import pytest
from pytest_httpx import HTTPXMock
from ogur.sources.mynewsource import MyNewSource
from tests.conftest import make_landscape

@pytest.mark.asyncio
async def test_basic_fetch(httpx_mock: HTTPXMock):
    httpx_mock.add_response(url="https://api.example.com/…", json={…})
    src = MyNewSource()
    signals = await src.fetch(make_landscape())
    assert len(signals) == 3

Register in the seed script (scripts/seed_nsclc.py) — wrap in try/except so a failure doesn't kill the whole run.
If the source has a new domain, add a DomainAgent:

# ogur/engine/agents/my_domain.py
from ogur.engine.agents.base import DomainAgent

class MyDomainAgent(DomainAgent):
    domain = "mydomain"
    signal_sources = frozenset({"mynewsource"})
    classifier_prompt_suffix = "Score 10 for X, 1 for Y…"

Then add it to AgentOrchestrator.__init__ in ogur/engine/agents/orchestrator.py.

Document it in data-sources.md — auth, rate limit, signal types produced, upstream link.

Adding a new landscape¶

A landscape defines what to track (indication, targets, conditions, companies).

Create a seed script scripts/seed_<name>.py modeled on seed_immunology.py:

from ogur.models.landscape import Landscape

landscape = Landscape(
    id="mylandscape-001",
    name="…",
    indication="…",
    therapeutic_area="…",
    conditions='["…", "…"]',
    targets='["…"]',
    companies='[]',
)
# insert, then run sources against it

Add matching make targets and smoke-test with one source before running all of them.
If the landscape uses a different Open Targets disease ID, override OPENTARGETS_DISEASE_ID in .env before seeding (or parameterize — the current code uses the global setting).

Adding a new engine agent¶

If you want to split an existing domain or add one:

Subclass DomainAgent in ogur/engine/agents/.
Set domain, signal_sources, and a classifier_prompt_suffix — the suffix is appended to the base Haiku scoring rubric and tunes 1–10 relevance for that domain.
Register in AgentOrchestrator.__init__ (orchestrator.py).
Add a test in tests/unit/engine/test_orchestrator.py asserting your agent owns the expected Signal.source values.

Running entity / outcomes evals¶

Two harnesses live under ogur/engine/extractor/ and have CLI runners under scripts/:

uv run python scripts/eval_entity_extractor.py        # NER over the BIOPSY-derived gold set
uv run python scripts/eval_outcomes_extractor.py      # outcomes-tuple extraction over a 25-sample gold

Both write metrics + per-sample diffs into evals/ so prompt iterations are auditable. See evals.md for the harness design and what good looks like.

Writing a new per-tab analyzer¶

Per-tab analyzers live at ogur/engine/analyzers/ and feed the AssetDetail tabs.

Contract:

class MyAnalyzer:
    def analyze(self, drug_name: str, landscape_id: str) -> dict:
        # 1. Pull relevant signals from DB
        # 2. Stuff into a Haiku prompt with structured JSON schema
        # 3. Parse, return dict matching a Pydantic schema in ogur/api/schemas.py
        ...

Define the output shape in ogur/api/schemas.py (MyAnalyzerOut Pydantic model).
Implement the analyzer — follow the pattern in overview.py: _SYSTEM prompt + _build_user_prompt + _parse_response.
Add a route pair in ogur/api/routes/briefings.py — GET .../{tab} and POST .../{tab}/generate.
Cache via upsert_briefing with composite landscape_id = f"{landscape_id}-{drug_name}-{tab}".
Test in tests/unit/engine/test_analyzers.py.

Testing conventions¶

See testing.md for fixture details. High-level rules:

Patch where used, not where defined. patch("ogur.engine.detector.get_session", …), never ogur.store.database.get_session.
Use make_signal / make_drug_profile / etc. from tests.conftest — don't hand-build models in tests.
Integration tests (live network calls) use @pytest.mark.integration and are not run in CI — make test excludes them by default. Run with uv run --extra dev python -m pytest -m integration.

Debugging a briefing¶

If make briefing produces a bad briefing:

Inspect the DB: make inspect — confirm you have signals in the window.
Check the synthesizer input: pipeline logs print the detected changes before classification.
Run the classifier in isolation: add --log-level DEBUG to see Haiku scoring.
Re-run with --since set to the start of the window to confirm time bounds.
The raw synthesizer output is saved to latest_briefing_<landscape_id>.json — diff against a previous known-good one.

For the web UI, the per-tab analyzers cache their output — if you're iterating on a prompt, call the POST .../{tab}/generate endpoint to overwrite the cache.

Committing¶

No hooks are required. make check before pushing is recommended.
Commit messages follow type(scope): summary (see git log --oneline).
ruff format fixes most style issues — run make fmt before pushing.

Contributing¶

Setup¶

The make menu¶

Setup¶

Data pipeline¶

Evidence pipeline (separate from briefing)¶

KIQ seeding¶

Smoke scripts¶

Servers¶

Quality¶

Cleanup¶

Code style¶

Adding a new data source¶

Adding a new landscape¶

Adding a new engine agent¶

Running entity / outcomes evals¶

Writing a new per-tab analyzer¶

Testing conventions¶

Debugging a briefing¶

Committing¶

The `make` menu¶