Skip to content

Contributing

Day-to-day workflow for contributors. Recipes for adding sources, agents, analyzers, and landscapes; the Make-target catalogue; testing conventions; and debugging.

For the system tour (data model, engine internals, dedup invariants), see docs/architecture.md. For per-endpoint API contracts, docs/api-reference.md. For the test-fixture catalogue, docs/testing.md.

Setup

Follow docs/installation.md first. The rest of this doc assumes make test passes.


The make menu

All common commands are wrapped in Makefile. Run make help for a printable list.

Setup

Target What it does
make install uv sync --extra dev — runtime + dev deps
make sync uv sync — runtime only (no pytest/ruff)

Data pipeline

Target Notes
make seed Seed nsclc-001 (~5 min, rate-limited by source)
make seed-immunology Seed immunology-001
make seed-immunology-full Wipe DB + reproducible immunology setup: seed → profiles → KIQs → briefing → analyses
make analyze-immunology Regenerate per-drug analyses for dupilumab (no reseed)
make profiles Build DrugProfile rows from signal data (~5 s)
make targets Build Target nodes + DrugTarget edges
make briefing Generate briefing for nsclc-001
make briefing-immunology Generate briefing for immunology-001
make briefing-html Generate briefing → JSON + HTML render
make briefing-since SINCE=2026-03-01 Custom lookback window
make drug-briefing DRUG=pembrolizumab [LANDSCAPE=nsclc-001] Drug-focused deep-dive briefing. Pass LANDSCAPE for non-NSCLC drugs (e.g. make drug-briefing DRUG=dupilumab LANDSCAPE=immunology-001); briefing loads drug-specific KIQs from {landscape}-{drug} only
make to-html F=path/to/briefing.json Convert any briefing JSON to HTML
make pipeline Full chain: seed → profiles → targets → briefing
make pipeline-html Full chain ending in JSON + HTML render
make query Q="question" One-shot QueryEngine ask against nsclc-001
make inspect Print signal counts by source/type/severity

Evidence pipeline (separate from briefing)

The evidence pipeline builds the durable structured-trial-outcomes layer (EvidenceRecord + ProtocolProfile). It is CLI-only — never invoked by a normal briefing run. See architecture.md §4.7.

Target Notes
make evidence-pilot Run pipeline on immunology-001 — writes DB + evals/pilot_immunology/report.md
make evidence-pilot-dry Dry run: calls extractors, no DB writes — for cost estimation

The underlying script (scripts/run_evidence_pipeline.py) accepts --landscape, --top-n, --limit-trials, --dry-run, --force-replace, --budget-usd. --force-replace overwrites existing rows for prompt-iteration runs; everything is content-hashed, so reruns without --force-replace are idempotent.

KIQ seeding

Briefings synthesize answers against Key Intelligence Questions. KIQs scope to either a landscape (class-level) or a drug-composite ID (drug-specific) — see architecture.md §2 — Path C. Seed them once per scope:

uv run python scripts/seed_kiqs.py                       # all landscapes
uv run python scripts/seed_kiqs.py --landscape immunology-001

The seeder uses stable IDs, so re-runs idempotently merge — safe to run after editing the canonical KIQ list inline. make seed-immunology-full wires this in automatically.

One-time migrations. scripts/migrate_immunology_kiqs.py is the migration that moved legacy parent-only KIQs into the Path C scoping (drug-composite for drug-specific questions, parent landscape for class-level). Already applied to local DBs — left in tree for reference.

Smoke scripts

Two end-to-end smoke scripts live under scripts/. Both assume the API is already running — start it with make dev in another terminal before invoking:

bash scripts/smoke_drug_briefing.sh        # asserts drug briefing has kiq_answers populated
bash scripts/smoke_briefing_frontend.sh    # asserts briefing has structured entity ids on signal_analyses

Both honour API_BASE for non-default hosts (e.g. API_BASE=http://staging:8000 bash scripts/smoke_*).

These are not part of make test. Run before merging anything that touches the synthesizer or the briefing API surface.

Servers

Target URL
make dev http://localhost:8000 (FastAPI + reload)
make frontend-install npm install (first time only)
make frontend http://localhost:5173 (Vite dev server)

Quality

Target What it does
make test Full suite (pytest tests/)
make test-v Verbose
make test-fast Stop on first failure
make test-file F=tests/unit/engine/test_detector.py One file
make lint ruff check
make fmt ruff format (auto-fix)
make check lint + full test suite

Cleanup

Target What it does
make clean-db Delete ogur.db (irreversible)
make clean-cache Wipe __pycache__, .pytest_cache, .ruff_cache
make clean clean-cache only

Code style

  • ruff is the only formatter + linter. Config in pyproject.toml — line length 100, Python 3.11 target.
  • No separate black, isort, or flake8.
  • Type hints everywhere. SQLModel does runtime enforcement, but type hints document intent.

Adding a new data source

  1. Implement ogur/sources/<name>.py:
from ogur.sources.base import Source
from ogur.models.signal import Signal, SignalType, SignalSeverity

class MyNewSource(Source):
    name = "mynewsource"

    async def fetch(self, landscape) -> list[Signal]:
        data = await self._get("https://api.example.com/…")
        return [self._to_signal(row, landscape) for row in data["rows"]]

Use self._get / self._post for HTTP — they get automatic retry + exponential backoff.

  1. Compute content_hash correctly. Use Source.compute_hash(self.name, source_id, signal_type.value) — do not invent a new hashing scheme. The dedup invariant depends on this.

  2. Test it at tests/unit/sources/test_<name>.py:

import pytest
from pytest_httpx import HTTPXMock
from ogur.sources.mynewsource import MyNewSource
from tests.conftest import make_landscape

@pytest.mark.asyncio
async def test_basic_fetch(httpx_mock: HTTPXMock):
    httpx_mock.add_response(url="https://api.example.com/…", json={})
    src = MyNewSource()
    signals = await src.fetch(make_landscape())
    assert len(signals) == 3
  1. Register in the seed script (scripts/seed_nsclc.py) — wrap in try/except so a failure doesn't kill the whole run.

  2. If the source has a new domain, add a DomainAgent:

# ogur/engine/agents/my_domain.py
from ogur.engine.agents.base import DomainAgent

class MyDomainAgent(DomainAgent):
    domain = "mydomain"
    signal_sources = frozenset({"mynewsource"})
    classifier_prompt_suffix = "Score 10 for X, 1 for Y…"

Then add it to AgentOrchestrator.__init__ in ogur/engine/agents/orchestrator.py.

  1. Document it in data-sources.md — auth, rate limit, signal types produced, upstream link.

Adding a new landscape

A landscape defines what to track (indication, targets, conditions, companies).

  1. Create a seed script scripts/seed_<name>.py modeled on seed_immunology.py:
from ogur.models.landscape import Landscape

landscape = Landscape(
    id="mylandscape-001",
    name="…",
    indication="…",
    therapeutic_area="…",
    conditions='["…", "…"]',
    targets='["…"]',
    companies='[]',
)
# insert, then run sources against it
  1. Add matching make targets and smoke-test with one source before running all of them.

  2. If the landscape uses a different Open Targets disease ID, override OPENTARGETS_DISEASE_ID in .env before seeding (or parameterize — the current code uses the global setting).


Adding a new engine agent

If you want to split an existing domain or add one:

  1. Subclass DomainAgent in ogur/engine/agents/.
  2. Set domain, signal_sources, and a classifier_prompt_suffix — the suffix is appended to the base Haiku scoring rubric and tunes 1–10 relevance for that domain.
  3. Register in AgentOrchestrator.__init__ (orchestrator.py).
  4. Add a test in tests/unit/engine/test_orchestrator.py asserting your agent owns the expected Signal.source values.

Running entity / outcomes evals

Two harnesses live under ogur/engine/extractor/ and have CLI runners under scripts/:

uv run python scripts/eval_entity_extractor.py        # NER over the BIOPSY-derived gold set
uv run python scripts/eval_outcomes_extractor.py      # outcomes-tuple extraction over a 25-sample gold

Both write metrics + per-sample diffs into evals/ so prompt iterations are auditable. See evals.md for the harness design and what good looks like.


Writing a new per-tab analyzer

Per-tab analyzers live at ogur/engine/analyzers/ and feed the AssetDetail tabs.

Contract:

class MyAnalyzer:
    def analyze(self, drug_name: str, landscape_id: str) -> dict:
        # 1. Pull relevant signals from DB
        # 2. Stuff into a Haiku prompt with structured JSON schema
        # 3. Parse, return dict matching a Pydantic schema in ogur/api/schemas.py
        ...
  1. Define the output shape in ogur/api/schemas.py (MyAnalyzerOut Pydantic model).
  2. Implement the analyzer — follow the pattern in overview.py: _SYSTEM prompt + _build_user_prompt + _parse_response.
  3. Add a route pair in ogur/api/routes/briefings.pyGET .../{tab} and POST .../{tab}/generate.
  4. Cache via upsert_briefing with composite landscape_id = f"{landscape_id}-{drug_name}-{tab}".
  5. Test in tests/unit/engine/test_analyzers.py.

Testing conventions

See testing.md for fixture details. High-level rules:

  • Patch where used, not where defined. patch("ogur.engine.detector.get_session", …), never ogur.store.database.get_session.
  • Use make_signal / make_drug_profile / etc. from tests.conftest — don't hand-build models in tests.
  • Integration tests (live network calls) use @pytest.mark.integration and are not run in CI — make test excludes them by default. Run with uv run --extra dev python -m pytest -m integration.

Debugging a briefing

If make briefing produces a bad briefing:

  1. Inspect the DB: make inspect — confirm you have signals in the window.
  2. Check the synthesizer input: pipeline logs print the detected changes before classification.
  3. Run the classifier in isolation: add --log-level DEBUG to see Haiku scoring.
  4. Re-run with --since set to the start of the window to confirm time bounds.
  5. The raw synthesizer output is saved to latest_briefing_<landscape_id>.json — diff against a previous known-good one.

For the web UI, the per-tab analyzers cache their output — if you're iterating on a prompt, call the POST .../{tab}/generate endpoint to overwrite the cache.


Committing

  • No hooks are required. make check before pushing is recommended.
  • Commit messages follow type(scope): summary (see git log --oneline).
  • ruff format fixes most style issues — run make fmt before pushing.