Skip to content

Phase 6: Insight Analysis

Module: src/insight_analyzer.py
Estimated Time: ~1 day

Objective

Analyze the knowledge graph using graph algorithms to identify cross-disciplinary patterns, hub concepts, field influence dynamics, and generate a structured Markdown report.

Analysis Components

1. Hub Concept Analysis

Identifies bridge concepts that connect the most different scientific fields:

  • Counts the number of distinct fields each concept connects to
  • Ranks concepts by cross-field connectivity
  • Top concepts are listed as "interdisciplinary hubs"

Example result:

Concept Connected Fields
Protein Degradation Mechanisms Biology, Chemistry, Physics
Ubiquitin-Proteasome Pathway Biology, Chemistry
Fixed Cost Analysis Economics, Business

2. Field Influence Analysis

Measures which fields export concepts to others and which import:

  • Counts outgoing CROSS_INSPIRED edges per source field
  • Counts incoming CROSS_INSPIRED edges per target field
  • Identifies net exporters and importers of concepts

3. Temporal Pattern Analysis

Extracts the timeline of cross-disciplinary migration events:

  • Lists all CROSS_INSPIRED edges with timestamps
  • Identifies acceleration trends in cross-field inspiration
  • Tracks which decades had the most cross-pollination

4. Key Pathway Analysis

Finds shortest paths between concepts that cross field boundaries:

  • Uses NetworkX shortest path algorithms
  • Filters for paths of length 3–6 hops
  • Requires at least 2 different fields along the path

Report Structure

The generated report (output/reports/insight_report.md) contains 6 sections:

  1. Graph Overview — Node/edge counts, node type breakdown
  2. Top Hub Concepts — Ranked list of cross-disciplinary bridge concepts
  3. Field Influence — Import/export analysis by field
  4. Cross-Discipline Timeline — Temporal listing of migration events
  5. Key Pathways — Notable cross-field concept paths
  6. Methodology — How the analysis was performed

Output Files

File Format Description
insight_report.md Markdown Human-readable analysis report
insight_report.json JSON Machine-readable analysis data

JSON Structure

{
  "graph_overview": {
    "total_nodes": 97,
    "total_edges": 181,
    "node_types": {"Laureate": 5, "Work": 25, "Concept": 51, ...},
    "edge_types": {"WON_AWARD": 5, "AUTHORED": 25, ...}
  },
  "hub_concepts": [...],
  "field_influence": {...},
  "temporal_patterns": [...],
  "key_pathways": [...]
}

Running

# Via pipeline
uv run python main.py --phase 6

# Standalone
uv run python -m src.insight_analyzer