Prime Radiant/Machine Cards
LMDawnEXTRAPclass card

LLM Public-Good Cooperative (LM-Dawn class, 2022–present)

infrastructure pace layer · 2022–ongoing

lifespan: 250 yrs

Class card for LM-Dawn civic-cooperative LLM substrates — large language models that combine open weights + cooperative training/compute + nonprofit governance + civic-aligned access (universities, libraries, municipalities, public-interest organizations). This is the OWNERSHIP and ACCESS refinement above capture-resistant-ai-infrastructure-class's open-weights focus. Distinguished by three structural commitments absent in corporate-open-weight releases: (1) multi-stakeholder OWNERSHIP — no single corporate entity controls training, weights, or deployment; governance is distributed across academic institutions, nonprofit foundations, government bodies, and civil-society actors; (2) CIVIC-MANDATE ACCESS — deployment to universities, public libraries, municipal governments, and public-interest organizations is a stated governance obligation (not a market byproduct); (3) COOPERATIVE TRAINING INFRASTRUCTURE — compute is contributed by public supercomputing facilities, academic consortia, and federated volunteer networks rather than corporate hyperscaler goodwill. Three sub-families constitute the class: (1) Open-weight academic cooperatives — BLOOM 176B (BigScience 2022: ~1,000 researchers from 70+ countries, GENCI/IDRIS Jean Zay compute; 176B params; BigScience Open RAIL-M license; full ROOTS corpus disclosure); EleutherAI / GPT-NeoX / Pythia (volunteer collective, 2020+; Apache 2.0; no corporate founding sponsor); OpenAssistant 2023 (LAION + community; open instruction-tuning dataset + model). These instances exhibit the purest civic-cooperative governance form: steering by community consensus, compute from public HPC, outputs under permissive civic licenses. (2) Decentralized/federated inference cooperatives — Petals (2022+; community- run distributed inference network; contributors donate GPU time to serve large models without requiring server-farm capital); Llamafile (2023+; Mozilla Builders; portable open-weight LLM bundles for one-file civic deployment); Bittensor protocol (2021+; incentivized federated training/inference; ambiguous capture-resistance — tokenomics introduce market dynamics). These instances substitute federated volunteer compute for hyperscaler dependency. (3) Civic/public-interest AI infrastructures — Public AI Lab (US academic proposal 2024+); Public AI Network (multi-institution 2024+); EU GAIA-X compatible models; Singapore SEA-LION (2024+; regional public model; government-funded); India BharatGPT (2024+; public-funded national model); Trillium initiative (2024 EU sovereign AI framing). Municipal AI provisions (city-as-civic-AI-provider proposals: Barcelona, Amsterdam, NYC AI Action Plan). California AB-2013 (2024; Public AI provisions). These instances embed LLMs into public-sector governance architectures. Adjacency-lift mechanic (Wave-6 Hidalgo generator §3): The civic-cooperative LLM form is NOT a direct derivative of any single DM substrate. It is an adjacency-lift: the capability set required to produce it (open-weight ML engineering + distributed volunteer coordination + nonprofit governance + public-interest mandate) is spread across four on-disk DM substrates — AWS-cloud-infrastructure-2006 (elastic compute template), github-code-collaboration-2008 (distributed-contribution coordination), openai-foundation-model-lab-2015 (LLM training methodology baseline), and llm-inference-platform-class (inference delivery pattern) — PLUS pre-existing LM-adjacent machinery from wikipedia-2001 (nonprofit knowledge-commons governance template) and linux-open-source-ecosystem-1991 (volunteer- coordination-without-capital model). φ(BLOOM-bigscience, AWS-2015) ≈ 0.45 (shared: compute abstraction, API standards, ML engineering talent); φ(BLOOM-bigscience, wikipedia-2001) ≈ 0.60 (shared: volunteer coordination, nonprofit governance, knowledge-commons mandate, peer-contribution norm); φ(BLOOM-bigscience, linux-ecosystem) ≈ 0.55 (shared: open licensing, volunteer labor, meritocratic contribution governance, copyleft-adjacent legal structure). The Hidalgo density ω(civic-cooperative-LLM, DM-corpus) ≈ 0.50 > θ=0.30 → ADJACENT. The civic-cooperative form is predictable from existing DM capability sets [EXTRAP]. LM mechanism signatures: capture_resistance_index HIGH (0.72 [EXTRAP]) — multi-stakeholder governance makes single-actor capture structurally difficult; no shareholder board that can flip to closed weights; nonprofit residence (Wikimedia Foundation template) creates legal-structural barriers to proprietarization. proletarianization_risk HIGH (0.75 [EXTRAP]) — deep technical-competence dependency; training requires expert ML teams + funded compute; if community fails to reproduce-train, civic-LLM-substrate degrades to inherited-weights-only (artifact persists, competence thins — classic Stiegler proletarianization_terminus trajectory). rewilding_fraction LOW (0.05 [EXTRAP]) — no biological substrate involvement; this is cognitive-infrastructure machinery. liveness_temporal_coupling HIGH (0.72 [EXTRAP]) — model updates per data refresh cycle; community deliberation per governance-mandated release cadence; public-interest bodies require temporal alignment to serve civic decisions. new_nature_density LOW (0.08 [EXTRAP]) — civic LLMs operate on existing digital substrate; no novel substrate emergence. emergence_subtype: crowdsourced [Field missing from Machine model — Phase-0 oversight; placed in description per embedded constraints §C.] The multi-stakeholder coordination is constituted by volunteer contribution, public-institution contribution, and community governance without a central directing authority. v0.2 gap: commodity field cannot encode model-weight-bytes-released, fine-tuned-variants-published, training-tokens-processed, civic-deployments, public-interest-applications, federated-inference-nodes; all set null with [STUB] per commodity-enum-gap workaround (Smil ~30-value enum does not cover these LM-native flow types). pace_layer scalar; primary = infrastructure (civic substrate provision is the binding structural role). notes field rejected on Throughput submodels; absorbed above.

Machine type

incorporeal

Plasticity

plastic

Substrate

inanimate cognitive social semiotic

Wave source

wave-6-framework-native-generators-hidalgo-adjacency-lift

Inputs

  • public_hpc_compute_allocation
  • multi_stakeholder_researcher_contribution
  • open_training_corpus_and_data_curation
  • cooperative_governance_charters_and_civic_mandate_documents

Outputs

  • open_weight_civic_model_releases
  • civic_cooperative_cognitive_substrate
  • governance_templates_and_civic_ai_standards

Landscape pressures

  • compute_concentration_training_dependency (88% intensity)
  • governance_coordination_fragmentation (72% intensity)
  • corporate_open_weight_narrative_capture (65% intensity)

Cross-era couplings

State variables

capture_resistance_index
0.72
EXTRAP
proletarianization_risk
0.75
EXTRAP
liveness_temporal_coupling
0.72
EXTRAP
rewilding_fraction
0.05
EXTRAP
argument_of_progress_adoption
0.65
EXTRAP
self_organized_criticality_proximity
0.42
EXTRAP
gravitational_weight
0.58
EXTRAP
machine_lifespan
250
regime
chaotic
EXTRAP

Phase snapshots

LM-Dawn2020–2023chaotic
LM-Dawn2023–2026chaotic

Notable instances

  • BLOOM 176B (BigScience, 2022) (2022) — The canonical civic-cooperative instance: 1,000+ researchers from 70+ countries; GENCI/IDRIS Jean Zay compute (French go…
  • EleutherAI (GPT-Neo / GPT-NeoX / Pythia, 2020–present) (2020) — EleutherAI founding volunteer collective (2020); GPT-Neo (2021, Apache 2.0); GPT-NeoX 20B (2022, Apache 2.0); Pythia sca…
  • Petals Distributed Inference Network (2022–present) (2022) — Community-run distributed inference network (Borzunov et al. 2022, arXiv:2209.01188). Contributors donate GPU time to se…
  • Llamafile (Mozilla Builders, 2023–present) (2023) — Mozilla Builders / Justine Tunney (2023): portable open-weight LLM bundles as single executable files. Any civic actor (…
  • SEA-LION (AI Singapore, 2024–present) (2024) — South East Asian Languages In One Network: government-funded regional public model (AI Singapore); covers Southeast Asia…
  • CommonCrawl Foundation (2007–present; precursor data-coop) (2007) — Open web-crawl data cooperative: nonprofit providing open dataset of web crawls used in virtually every LLM pretraining …

Sources

  • Rao (2024). World Machines — civilizational-era framing
  • Wave-6 (2026). framework-native-generators/findings.md §3 Hidalgo adjacency-lift
  • BigScience Workshop (2023). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model · 88%
  • Stiegler, Bernard (2016). Automatic Society vol.1
  • Wave-0 (2026). world-machines-eras findings.md (LM definitions)
  • EleutherAI (2020). The Pile: An 800GB Dataset of Diverse Text for Language Modeling · 82%