AI's Impact on Open Source at UN Open Source week
Moderator: Vipul Siddharth: Open Source lead, UNICEF Office of Innovation
Panelists:
- Amreen Taneja: Standards Lead, Digital Public Goods Alliance
- Daniel Alvarez: Innovation Manager & Data Science Lead, UNICEF Office of Innovation
- Frederik Blachetta: Chief Technology Officer, Bundesdruckerei Group
- Taylor Downs: Founder & CEO, Open Function Group
Framing (Vipul)
Vipul opened on the central contradiction: AI is built on the shoulders of open source (libraries, frameworks, ML tooling) and is simultaneously expanding open source while potentially eroding its principles. He set three fault lines for the discussion: licensing and trust (what license governs a model's derivative output, and can you trust code whose provenance you don't know), maintainer capacity (whether AI-generated contributions shift more review and accountability burden onto already burnt-out maintainers), and funding and sustainability (how open source survives if AI can produce any software cheaply). He noted the panel deliberately spanned a funder/builder, a standards-setter, a UN implementer, and a government-tech leader, and said he hoped they'd disagree.
Introductions: where AI quality shows up
Daniel Alvarez prototypes and develops open source, AI-enabled technologies for countries. His take: whether AI helps or harms open source "broadly just depends." It clearly speeds development and prototyping, but raises multifaceted issues of confidence, accountability, and trust, especially since these prototypes may be scaled nationally.
Taylor Downs leads OpenFN, the leading DPG for public service automation and AI orchestration, with deployments in ~50 countries. He framed OpenFN as open source middleware: it lets governments leverage cutting-edge proprietary AI without plugging it directly into their systems of record, preserving a degree of sovereignty and agency.
Amreen Taneja is Standards Lead at the DPGA. She sees AI helping with coding, documentation, and testing, lowering the barrier to contribution, but raising hard questions about data provenance, data security, and contribution quality, plus a governance question about how OSI-style open source is being used in the AI ecosystem and what responsible use should mean. That makes the standards and safeguards work more important than ever.
Frederik Blachetta is CTO and MD of a German government-owned tech entity with a large digital division, and a four-year veteran of this conference. His org works in AI innovation and contributes to open source and open source AI.
Q1 — Is openness alone enough to create trust? (Frederik)
Frederik: open source is necessary but not a sufficient trust guarantee, and openness in AI is fundamentally different from openness in software. The old equation (open source = readable code = portable systems) breaks down: you can release weights and still not know the training data or what's happening inside, and you can't audit a neural network the way you audit code. Trust therefore needs more than access: documentation of training data and methodology, real-world performance evaluation, and clear accountability for failure. Open source AI is a starting point, not a quality signal.
His key point for governments: a model that scores well on global benchmarks can fail completely in a specific administrative language, jurisdiction, or use case, so every agency must build local evaluation capacity. His framing: open source gives you the right to look; evaluation gives you the ability to judge. He pointed to a German public-sector LLM evaluation framework his team released under MIT license, built on the idea of evaluating models for your own context rather than by provider reputation or size, and threw the question back to the room: who builds that evaluation capacity, and how do we build it together?
Q2 — What does responsible openness look like for DPGs? (Amreen)
Amreen: it starts with openness plus safeguards by design. The DPG Standard assesses solutions across four categories (AI systems, open software, open content, open data) against 90 indicators, including documentation, platform independence, and privacy/data security. Core stance: privacy and security can't be an afterthought; they're ingrained from the outset.
Eighteen months of work split into two areas. Mandatory privacy-by-design principles now cover data minimization, user consent mechanisms, transparency on collection/use/access, retention and deletion policies, access controls, and an overseeing governance mechanism with documentation. A second set of best practices is encouraged but not required (data protection officer, ethical review board, compliance documentation). Important scope note: the standard assesses design and development, not implementation, since implementation is often beyond the developer's control. The higher bar buys enhanced trust, which is what matters in this ecosystem.
Q3 — Standards vs. practice; and more code, or more good code? (Daniel)
Vipul asked Daniel where standards and policy miss the gaps that show up in real implementation, and whether AI just produces more code or more good code.
On the gap: policy circles fixate on bias, explainability, and data sovereignty, all important but abstract. The real struggle is operational and institutional, specifically provenance: who can reconstruct what training data touched which output? In a UN context those outputs land in donor reports, interagency briefings, and country strategy documents, and "we fine-tuned a transformer we found on a hub" isn't a governance-grade answer when a member state asks how a figure was produced. General-purpose models handle open humanitarian data well until the edge cases that define this work (conflict contexts, non-Latin scripts, administrative geographies that don't match ISO standards), where the failure mode is plausible-sounding nonsense. That's structural, not an error, so the fix is context-appropriate models and tools.
On code: the honest answer is AI produces more code, and lets inexperienced people build easily. But the established DPG success stories (Primero in child protection across 60+ countries, DHIS2 in health systems across 80+ countries) are pre-ChatGPT, built on real maintainer communities and years of institutional adoption. The current AI-assisted cohort is judged on different benchmarks: faster and more technically polished, enough to clear seed stage, but not necessarily mature on reproducibility, community governance, documentation, or sustainability in low-resource settings. His worry: shallow implementations dressed up as more complete than they are, and no clear way to sustain the institutional investment AI tools don't provide.
On AI's variance in impact, not just output (Taylor)
Taylor: beyond the well-discussed variance in output quality (slop vs. well-architected), there's under-examined variance in AI's impact on its users. He's watched some people around him take less responsibility and grow less curious, almost atrophying, farming out whole parts of their work. His analogy: cheap digital storage led education to drop rote memorization for "synthesis," and now synthesis itself is being offloaded. Others use AI deliberately and at real cost, slowing adoption and bringing engagement and responsibility, which both raises quality and builds institutional capacity. On balance, AI has accelerated Open Function Group's delivery of high-quality software, but only through absolute discipline, focus, and repetition.
Cybersecurity: offense or defense? (Taylor, with floor exchange)
Taylor framed the moment via a powerful new model that was tested, flagged on bio/cyber risks, briefly released with guardrails, then shut down by the US government after 48 hours. The morning's optimistic framing was that capable models let public-interest providers defend; Taylor pushed back that it tips toward offense, since offense only has to win once and defense has to win every time. His scary scenario: a winnowing of open source projects deemed secure enough for national-scale infrastructure, if defending against AI-designed attacks comes to require something like $10M/year in cyber budget that even widely deployed DPGs don't have. The "many eyes" / herd-grazing protection helps but could erode.
A floor participant offered the opposite read: AI can drive a more secure ecosystem, citing an AI agent that found a real defect in StrongSwan (a VPN stack) and supplied the patch while finding nothing on the macOS/Windows native stacks. Frederik agreed only partly, stressing the geopolitical stakes and that this is a genuine tipping point requiring visibility for policy leaders.
Proprietary lock-in and sovereignty (Frederik)
Frederik: proprietary AI is dangerously easy to adopt because it comes "in a box," while government maturity to assess alternatives is often low. He cited the XZ Utils backdoor (caught only by accident) as a reminder that supply-chain risk cuts both ways and will grow. The path forward: maturity, raising knowledge levels, and modular, multi-model approaches as things go agentic. He pointed to the Linux Foundation's agentic AI work and to open source AI companies (e.g., Mistral, a Black Forest / German open source company) appearing at the G7 table as signs of a political shift driven by dependency risk becoming undeniable. His summary: open source AI can be a sovereignty asset, but only if genuinely governed and evaluated; otherwise it's just a new form of lock-in.
Open standards vs. standards (Amreen)
Amreen drew a nuance: ISO-type standards differ from open standards. DPG changes affect a whole ecosystem applied in sensitive contexts, so they go out for public comment rather than being decided by 20 people in a room. Top-down direction is necessary but must be balanced by community input that genuinely shapes the changes.
Updating the DPG standard (Amreen)
A governance procedure routes grievances through a standard council, with inputs via GitHub issues, direct outreach, or countries flagging infrastructural, privacy, or data-security gaps where local legislation isn't mature. An 18-month privacy expert group (10+ neutral experts from the highest-application countries) informs this. The persistent tension: keep the process rigorous without making it too heavy for applicants from countries lacking infrastructure or policy support.
Funding and sustainability (Taylor)
Taylor: reducing big-tech lock-in must not create new lock-in to philanthropy, big consulting, or anything else. Open Function Group now runs roughly equal earned and grant revenue, and the fix is genuinely diverse funding. His deeper point from summit room-polls: people broadly agree the price of producing software will approach zero within five years, yet only ~10% believe we'll have solid open source critical national infrastructure everywhere within ten years, and these are the people who build it for a living. Why? Software has always been cheaper than trustworthy, publicly accountable institutions. As software cost falls toward zero, the cost of the trust layer (maintenance, reliability, accountability guarantees) only rises. That's the real funding challenge.
Floor provocation: should the community build its own LLM?
A floor participant asked whether the DPG/DPI community should build its own model rather than trust frontier American or open-weight Chinese models whose network calls they haven't traced, and later asked how you validate data provenance when contributors use AI models you can't access.
- Vipul noted the DPG standard for AI models already goes beyond what OSI accepts as open source AI.
- Amreen reinforced that DPGs are "open source plus plus," requiring SDG contribution, privacy and data security, and platform independence, considerations absent elsewhere, which is itself proof that trust can be built on top of open software.
- Frederik argued maybe the frontier model isn't what's needed: use-case-specific models, narrow AI, or small language models, evaluated against standardized protocols for local context. Compute and memory costs make competing on the largest models near-impossible, and chasing it is a "red race" worth countering.
- Taylor offered a third way: preserve sovereignty by driving switching costs toward zero. Real agency requires both the technical ability to switch models and the substantive freedom to (good local evals, knowing the actual impact of a switch on a specific government service). Commoditizing the big models removes their power even if a few remain dominant. A floor view held the UN still needs a model of its own, noting that delegating tasks across models by capability level is already industry standard.
- Daniel grounded it in infrastructure: frontier-scale models consume megawatt-hours many target-country grids can't support, so inference, hosting, and large-model development are impractical there. The economics differ entirely for maintainers in places like Kampala or Kathmandu, against ~40% budget cuts and agencies that can't hire or retain top talent.
Closing one-liners
Vipul prompted everyone to finish with a one-liner "One standard, policy, or funding change you want in place by the next Open Source Week." If they go beyond 1 line, they owe him a beer.
- Frederik: adoption is running far ahead of accountability; push maturity into every sector now and get decision-makers into genuine multi-stakeholder dialogue.
- Amreen: a DPG maturity index already exists, so engage with the open standard and contribute via GitHub; the hope for next year is stronger AI governance standards that enhance responsible use and give back to the open source ecosystem AI depends on.
- Taylor: reject the false dichotomy between embedding proprietary AI into government and being left behind. His example: a Dominican Republic proactive-services initiative (hosted this week by Minister Freund and Vice Minister Manzueta) uses OpenFN plus powerful proprietary models at build time to cut build costs ~90%, but the resulting systems run offline rather than calling out at runtime. There's a smart middle ground.
- Daniel: go back to first principles: who does this benefit? AI tooling was built for developers, but DPG solutions serve low-resource, complex deployment settings with maintainers already at or past capacity. AI doesn't fix the underlying governance and capacity problems and may flood the dynamic; don't let "winning the AI race" contaminate the work.
Vipul has now 4 beers - he doesn't drink. fin.