Are Users Compliant? Enterprise AI, Scholarly Retrieval, and the Obligations You Cannot Outsource

Frictionless AI retrieval creates a dangerous illusion: if Unpaywall found a legal open-access copy and Claude summarized it, the workflow must be compliant. It may not be. Organizations using Claude or similar tools for scholarly workflows remain responsible for lawful access, license compliance, data protection, and platform terms—even when OA discovery tools and AI vendors make retrieval feel automatic.

Read More

Are Creators Actually Paid? Open Access, APCs, and the AI Economics Gap

Open access solved a reader-access problem: paywalls no longer block qualified researchers from reading scholarship. It did not automatically solve a creator-payment problem—and AI retrieval at scale may shift economic value toward intermediaries unless licensing and funding models explicitly account for new uses. When AI systems ingest OA literature, do creators and publishers actually get paid?

Read More

Where Is the Compliance Boundary? Open Access, Circumvention, Scraping, and Terms of Service

The compliance line in AI-assisted scholarly retrieval is not drawn at “open versus closed.” It runs between locating publisher-authorized open-access copies and any practice that bypasses technical or contractual access controls, exceeds license scope, or violates site or API terms—even when an AI system could technically retrieve the bytes. Here we map where lawful OA ends and circumvention, scraping, and policy violations begin.

Read More

Can Claude Use Unpaywall Legally? Access, Discovery, and the First Compliance Question

When teams wire Claude or another AI assistant into scholarly research workflows, the first technical question is usually practical: can we use Unpaywall to find full-text papers? The first compliance question is harder: does discovery through an open-access index grant permission to copy, store, summarize, or commercialize what we retrieve? Unpaywall can lawfully help locate publisher-authorized or repository-hosted open-access copies of scholarly articles, but using that discovery in an AI retrieval workflow still leaves separate, and often stricter, questions about copying, terms of service, licensing, and downstream reuse.

Read More

From Tokenmaxxing to Economic Governance: The 2026 AI Roadmap for CTOs Who Want a 2027 Budget

Over the past eight posts, this series has examined the 2026 AI token economy from six distinct angles: the Uber budget collapse, the physics of context scaling, recursive agent loops, per-minute pricing disruption, KV cache optimisation, infrastructure power constraints, and the local inference break-even. Each post was a close-up on a specific failure mode or opportunity. This final post is the wide-angle view — a synthesis of the full series into a governance framework that translates individual insights into organisational practice. The thesis is simple: the era of “tokenmaxxing” — deploying AI at maximum capability without cost discipline — is over. The organisations that will thrive in 2027 are those that implement economic governance over their AI stacks before their next budget cycle, not after.

Read More

The Local Inference ROI: 4-Bit Quantization, SLMs, and the Case for Bypassing the API

Every post in this series has examined a different dimension of API cost: token pricing, context scaling, agentic multiplication, per-minute billing, cache optimisation, and infrastructure constraints. The implicit assumption throughout has been that API inference is the only option — that your choice is which provider, which model, and which optimisation technique to apply within the API billing model. In 2026, that assumption requires re-examination. Small Language Models (SLMs) combined with 4-bit quantisation have reached a capability and cost profile that makes local inference economically rational for a well-defined and growing class of enterprise workloads. Understanding when to bypass the API entirely is now a first-order strategic decision, not a research-stage exploration.

Read More

Power as the New Token: Gartner's $1.37 Trillion Infrastructure Bet and the Physics of AI at Scale

Every discussion of AI cost in 2026 eventually arrives at the same upstream constraint: electricity. The token prices on every API pricing page, the per-minute rates, the per-seat subscriptions — they are all downstream of a physical fact that no software optimisation can dissolve. Training and running large language models requires power at a scale that is straining the capacity of data centres, national grids, and the global supply chains for the hardware that converts electricity into inference. Gartner’s forecast of $1.37 trillion in AI infrastructure spending by 2026 is not a number about software or services — it is primarily a number about construction, cooling, and electrical generation. Understanding this layer is essential for any CTO who wants to reason accurately about the medium-term trajectory of AI costs.

Read More