Markus Anderljung

How technical safety standards could promote TAI safety

8/9/2022

Cullen O’Keefe, Jade Leung, Markus Anderljung[1] [2]

Summary
Standard-setting is often an important component of technology safety regulation. However, we suspect that existing standard-setting infrastructure won’t by default adequately address transformative AI (TAI) safety issues. We are therefore concerned that, on our default trajectory, good TAI safety best practices will be overlooked by policymakers due to the lack or insignificance of efforts which identify, refine, recommend, and legitimate TAI safety best practices in time for their incorporation into regulation.

Given this, we suspect the TAI safety and governance communities should invest in capacity to influence technical standard setting for advanced AI systems. There is some urgency to these investments, as they move on institutional timescales. Concrete suggestions include deepening engagement with relevant standard setting organizations (SSOs) and AI regulation, translating emerging TAI safety best practices into technical safety standards, and investigating what an ideal SSO for TAI safety would look like.

Standards Help Turn Technical Safety Discoveries Into Legal Safety RequirementsA plausible high-level plan for achieving TAI safety is to (a) identify state-of-the-art technical safety and security measures that reduce the probability of catastrophic AI failures, then (b) ensure (such as by legal mandate) that actors at the frontier of AI development and deployment adopt those measures.
This general structure of first identifying and then mandating safety measures is obviously not unique to AI. How do lawmakers choose which substantive safety measures to legally mandate for other technologies? Several options are possible and used in practice, including encoding such requirements directly into legislation, or delegating such decisions to regulatory agencies. One common strategy is to have the law incorporate by reference (i.e., “point” to) existing technical safety standards[3] previously developed by private standard-setting organizations (“SSOs”). Another strategy, common in the EU, is to first pass generally-phrased regulation, and later have the regulation operationalized via standards developed by SSOs.[4]

Standardization accomplishes several important things. First, it provides a structured process for a consensus of technical safety experts to identify and recommend the best, well-tested technical safety ideas. As a result, policymakers have to spend less time developing governmental standards and exercise less non-expert judgment about which safety requirements should be adopted. Notably, standards can also be updated more rapidly than regulation, due to lower bureaucratic and legal overhead, therefore making it possible to keep more apace with technical developments. Second, standardization takes emerging safety practices that are under-specified or heterogeneous and restates them in a precise, consistent, and systematized form that is more readily adoptable by new actors and appropriately clear for a legal requirement. Supranational SSOs provide a routinized and reliable infrastructure for facilitating international harmonization and regulation via standards. Finally, well-structured standard-setting organizations (“SSOs”) operate on the basis of multistakeholder consensus, and therefore both aim to generate and provide evidence of politically viable standards.

In the US, the path from standardization often roughly follows a pattern of:

Informal, loose networks of industry safety experts identify, develop, and converge on safety-promoting best practices.
Private[5] SSOs elevate some of these best practices into standards, through a well-defined, multistakeholder, consensus-driven process with procedural safeguards (such as open and equitable participation, a balance of represented parties, and opportunities for appeal).[6]
Assuming the government passes regulation for which some of these standards are appropriate, this then provides a route via which these standards are incorporated into domestic law.[7] [8]
International bodies like the ISO attempt to harmonize standards across countries, as well as between SSOs; via these mechanisms, standards developed in e.g. the US could eventually have international impact.

To be clear, we do not necessarily think this is the best way to approach technology regulation. Our claim is primarily empirical: that privately developed standards are one of the main (and in the US, legally preferred) sources of mandated safety measures, and are likely to remain as such. There are substantial downsides with this approach, such as:

Increased risk of industry capture, since industry employees are heavily represented in SSOs.
A built-in preference for uniformity over experimentation and competition in regulatory approaches.
Slow and bureaucratic processes for setting standards (though less slow and bureaucratic than many governmental processes).
Reduced democratic accountability and participation, since SSOs are private organizations.
Lack of access to the incorporated standards, since the standards often cost hundreds of dollars each to access.[9] Importantly, we also think that standardization can be a useful lever for safety even if those standards are not incorporated into hard law. Established safety standards can establish a natural normative “floor” against which AI developers (especially those represented in the standard-setting process) can be evaluated. Special antitrust protections for bona fide standard-setting activities makes standard-setting a less risky way for labs to jointly work on safety.[10] Standardization of informal and heterogeneous safety best practices can lower the cost of adopting such practices, leading to broader coverage.[11] Standards can also form the substantive primitives for private certification and auditing schemes.

Emergence of Consensus AI Safety Best PracticesPart of what excites us about standardization as a tractable approach to TAI governance is the increasing emergence of best practices in AI safety with increasingly broad buy-in. For example, a number of industry, academic, and civil society actors appear to endorse and/or are willing to discuss some fairly concrete measures to improve alignment, safety, and social impact throughout the AI lifecycle, including (but not limited to):

A variety of social and technical measures to limit harms from commercialized large language models
Use of reinforcement learning from human feedback to align model behavior with human preferences[12]
Responsible publication of advances in AI, including discussion of ethical impacts and disclosure of ethically relevant model construction and behavior.

We think these measures may be good candidates for formalization into standards in the near future. As AI safety and policy research matures, currently theoretical, vague, or nascent ideas may mature into consensus best practices, adding to the list of candidates for standardization. Of course, the goal of existential-risk focused AI safety research is to eventually produce training and testing methods that can, when applied to an AI system, reliably improve that system’s alignment with human values. We hope that such methods will be (or could be made) sufficiently clear and universalizable to make into legally appropriate standards.

AI Standardization Today
Standardization may be an appropriate next step for some (but by no means all)[13] consensus best practices.

A number of SSOs currently develop standards relevant to AI safety. For example, the International Organization for Standardization (“ISO”) and International Electrotechnical Commission (“IEC”) run a joint subcommittee on AI, which has promulgated standards on AI trustworthiness, robustness, bias, and governance. The Institute of Electrical and Electronics Engineers (“IEEE”) has also promulgated a number of AI standards. The U.S. National Institute of Standards and Technology (“NIST”) is developing an AI Risk Management Framework.

Best practices, standardization, and the complementary process of conformity assessment are beginning to play an important role in the regulation of AI. The Federal Trade Commission has repeatedly implied that compliance with best practices and “independent standards” in ethical AI may be required by—or at least help evidence conformity with—various laws they enforce. In its Inaugural Joint Statement, the U.S.–EU Trade and Technology Council announced an intent to prioritize collaboration on AI standard-setting. Standardization and conformity assessments for certain high-risk AI systems play an important role in the proposed EU Artificial Intelligence Act. In short, governments appear poised to rely heavily on standardization for AI regulation.

Actionable Implications for the TAI Safety and Governance Communities
Our core thesis is that technical AI safety standards can and will be the building blocks for many forms of future AI regulation. We’ve laid out the case briefly above; additional analysis and refinement of this thesis could be valuable. If this thesis is true of the most existentially important forms of AI regulation, this has important and actionable implications for the TAI safety and governance communities, many of which were presciently identified by Cihon (2019). Thus, this post serves as a renewed call to take AI safety standardization seriously. Concretely, we have several ideas on how to do this in the near- and medium-term.

First, safety-conscious AI practitioners should consider advancing standardization of TAI-relevant safety best practices. Although we are aware and appreciative of several TAI-concerned individuals who have participated in AI safety standard-setting, we suspect that TAI-focused perspectives are still underrepresented in the processes of the various SSOs already developing AI safety, security, and governance standards. While this might not be a problem today, if those standards are increasingly relied upon by policymakers for substantive AI regulation, TAI perspectives and priorities might not be adequately represented or considered legitimate, and we won’t have routes to promote TAI safety best practices once they are discovered. We therefore renew Cihon (2019)’s call for strategic engagement between the TAI safety communities and AI SSOs. For example, (more) AI safety researchers may consider joining the membership of such SSOs, and serving on relevant committees.[14]

For similar reasons, TAI safety researchers and practitioners should consider engaging seriously with regulatory efforts in jurisdictions where regulation typically precedes standards. This notably includes forthcoming EU AI regulation and accompanying standard-setting processes, especially if we should expect such regulation to diffuse globally.

As the TAI safety community converges on best practices for frontier systems, we should proactively push for them to be refined into technical standards. An intermediate step here might look like creating fora where safety practitioners from across organizations can easily share and refine safety best practices and other lessons learned,[15] then sharing these publicly in concrete form.
We’d also encourage proper analysis of the adequacy of current AI-relevant SSOs. If it seems they might be inadequate at dealing with TAI safety issues, we should get to work investigating what new SSOs tailored to TAI safety issues might look like. Ideal features of such an SSO would likely include:

Disciplined focus on TAI safety issues, with supporting institutional rules (e.g., bylaws) and aligned leadership to maintain that focus.
Exceptionally high transparency and accessibility (e.g., technical standards are freely available, including in multiple relevant languages, for easy use, reference, and critique).[16]
Ability to very rapidly initiate or update standards.
Calibrated communication of the safety value of standards (i.e., communicating by how much does this standard, when properly applied, reduce worst-case risks).
Design of multiple layers of standards, including organization- and process-level standards (e.g., organizations are required to make extensive good-faith efforts to identify and remedy safety issues, and are not permitted to infer that their systems are safe merely because they’ve checked off a list of object-level system requirements).
Low implementation costs for safety standards., such as through provision of how-to guides for implementation.
ANSI accreditation, for credibility and legitimacy reasons.

Like many things in governance, learning to influence and implement standardization well will require iteration and experience. We shouldn’t assume that we can simply “tack on” standardization after discovering AI safety solutions. We suspect that such solutions will be more consistently, quickly, and smoothly adopted and eventually legally codified if there is already a nimble, well-functioning, respected, legitimate, TAI-oriented standardization infrastructure to translate our best collective safety measures into standards. Creating such an infrastructure will take time, but seems tractable if we invest our efforts efficiently and strategically in this space.

Conclusion
To summarize:

Technical standards form the fundamental building blocks of many technology regulation regimes, and could plausibly form the fundamental building blocks of TAI-relevant regulation.
Given (1), the TAI safety and governance communities should ensure that there exists SSOs that can efficiently elevate AI safety best practices into technical standards that are, in substance and form, appropriate for legal and regulatory use.
It’s not clear that long-term safety priorities are currently well represented in existing AI standard-setting efforts, or that the current structure and procedures of AI-relevant SSOs are appropriate to the challenges that TAI may pose.
If existing SSOs are inadequate, we should have a plan for improving existing SSOs or creating new ones, particularly to ensure they are focused on and nimble to the evolving challenges of advanced AI safety. This would take several years.
We may therefore wish to start investing in answering (3)—and then possibly working on (4)—soon.

Concrete steps we propose include:

TAI safety researchers and practitioners should consider joining and influencing existing TAI-relevant SSOs, both for the object-level reason of improving AI safety standard-setting and for the purpose of learning more about AI safety standard-setting.
For similar reasons, TAI safety researchers and practitioners should consider engaging with regulatory efforts in jurisdictions where regulation typically precedes standards.
TAI safety researchers should actively drive for convergence on what best safety practices are for frontier systems, and refine those best practices into technical standards that would be suitable for integration into law.
TAI safety and governance researchers and practitioners should analyze whether existing AI SSOs are adequate for the needs of TAI standard-setting, including analyzing which standardization processes are going to be most important to influence today[17]. If existing efforts seem likely to be inadequate, we should design and possibly build new standard-setting infrastructure. If you are interested in working on this and think that we could help, or have valuable insight regarding AI safety standard-setting, please reach out to us at tai-standards[at]googlegroups.com

Notes

Thanks to Jonas Schuett, Joslyn Barnhart, Miles Brundage, and Will Hunt for comments on earlier drafts of this post. All views and errors our own.
This post is written in our individual capacities, rather than in our capacities of employment or affiliation with particular organizations.
“Standardization” is defined as “[c]ommon and repeated use of rules, conditions, guidelines or characteristics for products or related processes and production methods, and related management systems practices.” Off. of Mgmt. & Budget, Exec. Off. of the President, OMB Circ. No. A-119, Federal Participation in the Development and Use of Voluntary Consensus Standards and in Conformity Assessment Activities § 3(a) (1998) (hereinafter “1998 Circular A-119”), https://perma.cc/Y32D-R2JQ. Standards can include “[t]he definition of terms; classification of components; delineation of procedures; specification of dimensions, materials, performance, designs, or operations; measurement of quality and quantity in describing materials, processes, products, systems, services, or practices; test methods and sampling procedures; or descriptions of fit and measurements of size or strength.” Id. We are here primarily focused on standards that attempt to improve safety. Other standards (perhaps most) are focused on promoting interoperability or reducing information costs.
In the EU, this responsibility typically falls on the “European Standards Organizations” (ESOs), some of which work on requests from the EU Commission, e.g. in preparation of forthcoming regulation such as the AI Act. The most important ones are the European Committee for Standardisation (CEN), European Committee for Electrotechnical Standardisation CENELEC) and European Telecommunications Standards Institute (ETSI). The EU’s recent Strategy on Standardisation is a good place to get an overview of EU standard-setting and approach to engagement with international SSOs.
In some other countries, governments take a much more active role in standard-setting. ↩︎
Some might worry that these due process requirements pose a possible risk as a source of distraction, obfuscation, or delay in setting TAI safety standards. We share this concern, which is why we propose investigating the creation of a new SSO that could retain a strong focus on TAI, with corresponding application of other due process requirements (such as faster turnaround times than achieved by most SSOs).
See also Off. of Mgmt. & Budget, Exec. Off. of the President, OMB Circ. No. A-119, Federal Participation in the Development and Use of Voluntary Consensus Standards and in Conformity Assessment Activities § 2(e) (2016) (hereinafter “2016 Circular A-119”), https://perma.cc/KUV8-VWN8; National Technology Transfer And Advancement Act Of 1995, Pub. L. No. 104–113, 110 Stat. 775 (1996).
To be clear, the US federal government retains the option to develop their own standards outside of this framework. See 2016 Circular A-119 § 5(c).
SSOs defend the costs to access as necessary to recoup the costs of standards development and maintenance. Standards incorporated by reference into US regulations can be freely viewed. One important goal for the TAI safety and governance communities is ensuring that existentially important AI safety standards are freely available, unlike most safety standards. ↩︎
See 15 U.S.C. § 4302.
For example, companies can reduce the amount of discovery and tinkering required to achieve some goal by referencing an appropriate standard. Consistent standards can also foster an ecosystem of actors specialized in the relevant standards, who can transfer those skills to other appropriate contexts.
E.g., at Anthropic, DeepMind, OpenAI, and elsewhere.
In particular, standard-setting is a time-consuming and expensive process. These costs may not always be worth the benefits of standardization.
To be clear, we do not consider such involvement to be an obvious unalloyed good. The time of AI safety researchers and engineers is very valuable, and they should not reallocate it lightly.
In so doing, they will have to take care not to run afoul of antitrust laws.
NB: This is not the case with most existing SSOs!
Relevant questions include: Which standards are likely to be most relevant for future frontier systems? Which order are standards from influential standard setting bodies going to come out? Which standards are most likely to see global diffusion? For example, will the EU AI Act, and its accompanying standards, diffuse globally? Should we expect the NIST AI Risk Management Framework to affect relevant ISO, IEC, or ESO standards?

0 Comments