How government can help build a better OpenAI, ChatGPT and AI in general

OpenAI logo

Beyond policy, proactive engagement and better data management will make government a good steward and partner in responsible artificial intelligence efforts.

Estimated read time: 2 minutes

By Luke Fretwell · November 15, 2023

The U.S. federal government is working to ensure those building artificial intelligence do so with good intentions. It’s also proactively providing guidance on how agencies should document and responsibly use AI in its service to the public.

The Biden Administration is actively working with leading AI companies “to uphold the highest standards to ensure that innovation doesn’t come at the expense of Americans’ rights and safety.”

As part of these efforts, the National Institute of Standards and Technology is charged with facilitating the conversation, supporting the National AI Advisory Committee, whose committee members include “experts with a broad and interdisciplinary range of AI-relevant experience from across the private sector, academia, non-profits, and civil society.”

The White House has published a Blueprint for an AI Bill of Rights focused on five key principles. And it has issued an executive order aimed at “harnessing AI for good and realizing its myriad benefits.”

“This endeavor demands a society-wide effort that includes government, the private sector, academia, and civil society,” says the executive order.

Albeit in varied, unstructured formats, it has started to publicly catalog U.S. government AI use cases.

There is even an official government website, AI.gov, to centralize the Biden Administration’s efforts around artificial intelligence.

Rightfully so, most of the AI energy led by government is around policy, however, as perhaps the largest creator and maintainer of data, it has a responsibility to also proactively contribute to its success.

Initiatives such as OpenAI’s data partnership is an example of industry’s efforts to open the lines of data to government so that it can do its part in fulfilling the objectives the Biden Administration is championing.

The effort, says OpenAI, is “intended to enable more organizations to help steer the future of AI and benefit from models that are more useful to them, by including content they care about.”

From OpenAI:

We’re interested in large-scale datasets that reflect human society and that are not already easily accessible online to the public today. We can work with any modality, including text, images, audio, or video. We’re particularly looking for data that expresses human intention (e.g. long-form writing or conversations rather than disconnected snippets), across any language, topic, and format.

We can work with data in almost any form and can use our next-generation in-house AI technology to help you digitize and structure your data. For example, we have world-class optical character recognition (OCR) technology to digitize files like PDFs, and automatic speech recognition (ASR) to transcribe spoken words. If the data needs cleaning (e.g. has lots of auto-generated artifacts or transcription errors), we can work with your team to process it into the most useful form. We are not seeking datasets with sensitive or personal information, or information that belongs to a third party; we can work with you to remove this information if you need help.

How government can partner with OpenAI:

Open-Source Archive: We’re seeking partners to help us create an open-source dataset for training language models. This dataset would be public for anyone to use in AI model training. We would also explore using it to safely train additional open-source models ourselves. We believe open-source plays an important role in the ecosystem.

OpenAI is already working with Iceland, but this is just the tip of the iceberg for how government can collaborate with OpenAI (and overall AI progress).

Engagements like this should be considered both a hands-on learning experience, but also an incentive to be more forward-thinking and proactive with publishing data in a more intelligent way.

Whether it’s government serving up the information or, more and more likely, a third-party, the onus is on government to do so with a data mindset.

Some progress has been made with the opening of taxpayer-funded research, Data.gov and general (albeit clunky and disparate) publishing of public data, but government still has a long way to go to being a proactive steward of AI.

Much like public sector data governance, this will entail leadership from the Chief Data Officers Council, agency chief data officers, the General Services Administration (maintainer of Data.gov) and executive action that mandates agencies do their part to proactively publish machine-readable data.

In doing so, government can move beyond just policy, and become practioners, stewards and true partners of AI for good.

Photo of Luke Fretwell

Luke Fretwell

Luke Fretwell is the founder and maintainer of GovFresh. More from Luke.

Connect with Luke