The Stanford Holistic Evaluation of Language Models and its AI research explained

The latest and greatest trend in technology is AI, and it’s everywhere. AI is becoming increasingly relevant in everyday life, from new versions of existing technology likethe best Samsung phonesto new services like ChatGPT. It isn’t all good news, though. There are several questions about how ethical, fair, and accurate AI models are. Stanford researchers have devised a method to evaluate AI models to provide better transparency around each model and how it performs in various situations.

What is Galaxy AI, what can it do, and how do I use it?

Learn more about Galaxy AI and how to use it on your Samsung Galaxy S24 phone

Stanford Holistic Evaluation of Language Models: a brief introduction

The Stanford Holistic Evaluation of Language Models (HELM) was developed by a team of 50 researchers at Standard University’s Center For Research on Foundation Models.HELM consists of three main elements:

A primary value of the team behind HELM is transparency.The Helm websitemakes the scenarios, predictions, prompts, and code for the model available to anyone to see and review. HELM provides a standard way to evaluate language models and test them against each other to give an industry-standard benchmark. The researchers aim to keep running and refining HELM as AI models progress. The research is funded by Google, among others.

A Samsung Galaxy Z Flip 6 with an image editor interface open.

What were the initial findings of HELM?

In 2022,HELM researchers published a paperafter the first rendition of its model ran more than 4,900 evaluations of 30 models, totaling more than 12 billion token requests for these models. There were five main findings from this paper:

What’s next for HELM?

What is AI tokenization?

Understanding AI lingo can take some time, but tokenization isn’t as complicated as it sounds

After this initial paper, a lot of work is still being done on HELM. There are some conclusions and next steps that were outlined in this initial paper:

A graphic showing how scenarios and models are put into HELM to get results

Transparency in AI is important

AI is making its way into more facets of our lives. To make sure it’s used safely, we need to constantly evaluate it and take the necessary steps to make it better for everyone. A way to do this is to create open source AI models, so that users have better insight into the models they’re using.

Large Language Model diagram on a blue background

What is Galaxy AI, what can it do, and how do I use it?#

Stanford Holistic Evaluation of Language Models: a brief introduction#

What were the initial findings of HELM?#

What’s next for HELM?#

What is AI tokenization?#

Transparency in AI is important#