Hugging Face: Everything you need to know about the AI platform

Hugging Face is a platform for viewing, sharing, and showcasing machine learning models, datasets, and related work. It aims to make Neural Language Models (NLMs) accessible to anyone building applications powered by machine learning. Many popular AI and machine-learning models are accessible through Hugging Face, includingLLaMA 2, an open source language model that Meta developed in partnership with Microsoft.

Hugging Face is a valuable resource for beginners to get started with machine-learning models. You don’t need to pay for any special apps or programs to get started. You only need a web browser to browse and test models and datasets on any device, even onbudget Chromebooks.

screenshot of text generation on hugging face website

What is Hugging Face?

Hugging Face provides machine-learning tools for building applications. Notable tools include the Transformers model library, pipelines for performing machine-learning tasks, and collaborative resources. It also offers dataset, model evaluation, simulation, and machine learning libraries. Hugging Face can be summarized as providing these services:

Hugging Face receives funding from companies including Google, Amazon, Nvidia, Intel, and IBM. Some of these companies have created open source models accessible through Hugging Face, like the LLaMA 2 model mentioned at the beginning of this article.

ChatGPT home screen on a mobile phone.

The number of models available through Hugging Face can be overwhelming, but it’s easy to get started. We walk you through everything you need to know about what you can do with Hugging Face and how to create your own tools and applications.

What can you do with Hugging Face?

The core of Hugging Face is the Transformers model library, dataset library, and pipelines. Understanding these services and technologies gives you everything you need to use Hugging Face’s resources.

The Transformers model library

The Transformers model library is a library of open source transformer models. Hugging Face has a library ofover 495,000 modelsgrouped into data types called modalities. you’re able to use these models to perform tasks with pipelines, which we explain later in this article.

Some of the tasks you can perform through the Transformers model library are:

wikipedia dataset on hugging face

A complete list of these tasks can be seen on theHugging Face website,categorized for easy searching.

Within these categories are numerous user-created models to choose from. For example, Hugging Face currently hosts over 51,000 models for Text Generation.

google colab screenshot of code import

If you aren’t sure how to get started with a task, Hugging Face provides in-depth documentation on every task. These docs include use cases, explanations of model and task variants, relevant tools, courses, and demos. For example, the demo on the Text Generation task page uses the Zephyr language models to complete models. You’ll refer to the model for instructions on how to use it for the task.

These tools make experimenting with models easy. While some are pre-trained with data, you’ll need datasets for others, which is where the datasets library comes into play.

Using the datasets library

The Hugging Facedatasets libraryis suitable for all machine-learning tasks offered within the Hugging Face model library. Each dataset contains a dataset viewer, a summary of what’s included in the dataset, the data size, suggested tasks, data structure, data fields, and other relevant information.

What are large language models?

Large language models (LLMs) are the basis for AI chatbots and much more. Here’s what’s going on behind the scenes

For example, the Wikipedia dataset contains cleaned Wikipedia articles of all languages. It has all the necessary documentation for understanding and using the dataset, including helpful tools like a data visualization map of the sample data. Depending on what dataset you access, you may see different examples.

Using pipelines to perform tasks

Models and datasets are the power behind performing tasks from Hugging Face, but pipelines make it easy to use these models to complete tasks.

Hugging Face’s pipelines simplify using models through an API that cuts out using abstract code. You can provide a pipeline with multiple models by specifying which one you want to use for specific actions. For example, you can use one model for generating results from an input and another for analyzing them. This is where you’ll need to refer to the model page you used for the results to interpret the formatted results correctly.

Hugging Face has a full breakdown of thetasks you can use pipelines for.

How to get started with Hugging Face

Now you have an understanding of the models, datasets, and pipelines provided by Hugging Face, you’re ready to use these assets to perform tasks.

You only need a browser to get started. We recommend using Google Colab, which lets you write and execute Python code in your browser. It provides free access to computing resources, including GPUs and TPUs, making it ideal for basic machine-learning tasks.Google Colab is easy to useand requires zero setup.

After you’ve familiarized yourself with Colab, you’re ready to install the transformer libraries using the following command:

Then check it was installed correctly using this command:

You’re now ready to dive into Hugging Face’s libraries. There are a lot of places to start, but we recommendHugging Face’s introductory course, which explains the concepts we outlined earlier in detail with examples and quizzes to test your knowledge.

Collaborating with Hugging Face

Collaboration is a huge part of Hugging Face, allowing you to discuss models and datasets with other users. Hugging Face encourages collaboration through adiscussion forum, a community blog, Discord, and classrooms.

Models and datasets on Hugging Face also have their own forums where you can discuss errors, ask questions, or suggest use cases.

Hugging Face is a powerful platform for collaborating with machine-learning tools

Machine learning and AI are daunting for beginners, but platforms like Hugging Face provide a great way to introduce these concepts. Many of the popular models on Hugging Face are large language models (LLMs), sofamiliarize yourself with LLMsif you plan to use machine-learning tools for text generation or analysis.

What is Hugging Face?#

What can you do with Hugging Face?#

The Transformers model library#

Using the datasets library#

What are large language models?#

Using pipelines to perform tasks#

How to get started with Hugging Face#

Then check it was installed correctly using this command:#

Collaborating with Hugging Face#

Hugging Face is a powerful platform for collaborating with machine-learning tools#