# üß© Quizzes and Adventures üè∞ with Character Codex and llamafile

<img src="https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/2qPIzxcnzXrEg66VZDjnv.png" width="430" style="display:inline;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="https://raw.githubusercontent.com/Mozilla-Ocho/llamafile/main/llamafile/llamafile-640x640.png" width="213" style="display:inline;">

<br/>

Let's build something fun with [Character Codex](https://huggingface.co/datasets/NousResearch/CharacterCodex), a newly released dataset featuring popular characters from a wide array of media types and genres...

We'll be using Haystack for orchestration and [llamafile](https://github.com/Mozilla-Ocho/llamafile) to run our models locally.

We will first build a simple quiz game, in which the user is asked to guess the character based on some clues.
Then we will try to get two characters to interact in a chat and maybe even have an adventure together!

## Preparation

### Install dependencies

In [None]:
! pip install haystack-ai datasets

### Load and look at the Character Codex dataset

In [None]:
from datasets import load_dataset

dataset = load_dataset("NousResearch/CharacterCodex", split="train")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Downloading readme:   0%|          | 0.00/4.35k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/11.2M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

In [None]:
len(dataset)

15939

In [None]:
dataset[0]

{'media_type': 'Webcomics',
 'genre': 'Fantasy Webcomics',
 'character_name': 'Alana',
 'media_source': 'Saga',
 'description': 'Alana is one of the main characters from the webcomic "Saga." She is a strong-willed and fiercely protective mother who is on the run with her family in a war-torn galaxy. The story blends elements of fantasy and science fiction, creating a rich and complex narrative.',
 'scenario': "You are a fellow traveler in the galaxy needing help, and Alana offers her assistance while sharing stories of her family's struggles and triumphs."}

Ok, each row of this dataset contains some information about a character.
It also includes a creative `scenario`, which we will not use.

### llamafile: download and run the model

For our experiments, we will be using the Llama-3-8B-Instruct model: a small but good language model.

[llamafile](https://github.com/Mozilla-Ocho/llamafile) is a project by Mozilla that simplifies access to LLMs. It wraps both the model and the inference engine in a single executable file.

We will use it to run our model.

*llamafile is meant to run on standard computers. We will do some tricks to make it work on Colab. For instructions on how to run it on your PC, check out the docs and [Haystack-llamafile integration page](https://haystack.deepset.ai/integrations/llamafile).*

In [None]:
# download the model
!wget "https://huggingface.co/Mozilla/Meta-Llama-3-8B-Instruct-llamafile/resolve/main/Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile"

--2024-06-20 09:53:30--  https://huggingface.co/Mozilla/Meta-Llama-3-8B-Instruct-llamafile/resolve/main/Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile
Resolving huggingface.co (huggingface.co)... 18.239.50.103, 18.239.50.80, 18.239.50.49, ...
Connecting to huggingface.co (huggingface.co)|18.239.50.103|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs-us-1.huggingface.co/repos/e3/ee/e3eefe425bce2ecb595973e24457616c48776aa0665d9bab33a29b582f3dfdf0/23365cb45398a3c568dda780a404b5f9a847b865d8341ec500ca3063a1f99eed?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile%3B+filename%3D%22Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile%22%3B&Expires=1719136410&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxOTEzNjQxMH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zL2UzL2VlL2UzZWVmZTQyNWJjZTJlY2I1OTU5NzNlMjQ0NTc2MTZjNDg3NzZhYTA2NjVkOWJhYjMzYTI5Yj

In [None]:
# make the llamafile executable
! chmod +x Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile

**Running the model - relevant parameters**:
- `--server`: start an OpenAI-compatible server
- `--nobrowser`: do not open the interactive interface in the browser
- `--port`: port of the OpenAI-compatible server (in Colab, 8080 is already taken)
- `--n-gpu-layers`: offload some layers to GPU for increased performance
- `--ctx-size`: size of the prompt context


In [None]:
# we prepend "nohup" and postpend "&" to make the Colab cell run in background
! nohup ./Meta-Llama-3-8B-Instruct.Q5_K_M.llamafile \
        --server \
        --nobrowser \
        --port 8081 \
        --n-gpu-layers 999 \
        --ctx-size 8192 \
        > llamafile.log &

nohup: redirecting stderr to stdout


In [None]:
# we check the logs until the server has been started correctly
!while ! grep -q "llama server listening" llamafile.log; do tail -n 5 llamafile.log; sleep 10; done

Let's try to interact with the model.

Since the server is OpenAI-compatible, we can use an [OpenAIChatGenerator](https://docs.haystack.deepset.ai/docs/openaichatgenerator).

In [None]:
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

generator = OpenAIChatGenerator(
    api_key=Secret.from_token("sk-no-key-required"),  # for compatibility with the OpenAI API, a placeholder api_key is needed
    model="LLaMA_CPP",
    api_base_url="http://localhost:8081/v1",
    generation_kwargs = {"max_tokens": 50}
)

generator.run(messages=[ChatMessage.from_user("How are you?")])

{'replies': [ChatMessage(content="I'm just a language model, I don't have emotions or feelings like humans do. However, I'm functioning properly and ready to assist you with any questions or tasks you may have. How can I help you today?<|eot_id|>", role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'LLaMA_CPP', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 46, 'prompt_tokens': 14, 'total_tokens': 60}})]}

## üïµÔ∏è Mystery Character Quiz

Now that everything is in place, we can build a simple game in which a random character is selected from the dataset and the LLM is used to create hints for the player.

### Hint generation pipeline

This simple pipeline includes a [`ChatPromptBuilder`](https://docs.haystack.deepset.ai/docs/chatpromptbuilder) and a [`OpenAIChatGenerator`](https://docs.haystack.deepset.ai/docs/openaichatgenerator).

Thanks to the template messages, we can include the character information in the prompt and also previous hints to avoid duplicate hints.

In [None]:
from haystack import Pipeline
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator

template_messages = [
    ChatMessage.from_system("You are a helpful assistant that gives brief hints about a character, without revealing the character's name."),
    ChatMessage.from_user("""Provide a brief hint (one fact only) for the following character.
                          {{character}}

                          Use the information provided, before recurring to your own knowledge.
                          Do not repeat previously given hints.

                          {% if previous_hints| length > 0 %}
                            Previous hints:
                            {{previous_hints}}
                          {% endif %}""")
]

chat_prompt_builder = ChatPromptBuilder(template=template_messages, required_variables=["character"])

generator = OpenAIChatGenerator(
    api_key=Secret.from_token("sk-no-key-required"),  # for compatibility with the OpenAI API, a placeholder api_key is needed
    model="LLaMA_CPP",
    api_base_url="http://localhost:8081/v1",
    generation_kwargs = {"max_tokens": 100}
)

hint_generation_pipeline = Pipeline()
hint_generation_pipeline.add_component("chat_prompt_builder", chat_prompt_builder)
hint_generation_pipeline.add_component("generator", generator)
hint_generation_pipeline.connect("chat_prompt_builder", "generator")

<haystack.core.pipeline.pipeline.Pipeline object at 0x7c0f4a07f580>
üöÖ Components
  - chat_prompt_builder: ChatPromptBuilder
  - generator: OpenAIChatGenerator
üõ§Ô∏è Connections
  - chat_prompt_builder.prompt -> generator.messages (List[ChatMessage])

### The game

In [None]:
import random

MAX_HINTS = 3



random_character = random.choice(dataset)
# remove the scenario: we do not use it
del random_character["scenario"]

print("üïµÔ∏è Guess the character based on the hints!")

previous_hints = []

for hint_number in range(1, MAX_HINTS + 1):
    res = hint_generation_pipeline.run({"character": random_character, "previous_hints": previous_hints})
    hint = res["generator"]["replies"][0].text

    previous_hints.append(hint)
    print(f"‚ú® Hint {hint_number}: {hint}")


    guess = input("Your guess: \nPress Q to quit\n")

    if guess.lower() == 'q':
        break

    print("Guess: ", guess)

    if random_character['character_name'].lower() in guess.lower():
        print("üéâ Congratulations! You guessed it right!")
        break
    else:
        print("‚ùå Wrong guess. Try again.")
else:
    print(f"üôÅ Sorry, you've used all the hints. The character was {random_character['character_name']}.")

üïµÔ∏è Guess the character based on the hints!
‚ú® Hint 1: Here's a brief hint:

This actor has won an Academy Award for his role in a biographical sports drama film.<|eot_id|>
Your guess: 
Press Q to quit
Tom Cruise?
Guess:  Tom Cruise?
‚ùå Wrong guess. Try again.
‚ú® Hint 2: Here's a new hint:

This actor is known for his intense physical transformations to portray his characters, including a significant weight gain and loss for one of his most iconic roles.<|eot_id|>
Your guess: 
Press Q to quit
Brendan Fraser
Guess:  Brendan Fraser
‚ùå Wrong guess. Try again.
‚ú® Hint 3: Here's a new hint:

This actor has played a character who is a comic book superhero.<|eot_id|>
Your guess: 
Press Q to quit
Christian Bale
Guess:  Christian Bale
üéâ Congratulations! You guessed it right!


## üí¨ ü§† Chat Adventures

Let's try something different now!

Character Codex is a large collection of characters, each with a specific description.
Llama 3 8B Instruct is a good model, with some world knowledge.

We can try to combine them to simulate a dialogue and perhaps an adventure involving two different characters (fictional or real).

### Character pipeline

Let's create a character pipeline: [`ChatPromptBuilder`](https://docs.haystack.deepset.ai/docs/chatpromptbuilder) +[`OpenAIChatGenerator`](https://docs.haystack.deepset.ai/docs/openaichatgenerator).

This represents the core of our conversational system and will be invoked multiple times with different messages to simulate conversation.


In [None]:
from haystack import Pipeline
from haystack.dataclasses import ChatMessage, ChatRole
from haystack.utils import Secret

from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator

character_pipeline = Pipeline()
character_pipeline.add_component("chat_prompt_builder", ChatPromptBuilder(required_variables=["character_data"]))
character_pipeline.add_component("generator", OpenAIChatGenerator(
    api_key=Secret.from_token("sk-no-key-required"),  # for compatibility with the OpenAI API, a placeholder api_key is needed
    model="LLaMA_CPP",
    api_base_url="http://localhost:8081/v1",
    generation_kwargs = {"temperature": 1.5}
))
character_pipeline.connect("chat_prompt_builder", "generator")

<haystack.core.pipeline.pipeline.Pipeline object at 0x78dd00ce69e0>
üöÖ Components
  - chat_prompt_builder: ChatPromptBuilder
  - generator: OpenAIChatGenerator
üõ§Ô∏è Connections
  - chat_prompt_builder.prompt -> generator.messages (List[ChatMessage])

### Messages

We define the most relevant messages to steer our LLM engine.

- System message (template): this instructs the Language Model to chat and act as a specific character.

- Start message: we need to choose an initial message (and a first speaking character) to spin up the conversation.

We also define the `invert_roles` utility function: for example, we want the first character to see the assistant messages from the second character as user messages,  etc.

In [None]:
system_message = ChatMessage.from_system("""You are: {{character_data['character_name']}}.
                                            Description of your character: {{character_data['description']}}.
                                            Stick to your character's personality and engage in a conversation with an unknown person. Don't make long monologues.""")

start_message = ChatMessage.from_user("Hello, who are you?")

In [None]:
from typing import List

def invert_roles(messages: List[ChatMessage]):
    inverted_messages = []
    for message in messages:
        if message.is_from(ChatRole.USER):
            inverted_messages.append(ChatMessage.from_assistant(message.text))
        elif message.is_from(ChatRole.ASSISTANT):
            inverted_messages.append(ChatMessage.from_user(message.text))
        else:
          inverted_messages.append(message)
    return inverted_messages

### The game

It's time to choose two characters and play.

We choose the popular dancer [Fred Astaire](https://en.wikipedia.org/wiki/Fred_Astaire) and [Corporal Dwayne Hicks](https://en.wikipedia.org/wiki/Dwayne_Hicks) from the Alien saga.

In [None]:
from rich import print

first_character_data = dataset.filter(lambda x: x["character_name"] == "Fred Astaire")[0]
second_character_data = dataset.filter(lambda x: x["character_name"] == "Corporal Dwayne Hicks")[0]

first_name = first_character_data["character_name"]
second_name = second_character_data["character_name"]

# remove the scenario: we do not use it
del first_character_data["scenario"]
del second_character_data["scenario"]

In [None]:
MAX_TURNS = 20


first_character_messages = [system_message, start_message]
second_character_messages = [system_message]

turn = 1
print(f"{first_name} üï∫: {start_message.text}")

while turn < MAX_TURNS:
    second_character_messages=invert_roles(first_character_messages)
    new_message = character_pipeline.run({"template":second_character_messages, "template_variables":{"character_data":second_character_data}})["generator"]["replies"][0]
    second_character_messages.append(new_message)
    print(f"\n\n{second_name} ü™ñ: {new_message.text}")

    turn += 1
    print("-"*20)

    first_character_messages=invert_roles(second_character_messages)
    new_message = character_pipeline.run({"template":first_character_messages, "template_variables":{"character_data":first_character_data}})["generator"]["replies"][0]
    first_character_messages.append(new_message)
    print(f"\n\n{first_name} üï∫: {new_message.text}")

    turn += 1

‚ú® Looks like a nice result.

Of course, you can select other characters (even randomly) and change the initial message.

The implementation is pretty basic and could be improved in many ways.

## üìö Resources
- [Character Codex dataset](https://huggingface.co/datasets/NousResearch/CharacterCodex)
- [llamafile](https://github.com/Mozilla-Ocho/llamafile)
- [llamafile-Haystack integration page](https://haystack.deepset.ai/integrations/llamafile): contains examples on how to run Generative and Embedding models and build indexing and RAG pipelines.
- Haystack components used in this notebook:
  - [ChatPromptBuilder](https://docs.haystack.deepset.ai/docs/chatpromptbuilder)
  - [OpenAIChatGenerator](https://docs.haystack.deepset.ai/docs/openaichatgenerator)

(*Notebook by [Stefano Fiorucci](https://github.com/anakin87)*)