Setting Up and Using Open Source LLM

Article by @Rohan Handore 22th April 2024

Last updated 10 months ago

Was this helpful?

Setting Up and Using Open Source LLM

Article by @Rohan Handore 22th April 2024

In the contest between Closed Source LLM models like OpenAI and Claude, and Open Source LLM models such as LLAMA and Falcon, we are observing firsthand the dynamic evolution of linguistic technologies, as they redefine boundaries and set new standards for natural language processing. How to start with OpenAI API guide you can find Here! In this post we’ll describe different ways to start using Open Source LLM’s.

Where to find the Best Open Source LLM’s?

The most well known Open Source platform for transformer based models is HuggingFace.

In HuggingFace you can find Datasets, all kinds of open source models like NLP, Computer vision, Multimodal, Audio, Tabular, Reinforcement Learning. You can filter the models by the relevant task:

Let’s have a look on Meta’s LLM — the LLAMA-2.

The latest Llama models have made significant advancements in the realm of large language models. There are three variations of Llama-2, boasting 7, 13, and 70 billion parameters. Notably, while most of these models are open-sourced for both commercial and research use, the Llama 2 34B version remains unreleased to the public.Complementing these are the refined conversational models, Llama-2-Chat, available in 7B, 34B, and 70B configurations.

The number of parameters is a rough indicator of:

Complexity: Larger models can, in general, capture more intricate patterns in data. A model with billions of parameters, like 7b, would be considered very large and complex.
Computational Requirements: Larger models need more memory and processing power, both for training and inference. Therefore, a 7b model would require substantial hardware resources, especially when compared to smaller models.

2. Using Open Source model in your application

Let’s explore how to execute the open-source LLAMA-2 model on Google Colab for inference. (Note: We won’t be fine-tuning models in this guide). As previously mentioned, the LLM model demands significant computational resources like GPU, ample RAM, and disk space. To run it effectively on Google Colab, a ‘Pro’ account might be necessary.

Every month, Google Colab Pro allocates 90 credit points, allowing a balance to grow up to 200 points. For running the ‘Llama-2–7b-chat’ model, selecting a T4 GPU is advisable. It’s approximate usage rate stands at 2.05 credit points per hour.

Let’s start with installing some dependencies:

!pip install transformers einops accelerate langchain bitsandbytes Xformers

To use LLAMA models, you’ll need access permission from Meta. You can apply for it Here! Once you’ve gained access, you can utilize your HuggingFace access key for authentication.

from huggingface_hub import notebook_login

notebook_login()b

After executing the code, you’ll receive a message prompting you to input your HuggingFace access token. Please copy and paste it into the indicated field.

Then you can run this pipeline code configuring the arguments:

from transformers import AutoTokenizer, AutoModelForCausalLM

model = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline = pipeline(
    "text-generation", #task
    model=model,
    max_length=6000,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

Then you can ask any question you want, for example:

question = "What is a iguana?"

pipeline.predict(question)

This is the response we get:

[{'generated_text': "What is a iguana?
\n\nAn iguana is a type of lizard that is native to Central and South 
America and the Caribbean. There are several species of iguanas, 
including the green iguana, the blue iguana, and the spiny-tailed iguana. 
Iguanas are known for their large size, long lifespan, and distinctive 
appearance, which includes a long, flat tail and a row of spines running 
along their back.\n\nIguanas are herbivores, which means they eat plants 
and fruits, and they play an important role in their ecosystems as seed 
dispersers and pollinators. In captivity, iguanas are popular pets due to 
their relatively small size and calm demeanor compared to other reptiles. 
However, they require specialized care and can live for up to 20 years with 
proper care.\n\nIguanas are also known for their intelligence and can be 
trained to perform tricks and tasks, such as recognizing and responding to 
their owners' voices. They are also relatively low-maintenance pets compared 
to dogs or cats, as they do not require as much attention or exercise.
\n\nOverall, iguanas are fascinating creatures that are both interesting 
to observe and enjoyable to care for as pets."}]

How cool is that?!

I hope you enjoyed this article and are now inspired to run your own Open Source model on your device :)

Last updated 10 months ago

Was this helpful?

Where to find the Best Open Source LLM’s?

The most well known Open Source platform for transformer based models is HuggingFace.

In HuggingFace you can find Datasets, all kinds of open source models like NLP, Computer vision, Multimodal, Audio, Tabular, Reinforcement Learning. You can filter the models by the relevant task:

Let’s have a look on Meta’s LLM — the LLAMA-2.

The number of parameters is a rough indicator of:

Complexity: Larger models can, in general, capture more intricate patterns in data. A model with billions of parameters, like 7b, would be considered very large and complex.
Computational Requirements: Larger models need more memory and processing power, both for training and inference. Therefore, a 7b model would require substantial hardware resources, especially when compared to smaller models.

2. Using Open Source model in your application

Let’s start with installing some dependencies:

!pip install transformers einops accelerate langchain bitsandbytes Xformers

To use LLAMA models, you’ll need access permission from Meta. You can apply for it Here! Once you’ve gained access, you can utilize your HuggingFace access key for authentication.

from huggingface_hub import notebook_login

notebook_login()b

After executing the code, you’ll receive a message prompting you to input your HuggingFace access token. Please copy and paste it into the indicated field.

Then you can run this pipeline code configuring the arguments:

from transformers import AutoTokenizer, AutoModelForCausalLM

model = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model)

pipeline = pipeline(
    "text-generation", #task
    model=model,
    max_length=6000,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

Then you can ask any question you want, for example:

question = "What is a iguana?"

pipeline.predict(question)

This is the response we get:

[{'generated_text': "What is a iguana?
\n\nAn iguana is a type of lizard that is native to Central and South 
America and the Caribbean. There are several species of iguanas, 
including the green iguana, the blue iguana, and the spiny-tailed iguana. 
Iguanas are known for their large size, long lifespan, and distinctive 
appearance, which includes a long, flat tail and a row of spines running 
along their back.\n\nIguanas are herbivores, which means they eat plants 
and fruits, and they play an important role in their ecosystems as seed 
dispersers and pollinators. In captivity, iguanas are popular pets due to 
their relatively small size and calm demeanor compared to other reptiles. 
However, they require specialized care and can live for up to 20 years with 
proper care.\n\nIguanas are also known for their intelligence and can be 
trained to perform tricks and tasks, such as recognizing and responding to 
their owners' voices. They are also relatively low-maintenance pets compared 
to dogs or cats, as they do not require as much attention or exercise.
\n\nOverall, iguanas are fascinating creatures that are both interesting 
to observe and enjoyable to care for as pets."}]

How cool is that?!

I hope you enjoyed this article and are now inspired to run your own Open Source model on your device :)