September 14, 2024 6 min to read

Hotest LLMs

LLMs are hot and still evolving

After reading Databricks’s AI State ofData+AI report from this past month, I wanted to discuss and write about LLMs, as it actually gives some key data about the current usage of them in all the companies using DataBricks, I’m myself included on this.

Large Language Models (LLMs) are advanced machine learning models designed to understand and generate human-like text. They are trained on vast amounts of text data and can be used in a variety of applications, including translation, question answering, and content creation.

As of 2024, several large language models (LLMs) are widely used and recognized for their capabilities in natural language processing. Here is a list some of the most prominent ones:

GPT-4 by OpenAI:
- The latest in the GPT series, known for its advanced language understanding and generation capabilities. Used in a variety of applications, from chatbots to content generation.
LLaMA 3 by Meta:
- The latest in the LLaMA series, with significant advancements in performance and efficiency. Used in research, industry applications, and as part of Meta’s commitment to open science.
BERT by Google:
- Bidirectional Encoder Representations from Transformers, primarily used for natural language understanding tasks. Employed in search engines and question-answering systems.
Megatron by NVIDIA:
- A large, scalable transformer model optimized for distributed training on GPUs. Used in research and for creating high-performance NLP (Natural Language Processing) models.
GPT-Neo and GPT-J by EleutherAI:
- Open-source models aimed at replicating GPT-3 performance. Popular in the open-source community for various NLP tasks.
BLOOM by BigScience:
- An open, collaborative model designed to be transparent and accessible for research purposes. Used in academic and research settings to study the behavior and capabilities of large language models.

These models are chosen based on their performance, flexibility, and the specific requirements of different NLP tasks. They are used in applications ranging from virtual assistants and chatbots to complex data analysis and content generation and some of them are opensource and others now.

Open Source vs Private LLMs

As per Databricks’s report, something that caguth my attention was this: “Companies are quick to adopt new open source models”; delivered by the big tech companies, such example can be seen on this, Meta’s LLaMA 3 was launched on April 18, 2024. Within its first week, organizations began leveraging it over other models and providers. Just four weeks after its launch, LLaMA 3 accounted for 39% of all open-source LLM usage inside Databricks.

Talking about the key differences between open-source and private LLMs we can describe specific factors that makes a company take the decission to use one or another

Accessibility:

Open-Source LLMs: The code and underlying architecture of the LLM are publicly available. Anyone can access, study, modify, and experiment with the model. This fosters collaboration, innovation, and wider adoption. Examples include BLOOM, Jurassic-1 Mini, and WuDao 2.0 lite.
Private LLMs: The code and architecture are proprietary and not publicly accessible. Access is typically granted through licensing agreements or by special invitation from the developers. Examples include GPT-4, Jurassic-1 Jumbo, Megatron-Turing NLG, and WuDao 2.0.

Development and Control:

Open-Source LLMs: Development is often driven by a community of researchers and developers. This collaborative approach can lead to faster innovation and adaptation but may lack the focused direction of a single entity.
Private LLMs: Development is controlled by the companies or organizations that created them. They have more control over the model’s direction and functionality but may prioritize internal needs over broader community interests.

Cost and Resources:

Open-Source LLMs: Generally free to use and experiment with, although significant computational resources might be needed to run the model effectively. Cloud platforms might offer paid services for easier access and scalability.
Private LLMs: May require licensing fees or specific hardware configurations to access and utilize the model. Companies might provide cloud-based access through paid subscriptions.

Focus and Performance:

Open-Source LLMs: Often focus on general-purpose tasks and may perform well across a variety of domains. However, they may not be as specialized or optimized for specific tasks compared to private models.
Private LLMs: Can be optimized for specific purposes, potentially leading to superior performance in targeted areas. Companies might prioritize tasks relevant to their business goals.

Transparency and Explainability:

Open-Source LLMs: Due to open access to the code, there’s more potential for transparency in how the model works. Researchers can analyze the inner workings and identify potential biases.
Private LLMs: Less transparency as the code is not publicly available. Understanding the model’s decision-making process and identifying biases can be more challenging.

Choosing the Right LLM:

The choice between using an open-source or private LLM depends on your specific needs and priorities. Here are some factors to consider:

Budget: Open-source options are typically more cost-effective, but computational resources might be a factor.
Purpose: Consider the tasks you need the LLM for. Open-source models offer broader capabilities, while private models might be more specialized.
Control and Customization: If you need more control over the model or require specific customizations, a private LLM might be preferable.
Transparency: Open-source models offer greater transparency in how they work.

Both open-source and private LLMs play a crucial role in the advancement of artificial intelligence. Each approach fosters innovation and offers distinct advantages depending on the specific context.

Resources for further exploration:

The AI Today website maintains a curated list of LLMs: https://theaitoday.com/
Papers With Code keeps track of research papers related to LLMs: https://paperswithcode.com/task/large-language-model
The LLM benchmarking leaderboard offers insights into model performance: https://github.com/google/BIG-bench

The field of LLMs is rapidly evolving, with new players and products being launched every week, not kidding! This blog provides a high overview of some of the most prominent models currently in use, but it’s important to stay updated on the latest developments from the key players.

Build On!

Not a DevOps Engineer v3.1.0

Hotest LLMs

Open Source vs Private LLMs

Data Governance 101

Andrés Zepeda

Hotest LLMs

Open Source vs Private LLMs

Data Governance 101

Share

Andrés Zepeda