Open-Source Foundation Models
Key Trends
- Skyrocketing Capabilities: Rapid advancements in LLMs since 2018.
- Declining Access: Shift from open paper, code, and weights to API-only models, limiting experimentation and research.
Why Access Matters
- Access drives innovation:
- 1990s: Digital text enabled statistical NLP.
- 2010s: GPUs and crowdsourcing fueled deep learning and large datasets.
- Levels of access define research opportunities:
- API: Like a cognitive scientist, measure behavior (prompt-response systems).
- Open-Weight: Like a neuroscientist, probe internal activations for interpretability and fine-tuning.
- Open-Source: Like a computer scientist, control and question every part of the system.
Levels of Access for Foundation Models
-
API Access
- Acts as a universal function (e.g., summarize, verify, generate).
- Enables problem-solving agents (e.g., cybersecurity tools, social simulations).
- Challenges: Deprecation and limited reproducibility.
-
Open-Weight Access
- Enables interpretability, distillation, fine-tuning, and reproducibility.
- Prominent models: Llama, Mistral.
- Challenges:
- Testing model independence and functional changes from weight modifications.
- Blueprint constraints of pre-existing models.
-
Open-Source Access
- Embodies creativity, transparency, and collaboration.
- Examples: GPT-J, GPT-NeoX, StarCoder.
- Performance gap persists compared to closed models due to compute and data limitations.
Key Challenges and Opportunities
- Open-Source Barriers:
- Legal restrictions on releasing web-derived training data.
- Significant compute requirements for retraining.
- Scaling Compute:
- Pooling idle GPUs.
- Crowdsourced efforts like Big Science.
- Emergent Research Questions:
- How do architecture and data shape behavior?
- Can scaling laws predict performance at larger scales?
Reflections
- Most research occurs within API and fixed-weight confines, limiting exploration.
- Open-weight models offer immense value for interpretability and experimentation.
- Open-source efforts require collective funding and infrastructure support.
Final Takeaway
Access shapes the trajectory of innovation in foundation models. To unlock their full potential, researchers must question data, architectures, and algorithms while exploring new models of collaboration and resource pooling.