> ## Documentation Index > Fetch the complete documentation index at: https://mintlify.com/alblandino/tokenizador/llms.txt > Use this file to discover all available pages before exploring further. # Models Configuration > Complete reference for AI model configurations, encodings, and company branding ## Overview The models configuration module (`models-config.js`) contains comprehensive data about AI language models, their token encodings, pricing, context limits, and company branding information. This configuration powers the model selector and cost calculations throughout the application. ## Configuration Objects ### MODEL\_ENCODINGS A mapping of model identifiers to their tokenization encoding schemes. Most models use `cl100k_base` or `o200k_base` encodings. Object mapping model IDs to encoding identifiers The encoding identifier for the model (e.g., `cl100k_base`, `o200k_base`) ```javascript Example Structure theme={null} MODEL_ENCODINGS = { 'gpt-4o': 'o200k_base', 'gpt-4': 'cl100k_base', 'claude-3.5-sonnet': 'cl100k_base', 'gemini-1.5-pro': 'cl100k_base' } ``` ```javascript Accessing Encoding theme={null} // Get encoding for a specific model const encoding = MODEL_ENCODINGS['gpt-4o']; console.log(encoding); // 'o200k_base' // Check if model has encoding if (MODEL_ENCODINGS[modelId]) { // Initialize tokenizer with correct encoding } ``` Models without native encoding information use `cl100k_base` as an approximation, marked with comments in the source code. #### Supported Encodings **OpenAI's latest encoding** (2024+) Used by: * GPT-4o * GPT-4o Mini More efficient token usage than cl100k\_base. **Standard encoding** for most models Used by: * GPT-4, GPT-4 Turbo, GPT-3.5 Turbo * Claude 3/3.5 series (approximation) * Gemini 1.5 series (approximation) * Llama, Mistral, and most other models Industry standard for token counting. *** ### COMPANIES Company branding information including colors and emoji logos for visual representation in the UI. Object mapping company names to branding data Branding information for a company Hex color code for the company brand (e.g., `#00a67e`) Emoji character used as the company logo ```javascript Example Structure theme={null} COMPANIES = { 'OpenAI': { color: '#00a67e', logo: '🤖' }, 'Anthropic': { color: '#d97757', logo: '🧠' }, 'Google': { color: '#4285f4', logo: '🔍' } } ``` ```javascript Using Company Data theme={null} // Get company branding const company = COMPANIES['OpenAI']; // Apply company color to UI element element.style.color = company.color; // Display company logo logoElement.textContent = company.logo; // Generate company badge const badge = ` ${company.logo} ${companyName} `; ``` #### Supported Companies * **OpenAI** - `#00a67e` 🤖 * **Anthropic** - `#d97757` 🧠 * **Mistral AI** - `#ff6b35` 💨 * **Cohere** - `#39a0ed` 🔗 * **DeepSeek** - `#2c5aa0` 🔍 * **01.AI** - `#1a73e8` 🤖 * **AI21 Labs** - `#6c5ce7` 🧪 * **xAI** - `#000000` ❌ * **Google** - `#4285f4` 🔍 * **Meta** - `#1877f2` 📘 * **Microsoft** - `#00bcf2` 💻 * **Amazon** - `#ff9900` 📦 * **NVIDIA** - `#76b900` 💚 * **Alibaba** - `#ff6a00` 🛒 * **Reka** - `#ff4757` 🦄 * **Perplexity** - `#20bf6b` ❓ * **IBM** - `#054ada` 💼 * **Nous Research** - `#8e44ad` 🔬 * **Snowflake** - `#29b5e8` ❄️ *** ### MODELS\_DATA Complete configuration data for all supported AI models including pricing, context limits, and technical specifications. Object mapping model IDs to complete model configuration Complete configuration for a specific model Display name of the model (e.g., "GPT-4o", "Claude 3.5 Sonnet") Company name (must match a key in COMPANIES object) Tokenization encoding scheme (e.g., "o200k\_base", "cl100k\_base") Maximum context window size in tokens Cost per 1M input tokens in USD Cost per 1M output tokens in USD External link to model information (typically Artificial Analysis) Token count adjustment ratio (1.0 = standard, greater than 1.0 = more tokens, less than 1.0 = fewer tokens) ## Model Data Examples ```javascript theme={null} 'gpt-4o': { name: 'GPT-4o', company: 'OpenAI', encoding: 'o200k_base', contextLimit: 128000, inputCost: 2.50, outputCost: 10.00, url: 'https://artificialanalysis.ai/models/gpt-4o', tokenRatio: 1.0 } ``` OpenAI's most capable multimodal model with 128K context window and efficient o200k\_base encoding. ```javascript theme={null} 'claude-3.5-sonnet': { name: 'Claude 3.5 Sonnet', company: 'Anthropic', encoding: 'cl100k_base', contextLimit: 200000, inputCost: 3.00, outputCost: 15.00, url: 'https://artificialanalysis.ai/models/claude-35-sonnet', tokenRatio: 1.1 } ``` Anthropic's flagship model with 200K context and slightly higher token count (1.1x ratio). ```javascript theme={null} 'gemini-1.5-pro': { name: 'Gemini 1.5 Pro', company: 'Google', encoding: 'cl100k_base', contextLimit: 2097152, inputCost: 1.25, outputCost: 5.00, url: 'https://artificialanalysis.ai/models/gemini-15-pro', tokenRatio: 1.05 } ``` Google's model with massive 2M token context window and competitive pricing. ```javascript theme={null} 'llama-3.1-70b': { name: 'Llama 3.1 70B', company: 'Meta', encoding: 'cl100k_base', contextLimit: 131072, inputCost: 0.35, outputCost: 0.40, url: 'https://artificialanalysis.ai/models/llama-31-70b', tokenRatio: 0.95 } ``` Meta's open-source model with lower token count (0.95x ratio) and affordable pricing. ## Usage Examples ### Retrieving Model Configuration ```javascript theme={null} // Get complete model data const modelData = MODELS_DATA['gpt-4o']; console.log(modelData.name); // "GPT-4o" console.log(modelData.contextLimit); // 128000 console.log(modelData.inputCost); // 2.50 ``` ### Calculating Token Costs ```javascript theme={null} function calculateCost(modelId, inputTokens, outputTokens) { const model = MODELS_DATA[modelId]; if (!model) { throw new Error(`Model ${modelId} not found`); } // Apply token ratio adjustment const adjustedInput = inputTokens * model.tokenRatio; const adjustedOutput = outputTokens * model.tokenRatio; // Calculate costs (prices are per 1M tokens) const inputCost = (adjustedInput / 1_000_000) * model.inputCost; const outputCost = (adjustedOutput / 1_000_000) * model.outputCost; return { input: inputCost, output: outputCost, total: inputCost + outputCost }; } // Example usage const cost = calculateCost('gpt-4o', 50000, 10000); console.log(`Total cost: $${cost.total.toFixed(4)}`); ``` ### Building Model Selector UI ```javascript theme={null} function buildModelSelector() { const modelsByCompany = {}; // Group models by company Object.entries(MODELS_DATA).forEach(([id, model]) => { if (!modelsByCompany[model.company]) { modelsByCompany[model.company] = []; } modelsByCompany[model.company].push({ id, ...model }); }); // Build UI with company branding const selectorHTML = Object.entries(modelsByCompany).map(([company, models]) => { const branding = COMPANIES[company]; return `

${branding.logo} ${company}

${models.map(m => `

${m.name} ${(m.contextLimit / 1000).toFixed(0)}K $${m.inputCost}/$${m.outputCost}

`).join('')}

`; }).join(''); return selectorHTML; } ``` ### Validating Context Length ```javascript theme={null} function validateContextLength(modelId, tokenCount) { const model = MODELS_DATA[modelId]; if (!model) { return { valid: false, error: 'Model not found' }; } // Apply token ratio const adjustedTokens = tokenCount * model.tokenRatio; if (adjustedTokens > model.contextLimit) { return { valid: false, error: `Token count (${adjustedTokens.toFixed(0)}) exceeds model's context limit (${model.contextLimit})`, limit: model.contextLimit, current: adjustedTokens }; } return { valid: true, limit: model.contextLimit, current: adjustedTokens, remaining: model.contextLimit - adjustedTokens }; } // Example usage const validation = validateContextLength('claude-3.5-sonnet', 150000); if (!validation.valid) { console.error(validation.error); } ``` ### Comparing Model Costs ```javascript theme={null} function compareModelCosts(inputTokens, outputTokens, modelIds) { return modelIds.map(modelId => { const model = MODELS_DATA[modelId]; const cost = calculateCost(modelId, inputTokens, outputTokens); return { modelId, name: model.name, company: model.company, cost: cost.total, contextFit: inputTokens + outputTokens <= model.contextLimit }; }).sort((a, b) => a.cost - b.cost); } // Example: Find cheapest model for a specific workload const comparison = compareModelCosts(100000, 20000, [ 'gpt-4o', 'claude-3.5-sonnet', 'gemini-1.5-pro', 'llama-3.1-70b' ]); console.table(comparison); ``` ## Model Categories The configuration includes **48 models** across multiple categories: 5 models including GPT-4o, GPT-4 Turbo, and GPT-3.5 Turbo 4 Claude models from Haiku to Opus 2 Gemini 1.5 models with massive context windows 37 models from Meta, Mistral, Alibaba, and others ## Token Ratio Explained The `tokenRatio` field adjusts for differences in how models count tokens: OpenAI models and most approximations use 1.0 as the baseline. **Models:** GPT-4o, GPT-4, GPT-4 Turbo, GPT-3.5 Turbo Models that typically count more tokens for the same text. **Examples:** * Claude models: 1.1 (10% more tokens) * Gemini models: 1.05 (5% more tokens) * Amazon Titan: 1.04 * Snowflake Arctic: 1.06 Models that typically count fewer tokens for the same text. **Examples:** * Llama models: 0.95 (5% fewer tokens) * Alibaba Qwen: 0.92 (8% fewer tokens) * DeepSeek: 0.93 * AI21 Jamba: 0.94 Token ratios are approximations based on empirical testing. Actual token counts may vary depending on text characteristics. ## Best Practices ```javascript theme={null} if (!MODELS_DATA[modelId]) { console.error('Model not found'); return; } ``` ```javascript theme={null} const adjustedTokens = tokenCount * model.tokenRatio; ``` Check that your content fits within the model's context window before making API calls. Always reference the COMPANIES object for visual consistency across the UI. ## Related Resources Learn how tokenization works with these encodings Implementation details for cost calculations UI component that uses this configuration Deep dive into tokenization encodings