Token (AI)
A token is the smallest unit of text that an AI language model processes, such as a word, part of a word, or punctuation mark.
What is a token?
A token is the basic unit with which AI language models process text. A token can be a complete word, but also part of a word, a number, or a punctuation mark. The Dutch word "volgsysteem" (tracking system), for example, is split into multiple tokens. The number of tokens determines how much text a model can process at once (the context window) and directly affects the processing speed and cost of AI applications.
How do tokens work?
When text is sent to a language model, it is first split into tokens by a tokenizer. Each token receives a numerical value that the model uses to statistically process the text. Both the input (prompt) and the output (the answer) consume tokens. An efficient prompt not only produces better answers but also consumes fewer tokens. Wabber optimizes token processing within the RAG pipeline, ensuring that only the most relevant context is provided to the model.
Example
A logistics company uses Wabber's AI chatbot to quickly answer employee questions about shipping instructions and procedures. When an employee asks "How do I process a return shipment?", this question is converted into approximately 8 tokens. The system then retrieves relevant passages from the vector database and sends them as context, costing perhaps 500 tokens in total. Because Wabber intelligently selects the context, no unnecessary tokens are consumed and the employee receives an accurate answer within seconds.
Why are tokens important?
Tokens determine the capacity, speed, and cost of AI applications. The more tokens a model can process simultaneously, the more extensive the questions and documents it can handle. On Wabber's private cluster, tokens are processed locally without data being sent to external servers, which is essential for organizations working with confidential information. With 128GB of VRAM, Wabber can run models with large context windows, processing more information simultaneously for more accurate answers.
Related solutions
Frequently asked questions
How many tokens can an AI model process at once?
The number of tokens a model can process at once is called the context window. Modern models support context windows ranging from 4,000 to over 200,000 tokens. On Wabber's cluster, with 128GB of VRAM, models with large context windows can be run, allowing extensive documents and conversation histories to be processed at once.
What does token usage cost in AI?
With commercial cloud providers, tokens are charged per thousand or per million, with output tokens being more expensive than input tokens. On Wabber's private cluster, there are no per-token costs because processing takes place locally on our own hardware. This makes usage predictable and cost-effective, especially with intensive use.
What is the difference between tokens and words?
A word can consist of one or more tokens. Short, common words are often a single token, while longer or rarer words are split into multiple tokens. As a rule of thumb, 1 token is approximately 0.75 words in English, and slightly less in Dutch due to longer compound words.
Is my data processed securely during tokenization?
On Wabber's private cluster, all tokens are processed locally on our own hardware in the Netherlands. No data leaves the cluster, guaranteeing complete privacy and data sovereignty. This is an important difference from cloud-based AI services where data is sent to external servers.
AI-Readiness Scan
Scan in 2 minutes - discover where you stand.
Ready to put your data to work?
Schedule a no-obligation 30-minute session. Discover how private AI and tracking systems measurably improve your operation.

