FoNE: Precise Single-Token Number Embeddings Via Fourier Features
Source
- Raw Markdown: paper_fone-2025.md
- PDF: paper_fone-2025.pdf
- Preprint: arXiv 2502.09741
- Venue page: OpenReview ICLR 2026
- Official project page: fouriernumber.github.io
- Official code: KevinZhoutianyi/FoNE
Core Claim
FoNE maps each number into a single token embedding built from Fourier features, using sine/cosine components at digit-aligned periods so numeric values can be represented without fragmented subword or digit tokens.
Key Contributions
- Defines Fourier Number Embedding as a concatenation of circular embeddings over powers-of-10 periods.
- Uses each sine/cosine pair to recover a modular component of the number, giving a digit-aligned representation.
- Adds the Fourier number embedding to a learned
[NUM]token, then decodes numbers by matching hidden-state pairs to digit embeddings. - Reports stronger arithmetic performance and data efficiency than subword and digit-wise baselines in its controlled experiments.
- Builds directly on the observation that pretrained LLMs already contain Fourier-like number features.
Method Notes
FoNE is the cleanest source in this batch for a smooth, periodic basis view of number embeddings. Its closest time-series analogy is EIDOS-style point-wise scalar encoding: both use bounded periodic basis functions to map scalar numeric values into higher-dimensional representations.
The difference is semantic and operational. FoNE is designed for literal numbers in language-model text and arithmetic outputs. EIDOS maps observed time-series samples into latent tokens for passive forecasting. The two should not be collapsed into one method, but they support the same broader design question: scalar numeric values may deserve specialized embeddings rather than ordinary tokenization.
Slug note: this page uses the arXiv submission year 2025 in the slug, while the OpenReview venue page lists the paper as an ICLR 2026 poster.
Evidence And Results
The abstract and results report that FoNE reduces the number of tokens per number and improves arithmetic accuracy in controlled language-model experiments. The project page and paper present a concrete tokenization comparison for a decimal number, then show how modular Fourier components represent digits.
The source is also important historically: it cites Pre-trained Large Language Models Use Fourier Features To Compute Addition as the mechanistic motivation for explicitly building Fourier number embeddings.
Limitations
FoNE’s strongest claims are for controlled arithmetic tasks. BitTokens challenges its generality, arguing that sinusoidal/Fourier encodings are well suited to addition but force non-local decoding and re-encoding for multiplication and division. Convergent Evolution adds a diagnostic caveat: Fourier spectra can be present even when modular residue classes are not linearly usable, so FoNE-style claims should be checked with geometric probes or downstream task tests, not spectrum alone. Treat FoNE as an important representation proposal, not as a settled universal numeric encoding.
Foundation TSFM Relevance
| Agenda slot | Verdict | Evidence | Missing pieces |
|---|---|---|---|
| Point-wise numeric embeddings | partially closes | Encodes each number as one Fourier-feature token with digit-aligned periods and exact modular recovery properties. | Not tested on sensor values, units, missingness, uncertainty, or time-series forecasting. |
| Representation quality | adjacent | Preserves dense numeric detail better than fragmented subword or digit tokens in controlled arithmetic tasks. | No evidence that the representation preserves regimes, causal variables, or generative fidelity for time series. |
| Benchmarks: what level of modeling is tested? | warning | Strong evidence is arithmetic-focused, including addition, subtraction, and multiplication. | Arithmetic accuracy should not be treated as proof of TSFM numeric-token quality. |
Links Into The Wiki
- Foundation Time-Series Model Research Agenda
- FoNE
- Number Tokenization
- Latent Tokenization
- Tokenizer Transfer
- Pre-trained Large Language Models Use Fourier Features To Compute Addition
- Convergent Evolution
Open Questions
- Can FoNE-style periodic scalar embeddings improve point-wise time-series embeddings beyond arithmetic tasks?
- Should Fourier number embeddings be combined with bit-level or logarithmic encodings to cover both addition and multiplication-like operations?
- How should sign, uncertainty, missingness, and measurement units be represented when FoNE-style encodings are applied to auxiliary numeric values?