For faster navigation, this Iframe is preloading the Wikiwand page for Distributional semantics.

Distributional semantics

How words are related in a given language is demonstrated in the "semantic space", which mathematically corresponds to the vector space.

Distributional semantics [1] is a research area that develops and studies theories and methods for quantifying and categorizing semantic similarities between linguistic items based on their distributional properties in large samples of language data. The basic idea of distributional semantics can be summed up in the so-called distributional hypothesis: linguistic items with similar distributions have similar meanings.

Distributional hypothesis

The distributional hypothesis in linguistics is derived from the semantic theory of language usage, i.e. words that are used and occur in the same contexts tend to purport similar meanings.[2]

The underlying idea that "a word is characterized by the company it keeps" was popularized by Firth in the 1950s.[3]

The distributional hypothesis is the basis for statistical semantics. Although the Distributional Hypothesis originated in linguistics,[4] it is now receiving attention in cognitive science especially regarding the context of word use.[5]

In recent years, the distributional hypothesis has provided the basis for the theory of similarity-based generalization in language learning: the idea that children can figure out how to use words they've rarely encountered before by generalizing about their use from distributions of similar words.[6][7]

The distributional hypothesis suggests that the more semantically similar two words are, the more distributionally similar they will be in turn, and thus the more that they will tend to occur in similar linguistic contexts.

Whether or not this suggestion holds has significant implications for both the data-sparsity problem in computational modeling,[8] and for the question of how children are able to learn language so rapidly given relatively impoverished input (this is also known as the problem of the poverty of the stimulus).

Distributional semantic modeling in vector spaces

Distributional semantics favor the use of linear algebra as a computational tool and representational framework. The basic approach is to collect distributional information in high-dimensional vectors, and to define distributional/semantic similarity in terms of vector similarity.[9] Different kinds of similarities can be extracted depending on which type of distributional information is used to collect the vectors: topical similarities can be extracted by populating the vectors with information on which text regions the linguistic items occur in; paradigmatic similarities can be extracted by populating the vectors with information on which other linguistic items the items co-occur with. Note that the latter type of vectors can also be used to extract syntagmatic similarities by looking at the individual vector components.

The basic idea of a correlation between distributional and semantic similarity can be operationalized in many different ways. There is a rich variety of computational models implementing distributional semantics, including latent semantic analysis (LSA),[10][11] Hyperspace Analogue to Language (HAL), syntax- or dependency-based models,[12] random indexing, semantic folding[13] and various variants of the topic model.[14]

Distributional semantic models differ primarily with respect to the following parameters:

Distributional semantic models that use linguistic items as context have also been referred to as word space, or vector space models.[16][17]

Beyond Lexical Semantics

While distributional semantics typically has been applied to lexical items—words and multi-word terms—with considerable success, not least due to its applicability as an input layer for neurally inspired deep learning models, lexical semantics, i.e. the meaning of words, will only carry part of the semantics of an entire utterance. The meaning of a clause, e.g. "Tigers love rabbits.", can only partially be understood from examining the meaning of the three lexical items it consists of. Distributional semantics can straightforwardly be extended to cover larger linguistic item such as constructions, with and without non-instantiated items, but some of the base assumptions of the model need to be adjusted somewhat. Construction grammar and its formulation of the lexical-syntactic continuum offers one approach for including more elaborate constructions in a distributional semantic model and some experiments have been implemented using the Random Indexing approach.[18]

Compositional distributional semantic models extend distributional semantic models by explicit semantic functions that use syntactically based rules to combine the semantics of participating lexical units into a compositional model to characterize the semantics of entire phrases or sentences. This work was originally proposed by Stephen Clark, Bob Coecke, and Mehrnoosh Sadrzadeh of Oxford University in their 2008 paper, "A Compositional Distributional Model of Meaning".[19] Different approaches to composition have been explored—including neural models—and are under discussion at established workshops such as SemEval.[20]


Distributional semantic models have been applied successfully to the following tasks:


See also


  1. ^ Lenci, Alessandro; Sahlgren, Magnus (2023). Distributional Semantics. Cambridge University Press. ISBN 9780511783692.
  2. ^ Harris 1954
  3. ^ Firth 1957
  4. ^ Sahlgren 2008
  5. ^ McDonald & Ramscar 2001
  6. ^ Gleitman 2002
  7. ^ Yarlett 2008
  8. ^ Wishart, Ryder; Prokopidis, Prokopis (2017). Topic Modelling Experiments on Hellenistic Corpora (PDF). Proceedings of the Workshop on Corpora in the Digital Humanities 17. S2CID 9191936.
  9. ^ Rieger 1991
  10. ^ Deerwester et al. 1990
  11. ^ Landauer, Thomas K.; Dumais, Susan T. (1997). "A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge". Psychological Review. 104 (2): 211–240. doi:10.1037/0033-295x.104.2.211.
  12. ^ Padó & Lapata 2007
  13. ^ De Sousa Webber, Francisco (2015). "Semantic Folding Theory And its Application in Semantic Fingerprinting". arXiv:1511.08855 [cs.AI].
  14. ^ Jordan, Michael I.; Ng, Andrew Y.; Blei, David M. (2003). "Latent Dirichlet Allocation". Journal of Machine Learning Research. 3 (Jan): 993–1022.
  15. ^ Church, Kenneth Ward; Hanks, Patrick (1989). "Word association norms, mutual information, and lexicography". Proceedings of the 27th Annual Meeting on Association for Computational Linguistics. Morristown, NJ, USA: Association for Computational Linguistics: 76–83. doi:10.3115/981623.981633.
  16. ^ Schütze 1993
  17. ^ Sahlgren 2006
  18. ^ Karlgren, Jussi; Kanerva, Pentti (July 2019). "High-dimensional distributed semantic spaces for utterances". Natural Language Engineering. 25 (4): 503–517. arXiv:2104.00424. doi:10.1017/S1351324919000226. S2CID 201141249.
  19. ^ Clark, Stephen; Coecke, Bob; Sadrzadeh, Mehrnoosh (2008). "A compositional distributional model of meaning" (PDF). Proceedings of the Second Quantum Interaction Symposium: 133–140.
  20. ^ "SemEval-2014, Task 1".


{{bottomLinkPreText}} {{bottomLinkText}}
Distributional semantics
Listen to this article

This browser is not supported by Wikiwand :(
Wikiwand requires a browser with modern capabilities in order to provide you with the best reading experience.
Please download and use one of the following browsers:

This article was just edited, click to reload
This article has been deleted on Wikipedia (Why?)

Back to homepage

Please click Add in the dialog above
Please click Allow in the top-left corner,
then click Install Now in the dialog
Please click Open in the download dialog,
then click Install
Please click the "Downloads" icon in the Safari toolbar, open the first download in the list,
then click Install

Install Wikiwand

Install on Chrome Install on Firefox
Don't forget to rate us

Tell your friends about Wikiwand!

Gmail Facebook Twitter Link

Enjoying Wikiwand?

Tell your friends and spread the love:
Share on Gmail Share on Facebook Share on Twitter Share on Buffer

Our magic isn't perfect

You can help our automatic cover photo selection by reporting an unsuitable photo.

This photo is visually disturbing This photo is not a good choice

Thank you for helping!

Your input will affect cover photo selection, along with input from other users.


Get ready for Wikiwand 2.0 🎉! the new version arrives on September 1st! Don't want to wait?