About: Latent semantic analysis

Facets (new session)
Description
Metadata
Settings
- Rule:
- Inverse Functional Properties:
- "Same As":

About: Latent semantic analysis Goto Sponge NotDistinct Permalink

An Entity of Type : yago:Whole100003553, within Data Space : dbpedia.demo.openlinksw.com associated with source document(s)
QRcode icon

http://dbpedia.demo.openlinksw.com/c/5yATPZEzYv

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 repre

Attributes	Values
rdf:type	topical concept yago:WikicatLatentVariableModels yago:Assistant109815790 yago:CausalAgent100007347 yago:LivingThing100004258 yago:Model110324560 yago:Object100002684 yago:Organism100004475 yago:Person100007846 yago:PhysicalEntity100001930 yago:Worker109632518 yago:YagoLegalActor yago:YagoLegalActorGeo yago:Whole100003553
rdfs:label	Latent Semantic Analysis (de) Ezkutuko semantikaren analisia (eu) Analyse sémantique latente (fr) Latent semantic analysis (en) 潜在意味解析 (ja) Латентно-семантический анализ (ru) Latent semantisk analys (sv) Латентно-семантичний аналіз (uk) 潜在语义学 (zh)
rdfs:comment	L’analyse sémantique latente (LSA, de l'anglais : Latent semantic analysis) ou indexation sémantique latente (ou LSI, de l'anglais : Latent semantic indexation) est un procédé de traitement des langues naturelles, dans le cadre de la sémantique vectorielle. La LSA fut brevetée en 1988 et publiée en 1990. Elle permet d'établir des relations entre un ensemble de documents et les termes qu'ils contiennent, en construisant des « concepts » liés aux documents et aux termes. (fr) 潜在意味解析（せんざいいみかいせき、英: Latent Semantic Analysis、略称: LSA）は、ベクトル空間モデルを利用した自然言語処理の技法の1つで、文書群とそこに含まれる用語群について、それらに関連した概念の集合を生成することで、その関係を分析する技術である。潜在的意味解析とも。 1988年、アメリカ合衆国でLSAの特許が取得されている。情報検索の分野では、潜在的意味索引または潜在意味インデックス（英: Latent Semantic Indexing, LSI）とも呼ばれている。 (ja) Latent semantisk analys (eng. Latent Semantic Analysis, LSA), även kallat latent semantisk indexering, (eng. Latent Semantic Indexing, LSI), är en indexeringsmetod inom språkteknologi som beskriver relationen mellan termer (ord) och dokument i en korpus. Metoden placerar alla dokument i ett högdimensionellt vektorrum så att konceptuellt besläktade dokument även är närliggande i vektorrummet. Ett av metodens främsta mål är att kunna hämta ut alla relevanta dokument vid en sökning, även de som inte innehåller just de termer som användes i sökfrasen. (sv) 潜在语义分析（Latent Semantic Analysis），是语义学的一个新的分支。传统的语义学通常研究字、词的含义以及词与词之间的关系，如同义，近义，反义等等。潜在语义分析探讨的是隐藏在字词背后的某种关系，这种关系不是以词典上的定义为基础，而是以字词的使用环境作为最基本的参考。这种思想来自于心理语言学家。他们认为，世界上数以百计的语言都应该有一种共同的简单的机制，使得任何人只要是在某种特定的语言环境下长大都能掌握那种语言。在这种思想的指导下，人们找到了一种简单的数学模型，这种模型的输入是由任何一种语言书写的文献构成的文库，输出是该语言的字、词的一种数学表达（向量）。字、词之间的关系乃至任何文章片断之间的含义的比较就由这种向量之间的运算产生。潛在語義學的觀念也被應用在資訊檢索上，所以有時潛在語義學也被稱為隱含語義索引（Latent Semantic Indexing，LSI）。 (zh) Latent Semantic Indexing (kurz LSI) ist ein (nicht mehr patentgeschütztes) Verfahren des Information Retrieval, das 1990 zuerst von et al. erwähnt wurde. Verfahren wie das LSI sind insbesondere für die Suche auf großen Datenmengen wie dem Internet von Interesse. Das Ziel von LSI ist es, Hauptkomponenten von Dokumenten zu finden. Diese Hauptkomponenten (Konzepte) kann man sich als generelle Begriffe vorstellen. So ist Pferd zum Beispiel ein Konzept, das Begriffe wie Mähre, Klepper oder Gaul umfasst. Somit ist dieses Verfahren zum Beispiel dazu geeignet, aus sehr vielen Dokumenten (wie sie sich beispielsweise im Internet finden lassen), diejenigen herauszufinden, die sich thematisch mit ‘Autos’ befassen, auch wenn in ihnen das Wort Auto nicht explizit vorkommt. Des Weiteren kann LSI dabei h (de) Ezkutuko semantikaren analisia (LSA) hizkuntzaren prozesamendurako teknika bat da. Dokumentu-multzo baten eta bertan agertzen diren terminoen arteko erlazioa aztertzeko kontzeptu-multzo bat sortzen da dokumentuetan eta terminoetan oinarrituz. Semantikoki oso antzeko diren hitzak antzeko esanahia duten testuetan agertzen direla ontzat ematen du LSAk. Testuetako paragrafoetako terminoen agerpen maiztasunak kalkulatuz termino-dokumentu matrize bat eraikitzen da (errenkada bat termino bakoitzekoeta zutabe bat paragrafo bakoitzeko) eta balio singularretan deskonposatzea (SVD) izeneko teknika matematikoa erabiltzen da terminoen eta dokumentuen adierazpen bektorialen dimentsioa murrizteko. Hitzen (terminoen) antzekotasun semantikoa kalkulatzeko errenkada-bektoreen arteko angeluaren kosinua kalkul (eu) Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 repre (en) Лате́нтно-семанти́чний ана́ліз (ЛСА) — метод обробки інформації природною мовою, зокрема, , що дозволяє аналізувати взаємозв'язок між набором документів і термінами, які в них зустрічаються, шляхом створення набору понять. ЛСА припускає, що слова, близькі за значенням, зустрічатимуться в подібних фрагментах тексту (дистрибутивна гіпотеза). З великої частини тексту створюється матриця, що вміщує кількість слів на параграф (рядки містять унікальні слова, а стовпці — текст кожного параграфа).При аналізі множини документів як вихідну інформацію ЛСА застосовує терм-документну матрицю, елементи якої свідчать про частоту використання кожного терміну в документах (TF-IDF). (uk) Латентно-семантический анализ (ЛСА) (англ. Latent semantic analysis, LSA) — это метод обработки информации на естественном языке, анализирующий взаимосвязь между библиотекой документов и терминами, в них встречающимися, и выявляющий характерные факторы (тематики), присущие всем документам и терминам. (ru)
dct:subject	Semantic relations Information retrieval techniques Latent variable models Natural language processing
Wikipage page ID	689427 (xsd:integer)
Wikipage revision ID	1123547216 (xsd:integer)
Link from a Wikipage to another Wikipage	Science Applications International Corporation Multinomial distribution N-gram Natural language processing Neural network Probabilistic latent semantic analysis Bellcore Patents Deep learning Information retrieval Electronic Discovery Semantic relations Computational Linguistics Correlation Cosine similarity Information retrieval techniques George Furnas Low-rank approximation Cognitive Science Eigenvector Gensim Concept Context (language use) Contingency table Correspondence analysis Cross-language information retrieval Ergodic hypothesis Singular value decomposition Latent variable models Frobenius norm Bag of words model Distributional semantics Document-term matrix Document classification Latent Dirichlet allocation Latent semantic mapping Latent semantic structure indexing Factor analysis Normal distribution Diagonal matrix Graphical model Principal component analysis Text corpus Susan Dumais Richard Harshman Jean-Paul Benzécri Boolean search Apache Lucene Natural language processing Synonymy Co-occurrence Coh-Metrix Higher-order statistics Terminology Dot product Automated essay scoring Automatic summarization Physicians Poisson distribution Sparse matrix Free recall Data clustering

Faceted Search & Find service v1.17_git147 as of Sep 06 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 08.03.3331 as of Sep 2 2024, on Linux (x86_64-generic-linux-glibc212), Single-Server Edition (378 GB total memory, 50 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software