About: Text segmentation     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : yago:WikicatTasksOfNaturalLanguageProcessing, within Data Space : dbpedia.demo.openlinksw.com associated with source document(s)
QRcode icon
http://dbpedia.demo.openlinksw.com/describe/?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FText_segmentation&invfp=IFP_OFF&sas=SAME_AS_OFF

Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental processes used by humans when reading text, and to artificial processes implemented in computers, which are the subject of natural language processing. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written English and the distinctive initial, medial and final letter shapes of Arabic, such signals are sometimes ambiguous and not present in all written languages.

AttributesValues
rdf:type
rdfs:label
  • تجزئة النص (ar)
  • Segmentació de text (ca)
  • Morphologische Analyse (Computerlinguistik) (de)
  • Segmentasi teks (in)
  • Text segmentation (en)
  • Análise morfológica (pt)
  • 文本分割 (zh)
rdfs:comment
  • تجزئة النص هي عملية تقسيم النص المكتوب إلى وحدات ذات معنى مثل الكلمات، الجمل، أو الموضوعات. ينطبق المصطلح على كل من العمليات العقلية التي يستخدمها البشر عند قراءة النص، والعمليات الاصطناعية المنفذة من خلال أجهزة الحاسب، والتي تعتبر من مواضيع مجال معالجة اللغات الطبيعية. هذه العملية ليست سهلة لأنه في حين وجود حدود صريحة للكلمات في بعض اللغات المكتوبة، مثل المسافات بين الكلمات في الإنجليزية المكتوبة وأشكال الحروف المختلفة بحسب موقعها من الكلمة (بداية أو وسط أو نهاية الكلمة) في العربية، فإن هذه الحدود تكون أحيانًا غامضة وغير موجودة في بعض اللغات المكتوبة. (ar)
  • Segmentasi teks adalah proses pemisahan teks tertulis menjadi unit makna seperti kata, kalimat, atau topik. Istilah ini dapat diterapkan baik untuk proses mental yang dilakukan oleh manusia sewaktu membaca teks, maupun proses buatan yang dilakukan oleh komputer dan menjadi bahan kajian pemrosesan bahasa alami. Meskipun beberapa aksara memiliki eksplisit (seperti spasi) atau pembedaan bentuk huruf awal, tengah, dan akhir (seperti pada aksara Arab), penanda tersebut kadang taksa dan tidak semua bahasa tulisan memilikinya. (in)
  • 文本分割(Text segmentation)将书面文本分割成有意义单位的过程,如单词、句子或主题。这个术语既适用于人类阅读文本时的心理过程,也适用于在计算机中实现的人工过程,后者属于自然语言处理的领域。一些书面语言有明确的单词分界标记,例如英语的词之间有空格标识,阿拉伯语有独特的首、中、末字母形状,但这种标记不是所有书面语言都有。 (zh)
  • Unter morphologischer Analyse versteht man in der Computerlinguistik ein Verfahren, welches die morphologischen, syntaktischen und evtl. semantischen Eigenschaften von Wörtern ermittelt. Im Einzelnen können morphologische Analyseverfahren die folgenden Teilaufgaben lösen: (de)
  • Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental processes used by humans when reading text, and to artificial processes implemented in computers, which are the subject of natural language processing. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written English and the distinctive initial, medial and final letter shapes of Arabic, such signals are sometimes ambiguous and not present in all written languages. (en)
  • A morfologia , ainda análise morfológica ou mórfica é o ato de estudar cada uma das diversas palavras em uma frase independentemente, visando sua classe gramatical. Há dez classes gramaticais: substantivos, adjetivo, artigo, pronomes, numeral, verbo, advérbio, preposição, conjunção e interjeição. No exemplo "A Wikipédia é uma enciclopédia livre." (pt)
rdfs:seeAlso
dcterms:subject
Wikipage page ID
Wikipage revision ID
Link from a Wikipage to another Wikipage
sameAs
dbp:wikiPageUsesTemplate
Faceted Search & Find service v1.17_git139 as of Feb 29 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3330 as of Mar 19 2024, on Linux (x86_64-generic-linux-glibc212), Single-Server Edition (378 GB total memory, 53 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software