Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A simple Bayesian model of written text contains only the dictionary of legal words and their relative probabilities. A Markovian model adds the relative transition probabilities that given one word, predict what the next word will be. It is based on the theory of Markov chains by Andrey Markov, hence the name. In essence, a Bayesian filter works on single words alone, while a Markovian filter works on phrases or entire sentences.
Attributes | Values |
---|
rdf:type
| |
rdfs:label
| - Markow-Spamfilter (de)
- Markovian discrimination (en)
|
rdfs:comment
| - Der Markow-Spamfilter (nach Andrei Andrejewitsch Markow) ist ein Spamfilter basierend auf einem Hidden Markov Model und stellt eine Weiterentwicklung des Bayes-Spamfilters dar. Der Spamfilter errechnet dabei die Wahrscheinlichkeit, mit der die Wortketten des überprüften Textes zu Wortketten typischer Spamtexte passen. Während bei einem Bayes-Spamfilter die Wahrscheinlichkeit einzelner Wörter errechnet wird, zieht der Markow-Spamfilter Wortketten zur Ermittlung der Wahrscheinlichkeit heran und gewichtet die einzelnen Kombinationsmöglichkeiten. Ähneln die Wortketten des überprüften Textes denen typischer Spamtexte, so gilt der überprüfte Text als Spam. (de)
- Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A simple Bayesian model of written text contains only the dictionary of legal words and their relative probabilities. A Markovian model adds the relative transition probabilities that given one word, predict what the next word will be. It is based on the theory of Markov chains by Andrey Markov, hence the name. In essence, a Bayesian filter works on single words alone, while a Markovian filter works on phrases or entire sentences. (en)
|
dcterms:subject
| |
Wikipage page ID
| |
Wikipage revision ID
| |
Link from a Wikipage to another Wikipage
| |
sameAs
| |
dbp:wikiPageUsesTemplate
| |
has abstract
| - Der Markow-Spamfilter (nach Andrei Andrejewitsch Markow) ist ein Spamfilter basierend auf einem Hidden Markov Model und stellt eine Weiterentwicklung des Bayes-Spamfilters dar. Der Spamfilter errechnet dabei die Wahrscheinlichkeit, mit der die Wortketten des überprüften Textes zu Wortketten typischer Spamtexte passen. Während bei einem Bayes-Spamfilter die Wahrscheinlichkeit einzelner Wörter errechnet wird, zieht der Markow-Spamfilter Wortketten zur Ermittlung der Wahrscheinlichkeit heran und gewichtet die einzelnen Kombinationsmöglichkeiten. Ähneln die Wortketten des überprüften Textes denen typischer Spamtexte, so gilt der überprüfte Text als Spam. (de)
- Within the probability theory Markov model, Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A simple Bayesian model of written text contains only the dictionary of legal words and their relative probabilities. A Markovian model adds the relative transition probabilities that given one word, predict what the next word will be. It is based on the theory of Markov chains by Andrey Markov, hence the name. In essence, a Bayesian filter works on single words alone, while a Markovian filter works on phrases or entire sentences. There are two types of Markov models; the visible Markov model, and the hidden Markov model or HMM.The difference is that with a visible Markov model, the current word is considered to contain the entire state of the language model, while a hidden Markov model hides the state and presumes only that the current word is probabilistically related to the actual internal state of the language. For example, in a visible Markov model the word "the" should predict with accuracy the following word, while ina hidden Markov model, the entire prior text implies the actual state and predicts the following words, but doesnot actually guarantee that state or prediction. Since the latter case is what's encountered in spam filtering,hidden Markov models are almost always used. In particular, because of storage limitations, the specific typeof hidden Markov model called a Markov random field is particularly applicable, usually with a clique size ofbetween four and six tokens. (en)
|
gold:hypernym
| |
prov:wasDerivedFrom
| |
page length (characters) of wiki page
| |
foaf:isPrimaryTopicOf
| |
is Link from a Wikipage to another Wikipage
of | |
is foaf:primaryTopic
of | |