About: Determining the number of clusters in a data set

Facets (new session)
Description
Metadata
Settings
- Rule:
- Inverse Functional Properties:
- "Same As":

About: Determining the number of clusters in a data set Goto Sponge NotDistinct Permalink

An Entity of Type : dbo:Disease, within Data Space : dbpedia.demo.openlinksw.com associated with source document(s)
QRcode icon

http://dbpedia.demo.openlinksw.com/describe/?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FDetermining_the_number_of_clusters_in_a_data_set&invfp=IFP_OFF&sas=SAME_AS_OFF

Determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem.

Attributes	Values
rdf:type	disease
rdfs:label	Determining the number of clusters in a data set (en)
rdfs:comment	Determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. (en)
foaf:depiction
dcterms:subject	Cluster analysis Articles with example pseudocode Clustering criteria
Wikipage page ID	22324566 (xsd:integer)
Wikipage revision ID	1120852726 (xsd:integer)
Link from a Wikipage to another Wikipage	Bayesian information criterion Robert Tibshirani Elbow method (clustering) Cluster analysis Robert L. Thorndike Deviance information criterion Covariance Matrix (mathematics) Empirical Cross-validation (statistics) Likelihood function Limit (mathematics) Silhouette (clustering) Clustering algorithm Mahalanobis distance Articles with example pseudocode Clustering criteria Trevor Hastie Data mining Data set Feature space K-means clustering Least squares Akaike information criterion DBSCAN F-test Normal distribution Random variable OPTICS algorithm Asymptotic analysis Hierarchical clustering Dimensionality Dot product Data clustering Information theory R (programming language) Radial basis function Genetic algorithms StackOverflow Non-parametric statistics Expectation–maximization algorithm Ralf Wagner Gaussian mixture model R-Project Rate distortion theory Explained variance K-means algorithm K-medoid
Link from a Wikipage to an external page	https://stackoverflow.com/a/15376462/1036500 https://hal.archives-ouvertes.fr/hal-02124947/document http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/
sameAs	Determining the number of clusters in a data set Determining the number of clusters in a data set Determining the number of clusters in a data set Determining the number of clusters in a data set Determining the number of clusters in a data set
dbp:wikiPageUsesTemplate	dbt:Main dbt:Math dbt:Mvar dbt:Reflist
thumbnail	wiki-commons:Special:FilePath/DataClustering_ElbowCriterion.jpg?width=300
has abstract	Determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a certain class of clustering algorithms (in particular k-means, k-medoids and expectation–maximization algorithm), there is a parameter commonly referred to as k that specifies the number of clusters to detect. Other algorithms such as DBSCAN and OPTICS algorithm do not require the specification of this parameter; hierarchical clustering avoids the problem altogether. The correct choice of k is often ambiguous, with interpretations depending on the shape and scale of the distribution of points in a data set and the desired clustering resolution of the user. In addition, increasing k without penalty will always reduce the amount of error in the resulting clustering, to the extreme case of zero error if each data point is considered its own cluster (i.e., when k equals the number of data points, n). Intuitively then, the optimal choice of k will strike a balance between maximum compression of the data using a single cluster, and maximum accuracy by assigning each data point to its own cluster. If an appropriate value of k is not apparent from prior knowledge of the properties of the data set, it must be chosen somehow. There are several categories of methods for making this decision. (en)
gold:hypernym	Problem
prov:wasDerivedFrom	wikipedia-en:Determining_the_number_of_clusters_in_a_data_set?oldid=1120852726&ns=0
page length (characters) of wiki page	19712 (xsd:nonNegativeInteger)
foaf:isPrimaryTopicOf	wikipedia-en:Determining_the_number_of_clusters_in_a_data_set
is Link from a Wikipage to another Wikipage of	Elbow method (clustering) Nadia Ghazzali Scree plot Correlation clustering Silhouette (clustering) Cluster analysis Fuzzy clustering K-means clustering Hierarchical clustering List of statistics articles Outline of brain mapping Outline of machine learning How many clusters X-means clustering
is Wikipage redirect of	How many clusters X-means clustering
is foaf:primaryTopic of	wikipedia-en:Determining_the_number_of_clusters_in_a_data_set

Faceted Search & Find service v1.17_git139 as of Feb 29 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 08.03.3330 as of Mar 19 2024, on Linux (x86_64-generic-linux-glibc212), Single-Server Edition (378 GB total memory, 67 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software