Return home
Tutorials for DtmVic
(The software DtmVic should be downloaded beforehand, since the tutorial
makes use of the example data sets)
Tutorial A : An introduction to DtmVic
- A1- Principal Components Analysis.
Active and supplementary variables. Supplementary categories. Bootstrap validation.
PCA is followed by a clustering of observations, and a description of the obtained clusters.
- A2- Correspondence Analysis.
Correspondence Analysis of a small contingency table. Bootstrap validation.
- A3- Multiple Correspondence Analysis.
Active and supplementary categories. Bootstrap validation.
MCA is followed by a clustering of observations, and a description of the obtained clusters.
- A4- Correspondence Analysis of a lexical table.
Processing of a simple series of texts (20 first Shakespearian Sonnets). Numerical
coding. Correspondence Analysis of the lexical table words - poems. Bootstrap validation.
Characteristic words and verses. Kohonen maps. Seriation.
- A5- Open questions in a sample survey.
Using both numerical and textual data. Processing of the responses to an open-ended question using a specific
categorical variable. Numerical coding of the responses. Correspondence Analysis of the lexical table words x
categories. Bootstrap validation. Description of the categories through their characteristic words and responses.
Simultaneous Kohonen map for words and categories.
Tutoriel B : DtmVic and textual data
Unlike Tutorial A, the following examples use existing command files (or: parameter files). Each
example corresponds to a directory included in the directory "DtmVic-Examples_B_Texts" that
has been downloaded with DtmVic.
- B1- Open questions in a sample survey: First exploration
First processing of the responses to an open-ended question. Numerical coding of the responses. Correspondence
Analysis (CA) of the sparse lexical table words x respondents, clustering of the responses, and a description of the
obtained clusters through their characteristic words and responses. Kohonen map for words and for respondents.
- B2- Open questions in a sample survey:
link with closed-end questions
This example, involving 14 steps, contains the example B.1 but takes into account the information about closed
questions. Numerical coding of the responses. Examples of modification of the frequency threshold for words.
Example of concordances (syntactic context) for some words. CA of the lexical table words x respondents,
clustering of the responses, and a description of the obtained clusters through their characteristic words, responses,
and also through their characteristic categories (closed questions). Kohonen maps for words and for respondents.
- B3- Open questions and MCA in a sample survey
Multiple Correspondence Analysis and Clustering of respondents using closed questions. Processing aggregated
[and lemmatised] responses to open questions. The example puts forward another technique for grouping and
processing responses to open question in a sample survey. In a first phase, a multiple correspondence analysis is
performed on a set of selected categorical variables (i.e.: responses to closed-end questions). The principal axes
visualisation is complemented with a clustering, followed by an automatic description of the clusters. These clusters
are then used to aggregate the responses to an open question.
- B4- Visualization of the Semantic network of French verbs
Visualisation of the semantic links existing between 829 French verbs. Each verb is described by a list of
"synonyms".
Tutorial C : DtmVic with numerical data:
Semiometry, Fisher's Iris data, Graphs, Images.
- C1- Visualization in Principal Components Analysis (1)
Example C1 uses an excerpt of semiometric data. The principal axes visualisation is complemented by a clustering, with an automatic
description of the clusters. Bootstrap procedures, Kohonen maps are followed by the various tools of visualisation
provided in the sub-menu "Visualization" of the phase "VIC" : visualisation of clusters (or categories) using symbols
or colours, convex hulls or density ellipses for clusters, Minimum spanning tree, drawing of various nearest
neighbours graphs.
- C2- Visualization in Principal Components Analysis (2)
Same set of numerical variables (semiometric data) . The principal axes visualisation is complemented by a
clustering of variables. Bootstrap procedures, Kohonen maps are followed by the various tools of visualisation
provided in the menu "Contiguity view": visualisation of clusters of variables using symbols or colours, convex hulls
or density ellipses for clusters, Minimum spanning tree drawn on variable-points, drawing of various nearest
neighbours graphs.
- C3- PCA and Contiguity Analysis on Fisher's Iris Data
Example C3 aims at analysing a classical set of numerical variables (The Iris data set of Anderson / Fisher) through
Principal Components Analysis, Classification, Contiguity Analysis, Discriminant Analysis. The principal axes
visualisation is complemented with a clustering including an automatic description of the clusters.
At the outset, example C3 is very similar to example C2: Principal components analysis and classification
(clustering) of a set of numerical data, with various tools of visualisation, involving also a specific categorical data. It
presents then the improvements provided by Contiguity Analysis and and a more classical particular case of
Contiguity Analysis : Linear Discriminant Analysis.
- C4- Description of graphs through Correspondence Analysis
Visualisation of a series of simple symmetrical planar graphs, mainly through correspondence
analysis. The three graphs are planar graphs: a chessboard shaped graph, a cycle, and two empirical graphs supposed to
roughly represent maps of the regions of Japan and France.
These examples provide a bridge between distinct facets of DtmVic: a same graph can lead to different input data :
classical numerical data, textual data, and a specific "external format".
- C5- Structural Compression of Images through SVD and CA
Example C5 could be viewed as a pedagogical appendix. It does not make use of data in DtmVic format, since it
deals with digitalized images. A simple rectangular array of integers suffices: there is no need for identifiers of rows
or column. A specialized interface is provided via the button "DtmVic Tools" of the main menu.
Tutorial D : Data importation
- D1- Importation of numerical and textual data from an Excel ® file
- D2- Importation of numerical data from a free format file
- D3- Importation of numerical data from a fixed format file
- D4- Importation of textual data from a free format file
- D5- Importation of both numerical and textual data from a XML format file
Return home