Research of multidimensional data visualization using feed-forward neural networks ; Tiesioginio sklidimo neuroninių tinklų taikymo daugiamačiams duomenims vizualizuoti tyrimai
24 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Research of multidimensional data visualization using feed-forward neural networks ; Tiesioginio sklidimo neuroninių tinklų taikymo daugiamačiams duomenims vizualizuoti tyrimai

-

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
24 pages
English

Description

VILNIUS GEDIMINAS TECHNICAL UNIVERSITY INSTITUTE OF MATHEMATICS AND INFORMATICS Viktor MEDVEDEV RESEARCH OF MULTIDIMENSIONAL DATA VISUALIZATION 2007 Vilnius Technological Sciences, Informatics Engineering (07T) Summary of Doctoral Dissertation USING FEED-FORWARD NEURAL NETWORKS Doctoral dissertation was prepared at the Institute of Mathematics and Prof Dr Habil Gintautas DZEMYDA (Institute of Mathematics and The dissertation is being defended at the Council of Scientific Field of Prof Dr Habil Romualdas BAUŠYS (Vilnius Gediminas Technical Prof Dr Habil Feliksas IVANAUSKAS (Vilnius University, Physical Assoc Prof Dr Regina KULVIETIENĖ (Vilnius Gediminas Technical Prof Dr Habil Rimvydas SIMUTIS (Kaunas University of Technology, Prof Dr Habil Antanas ŽILINSKAS (Institute of Mathematics and Prof Dr Habil Rimantas ŠEINAUSKAS (Kaunas University of Assoc Prof Dr Antanas Leonas LIPEIKA (Institute of Mathematics and The dissertation will be defended at t he public meeting of the Council of Scientific Field of Informatics Engin eering in the Conference and Seminars Center of the Institute of Mathematics and Informatics at 11 a. m. on January The summary of the doctoral dissertation was distributed on 17 December A copy of the doctoral dissertation is avai lable for review at the Library of Vilnius Gediminas Technical University (Saulėtekio al.

Sujets

Informations

Publié par
Publié le 01 janvier 2008
Nombre de lectures 17
Langue English
Poids de l'ouvrage 1 Mo

Exrait

VILNIUS GEDIMINAS TECHNICAL UNIVERSITY
INSTITUTE OF MATHEMATICS AND INFORMATICS
Viktor MEDVEDEV
RESEARCH OF
MULTIDIMENSIONAL DATA VISUALIZATION

2007 Vilnius
Technological Sciences, Informatics Engineering (07T)
Summary of Doctoral Dissertation
USING FEED-FORWARD NEURAL NETWORKS Doctoral dissertation was prepared at the Institute of Mathematics and
Prof Dr Habil Gintautas DZEMYDA (Institute of Mathematics and
The dissertation is being defended at the Council of Scientific Field of
Prof Dr Habil Romualdas BAUŠYS (Vilnius Gediminas Technical
Prof Dr Habil Feliksas IVANAUSKAS (Vilnius University, Physical
Assoc Prof Dr Regina KULVIETIENĖ (Vilnius Gediminas Technical
Prof Dr Habil Rimvydas SIMUTIS (Kaunas University of Technology,
Prof Dr Habil Antanas ŽILINSKAS (Institute of Mathematics and
Prof Dr Habil Rimantas ŠEINAUSKAS (Kaunas University of
Assoc Prof Dr Antanas Leonas LIPEIKA (Institute of Mathematics and
The dissertation will be defended at t he public meeting of the Council of
Scientific Field of Informatics Engin eering in the Conference and Seminars
Center of the Institute of Mathematics and Informatics at 11 a. m. on January
The summary of the doctoral dissertation was distributed on 17 December
A copy of the doctoral dissertation is avai lable for review at the Library of
Vilnius Gediminas Technical University (Saulėtekio al. 14, LT-10223 Vilnius,
Lithuania) and at the Library of In stitute of Mathematics and Informatics

© Viktor Medvedev, 2007
(Akademijos g. 4, LT-08663 Vilnius, Lithuania)
2007.
e-mail: doktor@adm.vgtu.lt
74 4956; fax +370 5 270 0112; Tel.: +370 5 274 4952, +370 5 2
Address: Goštauto str. 12, LT-01108 Vilnius, Lithuania.
17 2008.
Informatics Engineering – 07T). Informatics, Technological Sciences,
Informatics Engineering – 07T), Technology, Technological Sciences,
Opponents:
Informatics Engineering – 07T). Informatics, Technological Sciences,
Technological Sciences, Informatics Engineering – 07T),
University, Technological Sciences, Informatics Engineering – 07T),
Sciences, Informatics – 09P),
Members:
University, Technological Sciences, Informatics Engineering – 07T).
Chairman:
Informatics Engineering at Vilnius Gediminas Technical University:
Informatics Engineering – 07T). Informatics, Technological Sciences,
Scientific Supervisor
Informatics in 2003–2007. VILNIAUS GEDIMINO TECHNIKOS UNIVERSITETAS
MATEMATIKOS IR INFORMATIKOS INSTITUTAS
Viktor MEDVEDEV
TIESIOGINIO SKLIDIMO
NEURONINIŲ TINKLŲ TAIKYMO
DAUGIAMAČIAMS DUOMENIMS
VIZUALIZUOTI TYRIMAI

2007 Vilnius
Technologijos mokslai, informatikos inžinerija (07T)
Daktaro disertacijos santrauka prof. habil. dr. Gintautas DZEMYDA (Matematikos ir informatikos
Disertacija ginama Vilniaus Gedimino technikos universiteto Informatikos
prof. habil. dr. Romualdas BAUŠYS (Vilniaus Gedimino technikos
prof. habil. dr. Feliksas IVANAUSKAS (Vilniaus universitetas, fiziniai
doc. dr. Regina KULVIETIENĖ (Vilniaus Gedimino technikos
prof. habil. dr. Rimvydas SIMUTIS (Kauno technologijos universitetas,
prof. habil. dr. Antanas ŽILINSKAS (Matematikos ir informatikos
prof. habil. dr. Rimantas ŠEINAUSKAS (Kauno technologijos
doc. dr. Antanas Leonas LIPEIKA (Matematikos ir informatikos
Disertacija bus ginama viešame Info rmatikos inžinerijos mokslo krypties
tarybos posėdyje 2008 m. sausio mė n. 17 d. 11 val. Matematikos ir
Disertaciją galima peržiūrėti Vilnia us Gedimino technikos universitet
(Saulėtekio al. 14, LT-10223 Vilnius, Liet uva) ir Matematikos ir informatikos

© Viktor Medvedev, 2007
VGTU leidyklos „Technika“ 1439 mokslo literatūros knyga.
Vilnius, Lietuva) bibliotekose. instituto (Akademijos g. 4, LT-08663
o
a 2007 m. gruodžio 17 d. Disertacijos santrauka išsiuntinėt
el. paštas doktor@adm.vgtu.lt
4956; faksas (8 5) 270 0112; Tel.: (8 5) 274 4952, (8 5) 274
Adresas: Goštauto g. 12, LT-01108 Vilnius, Lietuva.
informatikos instituto konferencijų ir seminarų centre.
institutas, technologijos mokslai, informatikos inžinerija – 07T).
i, informatikos inžinerija – 07T), universitetas, technologijos moksla
Oponentai:
institutas, technologijos mokslai, informatikos inžinerija – 07T).
technologijos mokslai, informatikos inžinerija – 07T),
i, informatikos inžinerija – 07T), universitetas, technologijos moksla
mokslai, informatika – 09P),
Nariai:
i, informatikos inžinerija – 07T). universitetas, technologijos moksla
Pirmininkas:
inžinerijos mokslo krypties taryboje:
institutas, technologijos mokslai, informatikos inžinerija – 07T).
Mokslinis vadovas
Disertacija rengta 2003–2007 metais Matematikos ir informatikos institute. Topicality of the problem.
apprehension is rather a complicated prob lem especially if the data refer to a
complex object or phenomenon described by many parameters. A tendency has
been recently observed that scientists, who pursue investigations of MDS
even ignore them. On the other hand, in other investigations of visualization

neural networks, thus strengthening th e relationship among different trends of

proposed. That allows a feed-forward neural network to realise Sammon’s
Minimization of the projection error of multidimensional data by using
Aim and tasks of the work. The key aim of the work is to develop and
improve methods how to efficiently minimize visualization errors of
multidimensional data by using artificial neural networks. It was necessary to
solve these tasks: 1) to analyse the methods of multidimensional data
visualization; 2) to investigate the ab ilities of artificial neural networks to
visualize multidimensional data; 3) to create parallel realizations of the
SAMANN algorithm; 4) to improve and speed-up the training and retraining
process of the SAMANN algorithm; 5) to search for the optimal values of the
algorithm learning rate; 6) to investigat e the abilities of the artificial neural
Research object. The research object of the d issertation are artificial
neural networks for multidimensional dat a projection. General topics related
with this object are: 1) multidimensional data visualization; 2) dimensionality
reduction algorithms; 3) rs of projecting data; 4) pr ojection of the new data;
5) strategies for retraining the neural ne twork that visualizes multidimensional
data; 6) optimization of control par ameters of the neural network f r
Scientific novelty. A parallel realization of the SAMANN algorithm for
multidimensional data projection has been created. The strategies for retraining
5

n; 7) parallel computing. multidimensional data projectio
o
erro
networks in projecting new data.
addressed in this dissertation. problem artificial neural networks is the main
projection.
ic learning rule (SAMANN) has been visualizing multidimensional data. A specif
artificial neural network algorithms for visual data analysis. The work deals with
of MDS-type methods by applying artificial work, we try to extend the realizations
ections with MDS-type methods. In this methods there are no comparisons or conn
ociate from other methods of research or (multidimensional scaling), frequently diss
proving apprehension of the data. Data multidimensional data and the ways of im
rk is the analysis of The research area of this wo
General Characteristic of the Dissertation the neural network have b een proposed. It has been established experimentally
how to select the learning parameter va lue of the SAMANN neural network so
The research is based on the development of new strategies
Practical value. The results of the research ar e applied in solving so me
problems in practice. Human physiol ogical data that describe the human
functional state have been investigat ed. The results, obtained by the method,
can be of use to medics for a prelim inary diagnosis: healthy, unclear, or sick
The base of research of psychological data is the project “Information
Approbation and publications of the research. The main results of this
dissertation were published in 11 scientific papers: 2 articles in periodical
scientific publications from the ISI W eb of Science list; 2 articles in periodical
scientific publications from the ISI Proceedings list; 1 article in the bo ok
published by Springer; 1 chapter of the book published by IOS Press; 3 articles
in periodical scientific publications from the list approved by the Scien ce
Council of Lithuania; 2 articles in the proceedings of scientific conferences.
The main results of the work have been presented and discussed at 4
The scope of the scientific work The work is written in Lithuanian. It
consists of 9 chapters, and the list of refe rences. There are 144 pages of the
1.
The relevance of the problem, the sci entific novelty of the results and their
practical significance are described as we ll as the objectives and tasks of t e
2.

or three dimensions to represent, can be difficult to interpret. Direct
6
.
g the data that rwto htan mroe qeriu visualization. Multidimensional data, meanin
d analysis of the various methods of The chapter is devoted to the review an
Analysis of the Methods of Multidimensional Data Visualization
work are formulated in this chapter.
h
Introduction
text, 86 figures, 1 table and 159 bibliographical sources.
international and 5 national conferences.
C-03013)”, supported by the Lithuanian State Science and Studies Foundation.
ecision support (e-Health). IT Health (No. technologies for human health – clinical d
persons.
for SAMANN neural network training and their experimental investigations.
methodology
that the algorithm would work efficiently. visualization methods and projection (d imensionality reduction) methods are
investigated. Several approaches have been developed for visualizing high-
dimensional data. Many of the method s, such as parallel coordinates, star
glyphs, Chernoff faces, try to show all dim ensions of the data at the same time.
This approach is only suitable for relatively few dimensions. When the
dimensionality of the data increases, some other means have to be used. One of
the main strategies used to handle very high dimensional data is dimensionality
reduction where the task is to reduce t he dimensionality of the data to two or
three for visualization. A large number of different projection methods have
been developed for this task. There are linea r and nonlinear projection methods.
The principal component analysis, proj ection pursuit are linear projection
methods; multidimensional scaling, princi pal curves, triangulation, isomap are
nonlinear ones. A more precise data str ucture is preserved using nonlinear
projection methods. Nevertheless, the pr ojection errors are inevitable. It is
m (,xx ,x) , in , in a ii 12iim
Yy (,yy,) , id , in a ii 12iim
d<n
in . Let
ij
ad n
E Sammon’s stress)
*2()ddij ij
. It is a measure of how well the distances are
ij,1 ij
ij ij
ij,1=<;i j
3.
Models of an artificial neuron, network architectures, some learning rules,
self-organizing maps, radial basis functi on networks are described. A
spired by biology. The idea is to build
7
ddd−K =*=∑XjY1i diX=points, ∑data -=dimjnEiK**jYn2=Rm=<X=XKK
e works of McCullogh and Pitts, followed by in this field began in the 1940s, with th
functioning of the brain neurons. Research systems that reproduce the structure and
Neural Networks (ANN) are algorithms in
r t i f i c i a l
Concepts of Artificial Neural Networks
ected to a lower-dimensional space. preserved when the patterns are proj
ollows: f as is (so-called measure error p rojection he T used.
frequently is distance uclidean E T he space. projected the n i points
corresponding t he between istance d t he denote a nd s pace, nput i he t in
nd a between distance the denote plane the onto 1 ..., , ,
vectors -dimensional these isualize v to is roblem p pending The ). ( space
, 1, points, m define e w respectively,
and, -space , 1, ave h we
hat t uppose S mapping. Sammon s i onality imensid l ower of space a onto space
igh-dimensional h a ap m to methods caling s ultidimensional m of One
necessary to look for the ways of minimizing these projection errors. 4.
This chapter deals with the capabilities of ANN t
multidimensional data. Application of art ificial neural networks to Sammon’s
ganizing neural networks application
areas, curvilinear component analysis, autocoders, NeuroScale methods are
the number of output units is specified as the extracted feature space dimension
d 1 M afor J (.Mao,
extraction and multivariate data projection, IEEE Transactions on Neural
have p 2 6derived 2 N 1 V a weight updating rule
for the multilayer perceptron neural ne twork that minimizes Sammon’s stress
1.
Fig 1.
8
n
mmon’s projection (SAMANN) Feed-forward network for Sa
y in the SAMANN network. Initialize the weights randoml
ion algorithm is as follows: The SAMANN unsupervised backpropagat
using the gradient descent method.
o , 9 o l , , 9 . . 9 5 . ,6 – 3 1 7 ) Networks
J A.K.Jain, niaArtificial ndneural networks feature ao .) giF(
and , imension d pace s eature f he t be o t et s is u nits i nput of number the where
network n eural feed-forward a is It algorithm. projection ammon S original
the i n resent p not a re hich w data, new projecting o f ability generalization
he t ffers o network SAMANN The it. recalculating without obtained
map t he to added b e cannot points new t hat means hich w generalization,
l acks It drawback. a has mapping S ammon possible. as well s a distances
nterpatteri n ll a reserves p algorithm T his investigated. is rojection p nonlinear
ammon‘s S p erform f or SAMANN) ( rk netwo neural feed-forward multilayer
a t raining or f algorithm backpropagation unsupervised an work, t his In
discussed in this chapter.
propagation learning algorithm. The self-or
eural network is trained by a specific back- projection is analysed: a feed-forward n
t h e v i s u a l i z e o
Artificial Neural Networks for Multidimensional Data Visualization
ly organized in a layered structure. of interconnected adaptive units general
ural Network can be described as a set Hebb, Rosenblatt, Widrow. An Artificial Ne2. Select a pair of patterns randomly, pr esent them to the network one at a
Update the weights in the backpropagat ion fashion starting from the output
4.
5. Present all the patterns and evaluate the outputs of the network; compute
Sammon’s stress; if the value of Samm on’s stress is below a predefine
threshold or the number of iterat ions (from steps 2–5) exceeds the
The rate, at which artificial neural networks learn, depends upon several
controllable factors. When projecting data , it is of great importance to achieve
good results in a short time interval. In the consideration of the SAMANN
network, it has been observed that the projection error depends on different
parameters. Investigations have revealed that, in order to achieve good results,
one needs to correctly se lect the learning rate . It has been stated so far that
projection yields the best results if the value is taken from the interval (0;1).
t c t n t i v s One of the possible reasons is
that, in the case of the SAMANN network, the interval (0;1) is not the best one.
dependence of the data projection accuracy on the learning rate has been
defined for ) ;1 (0 The results obtained are illustrated in Fig 2. This figure
demonstrates that with an increase in the learning rate value, a better projection

.
rate for the datasets considered is within In the case of the
Salinity dataset, the optimal value of the learning rate is , for the Iris

results, i.e., the data are projected more rapidly and more exactly. For the fixed
number of iterations, good projection results are obtained in a shorter time
9
η =η = 1ηηηη η> ∈η
values from the interval (0;1). interval than that taking the
At these values of the learning rate we obtain the best projection . dataset30
10
the interval [10;30].
at the optimal value of the learning We can conclude from Figures 2 and 3 th
in Fig 3. It has been noticed that the best results are at
esults are presented of the learning rate beyond the limits of the interval (0;1). T h e r
error is obtained. That is why the experiments have been done with higher values
.
The experiments have been done with real and artificial datasets. At first the
aining depends on the learning rate. show in what way the SAMANN network tr
may not necessarily be within the interval. The experiments, done in this chapter,
Thus, it is reasonable to look for the optimal value of the learning parameter that
wol. yre s iarng terowk eh ,esa hta nI
5. Learning Problems of the SAMANN Neural Network
predefined maximum number, then stop; otherwise, go to step 2.
d
Repeat steps 2–3 a number of times.
layer.
3.
time, and evaluate the network in a feed-forward fashion. Fig 2. Dependence of the data projection accuracy
on the learning rate ) ,1(0
Fig 3. Dependence of the data projection accuracy
on the learning rate

a) b)
The dependence of the projection error on the computation time for the salinity
10
η∈ ηη η ∈
dataset (a) and the iris dataset (b)
Fig 4.
[1)01, ,
(a – Salinity dataset, b – Iris dataset)
b) a)
,
(a – Salinity dataset, b – Iris dataset)
b) a)

  • Accueil Accueil
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • BD BD
  • Documents Documents