Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

An effective biometric discretization approach to extract highly discriminative, informative, and privacy-protective binary representation

16 pages
Biometric discretization derives a binary string for each user based on an ordered set of biometric features. This representative string ought to be discriminative, informative, and privacy protective when it is employed as a cryptographic key in various security applications upon error correction. However, it is commonly believed that satisfying the first and the second criteria simultaneously is not feasible, and a tradeoff between them is always definite. In this article, we propose an effective fixed bit allocation-based discretization approach which involves discriminative feature extraction, discriminative feature selection, unsupervised quantization (quantization that does not utilize class information), and linearly separable subcode (LSSC)-based encoding to fulfill all the ideal properties of a binary representation extracted for cryptographic applications. In addition, we examine a number of discriminative feature-selection measures for discretization and identify the proper way of setting an important feature-selection parameter. Encouraging experimental results vindicate the feasibility of our approach.
Voir plus Voir moins

Lim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107
RESEARCH Open Access
An effective biometric discretization approach to
extract highly discriminative, informative, and
privacy-protective binary representation
*Meng-Hui Lim and Andrew Beng Jin Teoh
Biometric discretization derives a binary string for each user based on an ordered set of biometric features. This
representative string ought to be discriminative, informative, and privacy protective when it is employed as a
cryptographic key in various security applications upon error correction. However, it is commonly believed that
satisfying the first and the second criteria simultaneously is not feasible, and a tradeoff between them is always
definite. In this article, we propose an effective fixed bit allocation-based discretization approach which involves
discriminative feature extraction, discriminative feature selection, unsupervised quantization (quantization that does
not utilize class information), and linearly separable subcode (LSSC)-based encoding to fulfill all the ideal properties
of a binary representation extracted for cryptographic applications. In addition, we examine a number of
discriminative feature-selection measures for discretization and identify the proper way of setting an important
feature-selection parameter. Encouraging experimental results vindicate the feasibility of our approach.
Keywords: biometric discretization, quantization, feature selection, linearly separable subcode encoding
1. Introduction equal-probable binary outputs creates a huge key space
Binary representation of biometrics has been receiving which could render an attacker clueless in guessing the
an increased amount of attention and demand in the correct output during a brute force attack. This is extre-
last decade, ever since biometric security schemes were mely essential in security provision as a malicious
widely proposed. Security applications such as bio- impersonation could take place in a straightforward
metric-based cryptographic key generation schemes manner if the correct key can be obtained by the adver-
[1-7] and biometric template protection schemes [8-13] sary with an overwhelming probability. Entropy is a
require biometric features to be present in binary form common measure of uncertainty, and it is usually a bio-
before they can be implemented in practice. However, metric system specification. By denoting the entropy of
as security is in concern, these applications require bin- abinaryrepresentationas L,itcanthenberelatedto
ary biometric representation to be the N number of outputs with probability p for i = {1,...,i
N? Discriminative: Binary representation of each user N}by . If the outputs are equal-L = − p log pi ii=1 2
ought to be highly representative and distinctive so that
probable, then the resultant entropy is maximal, that is,
it can be derived as reliably as possible upon every
L=log N. Note that the current encryption standard2
query request of a genuine user and will neither be mis-
based on the advanced encryption standard (AES) is
recognized as others nor extractable by any non-genuine
specified to be 256-bit entropy, signifying that at least
user. 256
2 possible outputs are required to withstand a brute
? Informative: Information or uncertainty contained in
force attack at the current state of art. With the consis-
the binary representation of each user should be made
tent technology advancement, adversaries will become
adequately high. In fact, the use of a huge number of
more and more powerful, resulting from the growing
capability of computers. Hence, it is utmost important
* Correspondence: bjteoh@yonsei.ac.kr to derive highly informative binary strings in coping
School of Electrical and Electronic Engineering, College of Engineering,
with the rising encryption standard in the future.
Yonsei University, Seoul, South Korea
© 2011 Lim and Teoh; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.Lim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 2 of 16
? Privacy-protective: To avoid devastated consequence ? Encoding: The second component can be regarded as
upon compromise of the irreplaceable biometric features a discrete-to-binary mapping process, where the resul-
of every user, the auxiliary information used for bit- tant index of each dimension is mapped to a unique n-
string regeneration must not be correlated to the raw or bit binary codeword of an encoding scheme. Next, the
projected features. In the case of system compromise, codeword output of every feature dimension is concate-
such non-correlation of the auxiliary information should nated to form the final bit string of a user. The discreti-
be guaranteed to impede any adversarial reverse engi- zation performance is finally evaluated in the Hamming
domain.neering attempt in obtaining the raw features. Other-
These two components are governed by a static or awise, it has no difference from storing the biometric
features in the clear in the system database. dynamic bit allocation algorithm, determining whether
To date, only a handful of biometric modalities such thequantityofbinarybitsallocated to every dimension
as iris [14] and palm print [15] have their features repre- is fixed or varied, respectively. Besides, if the (genuine
sented in the binary form upon an initial feature-extrac- or/and imposter) class information is used in determin-
tion process. Instead, many remain being represented in ing the cut points (intervals’ boundaries) of the non-
the continuous domain upon the feature extraction. overlapping quantization intervals, the discretization is
Therefore, an additional process in a biometric system is thus known as supervised discretization [1,3,16], and
needed to transform these inherently continuous fea- otherwise, it is referred to as unsupervised discretization
tures into a binary string (per user), known as the bio- [7,17-19].
metric discretization process. Figure 1 depicts the On the other hand, information about the constructed
general block diagram of a biometric discretization- intervals of each dimension is stored as the helper data
based binary string generator that employs a biometric during enrolment so as to assist reproducing the same
discretization scheme. binary string of each genuine user during the verifica-
In general, most biometric discretization can be tion phase. However, similar to the security and the
decomposed into two essential components, which can privacy requirements of the binary representation, it is
be alternatively described as a two-stage mapping important that such helper data, upon compromise,
process: should neither leak any helpful information about the
? Quantization: The first component can be seen as a output binary string (security concern), nor the bio-
continuous-to-discrete mapping process. Given a set of metric feature itself (privacy concern).
feature elements per user, every one-dimensional feature
space is initially constructed and segmented into a num- 1.1 Previous works
ber of non-overlapping intervals where each of which is Over the last decade, numerous biometric discretization
associated to a decimal index. techniques for producing a binary string from a given

Figure 1 A biometric discretization-based binary string generator.Lim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 3 of 16
set of features of each user have been reported. These direct binary representation (DBR) encoding elements (i.
schemes base upon either a fixed-bit allocation principle e. 3 ® 011,4 ® 100,5 ® 101 ). On the other10 2 10 2 10 2
(assigning a fixed number of bits to each feature dimen- hand, Chang et al. extend each feature space to account
sion) [4-7,10,13,16,20] or a dynamic-bit allocation prin- for the extra equal-width intervals to form 2 intervals
ciple (assigning a different number of bits to each in accordance to the entire 2 codeword labels from
feature dimension) [1,3,17-19,21]. each n-bit DBR encoding scheme.
Monrose et al. [4,5], Teoh et al. [6], and Verbitsky et Although both these schemes are able to generate bin-
al. [13] partition each feature space into two intervals ary strings of arbitrary length, they turn out to be
(labeled by ‘0’ and ‘1’) based on a prefix threshold. Tuyls greatly inefficient, since the ad-hoc interval handling
et al. [12] and Kevenaar et al. [9] have used a similar 1- strategies may probably result in considerable leakage of
bit discretization technique, but instead of fixing the entropy which will jeopardize the security of the users.
threshold, the mean of the background probability den- In particular, the non-feasible labels of all extra intervals
sity function (for modeling inter-class variation) is (including the boundary intervals) would allow an adver-
selected as the threshold in each dimension. Further, sary to eliminate the corresponding codeword labels
reliable components are identified based on either the fromherorhisoutput-guessingrangeafterobserving
training bit statistics [12] or a reliability (RL) function the helper data, or after reliably identifying the “fake”
[9] so that unreliable dimensions can be eliminated intervals. Apart from this security issue, another critical
from bits’ extraction. problem with these two schemes is the potential expo-
Kelkboom et al. have analytically expressed the genu- sure of the exact location of each genuine user pdf.
ine and imposter bit error probability [22] and subse- Based on the knowledge that the user pdf is located at
quently modeled a discretization framework [23] to the center of the genuine interval, the constructed inter-
analytically estimate the genuine and imposter Ham- vals thus serve as a clue at which the user pdf could be
ming distance probability mass functions (pmf) of a bio- located to the adversary. As a result, the possible loca-
metric system. This model is based upon a static 1-bit tions of user pdf could be reduced to the amount of
equal-probable discretization under the assumption that quantization intervals in that dimension, thus potentially
both intra-class and inter-class variations are Gaussian facilitating malicious privacy violation attempt.
distributed. Chen et al. [16] demonstrated a likelihood-ratio-based
Han et al. [20] proposed a discretization technique to multi-bit biometric discretization scheme which is like-
extract a 9-bit pin from each user’s fingerprint impres- wise to be supervised and user specific. The quantiza-
sions. The discretization derives the first 6 bits from six tion scheme first constructs the genuine interval to
pre-identified reliable/stable minutiae: If a minutia accommodate the likelihood ratio (LR) detected in that
belongs to bifurcation, a bit “0” is assigned; otherwise, if dimension and creates the remaining intervals in an
it is a ridge ending, a bit “1” is assigned. The derivation equal-probable (EP) manner so that the background
of the last 3 bits is constituted by a single-bit discretiza- probability mass is equally distributed within every
tion on each of three triangular features. If the biometric interval. The leftmost and rightmost boundary intervals
password/pin is used directly as a cryptographic key in with insufficient background probability mass are
security applications, it will be too short to survive brute wrapped into a single interval that is tagged with a com-
force attacks, as an adversary would only require at mon codeword label from the binary reflected gray code
most 512 attempts to crack the biometric password. (BRGC)-encoding scheme [24] (i.e., 3 ® 010,4 ®10 2 10
Hao and Chan [3] and Chang et al. [1] employed a 110,5 ® 111 ). This discretization scheme suffers2 10 2
multi-bit supervised user-specific biometric discretiza- from the same privacy problem as the previous super-
tion scheme, each with a different interval-handling vised schemes owing to that the genuine interval is con-
technique. Both schemes initially fix the position of the structed based on the user-specific information.
genuine interval of each dimension dimension around Yip et al. [7] presented an unsupervised, non-user spe-
the modeled pdf of the jth user: [μ -ks , μ +ks]and cific, multi-bit discretization scheme based on equal-j j j j
then construct the remaining intervals based on a con- width intervals’ quantization and BRGC encoding. This
stant width of 2ks within every feature space. Here, μ scheme adopts the entire BRGC code for labeling, andj j
and s denote mean and standard deviation (SD) of the therefore, it is free from the entropy loss problem.j
user pdf, respectively and k is a free parameter. As for Furthermore, since it does not make use of the user pdf
the boundary portions at both ends on each feature to determine the cut points of the quantization intervals,
space, Hao and Chan unfold every feature space arbitra- this scheme does not seem to suffer from the aforemen-
rily to include all the remaining possible feature values tioned privacy problem.
in forming the leftmost and rightmost boundary inter- Teoh et al. [18,19] developed a bit-allocation approach
vals. Then, all the constructed intervals are labeled with based on an unsupervised equal-width quantization withLim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 4 of 16
a BRGC-encoding scheme to compose a long binary
string per user by assigning different number of bits to
each feature dimension according to the SD of each esti-
mated user pdf. Particularly, the intention is to assign a
larger quantity of binary bits to discriminative dimen-
sions and smaller otherwise. In other words, the larger
the SD of a user pdf is detected to be, the lesser the
quantity of bits will be assigned to that dimension and
vice versa. Nevertheless, the length of the binary string
is not decided based on the actual position of the pdf
itself in the feature space. Although this scheme is
invulnerable to the privacy weakness, such a deciding
strategy gives a less accurate bit allocation: A user pdf
falling across an interval boundary may result in an
undesired intra-class variation in the Hamming domain
and thus should not be prioritized for bit extraction.
Another concern is that pure SD might not be a pro-
mising discriminative measure.
Chen et al. [17] introduced another dynamic bit-allo-
cation approach by considering detection rate (DR)

(user probability mass captured by the genuine interval)
Figure 2 An indefinite discrete-to-binary mapping from each
as their bit-allocation measure. The scheme, known as discrete-labelled quantization interval to a 3-bit BRGC
DR-optimized bit-allocation (DROBA), employs an codeword. The labelg(b) in each on the continuous feature
space can be understood by “index number (associated codeword)”.equal-probable quantization intervals construction with
BRGC encoding. Similar to Teoh et al.’s dynamic bit
allocation scheme, this scheme assigns more bits to
Linearly separable Subcode (LSSC) [25] has been putmore discriminative feature dimensions and vice versa.
forward to resolve such a performance-entropy tradeoffRecently, Chen et al. [21] developed a similar dynamic
by introducing bit redundancy to maintain the perfor-bit-allocation algorithm based on optimizing a different
manceaccuracywhenahighentropyrequirementisbit-allocation measure: area under the FRR curve. Given
imposed. Although the resultant LSSC-extracted binarythe bit-error probability, the scheme allocates bits dyna-
strings require a larger bit length in addressing an 8-mically to every feature component in a similar way as
interval discretization problem as exemplified in FigureDROBA except that the analytic area under the FRR
3, mapping discrete elements to the Hamming spacecurve for Hamming distance evaluation is minimized
becomes completely definite.instead of DR maximization.
This article focuses on discretization basing upon the
fixed bit-allocation principle. We extend the study of1.2 Motivation and contributions
[25] to tackle the open problem of generating desirableIt has been recently justified that DBR- and BRGC-
binary strings that are simultaneously highly discrimina-encoding-based discretization could not guarantee a dis-
tive, informative, and privacy-protective by means of dis-criminative performance when a large per-dimensional
cretization based on LSSC. Specifically, we adopt aentropy requirement is imposed [25]. The reason lies in
discriminative feature extraction with a further featurethe underlying indefinite feature mapping of DBR and
selection to extract discriminative feature components;BRGC codes from a discrete to a Hamming space, caus-
an unsupervised quantization approach to offer promis-ing the actual distance dissimilarity in the Hamming
ing privacy protection; and an LSSC encoding to achievedomain unable to be maintained. As a result, feature
large entropy without having to sacrifice the actual clas-points from multiple different intervals may be mapped
sification performance accuracy of the discriminativeto DBR or BRGC codewords which share a common
feature components. Note that the preliminary idea ofHammingdistanceawayfromareferencecodeword,as
this article has appeared in the context of global discre-illustrated by the 3-bit discretization instance in Figure
tization [26] for achieving strong security and privacy2. For this reason, regardless of how discriminative the
protection with high training efficiency.extracted (real-valued) features could be, deriving discri-
In general, the significance of our contribution isminative and informative binary strings with DBR or
three-fold:BRGC encoding will not be practically feasible.Lim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 5 of 16
BRGC and DBR for encoding is highlighted. In section
3, detailed descriptions about our approach in generat-
ing desirable binary representation will be given and ela-
borated. In section 4, experimental results justifying the
effectiveness of our approach are presented. Finally, con-
cluding remarks are provided in Section 5.
2. The emergence of LSSC
2.1 The security-performance tradeoff of DBR and BRGC
Two common encoding schemes adopted for discretiza-
tion, before LSSC is introduced, are DBR and BRGC.
DBR has each of its decimal indices directly converted
into its binary equivalent, while BRGC is a special code
that restricts the Hamming distance between every con-
secutive pair of codewords to unity. Depending on the
required size S of a code, the length of both DBR and
BRGC are commonly selected to be n = ⌈log S⌉.DBR 2
Instances of DBR and BRGC with different lengths
(n and n respectively) and sizes S are shown inDBR BRGC
Table 1. Here, the length of a code refers to the number
of bits in which the codewords are represented, while
the size of a code refers to the number of elements in a
Figure 3 A definite discrete-to-binary mapping from each
code. The codewords are indexed from 0 to S-1. Notediscrete-labelled quantization interval to a 7-bit LSSC
that each codeword index corresponds to the quantiza-codeword. The labelg(b) in each on the continuous feature
space can be understood by “index number (associated codeword)”. tion interval index as well.
Conventionally, a tradeoff between discretization per-
formance and entropy length is inevitable when DBR or
a) We propose a fixed bit-allocation-based discreti-
BRGC is adopted as the encoding scheme. The rationale
zation approach to extract a binary representation
behind was identified to be the indefinite discrete-to-
which is able to fulfill all the required criteria from
binary mapping behavior during the discretization pro-
each given set of user-specific features.
cess, since the employment of an encoding scheme in
b) Required by our approach, we study empirically
general affects only on how each index of the quantiza-
various discriminative measures that have been put
tion intervals is mapped to a unique binary codeword.
forward for feature selection and identify the reliable
More precisely, one may carefully notice that multiple
ones among them.
DBR as well as BRGC codewords share a common
c) We identify and analyze factors that influence
Hamming distance with respect to any reference code-
improvements resulting from the discriminative
word in the code for n and n ≥ 2, mapping pos-DBR BRGCselection based on the respective measures.
sibly most initially well-separated imposter feature
elements from a genuine feature element in the index
The structure of this article is organized as follows. In
space much nearer than it should be in the Hamming
the next section, the efficiency of using LSSC over
Table 1 A collection of n -bit DBRs and n -bit BRGCs for S = 8 and 16 with [τ] indicating the codeword index.DBR BRGC
Direct binary representation (DBR) Binary reflected gray code (BRGC)
n =3 n =4 n =3 n =4DBR DBR BRGC BRGC
S=8 S=16 S=8 S=16
[0] 000 [0] 0000 [8] 1000 [0] 000 [0] 0000 [8] 1100
[1] 001 [1] 0001 [9] 1001 [1] 001 [1] 0001 [9] 1101
[2] 010 [2] 0010 [10] 1010 [2] 011 [2] 0011 [10] 1111
[3] 011 [3] 0011 [11] 1011 [3] 010 [3] 0010 [11] 1110
[4] 100 [4] 0100 [12] 1100 [4] 110 [4] 0110 [12] 1010
[5] 101 [5] 0101 [13] 1101 [5] 111 [5] 0111 [13] 1011
[6] 110 [6] 0110 [14] 1110 [6] 101 [6] 0101 [14] 1001
[7] 111 [7] 0111 [15] 1111 [7] 100 [7] 0100 [15] 1000Lim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 6 of 16
space. Taking 4-bit DBR-based discretization as an which appears to be equal to the difference between the
example, the interval labelled with “1000”, located 8 inter- codeword index “3” and “1”. It is in general not difficult
vals away from the reference interval “0000”, is eventually to observe that neighbour codewords tend to have a
mapped to one Hamming distance away in the Hamming smaller Hamming distance compared to any distant
space. Worse for BRGC, interval “1000” is located even codewords. Thus, unlike DBR and BRGC, LSSC ensures
further (15 intervals away) from interval ‘0000’. As a result, every distance in the index space being thoroughly pre-
imposter feature components might be misclassified as served in the Hamming space, despite the large bit
genuine in the Hamming domain and eventually, the dis- redundancy a system might need to afford. As reported
in [25], increasing the entropy per dimension has a tri-cretization performance would be greatly impeded by such
an imprecise discrete-to-binary map. In fact, this defective vial effect on discretization performance through the
phenomenon gets more critical as the required entropy employment of LSSC, with the condition that the quan-
increases, or as S increases [25]. tity of quantization intervals constructed in each dimen-
sion is not too few. Instead, the entropy now becomes a
2.2 LSSC function of the bit redundancy incurred.
Linearly separable subcode (LSSC) [25] was put forward
to tackle the aforementioned inabilities of DBR and 3. Desirable bit string generation and the
BRGC effectively in fully preserving the separation of appropriate discriminative measures
feature points in the index domain when the eventual In the literature review, we have seen that user-specific
distance evaluation is performed in the Hamming information (i.e., user pdf) should not be utilized to
domain. This code particularly utilizes redundancy to define cut points of the quantization intervals to avoid
augment the separability in the Hamming space for reduction of possible locations of user pdf to the quan-
enabling one-to-one correspondence between every tity of intervals in each dimension. Therefore, strong
non-reference codeword and the Hamming distance privacy protection basically limits the choice of quanti-
incurred with respect to every possible reference zation to unsupervised techniques. Furthermore, the
codeword. entropy-performance independence aspect of LSSC
Let n denotes the code length of LSSC. An LSSC encoding allows promising performance to be preservedLSSC
contains S=(n + 1) codewords, that is a subset of regardless of how large the entropy is augmented perLSSC
nLSSC codewords (in total). The construction of LSSC dimension, and correspondingly how large the quantity2
can be given as follows: Beginning with an arbitrary of feature-space segmentation in each dimension would
n -bit codeword, say, an all zero codeword, the next be. Therefore, if we are able to extract discriminativeLSSC
n codewords can be sequentially derived by comple- feature components for discretization, deriving discrimi-LSSC
native, informative, and privacy-protective bit stringsmenting a bit at a time from the lowest-order (right-
can thus be absolutely possible. Our strategy can gener-most) to the highest-order (leftmost) bit position. The
ally be outlined in the four following fundamental steps:resultant n -bit LSSCs in fulfilling S=4,8and16LSSC
are shown in Table 2.
i. [Feature Extraction]-Employ a discriminative fea-The amount of bit disagreement, or equivalently the
ture extractor ℑ(·) (i.e., Fisher’s linear discriminantHamming distance between any pair of codewords hap-
pens to be the same as the corresponding positive index analysis (FDA) [27], Eigenfeature regularization and
difference. For a 3-bit LSSC, as an example, the Ham- extraction (ERE) [28]) to ensure D quality features
ming distance between codewords “111” and “001” is 2, being extracted from R raw features;
ii. [Feature Selection]-Select D (D <D <R)mostfs fs
discriminative feature components from a total of D
Table 2 A collection of n -bit LSSCs for S = 4, 8 and 16LSSC dimensions according to a discriminative measure c
where [τ] denotes the codeword index. (·);
n =3 n =7 n =15LSSC LSSC LSSC iii. [Quantization]-Adopt an unsupervised equal-
S=4 S=8 S=16
probable quantization scheme Q(·) to achieve strong
[0] 000 [0] 0000000 [0] 000000000000000 [8] 000000011111111
privacy protection; and
[1] 001 [1] 0000001 [1]0001 [9] 000000111111111
iv. [Encoding]-Employ LSSC for encoding ℰ (·) toLSSC
[2] 011 [2] 0000011 [2] 000000000000011 [10] 000001111111111
maintain such discriminative performance, while
[3] 111 [3] 0000111 [3]0111 [11] 000011111111111
satisfying arbitrary entropy requirement imposed on
[4] 0001111 [4] 000000000001111 [12] 000111111111111
the resultant binary string.
[5] 0011111 [5] 000000000011111 [13] 001111111111111
[6] 0111111 [6] 000000000111111 [14] 011111111111111
This approach initially obtains a set of discriminative
[7] 1111111 [7] 000000001111111 [15] 111111111111111
feature components in steps (i) and (ii); and producesLim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 7 of 16
an informative user-specific binary string (with large any significant effect in changing the background distri-
dentropy) while maintaining the prior discriminative per- bution). In their scheme, the cut points ofv ,v ∈1 2
formance in steps (iii) and (iv). The privacy protection is dthe j-th user’sgenuineinterval int in the dth-dimen-joffered by unsupervised quantization in step (iii), where
sional feature space are chosen based on a prefix thresh-
the correlation of helper data with the user-specific data
old t, such that
is insignificant. This makes our four-step approach to be
capable of producing discriminative, informative, and d d dint = {[v ,v ] ∈V | LR ≥ t} (2)1 2j j
privacy-protective binary biometric representation.
Among the steps, implementations of (i), (iii), and (iv) The remaining intervals are then constructed equal-
are pretty straightforward. The only uncertainty lies in probably, that is, with reference to the portion of back-
the appropriate discriminative measure and the corre- ground distribution captured by the genuine interval.
sponding parameter D in step (ii) for attaining absolutefs Since different users will have different intervals con-
superiority. Note that step (ii) is embedded particularly structed in each feature dimension, this discretization
to supplement the restrictive performance led by approach turns out to be user specific.
employment of unsupervised quantization. Here, we In fact, the LR could be used to assess discriminativity
introduce a couple of discriminative measures that can dof each feature component efficiently, since max(f (v))jbe adopted for discretization and perform a study on
d 2 dis reversely proportional to (σ ) because f (v)dv=1,the superiority of such measures in the next section. j j
or equivalently the dth dimensional intra-class variation;
d3.1 Discriminative measures X(·) for feature selection and f (v) is reversely proportional to the dth dimen-
The discriminativeness of each component is sional inter-class variation, which imply
closely related to the well-known Fisher’s linear discri- df (v) inter - classvariationjdLR =max ∝ max ,j∈{1,2,...,J},d∈{1,2,...,D} (3)minant criterion [27], where the discriminant criterion j df (v) intra - classvariation
is defined to be the ratio of between-class variance
Therefore, adopting D dimensions with maximum LR(inter-class variation) to within-class variance (intra- fs
would be equivalent to selecting D feature elementsclass variation). fs
with maximum inter- over intra-class variation.Suppose that we have J users enrolled to a biometric
3.1.2. Signal-to-noise ratio (c = SNR)system, where each of them is represented by a total of
1 2 D Signal-to-noise ratio (SNR) could possibly be anotherv ,v ,...,vD-ordered feature elements upon featureji ji ji
alternative to discriminative measurement, since it is a
extraction from each measurement. In view of potential
measure that captures both intra-class and inter-class
intra-class variation, the dth feature element of the jth
variations. This measure was first used in feature selec-
user can be modeled from a set of measurements by a
tion by a user-specific 1-bit RL-based discretization
duser pdf, denoted by f (v) where dÎ {1, 2,...,D}, jÎ {1,j scheme [12] to sort the feature elements which are iden-
d2,...,J}and vÎ feature space .Ontheotherhand, tified to be reliable. However, instead of using the
owing to inter-class variation, the dth feature element of default average intra-class variance to define SNR, we
the measurements of the entire population can be mod- adopt the user-specific intra-class variance to compute
deled by a background pdf, denoted by f (v). Both distribu- the user-specific SNR for each feature component to
tions are assumed to be Gaussian according to the obtain an improved precision:
central limit theorem. That is, the dth-dimensional back- 2 d(σ ) inter - classvarianced d dSNR = = , j∈{1,2,...,J},d ∈{1,2,...,D}ground pdf has mean μ and SD s while the jth user’s (4)j 2d intra - classvariance(σ )j
d ddth-dimensional user pdf has mean μ and variance σ .j j
3.1.3. Reliability (c = RL)3.1.1. Likelihood ratio (c = LR)
Reliability was employed by Kevenaar et al. [9] to sortThe idea of using LR to achieve optimal FAR/FRR per-
the discriminability of the feature components in theirformance in static discretization was first exploited by
user-specific 1-bit-discretization scheme. Thus, it can beChen et al. [16]. The LR of the jth user in the dth
implemented in a straightforward manner in our study.dimensional feature space is generally defined as
The definition of this measure is given by
df (v) ⎛ ⎛ ⎞⎞jd (1) d dLR = | μ − μ | inter - classvariationj ⎜ ⎜ j ⎟⎟d dRL =1/2 1+erf ∝ max ,f (v) ⎝ ⎝ ⎠⎠j 2 intra - classvariationd (5)2(σ )
with the assumption that the entire population is suffi- j∈{1,2,...,J},d ∈{1,2,...,D}
ciently large (excluding a single user should not haveLim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 8 of 16
where erf is the error function. This RL measure imposed on the binary output of the discretization
would produce a higher value when a feature element scheme. Based on a fixed-bit-allocation principle, L is
dd equally divided by D dimensions for typical equal-prob-has a larger difference between μ and μ relative toj
able discretization schemes and by D dimensions forfsdσ . As a result, a high RL measurement indicates aj our feature selection approach. Since the entropy per
high discriminating power of a feature component. dimension l is logarithmically proportional to the num-
3.1.4. Standard deviation (c = SD) ber of equal-probable intervals S (or l &S for ourfs fs
In dynamic discretization, the amount of bits allocated approach) constructed in each dimension, this can be
to a feature dimension indicates how discriminative the written as
user-specific feature component is detected to be.
l = L/D=log S fortypicalEPdiscretizationscheme(9)Usually, a more discriminative feature component is 2
assigned with a larger quantity of bits and vice versa.
dσThepureuser-specificSDmeasure signifying intra-j
l = L/D = lD/D =log S forourapproach(10)fs fs fs fs2class variance, was adopted by Teoh et al. as a bit-allo-
cation measure [18,19] and hence may serve as a poten-
By denoting n as the bit length of each one-dimen-
tial discriminative measure.
sional binary output, the actual bit length N of the final
3.1.5. Detection rate (c = DR)
bit string is simply N = Dn; while for LSSC-encoding-
Finally, unlike all the above measures that depend solely l
based schemes where n =(2 -1)bits,andforourLSSC
on the statistical distribution in determining the discri- lfsapproach where n =(2 −1) bits, the actual bitLSSC(fs)mination of the feature components, DR could be
length N and N can respectively be describedLSSC LSSC(fs)another efficient discriminative measure for discretiza-
bytion that takes into account an additional factor: the
position of the user pdf with reference to the con- l (11)N = Dn = D(2 −1)LSSC LSSC
structed genuine interval (the interval that captures the
largest portion of the user pdf) in each dimension. This and
measure, as adopted by Chen et al. in their dynamic bit-
lfsallocation scheme [17], is defined as the area under (12)N = D n = D (2 −1)LSSC(fs) fs LSSC(fs) fs
curve of the user pdf enclosed by the genuine interval
With the above equations, we illustrate the algorith-upon the respective intervals construction in that
mic description of our approach in Figure 4. Here, gdimension. It can be described mathematically by
and d* are dimensional variables, and || denotes binary
d d d concatenation operator.δ (S )= df (v)dv (6)j int jj
4. Experiments and analysis
dwhere δ denotes the jth user’sDRinthe dth dimen- 4.1. Experiment set-upj
dsion and S denotes the number of constructed intervals Two popular face datasets are selected to evaluate the
in the dth dimension. experimental discretization performance in this section:
To select D discriminative feature dimensions prop- FERETfs
erly, schemes employing LR, SNR, RL, and DR measures This employed dataset is a subset of the FERET face
should take dimensions with the D largest measure- dataset [29], in which the images were collected underfs
ments varying illumination conditions and face expressions. It
contains a total of 1800 images with 12 images for each
1 1 1 D D D{d | i = 1,...,D }=argmax [χ(v ,v ,...,v ),...,χ(v ,v ,...,v )],d ,...,d ∈ [1,D],D < D,i fs 1 D fsj1 j2 jI j1 j2 jI fs (7)Dfs maxvalues of 150 users.
FRGCwhile schemes employing SD measure should adopt
The adopted dataset is a subset of the FRGC datasetdimensions with the D smallest measurements:fs
(version 2) [30], containing a total of 2124 images with
1 1 1 D D D{d | i = 1,...,D }=argmin [χ(v ,v ,...,v ),...,χ(v ,v ,...,v )],d ,...,d ∈ [1,D],D < D.i fs 1 Dfs fsj1 j2 jI j1 j2 jI (8)
Dfs minvalues 12 images for each of the 177 identities. The images
were taken under controlled illumination condition.
We shall empirically identify discriminative measures
For both datasets, proper alignment is applied to the
that can be reliably employed in the next section.
images based on standard face landmarks. Owing to
possible strong variation in hair style, only the face
3.2 Discussions and a summary of our approach
region is extracted for recognition by cropping the
In a biometric-based cryptographic key generation appli-
images to the size of 30 × 36 for FERET dataset and 61
cation, there is usually an entropy requirement LLim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 9 of 16
Figure 4 Our fixed-bit-allocation-based discretization approach.
× 73 for FRGC dataset. Finally, histogram equalization is experiments, the equal error rate (EER) (error rate
applied to the cropped images. where FAR = FRR) is used for comparing the discretiza-
tion performance among different discretizationHalf of each identity’simagesareusedfortraining,
while the remaining half are used for testing. For mea- schemes, since it is a quick and convenient way to com-
suring the system’s false acceptance rate (FAR), each pare the performance accuracy of the discretizations.
image of the corresponding user is matched against that Basically, the performance is considered to be better
of every other user according to its corresponding image when the EER is lower.
index, while for the False Rejection Rate (FRR) evalua- The experiments can be divided into three parts: The
tion, each image is matched against every other images first part identifies the reliable discriminative feature
ofthesameuserforeveryuser.Inthesubsequent selection measures among those listed in the previousLim and Teoh EURASIP Journal on Advances in Signal Processing 2011, 2011:107 Page 10 of 16
section. The second part examines the performance of For LSSC encoding scheme which utilizes longer
our approach and illustrates that replacing LSSC with codewords than DBR and BRGC in each dimension to
DBR- or BRGC-encoding scheme in our approach fulfil a system-specified entropy requirement, the rela-
would achieve a much poorer performance when high tion between bit length n and single-dimensionalLSSC
entropy is imposed because of the conventional perfor- entropy l can be described by
mance-entropy tradeoff of DBR- and BRGC-encoding-
l (16)n = S −1=2 −1;LSSCbased discretization; The last part scrutinizes and reveals
how one could attain reliable parameter estimation, i.e.,
and for our approach, we have
D , in achieving the highest possible discretizationfs
l L/Dfs fsperformance. n =2 −1=2 −1 (17)LSSC(fs)
The experiments were carried out based on two differ-
ent dimensionality-reduction techniques: ERE [28] and from (10).
For the baseline discretization scheme of EP + LSSCFDA [27], and two different datasets: FRGC and FERET.
with D = 100, L = Dl = Dlog (n + 1) = 100logIn the first two parts of the experiments, 4453 raw 2 LSSC 2
(n + 1). Thus, L = {100, 200, 300, 400} correspondsdimensions of FRGC images and 1080 raw dimensions LSSC
to l = {1, 2, 3, 4}, n = {1, 3, 7, 15} and the actualof FERET images were both reduced to D = 100 dimen- LSSC
length of the extracted bit string is Dn = {100, 300,sions. While for the last part, the raw dimensions of LSSC
700, 1500}. While for the feature-selection schemesimages from both datasets were reduced to D=50and
with D =50where L = D l = D log (n +1) =100 dimensions for analytic purpose. Note that EP fs fs fs fs 2 LSSC(fs)
50log (n +1), L = {100, 200, 300, 400} corre-quantization was employed in all parts of experiment. 2 LSSC(fs)
sponds to l = {2, 4, 6, 8}, n = {3, 15, 63, 255}fs LSSC(fs)
and the actual length of the extracted bit string4.2. Performance assessment
becomes D n = {150, 750, 3150, 12750}. The4.2.1. Experiment Part I: Identification of reliable feature- fs LSSC(fs)
implication here is that when a particularly largeselection measures
entropy specification is imposed on a feature selectionBased on the fixed-bit-allocation principle, n bits are
scheme, a much longer LSSC-generated bit string willassigned equally to each of the D feature dimensions. A
always be required.Dn-bit binary string is then extracted for each user
Figure 5 illustrates the EER performance of (I) EP +through concatenating n-bit binary outputs of the indi-
DBR, (II) EP + BRGC, and (III) EP + LSSC discretiza-vidual dimensions. Since DBR as well as BRGC is a code
n tion schemes which adopt different discriminative mea-which comprise the entire 2 n-bit codewords for label-
n sures-based feature selections with respect to that of theling S=2 intervals in every dimension, the single-
dimensional l can be deduced from (9) as baseline (discretization without feature selection where
D = D) based on (a) FERET and (b) FRGC datasets.fs
nl=log S=log 2 = n. (13)2 2 “Max” and “Min” in each subfigure are referred to as
whether D largest or smallest measurements werefsThe total entropy L is then equal to the length of the
adopted corresponding to each feature selection method,
binary string:
as illustrated in (7) and (8).
D D A great discretization performance achieved by a fea-
L = l = n = Dn. (14)
ture-selection scheme basically implies a reliable mea-d=1 d=1
sure for estimating the discriminativity of the features.
Note that L = 100, 200, 300 and 400 correspond to n
In all the subfigures, it is noticed that the discretization
= 1, 2, 3 and 4, respectively, for each baseline scheme
schemes that select features based on the LR, RL, and
(D = 100). For the feature-selection-based discretization
DR measures give the best performance among the fea-
schemes to provide the same amount of entropy (with
ture selection schemes. RL seems to be the most reliable
n and l denoting the number of bits and the entropyfs fs discriminative measure, followed by LR and DR. In con-
of each selected dimension, respectively), we have
trast, SNR and SD turn out to be some poor discrimina-
D D tive measures that could not guarantee anyfs fs
(15)L = l = n = D nfs fs fs fs. improvement compared to the baseline scheme.d=1 d=1
When LSSC encoding in our 4-step approach (see
With this, L = 100, 200, 300 and 400 correspond to lfs Section 3) is replaced with DBR in Figure 5Ia, Ib; and
= n = 2, 4, 6 and 8 respectively, for D =50.Thisfs fs BRGCinFigure5IIa,IIb,RL-,LR-,andDR-basedfea-
implies that the number of segmentation in each ture selection schemes manage to outperform the
selected feature dimension is now larger than the usual respective baseline scheme at low L. However, in most
n−nfscase by a factor of .2 cases, these DBR- and BRGC-encoding-based