Kaohsiung, Taiwan
Kaohsiung, Taiwan

Time filter

Source Type

Tsai Y.-H.,Digital Signal | Tsai C.-C.,National Dong Hwa University | Wang K.-C.,Shin Chien University
Proceedings - 2010 7th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2010 | Year: 2010

In recent years, the MP3 music objects become the popular type of music file in many internet audio applications, including the surveillance system . But, less attention was received to the content-based classification of audio data. While Cloud Services were blooming, the classification of MP3 music has better more and more important. It is necessary to process much audio data when Cloud Computing. In this paper, we propose an approach to classify MP3 objects based on their energy distribution. The techniques of PCA (principal component analysis) and RBF (radial basis function) network is used to construct the MP3 classifier. Experiment show that the good performance of an MP3 classification system can be met by the proposed method. ©2010 IEEE.

Wu W.-P.,Kun Shan University | Yang H.-L.,Shin Chien University
Journal of Quality | Year: 2011

The objective of this study is to investigate whether the characteristic of home-stay operation will affect the customer value and relationship quality from surveying home-stay guests. By using e-commerce as a moderator, this paper also examines whether the moderating effect exists between customer value and relationship quality. In our research, the characteristics of home-stay operation were classified to three dimensions including style, service and price; relationship quality including customer satisfy, commitment, and trust; e-commerce function including two dimensions of traveling information and on-line reservation. The results of this research show as follows: The characteristics of home-stay operation significantly and positively influence customer value. The customer value significantly positively influences relationship quality. The traveling information of e-commerce function moderates the customer value and relationship quality while on-line reservation doesn't show. As the result, we suggest that the home-stay operators could improve their traveling information from their e-commerce function to attract more customers.

Wang K.-C.,Shin Chien University | Chin C.-L.,Chung Shan Medical University
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | Year: 2011

In this paper, we present an approach of detecting speech presence for which the decision rule is based on a combination of multiple features using a sigmoid function. A minimum classification error (MCE) training is used to update the weights adjustment for the combination. The features, consisting of three parameters: the ratio of ZCR, the spectral energy, and spectral entropy, are combined linearly with weights derived from the sub-band domain. First, the Bark-scale wavelet decomposition (BSWD) is used to split the input speech into 24 critical sub-bands. Next, the feature parameters are derived from the selected frequency subband to form robust voice feature parameters. In order to discard the seriously corrupted frequency sub-band, a strategy of adaptive frequency subband extraction (AFSE) dependant on the sub-band SNR is then applied to only the frequency sub-band used. Finally, these three feature parameters, which only consider the useful sub-band, are combined through a sigmoid type function incorporating optimal weights based on MSE training to detect either a speech present frame or a speech absent frame. Experimental results show that the performance of the proposed algorithm is superior to the standard methods such as G.729B and AMR2. © 2011 The Institute of Electronics, Information and Communication Engineers.

Wang K.-C.,Shin Chien University | Chin C.-L.,Chung Shan Medical University
WSEAS Transactions on Information Science and Applications | Year: 2010

In this paper, we propose a novel wavelet coefficient threshold (WCT) depended on both time and frequency information for providing robustness to non-stationary and correlated noisy environments. A perceptual wavelet filter-bank (PWFB) is firstly used to decompose the noisy speech signal into critical bands according to critical bands of psycho-acoustic model of human auditory system. The estimation of wavelet coefficient threshold (WCT) is then adjusted with the posterior SNR, which is determined by estimated noise power, through the well-known "Quantum Neural Networks (QNN)". In order to suppress the appearance of musical residual noise produced by thresholding process, we consider masking properties of human auditory system to reduce the effect of musical residual noise. Simulation results showed that the proposed system is capable of reducing noise with little speech degradation and the overall performance is superior to several competitive methods.

Wang K.-C.,Shin Chien University | Chin C.-L.,Chung Shan Medical University | Wang C.-M.,Chung Shan Medical University
Lecture Notes in Engineering and Computer Science | Year: 2013

This paper shows innovative VAD based on horizontal spectral entropy with long-span of time (HSELT) feature sets to improve mobile ASR performance in low signal-to-noise ratio (SNR) conditions. Due to the signal characteristics of nonstationary noise change with time, we need long-term information of the noisy speech signal to define a more robust decision rule yielding high accuracy. We find that HSELT measures can horizontally enhance the transition between speech and non-speech segments. Based on the above finds, we can use the HSELT measures to achieve high accuracy for detecting speech signal form various stationary and nonstationary noises.

Kung C.M.,Shin Chien University
Journal of Multimedia | Year: 2010

Due to the rapid development of computer networks and data communication technologies, communication using digital media (text, picture, sound, video, etc.) has become more and more frequent. Digital media can be readily duplicated, modified, and transmitted, making them easy for people to create, manipulate, and enjoy. Thus the protection of the intellectual property rights of digital images becomes an important issue. Watermark is an effective and popular technique for discouraging illegal copying and distribution of copyrighted digital image information. In this paper, we proposed the method for robust watermarking. First, the robust watermarking scheme performed in the frequency domain. It can be used to prove the ownership. Second, we can provide a high degree of robustness against JPEG compression attacks by the source coding, and protect the transmit information by channel coding. We adopt the data distribution idea to avoid the continue information attack, because it will destroy the entire error correction scheme. Experimental results are also presented to demonstrate the validity and robustness of the approach. © 2010 ACADEMY PUBLISHER.

Wang K.-C.,Shin Chien University
International Journal of Computers and Applications | Year: 2011

To obtain reliable performance of Voice Activity Detector (VAD) algorithm, the straight lines on spectrogram of speech-activity being robust against noise is characterized by an entropy-based measure in this paper. A measure of entropy will be defined on the energy domain of harmonic subband. It is shown that the entropy-based measure is well suited for detecting speech in white or quasi-white noises, but will perform poorly for coloured noises. To compensate the limitation, the refined minima controlled recursive averaging, which be updated quickly and accurately even given rapidly increasing levels of noise, is required to desensitize the measure of entropy to various types of noise. Consequently, the proposed VAD algorithm is shown significantly outperform the commonly used energy-based algorithm when SNR drops rapidly, and moreover is insensitive to the changing-level of noise. Experimental results demonstrate that the performance of the proposed VAD is comparable to modern standard VADs such that ITU-T G.729B and ETSI front-end VAD or statistical model-based VADs.

Wang K.-C.,Shin Chien University
IEICE Transactions on Information and Systems | Year: 2010

Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple-method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition (BSWPD) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions. Copyright ©2010 The Institute of Electronics, Information and Communication Engineers.

Wang K.-C.,Shin Chien University
IEICE Transactions on Information and Systems | Year: 2012

Conventional entropy measure is derived from full-band (range from 0 Hz to 4 kHz); however, it can not clearly describe the spectrum variability during voice-activity. Here we propose a novel concept of adaptive long-term sub-band entropy (ALT-SubEnpy) measure and combine it with a multi-thresholding scheme for voice activity detection. In detail, the ALT-SubEnpy measure developed with four part parameters of sub-entropy which uses different long-term spectral window length at each part. Consequently, the proposed ALT-SubEnpy-based algorithm recursively updates the four adaptive thresholds on each part. The proposed ALT-SubEnpy-based VAD method is shown to be an effective method while working at variable noise-level condition. Copyright © 2012 The Institute of Electronics, Information and Communication Engineers.

Wang K.-C.,Shin Chien University
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | Year: 2012

A novel long-term sub-band entropy (LT-SubEntropy) measure, which uses improved long-term spectral analysis and sub-band entropy, is proposed for voice activity detection (VAD). Based on the measure, we can accurately exploit the inherent nature of the formant structure on speech spectrogram (the well-known as voiceprint). Results show that the proposed VAD is superior to existing standard VAD methods at low SNR levels, especially at variable-level noise. Copyright © 2012 The Institute of Electronics, Information and Communication Engineers.

Loading Shin Chien University collaborators
Loading Shin Chien University collaborators