JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 3            2003 March

Signal Representation Including Waveform Envelope by Clustered Line-Spectrum Modeling

M. KAZAMA

Waseda University, Tokyo, Japan

AND

K. YOSHIDA AND M. TOHYAMA, AES Member

Kogakuin University, Tokyo, Japan

      Clustered line-spectrum modeling (CLSM) around the peaks of interpolated FFT spectrum records has been developed for signal analysis and representation including the signal envelope. The signal waveform, including the envelope, can be represented by components clustered around spectrum peaks. The sinusoidal components at the peaks are estimated by peak picking, whereas the clustered components required in particular for envelope representation can be estimated by obtaining least-squares error (LSE) solutions in the frequency domain, which was originally formulated by Quatieri and Danisewicz and by Maher. Num-erical simulation reveals that the basic CLSM algorithm works well, and a narrow-band speech sample or impulse-response type transient-signal analysis shows that acoustic signals that include envelopes can be expressed quite effectively by using clustered components based on CLSM.




















Moving-Horizon Optimal Quantizer for Audio Signals

GRAHAM C. GOODWIN AND DANIEL E. QUEVEDO

School of Electrical Engineering and Computer Science, University of Newcastle, Callaghan, NSW 2308, Australia

AND

DAVID MCGRATH, AES Member

Lake Technology Ltd., Ultimo, NSW 2007, Australia

      By analyzing the quantization of audio signals as a deterministic finite-set constrained quadratic optimization problem, a new scheme, called moving-horizon optimal quantizer (MHOQ), is developed. The MHOQ includes a model of the ear's sensitivity to low-level noise power and minimizes directly the perceived error over a finite prediction horizon. Feedback is incorporated by means of the moving-horizon principle. With a prediction horizon equal to 1, the MHOQ reduces to the psychoacoustically optimal noise-shaping quantizer, widely used in practical applications. Larger prediction horizons outperform the noise shaper at the expense of only a small increase in computational complexity.


Test Signal Generation and Accuracy of Turntable Control in a Dummy-Head Measurement System

GYÖRGY WERSÉNYI, AES Member , AND ANDRÁS ILLÉNYI, AES Member

University of Technology and Economics, Békésy György Acoustical Research Laboratory, Budapest, Hungary

      To measure the variations of the head-related transfer function (HRTF) caused by the acoustical environment near the head, a precisely controlled measurement setup with increased signal-to-noise ratio is needed. Based on the conclusions of an earlier investigation, a fully automatic dummy-head measurement system was installed in an anechoic room. An alternative way of generating a pseudorandom test signal is described as well as the rebuilding of the Brüel & Kjaer turntable for accurate settings of azimuth.


Loudspeaker Equalizer Design for Near-Sound-Field Applications

WEE SER, PENG WANG, AND MING ZHANG

Center for Signal Processing, School of Electrical & Electronic Engineering,


Nanyang Technological University, Singapore 639798

      Loudspeaker equalization is an essential technique in audio reproduction. Conventional equalization schemes focus on dealing with the axial impulse response, and thus cannot provide sufficient off-axis equalization, which is required in near-sound-field applications. A new loudspeaker equalizer is proposed. It addresses this problem and the problem of binaural perceptual difference. Equalizer designs for multiuser and/or multiposition environments are also briefly discussed. These methods have the potential of being applied in practice in the implementation of automobile, TV, or desktop loudspeaker systems.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 4            2003 April

The Bidirectional Microphone:
A Forgotten Patriarch

RON STREICHER, AES Fellow

Pacific Audio-Visual Enterprises, Pasadena, CA 91107, USA

AND

WES DOOLEY, AES Fellow

Audio Engineering Associates, Pasadena, CA 91104, USA

      Despite being one of the progenitors of all modern microphones and recording techniques, the bidirectional pattern is still not very well understood. Its proper and effective use remains somewhat of a mystery to many recording and sound-reinforcement engineers. The bidirectional microphone is examined from historical, technical, and operational perspectives. It is reviewed how it was developed and exists as a fundamental element of almost all other single-order microphone patterns. In the course of describing how this unique pattern responds to sound waves arriving from different angles of incidence, it is shown that very often it can be employed successfully where other more commonly used microphones cannot.


Efficient Tempo and Beat Tracking in Audio Recordings

JEAN LAROCHE

Creative Advanced Technology Center, Scotts Valley, CA 95066, USA

      Automatic beat tracking consists of estimating the number of beats per minutes at which a music track is played and identifying exactly when these beats occur. Applications range from music analysis, sound-effect synchronization, and audio editing to automatic playlist generation and deejaying. An off-line beat-tracking technique for estimating a time-varying tempo in an audio track is presented. The algorithm uses an MMSE estimation of local tempo and beat location candidates, followed by a dynamic programming stage used to determine the optimum choice of candidate in each analysis frame. The algorithm is efficient in its use of computation resource, yet provides very good results on a wide range of audio tracks. The algorithm details are presented, followed by a discussion of the performance and suggestions for further improvements.


On the Acoustic Radiation from a Loudspeaker's Cabinet

KEVIN J. BASTYR, AES Member

Graduate Program in Acoustics, The Pennsylvania State University, State College, PA 16804

AND

DEAN E. CAPONE

Applied Research Laboratory, The Pennsylvania State University, State College, PA 16804

      A scanning laser Doppler vibrometer and a computational boundary-element model are used to study the acoustic radiation from loudspeaker cabinets. In contrast to the research findings of Skrodzka, loudspeaker cabinets are shown to contribute significantly to the total radiated pressure at their lower resonance frequencies. This occurs because, despite a cabinet's relatively small surface velocity, its radiation efficiency is many times greater than that of the drivers. The radiation from two different versions of NHT's model 2.9 loudspeaker is investigated. The first is a standard production 2.9, the second a 2.9 without the standard internal bracing. A comparison of their performance yields insight into the effects of wall bracing location: stiffer cabinets with lower amplitude wall vibrations do not always radiate less sound.


The Virtual Loudspeaker Cabinet

J. R. WRIGHT, AES Member

KEF Audio (UK) Ltd., Maidstone, UK

      A method is presented for increasing the acoustic compliance of a loudspeaker cabinet by introducing activated carbon into the enclosure. The process is explained and working examples are discussed.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 5              2003 May

Assessment of Voice-Coil Peak Displacement Xmax

WOLFGANG KLIPPEL, AES Fellow

Klippel GmbH, Dresden, Germany

      The voice-coil peak displacement Xmax is an important driver parameter for assessing the maximum acoustic output at low frequencies. The existing standard AES2-1984 defines the peak displacement Xmax by measuring harmonic distortion in either voice-coil current or displacement. This freedom of choice gives completely different and controversial results. After a critical review of this performance-based technique, an amendment of this method is suggested. Alternatively, a parameter-based method is developed giving more detailed information about the cause of the distortion, limitations, and defects. The relationship between performance-based and parameter-based methods is discussed, and both techniques are tested with real drivers.


Modal Equalization of Loudspeaker-Room Responses
at Low Frequencies

AKI MÄKIVIRTA,1 AES Member, POJU ANTSALO,2 MATTI KARJALAINEN,2 AES Fellow,
AND VESA VÄLIMÄKI,2 AES Member

1Genelec Oy, FIN-74100 lisalmi, Finland
2Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing,
FIN 02015 HUT, Espoo, Finland

      The control of excessively long decays in a listening room with strong low-frequency modes is problematic, expensive, and sometimes impossible with conventional passive means. A systematic methodology is presented to design active modal equalization able to selectively reduce the mode decay rate of a loudspeaker-room system at low frequencies in the vicinity of a sound engineer's listening location. Modal equalization is able to increase the rate of initial sound decay at mode frequencies, and can be used with conventional magnitude equalization to optimize the reproduced sound quality. Two methods of implementing active modal equalization are proposed. The first modifies the primary sound such that the mode decay rates are controlled. The second uses separate secondary radiators and controls the mode decays with additional sound fed into the secondary radiators. Case studies are presented of implementing active modal control according to the first method.


A Low-Cost Intensity Probe

R. RAANGS, W. F. DRUYVESTEYN, AES Life Member, AND H. E. DE BREE, AES Member

University of Twente, 7500 AE Enschede,
The Netherlands

      The sound intensity in a sound field can be determinated from two acoustical measurements such as pressure and particle velocity. In the p-p method, the most commonly used method, the sound intensity is calculated by means of two spaced microphones. Another method, the p-u method, uses a pressure sensor to measure the sound particle velocity. Using the p-u method, measurements performed with a low-cost intensity probe and a computer soundcard combined with an open-source software show good agreement with results obtained with a p-p probe.


Industry Evaluation of In-Band    On-Channel Digital Audio Broadcast Systems

DAVID WILSON, AES Member

Consumer Electronics Association,
Arlington, VA 22201, USA

      The National Radio Systems Committee's testing and evaluation program for in-band on-channel digital audio broadcast systems is described. The results of laboratory and field tests performed during 2001 on iBiquity Digital Corporation's AM-band and FM-band IBOC DAB systems are reported. The conclusions drawn from the laboratory and field test results are also reported, and implications for the future are discussed.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 6             2003 June

Effects of Bandwidth Limitation on
Audio Quality in Consumer
Multichannel Audiovisual
Delivery Systems

SLAWOMIR K. ZIELINSKI, AES Member, AND FRANCIS RUMSEY, AES Member

Institute of Sound Recording, University of Surrey, Guildford, Surrey, GU2 7XH,
UK

AND

SØREN BECH, AES Fellow

Bang & Olufsen, Struer, Denmark

      The subjective effects of controlled limitation of audio bandwidth on the assessment of audio quality were studied. The investigation was focused on the standard 5.1 multichannel audio setup and limited to the optimum listening position. The effect of video presence on the audio quality assessment was also investigated. The results of formal subjective tests indicate that it is possible to limit the bandwidth of the center or the rear channels without significant deterioration of the audio quality for most program material types investigated. Video presence had a small effect on the audio quality assessment.


An Efficient Algorithm for the
Restoration of Audio Signals
Corrupted with Low-Frequency
Pulses

PAULO A. A. ESQUEF,1 AES Member,
LUIZ W. P. BISCAINHO,2 AES Member,
AND VESA VÄLIMÄKI,1 AES Member

1 Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing,
FIN-02545 HUT, Espoo, Finland
2 Universidade Federal do Rio de Janeiro
LPS-DEL/POLI, 21945-970, Rio de Janeiro,
RJ, Brazil

      The control of excessively long decays in a listening room with strong low-frequency modes is problematic, expensive, and sometimes impossible with conventional passive means.

A systematic methodology is presented to design active modal equalization able to selectively reduce the mode decay rate of a loudspeaker-room system at low frequencies in the vicinity of a sound engineer's listening location. Modal equalization is able to increase the rate of initial sound decay at mode frequencies, and can be used with conventional magnitude equalization to optimize the reproduced sound quality. Two methods of implementing active modal equalization are proposed. The first modifies the primary sound such that the mode decay rates are controlled. The second uses separate secondary radiators and controls the mode decays with additional sound fed into the secondary radiators. Case studies are presented of implementing active modal control according to the first method.


A Low-Cost Intensity Probe

R. RAANGS, W. F. DRUYVESTEYN, AES Life Member, AND H. E. DE BREE, AES Member

University of Twente, 7500 AE Enschede,
The Netherlands

      Digital audio restoration of old recordings is addressed, with a focus on the removal of long pulses with low-frequency content. The main drawback of the state-of-the-art method, which is based on the separation of autoregressive (AR) processes, is its high computational complexity. A method is proposed in which the pulse tails are first estimated via a nonlinear scheme called two-pass split-window (TPSW) filtering, followed by a polynominal smoothing stage. After removing the tail of each pulse by subtraction, the remaining initial clicks are suppressed through a model-based declicking algorithm. The proposed procedure is as effective for pulse removal as the AR-based method but has a substantially lower computational complexity. Moreover, from a user point of view, its processing parameters are more intuitive and easier to adjust.










Study on the Relationship between Some Room Acoustical Descriptors

D. OUIS

Department of Engineering Acoustics,
Lund Institute of Technology, SE-221 00
Lund, Sweden


      The results of a preliminary investigation into the theoretical evaluation and study of the relationship between some room acoustical descriptors used for the subjective assessment of performance halls are considered. The concern is about three parameters, namely, the interaural cross-correlation coefficient (IACC), the early lateral energy fraction (ELEF) or a related measure, the spaciousness S, and the initial time-delay gap (ITDG). To this end, the impulse response (IR) for a hard rectangular room with a side balcony on each lateral wall is calculated. This room configuration is considered a coarse approximation of a small performance hall. The theoretical model used for this calculation represents a combination of the image sources method for the wave reflections at the hard surfaces and an exact model accounting for the diffraction of waves at the wedges of the balconies, the latter being even extended to the second order of multiple diffractions. Furthermore, and in view of a more realistic determination of the IACC, the thus obtained impulse response is convolved with the head-related transfer function (HRTF) for both ears as measured using a dummy head (KEMAR) in an anechoic environment. In this respect the directional characteristics of the different components of the impulse response are accounted for in the IACC, but to a lesser degree than in the ELEF, whereas the ITDG is independent of direction. It is found that simple relations may be established between these parameters, which may be useful for room acoustical assessments or for estimating one of these parameters, when inaccessible, by knowing the value of any of the other ones.


Automated Parameter Optimization for Double Frequency Modulation Synthesis Using a Tree Evolution Algorithm

B. T. G. TAN AND N. LIU

Department of Physics, National University of Singapore, Singapore 117542, Republic of Singapore

      A new algorithm, the tree evolution algorithm (TEA), is proposed for parameter optimization of FM and related synthesis techniques. The algorithm explores each local minimum separately in a complex solution space, with its search algorithm forming a tree structure. It is shown that the algorithm, which is based on the genetic annealing algorithm (GAA), is more accurate and more stable than GAA in estimating the optimum parameters for double frequency modulation (DFM) synthesis.




JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 7/8    2003 July/August

Full-Sphere Sound Field of
Constant-Beamwidth Transducer (CBT)
Loudspeaker Line Arrays

D. B. (DON) KEELE, JR., AES Fellow

Harman/Becker Automotive Systems, Martinsville, IN 46151, USA

      The full-sphere sound radiation pattern of the constant-beamwidth transducer       circular-wedge curved-line loudspeaker array exhibits a three-dimensional petal or eye-shaped sound radiation pattern that stays surprisingly uniform with frequency. Oriented vertically, it not only exhibits the expected uniform control of vertical coverage, but also provides significant coverage control horizontally. The horizontal control is provided by a vertical coverage that decreases smoothly as a function of the horizontal off-axis angle and reaches a minimum at right angles to the primary listening axis. This is in contrast to a straight-line array, which exhibits a three-dimensional sound field that is axially symmetric about its vertical axis and exhibits only minimal directivity in the horizontal plane due to the inherent directional characteristics of each of the sources that make up the array.


Direct-Radiator Loudspeaker
Systems with High Bl

JOHN VANDERKOOY, AES Fellow,

Audio Research Group, Department of Physics, University of Waterloo, Waterloo, ON Canada N2L 3G1

AND

PAUL M. BOERS AND RONALD M. AARTS, AES Fellow

Phillips Research Labs, WY81, 5656 AA Eindhoven, The Netherlands

      In an extension of an earlier paper by the authors additional consequences of a dramatic increase in the motor strength Bl of a driver are shown. Not only is the efficiency of the loudspeaker and amplifier greatly increased, but high Bl values have a positive influence on other aspects of loudspeaker systems. Box volume can be reduced significantly and other parameters can be altered. A prototype driver unit is studied, which performs well in a small sealed box. Vented systems do not benefit as much from high Bl.

Psychoacoustic Investigations On
Sound-Source Occlusion

HANIA FARAG,1 LUIZ W. JENS BLAUERT,2 AES Fellow, AND ONSY ABDEL ALIM1

1 Department of Electrical Engineering, Faculty of Engineering, University of Alexandria, 21544, Alexandria, Egypt
2 Institut für Kommunikationsakustik, Ruhr-Universität Bochum, DE-44801,
Bochum, Germany

      Efficient simulation of sound-source occlusion is needed, for example, in auditory virtual environments, and remains an interesting and at present not widely researched topic. In order to achieve plausible and efficient simulation, the changes in psychoacoustical parameters accompanying the perception of sound-source occlusion have to be identified and understood. The impact of occlusion on the localization of auditory events is investigated with the aid of listening tests. Rectangular wood plates of different dimensions are used as occluders. A noticeable shift in the location of the auditory events is observed. The results can be explained on grounds of the precedence effect.


The Differential Pressure Synthesis Method for Efficient Acoustic Pressure Estimation

YUFEI TAO, ANTHONY I. TEW, AES Associate, AND STUART J. PORTER

Department of Electronics, University of York, YO10 5DD, UK

      A differential pressure synthesis (DPS) method is proposed which estimates the free-field acoustic pressure on the boundary of an object from its geometry by precalculating a database of pressure changes caused by introducing orthogonal shape deformations to a template shape. Pressures are synthesized using DPS for a two-dimensional shape and a three-dimensional KEMAR head model. The accuracy of pressure estimates compares favorably with the boundary-element method computation provided that shape deformations are moderate in relation to acoustic wavelength.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 9        2003 September

Effects of Down-Mix Algorithms on Quality of Surround Sound

S⁄LAWOMIR K. ZIELINSKI, AES Member, AND FRANCIS RUMSEY, AES Fellow

Institute of Sound Recording, University of Surrey, Guildford, Surrey, GU2 7XH,
UK

AND

SØREN BECH, AES Fellow

Bang & Olufsen, Struer, Denmark

      Eight down-mix algorithms were evaluated in terms of basic audio quality. The investigation was focused on the standard 5.1 multichannel audio setup (ITU-R BS.775-1) and limited to two listening positions. The results obtained are summarized and detailed specifications of the subjectively best algorithms are given. The effect of the presentation of moving pictures on the assessment of audio quality was also investigated. The results show that the exposure to a visual content has a considerable effect on the evaluation of the audio quality at the off-center position to some types of program material.


A Study on Head-Shape
Simplification Using Spherical
Harmonics for HRTF Computation at
Low Frequencies

YUFEI TAO, ANTHONY I. TEW, AES Associate,
AND STUART J. PORTER

Department of Electronics, University of York, Heslington, YO10 5DD, UK

      Simplified head shapes, such as spheres and ellipsoids, have often been applied in the research of head-related transfer functions (HRTFs). However, the effects of the missing head-shape features in these simplified head models have not been thoroughly examined. Head shapes are represented using spherical harmonics, which allows the simplification of head shapes to be carried out in a controlled and systematic way. The KEMAR head shape is low-pass filtered to different degrees. The errors in both the head shape and the acoustic pressures introduced by the low-pass filters are studied. Guidelines are presented for examining the tradeoff between head-shape simplification and accuracy of pressure

estimation. It is concluded that spherical harmonics above degree 11 may be ignored in the computation of HRTFs below 3 kHz.


Differences in Performance and
Preference of Trained versus
Untrained Listeners in
Loudspeaker Tests: A Case Study

Sean E. Olive, AES Fellow

Research & Development Group, Harman International Industries, Inc.,
Northridge, CA, 91329, USA

      Listening tests on four different loudspeakers were conducted over the course of 18 months using 36 different groups of listeners. The groups included 256 untrained listeners whose occupations fell into one of four categories: audio retailer, marketing and sales, professional audio reviewer, and college student. The loudspeaker preferences and performance of these listeners were compared to those of a panel of 12 trained listeners. Significant differences in performance, expressed in terms of the magnitude of the loudspeaker F statistic FL, were found among the different categories of listeners. The trained listeners were the most discriminating and reliable listeners, with mean FL values 3-27 times higher than the other four listener categories. Performance differences aside, loudspeaker preferences were generally consistent across all categories of listeners, providing evidence that the preferences of trained listeners can be safely extrapolated to a larger population. The highest rated loudspeakers had the flattest measured frequency response maintained uniformly off axis. Effects and interactions between training, programs, and loudspeakers are discussed.







Objective Measures of Listener
Envelopment in Multichannel
Surround Systems

Gilbert A. Soulodre, AES Fellow, Michel C. Lavoie, and Scott G. Norcross, AES Member

Communications Research Centre, Ottawa,
Ont. K2H 8S2, Canada


      A common goal in multichannel musical recordings is to create a better approximation of the concert-hall experience than can be achieved with a traditional stereo reproduction system. Listener envelopment (LEV) is known to be an important part of good concert-hall acoustics and is therefore desirable in multichannel reproduction. In the present study a series of subjective tests were conducted to determine which acoustic parameters are important to the creation of LEV. It is shown that LEV can be controlled systematically in a home listening environment by varying the level and angular distribution of the late arriving sound. While the perceptual transition point between early and late energy has traditionally been set to 80 ms when predicting LEV, this matter has not been investigated rigorously. Subjective tests were conducted wherein the temporal and spatial distributions of the late energy were varied. A new frequency-dependent objective measure GSperc was derived, and it was shown to outperform other objective measures significantly.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 10         2003 October

Two-Port Representation of the Connection between Horn Driver
and Horn

Gottfried K. Behler, AES Member, AND
Michael Makarski

Institute of Technical Acoustics, University of Aachen, Aachen, D-52056,
Germany

      A method for measuring and describing horn drivers and horns as independent parts was investigated. It is shown that the well-known two-port representation can be adopted for system characterization considering certain assumptions and limitations. The horn driver is represented as a two-port whereas the horn is characterized by its acoustical input impedance and, due to its three-dimensional sound radiation, by its on-axis transfer function and a relative directivity. With both sets of parameters the electrical input impedance, the transfer function, and the directivity of any horn driver-horn combination can be synthesized by a software tool without a need for measuring the real combination. This method speeds up procedures of either loudspeaker system design or the design and optimization of new horn drivers and horns, respectively. Besides the general-purpose measuring techniques, some specialized measuring equipment is required such as an impedance tube fitted to the horn throat and an anechoic chamber to record the directivity of the horn. Finally, all possible combinations of seven horn drivers and eleven horns have been studied to show the reliability of the method.


Sensitivity of High-Order
Loudspeaker Crossover Networks
with All-Pass Response

Brandon Cochenour, AES Member, Carlos Chai,
and David A. Rich, AES Member,

Electrical and Computer Engineering Department, Lafayette College, Easton,
PA 18042, USA

      The sensitivity of high-order filter networks to component matching tolerances increases with the filter order. For the crossover network of an audio loudspeaker that is designed to sum to an all-pass network, it is demonstrated that the sensitivity to component matching tolerances may be dwarfed by sensitivities to other effects. Second- to eighth-order Linkwitz-Riley crossovers are examined. The analysis also subsumes networks with transmission zeros and optimized networks where the effects of frequency-


response errors introduced by the respective driver transfer functions are minimized. Crossover networks are considered which are least sensitive to the combined effects of component tolerances, path-delay effects, interaction of filter sections in loudspeakers that divide the incoming signal into three or more subbands, and driver transfer functions.

Differences in Performance and
Preference of Trained versus
Untrained Listeners in
Loudspeaker Tests: A Case Study

Marcel Urban, Christian Heil, AES Member,
and Paul Bauman, AES Member

L-ACOUSTICS, Marcoussis, 91462 France

      The Fresnel approach in optics is introduced to the field of acoustics. Fresnel analysis provides an effective, intuitive way of understanding complex interference phenomena and allows for the definition of criteria required to couple discrete sound sources effectively and to achieve coverage of a given audience geometry in sound-reinforcement applications. The derived criteria form the basis of what is termed Wavefront Sculpture Technology.


Acoustical Renovation of Tainan
Municipal Cultural Center
Auditorium

Weihwa Chiang, Chingtsung Hwang, and
Yenkun Hsu

Department of Architecture, National Taiwan University of Science and Technology,
Taipei City 106, Taiwan, ROC

      An acoustical analysis was conducted for renovating the auditorium in the Tainan Municipal Cultural Center for orchestral programs. Design strategies included using more reflective surfaces, incorporating the pit as part of the platform, installing suspended reflectors, and splaying the rear sidewalls. The unoccupied midfrequency reverberation time and the strength factor were increased to 2.02 s and 4.6 dB, respectively. The overall impression rating was 5.3 on a 7-point scale.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 11        2003 November

The Effect of Nonlinear Distortion on
the Perceived Quality of Music and
Speech Signals

CHIN-TUAN TAN, AND BRIAN C. J. MOORE, AES Member

Department of Experimental Psychology, University of Cambridge, Cambridge CB2 3EB, UK

AND

NICK ZACHAROV, AES Member

Nokia Research Center, Audio-Visual
Systems Laboratory, Tampere, Finland

      The effect of various types of nonlinear distortion on the perceived quality of speech and music signals was examined. In experiments 1 and 2, "artificial" distortions were used, including hard and soft symmetrical and asymmetrical peak clipping of various amounts, center clipping, and full-range waveform distortion produced by raising the instantaneous absolute value of the waveform to a power (?1) while preserving the sign. Subjects were asked to rate the perceived amount of distortion on a ten-point scale (where 1 was most distorted and 10 least distorted). In experiment 1 the distortions were applied to the broad-band signals. In experiment 2 the distortions were applied to subbands of the signal. Results were highly consistent across subjects and test sessions. Center clipping and soft clipping had only small effects on the ratings, whereas hard clipping and the full-range distortions had large effects. The subjective ratings were compared to physical measures of distortion based on multitone test signals. A distortion measure, DS, derived from the output spectrum of each nonlinear system in response to a 10-component multitone signal gave high negative correlations with the subjective ratings (correlations were negative as large values of DS were associated with low ratings). A further experiment was conducted using stimuli for which nonlinear distortion was introduced by recording the outputs of real transducers. The output signals were digitally filtered to reduce irregularities in the amplitude-frequency response as far as possible. The results showed moderately strong negative correlations between the subjective ratings and the objective measure DS. It was concluded, that an objective measure of nonlinear distortion based on the use of a multitone signal can predict the perceptual effects of nonlinear distortion reasonably well.


Ultra-High Quality Video Frame
Synchronous Audio Coding

MICHAEL J. SMITHERS, AES Member, BRETT G. CROCKETT, AES Member, AND LOUIS D. FIELDER, AES Fellow

Dolby Laboratories, San Francisco, CA 94103, USA

      Two methods of coding and delivering ultra-high-quality audio are presented. Both methods are video frame synchronous and editable at common video frame rates (23.98, 24, 25, 29.97, and 30 frames per second) without the use of sample-rate converters. The first is an ultra-high-quality audio coder that exceeds 4.8 on the ITU-R five-point audio impairment scale at a bit rate of 256 kbit/s per channel and at up to three generations of encoding/decoding. The second is an enhanced method of video frame synchronous PCM packing. Specifically the problem of transmitting 48-kHz audio in 29.97-Hz frames is examined.


Large-Signal Analysis of Triode
Vacuum-Tube Amplifiers

MUHAMMAD TAHER ABUELMA'ATTI

King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia

      A mathematical model for the input-output characteristic of a single-stage triode vacuum-tube amplifier is presented. The model, basically a cosine-series function, can easily yield closed-form series expressions for the amplitudes of the output components resulting from multisinusoidal input signals to the amplifier. The special case of an equal-amplitude two-tone input signal is considered in detail. The results show that, similar to transistor-based amplifiers, the vacuum triode amplifier generates both even- and odd- order harmonic and intermodulation components. The results also show that the amplitudes of these components are strongly dependent on the tube parameters, the biasing voltage, and the amplitudes of the input tones and are not following a general pattern.


Acoustical Measurements of
Traditional Theaters Integrated with
Chinese Gardens

WEIHWA CHAING, YENKUN HSU, AND
JINJAW TSAI

Department of Architecture, National Taiwan University of Science and Technology, Taipei 106, China (Taiwan)


AND

JIQING WANG AND LINPING XUE

Institute of Acoustics, Tongji University, Shanghai 200092, China

      Acoustical measurements were taken in three traditional theaters integrated with Chinese gardens. Each theater consisted of a pavilion-like stage inside a courtyard surrounded by covered galleries, walls, or rock piles. The courtyard was generally rectangular in shape. Water, rocks, bridges, walls, and vegetation made the theater space rather irregular. All measurements were taken when the floors were unoccupied and without seats. Analysis showed an average strength G of 4.7 dB, an average early decay time EDT of 0.74 s, and an average early support STE of -9.3 dB.



JOURNAL OF THE AUDIO ENGINEERING SOCIETY
Volume 51 Number 12        2003 December

Why Are Commercials so Loud? —
Perception and Modeling of the Loudness of Amplitude-Compressed Speech

BRIAN C. J. MOORE, AES Member , BRIAN R. GLASBERG, AND MICHAEL A. STONE

Department of Experimental Psychology, University of Cambridge, Cambridge CB2 3EB, England

      The level of broadcast sound is usually limited to prevent overmodulation of the transmitted signal. To increase the loudness of broadcast sounds, especially commercials, fast-acting amplitude compression is often applied. This allows the root-mean-square (rms) level of the sounds to be increased without exceeding the maximum permissible peak level. In addition, even for a fixed rms level, compression may have an effect on loudness. To assess whether this was the case, we obtained loudness matches between uncompressed speech (short phrases) and speech that was subjected to varying degrees of four-band compression. All rms levels were calculated off line. We found that the compressed speech had a lower rms level than the uncompressed speech (by up to 3 dB) at the point of equal loudness, which implies that, at equal rms level, compressed speech sounds louder than uncompressed speech. The effect increased as the rms level was increased from 50 to 65 to 80 dB SPL. For the largest amount of compression used here, the compression would allow about a 58% increase in loudness for a fixed peak level (equivalent to a change in level of about 6 dB). With a slight modification, the model of loudness described by Glasberg and Moore [1] was able to account accurately for the results.


Smart Digital Loudspeaker Arrays

M. O. J. HAWKSFORD, AES Fellow

Centre for Audio Research and Engineering, University of Essex, Colchester, CO4 3SQ, UK

      A theory of smart loudspeaker arrays is described where a modified Fourier technique yields complex filter coefficients to determine the broad-band radiation characteristics of a uniform array of micro drive units. Beamwidth and direction are individually programmable over a 180∞ arc, where multiple agile and steerable beams carrying dissimilar signals can be accommodated. A novel method of stochastic filter design is also presented, which endows the directional array with diffuse radiation properties.


Localization of 3-D Sound Presented
through Headphone-Duration of
Sound Presentation and
Localization Accuracy

FANG CHEN

Department of Industrial Ergonomics, Linköping University, SE- 581 83
Linköping, Sweden

      The relationship between the duration of a sound presentation and the accuracy of human localization is investigated. The three-dimensional sound is presented via headphones. The head-tracking system was integrated together with the sound presentation. Generalized head-related transfer functions (HRTFs) are used in the experiment. Six different types of sounds with durations of 0.5, 2, 4, and 6 seconds were presented in random order on any azimuth in the horizontal plane. Thirty subjects participated in the study. A special location indication system called DINC (directional indication compass) was developed. With DINC the judged location of every test can be recorded accurately. The results showed that the localization accuracy is significantly related to the duration of the sound presentation. As long as the sound has a broad frequency bandwidth, the sound type has little effect on the localization accuracy. A presentation of at least 4-second duration is recommended. There is no significant difference between male and female subjects in the accuracy of detection.











Acoustical Measurements of
Reconstruction of Mechanically
Recorded Sound by Image
Processing

VITALIY FADEYEV AND CARL HABER

Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA


      Audio information stored in the undulations of grooves in a medium such as a phonograph record may be reconstructed, with no or minimal contact, by measuring the groove shape using precision metrology methods and digital image processing. The effects of damage, wear, and contamination may be compensated, in many cases, through image processing and analysis methods. The speed and data-handling capacity of available computing hardware make this approach practical. Various aspects of this approach are discussed. A feasibility test is reported which used a general-purpose optical metrology system to study a 50-year-old 78-rpm phonograph record. Comparisons are presented with stylus playback of the record and with a digitally remastered version of the original magnetic recording. A more extensive implementation of this approach, with dedicated hardware and software, is considered.