Journal of East China Normal University(Natural Science) ›› 2021, Vol. 2021 ›› Issue (1): 152-164.doi: 10.3969/j.issn.1000-5641.202022009

• Physics and Electronics • Previous Articles     Next Articles

Voice singing by function fitting

Yibu WANG, Jianwen LI*()   

  1. School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710021, China
  • Received:2020-06-10 Online:2021-01-25 Published:2021-01-28
  • Contact: Jianwen LI


Intonation is the tone of speech, which is formed by variations in pitch and emphasis; it is one of the characteristics of human emotion transmission. By adjusting the intonation parameters to change the length and height of certain words in discourse, the controlled intonation can mimic the effect of singing; this approach, in turn, can be used to address the lack of research on voice synthesis in singing. The cepstrum method is used to extract the pitch frequency, the LPC (linear predictive coding) method is used to estimate the formant, and a high-order polynomial is used to fit the pitch of the voice; the fitting function is then adjusted in real time to form the tone required to achieve the objective of singing. Given two basic speech parameters, pitch frequency and formant, combined with the mathematical nature of pronunciation, this paper uses an intuitive mathematical method to synthesize the effect of singing; using this method, the original voice and the synthetic voice reach an overall recognition rate of 87.6%. The result of this synthesis shows that by adjusting the parameters of speech synthesis, we can achieve greater control over voice singing.

Key words: intonation, tone, voice singing, cepstrum, pitch frequency, LPC (linear predictive coding) method, formant, fitting function

CLC Number: