Special problems for deep learning based text-to-speech synthesis