CouldntEncodeError: Encoding failed #46

slopezpereyra · 2023-05-04T02:28:36Z

Description of bug / unexpected behavior

I took the following example from the VoiceOver Website:


class MyScene(VoiceoverScene):

    def construct(self):
        self.set_speech_service(RecorderService( ))
        with self.voiceover(text="This circle is drawn as I speak.") as tracker:
            self.play(Create(circle), run_time=tracker.duration))

I then ran manim -pqh myfile.py MyScene --disable_caching. I was requested to chose from which input device to record. I chose "default" (13). I recorded my voice as instructed, holding the 'r' key.

Upon finishing my recording, the following message appeared on the console:

Finished recording, saving to media/voiceovers/alaska-venus-montana-robin.mp3
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim/cli/render/commands.py:115 in render                                                  │
│                                                                                                  │
│   112 │   │   │   try:                                                                           │
│   113 │   │   │   │   with tempconfig({}):                                                       │
│   114 │   │   │   │   │   scene = SceneClass()                                                   │
│ ❱ 115 │   │   │   │   │   scene.render()                                                         │
│   116 │   │   │   except Exception:                                                              │
│   117 │   │   │   │   error_console.print_exception()                                            │
│   118 │   │   │   │   sys.exit(1)                                                                │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim/scene/scene.py:223 in render                                                          │
│                                                                                                  │
│    220 │   │   """                                                                               │
│    221 │   │   self.setup()                                                                      │
│    222 │   │   try:                                                                              │
│ ❱  223 │   │   │   self.construct()                                                              │
│    224 │   │   except EndSceneEarlyException:                                                    │
│    225 │   │   │   pass                                                                          │
│    226 │   │   except RerunSceneException as e:                                                  │
│                                                                                                  │
│ /home/santiago/repos/manim/intro.py:29 in construct                                              │
│                                                                                                  │
│    26 │                                                                                          │
│    27 │   def construct(self):                                                                   │
│    28 │   │   self.set_speech_service(RecorderService(format=1, channels=128, chunk=1024, tran   │
│ ❱  29 │   │   with self.voiceover(text="This circle is drawn as I speak.") as tracker:           │
│    30 │   │   │   self.play(Create(circle), run_time=tracker.duration)                           │
│    31 │   │   v = [r"\{a\}", r"\{b\}", r"\{a, b\}", r"\{a, b, c\}", r"\{a, b, c, f, g\}",        │
│    32 │   │   │    r"\{f\}", r"\{f, g\}"]                                                        │
│                                                                                                  │
│ /usr/lib/python3.11/contextlib.py:137 in __enter__                                               │
│                                                                                                  │
│   134 │   │   # they are only needed for recreation, which is not possible anymore               │
│   135 │   │   del self.args, self.kwds, self.func                                                │
│   136 │   │   try:                                                                               │
│ ❱ 137 │   │   │   return next(self.gen)                                                          │
│   138 │   │   except StopIteration:                                                              │
│   139 │   │   │   raise RuntimeError("generator didn't yield") from None                         │
│   140                                                                                            │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/voiceover_scene.py:180 in voiceover                                         │
│                                                                                                  │
│   177 │   │                                                                                      │
│   178 │   │   try:                                                                               │
│   179 │   │   │   if text is not None:                                                           │
│ ❱ 180 │   │   │   │   yield self.add_voiceover_text(text, **kwargs)                              │
│   181 │   │   │   elif ssml is not None:                                                         │
│   182 │   │   │   │   yield self.add_voiceover_ssml(ssml, **kwargs)                              │
│   183 │   │   finally:                                                                           │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/voiceover_scene.py:63 in add_voiceover_text                                 │
│                                                                                                  │
│    60 │   │   │   │   "You need to call init_voiceover() before adding a voiceover."             │
│    61 │   │   │   )                                                                              │
│    62 │   │                                                                                      │
│ ❱  63 │   │   dict_ = self.speech_service._wrap_generate_from_text(text, **kwargs)               │
│    64 │   │   tracker = VoiceoverTracker(self, dict_, self.speech_service.cache_dir)             │
│    65 │   │   self.add_sound(str(Path(self.speech_service.cache_dir) / dict_["final_audio"]))    │
│    66 │   │   self.current_tracker = tracker                                                     │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/services/base.py:85 in _wrap_generate_from_text                             │
│                                                                                                  │
│    82 │   │   # Replace newlines with lines, reduce multiple consecutive spaces to single        │
│    83 │   │   text = " ".join(text.split())                                                      │
│    84 │   │                                                                                      │
│ ❱  85 │   │   dict_ = self.generate_from_text(text, cache_dir=None, path=path, **kwargs)         │
│    86 │   │   original_audio = dict_["original_audio"]                                           │
│    87 │   │                                                                                      │
│    88 │   │   # Check whether word boundaries exist and if not run stt                           │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/services/recorder/__init__.py:101 in generate_from_text                     │
│                                                                                                  │
│    98 │   │                                                                                      │
│    99 │   │   self.recorder._trigger_set_device()                                                │
│   100 │   │   box = msg_box("Voiceover:\n\n" + input_text)                                       │
│ ❱ 101 │   │   self.recorder.record(str(Path(cache_dir) / audio_path), box)                       │
│   102 │   │                                                                                      │
│   103 │   │   json_dict = {                                                                      │
│   104 │   │   │   "input_text": text,                                                            │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/services/recorder/utility.py:225 in record                                  │
│                                                                                                  │
│   222 │   def record(self, path: str, message: str = None):                                      │
│   223 │   │   if message is not None:                                                            │
│   224 │   │   │   print(message)                                                                 │
│ ❱ 225 │   │   self._record(path)                                                                 │
│   226 │   │                                                                                      │
│   227 │   │   while True:                                                                        │
│   228 │   │   │   print(                                                                         │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/services/recorder/utility.py:110 in _record                                 │
│                                                                                                  │
│   107 │   │   self.event = self.task.enter(                                                      │
│   108 │   │   │   self.callback_delay, 1, self._record_task, ([path])                            │
│   109 │   │   )                                                                                  │
│ ❱ 110 │   │   self.task.run()                                                                    │
│   111 │   │                                                                                      │
│   112 │   │   return                                                                             │
│   113                                                                                            │
│                                                                                                  │
│ /usr/lib/python3.11/sched.py:151 in run                                                          │
│                                                                                                  │
│   148 │   │   │   │   │   return time - now                                                      │
│   149 │   │   │   │   delayfunc(time - now)                                                      │
│   150 │   │   │   else:                                                                          │
│ ❱ 151 │   │   │   │   action(*argument, **kwargs)                                                │
│   152 │   │   │   │   delayfunc(0)   # Let other threads run                                     │
│   153 │                                                                                          │
│   154 │   @property                                                                              │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/services/recorder/utility.py:208 in _record_task                            │
│                                                                                                  │
│   205 │   │   │   │   buffer_start=self.trim_buffer_start,                                       │
│   206 │   │   │   │   buffer_end=self.trim_buffer_end,                                           │
│   207 │   │   │   ).export(wav_path, format="wav")                                               │
│ ❱ 208 │   │   │   wav2mp3(wav_path)                                                              │
│   209 │   │   │                                                                                  │
│   210 │   │   │   for e in self.task._queue:                                                     │
│   211 │   │   │   │   self.task.cancel(e)                                                        │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/manim_voiceover/helper.py:31 in wav2mp3                                                     │
│                                                                                                  │
│    28 │   │   mp3_path = Path(wav_path).with_suffix(".mp3")                                      │
│    29 │                                                                                          │
│    30 │   # Convert to mp3                                                                       │
│ ❱  31 │   AudioSegment.from_wav(wav_path).export(mp3_path, format="mp3", bitrate=bitrate)        │
│    32 │                                                                                          │
│    33 │   if remove_wav:                                                                         │
│    34 │   │   # Remove the .wav file                                                             │
│                                                                                                  │
│ /home/santiago/.local/share/venvs/83241af131ae2bea5e060955fe3fb67f/venv/lib/python3.11/site-pack │
│ ages/pydub/audio_segment.py:970 in export                                                        │
│                                                                                                  │
│    967 │   │   log_subprocess_output(p_err)                                                      │
│    968 │   │                                                                                     │
│    969 │   │   if p.returncode != 0:                                                             │
│ ❱  970 │   │   │   raise CouldntEncodeError(                                                     │
│    971 │   │   │   │   "Encoding failed. ffmpeg/avlib returned error code: {0}\n\nCommand:{1}\n  │
│    972 │   │   │   │   │   p.returncode, conversion_command, p_err.decode(errors='ignore') ))    │
│    973                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CouldntEncodeError: Encoding failed. ffmpeg/avlib returned error code: 1

Command:['ffmpeg', '-y', '-f', 'wav', '-i', '/tmp/tmp8ska799i', '-b:a', '312k', '-f', 'mp3', '/tmp/tmpnp4_0fh8']

Output from ffmpeg/avlib:

ffmpeg version 5.1.2-3ubuntu1 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 12 (Ubuntu 12.2.0-14ubuntu2)
  configuration: --prefix=/usr --extra-version=3ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa
--enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang
--enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband
--enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl
--enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
  WARNING: library configuration mismatch
  avfilter    configuration: --prefix=/usr --extra-version=3ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa
--enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang
--enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband
--enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl
--enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared --enable-version3
--disable-doc --disable-programs --enable-libaribb24 --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libtesseract --enable-libvo_amrwbenc --enable-libsmbclient
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
Input #0, wav, from '/tmp/tmp8ska799i':
  Duration: 00:00:01.67, bitrate: 90317 kb/s
  Stream #0:0: Audio: pcm_s32le ([1][0][0][0] / 0x0001), 44100 Hz, 64 channels, s32, 90316 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s32le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[auto_aresample_0 @ 0x559814942200] [SWR @ 0x559814942380] Rematrix is needed between 64 channels and stereo but there is not enough information to do it
[auto_aresample_0 @ 0x559814942200] Failed to configure output pad on auto_aresample_0
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
Conversion failed!

I run Ubuntu 23.04. All dependencies were installed and are up-to-date.

Expected behavior

After executing manim -pqh myfile.py MyScene --disable_caching and recording my voice with my (functioning) microphone, I expected the recording to be succesfully embedded to the video and an .mp4 file to be outputted with the recording.

How to reproduce the issue

Code for reproducing the problem

from manim import *
from manim_voiceover import VoiceoverScene
from manim_voiceover.services.recorder import RecorderService

class MyScene(VoiceoverScene):


    def construct(self):
        self.set_speech_service(RecorderService( ))
        with self.voiceover(text="This circle is drawn as I speak.") as tracker:
            self.play(Create(circle), run_time=tracker.duration)

## System specifications

System Details

OS: Ubuntu 23.04
RAM: 16 GB
Python version 3.11.2
Installed modules (provide output from pip list):

Package                        Version
------------------------------ ----------
azure-cognitiveservices-speech 1.28.0
build                          0.10.0
certifi                        2022.12.7
charset-normalizer             3.1.0
click                          8.1.3
click-default-group            1.2.2
cloup                          0.13.1
cmake                          3.26.3
colour                         0.1.5
decorator                      5.1.1
docstring-to-markdown          0.12
evdev                          1.6.1
ffmpeg-python                  0.2.0
filelock                       3.12.0
fsspec                         2023.4.0
future                         0.18.3
glcontext                      2.3.7
greenlet                       2.0.2
gTTS                           2.3.2
huggingface-hub                0.14.1
humanhash3                     0.0.6
idna                           3.4
isosurfaces                    0.1.0
jedi                           0.17.2
Jinja2                         3.1.2
lit                            16.0.2
llvmlite                       0.40.0
manim                          0.17.3
manim-voiceover                0.3.0
ManimPango                     0.4.3
mapbox-earcut                  1.0.1
markdown-it-py                 2.2.0
MarkupSafe                     2.1.2
mdurl                          0.1.2
moderngl                       5.8.2
moderngl-window                2.4.3
more-itertools                 9.1.0
mpmath                         1.3.0
msgpack                        1.0.5
multipledispatch               0.6.0
mutagen                        1.46.0
networkx                       2.8.8
numba                          0.57.0
numpy                          1.24.3
nvidia-cublas-cu11             11.10.3.66
nvidia-cuda-cupti-cu11         11.7.101
nvidia-cuda-nvrtc-cu11         11.7.99
nvidia-cuda-runtime-cu11       11.7.99
nvidia-cudnn-cu11              8.5.0.96
nvidia-cufft-cu11              10.9.0.58
nvidia-curand-cu11             10.2.10.91
nvidia-cusolver-cu11           11.4.0.1
nvidia-cusparse-cu11           11.7.4.91
nvidia-nccl-cu11               2.14.3
nvidia-nvtx-cu11               11.7.91
openai-whisper                 20230314
packaging                      23.1
parso                          0.7.1
Pillow                         9.5.0
pip                            23.1.2
pip-tools                      6.13.0
playsound                      1.3.0
pluggy                         1.0.0
PyAudio                        0.2.13
pycairo                        1.23.0
pydub                          0.25.1
pyglet                         2.0.5
Pygments                       2.15.1
pynput                         1.7.6
pynvim                         0.4.3
pyproject_hooks                1.0.0
pyrr                           0.10.3
python-dotenv                  0.21.1
python-jsonrpc-server          0.4.0
python-language-server         0.36.2
python-lsp-jsonrpc             1.0.0
python-lsp-server              1.7.2
python-xlib                    0.33
PyYAML                         6.0
regex                          2023.5.5
requests                       2.29.0
rich                           13.3.5
scipy                          1.10.1
screeninfo                     0.8.1
setuptools                     66.1.1
six                            1.16.0
skia-pathops                   0.7.4
sox                            1.4.1
srt                            3.5.3
stable-ts                      2.5.3
svgelements                    1.9.3
sympy                          1.11.1
tiktoken                       0.3.1
tokenizers                     0.13.3
torch                          2.0.0
torchaudio                     2.0.1
tqdm                           4.65.0
transformers                   4.28.1
triton                         2.0.0
typing_extensions              4.5.0
ujson                          5.7.0
urllib3                        1.26.15
watchdog                       2.3.1
wheel                          0.40.0

FFMPEG

Output of ffmpeg -version:

ffmpeg version 5.1.2-3ubuntu1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12 (Ubuntu 12.2.0-14ubuntu2)
configuration: --prefix=/usr --extra-version=3ubuntu1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
libavutil      57. 28.100 / 57. 28.100
libavcodec     59. 37.100 / 59. 37.100
libavformat    59. 27.100 / 59. 27.100
libavdevice    59.  7.100 / 59.  7.100
libavfilter     8. 44.100 /  8. 44.100
libswscale      6.  7.100 /  6.  7.100
libswresample   4.  7.100 /  4.  7.100
libpostproc    56.  6.100 / 56.  6.100

Aditional comments

Choosing HDA Intel PCH: ALC897 Analog or HDA Intel PCH: ALC897 Alt Analog as input devices, instead of default, did not produce the same issue. However, the recordings were of terrible quality (not a microphone issue, tested the same microphone on an online recorder and had good quality).

The text was updated successfully, but these errors were encountered:

osolmaz · 2023-05-20T09:06:17Z

The plugin saves the recordings under media/voiceovers/. Can you play the ones you recorded with "default" device? They will properly not play properly with a media player, but just trying to narrow the problem down.

slopezpereyra added the bug Something isn't working label May 4, 2023

slopezpereyra assigned osolmaz May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CouldntEncodeError: Encoding failed #46

CouldntEncodeError: Encoding failed #46

slopezpereyra commented May 4, 2023 •

edited

Loading

osolmaz commented May 20, 2023 •

edited

Loading

CouldntEncodeError: Encoding failed #46

CouldntEncodeError: Encoding failed #46

Comments

slopezpereyra commented May 4, 2023 • edited Loading

Description of bug / unexpected behavior

Expected behavior

How to reproduce the issue

Aditional comments

osolmaz commented May 20, 2023 • edited Loading

slopezpereyra commented May 4, 2023 •

edited

Loading

osolmaz commented May 20, 2023 •

edited

Loading