To complete this task, we'll use OpenAI's Whisper model, which is open-source and can be run locally. Whisper provides advanced transcription capabilities, and you can run it using Python. We'll build a Docker image that includes Whisper and Python, and a script to handle the transcription task.
1. Dockerfile for the custom image
Create a Dockerfile
for building the custom image:
# Use a Python base image
FROM python:3.9-slim
# Install required dependencies
RUN apt-get update && apt-get install -y ffmpeg && \
pip install --no-cache-dir torch torchaudio openai-whisper
# Create the working directory
WORKDIR /var/script
# Copy the script into the Docker image
COPY scriptYouBuild.py /var/script/
# Set the entrypoint to run the script
ENTRYPOINT ["python", "scriptYouBuild.py"]
2. Transcription Script (scriptYouBuild.py
)
This Python script will handle the transcription of the audio file using Whisper:
import whisper
import sys
import os
def transcribe(input_file, output_file=None):
# Load the Whisper model
model = whisper.load_model("base")
# Transcribe the audio file
result = model.transcribe(input_file)
# Output transcription to STDOUT
print(result['text'])
# Write transcription to the output file if specified
if output_file:
with open(output_file, 'w') as f:
f.write(result['text'])
def main():
if len(sys.argv) < 2:
print("Usage: python scriptYouBuild.py <input_file> [output_file]")
sys.exit(1)
input_file = sys.argv[1]
output_file = sys.argv[2] if len(sys.argv) > 2 else None
if not os.path.exists(input_file):
print(f"Input file '{input_file}' does not exist.")
sys.exit(1)
transcribe(input_file, output_file)
if __name__ == "__main__":
main()
3. Build the Docker Image
Save both files in the same directory and build the Docker image:
docker build -t custom-whisper-image:latest .
4. Running the Docker Container
Now, you can run the container using the following command:
docker run --rm -i -v $(pwd)/myfile.mp4:/var/script/input.mp4 -v $(pwd)/myfile.txt:/var/script/output.txt custom-whisper-image:latest /var/script/input.mp4 /var/script/output.txt
- Explanation:
-v $(pwd)/myfile.mp4:/var/script/input.mp4
: Mounts the input file as read-only.-v $(pwd)/myfile.txt:/var/script/output.txt
: Mounts the output file path as read-write.custom-whisper-image:latest
: The Docker image we built./var/script/input.mp4 /var/script/output.txt
: Passes the input and output paths as arguments.
Notes:
- The output will be shown in
STDOUT
and written to the specified output file (myfile.txt
). - Make sure to adjust permissions if needed, especially for Rocky Linux.
This setup should provide a fully functional transcription service using advanced open-source technologies within the Docker container.