Post

Automating Video Processing with a Bash Script

Image

In this blog post, I will walk you through a Bash script designed to automate the processing of video files in a specified directory. This script performs tasks such as calculating bitrates, adjusting audio channels, transcoding high-bitrate videos, and logging processed files. Additionally, it maintains a debug log to track events and errors, ensuring transparency and ease of troubleshooting.

Overview

The script performs the following key functions:

  • Calculates the bitrate of video files.
  • Determines the number of audio channels.
  • Processes videos based on bitrate and audio channels:
    • Transcodes videos with a bitrate higher than 3800 kbps.
    • Downmixes audio channels if they exceed two channels.
  • Logs processed and failed files in processedFiles.txt and failedFiles.txt, respectively.
  • Stores logs in the directory specified by the user.
  • Uses relative paths in log files relative to their location.
  • Creates a debug log (debug.log) to track events and errors.

The Script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
#!/bin/bash

# Function to calculate bitrate
calculate_bitrate() {
    local file_path="$1"
    local file_size_bytes
    file_size_bytes=$(stat -c%s "$file_path")
    echo "Filesize = $file_size_bytes" >> "$debug_log"
    local video_duration
    video_duration=$(ffprobe -v error -show_entries format=duration \
        -of default=noprint_wrappers=1:nokey=1 "$file_path" 2>> "$debug_log")
    local bitrate_kbps
    bitrate_kbps=$(echo "scale=2; ($file_size_bytes * 8) / $video_duration / 1024" | bc)
    echo "$bitrate_kbps"
}

# Function to get the number of audio channels
get_audio_channels() {
    local file_path="$1"
    ffprobe -v error -select_streams a:0 -show_entries stream=channels \
        -of default=noprint_wrappers=1:nokey=1 "$file_path" 2>> "$debug_log"
}

# Function to get the duration of the video
get_duration() {
    local file_path="$1"
    ffprobe -v error -show_entries format=duration \
        -of default=noprint_wrappers=1:nokey=1 "$file_path" 2>> "$debug_log"
}

# Function to get the ffmpeg format based on file extension
get_ffmpeg_format() {
    local file_path="$1"
    local extension="${file_path##*.}"
    case "$extension" in
        mkv) echo "matroska" ;;
        mp4) echo "mp4" ;;
        *) echo "$extension" ;;
    esac
}

# Main script logic

top_level_directory="${1:-.}"

processed_files_log="$top_level_directory/processedFiles.txt"
failed_files_log="$top_level_directory/failedFiles.txt"
debug_log="$top_level_directory/debug.log"

# Ensure the logs exist
if [[ ! -f "$processed_files_log" ]]; then
    touch "$processed_files_log"
fi

if [[ ! -f "$failed_files_log" ]]; then
    touch "$failed_files_log"
fi

if [[ ! -f "$debug_log" ]]; then
    touch "$debug_log"
fi

processed_files=$(cat "$processed_files_log")

find "$top_level_directory" -type f \( -iname "*.mp4" -o -iname "*.mkv" -o -iname "*.avi" \) | while read -r file; do
    relative_file="${file#$top_level_directory/}"

    if grep -Fxq "$relative_file" <<< "$processed_files"; then
        echo "Skipping already processed file $file." | tee -a "$debug_log"
        continue
    fi

    echo "Processing file $file" >> "$debug_log"

    file_size_mb=$(echo "scale=2; $(stat -c%s "$file") / 1048576" | bc)
    echo "File size: $file_size_mb MB" >> "$debug_log"
    if (( $(echo "$file_size_mb < 200" | bc -l) )); then
        echo "Skipping $file due to its size ($file_size_mb MB) being under 200 MB." | tee -a "$debug_log"
        echo "$relative_file" >> "$processed_files_log"
        continue
    fi

    bitrate=$(calculate_bitrate "$file")
    echo "Bitrate: $bitrate kbps" >> "$debug_log"
    audio_channels=$(get_audio_channels "$file")
    echo "Audio channels: $audio_channels" >> "$debug_log"
    original_duration=$(get_duration "$file")
    echo "Original duration: $original_duration" >> "$debug_log"
    ffmpeg_format=$(get_ffmpeg_format "$file")
    echo "FFmpeg format: $ffmpeg_format" >> "$debug_log"

    if (( $(echo "$bitrate > 3800" | bc -l) )); then
        if [[ -f "${file}.tmp" ]]; then
            rm -f "${file}.tmp"
            echo "The file '${file}.tmp' has been deleted." >> "$debug_log"
        else
            echo "The file '${file}.tmp' does not exist." >> "$debug_log"
        fi
        echo "Running ffmpeg to transcode video for high bitrate." >> "$debug_log"
        ffmpeg -hwaccel auto -i "$file" -nostdin -b:v 2M -minrate 1M -maxrate 10M \
            -c:v libx265 -pix_fmt yuv420p10le -x265-params rc-lookahead=120 \
            -profile:v main10 -c:a aac -b:a 128k -ac 2 -af loudnorm -y \
            -f "$ffmpeg_format" "${file}.tmp" 2>> "$debug_log"
        new_duration=$(get_duration "${file}.tmp")
        echo "New duration: $new_duration" >> "$debug_log"
        duration_diff=$(echo "scale=4; ($new_duration - $original_duration) / $original_duration" | bc)
        echo "Duration difference ratio: $duration_diff" >> "$debug_log"

        if (( $(echo "$duration_diff < 0.02 && $duration_diff > -0.02" | bc -l) )); then
            mv -f "${file}.tmp" "$file"
            echo "Duration match for $file" | tee -a "$debug_log"
            echo "$relative_file" >> "$processed_files_log"
        else
            echo "Duration mismatch for $file" | tee -a "$debug_log"
            rm -f "${file}.tmp"
            echo "$relative_file" >> "$failed_files_log"
        fi
    elif (( $(echo "$bitrate <= 3800" | bc -l) )); then
        if (( audio_channels > 2 )); then
            if [[ -f "${file}.tmp" ]]; then
                rm -f "${file}.tmp"
                echo "The file '${file}.tmp' has been deleted." >> "$debug_log"
            else
                echo "The file '${file}.tmp' does not exist." >> "$debug_log"
            fi
            echo "Running ffmpeg to downmix audio channels." >> "$debug_log"
            ffmpeg -i "$file" -nostdin -c:v copy -c:a aac -ac 2 -filter:a loudnorm \
                -f "$ffmpeg_format" "${file}.tmp" 2>> "$debug_log"
            new_duration=$(get_duration "${file}.tmp")
            echo "New duration: $new_duration" >> "$debug_log"
            duration_diff=$(echo "scale=4; ($new_duration - $original_duration) / $original_duration" | bc)
            echo "Duration difference ratio: $duration_diff" >> "$debug_log"

            if (( $(echo "$duration_diff < 0.02 && $duration_diff > -0.02" | bc -l) )); then
                mv -f "${file}.tmp" "$file"
                echo "Duration match for $file" | tee -a "$debug_log"
                echo "$relative_file" >> "$processed_files_log"
            else
                echo "Duration mismatch for $file" | tee -a "$debug_log"
                rm -f "${file}.tmp"
                echo "$relative_file" >> "$failed_files_log"
            fi
        else
            echo "$relative_file" >> "$processed_files_log"
            echo "No processing needed for $file" >> "$debug_log"
        fi
    fi
done

echo "Processing complete." | tee -a "$debug_log"

Prerequisites

Before running the script, ensure you have the following installed on your system:

  • Bash shell
  • FFmpeg: A complete, cross-platform solution to record, convert and stream audio and video.
  • FFprobe: A part of the FFmpeg package used to get information from multimedia streams.
  • BC (Basic Calculator): A language that supports arbitrary precision numbers with interactive execution.

How the Script Works

1. Calculating Bitrate

The calculate_bitrate function computes the bitrate of a video file using its file size and duration.

1
2
3
4
5
6
7
8
9
10
11
12
13
calculate_bitrate() {
    local file_path="$1"
    # Get file size in bytes
    local file_size_bytes=$(stat -c%s "$file_path")
    # Get video duration using ffprobe
    local video_duration=$(ffprobe -v error -show_entries format=duration \
        -of default=noprint_wrappers=1:nokey=1 "$file_path" 2>> "$debug_log")
    # Calculate bitrate in kbps
    local bitrate_kbps=$(echo "scale=2; ($file_size_bytes * 8) / $video_duration / 1024" | bc)
    echo "$bitrate_kbps"
}

2. Getting Audio Channels and Duration

The script uses get_audio_channels and get_duration functions to retrieve the number of audio channels and the duration of the video.

1
2
3
4
5
6
7
8
9
10
11
get_audio_channels() {
    local file_path="$1"
    ffprobe -v error -select_streams a:0 -show_entries stream=channels \
        -of default=noprint_wrappers=1:nokey=1 "$file_path" 2>> "$debug_log"
}

get_duration() {
    local file_path="$1"
    ffprobe -v error -show_entries format=duration \
        -of default=noprint_wr

3. Determining FFmpeg Format

The get_ffmpeg_format function determines the correct FFmpeg format based on the file extension.

1
2
3
4
5
6
7
8
9
10
11
get_ffmpeg_format() {
    local file_path="$1"
    local extension="${file_path##*.}"
    case "$extension" in
        mkv) echo "matroska" ;;
        mp4) echo "mp4" ;;
        *) echo "$extension" ;;
    esac
}

4. Main Processing Logic

The script processes each video file in the specified directory:

  • Skips files under 200 MB.
  • Checks if the file has already been processed by consulting processedFiles.txt.
  • Calculates bitrate and audio channels.
  • Transcodes videos if the bitrate is higher than 3800 kbps.
  • Downmixes audio channels if they exceed two channels.
  • Logs the processing details to debug.log.

Processing High-Bitrate Videos

If the video’s bitrate exceeds 3800 kbps, it is transcoded to reduce the bitrate.

1
2
3
4
5
6
ffmpeg -hwaccel auto -i "$file" -nostdin -b:v 2M -minrate 1M -maxrate 10M \
    -c:v libx265 -pix_fmt yuv420p10le -x265-params rc-lookahead=120 \
    -profile:v main10 -c:a aac -b:a 128k -ac 2 -af loudnorm -y \
    -f "$ffmpeg_format" "${file}.tmp" 2>> "$debug_log"

Downmixing Audio Channels

If the audio channels exceed two, the script downmixes them to stereo.

1
2
3
4
ffmpeg -i "$file" -nostdin -c:v copy -c:a aac -ac 2 -filter:a loudnorm \
    -f "$ffmpeg_format" "${file}.tmp" 2>> "$debug_log"

Duration Verification

After processing, the script verifies that the duration of the processed file matches the original within a 2% margin.

1
2
3
4
5
6
7
8
duration_diff=$(echo "scale=4; ($new_duration - $original_duration) / $original_duration" | bc)
if (( $(echo "$duration_diff < 0.02 && $duration_diff > -0.02" | bc -l) )); then
    # Success: Replace the original file
else
    # Failure: Log the file as failed
fi

Running the Script

To run the script, use the following command:

1
bash script_name.sh /path/to/video/directory

If no directory is specified, the script defaults to the current directory.

Logs and Debugging

  • Processed Files: processedFiles.txt contains a list of files that have been successfully processed.
  • Failed Files: failedFiles.txt lists files that failed during processing.
  • Debug Log: debug.log records detailed information about each processing step, which is helpful for troubleshooting.

All log files are stored in the specified top-level directory.

Conclusion

This Bash script automates the tedious task of processing video files, ensuring they meet specific criteria for bitrate and audio channels. By logging each step, it provides transparency and aids in debugging any issues that may arise during processing.

This post is licensed under CC BY 4.0 by the author.