Ensuring that the file I am working with is a video before processing prevents the error from occurring. It will help us improve the security of our application.
Here are three ways to check if a file is a video in Python:
- Using mimetypes (built-in library)
- Using python-magic (third-party library)
- Using FFprobe command
Here is the “new.mp4” video file we will use to test our code:
Method 1: Using mimetypes
The fastest and easiest way to check if an input file is video is by checking the mime type of an input file. The advantage of using this approach is that we don’t need to specify each video format. The mimetypes module accounts for registered MIME types associated with extensions.
Decision Tree Diagram
You can tell from the above diagram that first we checked if the input file exists and if it did then get the mime type of the file. Check if the mime type starts with video. If it is video then return True which means the input file is video.
Code example
import mimetypes import os def is_video_file(filename): try: # Check if the file exists if not os.path.exists(filename): raise FileNotFoundError(f"File '{filename}' not found.") # Guess the MIME type based on the file extension mimetype, _ = mimetypes.guess_type(filename) # Return True if the file is a video, otherwise False if mimetype and mimetype.startswith('video/'): return True else: return False except FileNotFoundError as e: print(f"Error: {e}") return False except Exception as e: print(f"An error occurred: {e}") return False # Usage filename = 'new.mp4' if is_video_file(filename): print(f"{filename} is a video file.") else: print(f"{filename} is not a video file.")
Output
new.mp4 is a video file.
If I let’s say check for an image file then it will give me the below output:
new.jpg is not a video file.
If the input file has a very less popular mime type in a video then it might not recognize it. Furthermore, if it has the wrong extension then also it might give you the wrong answer.
However, determining the file type is a fast operation that requires less memory consumption.
The time complexity is O(1).
The space complexity is O(1).
Method 2: Using python-magic library
Using the python-magic library, we can get the mime type from the input file and check if the mime type starts with “video/” and if it does then it is a video file otherwise not.
Decision Tree Diagram
The python-magic library depends on the libmagic library which you need to install first based on your operating system. Since I am using MacOS, I will install it using the command below:
brew install libmagic
If you are using Windows or Linux, you should install it accordingly.
Now, you can install the “python-magic” library using the command below:
pip install python-magic
Code example
import os import magic def is_video_file(filename): try: # Check if file exists if not os.path.exists(filename): raise FileNotFoundError(f"File '{filename}' not found.") # Get the MIME type of the file mime = magic.from_file(filename, mime=True) # Check if the MIME type starts with 'video/' return mime and mime.startswith('video/') except FileNotFoundError as e: print(f"Error: {e}") return False except PermissionError as e: print(f"Permission Error: {e}") return False except Exception as e: print(f"An error occurred: {e}") return False # Usage filename = 'new.mp4' if is_video_file(filename): print(f"{filename} is a video file.") else: print(f"{filename} is not a video file.")
Output
new.mp4 is a video file.
The python-magic library is highly accurate in determining whether a file is a video and recognizes a wide array of video formats. However, as I mentioned earlier, it is a third-party library that depends on the libmagic software, which can make the setup process complex on different operating systems.
I highly recommend this approach if you are looking for accuracy. It is ideal for server-side file validation. If you are working with untrusted sources, it will help you determine various aspects of the file.
Method 3: Using the FFprobe command
If you are looking for a non-pythonic way that is fast and highly accurate then I recommend you to use the FFProbe approach. It requires the FFMpeg library which you can download from this URL: https://www.ffmpeg.org/
From the above diagram, you can see that first, we checked if the file existed. If the file exists, then we run the ffprobe command to check for the video stream, and if the output is a video, that means it’s a video file.
Code example
import subprocess import os def is_video_file(filename): try: # Check if the file exists if not os.path.exists(filename): raise FileNotFoundError(f"File '{filename}' not found.") # Run ffprobe to check for video stream result = subprocess.run( ['ffprobe', '-v', 'error', '-select_streams', 'v:0', '-show_entries', 'stream=codec_type', '-of', 'default=noprint_wrappers=1:nokey=1', filename], stdout=subprocess.PIPE, stderr=subprocess.STDOUT ) output = result.stdout.decode().strip() # Check if the output is 'video' return output == 'video' except FileNotFoundError as e: print(f"Error: {e}") return False except subprocess.CalledProcessError as e: print(f"FFmpeg error: {e}") return False except Exception as e: print(f"An unexpected error occurred: {e}") return False # Usage filename = 'new.mov' if is_video_file(filename): print(f"{filename} is a video file according to FFmpeg using ffprobe") else: print(f"{filename} is not recognized as a video file by FFmpeg.")
Output
new.mov is a video file according to FFmpeg using ffprobe
The ffprobe method is slower due to spawning a subprocess. It requires ffmpeg to be installed on your machine.
The time complexity is O(n) where n depends on the file and FFMpeg’s processing time.
The space complexity is O(1).
Time measurement for execution
I measured time for each of the above three approaches and concluded that “mimetypes” is the fastest method and ffprobe is the slowest. Here are my findings displayed as a bar chart.
Conclusion
The “mimetypes” is the best and fastest way if your video format is popular and registered globally. If you are looking for accuracy then the magic method can be explored too.
We did not explore the approach based on the file’s extension because it can sometimes mislead and give inaccurate answers.