Author Topic: LATEST VERSION INFO THREAD - Easy Screencast Recorder - v1.17.01 - May 31, 2017 (Read 345364 times)

TAF2000 · « **Reply #100 on:** March 05, 2013, 01:15 PM »

Wow! This is exactly the thing I have been hunting for!

I am not exactly sure how it "integrates" with Screenshot Captor (my greatest LOVED utility of all time), but I sure like that I can set my own combination hotkey and BOOM! I am recording.

No fuss, no muss, and with another hit of the chosen shortcut keys, I am done and can get on with whatever I need to do with the video file.

I definitely need to donate some more credits and send them to Mouser.

AVI seems to kick up an error for me, but the default works great and I am guessing I only need to get more codex installed for the rest. SUPPER SWEET!

As for the integration, it sure would be cool if I chose the same ScreenShots folder as I use for the Screenshot Captor, and then it could open up the Captor window after recording a video and do the preview of the video right there with the nice folder display on the side. Not seriously necessary, but would sure work nicely for someone who uses the Screenshot Captor a ton.

Maybe Something like below:

mouser · « **Reply #101 on:** March 05, 2013, 02:20 PM »

As for the integration, it sure would be cool if I chose the same ScreenShots folder as I use for the Screenshot Captor, and then it could open up the Captor window after recording a video and do the preview of the video right there with the nice folder display on the side.

it should already do that, should look something like this:
Screenshot 3_5_2013 , 2_22_21 PM.png

You might need to be running the new beta release to see the feature.

TAF2000 · « **Reply #102 on:** March 06, 2013, 01:22 AM »

that is exactly what I was hoping for...

Updated to the new beta release.

Errr... but now I must be completely lost in the setup somehow. I am not sure how to make the caster show up in the captor after recording, and it doesn't load the asf when I click on it if I manually open the captor and click on it.

I probably missed it somewhere in the instructions or documentation. Hopefully you won't mind pointing me in the right direction?

mouser · « **Reply #103 on:** March 06, 2013, 02:07 AM »

Ok quick answers:

Somehow I have overlooked the fact that the SC video component does not seem to be able to decode ASF files, only AVI files. So that means that you are not going to be able to load the video into SC, unless you change to recording AVI. Though note that installing a lossless AVI codec as explained on the forum and in help file for ESR will get you the best results.

Ok, so having said that -- the answer to your other question is to launch ESR from within Screenshot Captor tool menu -- you should find it's in the Tool menu of SC automatically. Even if you don't load the video into SC, if you save frames from the video inside ESR preview window, they should appear in the SC screenshot folder automatically for editing (that will work even if you use ASF, as long as you extract the frames from ESR instead of trying to use SC to do it).

mouser · « **Reply #104 on:** March 06, 2013, 02:23 AM »

Correction:

It seems that SC *can* in fact load ASF files and play and extract frames from them.. BUT there appears to be a bug dealing with ASF files that have no audio track. As long as the video has been recorded with audio it will load, at least for me.

Can you test and see if you have the same results?

TAF2000 · « **Reply #105 on:** March 06, 2013, 12:01 PM »

Ah HA!

The trouble seemed to be that I was having it save the files into a sub-folder of my screen shots and it won't play them from there. I don't know if it is recording sound or not, but I have it on default, so I think it is recording stereo mix.

I have both the SC and the ESR starting with the startup of my computer, so that I can run them right from hotkeys at any moment that I need them. So the preferred method I am looking for, I think, would be to figure out how to open the SC from the "With last recorded video do..." and display the video that just recorded. I am not familiar with what commands would be required for this, Would it be possible?

TAF2000 · « **Reply #106 on:** March 06, 2013, 12:13 PM »

Oh no. that doesn't seem to be it.

First trouble is that "With last recorded video do..." doesn't happen automatically after the video is finished recording, I thought it might. And second trouble is that the SC is already running in the system tray, so it pops an alert instead of opening. Not sure there would be any way around these issues without making it an option in the ESR program itself that would cause it to open SC and find the file automatically after each video is recorded.

Would it be worth the trouble to add that as an option to the ESR program "Options and Preferences"?

No worries if so... I think I am already getting the hang of finding the file manually in SC, and I now noticed that the ESR preview window has a nice little drop down list of recent files under "file".

Vurbal · « **Reply #107 on:** July 11, 2013, 05:56 AM »

I decided to try this program out as a more barebones alternative to CamStudio. I actually do a lot of screencasting but I don't need anything special like annotations. I am, however intrigued by some of the options in this program and after skimming through this thread I think I could help out if you're still working on it. I have a lot of experience with video encoding, a decent familiarity with the ffmpeg command line and AviSynth has been my go to video editing tool for close to a decade now.

Since I'm in the middle of a wipe and clean install of my system drive I'll have to wait to try the program out, but I already have a couple questions and comments so I figured I'd get a head start here. I guess the first thing would be just to say I'm a big fan of your software already. After more than a year of playing around with LaunchBar Commander off and on I'm making it the center piece of some articles for the website I work for (AfterDawn.com btw) which will also include some videos for our YouTube channel. You can expect a donation from me on the strength of Screenshot Captor alone once I have a few bucks to spare.

But back to Easy Screencast Recorder, one of the things that caught my attention was the option to create MKV files but I'm not quite clear on what I need to have installed for that. It says the "MKV codec" is required but since Matroska is a container and not a compression standard there's no such thing. What it does need is a MKV muxer. I'm guessing you're using DirectShow and I'm guessing (or hoping at least) this uses the Dshow muxer from MPC-HC. If there's more information in the help manual feel free to just tell me to RTFM. My computer doesn't want to open CHM files right now (I did say I was reinstalling Windows) so I can't check for myself right now.

On the subject of codecs, though, for anyone encoding normal screen activity (ie standard low motion computer use) I would highly recommend the CamStudio Lossless codec. For ultra low motion video it's more efficient than any other realtime codec by a couple orders of magnitude. For anything else it's pretty much worthless since it drops frames left and right.

Also, if you're still thinking about some way to incorporate AviSynth functionality you don't necessarily need to have it installed. You should be able to do everything you need by just linking to avisynth.dll. I wouldn't want to guess how much work that might be since I'm not a programmer but you should be able to get any answers you need at the Doom9 forums.

mouser · « **Reply #108 on:** July 11, 2013, 06:56 PM »

Vurbal, welcome to the site!

And thank you for such a great post.

I have a lot of experience with video encoding, a decent familiarity with the ffmpeg command line and AviSynth has been my go to video editing tool for close to a decade now.

This would actually be of great help, because i have 0 experience with codecs and avisynth, and what i would really love is to understand some common profiles that i could offer the user so that ESR could easily convert to file formats that would be easy for people to upload to youtube, etc.

I have a couple of other projects to work on but i'd love to revisit ESR and get the program to a state where it can really live up to it's name. Right now the program would probably be more accurately called "Half-easy, Half-painfully-difficult screencast recorder"..

After more than a year of playing around with LaunchBar Commander off and on I'm making it the center piece of some articles for the website I work for (AfterDawn.com btw) which will also include some videos for our YouTube channel.

Awesome

Afterdawn: very cool!!

Vurbal · « **Reply #109 on:** July 18, 2013, 12:58 AM »

I could have sworn I already replied to this. Must have gotten distracted and forgotten to hit the Post button

Here's the easy answer to encoding screencasts for uploading. It's probably also incomplete because I've only tried it with YouTube. I encode all my video to H.264 using x264 in lossless mode because it's insanely efficient. The same caveat applies here as using CamStudio Lossless for capture - for anything but low motion screencap all bets are off. To give you an idea just how efficient it is I compress the 16/48 stereo audio with Flac and it still amounts to almost the entire size of the final MKV file.

You might be able to use the same video format for most video sites but I have no idea how prevalent Flac support is. I'd be willing to bet MKV support is rare beyond YouTube but MP4 should be pretty much universal.

I'm actually glad it will be a while before you get back to this. After years of procrastination I've decided I need learn some actual programming skills for various projects I want to pursue so I'm giving myself a crash course in Python. One of those projects happens to be an alternate version of AviSynth called Vapoursynth which uses Python instead of native AVS scripts.

mouser · « **Reply #110 on:** July 18, 2013, 09:22 AM »

x264 in lossless mode sounds intriguing as a potential default recording format for ESR..

wraith808 · « **Reply #111 on:** July 18, 2013, 10:15 AM »

why x264 instead of other h.264 implementations? Just wondering...

(I think I found the answer on my own)

Vurbal · « **Reply #112 on:** July 18, 2013, 11:42 AM »

x264 in lossless mode sounds intriguing as a potential default recording format for ESR..
-mouser (July 18, 2013, 09:22 AM)

That would be a tricky proposition. The problem is that it's not designed as a realtime encoder. There used to be a VfW version around but trust me VfW should be considered a last resort option and VfW x264 is a bad idea.

I do put up with VfW for CamStudio because it's better than any comparable program that doesn't cost hundreds of dollars. And because the CamStudio Lossless Codec is only available in VfW.

What you could do in theory is pipe the video to FFmpeg for encoding since x264 is integrated into it. I know FFmpeg supports pipe input but other than that I know basically nothing about it. I can see a lot of potential difficulties there like buffering.

Actually, though, that does make me think of a different option. I seem to recall that ffdshow can decode CamStudio Lossless using libavcodec so there's definitely FFmpeg support of some kind. That leads me to believe there's probably encoding support as well.

Well that's it. Now I'm on a mission to work this out. Damn you mouser!

mouser · « **Reply #113 on:** July 18, 2013, 11:49 AM »

The problem is that it's not designed as a realtime encoder.

Fair enough.. And it probably makes sense to split up the job of rapid lossless recording vs the job of getting good compression.

So we're back to the idea that ESR should have a way to post-process (either on demand or automatically) video into a format for uploading and sharing.

Vurbal · « **Reply #114 on:** July 18, 2013, 11:53 AM »

why x264 instead of other h.264 implementations? Just wondering...

(I think I found the answer on my own)
-wraith808 (July 18, 2013, 10:15 AM)

Yep, that's a good technical explanation. The less technical one is that x264 is not just free (as in beer and speech, but not necessarily patent encumbrance) but also the best H.264 encoder available with the exception of certain situations involving gradients where CinemaCraft's encoder is supposed to be the only good choice. I say supposed to be because it's a high end professional encoder that costs something like $50,000 so needless to say I haven't used it.

x264 is so good that The Criterion Collection paid the tens of thousands of dollars required for Blu-ray certification. If there's one thing Criterion is known for (besides their huge selection of art films) it's their uncompromising attitude to quality.

Vurbal · « **Reply #115 on:** July 18, 2013, 11:57 AM »

The problem is that it's not designed as a realtime encoder.

Fair enough.. And it probably makes sense to split up the job of rapid lossless recording vs the job of getting good compression.

So we're back to the idea that ESR should have a way to post-process (either on demand or automatically) video into a format for uploading and sharing.
-mouser (July 18, 2013, 11:49 AM)

Probably - but let me get back to you on the FFmpeg thing first and see what the options are. I suspect x264 ends up being to CPU intensive either way but we should start with a better picture of the possibilities before getting invested in anything.

Vurbal · « **Reply #116 on:** July 19, 2013, 02:27 AM »

Before I get any deeper into this (oops too late) I think the best thing is to establish a common baseline so everybody (including me) knows what I'm talking about. Like anything technical it's essential everybody is speaking the same language so a lot of it will be pretty rudimentary. Also, see my signature.

Basic Terminology

Frame: A frame is the smallest group of samples you should need to be concerned with. Don't think pictures (like video frames) but rather data frames like in networking. Each group of samples has a header both to provide metadata and for muxing and decoding.
- Video Frame: Every video frame, regardless of what standard is used for encoding, contains all the samples for 1 entire picture. The terms are basically interchangeable.
- Audio Frame: The number of samples in an audio frame is determined by the relevant encoding standard. Any further details will be handled by the relevant DirectShow filters so this is already more than you probably need to know.
Stream: A stream is a sequence of video or audio samples.
- Elementary Stream: This is a stream of video or audio frames. Some files appear to be containers (eg MP3) but are really just elementary streams with additional information tacked on.
- Raw Stream: A raw stream consists of nothing but samples. It typically has no file header and there are no frames. I only mention this because H.264 does not use frames (except in the video sense obviously) so except for special circumstances (which we shouldn't have to worry about it) should always be in a container.
Mux: Multiplexing multiple streams together is typically referred to as muxing. This is also similar to multiplexing in a data network except that the frames have to be in sync in terms of timing. Typically this means alternating between 1 video frame and multiple audio frames. This should be handled automatically but I figured it was worth explaining.
Container: When you mux streams together you put them into a container. The container has its own header to store metadata about the streams so they can be separated correctly later.
Demux: Predictably demultiplexing the individual streams from a container is better known as demuxing.
Encoding: Just like any other type of information video and audio have to be encoded in some standardized data format for processing by a computer. Encoding and format are not the same thing (see container) but they're commonly used interchangeably. If you're going to do that just make sure you're clear that's what you mean. If you want to be extra clear it's safer to call it encoding, standard, or encoding standard but even I'm not that much of a language nazi.
Encoding Standard: Some encoding standards are inseparable from the software used to create them. For example QuickTime refers to both the encoding standard and the encoder. However the most common encoding standards are entirely separate from the software itself. It makes things easier if you can separate them in your head. For example there are encoders called DivX and XviD but both encode video according to standards defined in MPEG-4 Part 2. It is not DivX or XviD video, but rather MPEG-4 ASP or MPEG-4 SP video.
Definition: Definition is the accuracy of a sample or group of samples. In other words the amount of detail captured. Once a stream is captured the maximum definition is set. It can never be increased but can be decreased.
Resolution: Resolution is the precision of a sample or group of samples. In other words the amount of detail encoded (stored) for a sample or group of samples. Increasing the resolution does not increase definition but decreasing the resolution permanently decreases the definition.
Interpolation: If you're familiar with this mathematical term that's all this is. Otherwise just think of it as a mathematical educated guess. It's used to create new information, typically for upscaling to a higher resolution.

Picture Groups

One way to reduce the size of a video stream is to avoid duplicating details which don't change from one frame to the next. This is particularly relevant to screen capture where it's common for several sequential frames to be exactly the same and many others to be nearly so. This section describes how this can be done by grouping pictures together. To best understand this information I recommend you read through the descriptions, attempt to follow the explanation which follows, and then repeat as many times as necessary.

I-frame: An Intra picture, more commonly known as an I-frame, is a full picture equivalent to a normal image file.
Delta Frame: Delta frame is the generic name for any frame which describes changes compared to one or more other frames. It cannot be decoded by itself. Any other frames it references must be decoded first. These may be Intra frames or other Delta frames - often both.
- P Frame: A Predicted picture, more commonly called a P-frame, uses only references to previous frames. These are the most efficient delta frames in terms of encoding and decoding efficiency but the least efficient in terms of file size.
- B Frame: A Bidirectional picture, or B-frame, describes changes relative to both previous and future frames. They are (on average) the most efficient in terms of file size but the least efficient in terms of encoding and decoding efficiency.
Keyframe: In the most basic terms a Keyframe is an I frame where video decoding can begin without referencing any previous frames. Although keyframe and I-frame are sometimes used interchangeably, not every I-frame is automatically a keyframe. Also be aware that whether a particular I-frame is also a keyframe may be a function of a particular application and not determined exclusively by the properties of the video stream.
GOP: A Group Of Pictures, or GOP, is a sequence of frames beginning with an I-frame which is followed by 1 or more P-frames and/or B-frames and sometimes also additional I-frames.
- Open GOP: A GOP is considered open if it includes one or more delta frames which reference frames from preceding and/or subseqent GOPs.
- Closed GOP: A closed GOP is entirely self-contained. No frame in a closed GOP references a frame from another GOP. At the very least a closed GOP cannot end with a B-frame.

This is an example of a fairly simple GOP structure, like what you might find in a MPEG-2 file. At the top is the order these frames will play. At the bottom is the order they will be encoded, decoded, and stored. Most of the I and P frames are encoded (and must be decoded) out of order because otherwise the encoder or decoder won't have the necessary information for the preceding B frames.

It's also worth mentioning that the first GOP (00-05) ends with a B frame so it cannot be closed while the second (07-09) ends with a P frame so it can be.

For capturing you may use P frames (depending on your choice of encoder primarily) but never B frames because they're not suitable for realtime encoding. When you reach the final step of encoding for upload (or most other distribution methods) you will rely heavily on B frames to retain maximum quality at a minimal file size.

Containers

Matroska: Matroska is a free (as in speech, beer, and patent encumbrance) universal multimedia container. If Matroska doesn't support it you almost certainly shouldn't be using it. The only down side is a lack of support in video editing software and possibly a lack of support by online video services although YouTube is happy to accept Matroska uploads. If you want to use one of the big commercial editors Matroska will give you problems. Matroska files typically have a MKV extension for video or muxed video and audio. It can also be used for just audio although MKA is more common.
WebM: WebM is Google's open source media container. Actually it's just a subset of Matroska. However unlike Matroska you can't put just any stream you want into it. It's intended specifically for VP8 video which is a successor to the VP7 codec Flash Video was based on. Google bought On2, the company behind VP8, a couple years back to create an open source competitor to H.264. That hasn't happened yet. This one is more interesting than useful.
MP4: As the name suggests MP4 is the official MPEG-4 container. You can put H.264 or MPEG-4 ASP (DivX, XviD, and the like) video in this container and MP3, AAC, or AC3 audio. Other than storing H.264 streams (because they're a pain without a container) I don't have much use for it because I use LPCM or FLAC audio in my YouTube videos. If you prefer MP3 or AAC it's not a bad choice. Theoretically you can also put MPEG-2 video in it except I don't know about software support for it and I don't bother with MPEG-2 any more. The file extension should always be MP4 but you will see M4V used for video only files or M4A for audio (thank Apple for that one).
RIFF: RIFF is a ancient and generic container format for any type of data. Because it is not specific to any particular type of data it uses chunks instead of frames. Unlike a frame, a chunk does not have a header of its own. Instead it is referenced in 1 or more indexes at the beginning of the RIFF file. Each index amounts to master header for a group of chunks.
WAV: The WAV container is an implementation of RIFF specifically designed for audio streams. Although it can hold various types of audio you shouldn't be using it for anything except LPCM. For practical purposes it's enough to know that a WAV file can be treated like an elementary audio stream.
AVI: The AVI container is another specialized application of the RIFF format. It stands for Audio Video Interleave. Interleaving is just another way to describe muxing. Like all RIFF-based formats AVI files store the metadata (equivalent to frame headers) in monolithic indexes at the beginning of the file. There is one index for each stream so in our case there would be 2 - 1 each for video and audio.
The relatively primitive nature of AVI's chunk-based approach makes it unsuitable for B-frames because VfW was designed around the assumption that frames are stored in display order. It can be done, but it is always a hack. That's why VfW has generally been ignored by x264 developers.
- AVI 1.0: The original AVI specification, commonly referred to as AVI 1.0, is the only type of AVI file you can work with via the equally ancient VfW interface. It is officially limited to a file size of 2GB, although with some trickery AVI 1.0 files up to 4GB can be created and read. You may notice that these sizes exactly match the limitations of the FAT16 and FAT32 file systems respectively.
- AVI 2.0/OpenDML: AVI 2.0 files use a Matrox extension of the AVI 1.0 standard called OpenDML which removes the 2GB/4GB file size limitation. There are other technical differences, but at the end of the day that's the problem of your DirectShow filters. As long as we're going through DirectX (which makes use of VfW components but not VfW itself) this should be the only type of AVI files to worry about.
ASF: This is Microsoft's proprietary container for Windows Media (WMV and WMA) if you capture to these formats you'll use it for initial storage and then you'll convert it to something else and switch containers. Almost nobody uses ASF because almost nobody uses WMV and WMA is almost exclusively used in Windows Media Center. The extension, predictably enough, is ASF.

Video Standards

AVC/H.264: MPEG-4 Part 10 defines an advanced video encoding standard better known as H.264 (the ITU designation), MPEG-4 AVC, or simply AVC (Advanced Video Coding). It is far and away the best video encoding standard in terms of quality vs. bitrate (file size) but due to the complexity required is also relatively CPU intensive to encode, and potentially also to decode. It also has a lossless profile which is particularly suited to low motion video like typical screen captures. H.264 streams are raw, rather than elementary so they should always be stored in a container. Although they can be stored in MPEG PS or MPEG TS containers MP4 or Matroska are typically used.
WMV 9: Windows Media Video is a family of video encoding standards of which only WMV 9 is of any particular interest as it is designed for everything from screen capture software to streaming. The VC-1 standard (aka WMV3, WVC1 or SMPTE 421M) used on some Blu-ray discs is a subset of WMV 9. At relatively high bitrate the quality is comparable to H.264. Being a proprietary Microsoft technology, WMV files are stored almost exclusively in their ASF container.
CamStudio Lossless: CamStudio Lossless is an encoding standard implemented by the VfW codec of the same name. There is also a decoder, but no encoder, built into FFmpeg. It is highly optimized for realtime encoding (eg screen capture) of low motion video. Within those constraints it offers unbeatable efficiency (small file sizes). The one caveat for this codec is that it can be tricky to decode in AviSynth. For some reason AviSynth doesn't handle the way CamStudio Lossless handles duplicate frames properly. To bypass this you simply need to use the FFmpeg-based FFMS2 source filter rather than VfW to decode it with. It can be stored in either AVI or MKV containers.
UT Video: UT Video is one of the newest lossless codecs. I haven't used it personally but for general purpose use it is reputed to be as good as it gets. I can almost guarantee it won't hold a candle to CamStudio Lossless for screen captures but if it's more "normal" video rather than super low motion like the typical screen capture it is apparently in a class by itself. Even if you don't use it for capturing it should be a great choice for intermediate tasks like editing. Being a codec means you'll need to put it in either an AVI or Matroska container.
FFV1: FFV1 is FFmpeg's lossless video standard. It can be encoded either directly with FFmpeg or via VfW using ffdshow (don't look for a release version it's in perpetual beta). For normal or high motion video it's one of the most efficient but it is fairly CPU intensive. For screen capture probably most suitable as an intermediate format for editing. However it is not considered as fast or efficient as UT Video so I'd go with that one first. You will need to store it in either an AVI or Matroska container.
MSU: MSU is a commercial lossless codec that's free for personal use. People seem to either love this one or hate it - either it works flawlessly or it chews up CPU cycles and takes forever. Tests I've seen seem to indicate it's efficiency (output filesize) is better than FFV1 but not as good as UT Video. I'll probably never bother to try it out just because I prefer open source alternatives which the rest of the lossless codecs on this list are. Once again this is an AVI or Matroska container.
HuffYUV: HuffYUV I include more for sentimental reasons than anything else I suppose. It's the original open source lossless codec and was originally written by Ben Rudiak Gould who also created the first version of AviSynth. It should be less CPU intensive than FFV1 or MSU, but still not as good as UT Video. Efficiency wise definitely at the bottom of the list. However it does have the advantage of being available either as a standalone codec or part of ffdshow. Again, AVI or Matroska container.

Audio Standards

LPCM/PCM: LPCM stands for Linear Pulse Code Modulation which may also be referred to as just PCM or uncompressed. This is a sort of universal and simple way of storing audio used for everything from Betamax to CD to DVD and Blu-ray. There is no standard LPCM elementary stream but LPCM in a WAV container can be treated like one.
FLAC: Free Lossless Audio Encoding or FLAC is an open source encoding standard for losslessly compressing LPCM audio. Unlike LPCM FLAC does use elementary streams which can contain a variety of (mostly CD Audio related) metadata. It can be stored by itself in a FLAC container. It can also be stored in a Matroska container either by itself or muxed with other streams.
MP3: MPEG-1 Layer 3 Audio is losslessly compressed audio typically found in an elementary stream which also has a sort of secondary header added for tag metadata. It can also be muxed into pretty much any container although typically it's found in MKV, MP4, or AVI files.
AAC LC: Advanced Audio Coding Low Complexity is part of the MPEG-2 standards family. At very low bitrates (128kbps or less) it tends to have slightly superior sound to MP3 at the same bitrate. At higher bitrates they are more or less equivalent. It is often referred to simply as AAC. Apple uses this encoding standard for iTunes downloads. There is no elementary stream format so raw AAC streams are typically found in MP4 containers by themselves. They can be muxed into MP4, MPEG PS, MPEG TS, or Matroska containers.
WMA: Windows Media Audio is a family of audio encoding standards which include a lossy standard more or less comparable in quality to MP3 as well as a lossless one. I could give you a lot more details but WMA really isn't worth the effort.

Windows Specific

Uncompressed Video: In Microsoft land uncompressed video is defined as RGB 4:4:4 (every pixel includes all three color components). Uncompressed video can be encoded or rendered directly without any additional components.
Uncompressed Audio: Microsoft defines uncompressed audio as LPCM at any bit depth and any samplerate. Uncompressed audio can be encoded or rendered directly without any additional components.
Compressor: A component used to encode video or audio to a format other than the ones listed above as uncompressed.
Decompressor: A component for decoding any video format except those listed above as uncompressed.
Splitter: This is Microsoft speak for demuxer. Functionally it's exactly the same thing.
Renderer: A component used to send uncompressed video or audio to your display or speakers.
VfW: VfW is Microsoft's ancient attempt to copy Quicktime. Most things you can do via VfW are better handled either through DirectShow or a standalone executable.
Codec: In VfW Compressors and Decompressors for a given format are typically included in a single Codec. However it's still possible to have a component that's just a compressor or a decompressor.
DirectShow: This is the DirectX multimedia framework which replaced VfW. While it is a huge improvement in terms of playback, it isn't always reliable for random access.
Filter: Rather than monolithic components like the codecs in VfW, DirectShow uses discrete components called filters. Each filter performs a single, specialized task like opening or writing a file splitting streams from a container, decoding or encoding a stream, or rendering uncompressed video or audio. DirectShow filter files have an extension of AX.
Pins: The inputs and output of DirectShow filters are called pins. To send information from one filter to another you connect the output pin from the first filter to the input pin on the second. Depending on the filter an input or output pin could also connect to a file or a device.
Filter Graph: In DirectShow the chain of filters used for a given set of operations is called a graph. A graph is built automatically by video playing or processing programs. It consists of the filters themselves and the connections between the various pins.
GraphEdit: Alternatively you can use a free Microsoft tool called GraphEdit to either open a file to see what filters are involved or manually build a graph by selecting your own filters and connecting them yourself. You can even save a graph to open with various tools just like opening a file. I actually use an open source tool called GraphStudio which offers the same basic interface but more features. In either case these are good tools for troubleshooting DirectShow problems. It's probably one of the better pieces of software Microsoft has ever produced.

MFC: Media Foundation Classes is essentially DirectShow except that it's locked down so they don't have all those pesky volunteer developers turning their crappy closed ecosystem into something infinitely more useful and interesting like they did with VfW and DirectShow. Just remember that no matter how much an MFC filter looks like something from DirectShow they're intentionally designed not to work with it.

Color

This seems like a simple enough subject. When you look at an object in the real world like your keyboard what you're seeing is whatever frequencies of light aren't being absorbed. Instead they're being reflected back. That's subtractive color. When you look at a computer monitor you are actually seeing light being shined directly at you. That's additive color.

Each pixel on your monitor is actually 3 different dots, one red, one green, and one blue. This is called RGB color space. The intensity of each one determines the final color of the pixel and can be expressed in a value from 0-255. That's 8 bits per color or 24 bits per pixel making it 24 bit color.

Video uses a different color scheme. Instead of RGB it uses YUV. The Y represents luma (light and dark) while the U and V represent chroma (color). Technically the chroma is only red and blue because your eyes are more sensitive to green and therefore it's more or less included in the luma. This is the YUV color space

That also means the UV components don't need to be at full resolution because you won't be able to tell the difference. In fact they are usually only 1/4 the resolution which saves a lot of space. So now instead of 24 bits per pixel each block of 4 pixels uses 32 bits for luma but only 16 bits for chroma for a total of 12 bits per pixel. In other words half the original size.

Additionally video colors have a different gamut than computer colors. For video, whether RGB or YUV, the values range from 16-235 instead of 0-255. That means not every color your monitor displays correlates to a legal color for video. These illegal values are called out of gamut colors. What this ultimately means is that the colors in your final video will be different from the colors on your desktop. It shouldn't be a big deal but it's easy to think you're doing something wrong when that happens or that there's some way to "fix" it. You're not and there isn't.

What you should try to make sure of, though, is that you're not converting between RGB and YUV (actually YV12 once you include the chroma downsampling) more than once. If all you do is capture and encode that shouldn't be a problem. If you do any kind of editing in between you should figure out what colorspace your editing tools use and compare that to your capture codec. If your editing tools operate in RGB color space you should make sure you capture RGB. If they use YUV you can capture either RGB or YUV as long as you only perform the conversion once.

Other Useful Software

AviSynth: This is one of the most useful and flexible video tools ever created. It's a script-based editor that uses its own custom scripting language. It's extensible with plugins and can even be linked directly to programatically. Even though it's GPL licensed there are also exceptions written in to allow developers to link to AviSynth.dll without worrying about whether they have to open their code as well.
   It basically works like this. You write a simple script specifying a source file or
files to open and any processing instructions you may have and then save it with an extension of AVS. You can then open that script with just about any program that can open an AVI file and it supplies the output of your script as uncompressed video and audio frames. That sounds a little complicated - and it certainly can be - but it can also be as simple as 1 or 2 lines:

   AviSource("D:\Videos\MyFile.avi")
   ConvertToYV12()

I won't demonstrate how much more complicated it can get because that would be cruel and probably not something you'll ever use.
FFmpeg: This is the holy grail of open source video and audio programs because it allows you to decode just about any file you can come up with and also bundles numerous impressive open source encoders including x264 and FFmpeg's own FFV1. It is also a nest of potential patent litigation so you have to be careful about where your server is located if you choosed to distribute it with a program. In most cases it's better to simply link to a website where somebody is providing a download from beyond the reach of the various trade groups.
x264: H.264 is the most important video format in the world for the forseeable future and besides being free and open source x264 also happens to be the best x264 encoder in the world. It is a command line tool but there are lots of front ends available and honestly the command line isn't all that intimidating because the built-in presets are probably all you'll ever need. As with FFmpeg though, beware patent traps and the trolls who guard them.
AviDemux: This open source and cross platform video editor uses FFmpeg libraries to do all kinds of basic editing. It can also be run from the command line, making it potentially a good tool for preparing captured video for upload.
ffdshow: This is a package of DirectShow filters using open source libraries (predominantly FFmpeg) and also includes a VfW interface for them as well. You may not need it but I've used it for years.
mkvtoolnix: This is the official toolkit for Matroska files. You can put streams in, take streams out, join streams together, and numerous other things.
LAV Tools: Yet another FFmpeg-based toolkit. This one provides individual DirectShow filters implementing FFmpeg features rather than a single package like ffdshow. One of the more useful of these filters is a Matroska splitter. You can't open MKV files using DirectShow without one.
Preferred DirectShow Filter Tweaker: When you have more than one DirectShow filter installed to handle the same type of file each one has a priority. Sometimes you need to change the priorities of one or more filters to make sure DirectShow uses the one you want. This program can do that for you.
Media Player Classic - Home Cinema: This DirectShow based media player can handle just about any file you throw at it. You can also get all of its built-in DirectShow filters as standalone AX files.

[/list]

mouser · « **Reply #117 on:** July 20, 2013, 04:51 AM »

Awesome summary, Vurbal

cranioscopical · « **Reply #118 on:** July 20, 2013, 10:40 AM »

Very useful! Saved locally for reference

Edvard · « **Reply #119 on:** July 20, 2013, 02:23 PM »

Wow, finally some PLAIN LANGUAGE about video codecs, containers and mux/demux. This has eluded me for a loooong time.
Thanks Vurbal!

Vurbal · « **Reply #120 on:** July 20, 2013, 03:33 PM »

Glad to help! It took me literally thousands of hours of reading, re-reading, and re-re-reading to understand this stuff so I know how frustrating (and unnecessarily confusing) it can be.

ewemoa · « **Reply #121 on:** July 21, 2013, 09:19 PM »

Making it through took me a while and I feel a bit more comfortable, but I am left wondering what "sample" means.

Any help on that?

Vurbal · « **Reply #122 on:** July 22, 2013, 08:32 AM »

I don't know if this can be simplified enough to make it worth adding to that post but I can explain in basic terms. I'll stick to video sampling for the moment because digital video and digital audio have some fundamental differences.

Sampling is sort of like looking through a screen door. You can't actually see everything on the other side of the door. What you see is everythng in between the vertical and horizontal lines. Those spaces are essentially analog samples. Even though your vision is being partially blocked your brain can fill in the details between the samples to produce a complete image.

Digital sampling is a little different. Going back to the screen door, if you increase the size of the spaces you're looking through you also increase the information in each sample. A digital sample, on the other hand, has a fixed amount of information. Each video sample represents one point or a single RGB value. When you reduce the number of spaces in the grid what you are really doing is increasing the space in between samples.

To get away from the screen door analogy entirely, think of a digital image as a grid made up of samples (data points) and gaps (the space between them). More samples means smaller gaps and less information your brain has to fill in. The goal is to have enough samples that you don't notice the gaps.

wraith808 · « **Reply #123 on:** July 22, 2013, 09:37 AM »

^ That was a very good analogy! Actually, one of the best, if not the best I've seen. Thanks!

ewemoa · « **Reply #124 on:** July 23, 2013, 07:39 PM »

Thanks for the explanation.

Is it far off to say then that for a given source (image, video, audio clip), sampling is a process of creating a set of components (samples) which can later be assembled to create something akin to the original source?