One of the things we do at work is streaming video, in fact it’s a pretty big part of our business. Streaming video presents a number of problems, and monitoring is one of the biggest – people get pretty unhappy when their TV show cuts out, or they miss the winning goal in a soccer match because of some silly issue.
What makes this challenging is that it’s not all that easy to solve reliability issues with just redundancy, and there are a lot of things which can go wrong. So along with checking every component, we need to verify the end result and check every channel to make sure the audio quality is reasonable, there’s no MPEG artifacts (which can be caused by anything from a faulty encoder to a lightning storm), and that there’s actually a picture. How does that happen? People. People painstakingly check every channel every hour of the day and make sure people are seeing what they want to see.
The problem is a lot of quality issues are subjective, and people make mistakes so my job is to automate quality control and measurement, and generate alerts when things go wrong. But how on earth is this possible for streaming video?
The first thing is to look at what video is – it’s a stream of data, or on a lower level a multi-dimensional matrix of pixels streaming through time. Of course analysing this is a huge job for a computer, especially with a few hundred channels and endpoints, so you need some dimensional reduction.
If you’ve ever used photo editing software you’ve probably seen something called the histogram. What it represents is the frequency of colour intensities in the image, which is useful for detecting when videos plummet into darkness.
Taking a TV show and smoothing out a histogram for each frame and plotting that as a surface to see what’s going on turns out like this.
Pretty, but still too many dimensions to monitor real time (fortunately I have a monstrous workstation to generate these graphs). Source here.
Obviously what we don’t want is the histogram leaning heavily to the right or left, but more importantly we expect it to jump all over the place – ie, have a high variance. So the first thing we do is reduce the dimension by taking the sum of pixels and calculating the percentage of each luminosity value. Then we sum the multiple of those and arrive at just a value of how light or dark each frame is. This gives us a value from 0 to 255 which indicates where the histogram peaks, which ideally is not near the extremes if there’s a visible picture and it’s not roasting peoples eyeballs.
You can clearly see the value drop off when the credits roll, but there’s still variance which gives us a good idea that something is going on. Snow, test patterns or black screens will have a very low variance.
Tracking the variance in the histogram gives us an idea that there’s something resembling a video, but another good indicator is some simple motion detection. We can get a matrix of changed pixels by calculating the difference between frames.
1 2 3 4
def frameDiff(t0, t1, t2): d1 = cv2.absdiff(t2, t1) d2 = cv2.absdiff(t1, t0) return cv2.bitwise_and(d1, d2)
Sending it all to Riemann
Using Python and OpenCV this is what we end up with the following functions which takes the 95th percentile of motion changes as a percentage of the visible area, and returning the mean and variance in the histogram value.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
import cv2 import numpy def frameDiff(t0, t1, t2): d1 = cv2.absdiff(t2, t1) d2 = cv2.absdiff(t1, t0) return cv2.bitwise_and(d1, d2) def processStream(chan, fname): cap = cv2.VideoCapture(fname) bright =  frameBuffer =  tdiff =  while(cap.isOpened()): ret, frame = cap.read() if ret: gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) frameBuffer.append(gray) r = cv2.calcHist([gray], , None, ,[0,256]) cv2.normalize(r, r, 0, 255, cv2.NORM_MINMAX) # Calculate value v = sum(r) ev = sum([i * (c/v) for i,c in enumerate(r)]) bright.append(ev) # Perform motion detection if len(frameBuffer)==3: diff = frameDiff(*frameBuffer) h = len(diff) w = len(diff) avm = (sum(sum(diff)) / float(h*w))*100 tdiff.append(avm) frameBuffer =  else: break cap.release() if bright: avsq = float(numpy.mean(bright)) sqvar = float(numpy.var(bright)) vmotion = float(numpy.percentile(tdiff, 95)) else: avsq, sqvar, vmotion = 0, 0, 0 return avsq, sqvar, vmotion
Unfortunately I can’t show you my Tensor plugin for this because there’s a bunch of secret API and stream source logic wrapped up in it. But lets pretend a video stream comes from somewhere in some format without any DRM, gets buffered in small chunks in a big ramdisk every few seconds and then gets sent to the functions above and lands up in Riemann and consequently InfluxDB.
So what happens when things go wrong? Well, at least one of these tends to flat line. The variance of the histogram goes to zero when there’s no more change in the histogram of the frames, this generally means you’re locked on a static image or stuttering between frames – this is more useful than the histogram value itself because beyond a fairly small threshold no particular value indicates an issue – otherwise you’d trigger an alert every time credits roll, or miss something like a test pattern coming through.