500 likes | 569 Views
LSU/SLIS. Multimedia. Session 4 LIS 7008 Information Technologies. Agenda. HW2 Quiz Images Video Audio Streaming SMILe. HW2. Good work Your webpage is well rendered in a browser. Should use paragraphs Findings based on each approach are addressed.
E N D
LSU/SLIS Multimedia Session 4 LIS 7008 Information Technologies
Agenda • HW2 • Quiz • Images • Video • Audio • Streaming • SMILe
HW2 • Good work • Your webpage is well rendered in a browser. • Should use paragraphs • Findings based on each approach are addressed. • A correct (or partially correct) conclusion is drawn by pulling together all the findings. An attempt to draw a conclusion of who owns the website is obviously present. • Some good Examples (although not perfect): • Distributed on Moodle
Quiz • Open book (text, slides, Internet) • Posted on Moodle • Scope: covers Session 1-3 • Readings, slides, homework • Purposes • Measure your learning progress • Preview what the midterm exam will be like • Time: 20 minutes from opening to closing • Due: • See instructions on Moodle • Read the instructions BEFORE opening the quiz!!! • You can download the quiz any time, but do not open it until you are ready to take it.
The Gullibility of Human Senses • Three simple tricks for producing • Images • Video • Audio • But how do you move the bits around fast enough? • Remove redundancy – apply compression! • Throw away stuff that doesn’t matter (b/c eyes cannot see) • Synchronizing different media to create multimedia
Specify color RGB (Red, Green, Blue) are primary colors, which can be combined to produce secondary colors. RGB are used by computer monitors/screens. MCY (Magenta, Cyan, Yellow) are subtractive colors. MCY + K (black) are used by colorful printers. • Additive colors and subtractive colors. • Primary colors: RGB: produce secondary colors • http://en.wikipedia.org/wiki/Primary_color • Red+Green+Blue = White • Subtractive colors: MCY: absorb colors. • http://en.wikipedia.org/wiki/Subtractive_color • Magenta+Cyan+Yellow = Black • Guess the size of that picture on the previous slide? • Typical projector/monitor: 1024x768 = 786,432 pixels. • Horizontal dimension: 1024 pixels, vertical dimension: 768 pixels
Basic Image Coding • An image = Collection of picture elements (pixels) • Each pixel has a “color” • Black/white image: each pixel has 1 color: use 1 bit (either 1 or 0) to code a color • Grayscale image: each pixel has 1 color: use 8 bits to code a color • Colorful image: each pixel has 3 colors - RGB, each color is coded with 8 bits (or 1 byte), so each pixel has 24 bits (or 3 bytes, 1 byte=8 bits) • So, 3 bytes per pixel for colorful images • Screen • Typical projector resolution: 1024x768 pixels • A 1024x768 image requires 2.4 MB (=1024x768x3) • So a picture is worth 400,000 words (1 word = 6 bytes)! • Compression: do not use 3 bytes/pixel • remove some unimportant colors that human eyes cannot see
Nothing new (color the pixels) Georges Seurat, A Sunday Afternoon on the Island of La Grande Jatte
Visual Perception • Closely spaced dots appear solid • But irregularities in diagonal lines can stand out • Any color can be produced from just three • Red, Blue and Green: “additive” primary colors • High frame rates produce apparent motion • Smooth motion picture movie requires about 24 frames/second • Visual acuity varies markedly across features • Discontinuities of features easily seen, absolute difference is less crucial
Monitor Characteristics • Technology • Cathode-Ray Tube (CRT): RGB 3 guns • Flat panel (LCD liquid crystal) • Size (15, 17, 19, 21 inch) • Measured diagonally • For CRT, key figure is “viewable area” • Resolution • 640x480 (VGR video card), 800x600 (old laptop), 1024x768 (popular now or near future), 1280x1024… • Layout (three dot pixel, by lines) • Dot pitch (0.26mm, 0.28mm) • Distance between pixels on a monitor screen • Color Refresh rate (60, 72, 80 Hz) • light flick rate: 60 Hz
Some Questions • How many images can a 1 GB flash card store? • But mine holds about 500 images. How? • How long will it take to send an image at 128KB/s? • But my image-intensive Web page loads faster than that. How? You should be able to answer these questions by the end of this session; otherwise, come to Moodle to discuss.
Compression • Goal: Send the same information using fewer bits • Technology originally developed for fax transmission • Send high quality documents in short calls • Two types of compression: • Lossless compression: can reconstruct the original image exactly after decompression • File size reduced, but no information is lost • Lossy: can’t reconstruct the original image after compression, but the compressed image looks the same as the original • Two compression strategies: • Reduce redundancy • Throw away stuff that doesn’t matter (because human eyes/ears cannot see/hear) • Whether to compress? Depends on: • Computer speed • Network transmission speed
Palette Selection • Opportunity: • No picture uses all 16 million colors • Human eye does not see small differences between colors • Approach: • Select a palette of 256 colors • Indicate which palette entry to use for each pixel • Look up each color in the palette “The rain in Spain falls mainly in the plain”→ [*=ain,^=in] “The r* ^ Sp* falls m*ly ^ the pl*” … 1 pixel = 3 colors …
Compression Using Run-Length Encoding (RLE) • Pixels are organized into lines • Opportunity: • Large regions of a single color are common • Most pixels are the same as the one before • As you can see from the lighthouse image • Approach: • Record # of consecutive pixels for each color • An example of lossless encoding Sheep go baaaaaaaaaa and cows go moooooooooo → Sheep go ba<10> and cows go mo<10>
Graphic Interchange Format (GIF) • Do palette selection first , then do lossless compression • Opportunity: • Common colors are sent more often • Approach: Huffman Encoding • Use fewer bits to represent common colors • Encoding Color % color in image #Bitsvs.#Bits if using regular 2 bits/color • 1 Blue 75% 75x1= 75 vs. 75x2=150 • 01 White 20% 20x2= 40 vs. 20x2= 40 • 001 Red 5% 5x3= 15 vs. 5x2= 10 Total: 130 bits vs. 200 bits What is 10100101? Can you interpret the colors? If you have no idea, come to Moodle to discuss this. PNG (Portable Network Graphics): replacement for GIF (PNG has no patent restrictions, GIF is owned by Compuserv.)
Joint Photographic Experts Group (JPEG) • Opportunity: • Eye sees sharp lines better than subtle shading • Eye more sensitive to small changes in brightness than in color • Approach: • Retain detail only for the most important parts (by human eyes) • Approximate changes in image with mathematical curves: accomplished with Discrete Cosine Transform • Allows user-selectable fidelity (allow users to select compression rate) • Efficiently captures smooth transitions and shading • Not as good at capturing sharp edges • Results: • Typical compression rate is 20:1
Variable Compression Rate in JPEG 37 KB (20% rate) 4 KB (95% rate)
Vector Graphics Line drawing using math functions. Re-scalable without loss of resolution
Raseter vs. Vector Graphics • Raster images (“bitmap graphics”) • Actually describe the contents of the image • Good for natural scenes • Vector images • Mathematically describe how to draw the image • Rescalable without loss of resolution
Discussion Point:Selecting an Image Format • Should I use GIF, JPEG, or vector graphics for … • Color photos? • Scanned black & white text? (Transcript, itinerary) • Line drawings? These are important practical questions to archivists and digital librarians. Please come to Moodle to discuss this.
Hands-On Exercise: Convert Between Formats • Download and save two images • http://www.csc.lsu.edu/~wuyj/Teaching/7008/fa14/Images/image1.jpg • http://www.csc.lsu.edu/~wuyj/Teaching/7008/fa14/Images/image2.gif • Use Microsoft Paint (on Windows: All Programs AccessoriesPaint) to convert each to the other format, and compare quality and the file size • Observe the difference • Why the difference?
Basic Video Coding • Display a sequence of images • Fast enough for smooth motion and no flicker • Motion picture film: smooth show at 24 pictures/second • NTSC Video • National Television System Committee (analog TV system) • 60 “interlaced” half-frames/second, 512x486 pixel images • HDTV • 30 “progressive” full-frames/second, 1280x720 pixel images
Video Data Rates • “NTSC” Quality Computer Display • 640 x 480 pixel image • 3 bytes per pixel (red, green, blue) • 30 Frames per second • Bandwidth requirement • 26.4 MB/second • That exceeds the bandwidth of most disk drives! • Storage • CD-ROM would hold 25 seconds worth of NTSC video • What is the capacity of CD-ROM? 650-900MB • 30 minutes would require 46.3 GB • About 100GB/hr: too big! • Compression! Multimedia is big! Compress harder!
Video Compression • Opportunity: • One frame looks very much like the next • Approach: • Record only the pixels that change (trace the difference) • Standards: • MPEG-1: for Web video (download then play) • MPEG-2: for HDTV and DVD (commercial quality) • MPEG-4: for Web video (streaming) • Next?
FrameTypes: MPEG Encoding • • • • • • I1 B1 B2 B3 P1 B4 B5 B6 P2 B7 B8 B9 I2 I Intra (JPEG) Encode complete image, similar to JPEG P Forward Predicted Motion relative to previous I and P’s B Backward Predicted Motion relative to previous & future I’s & P’s
MPEG1 Frame Reconstruction I1 I1+P1 I1+P1+P2 I2 • • • • • • updates I frames provide complete image P frames provide series of updates to most recent I frame P1 P2 What if drop an I frame? Bad!
Frame Reconstruction I1 I1+P1 I1+P1+P2 I2 • • • • • • Interpolations B frames interpolate between frames represented by I’s & P’s B1 B2 B3 B4 B5 B6 B7 B8 B9
Sampler Basic Audio Encoding (Digitizing) • Sample at twice the highest frequency (22KHz) • 8 or 16 bits per sample, sample rate: X samples/second • Speech (0-4 kHz) requires 8 kB/s • Standard telephone channel (1-byte samples) • Music (0-22 kHz) requires 172 kB/s (uncompressed) • Standard for CD-quality audio (2-byte samples) • Pitch range: • http://www.youtube.com/watch?v=zESbrwRvMyM • Caution! Extremely high pitch: the following can hurt your ears! Stop playing when feel uncomfortable! http://www.youtube.com/watch?v=BX7Ar3Z-oTo
Music Compression • Opportunity: • The human ear cannot hear all frequencies at once • Approach: • Don’t represent “masked” frequencies • Standard: MPEG-1 Layer 3 (.mp3) Loudness: http://www.tlc-direct.co.uk/Technical/Sounds/Decibles.htm loudness frequency
Temporal Masking If we hear a loud sound, then it stops, it takes a while until we can hear a soft tone at about the same frequency. “Psychoacoustic compression” • Eliminate sounds below threshold of hearing • Eliminate sounds that are frequency masked • Eliminate sounds that are temporally masked • Eliminate stereo information for low frequencies
Compact Disk (CD) Recording • Parameters • 44,100 samples per second • Sufficient for frequency response of 22KHz • Each sample takes 16 bits • 48 dB (decibel) range • Two independent channels: stereo sound • Dolby surround-sound uses tricks to pack 5 sound channels + subwoofer effects • Bit Rate • 44.1K samples/sec x 2 channels x 2 bytes/sample = 172 KB/sec • Typical Capacity • 74 Minutes maximum playing time • 747 MB total
Speech Compression • Opportunity: • Human voices vary in predictable ways • Approach: • Predict what’s next, then send only any corrections/changes • Standards: • Real audio can code speech in 6.5 kb/sec • Demo at http://www.data-compression.com/speech.html • Scroll down to near the bottom: “VII. Demonstration” • Listen to the original and LPC10U (2400bps) to understand speech effect with different compression rate.
Narrated PowerPoint • Create your slides using PowerPoint • Slide Show Record Narration • Set microphone level • Record the narration • Slide transitions are automatically captured • Narration plays automatically when displayed • Synchronized between slide flipping and narration
Adding Video to PowerPoint • InsertMovies and Sounds • Movies from file (a .mpg file) • Decide whether you want “autostart” • If not, it starts when you click on it
The “Last Mile”: bandwidth to your desk • Traditional modems • “56” kb/sec modems really move data at ~3 kB/sec • Maximumly 56 kb/s theoretically • Digital Subscriber Lines (DSL) • 384 kb/sec downloads (~38 kB/sec) • 128 kb/sec uploads (~12 kB/sec) • Cable modems • 10 Mb/sec downloads (~1 MB/sec) • 256 kb/sec uploads (~25kB/sec)
Multimedia on a Web Server Web Browser Web Server • Object stored in a file • File transferred as an HTTP object: • Received entirely at the client • Passed to media player for play • This seems stupid because downloading is slow Media Player
Streaming Web Browser Web Server • Browser gets a portion of media file over HTTP • Launches media player to interpret that media file • Media player contacts streaming server Media Player Streaming Server Can be downloaded and installed buffering
Streaming Audio and Video • Begin to play after only a portion received • Buffer provides time to recover lost packets • Interrupts replay when “rebuffering” • Data not saved to hard drive. Media Sever Buffer Internet
Lost Packets (IP Phone) • Network loss • Packets completely lost (e.g., due to collisions) • Delay loss • Packets arrives too late for playout • Due to: queueing; sender and receiver processing delays • IP Phone: Typical maximum tolerable delay: 400 ms • Loss tolerance • 1% to 10% packet loss may be tolerable • Some encoding schemes are more tolerant than others
1.5 Mbps encoding 28.8 Kbps encoding Multiple Client Rates Q:how to handle different client receiving rate capabilities? • 28.8 Kbps dialup • 128 Kbps to 3 Mbps (residential DSL service) • 100Mbps Ethernet A:server stores, transmits multiple copies of video, encoded at different rates, for different users
Synchronizing Multiple Media • Scripting Languages for synchronizing multiple media: • Synchronized Multimedia Integration Language (SMIL) • Custom applications for this: • Macromedia Flash • Content representation standards for this: • MPEG 4
SMIL • Synchronized Multimedia Integration Language • Integration of multimedia with text, audio, video • Supported in RealPlayer Slide from http://www.umiacs.umd.edu/~jimmylin/LBSC690-2007-Spring/content.html (Session 5)
SMILe • Follows W3C standard • Player-specific extensions are common • Real Player implements SMIL (or SMILe) • It is XML, with a structure similar to HTML <smil> <head> … </head> <body> … </body> </smil>
Elements in SMIL • Window controls (in <head>) • Controlling layout: <region>, <root-layout> • Timeline controls (in <body>) • Sequence control: <seq>, <excl>, <par> • Timing control: <begin>, <end>, <dur> • Content types (in <body>) • <audio>, <video>, <img>, <ref>
SMIL Examples • Implemented in RealOne Player • You need to install RealOne Player (or Real Player) to run the following examples • Demo: http://www.csc.lsu.edu/~wuyj/Teaching/7008/fa14/SMIL-demo/index.htm There are 3 sets of executable and text files, at least run/read the last set: • First, run the smildemo.smil (executable) • Then, view smildemo.smil (xml) file • Question: can you make sense of smildemo.smil? • You are welcome to play with the first 2 sets.
SMIL Example <smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout> </head> <body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par> </body> </smil> Synchronizing audio, animation, and text. Slide from http://www.umiacs.umd.edu/~jimmylin/LBSC690-2007-Spring/content.html (Session 5)
From Media to Multimedia… • Tricking the human senses: • Blending pixels into a seamless image • Rapidly cycling through images to create motion (video) • Sampling analog waveforms to create digital recordings • Lots of information required to encode images, movies, and sounds • Result: you get a bulky digital file, not handy for online distribution and access. • The Key is compression! • Synchronization of different media sources leads to multimedia applications
Discussion Point: When is Lossless Compression Important? • For images? • For text? • For sound? • For video? Please Come to Moodle to discuss.