www.calpoly.edu/~acadprog/2000pdf/c_arch.pdf
graphics 2D primitive support bit block transfers
Some might have video support
And of course 3D support (a topic at the heart of this presentation) GPUs are optimized for raster graphics 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 3 The Graphics pipeline Modern graphics pipeline (left) (ref: http://graphics.stanford.edu/courses/cs448a-01-fall/lectures/lecture2/walk010.html ) OpenGL 3D pipeline (right) (ref: http://www.vorlesungen.uos.de/informatik/ifc99-00/opengl/images/pipeline.gif ) 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 4 3D graphics software interfaces Low level
Specification not an API
Crossplatform implementations
Popular with some games
A simple seq of opengl instr (in C) glClearColor(0.0,0.0,0.0,0.0);
glClear(GL_COLOR_BUFFER_BIT);
glColor3f(1.0,1.0,1.0);
glOrtho(0.0,1.0,0.0,1.0,-1.0,1.0);
glBegin(GL_POLYGON); glVertex(0.25,0.25,0.0);
glVertex(0.75,0.25,0.0);
glVertex(0.75,0.75,0.0);
glVertex(0.25,0.75,0.0); glEnd(); OpenGL (v2.0 as of now) 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 5 3D graphics software interfaces High level
3D API part of DirectX
Very popular in the gaming industry
Microsoft platforms only Direct 3D (v9.0c as of now) 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 6 NVIDIA GeForce 6800 Impressive performance stats 600 Million vertices/s 6.4 billion texels/s 12.8 billion pixels/s rendering z/stencil only 64 pixels per clock cycle early z-cull (reject rate) Riva series (1 st DirectX compatible) Riva 128, Riva TNT, Riva TNT2 GeForce Series GeForce 256, GeForce 3 (DirectX 8), GeForce FX, GeForce 6 series General info 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 7 NVIDIA GeForce 6800 Block Diagram 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 8 Allow shader to be applied to each vertex Transformation and other per vertex ops Allow vertex shader to fetch texture data (6
series only) NVIDIA GeForce 6800 Vertex Processor (or vertex shader) 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 9 Cull/clip per primitive operation and data
preparation for
rasterization Rasterization: primitive to pixel mapping Z culling : quick pixel elimination based on
depth NVIDIA GeForce 6800 Clipping, Z Culling and Rasterization 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 10 Fragment : a candidate pixel Varying number of pixel pipelines Operates on quads for texture LOD SIMD processing hides texture fetch latency Texture caches NVIDIA GeForce 6800 Fragment processor and Texel pipeline 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 11 Texture unit can apply filters.
Shader units can perform 8 math ops (w/o texture load)
or 4 math ops (with texture
load) in a clock Fog calculation done in the end Pixels almost ready for framebuffer NVIDIA GeForce 6800 Fragment processor and Texel pipeline 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 12 Depth testing
Stencil tests
Alpha operations
Render final color to target buffer NVIDIA GeForce 6800 Z compare and blend 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 13 NVIDIA GeForce 6800 Vertex stream frequency hardware support for looping over a subset of vertices Example: rendering the same object multiple times at diff locations (grass,
soldiers, people in stadium) Features Geometry Instancing 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 14 NVIDIA GeForce 6800 Early culling and clipping; cull nonvisible primitives at high rate Rasterization supports Point Sprite, Aliased and anti-aliasing and triangles, etc Z-Cull Allows high-speed removal of hidden surfaces Occlusion Query Keeps a record of the number of fragments passing or failing the depth test and reports it to the CPU Features - continued 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 15 NVIDIA GeForce 6800 Texturing Extended support for non power of two textures to match support for power of two textures - Mipmapping, Wrapping and
clamping, Cube map and 3D textures. Shadow Buffer Support Fetches shadow buffer as a projective texture and performs z- compares of the shadow buffer data to distance from light. Features Continued 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 16 NVIDIA GeForce 6800 Increased instruction count (upto 65535 instructions.) Fragment processor; multiple render targets.
Dynamic flow control branching
Vertex texturing
More temporary registers. Features Shader Support 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 17 NVIDIA GeForce 6800 Co-issue: Each four-component-wide vector unit is capable of executing two independent
instructions in parallel More scalar computations done in less time. Dual issue: two independent instructions can be executed on different parts of the shader
pipeline Makes scheduling easy and more efficient. Features Co-issue and Dual Issue 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 18 GPGPU Look at GPU as a fast SIMD processor
It is a specialized processor, so not all programs can be run Example computational programs FFT, Cryptography, Ray Tracing, Segmentation
and even sound processing! 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 19 GPU from comp arch perspective Focus on Floating point math
fp32 and fp16 precision support for intermediate calculations 6 four-wide fp32 vector MADs/clock in shaders and 1 scalar multifunction op 16 four-wide fp32 vector MADs/clock in frag-proc plus 16 four-wide fp32 MULs Dedicated fp16 normalization hardware Processing units 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 20 GPU from comp arch perspective Use dedicated but standard memory architectures (eg DRAM) Multiple small independent memory partitions for improved latency Memory used to store buffers and optionally textures
In low-end system (Intel 855GM) system memory is shared as the Graphics memory Memory 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 21 GPU from comp arch perspective GPU interfaces with the CPU using fast buses like AGP and PCI Express Port speeds PCI express upto 8GB/sec ( 4 + 4 ) Practically upto ( 3.2 + 3.2 ) AGP upto 2 GB/sec (for 8x AGP) Such bus speeds are important because textures and vertex data needs to come from CPU to GPU (after that it's the
internal GPU bandwidth that matters) System Interface 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 22 GPU from comp arch perspective Texture caches (2 level) Shared between vertex procs and fragment procs
Cache processed/filtered textures Vertex caches cache processed and unprocessed vertexes
improve computation and fetch performance Z and buffer cache and write queues Caches 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 23 Demo http://download.nvidia.com/downloads/nZone/videos/nvidia/nalu.wmv 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 24 References Nvidia 6800 chapter from GPU Gems 2 http://download.nvidia.com/developer/GPU_Gems_2/GPU_Gems2_ch30.pdf OpenGL design http://graphics.stanford.edu/courses/cs448a-01-fall/design_opengl.pdf OpenGL programming guide (ISBN: 0201604582)
Real time graphics architectures lecture notes http://graphics.stanford.edu/courses/cs448a-01-fall/ GeForce 256 overview http://www.nvnews.net/reviews/geforce_256/gpu_overview.shtml NVIDIA website http://nvidia.com 04/14/05 Ajit Datar, Apurva Padhye Computer Architecture 25 So long and thanks for all the fish (Oh yeah ... any questions?)
Download www.calpoly.edu/~acadprog/2000pdf/c_arch.pdf.pdf
Comments
Google Search
RECENT SEARCHES
jabra hf5001 set up | Christopher Hagerman | COMMAREXSECGRU TWO xo | Cub Cadet Volunteer Service Manual | Saphouvong Khamhou | edward gorlo | nancy hale beasley | Lerlean Cotten | mariah johnson rabb | 2006 cub cadet utility vehicle specs | orbis terrarum descriptio duobis planis hemisphaeriis comprehesa | multiple choice exam in money market | sh7619 toppers | cub cadet volunteer fuel system | jabra hf5001 iphone 4 | hwic 3g gsm configuration | oystercatchers watercolours | motorola IHDT5SZ1 EE3 | jeff horowitz and money laundering | joseph thors signature | Virginia Beach Ciric | Lewis Burrell Buford | detyra te zgjedhura nga matematika | Flow Of Document Kendaraan | henze illinois | how long does a deros extension take usaf | dsp wells fargo | rachimah fraval | part number 69e6219 | Kristina Bicking | qerim pllana | johnny chriscoe | PO BOX 831830 RICHARDSON TX 75083 | smpte 381m | cathy l codrea | gregory luhn | Jabra speakerphone hf5001 instruction | jabra hf5001 pairing | 1NCD LCDR Kamensky | SMPTE 429 encrypt 6 essence | 0h | barry bohmueller | cotm presentation | vehibe ece toros | orbis terrarum tabula recens emendata et in lucem edita | 922646BJ2 | professor glenn jonas campbell university nc | naim gjoshi zyrtar ne kuvend | Kimberly Tassinaro allentown | ATTENTA PO BOX 803356 DALLAS TX 75380 FAX |
Hot Tags
Blue Blue Cross Dental Insurance Shield interference Bryant Catalog Bed In A Bag Bmg Music Barbecue Accessory Bali Vacation american singles australian domain name Bridal Show Alfa 2005 At Home Pajamas Bad Credit Mortgage Refinance Bubble Envelope Baymont Angeles Hotel Los bed hardware account best merchant Book Marketing Bumper Pool Bradstreet jet membership attorney florida injury american equity mortgage Beer Tap Buy Wine Online Animal Print Rug att store wireless free web hosting Attract Women
Related Articles
- Arch Chemicals Case Study
- TC36 & TC36 ARCH INSTALLATION AND OPERATING INSTRUCTIONS
- Arch
- Breckler Testifies Before Congress in Support of VA Funding
- Anthracite Firing In Large Utility Arch Fired Boilers
- The Roosevelt Arch
- www.archhousing.org/Outreach/Photos/Winner ARCH PressRelease.doc
- Structural Analyses of Two Historic Covered Wooden Bridges
- Running Shoes and Sports Injuries
- Yale University School of Architecture
- CLB-160 Sweets Arch Comp 01
- 80 00 0--3 32 27 7--4 41 11 10 0
- usccb.org/ccc/parishes/collectiontoolkit/2008/ddkit/...
- G L ASS ALUM I NUM PRODUCTS l DIVISION 8
- AUDITOR GENERAL
- Local Program Survey Report
- ORTHOTIC PROSTHETIC DEVICES
- GET REAL - NOT RETAIL
- Core Technologies in Hardware and Software
- Footwear XC and Track
Popular Articles
- Santa Monica DailyPress
- Keiretsu, Governance, and Learning: Case Studies in Change from the ...
- Introduction to Hospitality
- per ZIP code
- BAC 7298 BNZ CC Internet Form
- amram
- INFORMATION
- Wireless VoIP Phone
- FAMILY CODE SUBTITLE C. DISSOLUTION OF MARRIAGE CHAPTER 6. SUIT FOR ...
- http://www.tundrasolutions.com/forums/4runner/28012-lost-master
- ATIONAL
- Bob's All-Stars
- 2007-2008 Community Resource Guide
- The usual advice on car insurance is to shop around' for the cheapest ...
- Microsoft PowerPoint - Spice Jet Blog Promotion - June PPT
- ASL MDV-2811 (Page 1)
- C:WINNTProfiles writerDeskt
- www.albany.edu/faculty/ist100/wwwbrowser.ppt
- We dedicated this issue to the research of diabetes
- "ChesterBelloc" and the Fairy Tale of Distributism

pdf