-  [JOIN IRC!]


[Return]
Posting mode: Reply
Name
Subject   (reply to 143)
Message
File
Password  (for post and file deletion)
¯\(°_O)/¯
  • Supported file types are: BMP, GIF, JPG, PNG
  • Maximum file size allowed is 10000 KB.
  • Images greater than 400x400 pixels will be thumbnailed.
  • Currently 317 unique user posts. View catalog

  • Blotter updated: 2015-09-02 Show/Hide Show All


File 133279639199.jpg - (27.63KB , 550x327 , amd_radeon_hd_7990.jpg )
143 No. 143
Afternoon, /nerd/s.

How many of you have dabbled with OpenCL? I'm starting out with it now and the speed increases are ridiculous for tasks it's suited to.

Does anyone have any good resources for getting your head around the concept of massively parallel computing? Picture related.
Expand all images
>> No. 144
The basic idea isn't that hard to grasp. Cpu's use serial communication to communicate which essentially means they can do one thing at at time while videocards use parallel communication which means they can do 16 things at a time, which is why PCIEx16 is called PCIEx16
>> No. 145
The Radeon 7970 has 2048 processor cores == 2048 things at a time, not 16.

I know the gist of it, but I was wondering if anybody had any resources on learning how to program effectively with OpenCL, coming from a serial-programming background.
>> No. 147
I know nothing about what you are talking about, but your post has piqued my interest, so I'm going to ask you a few noobish questions about it.
What kinds of things is this really used for? Some embarrassingly parallel problem like mandelbrot or something?
And do you need to compile your code into something that'll run on those 2048 cores? I take it they're nothing like some 6502 or ARM11 or x86 core, or anything else that'll be more or less familiar to me.

>>145
>I know the gist of it
So what's the gist of it?
>> No. 148
File 133342328499.png - (58.95KB , 600x580 , CUDA_processing_flow_(En).png )
148
>>147
(Not OP)
It's useful for supercomputing, to allow small labs to do some heavy computing without buying (time on) a supercomputer, and for password cracking (in other words, it's useful for mostly anything that is parallelizable). You can see a huge list of uses here https://en.wikipedia.org/wiki/GPGPU#Applications . China used 7168 GPUs in one of their supercomputers in 2010, I'm sure you can find something better by now though https://en.wikipedia.org/wiki/Tianhe-I .

You need to compile the code with a special compiler (the binary runs on the CPU, which communicates to the GPU what to do). Unfortunately, there are three standards you can use: Nvidia's, ATI's and OpenCL (which works both on Nvidia and ATI cards). I don't know about the others, but on Nvidia's CUDA basically you can divide the data to be processed into a matrix and let the GPU take care of it. Doing this in practice, however, can be a bit of a pain since the (global) memory all cores have access to is extremely slow, so you have to use the very limited shared memory between the groups of cores, and different cards have different sizes of memory, different number of cores, etc. For complex problems this can be very difficult. Last time I used CUDA it didn't support recursion.
>> No. 172
We use it some in my organization, but only for niche tasks that really benefit from it. One concern we have is the lack of any really open implementation of the language -- which means we are praying that AMD doesn't ditch its compiler distributions, have difficult to find flaws/vulnerabilities in it, or lose interest suddenly. But anyway, writing things that really tack onto the hardware is MUCH easier in OpenCL than doing it any other way, and that makes it a nice choice for a few things.

As far as learning parallel computing... that's a tricky question. The best single piece of advice I would make is to learn not parallel computing in the sense of programming books, but first take a step back to some more fundamental math theory and study up on just *why* some problems lend themselves to parallel/serialized solutions. This is the thing that trips people up most of the time. The new kids who skipped math class but learned C at home often invent really wild/silly race conditions without realizing it because they didn't take the time to contemplate what defines a truly parallelizable problem.

If you get under that a bit, then actually programming a system that does the computations you want is an exercise in translating your understanding, which is preferable to basically stabbing in the dark with some OpenCL hoping your Radeon will save you without really knowing what you're about.


Delete post []
Password  
Report post
Reason