com.amd.aparapi
Class Kernel

java.lang.Object
  extended by com.amd.aparapi.Kernel
All Implemented Interfaces:
Cloneable

public abstract class Kernel
extends Object
implements Cloneable

A kernel encapsulates a data parallel algorithm that will execute either on a GPU (through conversion to OpenCL) or on a CPU via a Java Thread Pool.

To write a new kernel, a developer extends the Kernel class and overrides the Kernel.run() method. To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize) with a suitable 'global size'. At runtime Aparapi will attempt to convert the Kernel.run() method (and any method called directly or indirectly by Kernel.run()) into OpenCL for execution on GPU devices made available via the OpenCL platform.

Note that Kernel.run() is not called directly. Instead, the Kernel.execute(int globalSize) method will cause the overridden Kernel.run() method to be invoked once for each value in the range 0...globalSize.

On the first call to Kernel.execute(int _globalSize), Aparapi will determine the EXECUTION_MODE of the kernel. This decision is made dynamically based on two factors:

  1. Whether OpenCL is available (appropriate drivers are installed and the OpenCL and Aparapi dynamic libraries are included on the system path).
  2. Whether the bytecode of the run() method (and every method that can be called directly or indirectly from the run() method) can be converted into OpenCL.

Below is an example Kernel that calculates the square of a set of input values.

     class SquareKernel extends Kernel{
         private int values[];
         private int squares[];
         public SquareKernel(int values[]){
            this.values = values;
            squares = new int[values.length];
         }
         public void run() {
             int gid = getGlobalID();
             squares[gid] = values[gid]*values[gid];
         }
         public int[] getSquares(){
             return(squares);
         }
     }
 

To execute this kernel, first create a new instance of it and then call execute(int globalSize).

     int[] values = new int[1024];
     // fill values array
     SquareKernel kernel = new SquareKernel(values);
     kernel.execute(values.length);
 

When execute() returns, all the executions of Kernel.run() have completed and the results are available in the squares array.

     int[] squares = kernel.getSquares();
     for (int i=0; i< values.length; i++){
        System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
     }
 

A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:

     final int[] values = new int[1024];
     // fill values array
     final int[] squares = new int[values.length];
     Kernel kernel = new Kernel(){
         public void run() {
             int gid = getGlobalID();
             squares[gid] = values[gid]*values[gid];
         }
     };
     kernel.execute(values.length);
     for (int i=0; i< values.length; i++){
        System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
     }
     
 

Version:
Alpha, 21/09/2010
Author:
gfrost AMD Javalabs

Nested Class Summary
static class Kernel.EXECUTION_MODE
          The execution mode ENUM enumerates the possible modes of executing a kernel.
 
Constructor Summary
Kernel()
           
 
Method Summary
protected  double abs(double _d)
          Delegates to either Math.abs(double) (Java) or fabs(double) (OpenCL).
protected  float abs(float _f)
          Delegates to either Math.abs(float) (Java) or fabs(float) (OpenCL).
protected  int abs(int n)
          Delegates to either Math.abs(int) (Java) or abs(int) (OpenCL).
protected  long abs(long n)
          Delegates to either Math.abs(long) (Java) or abs(long) (OpenCL).
protected  double acos(double a)
          Delegates to either Math.acos(double) (Java) or acos(double) (OpenCL).
protected  float acos(float a)
          Delegates to either Math.acos(double) (Java) or acos(float) (OpenCL).
protected  double asin(double _d)
          Delegates to either Math.asin(double) (Java) or asin(double) (OpenCL).
protected  float asin(float _f)
          Delegates to either Math.asin(double) (Java) or asin(float) (OpenCL).
protected  double atan(double _d)
          Delegates to either Math.atan(double) (Java) or atan(double) (OpenCL).
protected  float atan(float _f)
          Delegates to either Math.atan(double) (Java) or atan(float) (OpenCL).
protected  double atan2(double _d1, double _d2)
          Delegates to either Math.atan2(double, double) (Java) or atan2(double, double) (OpenCL).
protected  float atan2(float _f1, float _f2)
          Delegates to either Math.atan2(double, double) (Java) or atan2(float, float) (OpenCL).
protected  int atomicAdd(int[] _arr, int _index, int _delta)
          Atomically adds _delta value to _index element of array _arr (Java) or delegates to atomic_add(volatile int*, int) (OpenCL).
protected  double ceil(double _d)
          Delegates to either Math.ceil(double) (Java) or ceil(double) (OpenCL).
protected  float ceil(float _f)
          Delegates to either Math.ceil(double) (Java) or ceil(float) (OpenCL).
protected  Object clone()
          When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.
protected  double cos(double _d)
          Delegates to either Math.cos(double) (Java) or cos(double) (OpenCL).
protected  float cos(float _f)
          Delegates to either Math.cos(double) (Java) or cos(float) (OpenCL).
 void dispose()
          Release any resources associated with this Kernel.
 long execute(int _globalSize)
          Start execution of globalSize kernels.
protected  double exp(double _d)
          Delegates to either Math.exp(double) (Java) or exp(double) (OpenCL).
protected  float exp(float _f)
          Delegates to either Math.exp(double) (Java) or exp(float) (OpenCL).
protected  double floor(double _d)
          Delegates to either Math.floor(double) (Java) or floor(double) (OpenCL).
protected  float floor(float _f)
          Delegates to either Math.floor(double) (Java) or floor(float) (OpenCL).
 Kernel.EXECUTION_MODE getExecutionMode()
          Return the current execution mode.
protected  int getGlobalId()
          Determine the globalId of an executing kernel.
protected  int getGlobalSize()
          Determine the value that was passed to Kernel.execute(int globalSize) method.
protected  int getGroupId()
          Determine the groupId of an executing kernel.
protected  int getLocalId()
          Determine the local id of an executing kernel.
protected  int getLocalSize()
          Determine the size of the group that an executing kernel is a member of.
protected  int getNumGroups()
          Determine the number of groups that will be used to execute a kernel
protected  double IEEEremainder(double _d1, double _d2)
          Delegates to either Math.IEEEremainder(double, double) (Java) or remainder(double, double) (OpenCL).
protected  float IEEEremainder(float _f1, float _f2)
          Delegates to either Math.IEEEremainder(double, double) (Java) or remainder(float, float) (OpenCL).
protected  void localBarrier()
          Wait for all kernels in the current group to rendezvous at this call before continuing execution.
protected  double log(double _d)
          Delegates to either Math.log(double) (Java) or log(double) (OpenCL).
protected  float log(float _f)
          Delegates to either Math.log(double) (Java) or log(float) (OpenCL).
protected  double max(double _d1, double _d2)
          Delegates to either Math.max(double, double) (Java) or fmax(double, double) (OpenCL).
protected  float max(float _f1, float _f2)
          Delegates to either Math.max(float, float) (Java) or fmax(float, float) (OpenCL).
protected  int max(int n1, int n2)
          Delegates to either Math.max(int, int) (Java) or max(int, int) (OpenCL).
protected  long max(long n1, long n2)
          Delegates to either Math.max(long, long) (Java) or max(long, long) (OpenCL).
protected  double min(double _d1, double _d2)
          Delegates to either Math.min(double, double) (Java) or fmin(double, double) (OpenCL).
protected  float min(float _f1, float _f2)
          Delegates to either Math.min(float, float) (Java) or fmin(float, float) (OpenCL).
protected  int min(int n1, int n2)
          Delegates to either Math.min(int, int) (Java) or min(int, int) (OpenCL).
protected  long min(long n1, long n2)
          Delegates to either Math.min(long, long) (Java) or min(long, long) (OpenCL).
protected  double pow(double _d1, double _d2)
          Delegates to either Math.pow(double, double) (Java) or pow(double, double) (OpenCL).
protected  float pow(float _f1, float _f2)
          Delegates to either Math.pow(double, double) (Java) or pow(float, float) (OpenCL).
protected  double rint(double _d)
          Delegates to either Math.rint(double) (Java) or rint(double) (OpenCL).
protected  float rint(float _f)
          Delegates to either Math.rint(double) (Java) or rint(float) (OpenCL).
protected  long round(double _d)
          Delegates to either Math.round(double) (Java) or round(double) (OpenCL).
protected  int round(float _f)
          Delegates to either Math.round(float) (Java) or round(float) (OpenCL).
protected  double rsqrt(double _d)
          Computes inverse square root using Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL).
protected  float rsqrt(float _f)
          Computes inverse square root using Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL).
abstract  void run()
          The entry point of a kernel.
 void setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
          Set the execution mode.
protected  void setSizes(int _globalSize, int _localSize)
          A callback that Aparapi will invoke before executing a kernel to set the values of globalSize and localSize.
protected  double sin(double _d)
          Delegates to either Math.sin(double) (Java) or sin(double) (OpenCL).
protected  float sin(float _f)
          Delegates to either Math.sin(double) (Java) or sin(float) (OpenCL).
protected  double sqrt(double _d)
          Delegates to either Math.sqrt(double) (Java) or sqrt(double) (OpenCL).
protected  float sqrt(float _f)
          Delegates to either Math.sqrt(double) (Java) or sqrt(float) (OpenCL).
protected  double tan(double _d)
          Delegates to either Math.tan(double) (Java) or tan(double) (OpenCL).
protected  float tan(float _f)
          Delegates to either Math.tan(double) (Java) or tan(float) (OpenCL).
protected  double toDegrees(double _d)
          Delegates to either Math.toDegrees(double) (Java) or degrees(double) (OpenCL).
protected  float toDegrees(float _f)
          Delegates to either Math.toDegrees(double) (Java) or degrees(float) (OpenCL).
protected  double toRadians(double _d)
          Delegates to either Math.toRadians(double) (Java) or radians(double) (OpenCL).
protected  float toRadians(float _f)
          Delegates to either Math.toRadians(double) (Java) or radians(float) (OpenCL).
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Kernel

public Kernel()
Method Detail

getGlobalId

protected final int getGlobalId()
Determine the globalId of an executing kernel.

The kernel implementation uses the globalId to determine which of the executing kernels (in the global domain space) this invocation is expected to deal with.

For example in a SquareKernel implementation:

     class SquareKernel extends Kernel{
         private int values[];
         private int squares[];
         public SquareKernel(int values[]){
            this.values = values;
            squares = new int[values.length];
         }
         public void run() {
             int gid = getGlobalID();
             squares[gid] = values[gid]*values[gid];
         }
         public int[] getSquares(){
             return(squares);
         }
     }
 

Each invocation of SquareKernel.run() retrieves it's globalId by calling getGlobalId(), and then computes the value of square[gid] for a given value of value[gid].

Returns:
The globalId for the Kernel being executed
See Also:
getLocalId(), getGroupId(), getGlobalSize(), getNumGroups(), getLocalSize()

getGroupId

protected final int getGroupId()
Determine the groupId of an executing kernel.

When a Kernel.execute(int globalSize) is invoked for a particular kernel, the runtime will break the work into various 'groups'.

A kernel can use getGroupId() to determine which group a kernel is currently dispatched to

The following code would capture the groupId for each kernel and map it against globalId.

     final int[] groupIds = new int[1024];
     Kernel kernel = new Kernel(){
         public void run() {
             int gid = getGlobalId();
             groupIds[gid] = getGroupId();
         }
     };
     kernel.execute(groupIds.length);
     for (int i=0; i< values.length; i++){
        System.out.printf("%4d %4d\n", i, groupIds[i]);
     } 
 

Returns:
The groupId for this Kernel being executed
See Also:
getLocalId(), getGlobalId(), getGlobalSize(), getNumGroups(), getLocalSize()

getLocalId

protected final int getLocalId()
Determine the local id of an executing kernel.

When a Kernel.execute(int globalSize) is invoked for a particular kernel, the runtime will break the work into various 'groups'. getLocalId() can be used to determine the relative id of the current kernel within a specific group.

The following code would capture the groupId for each kernel and map it against globalId.

     final int[] localIds = new int[1024];
     Kernel kernel = new Kernel(){
         public void run() {
             int gid = getGlobalId();
             localIds[gid] = getLocalId();
         }
     };
     kernel.execute(localIds.length);
     for (int i=0; i< values.length; i++){
        System.out.printf("%4d %4d\n", i, localIds[i]);
     } 
 

Returns:
The local id for this Kernel being executed
See Also:
getGroupId(), getGlobalId(), getGlobalSize(), getNumGroups(), getLocalSize()

getLocalSize

protected final int getLocalSize()
Determine the size of the group that an executing kernel is a member of.

When a Kernel.execute(int globalSize) is invoked for a particular kernel, the runtime will break the work into various 'groups'. getLocalSize() allows a kernel to determine the size of the current group.

Note groups may not all be the same size. In particular, if (global size)%(# of compute devices)!=0, the runtime can choose to dispatch kernels to groups with differing sizes.

Returns:
The size of the currently executing group.
See Also:
getGroupId(), getGlobalId(), getGlobalSize(), getNumGroups(), getLocalSize()

getGlobalSize

protected final int getGlobalSize()
Determine the value that was passed to Kernel.execute(int globalSize) method.

Returns:
The value passed to Kernel.execute(int globalSize) causing the current execution.
See Also:
getGroupId(), getGlobalId(), getNumGroups(), getLocalSize()

getNumGroups

protected final int getNumGroups()
Determine the number of groups that will be used to execute a kernel

When Kernel.execute(int globalSize) is invoked, the runtime will split the work into multiple 'groups'. getNumGroups() returns the total number of groups that will be used.

Returns:
The number of groups that kernels will be dispatched into.
See Also:
getGroupId(), getGlobalId(), getGlobalSize(), getNumGroups(), getLocalSize()

run

public abstract void run()
The entry point of a kernel.

Every kernel must override this method.


clone

protected Object clone()
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.

If you choose to override clone() you are responsible for delegating to super.clone();

Overrides:
clone in class Object

acos

protected float acos(float a)
Delegates to either Math.acos(double) (Java) or acos(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
a - value to delegate to Math.acos(double)/acos(float)
Returns:
Math.acos(double) casted to float/acos(float)
See Also:
Math.acos(double), acos(float)

acos

protected double acos(double a)
Delegates to either Math.acos(double) (Java) or acos(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
a - value to delegate to Math.acos(double)/acos(double)
Returns:
Math.acos(double)/acos(double)
See Also:
Math.acos(double), acos(double)

asin

protected float asin(float _f)
Delegates to either Math.asin(double) (Java) or asin(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.asin(double)/asin(float)
Returns:
Math.asin(double) casted to float/asin(float)
See Also:
Math.asin(double), asin(float)

asin

protected double asin(double _d)
Delegates to either Math.asin(double) (Java) or asin(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.asin(double)/asin(double)
Returns:
Math.asin(double)/asin(double)
See Also:
Math.asin(double), asin(double)

atan

protected float atan(float _f)
Delegates to either Math.atan(double) (Java) or atan(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.atan(double)/atan(float)
Returns:
Math.atan(double) casted to float/atan(float)
See Also:
Math.atan(double), atan(float)

atan

protected double atan(double _d)
Delegates to either Math.atan(double) (Java) or atan(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.atan(double)/atan(double)
Returns:
Math.atan(double)/atan(double)
See Also:
Math.atan(double), atan(double)

atan2

protected float atan2(float _f1,
                      float _f2)
Delegates to either Math.atan2(double, double) (Java) or atan2(float, float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f1 - value to delegate to first argument of Math.atan2(double, double)/atan2(float, float)
_f2 - value to delegate to second argument of Math.atan2(double, double)/atan2(float, float)
Returns:
Math.atan2(double, double) casted to float/atan2(float, float)
See Also:
Math.atan2(double, double), atan2(float, float)

atan2

protected double atan2(double _d1,
                       double _d2)
Delegates to either Math.atan2(double, double) (Java) or atan2(double, double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d1 - value to delegate to first argument of Math.atan2(double, double)/atan2(double, double)
_d2 - value to delegate to second argument of Math.atan2(double, double)/atan2(double, double)
Returns:
Math.atan2(double, double)/atan2(double, double)
See Also:
Math.atan2(double, double), atan2(double, double)

ceil

protected float ceil(float _f)
Delegates to either Math.ceil(double) (Java) or ceil(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.ceil(double)/ceil(float)
Returns:
Math.ceil(double) casted to float/ceil(float)
See Also:
Math.ceil(double), ceil(float)

ceil

protected double ceil(double _d)
Delegates to either Math.ceil(double) (Java) or ceil(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.ceil(double)/ceil(double)
Returns:
Math.ceil(double)/ceil(double)
See Also:
Math.ceil(double), ceil(double)

cos

protected float cos(float _f)
Delegates to either Math.cos(double) (Java) or cos(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.cos(double)/cos(float)
Returns:
Math.cos(double) casted to float/cos(float)
See Also:
Math.cos(double), cos(float)

cos

protected double cos(double _d)
Delegates to either Math.cos(double) (Java) or cos(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.cos(double)/cos(double)
Returns:
Math.cos(double)/cos(double)
See Also:
Math.cos(double), cos(double)

exp

protected float exp(float _f)
Delegates to either Math.exp(double) (Java) or exp(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.exp(double)/exp(float)
Returns:
Math.exp(double) casted to float/exp(float)
See Also:
Math.exp(double), exp(float)

exp

protected double exp(double _d)
Delegates to either Math.exp(double) (Java) or exp(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.exp(double)/exp(double)
Returns:
Math.exp(double)/exp(double)
See Also:
Math.exp(double), exp(double)

abs

protected float abs(float _f)
Delegates to either Math.abs(float) (Java) or fabs(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.abs(float)/fabs(float)
Returns:
Math.abs(float)/fabs(float)
See Also:
Math.abs(float), fabs(float)

abs

protected double abs(double _d)
Delegates to either Math.abs(double) (Java) or fabs(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.abs(double)/fabs(double)
Returns:
Math.abs(double)/fabs(double)
See Also:
Math.abs(double), fabs(double)

abs

protected int abs(int n)
Delegates to either Math.abs(int) (Java) or abs(int) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n - value to delegate to Math.abs(int)/abs(int)
Returns:
Math.abs(int)/abs(int)
See Also:
Math.abs(int), abs(int)

abs

protected long abs(long n)
Delegates to either Math.abs(long) (Java) or abs(long) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n - value to delegate to Math.abs(long)/abs(long)
Returns:
Math.abs(long)/abs(long)
See Also:
Math.abs(long), abs(long)

floor

protected float floor(float _f)
Delegates to either Math.floor(double) (Java) or floor(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.floor(double)/floor(float)
Returns:
Math.floor(double) casted to float/floor(float)
See Also:
Math.floor(double), floor(float)

floor

protected double floor(double _d)
Delegates to either Math.floor(double) (Java) or floor(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.floor(double)/floor(double)
Returns:
Math.floor(double)/floor(double)
See Also:
Math.floor(double), floor(double)

max

protected float max(float _f1,
                    float _f2)
Delegates to either Math.max(float, float) (Java) or fmax(float, float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f1 - value to delegate to first argument of Math.max(float, float)/fmax(float, float)
_f2 - value to delegate to second argument of Math.max(float, float)/fmax(float, float)
Returns:
Math.max(float, float)/fmax(float, float)
See Also:
Math.max(float, float), fmax(float, float)

max

protected double max(double _d1,
                     double _d2)
Delegates to either Math.max(double, double) (Java) or fmax(double, double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d1 - value to delegate to first argument of Math.max(double, double)/fmax(double, double)
_d2 - value to delegate to second argument of Math.max(double, double)/fmax(double, double)
Returns:
Math.max(double, double)/fmax(double, double)
See Also:
Math.max(double, double), fmax(double, double)

max

protected int max(int n1,
                  int n2)
Delegates to either Math.max(int, int) (Java) or max(int, int) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n1 - value to delegate to Math.max(int, int)/max(int, int)
n2 - value to delegate to Math.max(int, int)/max(int, int)
Returns:
Math.max(int, int)/max(int, int)
See Also:
Math.max(int, int), max(int, int)

max

protected long max(long n1,
                   long n2)
Delegates to either Math.max(long, long) (Java) or max(long, long) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n1 - value to delegate to first argument of Math.max(long, long)/max(long, long)
n2 - value to delegate to second argument of Math.max(long, long)/max(long, long)
Returns:
Math.max(long, long)/max(long, long)
See Also:
Math.max(long, long), max(long, long)

min

protected float min(float _f1,
                    float _f2)
Delegates to either Math.min(float, float) (Java) or fmin(float, float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f1 - value to delegate to first argument of Math.min(float, float)/fmin(float, float)
_f2 - value to delegate to second argument of Math.min(float, float)/fmin(float, float)
Returns:
Math.min(float, float)/fmin(float, float)
See Also:
Math.min(float, float), fmin(float, float)

min

protected double min(double _d1,
                     double _d2)
Delegates to either Math.min(double, double) (Java) or fmin(double, double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d1 - value to delegate to first argument of Math.min(double, double)/fmin(double, double)
_d2 - value to delegate to second argument of Math.min(double, double)/fmin(double, double)
Returns:
Math.min(double, double)/fmin(double, double)
See Also:
Math.min(double, double), fmin(double, double)

min

protected int min(int n1,
                  int n2)
Delegates to either Math.min(int, int) (Java) or min(int, int) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n1 - value to delegate to first argument of Math.min(int, int)/min(int, int)
n2 - value to delegate to second argument of Math.min(int, int)/min(int, int)
Returns:
Math.min(int, int)/min(int, int)
See Also:
Math.min(int, int), min(int, int)

min

protected long min(long n1,
                   long n2)
Delegates to either Math.min(long, long) (Java) or min(long, long) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
n1 - value to delegate to first argument of Math.min(long, long)/min(long, long)
n2 - value to delegate to second argument of Math.min(long, long)/min(long, long)
Returns:
Math.min(long, long)/min(long, long)
See Also:
Math.min(long, long), min(long, long)

log

protected float log(float _f)
Delegates to either Math.log(double) (Java) or log(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.log(double)/log(float)
Returns:
Math.log(double) casted to float/log(float)
See Also:
Math.log(double), log(float)

log

protected double log(double _d)
Delegates to either Math.log(double) (Java) or log(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.log(double)/log(double)
Returns:
Math.log(double)/log(double)
See Also:
Math.log(double), log(double)

pow

protected float pow(float _f1,
                    float _f2)
Delegates to either Math.pow(double, double) (Java) or pow(float, float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f1 - value to delegate to first argument of Math.pow(double, double)/pow(float, float)
_f2 - value to delegate to second argument of Math.pow(double, double)/pow(float, float)
Returns:
Math.pow(double, double) casted to float/pow(float, float)
See Also:
Math.pow(double, double), pow(float, float)

pow

protected double pow(double _d1,
                     double _d2)
Delegates to either Math.pow(double, double) (Java) or pow(double, double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d1 - value to delegate to first argument of Math.pow(double, double)/pow(double, double)
_d2 - value to delegate to second argument of Math.pow(double, double)/pow(double, double)
Returns:
Math.pow(double, double)/pow(double, double)
See Also:
Math.pow(double, double), pow(double, double)

IEEEremainder

protected float IEEEremainder(float _f1,
                              float _f2)
Delegates to either Math.IEEEremainder(double, double) (Java) or remainder(float, float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f1 - value to delegate to first argument of Math.IEEEremainder(double, double)/remainder(float, float)
_f2 - value to delegate to second argument of Math.IEEEremainder(double, double)/remainder(float, float)
Returns:
Math.IEEEremainder(double, double) casted to float/remainder(float, float)
See Also:
Math.IEEEremainder(double, double), remainder(float, float)

IEEEremainder

protected double IEEEremainder(double _d1,
                               double _d2)
Delegates to either Math.IEEEremainder(double, double) (Java) or remainder(double, double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d1 - value to delegate to first argument of Math.IEEEremainder(double, double)/remainder(double, double)
_d2 - value to delegate to second argument of Math.IEEEremainder(double, double)/remainder(double, double)
Returns:
Math.IEEEremainder(double, double)/remainder(double, double)
See Also:
Math.IEEEremainder(double, double), remainder(double, double)

toRadians

protected float toRadians(float _f)
Delegates to either Math.toRadians(double) (Java) or radians(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.toRadians(double)/radians(float)
Returns:
Math.toRadians(double) casted to float/radians(float)
See Also:
Math.toRadians(double), radians(float)

toRadians

protected double toRadians(double _d)
Delegates to either Math.toRadians(double) (Java) or radians(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.toRadians(double)/radians(double)
Returns:
Math.toRadians(double)/radians(double)
See Also:
Math.toRadians(double), radians(double)

toDegrees

protected float toDegrees(float _f)
Delegates to either Math.toDegrees(double) (Java) or degrees(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.toDegrees(double)/degrees(float)
Returns:
Math.toDegrees(double) casted to float/degrees(float)
See Also:
Math.toDegrees(double), degrees(float)

toDegrees

protected double toDegrees(double _d)
Delegates to either Math.toDegrees(double) (Java) or degrees(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.toDegrees(double)/degrees(double)
Returns:
Math.toDegrees(double)/degrees(double)
See Also:
Math.toDegrees(double), degrees(double)

rint

protected float rint(float _f)
Delegates to either Math.rint(double) (Java) or rint(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.rint(double)/rint(float)
Returns:
Math.rint(double) casted to float/rint(float)
See Also:
Math.rint(double), rint(float)

rint

protected double rint(double _d)
Delegates to either Math.rint(double) (Java) or rint(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.rint(double)/rint(double)
Returns:
Math.rint(double)/rint(double)
See Also:
Math.rint(double), rint(double)

round

protected int round(float _f)
Delegates to either Math.round(float) (Java) or round(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.round(float)/round(float)
Returns:
Math.round(float)/round(float)
See Also:
Math.round(float), round(float)

round

protected long round(double _d)
Delegates to either Math.round(double) (Java) or round(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.round(double)/round(double)
Returns:
Math.round(double)/round(double)
See Also:
Math.round(double), round(double)

sin

protected float sin(float _f)
Delegates to either Math.sin(double) (Java) or sin(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.sin(double)/sin(float)
Returns:
Math.sin(double) casted to float/sin(float)
See Also:
Math.sin(double), sin(float)

sin

protected double sin(double _d)
Delegates to either Math.sin(double) (Java) or sin(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.sin(double)/sin(double)
Returns:
Math.sin(double)/sin(double)
See Also:
Math.sin(double), sin(double)

sqrt

protected float sqrt(float _f)
Delegates to either Math.sqrt(double) (Java) or sqrt(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.sqrt(double)/sqrt(float)
Returns:
Math.sqrt(double) casted to float/sqrt(float)
See Also:
Math.sqrt(double), sqrt(float)

sqrt

protected double sqrt(double _d)
Delegates to either Math.sqrt(double) (Java) or sqrt(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.sqrt(double)/sqrt(double)
Returns:
Math.sqrt(double)/sqrt(double)
See Also:
Math.sqrt(double), sqrt(double)

tan

protected float tan(float _f)
Delegates to either Math.tan(double) (Java) or tan(float) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.tan(double)/tan(float)
Returns:
Math.tan(double) casted to float/tan(float)
See Also:
Math.tan(double), tan(float)

tan

protected double tan(double _d)
Delegates to either Math.tan(double) (Java) or tan(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.tan(double)/tan(double)
Returns:
Math.tan(double)/tan(double)
See Also:
Math.tan(double), tan(double)

rsqrt

protected float rsqrt(float _f)
Computes inverse square root using Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_f - value to delegate to Math.sqrt(double)/rsqrt(double)
Returns:
( 1.0f / Math.sqrt(double) casted to float )/rsqrt(double)
See Also:
Math.sqrt(double), rsqrt(double)

rsqrt

protected double rsqrt(double _d)
Computes inverse square root using Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.

Parameters:
_d - value to delegate to Math.sqrt(double)/rsqrt(double)
Returns:
( 1.0f / Math.sqrt(double) ) /rsqrt(double)
See Also:
Math.sqrt(double), rsqrt(double)

atomicAdd

protected int atomicAdd(int[] _arr,
                        int _index,
                        int _delta)
Atomically adds _delta value to _index element of array _arr (Java) or delegates to atomic_add(volatile int*, int) (OpenCL).

Parameters:
_arr - array for which an element value needs to be atomically incremented by _delta
_index - index of the _arr array that needs to be atomically incremented by _delta
_delta - value by which _index element of _arr array needs to be atomically incremented
Returns:
previous value of _index element of _arr array
See Also:
atomic_add(volatile int*, int)

localBarrier

protected final void localBarrier()
Wait for all kernels in the current group to rendezvous at this call before continuing execution.


setSizes

protected void setSizes(int _globalSize,
                        int _localSize)
A callback that Aparapi will invoke before executing a kernel to set the values of globalSize and localSize.

When kernel.execute(globalsSize) is invoked, Aparapi will determine the localSize based on the execution mode and the globalSize. If you override setSizes( int _globalSize, int _localSize) your overridden method will be called. Your method must call super.setSize(_globalSize, _localSize) to ensure that the kernel receives the correct value for localSize.

This callback is also useful for initializing arrays based on the known global and local sizes.

Parameters:
_globalSize -
_localSize -

execute

public long execute(int _globalSize)
Start execution of globalSize kernels.

When kernel.execute(globalSize) is invoked, Aparapi will schedule the execution of globalSize kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.

Parameters:
_globalSize - The number of Kernels that we would like to initiate.

dispose

public void dispose()
Release any resources associated with this Kernel.

When the execution mode is CPU or GPU, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. The dispose() method must be called to release these resources.

If execute(int _globalSize) is called after dispose() is called the results are undefined.


getExecutionMode

public Kernel.EXECUTION_MODE getExecutionMode()
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode().

After a Kernel executes, the return value will be the mode in which the Kernel actually executed.

Returns:
The current execution mode.
See Also:
setExecutionMode(EXECUTION_MODE)

setExecutionMode

public void setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Set the execution mode.

This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.

Parameters:
_executionMode - the requested execution mode.
See Also:
getExecutionMode()


Copyright © 2010 AMD INC. All Rights Reserved.