渲染脚本渲染比Android上的OpenGL渲染要慢得多

Question

渲染脚本渲染比Android上的OpenGL渲染要慢得多

Jam*_*hao 7 android opengl-es renderscript

背景:

我想根据Android相机应用程序的代码添加实时过滤器.但Android相机应用程序的架构基于OpenGL ES 1.x. 我需要使用着色器来自定义我们的过滤器实现.但是,将相机应用程序更新为OpenGL ES 2.0非常困难.然后我必须找到一些其他方法来实现实时过滤器而不是OpenGL.经过一些研究,我决定使用渲染脚本.

问题:

我已经通过渲染脚本编写了一个简单过滤器的演示.它表明fps远低于OpenGL实现它.大约5 fps vs 15 fps.

问题:

Android官方异地说:RenderScript运行时将并行处理设备上可用的所有处理器(如多核CPU,GPU或DSP)的工作,使您可以专注于表达算法而不是调度工作或负载平衡.那为什么渲染脚本实现较慢？
如果渲染脚本不能满足我的要求,有没有更好的方法？

代码详情:

嗨,我和提问者在同一个团队中.我们想要编写一个基于渲染脚本的实时滤镜相机.在我们的测试演示项目中,我们使用了一个简单的过滤器:添加了覆盖过滤器ScriptC脚本的YuvToRGB IntrinsicScript.在OpenGL版本中,我们将相机数据设置为纹理,并使用着色器执行image-filter-procss.像这样:

    GLES20.glActiveTexture(GLES20.GL_TEXTURE0);
    GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, textureYHandle);
    GLES20.glUniform1i(shader.uniforms.get("uTextureY"), 0);
    GLES20.glTexSubImage2D(GLES20.GL_TEXTURE_2D, 0, 0, 0, mTextureWidth,
            mTextureHeight, GLES20.GL_LUMINANCE, GLES20.GL_UNSIGNED_BYTE,
            mPixelsYBuffer.position(0));

Run Code Online (Sandbox Code Playgroud)

在RenderScript版本中,我们将相机数据设置为Allocation,并使用script-kernals执行image-filter-procss.像这样:

    // The belowing code is from onPreviewFrame(byte[] data, Camera camera) which gives the camera frame data 
    byte[] imageData = datas[0];
    long timeBegin = System.currentTimeMillis();
    mYUVInAllocation.copyFrom(imageData);

    mYuv.setInput(mYUVInAllocation);
    mYuv.forEach(mRGBAAllocationA);
    // To make sure the process of YUVtoRGBA has finished!
    mRGBAAllocationA.copyTo(mOutBitmap);    
    Log.e(TAG, "RS time: YUV to RGBA : " + String.valueOf((System.currentTimeMillis() - timeBegin)));   

    mLayerScript.forEach_overlay(mRGBAAllocationA, mRGBAAllocationB);
    mRGBAAllocationB.copyTo(mOutBitmap);    
    Log.e(TAG, "RS time: overlay : " + String.valueOf((System.currentTimeMillis() - timeBegin)));

    mCameraSurPreview.refresh(mOutBitmap, mCameraDisplayOrientation, timeBegin);

Run Code Online (Sandbox Code Playgroud)

这两个问题是:(1)RenderScript进程似乎比OpenGL进程慢.(2)根据我们的时间日志,使用内在脚本的YUV到RGBA的过程非常快,大约需要6ms; 但是使用scriptC的叠加过程非常慢,大约需要180ms.这是怎么发生的？

这是我们使用的ScriptC的rs-kernal代码(mLayerScript):

#pragma version(1)
#pragma rs java_package_name(**.renderscript)
#pragma stateFragment(parent)

#include "rs_graphics.rsh"

static rs_allocation layer;
static uint32_t dimX;
static uint32_t dimY;

void setLayer(rs_allocation layer1) {
    layer = layer1;
}

void setBitmapDim(uint32_t dimX1, uint32_t dimY1) {
    dimX = dimX1;
    dimY = dimY1;
}

static float BlendOverlayf(float base, float blend) {
    return (base < 0.5 ? (2.0 * base * blend) : (1.0 - 2.0 * (1.0 - base) * (1.0 - blend)));
}

static float3 BlendOverlay(float3 base, float3 blend) {
    float3 blendOverLayPixel = {BlendOverlayf(base.r, blend.r), BlendOverlayf(base.g, blend.g), BlendOverlayf(base.b, blend.b)};
    return blendOverLayPixel;
}

uchar4 __attribute__((kernel)) overlay(uchar4 in, uint32_t x, uint32_t y) {
    float4 inPixel = rsUnpackColor8888(in);

    uint32_t layerDimX = rsAllocationGetDimX(layer);
    uint32_t layerDimY = rsAllocationGetDimY(layer);

    uint32_t layerX = x * layerDimX / dimX;
    uint32_t layerY = y * layerDimY / dimY;

    uchar4* p = (uchar4*)rsGetElementAt(layer, layerX, layerY);
    float4 layerPixel = rsUnpackColor8888(*p);

    float3 color = BlendOverlay(inPixel.rgb, layerPixel.rgb);

    float4 outf = {color.r, color.g, color.b, inPixel.a};
    uchar4 outc = rsPackColorTo8888(outf.r, outf.g, outf.b, outf.a);

    return outc;
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

Cla*_*ery 2

Renderscript 不使用任何 GPU 或 DSP 内核。这是谷歌故意模糊的文档所助长的一种常见误解。Renderscript 曾经有一个 OpenGL ES 接口，但已被弃用，并且从未用于动画壁纸之外的其他用途。Renderscript 将使用多个 CPU 核心（如果可用），但我怀疑 Renderscript 将被 OpenCL 取代。

查看 Android SDK 中的 Effects 类和 Effects 演示。它展示了如何使用 OpenGL ES 2.0 着色器将效果应用于图像，而无需编写 OpenGL ES 代码。

http://software.intel.com/en-us/articles/porting-opengl-games-to-android-on-intel-atom-processors-part-1

更新：

当我学会回答问题比提出问题更多时，这真是太棒了，这里就是这种情况。从缺乏答案可以看出，Renderscript 在 Google 之外几乎没有使用，因为它的奇怪架构忽略了 OpenCL 等行业标准，而且几乎不存在关于其实际工作原理的文档。尽管如此，我的回答确实引起了 Renderscrpt 开发团队的罕见回应，其中仅包含一个实际包含有关 renderscript 的任何有用信息的链接 - PowerVR GPU 供应商 IMG 的 Alexandru Voica 撰写的这篇文章：

http://withimagination.imgtec.com/index.php/powervr/running-renderscript-efficiently-with-powervr-gpus-on-android

那篇文章有一些对我来说是新的好信息。有更多人发表评论，表示 Renderscript 代码在 GPU 上实际运行时遇到困难。

但是，我错误地认为 Google 不再开发 Renderscript。尽管我声明“Renderscript 不使用任何 GPU 或 DSP 内核”。直到最近，情况都是如此，我了解到，从果冻豆版本之一开始，这种情况已经发生了变化。如果 Renderscript 开发人员之一能够解释这一点，那就太好了。或者即使他们有一个公共网页来解释或列出实际支持哪些 GPU 以及如何判断您的代码是否实际上在 GPU 上运行。

我的观点是，Google 最终会用 OpenCL 取代 Renderscript，我不会投入时间用它进行开发。

“不支持 GPU”是完全错误的，目前市场上的每个 Nexus 设备（以及许多其他设备）都附带 RS GPU 驱动程序。 (6认同)
这是完全错误的。RS和GLSL没有任何关系；它们是完全独立的用户模式驱动程序堆栈。RS 位码可以在不支持 GPU 的设备的 CPU 上运行，也可以在存在适当 GPU 的情况下在 GPU 上运行。开发人员不必提供多个源文件或类似的东西。（来源：我研究 RS 运行时、驱动程序模型和 API） (6认同)
链接，简单。Arm http://www.arm.com/products/multimedia/mali-graphics-plus-gpu-compute/mali-t604.php?tab=规格Qct：http://www.qualcomm.com/media/blog/ 2013/01/11/inside-snapdragon-800-series-processors-new-adreno-330-gpu 图片：http://withimagination.imgtec.com/index.php/powervr/running-renderscript-efficiently-with-powervr -gpus-on-android 我可以继续，但我很快就会用完字符。 (5认同)

归档时间：	11 年，8 月前
查看次数：	4466 次
最近记录：	11 年，8 月前