确定图像倾斜的有效方法

Question

确定图像倾斜的有效方法

我正在尝试编写一个程序来以编程方式确定任意图像中的旋转角度或旋转角度.

图像具有以下属性:

包括在明亮的背景上的黑暗文本
偶尔包含仅以90度角相交的水平或垂直线.
在-45到45度之间倾斜.
看到此图像为基准(其被倾斜2.8度).

到目前为止,我已经提出了这个策略:从左到右画一条路线,总是选择最近的白色像素.据推测,从左到右的路线将优选沿着图像倾斜的文本行之间的路径.

这是我的代码:

private bool IsWhite(Color c) { return c.GetBrightness() >= 0.5 || c == Color.Transparent; }

private bool IsBlack(Color c) { return !IsWhite(c); }

private double ToDegrees(decimal slope) { return (180.0 / Math.PI) * Math.Atan(Convert.ToDouble(slope)); }

private void GetSkew(Bitmap image, out double minSkew, out double maxSkew)
{
    decimal minSlope = 0.0M;
    decimal maxSlope = 0.0M;
    for (int start_y = 0; start_y < image.Height; start_y++)
    {
        int end_y = start_y;
        for (int x = 1; x < image.Width; x++)
        {
            int above_y = Math.Max(end_y - 1, 0);
            int below_y = Math.Min(end_y + 1, image.Height - 1);

            Color center = image.GetPixel(x, end_y);
            Color above = image.GetPixel(x, above_y);
            Color below = image.GetPixel(x, below_y);

            if (IsWhite(center)) { /* no change to end_y */ }
            else if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
            else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
        }

        decimal slope = (Convert.ToDecimal(start_y) - Convert.ToDecimal(end_y)) / Convert.ToDecimal(image.Width);
        minSlope = Math.Min(minSlope, slope);
        maxSlope = Math.Max(maxSlope, slope);
    }

    minSkew = ToDegrees(minSlope);
    maxSkew = ToDegrees(maxSlope);
}

Run Code Online (Sandbox Code Playgroud)

这在一些图像上效果很好,在其他图像上表现不佳,而且速度慢.

是否有更有效,更可靠的方法来确定图像的倾斜度？

Answer 1

Jul*_*iet 6

我对我的代码进行了一些修改,它确实运行得更快,但不是很准确.

我做了以下改进:

使用Vinko的建议,我避免使用GetPixel直接使用字节,现在代码以我需要的速度运行.
我的原始代码只使用了"IsBlack"和"IsWhite",但这还不够精细.原始代码在图像中跟踪以下路径:

http://img43.imageshack.us/img43/1545/tilted3degtextoriginalw.gif

请注意,许多路径都通过文本.通过将我的中心,上方和下方路径与实际亮度值进行比较并选择最亮的像素.基本上我将位图视为高度图,从左到右的路径遵循图像的轮廓,从而产生更好的路径:

http://img10.imageshack.us/img10/5807/tilted3degtextbrightnes.gif

正如Toaomalkster所建议的那样,高斯模糊平滑了高度图,我得到了更好的结果:

http://img197.imageshack.us/img197/742/tilted3degtextblurredwi.gif

由于这只是原型代码,我使用GIMP模糊了图像,我没有编写自己的模糊功能.

选择的路径非常适合贪婪的算法.
正如Toaomalkster建议的那样,选择最小/最大斜率是天真的.简单的线性回归可以更好地逼近路径的斜率.另外,一旦我离开图像的边缘,我应该缩短路径,否则路径将拥抱图像的顶部并给出不正确的斜率.

码

private double ToDegrees(double slope) { return (180.0 / Math.PI) * Math.Atan(slope); }

private double GetSkew(Bitmap image)
{
    BrightnessWrapper wrapper = new BrightnessWrapper(image);

    LinkedList<double> slopes = new LinkedList<double>();

    for (int y = 0; y < wrapper.Height; y++)
    {
        int endY = y;

        long sumOfX = 0;
        long sumOfY = y;
        long sumOfXY = 0;
        long sumOfXX = 0;
        int itemsInSet = 1;
        for (int x = 1; x < wrapper.Width; x++)
        {
            int aboveY = endY - 1;
            int belowY = endY + 1;

            if (aboveY < 0 || belowY >= wrapper.Height)
            {
                break;
            }

            int center = wrapper.GetBrightness(x, endY);
            int above = wrapper.GetBrightness(x, aboveY);
            int below = wrapper.GetBrightness(x, belowY);

            if (center >= above && center >= below) { /* no change to endY */ }
            else if (above >= center && above >= below) { endY = aboveY; }
            else if (below >= center && below >= above) { endY = belowY; }

            itemsInSet++;
            sumOfX += x;
            sumOfY += endY;
            sumOfXX += (x * x);
            sumOfXY += (x * endY);
        }

        // least squares slope = (N?(XY) - (?X)(?Y)) / (N?(X^2) - (?X)^2), where N = elements in set
        if (itemsInSet > image.Width / 2) // path covers at least half of the image
        {
            decimal sumOfX_d = Convert.ToDecimal(sumOfX);
            decimal sumOfY_d = Convert.ToDecimal(sumOfY);
            decimal sumOfXY_d = Convert.ToDecimal(sumOfXY);
            decimal sumOfXX_d = Convert.ToDecimal(sumOfXX);
            decimal itemsInSet_d = Convert.ToDecimal(itemsInSet);
            decimal slope =
                ((itemsInSet_d * sumOfXY) - (sumOfX_d * sumOfY_d))
                /
                ((itemsInSet_d * sumOfXX_d) - (sumOfX_d * sumOfX_d));

            slopes.AddLast(Convert.ToDouble(slope));
        }
    }

    double mean = slopes.Average();
    double sumOfSquares = slopes.Sum(d => Math.Pow(d - mean, 2));
    double stddev = Math.Sqrt(sumOfSquares / (slopes.Count - 1));

    // select items within 1 standard deviation of the mean
    var testSample = slopes.Where(x => Math.Abs(x - mean) <= stddev);

    return ToDegrees(testSample.Average());
}

class BrightnessWrapper
{
    byte[] rgbValues;
    int stride;
    public int Height { get; private set; }
    public int Width { get; private set; }

    public BrightnessWrapper(Bitmap bmp)
    {
        Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);

        System.Drawing.Imaging.BitmapData bmpData =
            bmp.LockBits(rect,
                System.Drawing.Imaging.ImageLockMode.ReadOnly,
                bmp.PixelFormat);

        IntPtr ptr = bmpData.Scan0;

        int bytes = bmpData.Stride * bmp.Height;
        this.rgbValues = new byte[bytes];

        System.Runtime.InteropServices.Marshal.Copy(ptr,
                       rgbValues, 0, bytes);

        this.Height = bmp.Height;
        this.Width = bmp.Width;
        this.stride = bmpData.Stride;
    }

    public int GetBrightness(int x, int y)
    {
        int position = (y * this.stride) + (x * 3);
        int b = rgbValues[position];
        int g = rgbValues[position + 1];
        int r = rgbValues[position + 2];
        return (r + r + b + g + g + g) / 6;
    }
}

Run Code Online (Sandbox Code Playgroud)

代码很好,但不是很好.大量的空白会导致程序绘制相对平坦的线条,导致斜率接近0,导致代码低估图像的实际倾斜度.

通过选择随机采样点与采样所有点,倾斜精度没有明显差异,因为通过随机采样选择的"平坦"路径的比率与整个图像中"平坦"路径的比率相同.

Answer 2

Vin*_*vic 5

GetPixel很慢.您可以使用此处列出的方法获得一个数量级的加速.

Answer 3

Can*_*ith 3

如果文本左（右）对齐，您可以通过测量图像左（右）边缘与两个随机位置的第一个暗像素之间的距离来确定斜率，并据此计算斜率。额外的测量可以降低误差，同时需要额外的时间。

归档时间：	16 年前
查看次数：	2861 次
最近记录：	14 年，4 月前