数据库fetchrow_array失败了长截断的DBI属性

Mar*_*Lim 7 perl dbi

我正在使用perl脚本从我的数据库中提取网址,我使用fetchrow_array从数据库中提取URL工作正常,直到我遇到一个很长的URL,georgelog24.blog.iskreni.net/?bid=6744d9dcf85991ed2e4b8a258153a1ab&lid=ff9963b9a798ea335b75b5f7c0c295d1
然后它开始给我这个错误.

DBD::ODBC::st fetchrow_array failed: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) (SQL-HY000) [state was HY000 now 01004]
[Microsoft][ODBC SQL Server Driver]String data, right truncation (SQL-01004) at C:\test\multihashtest2.pl line 44.
Run Code Online (Sandbox Code Playgroud)

我相信这是在数据库方面,因为我以前用来拉URL的代码已经运行了.我使用的数据库是MSSQL server 2005.

数据库中的URL列当前使用文本类型,但我已尝试将其更改为varchar(max)nvarchar(max),但错误仍然存​​在.

经过一些试验和错误,我发现网址的最大长度,然后我可以成功查询fetchrow_array是81个字符.由于URL有时会跨越荒谬的长度,我不能对URL长度施加限制.

任何人都可以帮助我理解并建议解决此问题吗?

仅供参考:第44行是我下面代码中的第一行

while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}
Run Code Online (Sandbox Code Playgroud)

Tud*_*tin 12

试试:

#not anymore errors if content is truncated - you don't necessarily want this
$statement_handle->{'LongTruncOk'} = 1;

#nice, hard coded constant for the length of data to be read from Longs
$statement_handle->{'LongReadLen'} = 20000;
while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}
Run Code Online (Sandbox Code Playgroud)

此外,我建议您尝试Parallel::ForkManager并行化作业 - 我发现它比线程更直观,更易于使用


boh*_*ica 5

请查看DBI属性LongTruncOkLongReadlen

您将需要接受截断或设置最大大小作为文本和varchar(max)列可能是巨大的,所以如果它留给DBD,它将别无选择,只能在列最大的情况下分配大量内存该列的大小.