常数超出范围但明确定义(或我认为)

jlr*_*jlr 3 haskell

我正在尝试使用Scalpel刮擦网站,但使用他们自己的示例代码遇到了超出范围的错误。该示例可在其github页面上的“ 我的抓取目标未返回预期的标记”部分中找到。

我正在使用ghc-8.6.4Haskell编译器。

我的packages.yaml依赖项是:

dependencies:
- base >= 4.7 && < 5
- http-conduit
- http-client
- http-client-tls
- http-types
- scalpel
Run Code Online (Sandbox Code Playgroud)

代码:

{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE OverloadedStrings #-}

module Example where

import Text.HTML.Scalpel
import qualified Network.HTTP.Client as HTTP
import qualified Network.HTTP.Client.TLS as HTTP
import qualified Network.HTTP.Types.Header as HTTP

-- Create a new manager settings based on the default TLS manager that updates
-- the request headers to include a custom user agent.
managerSettings :: HTTP.ManagerSettings
managerSettings = HTTP.tlsManagerSettings {
  HTTP.managerModifyRequest = \req -> do
    req' <- HTTP.managerModifyRequest HTTP.tlsManagerSettings req
    return $ req' {
      HTTP.requestHeaders = (HTTP.hUserAgent, "My Custom UA")
                          : HTTP.requestHeaders req'
    }
}

main = do
    manager <- Just <$> HTTP.newManager managerSettings
    html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
    maybe printError printHtml html
  where
    url = "https://www.google.com"
    printError = putStrLn "Failed"
    printHtml = mapM_ putStrLn
Run Code Online (Sandbox Code Playgroud)

从代码示例中可以看到,manager常量位于def函数旁边。但是似乎它manager以某种方式隐藏着……我不能把手指放在哪里出了问题。

stack build命令的整个控制台输出,其中包含报告的错误:

jroyer$ stack build
my-okr-haskeller-0.1.0.0: build (lib + exe)
Preprocessing library for my-okr-haskeller-0.1.0.0..
Building library for my-okr-haskeller-0.1.0.0..
[2 of 3] Compiling Example          ( src/Example.hs, .stack-work/dist/x86_64-osx/Cabal-2.4.0.1/build/Example.o )

/Users/jroyer/Projects/bizgithub/my-okr-haskeller/src/Example.hs:26:40: error: Not in scope: ‘manager’
   |
26 |     html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
   |                                        ^^^^^^^


--  While building package my-okr-haskeller-0.1.0.0 using:
      /Users/jroyer/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.4.0.1_ghc-8.6.4 --builddir=.stack-work/dist/x86_64-osx/Cabal-2.4.0.1 build lib:my-okr-haskeller exe:my-okr-haskeller-exe --ghc-options " -ddump-hi -ddump-to-file -fdiagnostics-color=always"
    Process exited with code: ExitFailure 1
Run Code Online (Sandbox Code Playgroud)

Tho*_*son 5

编辑:我可以用旧版本的手术刀重现质问者的问题,质问者提到他们正在使用:

[1 of 1] Compiling Example          ( Main.hs, /var/folders/m7/_2kqsz4n4c3ck8050glq4ggr0000gn/T/cabal-repl.-26184/dist-newstyle/build/x86_64-osx/ghc-8.6.4/fake-package-0/x/script/build/script/script-tmp/Example.o )

Main.hs:34:40: error: Not in scope: ‘manager’
   |
34 |     html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
   |                                        ^^^^^^^
./so.hs  16.94s user 3.89s system 114% cpu 18.155 total
Run Code Online (Sandbox Code Playgroud)

这是次优的错误消息,似乎是由于使用命名字段双关语和不是字段名称的变量导致的。也就是说,Config在该版本中scalpel没有管理员字段。我们可以在一个较小的示例中重现此问题:

% cat test.hs
{-# LANGUAGE NamedFieldPuns #-}
data Foo = Foo { bar :: Int } deriving (Show)
main :: IO ()
main = print (Foo { zar})
 where zar = 23 :: Int
% ghc test.hs
...snipt...
test.hs:4:21: error:
    Not in scope: ‘zar’
    Perhaps you meant ‘bar’ (line 3)
  |
4 | main = print (Foo { zar})
Run Code Online (Sandbox Code Playgroud)

因此,解决方案是将手术刀更新为新版本。

html <-scrapeURLWithConfig(def {manager})url $ htmls anySelector

我不知道这应该是什么。具体来说(def { manager })。我不熟悉任何语法。

如果有manager,应该有一个字段。例如:

def { someField = someValue }
Run Code Online (Sandbox Code Playgroud)

不是你所拥有的def { someValue }没有任何意义。

啊,NamedFieldPuns。老实说,我从未使用过它们,看着它们,我发现自己在使用RecordWildCards。继续。

查看黑线码头,字段名称是,manager所以您有一个manager字段和一个manager名称字段pun 的值。我需要为添加一个导入def。同时,我自由地使用cabal和shebang来明确说明所有软件包:

#! /usr/bin/env cabal
{- cabal:
build-depends:
      base >= 4
    , scalpel == 0.6.0
    , http-types == 0.12.3
    , http-client-tls == 0.3.5.3
    , http-client == 0.6.4
    , data-default == 0.7.1.1
-}
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.Default
import Text.HTML.Scalpel
import qualified Network.HTTP.Client as HTTP
import qualified Network.HTTP.Client.TLS as HTTP
import qualified Network.HTTP.Types.Header as HTTP

-- Create a new manager settings based on the default TLS manager that updates
-- the request headers to include a custom user agent.
managerSettings :: HTTP.ManagerSettings
managerSettings = HTTP.tlsManagerSettings {
  HTTP.managerModifyRequest = \req -> do
    req' <- HTTP.managerModifyRequest HTTP.tlsManagerSettings req
    return $ req' {
      HTTP.requestHeaders = (HTTP.hUserAgent, "My Custom UA")
                          : HTTP.requestHeaders req'
    }
}

main = do
    manager <- Just <$> HTTP.newManager managerSettings
    html <- scrapeURLWithConfig (def { manager = manager }) url $ htmls anySelector
    maybe printError printHtml html
  where
    url = "https://www.google.com"
    printError = putStrLn "Failed"
    printHtml = mapM_ putStrLn
Run Code Online (Sandbox Code Playgroud)

似乎运行良好。请注意,包含的模块main本身应为Main

  • 尽管我不熟悉扩展名,但似乎在此处启用的[NamedFieldPuns](https://ghc.readthedocs.io/en/8.0.1/glasgow_exts.html#record-puns)这种语法? (3认同)