什么是单态限制?

Bak*_*riu 60 polymorphism haskell types type-inference monomorphism-restriction

令我感到困惑的是,haskell编译器有时会推断出比我预期的更不易变形的类型,例如在使用无点定义时.

似乎问题是"单态限制",默认情况下在旧版本的编译器上启用.

考虑以下haskell程序:

{-# LANGUAGE MonomorphismRestriction #-}

import Data.List(sortBy)

plus = (+)
plus' x = (+ x)

sort = sortBy compare

main = do
  print $ plus' 1.0 2.0
  print $ plus 1.0 2.0
  print $ sort [3, 1, 2]
Run Code Online (Sandbox Code Playgroud)

如果我编译它,ghc我没有获得错误,可执行文件的输出是:

3.0
3.0
[1,2,3]
Run Code Online (Sandbox Code Playgroud)

如果我将main身体改为:

main = do
  print $ plus' 1.0 2.0
  print $ plus (1 :: Int) 2
  print $ sort [3, 1, 2]
Run Code Online (Sandbox Code Playgroud)

我没有编译时错误,输出变为:

3.0
3
[1,2,3]
Run Code Online (Sandbox Code Playgroud)

正如所料.但是,如果我尝试将其更改为:

main = do
  print $ plus' 1.0 2.0
  print $ plus (1 :: Int) 2
  print $ plus 1.0 2.0
  print $ sort [3, 1, 2]
Run Code Online (Sandbox Code Playgroud)

我收到类型错误:

test.hs:13:16:
    No instance for (Fractional Int) arising from the literal ‘1.0’
    In the first argument of ‘plus’, namely ‘1.0’
    In the second argument of ‘($)’, namely ‘plus 1.0 2.0’
    In a stmt of a 'do' block: print $ plus 1.0 2.0
Run Code Online (Sandbox Code Playgroud)

尝试sort使用不同类型调用两次时会发生同样的情况:

main = do
  print $ plus' 1.0 2.0
  print $ plus 1.0 2.0
  print $ sort [3, 1, 2]
  print $ sort "cba"
Run Code Online (Sandbox Code Playgroud)

产生以下错误:

test.hs:14:17:
    No instance for (Num Char) arising from the literal ‘3’
    In the expression: 3
    In the first argument of ‘sort’, namely ‘[3, 1, 2]’
    In the second argument of ‘($)’, namely ‘sort [3, 1, 2]’
Run Code Online (Sandbox Code Playgroud)
  • 为什么ghc突然认为这plus不是多态的并且需要Int参数?唯一的参考Int是在一个应用程序plus,当定义明显多态时怎么可能呢?
  • 为什么ghc突然想到sort需要一个Num Char实例?

此外,如果我尝试将函数定义放入自己的模块中,如下所示:

{-# LANGUAGE MonomorphismRestriction #-}

module TestMono where

import Data.List(sortBy)

plus = (+)
plus' x = (+ x)

sort = sortBy compare
Run Code Online (Sandbox Code Playgroud)

编译时出现以下错误:

TestMono.hs:10:15:
    No instance for (Ord a0) arising from a use of ‘compare’
    The type variable ‘a0’ is ambiguous
    Relevant bindings include
      sort :: [a0] -> [a0] (bound at TestMono.hs:10:1)
    Note: there are several potential instances:
      instance Integral a => Ord (GHC.Real.Ratio a)
        -- Defined in ‘GHC.Real’
      instance Ord () -- Defined in ‘GHC.Classes’
      instance (Ord a, Ord b) => Ord (a, b) -- Defined in ‘GHC.Classes’
      ...plus 23 others
    In the first argument of ‘sortBy’, namely ‘compare’
    In the expression: sortBy compare
    In an equation for ‘sort’: sort = sortBy compare
Run Code Online (Sandbox Code Playgroud)
  • 为什么不ghc能够使用多态型Ord a => [a] -> [a]sort
  • 为什么ghc治疗plusplus'不同?plus应该有多态类型Num a => a -> a -> a,我真的没有看到它与类型的不同sort,但只会sort引发错误.

最后一件事:如果我评论sort文件的定义编译.但是,如果我尝试将其加载ghci并检查我得到的类型:

*TestMono> :t plus
plus :: Integer -> Integer -> Integer
*TestMono> :t plus'
plus' :: Num a => a -> a -> a
Run Code Online (Sandbox Code Playgroud)

为什么不是plus多态的类型?


这是关于Haskell中单形态限制的规范问题,正如元问题中所讨论的那样.

Bak*_*riu 83

什么是单态限制?

Haskell维基所述的单同性限制是:

Haskell类型推断中的反直觉规则.如果您忘记提供类型签名,有时此规则将使用"类型默认"规则填充具有特定类型的自由类型变量.

这意味着,在某些情况下,如果您的类型不明确(即多态),编译器将选择将该类型实例化为不模糊的类型.

我如何解决它?

首先,您始终可以显式提供类型签名,这将避免触发限制:

plus :: Num a => a -> a -> a
plus = (+)    -- Okay!

-- Runs as:
Prelude> plus 1.0 1
2.0
Run Code Online (Sandbox Code Playgroud)

或者,如果要定义函数,则可以避免使用无 点样式,例如:

plus x y = x + y
Run Code Online (Sandbox Code Playgroud)

把它关掉

可以简单地关闭限制,这样您就不必对代码执行任何操作来修复它.该行为由两个扩展控制: MonomorphismRestriction将启用它(这是默认值),同时 NoMonomorphismRestriction将禁用它.

您可以将以下行放在文件的最顶部:

{-# LANGUAGE NoMonomorphismRestriction #-}
Run Code Online (Sandbox Code Playgroud)

如果您使用GHCi,可以使用以下:set命令启用扩展:

Prelude> :set -XNoMonomorphismRestriction
Run Code Online (Sandbox Code Playgroud)

您还可以告诉ghc从命令行启用扩展:

ghc ... -XNoMonomorphismRestriction
Run Code Online (Sandbox Code Playgroud)

注意:您应该更喜欢第一个选项而不是通过命令行选项选择扩展.

有关此扩展和其他扩展的说明,请参阅GHC的页面.

一个完整的解释

我将尝试总结下面你需要知道的一切,以了解单态限制是什么,为什么它被引入以及它是如何表现的.

一个例子

采取以下简单的定义:

plus = (+)
Run Code Online (Sandbox Code Playgroud)

你认为能够取代的每次出现+plus.特别是因为(+) :: Num a => a -> a -> a你也期望也有plus :: Num a => a -> a -> a.

不幸的是,这种情况并非如此.例如,我们在GHCi中尝试以下内容:

Prelude> let plus = (+)
Prelude> plus 1.0 1
Run Code Online (Sandbox Code Playgroud)

我们得到以下输出:

<interactive>:4:6:
    No instance for (Fractional Integer) arising from the literal ‘1.0’
    In the first argument of ‘plus’, namely ‘1.0’
    In the expression: plus 1.0 1
    In an equation for ‘it’: it = plus 1.0 1
Run Code Online (Sandbox Code Playgroud)

您可能需要:set -XMonomorphismRestriction 更新的GHCi版本.

事实上,我们可以看到类型plus不是我们所期望的:

Prelude> :t plus
plus :: Integer -> Integer -> Integer
Run Code Online (Sandbox Code Playgroud)

What happened is that the compiler saw that plus had type Num a => a -> a -> a, a polymorphic type. Moreover it happens that the above definition falls under the rules that I'll explain later and so he decided to make the type monomorphic by defaulting the type variable a. The default is Integer as we can see.

Note that if you try to compile the above code using ghc you won't get any errors. This is due to how ghci handles (and must handle) the interactive definitions. Basically every statement entered in ghci must be completely type checked before the following is considered; in other words it's as if every statement was in a separate module. Later I'll explain why this matter.

Some other example

Consider the following definitions:

f1 x = show x

f2 = \x -> show x

f3 :: (Show a) => a -> String
f3 = \x -> show x

f4 = show

f5 :: (Show a) => a -> String
f5 = show
Run Code Online (Sandbox Code Playgroud)

We'd expect all these functions to behave in the same way and have the same type, i.e. the type of show: Show a => a -> String.

Yet when compiling the above definitions we obtain the following errors:

test.hs:3:12:
    No instance for (Show a1) arising from a use of ‘show’
    The type variable ‘a1’ is ambiguous
    Relevant bindings include
      x :: a1 (bound at blah.hs:3:7)
      f2 :: a1 -> String (bound at blah.hs:3:1)
    Note: there are several potential instances:
      instance Show Double -- Defined in ‘GHC.Float’
      instance Show Float -- Defined in ‘GHC.Float’
      instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
        -- Defined in ‘GHC.Real’
      ...plus 24 others
    In the expression: show x
    In the expression: \ x -> show x
    In an equation for ‘f2’: f2 = \ x -> show x

test.hs:8:6:
    No instance for (Show a0) arising from a use of ‘show’
    The type variable ‘a0’ is ambiguous
    Relevant bindings include f4 :: a0 -> String (bound at blah.hs:8:1)
    Note: there are several potential instances:
      instance Show Double -- Defined in ‘GHC.Float’
      instance Show Float -- Defined in ‘GHC.Float’
      instance (Integral a, Show a) => Show (GHC.Real.Ratio a)
        -- Defined in ‘GHC.Real’
      ...plus 24 others
    In the expression: show
    In an equation for ‘f4’: f4 = show
Run Code Online (Sandbox Code Playgroud)

So f2 and f4 don't compile. Moreover when trying to define these function in GHCi we get no errors, but the type for f2 and f4 is () -> String!

Monomorphism restriction is what makes f2 and f4 require a monomorphic type, and the different behaviour bewteen ghc and ghci is due to different defaulting rules.

When does it happen?

In Haskell, as defined by the report, there are two distinct type of bindings. Function bindings and pattern bindings. A function binding is nothing else than a definition of a function:

f x = x + 1
Run Code Online (Sandbox Code Playgroud)

Note that their syntax is:

<identifier> arg1 arg2 ... argn = expr
Run Code Online (Sandbox Code Playgroud)

Modulo guards and where declarations. But they don't really matter.

where there must be at least one argument.

A pattern binding is a declaration of the form:

<pattern> = expr
Run Code Online (Sandbox Code Playgroud)

Again, modulo guards.

Note that variables are patterns, so the binding:

plus = (+)
Run Code Online (Sandbox Code Playgroud)

is a pattern binding. It's binding the pattern plus (a variable) to the expression (+).

When a pattern binding consists of only a variable name it's called a simple pattern binding.

The monomorphism restriction applies to simple pattern bindings!

Well, formally we should say that:

A declaration group is a minimal set of mutually dependent bindings.

Section 4.5.1 of the report.

And then (Section 4.5.5 of the report):

a given declaration group is unrestricted if and only if:

  1. every variable in the group is bound by a function binding (e.g. f x = x) or a simple pattern binding (e.g. plus = (+); Section 4.4.3.2 ), and

  2. an explicit type signature is given for every variable in the group that is bound by simple pattern binding. (e.g. plus :: Num a => a -> a -> a; plus = (+)).

Examples added by me.

So a restricted declaration group is a group where, either there are non-simple pattern bindings (e.g. (x:xs) = f something or (f, g) = ((+), (-))) or there is some simple pattern binding without a type signature (as in plus = (+)).

The monomorphism restriction affects restricted declaration groups.

Most of the time you don't define mutual recursive functions and hence a declaration group becomes just a binding.

What does it do?

The monomorphism restriction is described by two rules in Section 4.5.5 of the report.

First rule

The usual Hindley-Milner restriction on polymorphism is that only type variables that do not occur free in the environment may be generalized. In addition, the constrained type variables of a restricted declaration group may not be generalized in the generalization step for that group. (Recall that a type variable is constrained if it must belong to some type class; see Section 4.5.2 .)

The highlighted part is what the monomorphism restriction introduces. It says that if the type is polymorphic (i.e. it contain some type variable) and that type variable is constrained (i.e. it has a class constraint on it: e.g. the type Num a => a -> a -> a is polymorphic because it contains a and also contrained because the a has the constraint Num over it.) then it cannot be generalized.

In simple words not generalizing means that the uses of the function plus may change its type.

If you had the definitions:

plus = (+)

x :: Integer
x = plus 1 2

y :: Double
y = plus 1.0 2
Run Code Online (Sandbox Code Playgroud)

then you'd get a type error. Because when the compiler sees that plus is called over an Integer in the declaration of x it will unify the type variable a with Integer and hence the type of plus becomes:

Integer -> Integer -> Integer
Run Code Online (Sandbox Code Playgroud)

but then, when it will type check the definition of y, it will see that plus is applied to a Double argument, and the types don't match.

Note that you can still use plus without getting an error:

plus = (+)
x = plus 1.0 2
Run Code Online (Sandbox Code Playgroud)

In this case the type of plus is first inferred to be Num a => a -> a -> a but then its use in the definition of x, where 1.0 requires a Fractional constraint, will change it to Fractional a => a -> a -> a.

Rationale

The report says:

Rule 1 is required for two reasons, both of which are fairly subtle.

  • Rule 1 prevents computations from being unexpectedly repeated. For example, genericLength is a standard function (in library Data.List) whose type is given by

    genericLength :: Num a => [b] -> a
    
    Run Code Online (Sandbox Code Playgroud)

    Now consider the following expression:

    let len = genericLength xs
    in (len, len)
    
    Run Code Online (Sandbox Code Playgroud)

    It looks as if len should be computed only once, but without Rule 1 it might be computed twice, once at each of two different overloadings. If the programmer does actually wish the computation to be repeated, an explicit type signature may be added:

    let len :: Num a => a
        len = genericLength xs
    in (len, len)
    
    Run Code Online (Sandbox Code Playgroud)

For this point the example from the wiki is, I believe, clearer. Consider the function:

f xs = (len, len)
  where
    len = genericLength xs
Run Code Online (Sandbox Code Playgroud)

If len was polymorphic the type of f would be:

f :: Num a, Num b => [c] -> (a, b)
Run Code Online (Sandbox Code Playgroud)

So the two elements of the tuple (len, len) could actually be different values! But this means that the computation done by genericLength must be repeated to obtain the two different values.

The rationale here is: the code contains one function call, but not introducing this rule could produce two hidden function calls, which is counter intuitive.

With the monomorphism restriction the type of f becomes:

f :: Num a => [b] -> (a, a)
Run Code Online (Sandbox Code Playgroud)

In this way there is no need to perform the computation multiple times.

  • Rule 1 prevents ambiguity. For example, consider the declaration group

    [(n,s)] = reads t

    Recall that reads is a standard function whose type is given by the signature

    reads :: (Read a) => String -> [(a,String)]

    Without Rule 1, n would be assigned the type ? a. Read a ? a and s the type ? a. Read a ? String. The latter is an invalid type, because it is inherently ambiguous. It is not possible to determine at what overloading to use s, nor can this be solved by adding a type signature for s. Hence, when non-simple pattern bindings are used (Section 4.4.3.2 ), the types inferred are always monomorphic in their constrained type variables, irrespective of whether a type signature is provided. In this case, both n and s are monomorphic in a.

Well, I believe this example is self-explanatory. There are situations when not applying the rule results in type ambiguity.

If you disable the extension as suggest above you will get a type error when trying to compile the above declaration. However this isn't really a problem: you already know that when using read you have to somehow tell the compiler which type it should try to parse...

Second rule

  1. Any monomorphic type variables that remain when type inference for an entire module is complete, are considered ambiguous, and are resolved to particular types using the defaulting rules (Section 4.3.4 ).

This means that. If you have your usual definition:

plus = (+)
Run Code Online (Sandbox Code Playgroud)

This will have a type Num a => a -> a -> a where a is a monomorphic type variable due to rule 1 described above. Once the whole module is inferred the compiler will simply choose a type that will replace that a according to the defaulting rules.

The final result is: plus :: Integer -> Integer -> Integer.

Note that this is done after the whole module is inferred.

This means that if you have the following declarations:

plus = (+)

x = plus 1.0 2.0
Run Code Online (Sandbox Code Playgroud)

inside a module, before type defaulting the type of plus will be: Fractional a => a -> a -> a (see rule 1 for why this happens). At this point, following the defaulting rules, a will be replaced by Double and so we will have plus :: Double -> Double -> Double and x :: Double.

Defaulting

As stated before there exist some defaulting rules, described in Section 4.3.4 of the Report, that the inferencer can adopt and that will replace a polymorphic type with a monomorphic one. This happens whenever a type is ambiguous.

For example in the expression:

let x = read "<something>" in show x
Run Code Online (Sandbox Code Playgroud)

here the expression is ambiguous because the types for show and read are:

show :: Show a => a -> String
read :: Read a => String -> a
Run Code Online (Sandbox Code Playgroud)

So the x has type Read a => a. But this constraint is satisfied by a lot of types: Int, Double or () for example. Which one to choose? There's nothing that can tell us.

In this case we can resolve the ambiguity by telling the compiler which type we want, adding a type signature:

let x = read "<something>" :: Int in show x
Run Code Online (Sandbox Code Playgroud)

Now the problem is: since Haskell uses the Num type class to handle numbers, there are a lot of cases where numerical expressions contain ambiguities.

Consider:

show 1
Run Code Online (Sandbox Code Playgroud)

What should the result be?

As before 1 has type Num a => a and there are many type of numbers that could be used. Which one to choose?

Having a compiler error almost every time we use a number isn't a good thing, and hence the defaulting rules were introduced. The rules can be controlled using a default declaration. By specifying default (T1, T2, T3) we can change how the inferencer defaults the different types.

An ambiguous type variable v is defaultable if:

  • v appears only in contraints of the kind C v were C is a class (i.e. if it appears as in: Monad (m v) then it is not defaultable).
  • at least one of these classes is Num or a subclass of Num.
  • all of these classes are defined in the Prelude or a standard library.

A defaultable type variable is replaced by the first type in the default list that is an instance of all the ambiguous variable’s classes.

The default default declaration is default (Integer, Double).

For example:

plus = (+)
minus = (-)

x = plus 1.0 1
y = minus 2 1
Run Code Online (Sandbox Code Playgroud)

The types inferred would be:

plus :: Fractional a => a -> a -> a
minus :: Num a => a -> a -> a
Run Code Online (Sandbox Code Playgroud)

which, by defaulting rules, become:

plus :: Double -> Double -> Double
minus :: Integer -> Integer -> Integer
Run Code Online (Sandbox Code Playgroud)

Note that this explains why in the example in the question only the sort definition raises an error. The type Ord a => [a] -> [a] cannot be defaulted because Ord isn't a numeric class.

Extended defaulting

Note that GHCi comes with extended defaulting rules (or here for GHC8), which can be enabled in files as well using the ExtendedDefaultRules extensions.

The defaultable type variables need not only appear in contraints where all the classes are standard and there must be at least one class that is among Eq, Ord, Show or Num and its subclasses.

Moreover the default default declaration is default ((), Integer, Double).

This may produce odd results. Taking the example from the question:

Prelude> :set -XMonomorphismRestriction
Prelude> import Data.List(sortBy)
Prelude Data.List> let sort = sortBy compare
Prelude Data.List> :t sort
sort :: [()] -> [()]
Run Code Online (Sandbox Code Playgroud)

in ghci we don't get a type error but the Ord a constraints results in a default of () which is pretty much useless.

Useful links

There are a lot of resources and discussions about the monomorphism restriction.

Here are some links that I find useful and that may help you understand or deep further into the topic: