Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate the use of error in Generics.SOP.GGP #116

Closed
RyanGlScott opened this issue Feb 7, 2020 · 9 comments
Closed

Eliminate the use of error in Generics.SOP.GGP #116

RyanGlScott opened this issue Feb 7, 2020 · 9 comments

Comments

@RyanGlScott
Copy link
Contributor

While I managed to get rid of most uses of error in #77, there is one use of error in Generics.SOP.GGP that I couldn't remove:

gfrom :: (GFrom a, GHC.Generic a) => a -> SOP I (GCode a)
gfrom x = gSumFrom (GHC.from x) (error "gfrom: internal error" :: SOP.SOP SOP.I '[])

This is rather bothersome, since if you look carefully at how gSumFrom is implemented, that error argument can never be reached. Surely there must be a way to refactor gSumFrom to avoid this use of error!

After some pondering, I realized that the reason this error currently exists is due to the design of GSumFrom:

-- This can most certainly be simplified
class GSumFrom (a :: Type -> Type) where
gSumFrom :: a x -> SOP I xss -> SOP I (ToSumCode a xss)
gSumSkip :: proxy a -> SOP I xss -> SOP I (ToSumCode a xss)

gSumFrom always requires an SOP I xss argument, which makes it awkward to invoke from gfrom, as it requires xss ~ '[]. I realized that we can avoid this SOP I xss argument entirely by splitting up GSumFrom into smaller parts. Here is the design I am envisioning:

class GSumFrom (a :: Type -> Type) where
  gSumFrom :: a x -> SOP I (ToSumCode a '[])

instance GSumFrom V1 where
  gSumFrom x = case x of {}

instance (GSumFrom a) => GSumFrom (M1 D c a) where
  gSumFrom (M1 a) = gSumFrom a

instance (GSumFrom a, GSumPad a, GSumSkip a, GSumFrom b) => GSumFrom (a :+: b) where
  gSumFrom (L1 a) = gSumPad a (Proxy :: Proxy '[]) (gSumFrom a) (Proxy :: Proxy (ToSumCode b '[]))
  gSumFrom (R1 b) = gSumSkip (Proxy :: Proxy a) (gSumFrom b)

instance (GProductFrom a) => GSumFrom (M1 C c a) where
  gSumFrom (M1 a) = SOP (Z (gProductFrom a Nil))

type family (++) xs ys where
  '[] ++ ys = ys
  (x ': xs) ++ ys = x ': (xs ++ ys)
infixr 5 ++

pad_NS :: forall f xs ys proxy. NS f xs -> proxy ys -> NS f (xs ++ ys)
pad_NS fxs _ = go fxs
  where
    go :: forall xs'. NS f xs' -> NS f (xs' ++ ys)
    go (Z x) = Z x
    go (S x) = S (go x)

pad_SOP :: SOP f xs -> proxy ys -> SOP f (xs ++ ys)
pad_SOP (SOP fxs) fys = SOP (pad_NS fxs fys)

class GSumPad (a :: Type -> Type) where
  gSumPad :: a x -> proxy1 xss -> SOP I (ToSumCode a xss) -> proxy2 yss -> SOP I (ToSumCode a (xss ++ yss))

instance (GSumFrom a, GSumPad a, GSumSkip a, GSumFrom b, GSumPad b) => GSumPad (a :+: b) where
  gSumPad (L1 a) (_ :: proxy1 xss) _ (_ :: proxy2 yss) =
    gSumPad a (Proxy :: Proxy '[]) (gSumFrom a) (Proxy :: Proxy (ToSumCode b (xss ++ yss)))
  gSumPad (R1 b) (_ :: proxy1 xss) _ (_ :: proxy2 yss) =
    gSumSkip (Proxy :: Proxy a)
              (gSumPad b (Proxy :: Proxy '[]) (gSumFrom b) (Proxy :: Proxy (xss ++ yss)))

instance GSumPad (M1 C c a) where
  gSumPad _ _ = pad_SOP

class GSumSkip (a :: Type -> Type) where
  gSumSkip :: proxy a -> SOP I xss -> SOP I (ToSumCode a xss)

instance (GSumSkip a, GSumSkip b) => GSumSkip (a :+: b) where
  gSumSkip _ xss = gSumSkip (Proxy :: Proxy a) (gSumSkip (Proxy :: Proxy b) xss)

instance GSumSkip (M1 C c a) where
  gSumSkip _ (SOP xss) = SOP (S xss)

gfrom :: (GFrom a, GHC.Generic a) => a -> SOP I (GCode a)
gfrom x = gSumFrom (GHC.from x)

Some observations about this redesign:

  1. gSumFrom now has the much simpler type a x -> SOP I (ToSumCode a '[]). This allows gfrom to eliminate the error argument entirely.

  2. gSumSkip now lives in its own GSumPad class. There only need to be two instances for GSumSkip: M1 C c a and (a :+: b), since gSumSkip is only ever used to append elements to the right of a Code.

  3. The GSumFrom (a :+: b) instance has changed slightly. The R1 case still uses gSumPad to append elements to the right of the Code. However, the L1 case uses a different GSumPad class instead to prepend elements to the left of the Code. The GSumPad class makes use of the (++) type family, which is being proposed in append and split NP #115.

  4. The GSumPad (M1 C c a) is implemented using the pad_SOP function, which in turn uses the pad_NS function:

    pad_NS :: forall f xs ys proxy. NS f xs -> proxy ys -> NS f (xs ++ ys)
    pad_NS fxs _ = go fxs
      where
        go :: forall xs'. NS f xs' -> NS f (xs' ++ ys)
        go (Z x) = Z x
        go (S x) = S (go x)
    

    If you squint at the definition of pad_NS, you'll notice that it simply walks over the structure of the NS argument. This means that pad_NS can be implemented in a more efficient way:

    pad_NS :: forall f xs ys proxy. NS f xs -> proxy ys -> NS f (xs ++ ys)
    pad_NS fxs _ = unsafeCoerce fxs
    

    This is very similar to how existing functions like coerce_NS have both a fast unsafeCoerce-using implementation and a safe, internal-only implementation.


You might wonder if this refactoring changes the performance characteristics of Generics.SOP.GGP. I ran the generics-sop benchmarks before and after these changes to see how the GGP-related benchmarks were affected. Thankfully, the changes appear to be mostly negligible. Here are the benchmarks results before these changes:

Running 1 benchmarks...
Benchmark generics-sop-bench: RUNNING...
benchmarking Roundtrip/S2/GHCGeneric
time                 2.931 ns   (2.929 ns .. 2.933 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.931 ns   (2.929 ns .. 2.933 ns)
std dev              6.531 ps   (4.972 ps .. 8.465 ps)

benchmarking Roundtrip/S2/SOPGGP
time                 2.928 ns   (2.926 ns .. 2.930 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.934 ns   (2.932 ns .. 2.935 ns)
std dev              5.884 ps   (4.862 ps .. 7.449 ps)

benchmarking Roundtrip/S2/SOPTH
time                 2.933 ns   (2.931 ns .. 2.935 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.932 ns   (2.930 ns .. 2.933 ns)
std dev              4.555 ps   (3.771 ps .. 5.866 ps)

benchmarking Roundtrip/S20/GHCGeneric
time                 12.56 ns   (12.55 ns .. 12.57 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.55 ns   (12.54 ns .. 12.56 ns)
std dev              37.12 ps   (31.16 ps .. 44.75 ps)

benchmarking Roundtrip/S20/SOPGGP
time                 53.14 ns   (53.11 ns .. 53.18 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 53.15 ns   (53.13 ns .. 53.18 ns)
std dev              87.68 ps   (69.13 ps .. 117.9 ps)

benchmarking Roundtrip/S20/SOPTH
time                 23.20 ns   (23.19 ns .. 23.21 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 23.20 ns   (23.19 ns .. 23.21 ns)
std dev              31.28 ps   (21.63 ps .. 44.95 ps)

benchmarking Roundtrip/PB2/GHCGeneric
time                 5.281 ns   (5.280 ns .. 5.283 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.283 ns   (5.281 ns .. 5.287 ns)
std dev              9.156 ps   (3.740 ps .. 14.97 ps)

benchmarking Roundtrip/PB2/SOPGGP
time                 5.281 ns   (5.280 ns .. 5.282 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.281 ns   (5.280 ns .. 5.283 ns)
std dev              3.517 ps   (1.940 ps .. 6.257 ps)

benchmarking Roundtrip/PB2/SOPTH
time                 5.281 ns   (5.280 ns .. 5.282 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.281 ns   (5.280 ns .. 5.283 ns)
std dev              3.843 ps   (1.868 ps .. 7.067 ps)

benchmarking Eq/S2/GHCDeriving
time                 5.017 ns   (5.016 ns .. 5.018 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.018 ns   (5.016 ns .. 5.020 ns)
std dev              4.880 ps   (2.478 ps .. 7.793 ps)

benchmarking Eq/S2/SOPGGP
time                 6.073 ns   (6.072 ns .. 6.074 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.073 ns   (6.072 ns .. 6.075 ns)
std dev              4.457 ps   (2.530 ps .. 8.033 ps)

benchmarking Eq/S2/SOPTH
time                 6.179 ns   (6.164 ns .. 6.194 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.194 ns   (6.180 ns .. 6.210 ns)
std dev              48.84 ps   (40.69 ps .. 57.97 ps)

benchmarking Eq/S20/GHCDeriving
time                 4.753 ns   (4.752 ns .. 4.754 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.753 ns   (4.752 ns .. 4.755 ns)
std dev              3.498 ps   (2.302 ps .. 5.969 ps)

benchmarking Eq/S20/SOPGGP
time                 261.7 ns   (260.3 ns .. 263.1 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 260.7 ns   (259.1 ns .. 261.9 ns)
std dev              4.693 ns   (3.499 ns .. 6.103 ns)
variance introduced by outliers: 22% (moderately inflated)

benchmarking Eq/S20/SOPTH
time                 229.2 ns   (227.2 ns .. 231.0 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 228.1 ns   (226.4 ns .. 229.6 ns)
std dev              5.401 ns   (4.852 ns .. 6.013 ns)
variance introduced by outliers: 33% (moderately inflated)

benchmarking Eq/PB2/GHCDeriving
time                 6.338 ns   (6.336 ns .. 6.339 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.338 ns   (6.337 ns .. 6.341 ns)
std dev              5.656 ps   (3.462 ps .. 8.817 ps)

benchmarking Eq/PB2/SOPGGP
time                 217.8 ns   (217.1 ns .. 218.6 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 217.6 ns   (216.8 ns .. 218.2 ns)
std dev              2.462 ns   (2.043 ns .. 3.198 ns)
variance introduced by outliers: 10% (moderately inflated)

benchmarking Eq/PB2/SOPTH
time                 204.9 ns   (204.6 ns .. 205.2 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 204.6 ns   (204.0 ns .. 205.0 ns)
std dev              1.573 ns   (1.171 ns .. 2.074 ns)

benchmarking Eq/Tree/GHCDeriving
time                 29.31 ns   (29.30 ns .. 29.32 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 29.31 ns   (29.30 ns .. 29.32 ns)
std dev              20.60 ps   (13.22 ps .. 30.56 ps)

benchmarking Eq/Tree/SOPGGP
time                 1.440 μs   (1.438 μs .. 1.441 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.431 μs   (1.423 μs .. 1.436 μs)
std dev              20.95 ns   (13.78 ns .. 27.39 ns)
variance introduced by outliers: 14% (moderately inflated)

benchmarking Eq/Tree/SOPTH
time                 1.146 μs   (1.145 μs .. 1.147 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.145 μs   (1.144 μs .. 1.146 μs)
std dev              3.358 ns   (2.733 ns .. 4.219 ns)

benchmarking Eq/Tree large/GHCDeriving
time                 1.135 μs   (1.135 μs .. 1.136 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.135 μs   (1.135 μs .. 1.136 μs)
std dev              1.880 ns   (1.271 ns .. 3.281 ns)

benchmarking Eq/Tree large/SOPGGP
time                 63.31 μs   (63.01 μs .. 63.60 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 63.38 μs   (63.15 μs .. 63.59 μs)
std dev              758.0 ns   (597.9 ns .. 985.1 ns)

benchmarking Eq/Tree large/SOPTH
time                 49.59 μs   (49.55 μs .. 49.63 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 49.58 μs   (49.53 μs .. 49.62 μs)
std dev              153.6 ns   (110.0 ns .. 238.2 ns)

benchmarking Show/S2/GHCDeriving
time                 11.47 ns   (11.46 ns .. 11.48 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 11.50 ns   (11.48 ns .. 11.52 ns)
std dev              72.00 ps   (54.67 ps .. 92.94 ps)

benchmarking Show/S2/SOPGGP
time                 192.4 ns   (191.3 ns .. 193.5 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 192.3 ns   (191.5 ns .. 193.6 ns)
std dev              3.498 ns   (2.417 ns .. 5.028 ns)
variance introduced by outliers: 23% (moderately inflated)

benchmarking Show/S2/SOPTH
time                 192.6 ns   (192.0 ns .. 193.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 193.0 ns   (192.4 ns .. 193.6 ns)
std dev              1.975 ns   (1.664 ns .. 2.554 ns)

benchmarking Show/S20/GHCDeriving
time                 14.63 ns   (14.62 ns .. 14.65 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 14.66 ns   (14.64 ns .. 14.68 ns)
std dev              62.25 ps   (54.28 ps .. 73.90 ps)

benchmarking Show/S20/SOPGGP
time                 718.2 ns   (717.3 ns .. 719.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 718.6 ns   (717.2 ns .. 719.4 ns)
std dev              3.179 ns   (1.988 ns .. 6.122 ns)

benchmarking Show/S20/SOPTH
time                 668.1 ns   (667.5 ns .. 668.6 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 667.9 ns   (666.8 ns .. 668.7 ns)
std dev              3.108 ns   (2.120 ns .. 4.959 ns)

benchmarking Show/PB2/GHCDeriving
time                 136.0 ns   (135.0 ns .. 136.8 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 134.9 ns   (133.4 ns .. 135.9 ns)
std dev              4.353 ns   (3.076 ns .. 6.450 ns)
variance introduced by outliers: 49% (moderately inflated)

benchmarking Show/PB2/SOPGGP
time                 421.8 ns   (420.0 ns .. 423.9 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 422.7 ns   (420.2 ns .. 424.8 ns)
std dev              7.854 ns   (6.433 ns .. 9.684 ns)
variance introduced by outliers: 22% (moderately inflated)

benchmarking Show/PB2/SOPTH
time                 425.8 ns   (418.1 ns .. 434.5 ns)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 434.7 ns   (428.3 ns .. 440.1 ns)
std dev              20.43 ns   (16.90 ns .. 23.96 ns)
variance introduced by outliers: 65% (severely inflated)

benchmarking Show/Tree/GHCDeriving
time                 471.9 ns   (469.3 ns .. 474.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 471.5 ns   (468.5 ns .. 473.8 ns)
std dev              9.080 ns   (6.926 ns .. 11.85 ns)
variance introduced by outliers: 24% (moderately inflated)

benchmarking Show/Tree/SOPGGP
time                 2.520 μs   (2.495 μs .. 2.534 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 2.496 μs   (2.467 μs .. 2.517 μs)
std dev              82.20 ns   (64.75 ns .. 105.0 ns)
variance introduced by outliers: 43% (moderately inflated)

benchmarking Show/Tree/SOPTH
time                 2.536 μs   (2.510 μs .. 2.557 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 2.506 μs   (2.467 μs .. 2.540 μs)
std dev              121.3 ns   (97.99 ns .. 148.3 ns)
variance introduced by outliers: 63% (severely inflated)

benchmarking Show/Tree large/GHCDeriving
time                 20.79 μs   (20.64 μs .. 20.91 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 20.75 μs   (20.62 μs .. 20.85 μs)
std dev              392.4 ns   (300.0 ns .. 520.6 ns)
variance introduced by outliers: 17% (moderately inflated)

benchmarking Show/Tree large/SOPGGP
time                 105.5 μs   (104.2 μs .. 106.6 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 105.3 μs   (104.6 μs .. 105.8 μs)
std dev              2.041 μs   (1.695 μs .. 2.417 μs)
variance introduced by outliers: 14% (moderately inflated)

benchmarking Show/Tree large/SOPTH
time                 105.4 μs   (104.2 μs .. 106.4 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 103.6 μs   (102.1 μs .. 104.8 μs)
std dev              4.419 μs   (3.680 μs .. 5.292 μs)
variance introduced by outliers: 44% (moderately inflated)

Benchmark generics-sop-bench: FINISH

And after these changes:

Running 1 benchmarks...
Benchmark generics-sop-bench: RUNNING...
benchmarking Roundtrip/S2/GHCGeneric
time                 2.935 ns   (2.934 ns .. 2.936 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.934 ns   (2.932 ns .. 2.936 ns)
std dev              6.613 ps   (5.235 ps .. 8.617 ps)

benchmarking Roundtrip/S2/SOPGGP
time                 2.988 ns   (2.985 ns .. 2.991 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.989 ns   (2.986 ns .. 2.991 ns)
std dev              7.953 ps   (6.649 ps .. 10.22 ps)

benchmarking Roundtrip/S2/SOPTH
time                 4.753 ns   (4.752 ns .. 4.754 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.753 ns   (4.752 ns .. 4.754 ns)
std dev              3.213 ps   (2.239 ps .. 4.494 ps)

benchmarking Roundtrip/S20/GHCGeneric
time                 11.62 ns   (11.62 ns .. 11.62 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 11.62 ns   (11.62 ns .. 11.62 ns)
std dev              8.031 ps   (5.638 ps .. 12.68 ps)

benchmarking Roundtrip/S20/SOPGGP
time                 55.01 ns   (54.88 ns .. 55.14 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 54.97 ns   (54.89 ns .. 55.06 ns)
std dev              304.3 ps   (272.0 ps .. 343.9 ps)

benchmarking Roundtrip/S20/SOPTH
time                 22.11 ns   (22.11 ns .. 22.12 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 22.11 ns   (22.10 ns .. 22.12 ns)
std dev              24.87 ps   (19.97 ps .. 35.55 ps)

benchmarking Roundtrip/PB2/GHCGeneric
time                 5.281 ns   (5.280 ns .. 5.281 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.280 ns   (5.280 ns .. 5.281 ns)
std dev              2.084 ps   (1.593 ps .. 3.125 ps)

benchmarking Roundtrip/PB2/SOPGGP
time                 5.282 ns   (5.281 ns .. 5.282 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.281 ns   (5.280 ns .. 5.282 ns)
std dev              2.477 ps   (1.992 ps .. 3.454 ps)

benchmarking Roundtrip/PB2/SOPTH
time                 5.282 ns   (5.281 ns .. 5.283 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.283 ns   (5.282 ns .. 5.284 ns)
std dev              4.150 ps   (3.245 ps .. 5.172 ps)

benchmarking Eq/S2/GHCDeriving
time                 5.017 ns   (5.016 ns .. 5.018 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.018 ns   (5.017 ns .. 5.020 ns)
std dev              5.376 ps   (2.550 ps .. 9.438 ps)

benchmarking Eq/S2/SOPGGP
time                 6.073 ns   (6.072 ns .. 6.074 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.073 ns   (6.072 ns .. 6.074 ns)
std dev              3.039 ps   (2.444 ps .. 3.718 ps)

benchmarking Eq/S2/SOPTH
time                 6.322 ns   (6.311 ns .. 6.339 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.317 ns   (6.312 ns .. 6.326 ns)
std dev              20.63 ps   (13.26 ps .. 38.40 ps)

benchmarking Eq/S20/GHCDeriving
time                 4.753 ns   (4.753 ns .. 4.754 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.755 ns   (4.753 ns .. 4.757 ns)
std dev              5.325 ps   (3.352 ps .. 8.837 ps)

benchmarking Eq/S20/SOPGGP
time                 261.3 ns   (260.0 ns .. 262.4 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 260.8 ns   (259.3 ns .. 261.9 ns)
std dev              4.077 ns   (3.369 ns .. 5.175 ns)
variance introduced by outliers: 18% (moderately inflated)

benchmarking Eq/S20/SOPTH
time                 233.0 ns   (232.4 ns .. 233.5 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 230.9 ns   (228.8 ns .. 232.1 ns)
std dev              5.696 ns   (3.894 ns .. 7.775 ns)
variance introduced by outliers: 35% (moderately inflated)

benchmarking Eq/PB2/GHCDeriving
time                 6.339 ns   (6.337 ns .. 6.340 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 6.338 ns   (6.337 ns .. 6.339 ns)
std dev              3.790 ps   (2.897 ps .. 4.846 ps)

benchmarking Eq/PB2/SOPGGP
time                 214.0 ns   (213.2 ns .. 214.7 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 213.8 ns   (213.3 ns .. 214.6 ns)
std dev              2.037 ns   (1.735 ns .. 2.362 ns)

benchmarking Eq/PB2/SOPTH
time                 200.8 ns   (199.8 ns .. 202.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 201.8 ns   (200.8 ns .. 202.7 ns)
std dev              3.236 ns   (2.863 ns .. 3.737 ns)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Eq/Tree/GHCDeriving
time                 29.31 ns   (29.31 ns .. 29.32 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 29.32 ns   (29.31 ns .. 29.35 ns)
std dev              41.60 ps   (18.17 ps .. 86.58 ps)

benchmarking Eq/Tree/SOPGGP
time                 1.151 μs   (1.150 μs .. 1.151 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.150 μs   (1.148 μs .. 1.151 μs)
std dev              3.641 ns   (2.164 ns .. 6.927 ns)

benchmarking Eq/Tree/SOPTH
time                 1.146 μs   (1.144 μs .. 1.147 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.145 μs   (1.144 μs .. 1.147 μs)
std dev              4.367 ns   (3.355 ns .. 5.728 ns)

benchmarking Eq/Tree large/GHCDeriving
time                 1.114 μs   (1.109 μs .. 1.123 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.111 μs   (1.109 μs .. 1.115 μs)
std dev              9.093 ns   (1.036 ns .. 17.90 ns)

benchmarking Eq/Tree large/SOPGGP
time                 50.08 μs   (50.05 μs .. 50.11 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 50.08 μs   (50.05 μs .. 50.10 μs)
std dev              80.15 ns   (54.01 ns .. 112.5 ns)

benchmarking Eq/Tree large/SOPTH
time                 49.98 μs   (49.95 μs .. 50.01 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 49.98 μs   (49.94 μs .. 50.01 μs)
std dev              117.3 ns   (89.50 ns .. 165.3 ns)

benchmarking Show/S2/GHCDeriving
time                 11.78 ns   (11.77 ns .. 11.79 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 11.79 ns   (11.78 ns .. 11.80 ns)
std dev              37.31 ps   (31.38 ps .. 45.40 ps)

benchmarking Show/S2/SOPGGP
time                 194.9 ns   (194.2 ns .. 195.7 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 195.6 ns   (194.6 ns .. 196.6 ns)
std dev              3.492 ns   (2.721 ns .. 4.386 ns)
variance introduced by outliers: 22% (moderately inflated)

benchmarking Show/S2/SOPTH
time                 194.6 ns   (193.2 ns .. 196.3 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 195.1 ns   (194.1 ns .. 196.1 ns)
std dev              3.566 ns   (3.057 ns .. 4.229 ns)
variance introduced by outliers: 23% (moderately inflated)

benchmarking Show/S20/GHCDeriving
time                 15.00 ns   (14.76 ns .. 15.54 ns)
                     0.995 R²   (0.986 R² .. 1.000 R²)
mean                 14.88 ns   (14.76 ns .. 15.34 ns)
std dev              744.0 ps   (11.86 ps .. 1.581 ns)
variance introduced by outliers: 74% (severely inflated)

benchmarking Show/S20/SOPGGP
time                 724.5 ns   (723.4 ns .. 725.4 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 723.4 ns   (721.9 ns .. 724.4 ns)
std dev              3.967 ns   (2.989 ns .. 5.476 ns)

benchmarking Show/S20/SOPTH
time                 661.7 ns   (661.1 ns .. 662.2 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 661.5 ns   (660.9 ns .. 662.0 ns)
std dev              1.788 ns   (1.465 ns .. 2.275 ns)

benchmarking Show/PB2/GHCDeriving
time                 128.6 ns   (127.9 ns .. 129.4 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 128.4 ns   (127.6 ns .. 129.1 ns)
std dev              2.539 ns   (2.086 ns .. 2.969 ns)
variance introduced by outliers: 26% (moderately inflated)

benchmarking Show/PB2/SOPGGP
time                 425.2 ns   (420.5 ns .. 429.0 ns)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 423.7 ns   (419.4 ns .. 427.3 ns)
std dev              13.03 ns   (10.49 ns .. 15.35 ns)
variance introduced by outliers: 44% (moderately inflated)

benchmarking Show/PB2/SOPTH
time                 438.3 ns   (435.6 ns .. 441.1 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 440.5 ns   (436.8 ns .. 443.1 ns)
std dev              10.58 ns   (7.898 ns .. 16.88 ns)
variance introduced by outliers: 32% (moderately inflated)

benchmarking Show/Tree/GHCDeriving
time                 450.5 ns   (443.9 ns .. 456.6 ns)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 450.1 ns   (445.2 ns .. 454.7 ns)
std dev              16.30 ns   (14.75 ns .. 17.99 ns)
variance introduced by outliers: 52% (severely inflated)

benchmarking Show/Tree/SOPGGP
time                 2.488 μs   (2.469 μs .. 2.506 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.501 μs   (2.487 μs .. 2.512 μs)
std dev              41.86 ns   (34.37 ns .. 57.34 ns)
variance introduced by outliers: 16% (moderately inflated)

benchmarking Show/Tree/SOPTH
time                 2.501 μs   (2.484 μs .. 2.515 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 2.499 μs   (2.480 μs .. 2.513 μs)
std dev              56.56 ns   (44.95 ns .. 82.88 ns)
variance introduced by outliers: 26% (moderately inflated)

benchmarking Show/Tree large/GHCDeriving
time                 20.36 μs   (20.11 μs .. 20.65 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 20.14 μs   (19.83 μs .. 20.38 μs)
std dev              915.2 ns   (749.7 ns .. 1.098 μs)
variance introduced by outliers: 53% (severely inflated)

benchmarking Show/Tree large/SOPGGP
time                 102.7 μs   (101.6 μs .. 103.8 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 102.9 μs   (101.9 μs .. 103.6 μs)
std dev              2.936 μs   (2.156 μs .. 3.791 μs)
variance introduced by outliers: 26% (moderately inflated)

benchmarking Show/Tree large/SOPTH
time                 104.8 μs   (103.8 μs .. 105.4 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 104.4 μs   (103.4 μs .. 105.0 μs)
std dev              2.515 μs   (1.704 μs .. 4.026 μs)
variance introduced by outliers: 20% (moderately inflated)

Benchmark generics-sop-bench: FINISH
@phadej
Copy link
Contributor

phadej commented Feb 7, 2020

I'm really not sure if changing error to unsafeCoerce is worth it.

@RyanGlScott
Copy link
Contributor Author

It's important to note that the use of unsafeCoerce in the code above isn't critical to the overall design—it's just an optimization. On the other hand, the use of error in the current design can't really be avoided.

@phadej
Copy link
Contributor

phadej commented Feb 7, 2020

I think that for gfrom error can be avoided by case on whether Code is an empty list or not: if code is empty the implementation is absurd, if not-empty there's a base case. The gfrom shouldn't be unreasonably slow, there's plenty of overhead in generics-sop already (as the conversions aren't optimised away).

@RyanGlScott
Copy link
Contributor Author

I think that for gfrom error can be avoided by case on whether Code is an empty list or not: if code is empty the implementation is absurd, if not-empty there's a base case.

I'm afraid I don't understand what you mean here. gfrom produces a Code as an output, so I don't understand how you could case on it.

The gfrom shouldn't be unreasonably slow, there's plenty of overhead in generics-sop already (as the conversions aren't optimised away).

I agree that it shouldn't be unreasonably slow! If you check the benchmark results that I provide in the original issue description, you'll notice that my redesign gives results comparable to the current design, so that shouldn't be a concern.

@phadej
Copy link
Contributor

phadej commented Feb 7, 2020

Side note about benchmarks, maybe https://hackage.haskell.org/package/criterion-compare should be extended to produce textual tables, as looking at two long criterion outputs (which are not side-by-side) and trying to compare is not job for a human

@phadej
Copy link
Contributor

phadej commented Feb 7, 2020

So in short the proposal is to do foldMap like combining, not foldr-like traversal. And foldMap like tree-thing is fine, as if the entry is

  • on the right, we need to pad with Ss anyway,
  • on the left we can unsafeCoerce, which we do for other things too, so the crime is already committed

@RyanGlScott
Copy link
Contributor Author

@phadej has deftly one-upped my design in #119.

@kosmikus
Copy link
Member

Can we close this given that #119 is merged? Or are there additional insights in this conversation that should be extracted into their own issue?

@RyanGlScott
Copy link
Contributor Author

Yes, let's close this now that #119 has landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants