`unicode_normalization` benchmark from `rustc-perf` is slow #57718

nnethercote · 2019-01-18T00:31:13Z

The unicode_normalization benchmark was just added to rustc-perf: rust-lang/rustc-perf#328

Here is the high-level data from Cachegrind for a Clean-Check run.

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
45,053,275,971 (100.0%)  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir                      file:function
--------------------------------------------------------------------------------
3,743,278,359 ( 8.31%)  /home/njn/moz/rust0/src/librustc/infer/lexical_region_resolve/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::expand_node
3,027,925,390 ( 6.72%)  /home/njn/moz/rust0/src/librustc/middle/region.rs:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
2,217,916,340 ( 4.92%)  /home/njn/moz/rust0/src/librustc/ty/sty.rs:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
2,165,108,405 ( 4.81%)  /home/njn/moz/rust0/src/librustc/ty/query/plumbing.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,872,071,197 ( 4.16%)  /home/njn/moz/rust0/src/librustc/infer/lexical_region_resolve/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
1,782,556,352 ( 3.96%)  /home/njn/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-0.6.7/lib.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
1,678,166,351 ( 3.72%)  /home/njn/moz/rust0/src/libstd/collections/hash/map.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,652,330,349 ( 3.67%)  /home/njn/moz/rust0/src/libstd/collections/hash/table.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,491,568,194 ( 3.31%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::IntRange::from_ctor
1,142,511,763 ( 2.54%)  /home/njn/moz/rust0/src/librustc/ty/layout.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>> as rustc_target::abi::LayoutOf>::
layout_of
  983,408,925 ( 2.18%)  /home/njn/moz/rust0/src/libcore/num/mod.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
  925,902,672 ( 2.06%)  /home/njn/moz/rust0/src/libcore/option.rs:rustc::ty::sty::Const::assert_bits
  836,542,801 ( 1.86%)  /home/njn/moz/rust0/src/libcore/ptr.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
  738,053,639 ( 1.64%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::constructor_intersects_pattern
  698,283,902 ( 1.55%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::specialize
  623,416,500 ( 1.38%)  /home/njn/moz/rust0/src/libcore/cmp.rs:rustc::infer::lexical_region_resolve::LexicalResolver::expand_node
  605,680,300 ( 1.34%)  /home/njn/moz/rust0/<::rustc_data_structures::indexed_vec::newtype_index macros>:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
  590,667,449 ( 1.31%)  /home/njn/moz/rust0/src/librustc/ty/context.rs:<&'a rustc::ty::TyS<'a> as rustc::ty::context::Lift<'tcx>>::lift_to_tcx
  588,790,903 ( 1.31%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::IntRange::from_pat
  566,270,290 ( 1.26%)  /home/njn/moz/rust0/src/librustc/ty/sty.rs:rustc::ty::sty::Const::assert_bits
  535,797,872 ( 1.19%)  /home/njn/moz/rust0/src/libcore/slice/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
  531,048,306 ( 1.18%)  /home/njn/moz/rust0/src/librustc/ty/mod.rs:rustc::ty::ParamEnv::and
  504,837,733 ( 1.12%)  /home/njn/moz/rust0/src/librustc/ty/layout.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>>>::record_layout_for_printing
  491,555,270 ( 1.09%)  /home/njn/moz/rust0/src/librustc/mir/interpret/value.rs:<rustc::mir::interpret::value::Scalar<Tag>>::to_bits
  478,260,738 ( 1.06%)  /home/njn/moz/rust0/src/librustc/ty/query/plumbing.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>> as rustc_target::abi::Lay
outOf>::layout_of

The first three entries show that expand_node is super-hot:

rust/src/librustc/infer/lexical_region_resolve/mod.rs

Lines 217 to 273 in d45bef9

    
           fn expand_node( 
        
               &self, 
        
               a_region: Region<'tcx>, 
        
               b_vid: RegionVid, 
        
               b_data: &mut VarValue<'tcx>, 
        
           ) -> bool { 
        
               debug!("expand_node({:?}, {:?} == {:?})", a_region, b_vid, b_data); 
        
               match *a_region { 
        
                   // Check if this relationship is implied by a given. 
        
                   ty::ReEarlyBound(_) | ty::ReFree(_) => if self.data.givens.contains(&(a_region, b_vid)) 
        
                   { 
        
                       debug!("given"); 
        
                       return false; 
        
                   }, 
        
                   _ => {} 
        
               } 
        
               match *b_data { 
        
                   VarValue::Value(cur_region) => { 
        
                       let mut lub = self.lub_concrete_regions(a_region, cur_region); 
        
                       if lub == cur_region { 
        
                           return false; 
        
                       } 
        
                       // Watch out for `'b: !1` relationships, where the 
        
                       // universe of `'b` can't name the placeholder `!1`. In 
        
                       // that case, we have to grow `'b` to be `'static` for the 
        
                       // relationship to hold. This is obviously a kind of sub-optimal 
        
                       // choice -- in the future, when we incorporate a knowledge 
        
                       // of the parameter environment, we might be able to find a 
        
                       // tighter bound than `'static`. 
        
                       // 
        
                       // (This might e.g. arise from being asked to prove `for<'a> { 'b: 'a }`.) 
        
                       let b_universe = self.var_infos[b_vid].universe; 
        
                       if let ty::RePlaceholder(p) = lub { 
        
                           if b_universe.cannot_name(p.universe) { 
        
                               lub = self.tcx().types.re_static; 
        
                           } 
        
                       } 
        
                       debug!( 
        
                           "Expanding value of {:?} from {:?} to {:?}", 
        
                           b_vid, cur_region, lub 
        
                       ); 
        
                       *b_data = VarValue::Value(lub); 
        
                       return true; 
        
                   } 
        
                   VarValue::ErrorValue => { 
        
                       return false; 
        
                   } 
        
               } 
        
           }

Most of the calls take the return on line 241, i.e. the lub == cur_region comparison succeeds. That explains why <rustc::ty::sty::RegionKind as core::cmp::PartialEq> shows up as hot.

expand_node is called from a closure within expansion:

rust/src/librustc/infer/lexical_region_resolve/mod.rs

Lines 188 to 215 in d45bef9

    
           fn expansion(&self, var_values: &mut LexicalRegionResolutions<'tcx>) { 
        
               self.iterate_until_fixed_point("Expansion", |constraint, origin| { 
        
                   debug!("expansion: constraint={:?} origin={:?}", constraint, origin); 
        
                   match *constraint { 
        
                       Constraint::RegSubVar(a_region, b_vid) => { 
        
                           let b_data = var_values.value_mut(b_vid); 
        
                           (self.expand_node(a_region, b_vid, b_data), false) 
        
                       } 
        
                       Constraint::VarSubVar(a_vid, b_vid) => match *var_values.value(a_vid) { 
        
                           VarValue::ErrorValue => (false, false), 
        
                           VarValue::Value(a_region) => { 
        
                               let b_node = var_values.value_mut(b_vid); 
        
                               let changed = self.expand_node(a_region, b_vid, b_node); 
        
                               let retain = match *b_node { 
        
                                   VarValue::Value(ReStatic) | VarValue::ErrorValue => false, 
        
                                   _ => true 
        
                               }; 
        
                               (changed, retain) 
        
                           } 
        
                       }, 
        
                       Constraint::RegSubReg(..) | Constraint::VarSubReg(..) => { 
        
                           // These constraints are checked after expansion 
        
                           // is done, in `collect_errors`. 
        
                           (false, false) 
        
                       } 
        
                   } 
        
               }) 
        
           }

That closure is passed to iterate_until_fixed_point:

rust/src/librustc/infer/lexical_region_resolve/mod.rs

Lines 724 to 745 in d45bef9

    
           fn iterate_until_fixed_point<F>(&self, tag: &str, mut body: F) 
        
           where 
        
               F: FnMut(&Constraint<'tcx>, &SubregionOrigin<'tcx>) -> (bool, bool), 
        
           { 
        
               let mut constraints: SmallVec<[_; 16]> = self.data.constraints.iter().collect(); 
        
               let mut iteration = 0; 
        
               let mut changed = true; 
        
               while changed { 
        
                   changed = false; 
        
                   iteration += 1; 
        
                   debug!("---- {} Iteration {}{}", "#", tag, iteration); 
        
                   constraints.retain(|(constraint, origin)| { 
        
                       let (edge_changed, retain) = body(constraint, origin); 
        
                       if edge_changed { 
        
                           debug!("Updated due to constraint {:?}", constraint); 
        
                           changed = true; 
        
                       } 
        
                       retain 
        
                   }); 
        
               } 
        
               debug!("---- {} Complete after {} iteration(s)", tag, iteration); 
        
           }

Most of the calls to iterate_until_fixed_point have a tiny number of constraints and iterations. But for unicode_normalization there are two exceptional calls.

The first exceptional call has 30,902 constraints (20,598 VarSubVar ones followed by 10,304 RegSubVar ones), coming from this match:
https://github.com/nnethercote/rustc-perf/blob/47adadaf3541db9382d5fb3cba5ca7c75064219f/collector/benchmarks/unicode_normalization/src/tables.rs#L1788-L3853

It takes 2,064 iterations to process; by the end the constraints list is empty.

The second exceptional call has 55,172 constraints (36,778 VarSubVar ones followed by 18,394 RegSubVar ones), coming from this match:

https://github.com/nnethercote/rustc-perf/blob/47adadaf3541db9382d5fb3cba5ca7c75064219f/collector/benchmarks/unicode_normalization/src/tables.rs#L3855-L7538

It takes 3,682 iterations to process; again, by the end the constraints list is empty.

cc @rust-lang/wg-compiler-performance

The text was updated successfully, but these errors were encountered:

nnethercote · 2019-01-18T00:44:45Z

#57719 improves things by up to 4% by inlining expand_node. But that's just tweaking around the edges. An algorithmic improvement will be necessary to really improve things.

jens1o · 2019-01-26T10:32:38Z

ref #55528

nnethercote · 2019-01-27T10:01:27Z

No point having this open as well as #55528.

Centril added the I-compiletime Issue: Problems and improvements with respect to compile times. label Jan 18, 2019

nnethercote closed this as completed Jan 27, 2019

nnethercote mentioned this issue Jan 27, 2019

Long compilation time of crate "unicode-normalization" #55528

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`unicode_normalization` benchmark from `rustc-perf` is slow #57718

`unicode_normalization` benchmark from `rustc-perf` is slow #57718

nnethercote commented Jan 18, 2019

nnethercote commented Jan 18, 2019

jens1o commented Jan 26, 2019

nnethercote commented Jan 27, 2019

unicode_normalization benchmark from rustc-perf is slow #57718

unicode_normalization benchmark from rustc-perf is slow #57718

Comments

nnethercote commented Jan 18, 2019

nnethercote commented Jan 18, 2019

jens1o commented Jan 26, 2019

nnethercote commented Jan 27, 2019

`unicode_normalization` benchmark from `rustc-perf` is slow #57718

`unicode_normalization` benchmark from `rustc-perf` is slow #57718