Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PT Run] Improve the Win32 Program Indexing speed #11364

Merged

Conversation

royvou
Copy link
Contributor

@royvou royvou commented May 19, 2021

Summary of the Pull Request

What is this about:
Improve the performance of the Win32 scanning.

What is include in the PR:

  • Performance Improvements
    • Parallelize collection of all paths with multiple sources
    • Deduplicate all paths which are duplicate
    • Resolve all indexed program paths in parallel
      • When checking Steam/Epic games, do not read whole file if not needed
  • Refactored indexing on several points
    • Always LowerCase the fullpath (like done on other places
    • Updated some usage of more modern C# features where i think the code became more readable
  • Removed Indexing from constructor (so it won't run if the plugin is disabled)
  • Fix Chrome PWA's not having the correct AppType
  • Change Security Exceptions to Warnings in the logs
  • Allow Searching by executable name (e.g. WINWORD.EXE) exluded as this changes behavior

How does someone test / validate:

  • Run PT Run and check the logs if it's faster for that device
  • Disable PT Run within the Settings UI & the 'Microsoft.Plugin.Program.Main - Win32Program index cost ' will not be logged

Quality Checklist

Contributor License Agreement (CLA)

A CLA must be signed. If not, go over here and sign the CLA.

Additional Information

@htcfreek
Copy link
Collaborator

@royvou
#10981 can conflict with your PR!

@enricogior
Copy link
Contributor

@royvou
if you have the time, it would also be great to stop logging as exceptions errors that are not fatal. It's ok to report them as info/warn.

@royvou royvou force-pushed the feature/improvewin32scanningspeed branch from fd82b4c to d52e3c9 Compare May 20, 2021 22:07
@mykhailopylyp
Copy link
Contributor

@royvou
Is it ready for review or is it still in progress?

@royvou
Copy link
Contributor Author

royvou commented May 21, 2021

I need to verify the filewatcher/update deduplicate logic as the unit tests are failing.

Main logic/changes are completed though!
Made it as a draft to show the progress and if some useful (high-level) additions need to be done/would be nice

@royvou royvou force-pushed the feature/improvewin32scanningspeed branch from d52e3c9 to 2fe0b11 Compare May 21, 2021 19:44
@microsoft microsoft deleted a comment from github-actions bot May 22, 2021
@microsoft microsoft deleted a comment from github-actions bot May 22, 2021
@mykhailopylyp
Copy link
Contributor

Program plugin with changes doesn't show any results from the Program plugin.

Master branch:

image

Current branch:

image

@htcfreek
Copy link
Collaborator

@enricogior
Does it make sense to look at #10210 while refactoring plugin code?

@mykhailopylyp
Copy link
Contributor

mykhailopylyp commented May 24, 2021

Measured improvement. Debug build.

The branch:
Microsoft.Plugin.Program.Main - Win32Program index cost <796ms>
Microsoft.Plugin.Program.Main - Win32Program index cost <244ms>
Microsoft.Plugin.Program.Main - Win32Program index cost <245ms>

Master:
Microsoft.Plugin.Program.Main - Win32Program index cost <280ms>
Microsoft.Plugin.Program.Main - Win32Program index cost <265ms>
Microsoft.Plugin.Program.Main - Win32Program index cost <264ms>

@@ -17,6 +17,31 @@ namespace Microsoft.Plugin.Program.Logger
/// </summary>
internal static class ProgramLogger
{
/// <summary>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Warn Method which could be used for non-fatal index related issues (e.g. Not found / not accessible / not parse able)

@@ -118,6 +104,20 @@ public void Init(PluginInitContext context)
_context.API.ThemeChanged += OnThemeChanged;

UpdateUWPIconPath(_context.API.GetCurrentTheme());

var a = Task.Run(() =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from Constructor to Initialize method so it won't run if you disable the program plugin

@github-actions
Copy link

github-actions bot commented May 24, 2021

@check-spelling-bot Report

Unrecognized words, please review:

  • Executables
  • Hashset
  • Registery
  • WINWORD
Previously acknowledged words that are now absent apos autoupdate bc bh Chris's classmethod cls CMDARG cn cx cz deque df dw EB ev fd fody fx getmembers gh hc hh hk hu ip ismethod jp Kf lambson laute loadingbar messagebox multithreading nonwin NX overlaywindow pb Pn pv pw px QI rpc ru rv rx sz tadele td tt tw Tz ul uv vh vk vm vw Vx watsonportal wostringstream xa XY yy Zc zh zipfolder zm
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the [email protected]:royvou/PowerToys.git repository
on the feature/improvewin32scanningspeed branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spell-check/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spell-check/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/PowerToys/issues/comments/847356224" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to the patterns.txt file.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@@ -28,6 +27,8 @@ namespace Microsoft.Plugin.Program.Programs
[Serializable]
public class Win32Program : IProgram
{
public static readonly Win32Program InvalidProgram = new Win32Program { Valid = false, Enabled = false };
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single Instance Method (which can be reused to reduce allocations)

@@ -53,7 +54,7 @@ public class Win32Program : IProgram

public bool Enabled { get; set; }

public bool HasArguments { get; set; }
public bool HasArguments => !string.IsNullOrEmpty(Arguments);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter can easily be forgotten, this will calculate the HasArgument at runtime

@@ -104,10 +105,12 @@ public bool IsWebApplication()
{
// To Filter PWAs when the user searches for the main application
// All Chromium based applications contain the --app-id argument
// Reference : https://codereview.chromium.org/399045/show
// Reference : https://codereview.chromium.org/399045
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old link was throwing a 404 :(

@@ -119,9 +122,6 @@ public bool FilterWebApplication(string query)
return false;
}

// Set the subtitle to 'Web Application'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this check inside the LnkProgram

@@ -403,65 +395,62 @@ private static Win32Program InternetShortcutProgram(string path)
{
urlPath = line.Substring(urlPrefix.Length);

try
if (!Uri.TryCreate(urlPath, UriKind.RelativeOrAbsolute, out Uri _))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use "Safe" TryParse (which does not throw an exception, but we log the warning anyway for debugging purpose

}
}

private static readonly Regex InternetShortcutURLPrefixes = new Regex(@"^steam:\/\/(rungameid|run)\/|^com\.epicgames\.launcher:\/\/apps\/", RegexOptions.Compiled);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex can be cached/reused

{
Name = Path.GetFileNameWithoutExtension(path),
ExecutableName = Path.GetFileName(path),
IcoPath = iconPath,
FullPath = urlPath,
FullPath = urlPath.ToLowerInvariant(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Other Win32Programs use LowerInvariant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change it ToUpperInvariant as it is recommended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since its shown to the user (and all other types also use the lowercase variant) I don't think we should!

@royvou
Copy link
Contributor Author

royvou commented Jun 1, 2021

Ah sorry @htcfreek ! (Yes i can search on those names in this PR)

image
image

@htcfreek
Copy link
Collaborator

htcfreek commented Jun 1, 2021

Ah sorry @htcfreek ! (Yes i can search on those names in this PR)

image
image

@royvou
Which local do you have? Other than englisch? Maybe you see the reason why some tools not searchable by the translated name while working on this PR.

@royvou
Copy link
Contributor Author

royvou commented Jun 1, 2021

Ah sorry @htcfreek ! (Yes i can search on those names in this PR)
image
image

@royvou
Which local do you have? Other than englisch? Maybe you see the reason why some tools not searchable by the translated name while working on this PR.

EN-UK, but this seems not to have changed over the current PT Run release, so let's continue that in the other issue :)

@htcfreek
Copy link
Collaborator

htcfreek commented Jun 1, 2021

Ah sorry @htcfreek ! (Yes i can search on those names in this PR)
image
image

@royvou
Which local do you have? Other than englisch? Maybe you see the reason why some tools not searchable by the translated name while working on this PR.

EN-UK, but this seems not to have changed over the current PT Run release, so let's continue that in the other issue :)

Issue exists since while. :)

@crutkas
Copy link
Member

crutkas commented Jun 2, 2021

@royvou / @mykhailopylyp what is missing to close this out

@mykhailopylyp
Copy link
Contributor

@crutkas @royvou
I'm good with it as long as we keep win32 program hash function as it was before(that is with executable name) and a number of indexed programs don't differ after this change.

royvou added 12 commits June 2, 2021 19:40
…(from multiple sources) & Optimze Paralell indexing
- ProgramLogger.Warn allow Exception NULL
- Add "CustomProgramPaths" back
- Moved Enumeration/Distinct
   - ToList on each "ProgramSource"
   - Moved Enumeration/Distinct to All
   - Moved DisabledProgramList check to All
- Changed ToArray to ToList in All to optimize  allocations
- Add InitialSize to HashSet to prevent HashSet resize
@royvou royvou force-pushed the feature/improvewin32scanningspeed branch from 6855291 to 92012eb Compare June 2, 2021 18:44
@royvou
Copy link
Contributor Author

royvou commented Jun 2, 2021

@mykhailopylyp / @crutkas I'll revert the change that .lnks will always have the .exe name set to the name of the exe (e.g. WINWORD for word) and we should create a seperate issue for that how we want that to behave.
Now it will be the same as on master (e.g. no results)

Also added back the previous comparer.

@royvou royvou marked this pull request as ready for review June 2, 2021 18:46
@github-actions
Copy link

github-actions bot commented Jun 2, 2021

@check-spelling-bot Report

Unrecognized words, please review:

  • Executables
  • Hashset
  • Registery
Previously acknowledged words that are now absent autoupdate CMDARG Dbg deque multithreading nonwin overlaywindow watsonportal wostringstream zipfolder
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the [email protected]:royvou/PowerToys.git repository
on the feature/improvewin32scanningspeed branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spell-check/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spell-check/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/PowerToys/issues/comments/853295532" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to the patterns.txt file.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

Copy link
Contributor

@mykhailopylyp mykhailopylyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Good job! Tested, works as expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants