Commit graph

15 commits

Author SHA1 Message Date
Michael Eischer
55bea6e7a6 filter: Fix crash for '**' pattern 2021-05-14 23:50:31 +02:00
Michael Eischer
88c8e903d2 filter: Fix glob matching on absolute path marker on windows
A pattern part containing "/" is used to mark a path or a pattern as
absolute. However, on Windows the path separator is "\" such that glob
patterns like "?" could match the marker. The code now explicitly skips
the marker when the pattern does not represent an absolute path.
2020-10-09 16:11:05 +02:00
greatroar
740758a5fa Optimize filter pattern matching
By replacing "**" with "", checking for this special path component can
be reduced to a length-zero check.

name                          old time/op    new time/op    delta
FilterLines-8                   44.7ms ± 5%    44.9ms ± 5%     ~     (p=0.631 n=10+10)
FilterPatterns/Relative-8       13.6ms ± 4%    13.4ms ± 5%     ~     (p=0.165 n=10+10)
FilterPatterns/Absolute-8       10.9ms ± 5%    10.7ms ± 4%     ~     (p=0.052 n=10+10)
FilterPatterns/Wildcard-8       53.7ms ± 5%    50.4ms ± 5%   -6.00%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-8     128ms ± 2%      95ms ± 1%  -25.54%  (p=0.000 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-8       3.57MB ± 0%    3.57MB ± 0%     ~     (p=1.000 n=9+8)
FilterPatterns/Absolute-8       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.903 n=9+8)
FilterPatterns/Wildcard-8       19.7MB ± 0%    19.7MB ± 0%   -0.00%  (p=0.022 n=10+9)
FilterPatterns/ManyNoMatch-8    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-8        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-8        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-8        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-8     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-09 15:51:14 +02:00
Michael Eischer
8388e67c4b filter: cleanup path separator conversion 2020-10-07 21:14:07 +02:00
Michael Eischer
0acc3c5923 filter: special case patterns without globbing characters
In case a part of a path is a simple string, we can just check for
equality without complex parsing in filepath.Match.

name                          old time/op    new time/op    delta
FilterLines-4                   34.8ms ±17%    41.2ms ±23%  +18.36%  (p=0.000 n=10+10)
FilterPatterns/Relative-4       21.7ms ± 6%    12.1ms ±23%  -44.46%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4       10.0ms ± 5%     9.1ms ±11%   -9.80%  (p=0.006 n=10+9)
FilterPatterns/Wildcard-4       47.0ms ± 7%    42.2ms ± 5%  -10.19%  (p=0.000 n=9+10)
FilterPatterns/ManyNoMatch-4     190ms ± 1%     131ms ±20%  -31.47%  (p=0.000 n=8+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.870 n=9+9)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.145 n=10+10)
FilterPatterns/Wildcard-4       14.3MB ± 0%    19.7MB ± 0%  +37.91%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.421 n=10+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:55:43 +02:00
Michael Eischer
bcc3bddcf4 filter: only check whether a child path could match when necessary
When checking excludes there is no need to test whether a child path
could also match the pattern, as it is by definition excluded.
Previously childMayMatch was calculated but then discarded. For simple
absolute paths this can account for half the time spent for checking
pattern matches.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       23.3ms ± 9%    21.7ms ± 6%   -6.68%  (p=0.004 n=10+10)
FilterPatterns/Absolute-4       13.9ms ± 7%    10.0ms ± 5%  -27.61%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4       51.4ms ± 7%    47.0ms ± 7%   -8.51%  (p=0.001 n=9+9)
FilterPatterns/ManyNoMatch-4     551ms ± 9%     190ms ± 1%  -65.41%  (p=0.000 n=10+8)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.665 n=10+9)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.480 n=9+10)
FilterPatterns/Wildcard-4       14.3MB ± 0%    14.3MB ± 0%     ~     (p=0.431 n=9+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        88.7k ± 0%     88.7k ± 0%     ~     (all equal)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:52 +02:00
Michael Eischer
17c53efb0d filter: Optimize double wildcard expansion
This only allocates a single slice to expand the double wildcard and
only copies the pattern prefix once.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       22.7ms ± 5%    23.3ms ± 9%     ~     (p=0.353 n=10+10)
FilterPatterns/Absolute-4       14.2ms ±13%    13.9ms ± 7%     ~     (p=0.853 n=10+10)
FilterPatterns/Wildcard-4        266ms ±16%      51ms ± 7%  -80.67%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4     554ms ± 6%     551ms ± 9%     ~     (p=0.436 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.349 n=10+10)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.073 n=10+9)
FilterPatterns/Wildcard-4        141MB ± 0%      14MB ± 0%  -89.89%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        1.63M ± 0%     0.09M ± 0%  -94.56%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:48 +02:00
Michael Eischer
7959796269 filter: Special case for absolute paths
name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       23.6ms ±20%    22.7ms ± 5%     ~     (p=0.684 n=10+10)
FilterPatterns/Absolute-4       32.3ms ± 8%    14.2ms ±13%  -56.01%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        334ms ±17%     266ms ±16%  -20.56%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     709ms ± 7%     554ms ± 6%  -21.89%  (p=0.000 n=10+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       3.57MB ± 0%    3.57MB ± 0%   +0.00%  (p=0.046 n=9+10)
FilterPatterns/Absolute-4       3.57MB ± 0%    3.57MB ± 0%     ~     (p=0.464 n=10+10)
FilterPatterns/Wildcard-4        141MB ± 0%     141MB ± 0%     ~     (p=0.163 n=9+10)
FilterPatterns/ManyNoMatch-4    3.57MB ± 0%    3.57MB ± 0%     ~     (all equal)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Absolute-4        22.2k ± 0%     22.2k ± 0%     ~     (all equal)
FilterPatterns/Wildcard-4        1.63M ± 0%     1.63M ± 0%     ~     (p=0.072 n=10+10)
FilterPatterns/ManyNoMatch-4     22.2k ± 0%     22.2k ± 0%     ~     (all equal)
2020-10-07 20:47:29 +02:00
Michael Eischer
375c2a56de filter: Parse filter patterns only once
name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       30.3ms ±10%    23.6ms ±20%  -22.12%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4       49.0ms ± 3%    32.3ms ± 8%  -33.94%  (p=0.000 n=8+10)
FilterPatterns/Wildcard-4        345ms ± 9%     334ms ±17%     ~     (p=0.315 n=10+10)
FilterPatterns/ManyNoMatch-4     3.93s ± 2%     0.71s ± 7%  -81.98%  (p=0.000 n=9+10)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       4.63MB ± 0%    3.57MB ± 0%  -22.98%  (p=0.000 n=9+9)
FilterPatterns/Absolute-4       8.54MB ± 0%    3.57MB ± 0%  -58.20%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        146MB ± 0%     141MB ± 0%   -2.93%  (p=0.000 n=9+9)
FilterPatterns/ManyNoMatch-4     907MB ± 0%       4MB ± 0%  -99.61%  (p=0.000 n=9+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4        66.6k ± 0%     22.2k ± 0%  -66.67%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4        88.7k ± 0%     22.2k ± 0%  -75.00%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        1.70M ± 0%     1.63M ± 0%   -3.92%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     4.46M ± 0%     0.02M ± 0%  -99.50%  (p=0.000 n=10+10)
2020-10-07 20:47:27 +02:00
Michael Eischer
b8eacd1364 filter: Reduce redundant path and pattern splitting
A single call to filter.List will split the path only once and also
split each search pattern only once and use it for both match and
childMatch.

name                          old time/op    new time/op    delta
FilterPatterns/Relative-4       62.1ms ±15%    30.3ms ±10%  -51.22%  (p=0.000 n=9+10)
FilterPatterns/Absolute-4        111ms ±10%      49ms ± 3%  -56.08%  (p=0.000 n=10+8)
FilterPatterns/Wildcard-4        393ms ±15%     345ms ± 9%  -12.30%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     10.0s ± 3%      3.9s ± 2%  -60.53%  (p=0.000 n=10+9)

name                          old alloc/op   new alloc/op   delta
FilterPatterns/Relative-4       16.4MB ± 0%     4.6MB ± 0%  -71.76%  (p=0.000 n=10+9)
FilterPatterns/Absolute-4       31.4MB ± 0%     8.5MB ± 0%  -72.77%  (p=0.000 n=9+10)
FilterPatterns/Wildcard-4        168MB ± 0%     146MB ± 0%  -13.19%  (p=0.000 n=10+9)
FilterPatterns/ManyNoMatch-4    3.23GB ± 0%    0.91GB ± 0%  -71.96%  (p=0.000 n=10+9)

name                          old allocs/op  new allocs/op  delta
FilterPatterns/Relative-4         178k ± 0%       67k ± 0%  -62.50%  (p=0.000 n=10+10)
FilterPatterns/Absolute-4         266k ± 0%       89k ± 0%  -66.67%  (p=0.000 n=10+10)
FilterPatterns/Wildcard-4        1.87M ± 0%     1.70M ± 0%   -9.47%  (p=0.000 n=10+10)
FilterPatterns/ManyNoMatch-4     17.7M ± 0%      4.5M ± 0%  -74.87%  (p=0.000 n=9+10)
2020-10-07 18:13:19 +02:00
Alexander Neumann
14aead94b3 filter: Allow double wildcard in ChildMatch 2018-06-09 23:18:13 +02:00
Michael Pratt
92eb1cbffd filter: document recursive wildcards
Match/ChildMatch accept a ** pattern which is not noted in the doc
string, nor do any of the docs or tests specify whether the match is
greedy (i.e., can 'foo/**/bar' match paths with additional intermediate
bar directories?).

Add a note to the doc string and add test cases for greedy matches.
2017-09-04 14:38:48 -07:00
Loic Nageleisen
4a36993c19 Smarter filter when children won't match
This improves restore performance by several orders of magniture by not
going through the whole tree recursively when we can anticipate that no
match will ever occur.
2017-08-16 15:25:02 +02:00
Alexander Neumann
6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann
83d1a46526 Moves files 2017-07-23 14:19:13 +02:00
Renamed from src/restic/filter/filter.go (Browse further)