Skip to content
This repository has been archived by the owner on Nov 1, 2017. It is now read-only.

Updates should not ignore items with links #41

Merged
merged 6 commits into from
Sep 26, 2014

Conversation

mtodd
Copy link
Member

@mtodd mtodd commented Sep 25, 2014

A regression as part of #38 is that items followed immediately by links get ignored on update. Whoops. Here's a failing test to reproduce the user reported issue so we can work towards a fix.

cc @github/task-lists @rsese @aroben

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

Could use some regexp help... struggling to distinguish when to update between these scenarios:

- [ ] one
- [ ] [two] # skip this
- [ ] (three) # and skip this
- [ ] [four] (url) # but not this
- [ ] [five] [ref] # nor this
- [ ] six

@aroben
Copy link
Contributor

aroben commented Sep 25, 2014

Do you know what line of code is failing?

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

@aroben essentially, this stanza of the itemPattern helps us skip over list items that look like task list items but are links:

(?! # is not a link
\s* # with optional whitespace
(?:\(.*?\)|\[.*?\]) # because of destination or reference
)

But it needs to look further ahead to make sure that it isn't a task list followed by a link.

Feels like a fools errand. Might be worth splitting up the pattern matching to have an explicit ignore step that can better handle this kind of matching.

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

Here's a scriptular reproduction to play with: http://bit.ly/ZePF7z

@aroben
Copy link
Contributor

aroben commented Sep 25, 2014

Seems like we want to skip even sequences of square bracket pairs. I.e.:

Skip:

- [ ] [ ]
- [ ] [ ] [ ] [ ]
- [ ] [ ] [ ] [ ] [ ] [ ]

Don't skip:

- [ ]
- [ ] [ ] [ ]
- [ ] [ ] [ ] [ ] [ ]

I'm sure there's some regex-fu that would do this given that JS regexes support back references.

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

@aroben heh, that's a good observation. Will play around with that.

This is particularly frustrating since we're having to ignore what's essentially a mistake, - [ ] (just parenthetical) being turned into a link by Markdown.

In pseudocode, I'm thinking: match item, but not if it's actually a link, unless that link is a independent and separate from the checkbox. The wording may be confusing, but it's the "not...unless" part that seems intractable right now.

@aroben
Copy link
Contributor

aroben commented Sep 25, 2014

I think of it like this:

We match if:

  1. It is [ ] or [x], and
  2. it is not followed by \s*(.*) (because that would mean it itself is a link), and
  3. it is followed by an even number of \s*[.*] pairs (including 0 pairs) (because that means that there are some reference-style links that follow it that the initial brackets are not part of).

1 and 2 are easy. 3 is a little tricky, but I think doable.

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

@aroben number 2 is incomplete since [ ] [ref] is also a valid link, which makes it harder to account for I think, or at least harder to reason about.

@aroben
Copy link
Contributor

aroben commented Sep 25, 2014

Item 3 handles that case. The item must be followed by an even number of [] pairs. In your example it's followed by 1 [] pair.

-Adam

On Sep 25, 2014, at 5:20 PM, Matt Todd [email protected] wrote:

@aroben number 2 is incomplete since [ ] [ref] is also a valid link, which makes it harder to account for I think, or at least harder to reason about.


Reply to this email directly or view it on GitHub.

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

@aroben 👍 good point. Taking another swing at it now.

@aroben
Copy link
Contributor

aroben commented Sep 25, 2014

I think I was wrong about needing back references before. I think you can use a negative lookahead assertion to handle the ()-style links and a positive lookahead assertion to handle the []-style links. Something like:

///
  # the good stuff goes here
  (?!\s*\(.*?\)) # Not a [foo](url) link
  (?=
    \s*
    (?:\[.*?\]\s*){2}* # Must be followed by an even number of [] pairs (including 0)
    (?:[^[]|$)         # After all the pairs comes either something that's not a [, or the end of the line
  )
///

(?! # is not a link
\s* # with optional whitespace
(?:\(.*?\)|\[.*?\]) # because of destination or reference
)
(?=\s) # is followed by whitespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs to be \s+ so that there can be more than one whitespace character.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I guess extra whitespace characters don't need to be matched here. They'll just be after the pattern (or consumed by \s* in the subsequent assertions).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a lookahead assertion at all? Would a bare \s work not enclosed in (?=)? (This may be unrelated to fixing this bug.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aroben lookahead probably isn't needed. Agreed on the extra whitespace not needing to be matched here. Unrelated to the issue, I believe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think removing the (?=) makes the test pass. Give it a try!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that when you have two positive lookahead assertions in a row they both need to match. Instead we want the \s to match and then the reference stuff to come after that.

@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

I'm to the point where I think it'd just be better to only handle the [ ] (*) case and punt on [ ] [*] case until we can revisit it. The former seems to be what's most affected in the real world.

Punting on proper handling for now.
@mtodd
Copy link
Member Author

mtodd commented Sep 25, 2014

@aroben this is what I have right now, and it's still failing:

diff --git a/app/assets/javascripts/task_list.coffee b/app/assets/javascripts/task_list.coffee
index f5cf946..7fec188 100644
--- a/app/assets/javascripts/task_list.coffee
+++ b/app/assets/javascripts/task_list.coffee
@@ -125,11 +125,21 @@ itemPattern = ///
     #{escapePattern(complete)}|
     #{escapePattern(incomplete)}
   )
-  (?=\s)                  # is followed by whitespace
+  \s                  # is followed by whitespace
   (?!                     # but not a link to this destination
     \s*
     \(.*?\)
   )
+  (?=                     # and not part of a link reference
+    \s*
+    (?:(?:\[.*?\]\s*){2})*
+    (?:[^\[]|$)
+  )
 ///

 # Used to filter out code fences from the source for comparison only.
Expected:   
"- [ ] [link label](link)
- [x] [reference label][reference]"
Result:     
"- [ ] [link label](link)
- [ ] [reference label][reference]"
Diff:   
 "-  [  ]  [link  label](link)
 - [x] [ ]  [reference  label][reference]" 

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

Hm, strange. In both Safari and Chrome's console things seem to work the way we want:

/^(?:\s*(?:>\s*)*(?:[-+*]|(?:\d+\.)))\s*(\[\s\]|\[[xX]\])\s(?!\s*\(.*?\))(?=\s*(?:(?:\[.*?\]\s*){2})*(?:[^\[]|$))/.exec("- [ ] [reference label][reference]")
["- [ ] ", "[ ]"]

…so why isn't it passing the test?

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

Oh, maybe phantomjs has a too-old JS engine.

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

Yup, that's the problem:

phantomjs> /^(?:\s*(?:>\s*)*(?:[-+*]|(?:\d+\.)))\s*(\[\s\]|\[[xX]\])\s(?!\s*\(.*?\))(?=\s*(?:(?:\[.*?\]\s*){2})*(?:[^\[]|$))/.exec("- [ ] [reference label][reference]")
null

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

I wonder if this is fixed in PhantomJS 2 /cc @eanakashima

@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

Reproduced something that works here: http://bit.ly/ZfhL2s

@aroben seems that 765f60f works. Does that make sense to you?

@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

Here's my PhantomJS version locally:

task_list:skip-links-not-items-with-links $ phantomjs -v
1.9.0

It looks like earlier changes are passing in CI (6+ minute long builds for the loss) which makes this sound reasonable.

Hopefully this most recent change will work though.

Update: 6+ minutes, not 20+.

@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

Not loving regexp right meow 😼

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

@mtodd Nice. Good call on allowing the checklist item to be followed by a []() link. I simplified it a teeny bit in 9bd8e69. Feel free to pull that in if you'd like.

@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

@aroben legit, let's see how CI feels about it.

@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

Feeling that the style of testing employed here is masking which items are being ignored, which can lead us astray (pretty sure that happened a few times today). Need to make writing these tests much easier and simpler.

mtodd added a commit that referenced this pull request Sep 26, 2014
Updates should not ignore items with links
@mtodd mtodd merged commit b43e109 into master Sep 26, 2014
@mtodd mtodd deleted the skip-links-not-items-with-links branch September 26, 2014 00:58
@mtodd
Copy link
Member Author

mtodd commented Sep 26, 2014

@aroben really appreciate the help here. The brink of sanity is a scary place. ❤️

@aroben
Copy link
Contributor

aroben commented Sep 26, 2014

👬

@eanakashima
Copy link

I wonder if this is fixed in PhantomJS 2

Appears not to be. I also get null instead of ["- [ ] ", "[ ]"]. 😒

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants