Skip to content

Commit

Permalink
remove \cite{.*} from abstracts
Browse files Browse the repository at this point in the history
  • Loading branch information
mjpost committed Jan 15, 2020
1 parent 5d86edc commit 14a75ba
Showing 1 changed file with 10 additions and 7 deletions.
17 changes: 10 additions & 7 deletions bin/latex_to_unicode.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,16 +179,19 @@ def latex_to_unicode(s):
s = s.replace(r"\&", "&")
s = s.replace("`", "‘")

# Clean up
s = re.sub(r"(?<!\\)[{}]", "", s) # unescaped curly braces
s = s.replace(r"\{", "{")
s = s.replace(r"\}", "}")

def repl(s):
logging.warning("discarding control sequence {}".format(s.group(0)))
logging.warning(f"discarding control sequence '{s.group(0)}' from '{s.string}'")
return ""

s = re.sub(r"\\[A-Za-z]+ |\\.", repl, s)
### \cite
s = re.sub(r'\\cite \{[a-zA-Z0-9:]+\}', repl, s)

This comment has been minimized.

Copy link
@davidweichiang

davidweichiang Jan 15, 2020

Collaborator

It's hard to know what to do with citations; deleting them might fail to give credit to the cited author, or even make the remaining text unintelligible.

At any rate, \citet, \citep and possibly others should be checked for here as well.


# Clean up
# s = re.sub(r"(?<!\\)[{}]", "", s) # unescaped curly braces
# s = s.replace(r"\{", "{")
# s = s.replace(r"\}", "}")

# s = re.sub(r"\\[A-Za-z]+ |\\.", repl, s)

return s

Expand Down

0 comments on commit 14a75ba

Please sign in to comment.