Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way to make the parentheses and space be contained in bibtex entrykey? #30

Open
hushidong opened this issue Mar 17, 2019 · 4 comments

Comments

@hushidong
Copy link

Is there any way to make the parentheses and space be contained in bibtex entrykey?

hello

I am a biblatex/biber user, and encountered an error while biber was parsing the bib file, which contained entries with entriekey containing () or ' ', the biber/biblatex author told me that it was caused by btparse(biber use btparse), so I am here to ask about some improvement or measure to overcome this problem.

for an entry:

@misc{Euclidean_geometry(hi),
howpublished = {https://zh.wikipedia.org/wiki/abc},
title = {Euclidean geometry},
}

the biber output is:

INFO - This is Biber 2.12
INFO - Logfile is 'egtest.blg'
INFO - Reading 'egtest.bcf'
INFO - Using all citekeys in bib section 0
INFO - Processing section 0
INFO - Looking for bibtex format file 'egtest.bib' for section 0
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'egtest.bib'
WARN - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\CQM6OZvJS8\egtest.
bib_2912.utf8, line 6, warning: "(" in strange place -- should get a syntax erro
r
ERROR - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\CQM6OZvJS8\egtest
.bib_2912.utf8, line 6, syntax error: found "(", expected ","
INFO - WARNINGS: 1
INFO - ERRORS: 1

and for an entry:

@misc{how are you,
howpublished = {{www.baidu.com}},
title = {how are you},
}

the biber output is:

INFO - This is Biber 2.12
INFO - Logfile is 'egtest4.blg'
INFO - Reading 'egtest4.bcf'
INFO - Using all citekeys in bib section 0
INFO - Processing section 0
INFO - Looking for bibtex format file 'egtest4.bib' for section 0
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'egtest4.bib'
ERROR - BibTeX subsystem: C:\Users\ADMINI~1\AppData\Local\Temp\Y796C4jmpi\egtest
4.bib_6436.utf8, line 11, syntax error: found "are", expected ","
INFO - ERRORS: 1
@ambs
Copy link
Owner

ambs commented Mar 17, 2019

Hi. I would love to be able to support that kind of things. Unfortunately btparse was written in an ancient version of Antlr (pccts at the time), and its code generation is no longer supported. We have been changing manually some details of the parser, but that doesn't allow us to quickly do changes to the code.

So, while I will keep this ticket open, I do not have the time to dig on the source of the parser and try to change its behaviour. Thus, my suggestion would be to change the kind of used keys in your files. I do not see a great reason to have parenthesis or spaces on citation keys 😄

@plk
Copy link
Collaborator

plk commented Mar 17, 2019

The problem as I remember is that the btparse parser allows normal parentheses to play the same role as curly braces and so there is no way parentheses can be in keys (just like the key can't contain curly braces). The only option is to change the parser token codes for parentheses but then this breaks the ability to use parentheses as braces, which some people might rely on. I agree that there is very little reason to use spaces/parenthesis in keys - it's very rare indeed and you will have trouble with many parsing libraries and tools anyway.

@hushidong
Copy link
Author

ok,thanks, I will change the bib file which is not a hard work by using regular expressions.

@zepinglee
Copy link

Note that C<name> is a catch-all token used for entry types, citation
keys, field names, and macro names; because BibTeX has slightly
different (largely undocumented) rules for these various elements, a bit
of trickery is needed to make things work. As a starting point,

recognizes such a digit string as a number first. There are two
problems here: BibTeX entry keys may in fact be entirely numeric, and
field names may not begin with a digit. (Those are two of the

Actually the pattern of NAME is not used for entry keys. The regex pattern for entry keys should be [^ ,\t\n]* for parenthesis-style entries (@entrytype(...)) or [^ ,}\t\n]* for brace-style entry (@entrytype{...}).

The relevant code is located at bibtex.web#L6152-L6175. This procedure calls scan1_white(comma) or scan2_white(comma,right_brace) but none of them involves id_class (defined in L877-L896) which contains allowed characters in current NAME .

Also note the PEG for BibTeX provided by https://github.com/aclements/biblib is worth of reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants