User:Jakob.scholbach/zeteo/Parsing

From Wikipedia, the free encyclopedia

Supported formats[edit]

BibTex files (e.g. from MathSciNet or Zentralblatt_MATH)[edit]

  • example

@article {
AUTHOR = {Puppe, Dieter},
TITLE = {Homotopie und Homologie in abelschen Gruppen- und Monoidkomplexen. I. II},
JOURNAL = {Math. Z.},
FJOURNAL = {Mathematische Zeitschrift},
VOLUME = {68},
YEAR = {1958},
PAGES = {367--406, 407--421},
ISSN = {0025-5874},
}
  • supported fields: author, title, series, publisher, address, year, isbn, pages, number, fjournal, journal, issn, volume, mrnumber

(address goes to Publisher.location, fjournal goes to Journal.name, journal goes to Journal.abbrname, issn to Journal.issn)

  • similar examples: @incollection, @inproceedings, @book; for @incollection, additional fields: booktitle (->title) and title goes to chapter.
  • removes { and }, replaces \"a and similar Umlaute and other diacritics by their UTF-equivalent
  • if there is a mrnumber, it goes to the id (using the {{MathSciNet}} template).

{{cite book}}[edit]

supported fields: last, first, authorlink, author, coauthors, title, chapter, doi, journal, edition, url, series, publisher, location, year, date, accessdate, origdate, accessdate, origdate, isbn, volume, id, pages

example

 {{cite book
 | author=Eisenbud, David 
 | title=Commutative algebra with a view toward algebraic geometry| location=New York 
 | publisher=Springer-Verlag 
 | Series=Graduate Texts in Mathematics 
 | Volume=150 
 | pages=xvi+785 
 | year=1995 
 | id=ISBN 0-387-94268-8}} 

{{cite journal}}[edit]

supported fields: last, first, author, authorlink,coauthors, title, url, doi, journal, series, issue, publisher, location, year, accessdate, origdate, accessdate, origdate, isbn, volume, id, pages

  • {{cite web}} is currently not supported

{{citation}}[edit]

supported fields: last, first, author-link, last1, first1, author1-link etc., editor-last, editor-first, editor-link, editor1-last etc., title, edition, chapter, series, journal, issue, pages, publisher, location, accessdate, origdate, accessdate, origdate, year, date, isbn, volume, edition, doi, id, url, contribution (goes to chapter), contribution-url (goes to chapter-url)

XML files[edit]

example

 <wikitool application="cite">
 <query>
 <id type="isbn">0-071391401</id>
 </query>
 <response status="ok">
 <source>http://isbndb.com/api/books.xml?access_key=YourKeyHere&index1=isbn&results=details&value1=0-071391401</source>
 <content template="Template:Cite book">{{cite book |author=Harrison, Tinsley Randolph; Dennis L. Kasper |title=Harrison's principles of internal  medicine |publisher=McGraw-Hill Medical Publishing Division |location=New York |year=2005 |pages= |isbn=0-071391401 |oclc= |doi=}}</content>
 <paramlist>
 <param name="author">Harrison, Tinsley Randolph; Dennis L. Kasper</param>
 <param name="title">Harrison's principles of internal medicine</param>
 <param name="publisher">McGraw-Hill Medical Publishing Division</param>
 <param name="location">New York</param>
 <param name="year">2005</param>
 <param name="pages"/>
 <param name="isbn">0-071391401</param>
 <param name="oclc"/>
 <param name="doi"/>
 <param name="undef"/>
 </paramlist>
 </response>
 </wikitool>

(source: User:Diberri's tool)

supported fields: author, title, series, publisher, location, year, pages, oclc, doi, isbn, volume

General features[edit]

Warning[edit]

In some cases there will be a warning requesting the user's attention:

  • The title contains $ or \ (likely to be the result of LaTEX formatting or other incomplete parsing of BibTex entries)
  • An author, publisher or journal does not (yet) exist in the database.
  • The volume is non-numeric (warning because this is likely due to misplaced edition information etc.).
  • 'Ed' or 'Edition' occur in the title (edition should be input into the appropriate field).
  • '(', ')' or 'Eds' or 'Editors' occur in the author (likely this is an editor instead of an author, or something like '(transl.)', which should be put in the others field.
  • 'Ch.' or 'Chapter' occurs in chapter.

None of these warning precludes saving the item as it is. If no warning occurs, the item will be saved without further notice.

Author strings[edit]

Author strings will be parsed using the following algorithm:

  1. the string is separated into wikilinks and other strings.
  2. the wikilinks are parsed into caption (which is then parsed into firstname and name) and wikilink. Inside wikilinks, there is must be only one author
  3. remaining strings are separated by ";"
  4. the tokens of this are parsed as follows:
  • strings without any spaces or commas are understood as "name"
  • strings with commas are separated along the commas and every one is treated separatedly. if one of these tokens contains only single characters, ".", "-" or spaces, it is considered to be the firstname of the preceding author.
  • tokens without comma: the last word is the "name", the rest is the "firstname". if here "name" is actually a firstname (i.e. only single characters, . and - and spaces) and "firstname" is not, then the two will be swapped

example:

Sommerfeld J, [[Branko Grünbaum|Grünbaum, Branko]] ;Shephard, G. C., Klaus, Hansen, J.-P, Sommer

goes to

  • name=Sommerfeld, firstname=J
  • name=Grünbaum, firstname=Branko, wikilink=Branko Grünbaum
  • name=Shephard, firstname=G. C.
  • name=Klaus
  • name=Hansen, firstname=J.-P
  • name=Sommer

Parsing of internal Wikilinks[edit]

If the title, author, publisher or journal contains a Wikilink, it will be extracted automatically (in the case of author, publisher and journal only when the journal does not yet exist) to the appropriate field (e.g. the wikilink of the reference or the author etc.). Mixed titles etc. are possible (see below), but only one wikilink is possible.

In addition, if the title is a Wiki-URL-link (e.g. [http://en.wikipedia.org Wikipedia]), then the URL will be parsed to the url field of the item, and the true title will be preserved. Mixed titles (The english [http://en.wikipedia.org Wikipedia]...) is also allowed (will give url=http://en.wikipedia.org, and title=The english Wikipedia...). However only one URL is allowed, i.e. nothing like [http://en.wikipedia.org Wikipedia] is an encyclopedia [http://blabla.com]').

Parsing of URL in title or chapter[edit]

if the title or chapter contains an external URL-link, the URL will be put to the url field instead

ISBN[edit]

If the id is something like "ISBN 1-234-56789-0", this will be put to ISBN instead of id.