Skaldic Poetry of the Scandinavian Middle Ages

login: password: stay logged in: help

documentation

 

8. Web interface help

 
1. Introduction (TW)
2. Login (TW)
3. Database links and statistics (TW)
4. Searching skald names and text titles (TW)
5. Help for skalds/poems/verses (TW)
6. Help for texts (TW)
7. Help for mss (TW)
A. Regular expression syntax (TW)
B. Unicode conversion of database (TW)

(8. Web interface help > B. Unicode conversion of database)

B. Unicode conversion of database (TW)

The database was converted to Unicode (UTF-8 encoding) on 24 August 2009. This has caused some changes to the way the database works.

It is recommended that you install a MUFI-compliant Unicode font such as Junicode or Andron Scriptor Web. Assistants on the project should contact Tarrin to obtain the font used for quality control (Adobe Garamond Pro).

Notes:

  • Text is very unlikely to have been lost during these changes. If something appears to have disappeared, it is probably because the text is not displaying properly rather than the text having been deleted. In any case, everything has been backed up and can be recovered.
  • The database now uses modern Icelandic as the language collation, that is, the system for searching and ordering text. This means that vowels with accents are no longer treated the same as vowels without accents when searching and ordering text. In alphabetical order, for example, 'Auðunn' comes before 'Ámundi'. A search on 'armoðr' no longer matches 'Ármóðr'. You can use the underscore character in searches if you can't produce the correct letter, e.g. '_mundi' matches 'Ámundi'. The search fields and some browsing options (e.g. browsing by first letter) reflect these changes — buttons are now provided to insert the non-English characters. For searching and ordering, æ and œ are equivalent; ø and ö are equivalent; and o and ǫ are equivalent for searching, and in some cases the alphabetic ordering is adjusted so that ǫ is equivalent to ø/ö.
  • ReykholtTimes is no longer used as a font internally in the database. You can continue to use ReykholtTimes and documents produced in ReykholtTimes to enter and edit data in the database. The database converts such fields to Unicode for storage, and converts them back again for editing. If you want to enter or edit a verse using Unicode instead of ReykholtTimes, you can do so by setting an option in the preferences form. Some other fields using ReykholtTimes can be entered using alternative forms using Unicode.
  • The old quality control (‘print-friendly’) format has been replaced with the format tabled at the meeting in Uppsala. This uses the book font and encoding, and is much closer to the final product than the old QC format, making QC more reliable. If you have any problems with the format, let Tarrin know (but please install the Garamond Pro font first!).
  • The character o-hook-acute is not handled consistently. This character is not in the Unicode standard as a simple letter — it can be formed by joining o-hook and a combining acute accent, or o-acute with a combining hook accent; and the MUFI project defines an encoding for this character in the 'Private Use Area', but this does not work with non-MUFI fonts (i.e. almost all pre-installed fonts). O-hook-acute can therefore be displayed correctly, but searches and ordering of information may become unreliable. In most fields it is encoded as ô; in some fields formerly using ReykholtTimes it is encoded as o-hook-macron, and others as a custom character. Until this issue is resolved, this character may cause problems for display. However, the character is stored unambiguously if not consistently and no data will be lost or corrupted. If you need to insert or edit this character, use o-circumflex (ô and Ô) to represent the character for the time being. 



Other notes...

Known bugs and things to do:
  • still uncertain about how to treat o-hook-acute (in some fields this is encoded as o-circumflex, in former reykholttimes fields, it's o-ogonek-macron)
  • prose order and translation forms are not fully generalised to use the levels-definition fields (to determine auto-filled text, reordering, etc.)
Log of changes:
  • 18-21/6/09: preparations: conversion scripts, updates to database
  • 24/6/09: backed up database; converted database to unicode; started changing web forms and scripts
  • 25/6/09: fixed various problems; backed up database again; converted õ to o-hook in all text fields
Detail of changes:
  • Auto conversions: run convert2utf8.php to do conversions (see comments); run unicodeencode.php to convert reykholt fields to utf8
  • add to db-vals: $encoding = 'utf8'; $collation= 'utf8_icelandic_ci';
  • add to lib-db, lib-db-edit: mysql_query("SET
  •   character_set_results    = '$encoding',....");
  • add new rt-uni functions to lib-trans.php
  • add new lines to lib-db-edit.php (see bits marked with #----)
  • update everything in lib/php/view: Windows-1252 > 'UTF-8' [perl -pi~ -e 's/Windows-1252/UTF-8/;' *.php; rm *~]
  • lib/php/view/lib-verses-app.php CHAR(171) > UNHEX('CBA3')
  • php_query - 39 '<sup>x</sup>' > UNHEX('CBA3')
  • *** reykholt/unicode possibilities for extended verse editing forms ***
  • *** what to do with 8-bit encoding õ and ô ??? ***
  • change queries: SELECT id FROM php_view WHERE body_query LIKE '%convert%', php_query.sql, etc. -- ???
  • ReykholtTimes fields: app.corr_from, app.note, app.reading, notes.note, poems.introduction, refs.transcription, skalds.biography, verses.context, verses.editions, verses.intro, verses.lg_alt, verses.run_rdg, verses.run_rdg_notes, verses.skjatext (see //plato/var/local/find_ents_in_reykholt_fields.pl)
  • ReykholtTimes forms: verses...
  • ReykholtTimes conversion function for forms which use RT
  • --- notes ---
  • Reykholt fields only use entities outside the Latin-1 range (assumption, but checked)
  • Reykholt fields duplicated as *_reykholt
© Skaldic Project Academic Body, unless otherwise noted. Database structure and interface developed by Tarrin Wills. All users of material on this database are reminded that its content may be either subject to copyright restrictions or is the property of the custodians of linked databases that have given permission for members of the skaldic project to use their material for research purposes. Those users who have been given access to as yet unpublished material are further reminded that they may not use, publish or otherwise manipulate such material except with the express permission of the individual editor of the material in question and the General Editor of the volume in which the material is to be published. Applications for permission to use such material should be made in the first instance to the General Editor of the volume in question. All information that appears in the published volumes has been thoroughly reviewed. If you believe some information here is incorrect please contact Tarrin Wills with full details.